10019 ---- JUne_ITAL_fifarek_final President’s Message: For The Record Aimee Fifarek INFORMATION TECHNOLOGIES AND LIBRARIES | JUNE 2017 1 This is my final column as LITA President. Having just finished the 2016/17 Annual Report, I must admit I’m a little tapped out. Over the last year I’ve written on the events of an ALA Annual and Midwinter Conferences, a LITA Forum, a new strategic plan, information ethics, and advocacy. Even for an English Major and a Librarian that’s a lot of words. As I work with Executive Director Jenny Levine and the rest of the LITA Board to prepare the agenda for our meetings at Annual, the temptation is to focus on all the work that is yet to be done. But with the end of school and fiscal years approaching, it is the ideal time to celebrate everything that has been accomplished over the last 12 months. First off, at some magical point during the year we completed the LITA Staff transition period. Jenny has truly made the Executive Director position her own, and although she and Mark Beatty have more than enough work for six people, they are well on their way to guiding LITA to a bright new future. With her knowledge of the inner workings of ALA and her desire to make everything easier, faster and better, Jenny is truly the right person for this job. Next, we have a great new set of people coming in to lead LITA. Andromeda Yelton is going to be a fabulous LITA President. She is an eloquent speaker, has more determination than anyone I know, and is a kick ass coder to boot. Bohyun Kim has an amazing talent for organizing and motivating people, and as President-Elect work wonders with the new Appointments Committee. Our new Directors-at-Large Lindsay Cronk, Amanda Goodman, and Margaret Heller are all devoted LITAns who will be great additions to the Board. I’m glad I get to work with them all in their new roles as I transition to Past-President. And last but certainly not least we have started to make inroads on our Advocacy and Information Policy strategic focus. The Privacy Interest Group has already raised LITA’s profile by supplementing ALA’s Intellectual Freedom Committee’s Privacy Policies with Privacy Checklists.1 A group of Board members along with Office for Information Technology Policy liaison David Lee King and Advocacy Coordinating Committee liaison Callan Bignoli are working on a new Task Force proposal to outline strategies for effectively collaborating with the ALA Washington Office. These are just the first steps towards a future in which LITA is not only relevant but necessary. With all that hard work accomplished, it must be time to toast to our successes. I hope that everyone who will be at ALA Annual in Chicago (http://2017.alaannual.org/) later this month will join us as we conclude our 50th Anniversary year. Sunday with LITA promises to be amazing, with Aimee Fifarek (aimee.fifarek@phoenix.gov) is LITA President 2016-17 and Deputy Director for Customer Support, IT and Digital Initiatives at Phoenix Public Library, Phoenix, AZ. PRESIDENT’S MESSAGE | FIFAREK https://doi.org/10.6017/ital.v36i2.10019 2 Hugo Award winner Kameron Hurley (http://www.kameronhurley.com) speaking at the President’ Program, followed by what is sure to be a spectacular LITA Happy Hour at The Beer Bistro (http://www.thebeerbistro.com/). We are still working on our goal to raise $10,000 for Professional Development scholarships. We’re only halfway there, so please donate at: https://www.crowdrise.com/lita-50th-anniversary. Being LITA President during the Association’s 50th Anniversary year has been both an honor and a challenge. During a milestone year like this you become acutely aware of all of the hard work and innovation that was required for the Association to thrive for half a century, and feel more than a little pressure to leave an extraordinary legacy that will ensure another fifty years of success. It’s a tall order, especially in an era of rapid political and societal change. But as I navigated through my presidential year I realized that I didn’t have to do anything more than ensure that people who already want to work hard for the greater good have a welcoming place to do just that. After fifty years, LITA still has the thing that made it a success in the first place: a core group of volunteers committed to the belief that new technologies can empower libraries to do great things. The talented and passionate people I have worked with on the Board, in the Committee and Interest Group leadership, and throughout the membership are the best legacy that an Association can have. Now more than ever the people in libraries who “do tech” can be leaders in their communities and on the national stage. Now more than ever it is LITA’s time to shine. REFERENCES 1. http://litablog.org/2017/02/new-checklists-to-support-library-patron-privacy/ 10022 ---- September_ITAL_Ozeran_for_proofing Managing Metadata for Philatelic Materials Megan Ozeran INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 7 ABSTRACT Stamp collectors frequently donate their stamps to cultural heritage institutions. As digitization becomes more prevalent for other kinds of materials, it is worth exploring how cultural heritage institutions are digitizing their philatelic materials. This paper begins with a review of the literature about the purpose of metadata, current metadata standards, and metadata that are relevant to philatelists. The paper then examines the digital philatelic collections of four large cultural heritage institutions, discussing the metadata standards and elements employed by these institutions. The paper concludes with a recommendation to create international standards that describe metadata management explicitly for philatelic materials. INTRODUCTION Postage stamps have existed since Great Britain introduced them in 1840 as a way to prepay postage. Historian and professor Winthrop Boggs (1955) points out that postage stamps have been collected by individuals since 1841, just a few months after the first stamps were issued (5). To describe this collection and research, the term philately was coined by a French stamp collector, Georges Herpin, who “combined two Greek words philos (friend, amateur) and atelia (free, exempt from any charge or tax, franked)” (Boggs 1955, 7). Thus postage stamps and related materials, such as the envelopes to which they have been affixed, are considered philatelic materials. In the United States, numerous societies have formed around philately, such as the American Philatelic Society, the Postal History Society, the Precancel Stamp Society, and the Sacramento Philatelic Society (in northern California). The definitive United States authority on stamps and stamp collecting for nearly 150 years has been the Scott Postage Stamp Catalogue, which was first created by John Walter Scott in 1867 (Boggs 1955, 6). The Scott Catalogue “lists nearly all the postage stamps issued by every country of the world” (American Philatelic Society 2016). Philately is a massively popular hobby, and cultural heritage institutions have amassed large collections of postage stamps through collectors’ donations. In this paper, I will examine how cultural heritage institutions apply metadata to postage stamps in their digital collections. Libraries, archives, and museums have obtained specialized collections of stamps over the decades, and they have used various ways to describe these collections, such as through creating finding aids. Only recently have institutions begun to digitize their stamp collections and make the collections available for online review, as digitization in general has become more common in cultural heritage institutions. Megan Ozeran (megan.ozeran@gmail.com), a recent MLIS degree graduate from San Jose State University School of Information, is winner of the 2017 LITA/Ex Libris Student Writing Award. MANAGING METADATA FOR PHILATELIC MATERIALS | OZERAN | doi:10.6017/ital.v36i3.10022 8 PROBLEM STATEMENT Textual materials have received much attention in regards to digitization, including the creation and implementation of metadata standards and schemas. Philatelic materials are not like textual materials, and are not even like photographic materials, which have also received some digitization attention. In fact, there is very little literature that currently exists describing how metadata is or should be applied to philatelic materials, even though digital collections of these materials already exist. Therefore, the goal of this paper is to examine exactly how metadata is applied to digital collections of philatelic materials. Several related questions drove the research about this topic: As institutions digitize stamp collections, what metadata schema(s) are they using to do so? Are current metadata standards and schemas appropriate for these collections, or have institutions created localized versions? What metadata elements are most crucial in describing philatelic materials to enhance access in a digital collection? LITERATURE REVIEW While there is abundant literature regarding the use of metadata for library, archives, and museum collections, there is a dearth of literature that specifically discusses the use of metadata for philatelic materials. Indeed, there is no literature at all that analyzes best practices for philatelic metadata, despite the fact that several large institutions have already created digital stamp collections. Even among the many metadata standards that have been created, very few specify metadata guidelines for philatelic collections. It is clear that philatelic collections have not been highlighted in discussions over the last few decades about digitization, so best practices must be inferred based on the more general discussions that have taken place. The Purpose and Quality of Metadata When considering why metadata is important to digital collections (of any type), it is crucial to remember, as David Bade (2008) puts it, “Users of the library do not need bibliographic records at all. . .. What they want is to find what they are looking for” (125). In other words, the descriptive metadata in a digital record is important only to the extent that it facilitates the discovery of materials that are useful to a researcher. As Arms and Arms (2004) point out, “Most searching and browsing is done by the end users themselves. Information discovery services can no longer assume that users are trained in the nuances of cataloging standards and complex search syntaxes” (236). Echoing these sentiments, Chan and Zeng (2006) write, “Users should not have to know or understand the methods used to describe and represent the contents of the digital collection” (under “Introduction”). When creating digital records, then, institutions need to consider how the creation, display, and organization of metadata (especially within the search system) make it easier or more difficult for those end users to effectively search the digital collection. How effective metadata is in facilitating user research is ultimately dependent upon the quality of that metadata. Bade (2007) notes that the information systems are essentially a way for an institution to communicate with researchers, and that this communication is only effective if metadata creators understand what the end users are looking for in the content and style of INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 9 communication (3-4). Thus, in somewhat circular fashion, metadata quality is dependent upon understanding how best to communicate with end users. To help define discussions of metadata quality, Bruce and Hillmann (2004) suggest seven factors to consider: “completeness, accuracy, provenance, conformance to expectations, logical consistency and coherence, timeliness, and accessibility” (243). Deciding how to prioritize one or several factors over the others will depend on the resources and goals of the institution, as well as the ultimate needs of the end users. The State of Standards Standards are created by various organizations to define the rules for applying metadata to certain materials in certain settings. Standards generally describe a metadata schema, “a formal structure designed to identify the knowledge structure of a given discipline and to link that structure to the information of the discipline through the creation of an information system that will assist the identification, discovery and use of information within that discipline” (CC:DA 2000, under “Charge #3”). Essentially, a metadata schema standard demonstrates how best to organize and identify materials to enhance discovery and use of those materials. Such standards are helpful to catalogers and digitizers because they define rules for how to include content, how represent content, and/or what the allowable content values are (Chan and Zeng 2006, under “Metadata Schema”). Unfortunately, very few current metadata standards even mention philatelic materials, despite their unique nature. The only standard that appears to do so with any real purpose is the Canadian Rules for Archival Description (RAD), created by the Bureau of Canadian Archivists in 1990, and revised in 2008. Thirteen chapters comprise the first part of the RAD, and these chapters describe the standards for a variety of media. Philatelic materials are given their own focus in chapter 12, which discusses general rules for philatelic description as well as specifics for each of nine areas of description: title and statement of responsibility, edition, issue data, dates of creation and publication, physical description, publisher’s series, archival description, note, and standard number. The RAD therefore provides a decent set of guidelines for describing philatelic materials. The Encoded Archival Description Tag Library created by the Society of American Archivists (EAD3, updated in 2015) mentions philatelic materials only in passing. There is no specific section discussing how to properly apply descriptive metadata to philatelic materials. The single mention of such materials in the entire EAD3 documentation is in the discussion of the tag, where it is noted that “jurisdictional and denominational data for philatelic records” (257) may be recorded. Other standards don’t appear to mention philatelic materials at all, so implementers of those standards must extrapolate based on the general information provided. For example, Describing Archives: A Content Standard (DACS), also published by the Society of American Archivists (2013), does not discuss philatelic materials in any way. It does note, “Different media of course require different rules to describe their particular characteristics…” (xvii), but the recommendations for specific content standards for different media listed in Appendix B still leave out philately (141- 142). Institutions using DACS for philatelic materials need to determine how to localize the standard. Although MARC similarly does not include specific guidelines for philatelic materials, Peter Roberts (2007) suggests ways to effectively use it for cataloging philatelic materials. For MANAGING METADATA FOR PHILATELIC MATERIALS | OZERAN | doi:10.6017/ital.v36i3.10022 10 instance, in the MARC 655 field he suggests using the Getty Art and Architecture Thesaurus terms to describe the form of the materials and the Library of Congress Subject Headings to describe the subjects (genres) of the materials (86-87). In similar ways, most standards could potentially be applied to philatelic materials if an institution were to provide additional local rules for how to best implement the standard. The Metadata that Philatelists Want There are actually a good number of resources for determining what metadata is important to philatelic researchers. Boggs (1955) suggests that a philatelist may want to “study the methods of production; the origin, selection, and the subject matter of designs; their relation to the social, political and economic history of the country of issue; the history of the postal service which issued them” (1-2). These few initial research suggestions can provide some insight into what metadata elements would be most useful in a digital record. David Straight (1994) suggests the most basic crucial items are the date and country of issue for an item (75). Roberts (2007) provides significant background about philatelic materials and research, and indicates multiple metadata elements that will be helpful for researchers. He reiterates that dates are extremely useful, and are often identified on the materials themselves; when specific dates are not visible, a stamp itself may provide evidence of an approximate year based on when the stamp was issued (75). He notes that many of the postal markings also “indicate the time and place of origin, route, destination, and mode of transportation” (78), which will also be of interest to philatelic researchers. If any information is available about the original collector, dealer, or exhibitor of the stamp before it was acquired by a cultural heritage institution, this may also be of great interest to a researcher (81). Roberts also suggests that the finding aids for philatelic collections are more crucial places for description than for specific item records, and that controlled vocabulary subject terms are important in these descriptions (86). Because the Scott Postage Stamp Catalogue is the leading United States authority on stamps, it can also suggest the metadata elements that primarily concern philatelic researchers. Each listing includes a unique Scott number, paper color, variety (e.g., perforation differences), basic information, denomination, color of the stamp, year of issue, value used/unused, any changes in the basic set information, and the total value of the set (Scott Publishing Co. 2014, 14A). The Scott Catalogue also describes a variety of additional components that researchers may be interested in, including the type of paper used, any watermarks, inks used, separation type, printing process used, luminescence, and gum condition (19A-25A). One additional interesting source for deciding what metadata is important to researchers (aside from directly surveying them, of course) is a piece of software that was created to help philatelists catalog their own private collections. StampManage is available in United States and international versions, and it is largely based on the Scott Postage Stamp Catalogue in creating the full listing of stamps that may be available to a collector. It includes a wide variety of metadata elements for cataloging stamps, such as the Scott number, country of origin, date of issue, location of issue, type of stamp, denomination, condition, color, brief description, presence and type of perforations, category, plate block size, mint sheet size, paper type, presence and type of watermark, gum type, and so forth (Liberty Street Software 2016). As a product that is sold to stamp collectors, INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 11 StampManage is likely to have a confident grasp of all the metadata that could possibly be important to its customers. This literature review helps create a holistic view of the issues faced by cultural heritage institutions with digitized stamp collections. Although little progress has been made in the literature to describe how best to apply metadata to philatelic materials, there are ways that institutions can extrapolate guidelines from the literature that does exist. METHODOLOGY To explore my research questions, I interviewed (over email) representatives of several large institutions with digitized stamp collections. The information provided by these institutions sheds light on the current state of metadata and metadata schemas for philatelic collections. Note that there are other institutions with online collections of postage stamps that are not discussed in this paper (e.g., the Swedish Postal Museum, https://digitaltmuseum.se/owners/S-PM). Due to my own language limitations, this paper is limited to analysis of online collections that are described in English. Additional research into institutions with non-English displays would support greater analysis of how cultural heritage institutions are currently creating and providing philatelic metadata. RESULTS Smithsonian National Postal Museum In the United States, the largest publicly accessible digital collection of philatelic materials is from the Smithsonian National Postal Museum. I discussed the metadata for this collection with Elizabeth Heydt, Collections Manager at the museum. Ms. Heydt stated that the stamps are primarily identified “by their country and their Scott number” (E. Heydt, pers. comm., October 5, 2016). For digital collections, the Smithsonian National Postal Museum uses a Gallery Systems database called The Museum System, which includes the Getty Art and Architecture Thesaurus as an embedded thesaurus. Ms. Heydt noted that aside from this embedded thesaurus, they “do not use any additional, formalized data standards such as the Dublin Core, MODS,” or the like. Of note, The Museum System does allege compliance with “standards including SPECTRUM, CCO, CDWA, DACS, CHIN, LIDO, XMP, and other international standards” (Gallery Systems 2015, 4). The end user interface that pulls data from The Museum System is called Arago, which has “an internal structure that built on the Scott Catalogue system and some internal choices for grouping and classifying objects for the philatelic and the postal history collections.” Users can search and browse the entire digital collection through Arago, but Ms. Heydt did note that Arago “is in stasis right now as we are in the planning stages for an updated version sometime in the near future.” Based on an example record (http://arago.si.edu/record_145471_img_1.html), the descriptive metadata currently available for end users include a title, Scott number, detailed description (including keywords), date of issue, medium, museum ID (a unique identifier), and place of origin. Digital images of the stamps are also included. A set of “breadcrumb” links at the top of the page also allow a user to browse each level of the digital collection, from an individual stamp record up to the entire museum collection as a whole. MANAGING METADATA FOR PHILATELIC MATERIALS | OZERAN | doi:10.6017/ital.v36i3.10022 12 Library and Archives Canada I discussed the Library and Archives Canada (LAC) online philatelic collection with James Bone, Archivist at the LAC. He explained that the philatelic collection has had a complicated history: Our philatelic collection largely began with the dissolution of the National Postal Museum … in 1989 and the subsequent division and transfer of its collection to the Canadian Postal Museum for artifacts/objects at the former Canadian Museum of Civilization (now the Canadian Museum of History) and to the Canadian Postal Archives at the former National Archives (which was merged with the National Library in the mid-2000s to create Library and Archives Canada). As a side note, both the Canadian Postal Museum and the Canadian Postal Archives are themselves now defunct – although LAC still acquires philatelic records and records related to philately and postal administration, these functions are no longer handled by a dedicated section but rather by archivists within our government records branch and our private records branch (the latter being me). (J. Bone, pers. comm., October 11, 2016) Regarding the collection’s metadata, Mr. Bone confirmed that the archival records at the LAC all conform to the RAD standard (discussed in the literature review above), and that philatelic materials are all given “at least a minimum level of useful file level or item level description for philatelic records based on Chapter 12 of RAD,” the chapter that specifically discusses philatelic materials. Unfortunately, to his knowledge, the online database for these records does not use a common metadata standard such as OAI-PMH that enables “external metadata harvesting or querying,” so the system is not searchable outside of the LAC website. Mr. Bone also pointed out that there are fields visible on the back end of the LAC online database that are not visible to end users, and the most notable of these omissions is the Scott number (the number assigned to every stamp by the Scott Catalogue). He wrote that it seemed “bizarre” to not have the Scott number visible, “as that’s definitely an access point that I would expect philatelic researchers to use to narrow down a result set to the postage stamp issue of interest.” However, it appears this invisibility was a decision consciously made by the LAC, based on Mr. Bone’s review of an internal LAC standards document. Based on an example record (http://collectionscanada.gc.ca/pam_archives/index.php?fuseaction=genitem.displayItem&lang= eng&rec_nbr=2184475) the following fields are available for end users to view: title, place of origin, denomination, date of issue, title of the collection of which it is a part, extent of item, language, access conditions, terms of use, MIKAN number (a unique identifier), ITEMLEV number (deprecated), and any additional relevant information such as previous exhibitions of the physical item. The Postal Museum The Postal Museum in London is set to open its physical doors in 2017, but much of the collection is already available for browsing and searching online. Stuart Aitken, Curator, Philately, explained to me that the online collection uses the General International Standard Archival Description, Second Edition, as the primary metadata schema, but the online collection also includes “non ISAD(G) fields for certain extra-specific data for our archive material, including philatelic material” (S. Aitken, pers. comm., December 1, 2016). Based on my own review of the ISAD(G) standards INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 13 document (International Council on Archives 1999) and an example record from The Postal Museum’s online collection (http://catalogue.postalmuseum.org/collections/getrecord/GB813_P_150_06_02_011_01_001#cu rrent), it appears nearly all the fields are based on the ISAD(G) standards. These fields include information such as date, level of description, extent of item, language, description, and conditions for access and reproduction. Only the field for “philatelic number” appears to be extra. There may be additional non-ISAD(G) fields that are not included in the example record above, but are included in other records when the extra information is available and relevant. Each digital record also allows end users to submit tags for help with identification and search. No tags were already submitted on the example record reviewed above, but this is likely because the online collection is still rather new. Of note, digital records are created at each archival level, from the broadest collection category down to the individual item (similar to the Smithsonian National Postal Museum collection). To provide an additional way to browse the collection, a sidebar in each digital record shows where it exists in the hierarchy of collections and provides links to each broader collection of which the current record is a part. The British Museum I reached out to the folks at The British Museum to discuss the application of metadata to their online records for postage stamps, but at the time of this writing I have not received any response. However, some information can be gleaned from examining the website. Unlike the other institutions reviewed in this paper, The British Museum’s online collection includes a wide variety of objects. Postage stamps are therefore identified in the online collection by specifying “postage- stamp” in the “Object type” field, which likely uses a controlled vocabulary. Based on an example record (http://www.britishmuseum.org/research/collection_online/collection_object_details.aspx?objec tId=1102502&partId=1&searchText=postage+stamp&page=1), each record for a postage stamp lists the museum number (a unique identifier), denomination, description, date issued, country of origin, materials, dimensions, acquisition name and date, department, and registration number (which appears to be the same as the museum number). Digital images of the stamps are occasionally included. The collection website notes that The British Museum is “continuing every day to improve the information recorded in it [the digital collection] and changes are being fed through on a regular basis. In many cases it does not yet represent the best available knowledge about the objects” (Trustees of the British Museum 2016a, under “About these records”). Therefore, end users are encouraged to read the information in any given record with care, and to provide feedback if they have any additional information or corrections about an object. The online collection also is offered in machine-readable format, via linked data and SPARQL, to encourage wider accessibility and use. The website advises, The use of the W3C open data standard, RDF, allows the Museum's collection data to join and relate to a growing body of linked data published by other organisations around the world interested in promoting accessibility and collaboration. The data has also been organised using the CIDOC CRM (Conceptual Reference Model) crucial for harmonising MANAGING METADATA FOR PHILATELIC MATERIALS | OZERAN | doi:10.6017/ital.v36i3.10022 14 with other cultural heritage data. The CIDOC CRM represents British Museum's data completely and, unlike other standards that fit data into a common set of data fields, all of the meaning contained in the Museum's source data is retained. (Trustees of the British Museum 2016b) Each digital object has RDF and HTML resources, as well as a SPARQL endpoint with an HTML user interface. DISCUSSION The information from the four institutions above provides a starting point for examining best practices for philatelic metadata. In the following discussion, I will review the information in light of the research questions: important metadata elements, the standards that were implemented, and whether the standards that currently exist have been sufficient. As explained in the literature review above, relevant metadata are crucial for enhancing end user research of digital records. This suggests that similarity of metadata across collections of the same type will improve users’ ability to conduct their research. Unfortunately, there are only a few descriptive metadata fields used across all four of the institutions reviewed in this paper. These fields include a title (sometimes used very loosely), the date of issue, the place of issue, a description, and a unique identifier. These fields certainly seem to be the absolute minimum necessary for identifying (and searching for) a postage stamp, since they are among the fields discussed in the literature review as being important to philatelic researchers. Other fields that are included in some but not all of the above collections, such as stamp denomination and access conditions, are nonetheless quite relevant to online collections of postage stamps. Interestingly, although the Scott Catalogue is recognized as a premier stamp catalogue, only one institution (the Smithsonian National Postal Museum) currently uses the Scott identification number as part of the standard philatelic metadata. As noted above, the Library and Archives Canada does include the Scott number in the behind-the-scenes metadata, but does it not display the Scott number to end users. The Postal Museum and The British Museum don’t use the Scott number at all. It appears that only the Smithsonian believes the Scott number is useful to end users, either for search or identification purposes. Of the four institutions, it appears that only The British Museum uses metadata standards that increase the accessibility of the online collection beyond its own website. The implementation of RDF for linked data creates an open collection that is machine-readable beyond the internal database used by the museum. The Smithsonian National Postal Museum, Library and Archives Canada, and The Postal Museum do not appear to use any similar metadata standard for data harvesting or transmission, which means that these collections can only be searched from within their respective websites. The most important thing to note in reviewing the online collections for these four institutions is the fact that each institution uses different standards to apply metadata in a different way. Frankly, this is not a surprise. As discussed in the literature review above, although metadata standards exist for a variety of materials, philatelic materials are simply not considered. Only the Canadian Rules for Archival Description explicitly include information about philatelic materials; INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 15 accordingly, the Library and Archives Canada utilizes these rules when creating its online records of postage stamps. No similar standard exists in the United States or internationally, leaving individual institutions with the task of deciding what generic metadata standard to use as a jumping off point, and then modifying it to meet local needs. As described above, the Smithsonian National Postal Museum uses the metadata schema that comes with their collection management software, and has created an end-user interface based off of internal metadata decisions. The Postal Museum based their metadata primarily off of ISAD(G), an international metadata standard with no specific suggestions for philatelic materials. I was unable to confirm the base metadata schema The British Museum employs, although it is clear they use RDF to make the collection’s digital records more widely available. Each institution appears to be using a different base metadata standard, essentially requiring them to reinvent the wheel upon deciding to digitize philatelic materials. This is what happens when there is no single, unified standard available for the type of material being described. CONCLUSION As this paper has shown, metadata standards are sorely lacking when it comes to philatelic materials. Other kinds of materials have received special considerations because more and more institutions decided it would be important to digitize them, so various groups came together to create standards that provide some guidance. It is time for this to happen for philatelic materials as well. There aren’t many cultural heritage institutions that currently manage digital collections of philatelic materials, so this is an opportunity for those who plan to digitize their collections to consider what has been done and what makes sense to pursue. It is clear that philatelic digitization is still nascent, but as with other kinds of materials, it is only likely that more and more institutions will attempt digitization projects. It is hoped that this paper can serve as a jumping off point for institutions to discuss the creation of international metadata standards specifically for philatelic materials. ACKNOWLEDGEMENTS Many thanks are owed to the people who took time out of their very busy lives to respond to the unrefined inquiries of an MLIS grad student: Stuart Aitken (Curator, Philately, The Postal Museum); James Bone (Archivist, Private Archives Branch, Library and Archives Canada); and Elizabeth Heydt (Collections Manager, Smithsonian National Postal Museum). Their expertise and responsiveness is immensely appreciated. MANAGING METADATA FOR PHILATELIC MATERIALS | OZERAN | doi:10.6017/ital.v36i3.10022 16 REFERENCES AAPE (American Association of Philatelic Exhibitors). 2016a. “AAPE - Join/Renew Your Membership.” http://www.aape.org/join_the_aape.asp. –––––. 2016b. “Exhibits Online.” http://www.aape.org/join_the_aape.asp. American Philatelic Society. 2016. “Stamp Catalogs: Your Guide to the Hobby.” Accessed December 8. http://stamps.org/How-to-Read-a-Catalog. Arms, Caroline R., and William Y. Arms. 2004. “Mixed Content and Mixed Metadata: Information Discovery in a Messy World.” In Metadata in Practice, edited by Diane I. Hillman and Elaine L. Westbrooks, 223-37. Chicago, IL: ALA Editions. Bade, David. 2007. “Structures, Standards, and the People Who Make Them Meaningful.” Paper presented at the 2nd meeting of the Library of Congress Working Group on the Future of Bibliographic Control, Chicago, IL, May 9, 2007. https://www.loc.gov/bibliographic- future/meetings/docs/bade-may9-2007.pdf. Bade, David. 2008. “The Perfect Bibliographic Record: Platonic Ideal, Rhetorical Strategy or Nonsense?” Cataloging & Classification Quarterly 46 (1): 109-33. https://doi.org/10.1080/01639370802183081. Boggs, Winthrop S. 1955. The Foundations of Philately. Princeton, NJ: D. Van Nostrand Company. Bruce, Thomas R., and Diane I. Hillmann. 2004. “The Continuum of Metadata Quality: Defining, Expressing, Exploiting.” In Metadata in Practice, edited by Diane I. Hillman and Elaine L. Westbrooks, 238-56. Chicago, IL: ALA Editions. Bureau of Canadian Archivists. 2008. Rules for Archival Description. Rev. ed. Ottawa, Canada: Canadian Council of Archives. http://www.cdncouncilarchives.ca/archdesrules.html. CC:DA (American Library Association Committee on Cataloging: Description and Access). 2010. “Task Force on Metadata: Final Report.” American Library Association. https://www.libraries.psu.edu/tas/jca/ccda/tf-meta6.html. Chan, Lois M., and Marcia L. Zeng. 2006. “Metadata Interoperability and Standardization – A Study of Methodology Part I: Achieving Interoperability at the Schema Level.” D-Lib Magazine 12 (6). https://doi.org/10.1045/june2006-chan. Gallery Systems. 2015. “TMS: The Museum System.” http://go.gallerysystems.com/About- TMS.html. International Council on Archives. 1999. ISAD(G): General International Standard Archival Description. 2nd ed. Stockholm, Sweden: International Council on Archives. http://www.icacds.org.uk/eng/ISAD(G).pdf. Liberty Street Software. 2016. “StampManage - The Best Way to Catalog Your Stamp Collection.” http://www.libertystreet.com/Stamp-Collecting-Software.htm. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 17 Roberts, Peter J. 2007. “Philatelic Materials in Archival Collections: Their Appraisal, Preservation, and Description.” The American Archivist 70 (1): 70-92. https://doi.org/10.17723/aarc.70.1.w3742751w5344275. Scott Publishing Co. 2014. Scott 2015 Standard Postage Stamp Catalogue. Vol. 3, Countries of the World, G-I. Sidney, OH: Scott Publishing Co. Society of American Archivists. 2013. Describing Archives: A Content Standard. 2nd ed. Chicago, IL: Society of American Archivists. http://files.archivists.org/pubs/DACS2E-2013_v0315.pdf. Society of American Archivists. 2015. Encoded Archival Description Tag Library, Version EAD3. Chicago, IL: Society of American Archivists. http://www2.archivists.org/sites/all/files/TagLibrary-VersionEAD3.pdf. Straight, David. 1994. “Adding Value to Stamp and Coin Collections.” Library Journal 119 (10): 75- 78. Accessed December 8, 2016. http://libaccess.sjlibrary.org/login?url=http://search.ebscohost.com/login.aspx?direct=tr ue&db=ulh&AN=9406157617&site=ehost-live&scope=site. Trustees of the British Museum. 2016a. “About the Collection Database Online.” Accessed December 8. http://www.britishmuseum.org/research/collection_online/about_the_database.aspx. –––––. 2016b. “British Museum Semantic Web Collection Online.” Accessed December 8. http://collection.britishmuseum.org/. 10044 ---- Microsoft Word - June_ITAL_dehmlow_final.docx Editorial Board Thoughts: Developing Relentless Collaborations and Powerful Partnerships Mark Dehmlow INFORMATION TECHNOLOGIES AND LIBRARIES | JUNE 2017 3 With the end of the performance and fiscal year wrapping up, it seemed like a good time to reflect on what change initiatives we have engaged in over the past few years that have strengthened the organizational effectiveness of the IT department in our library. My thoughts almost immediately drifted to our focus on collaboration. Early in my career, it was the profession wide culture of cross-institutional collaboration that convinced me that becoming a librarian would be the right career move. I am certain that the impetus to collaborate stems from our professional service commitment - a values based system that at its core believes that the success of all helps the collective do their jobs better in the name of service to our patrons. And yet, over the years, I have heard stories of and observed first hand internal competitions for resources, vilification of library IT as siloed and opaque factions, and library IT departments that have had strained relationships with their institution’s central IT organizations. As a part of our senior leadership team for the Hesburgh Libraries, two of my core professional interests are organizational effectiveness and staff satisfaction, especially in the face of a rapidly changing technology landscape, competition for talent in the IT sector where it is hard to contend with commercial salaries, and the slow rate of attrition at the University. Retaining talented IT staff requires creating a work culture that is better than the commercial sector, a work culture that values work/life balance, innovation and experimentation, a culture of teamwork and camaraderie, and where there is a clear sense of strategic priority. To build these latter two qualities into our work culture, we have strategically emphasized durable internal and external coalitions with a tenacious sense of partnership. True collaboration reinforces a collective sense of goals, allows for maximal efficiency, discourages unnecessary or destructive competition, and opens the door to the coveted but seldom realized ability to “stop doing” through partnering with other units on campus that share a sense of priority around particular services. Creating sustainable and significant internal collaboration requires etching it into the culture of the organization. Making it a part of the organization’s DNA has to be prioritized and modeled by senior leadership and it begins with advancing shared goals over singular agendas. In our senior leadership team, we have committed to each other as our primary team. We may advocate for staff and initiatives in our own verticals, but our drive is to be holistic stewards for the Libraries, not just our functional departments. We give as much, if not more weight, to the objectives of collective senior leadership team which also helps in clarifying priorities. Our executive leadership Mark Dehmlow (mdehmlow@nd.edu), a member of the ITAL Editorial Board, is Director of Library Information Technology, Hesburgh Libraries, University of Notre Dame, South Bend, IN. EDITORIAL BOARD THOUGHTS | DEHMLOW https://doi.org/10.6017/ital.v36i2.10044 4 models cooperation, cross-divisional problem solving, and collective strategic initiative planning. Using this model, decisions get made more quickly, enhancing our ability is to accomplish things on time with a high level of quality, and with a considerable level of satisfaction for our staff and faculty. The IT department is less viewed as a black box where decisions for what to work on are made behind the curtains and rather as a group of talented staff who help our organization accomplish their priorities. When our IT department needs to advocate for support and timely completion of work from individuals in other departments, the other senior managers help get their units mobilized. We see ourselves as part of the community and the community embraces us as part of them. Historically, it has been tempting to view IT as somewhat separate - a part of the production line, but in an age where every operation in the library is affected by technology, our workflows need to be more integrated and team based. The problems we are working on are more cross-disciplinary and require a plurality of expertise to solve. Libraries are increasingly becoming more and more of an interconnected and interdependent ecosystem that requires thinking holistically about problems and a relentless commitment to building coalitions to drive our services. It may seem obvious that this would be a more effective way to work, yet I have spoken with many people at organizations where there is a clear culture of departmental objective separation and competition for resources. I have long appreciated the the work environment at Notre Dame, in part because we strive to be an organization whose culture has been guided by our core institutional values - accountability, integrity, excellence in leadership, excellence in mission, and teamwork. These values not only drive our internal collaborations, but also the way in which different departments on campus work with each other. We have had a long standing, positive relationship with our central Office of Information Technologies - one that has been tremendously cooperative, but for many years has lacked interconnections at a variety of levels and a clear collaborative and strategic focus. In the last 5 years, our organizations have shifted their focus - the OIT from emphasizing centralized, administrative, enterprise computing to decentralized, academic, enterprise computing and the Libraries from doing everything in house to leveraging services for standardized services and focusing our staff’s time on initiatives where they can create the most value. In part, we developed an in-house IT department because we had service expectations that weren’t a priority at the time for the OIT. But during our strategic transitions, we have extended our working relationships at every level throughout our organizations - from our staff in the trenches to our managers and senior leaders. My focus as the Director for Library IT over the past few years has been to look at ways we can enhance our capacity through partnerships. To that end, there are several interrelated initiatives that we have begun to engage in with the OIT: 1. embedding an OIT presence in the Libraries 2. shifting support for common IT services to the OIT, and 3. consolidating our customer communication through their service portal ServiceNow. INFORMATION TECHNOLOGIES AND LIBRARIES | MARCH 2017 5 The first step in this new collaboration with the OIT was letting go of the past and revisiting where the OIT and the Libraries have strategic overlaps that may not have been aligned before. As two service organizations on campus with a deep concern for supporting the academic endeavor, it was easy to find strategic alignment with each other. For the Libraries, we often get questions at our service points about how to change passwords or install printer drivers, needs that are part of the central IT service portfolio. For the OIT, the Libraries are a major campus hub where hordes of students and faculty conduct research and work on assignments, particularly after classes when many of the business unit leave the University for the day. Working closely with the Libraries’ Director for Teaching, Research, and User Services and the OIT’s Senior Director for User Services, we began developing a collaboration grounded in our common desire to support end users which resulted in creating an OIT outpost in the Libraries. While there are many libraries who have this kind of collaboration, this was a revolutionary step for us. This collaboration opened the door for us to begin a discussion about common technology services that we have been supporting internally - printing and general lab computing. For us, these services are important to function well for our end users, but they are not services that require library expertise to accomplish. The OIT supports these services for much of campus and as long as we have aligned expectations around service level - expectations that are practical and committed to excellence - the OIT can handle that function much more efficiently and we can use our staff expertise to support other, emerging services that are core to the Libraries. We are also working closely with the OIT to leverage their IT service portal ServiceNow as the Libraries’ service portal. Given that our service portfolio is much broader than strictly IT services, moving in this direction for us required a willingness from the OIT to think outside of the box and allow us to customize the system to meet our service needs. It has required some reciprocation from the Libraries as well. The ServiceNow platform is more expensive than others we could license, its functionality will require effort from our staff to customize, and it is requiring us to change workflows, especially in the public services areas. Integrating our customer communication into this platform, though, will create a better user experience for our patrons through supporting a common interface they are experienced with and it will allow for us to more easily transfer both staff and patron general IT questions to the OIT. Beginning to work in truly collaborative ways requires shifting the narrative around our relationships from a client/provider model to one of a coalition. Redefining these relationships as partnerships puts both parties on equal footing around the planning table where everyone has an equal stake in the objectives and outcomes. They don’t come effortlessly, they require libraries to ardently become more visible on campus, to articulate the complementary value that we can contribute to campus initiatives, and to proactively request to join initiatives that we haven’t participated in before. It also takes reaching out and helping campus partners see how we can collectively create value together using our unique talents to successfully support the campus community. And lastly, it takes engaging a more holistic view of the University and the way we steward its resources; sometimes that will mean allocating more resources for the common good EDITORIAL BOARD THOUGHTS | DEHMLOW https://doi.org/10.6017/ital.v36i2.10044 6 versus taking the narrower view that we should only consider our own context when adopting solutions. But in the end, if we are willing to think about our role at the University in that broader context and build powerful partnerships, we will collectively be able to serve our end users better. 10046 ---- Trope or Trap? Roleplaying Narratives and Length in Instructional Video Amanda S. Clossen INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 27 Amanda S. Clossen (asc17@psu.edu) is Learning Design Librarian, Pennsylvania State University. ABSTRACT A concern that librarians face when creating video is whether users will actually watch the video they are directed to. This is a significant issue when it comes to how-to and other point-of-need videos. How should a video be designed to ensure maximum student interest and engagement? Many of the basic skills demonstrated in how-to videos are crucial for success in research but are not always directly connected to a class. Whether a video is selected for inclusion by an instructor or viewed after it is noticed by a student depends on how viewable the video is perceived to be. This article will discuss the results of a survey of more than thirteen hundred respondents. This survey was designed to establish the broad preferences of the viewers of instructional how-to videos, specifically focusing on the question of whether the length and presence of a role-playing narrative enhances or detracts from the viewer experience, depending on demographic. LITERATURE REVIEW Length Since the seminal 2010 study by Bowles-Terry, Hensley, and Hinchliffe established emerging best practices for pace, length, content, look and feel, and video versus text, a variety of works compiling best practices for video have been created.1 The very successful Library Minute videos from Arizona State University resulted in a collection of how-tos and best practices by Rachel Perry.2 These included tips on addressing an audience, planning, content, length, frugality, and experimentation. In 2014 Coastal Carolina nursing students were surveyed for their preferences in video, resulting in another set of best practices. These focused on video length, speaking pace, zoom functionality, and use of callouts.3 Martin and Martin’s extensive 2015 review covers content, compatibility, accessibility, and audio.4 The recommended length listed in these best practices varies widely. Thirty-seconds to a minute is recommended by Bowles-Terry, Hensley, and Hinchliffe, while Perry recommends no longer than ninety seconds.5 The Coastal Carolina study and Seminole State review recommend no longer than three minutes.6 Nearly all the articles reviewed stress that complicated concepts should be broken into more easily comprehensible chunks to avoid overwhelming student cognitive load. mailto:asc17@psu.edu TROPE OR TRAP? ROLE-PLAYING NARRATIVES AND LENGTH IN INSTRUCTIONAL VIDEO | CLOSSEN 28 https://doi.org/10.6017/ital.v37i1.10046 Narrative Roleplay Scenario The typical roleplay involves a hypothetical student who needs some sort of assistance and is helped through the process using library resources. Often there is also a hypothetical guide, who can be a librarian, friend, or professor. These hypothetical situations are recorded in a variety of ways: from live-action video recordings, to screencast voice-overs, to text. The efficacy of such tools in library video have been explored little, if at all. Devine, Quinn, and Aguilar’s 2014 study explores the usage and effectiveness of micro- and macro-narratives in resident information literacy instruction,7 but there is no question that this instructional scenario is very different than how-to instructional videos. The interplay between student interest and such narratives is addressed by emotional interest theory, which states that adding unrelated but interesting material increases attention by energizing the learner. These unrelated pieces of engaging material are known as seductive details. This “highly interesting and entertaining information . . . is only tangentially related to the topic but is irrelevant to the author’s intended theme.”8 Exploration of this concept through experimental study has indicated that seductive details are detrimental to learning.9 Some evidence indicates that learners are more likely to remember these details than the important content itself thanks to cognitive load issues.10 However, there have also been cases where seductive details have improved recall.11 In their 2015 study, Park, Flowerday, and Brünken argue that the format and presentation of seductive details have varying effect on learning processes and that they can be used to positive effect.12 In this paper, the seductive details to be studied are those of the roleplay narrative used to frame instruction in how-to videos. METHODS Survey Design The survey was designed to explore three questions: • Does the length of the video affect a user’s willingness to watch it? • Do users prefer videos that are pure instruction or those that use a roleplay narrative to deliver content? • Does the demographic of the viewer affect a video’s viewability? The survey was revised in collaboration with a survey design and statistical specialist at the Penn State Library’s Data Learning Center. The completed survey was then entered into Qualtrics for implementation. Implementation Implementation and subject-gathering was done through a survey-research sampling company that provided both a wide demographic and rapid data collection. This was sponsored by an institutional grant. Subjects from a variety of institution types and geographic locations were solicited via email invitation to complete a survey that explored their perspectives on instructional videos. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 29 The twenty-question survey was focused on respondents of a traditional college age. Implementation resulted in 1,305 responses out of 1,528 surveys. After implementation, results were compiled and analyzed by a statistical expert at the institutional data center. Nearly all the analyses to follow are simple cross-tabulations of respondent choices as correlations between demographics and preference were minor based on a multivariate analysis of variance (MANOVA) test. RESULTS AND DISCUSSION Demographics The survey, which was limited to a traditionally college-aged population (eighteen to twenty- four), produced a nearly 1:1 gender distribution (figure 1). Figure 1. Age and gender distribution. The survey had around 64 percent student participants, 77 percent of these attending school full time. Of those full-time students, 60 percent were resident students, and only 9 percent were solely online students. Unemployed participants were more likely to be full-time resident students whereas online students were more likely to be employed full-time. (See figures 2 and 3.) TROPE OR TRAP? ROLE-PLAYING NARRATIVES AND LENGTH IN INSTRUCTIONAL VIDEO | CLOSSEN 30 https://doi.org/10.6017/ital.v37i1.10046 Figure 2. Employment and student status distribution. Figure 3. Resident versus online status distribution. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 31 Information and Video Confidence The distribution of confidence in information-seeking ability hovered around 90 percent. However, at most, only half of respondents had any familiarity with Google Scholar (see figure 4). This tells us several things, the most important being that what librarians consider appropriate confidence in information-seeking is very different from what the college-aged layperson considers appropriate. This supports Colón-Aguirre and Fleming-May’s 2012 study that indicates that students are likely to use free online websites that require the least effort for their research.13 Figure 4. Information-seeking confidence. Video Length Length of a video does play a role for most. About 70 percent of participants indicated that they are either more likely to watch a video with a timestamp or will rarely watch unless the time is indicated (see figure 5). Timestamp is easily provided by most video players. The mean maximum time for college-age participants’ willingness to watch was about four and a half minutes. The median was approximately three minutes. In general, shorter appears better: three to four minutes is around the maximum length that most eighteen to twenty-nine year olds are willing to watch. This contradicts all the referenced best practices but those proffered by Baker, who described thirty to ninety seconds as ideal video viewing time. Her study found that 41 percent of her students preferred videos that were one to three minutes long, but 24 percent preferred three to five minutes. Because of this, she recommends videos that are three minutes or less.14 TROPE OR TRAP? ROLE-PLAYING NARRATIVES AND LENGTH IN INSTRUCTIONAL VIDEO | CLOSSEN 32 https://doi.org/10.6017/ital.v37i1.10046 Figure 5. Perspective on viewing time. Instructions versus Roleplay The bulk of the survey was questions related to two videos. Both videos were under three minutes long and were produced using TechSmith’s Camtasia screencast software. The screencast video simply explained how to complete a research task—searching Google Scholar for an article addressing a theme in Shakespeare’s Romeo and Juliet. Viewers were guided through the process of finding articles on this topic by a single narrator. No dramatized roleplay situation was presented. The narrative video guided the participants through a hypothetical situation dramatized by two actors. The scenario was a common one—a student procrastinating on a paper and asking her roommate for assistance at the last minute. The roommate guided the student through use of Google Scholar, completing the same tasks as the screencast video. Participants watched both videos and answered a series of questions on their reactions. Number of views was tracked on the media player, verifying that both videos were viewed. Screencasts While watching the screencast video, most participants found that the narrator was trustworthy and that they were learning. Only 15 percent felt the video needed an example scenario. Though there were mixed experiences as to the length of the video, the timing of the video seemed on INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 33 point, as only 11.6 percent strongly believed that the video took too long and 7.5 percent strongly felt that went too quickly. (See figure 6.) Figure 6. Screencast reactions. When asked an open-ended question about what struck them the most in the screencast video, respondents most frequently stated that they found it to be informative and interesting, or at least neutral. However, a variety of responses were observed, both negative and positive, or even contradictory. It is worth noting that within this open-ended format, dislike of the narrator’s voice was independently assigned as one of the top three issues. This stresses the importance of coherent and pleasant narration, as it is something that viewers will likely notice. TROPE OR TRAP? ROLE-PLAYING NARRATIVES AND LENGTH IN INSTRUCTIONAL VIDEO | CLOSSEN 34 https://doi.org/10.6017/ital.v37i1.10046 Figure 7. Open-ended questions: screencast. Narrative While watching the narrative video, participants found that they could relate to the characters or scenario and found that they were learning as much as they were when watching the screencast (see figure 8). However, there were mixed responses regarding video length and credibility of the narrator. When compared across demographics, employed respondents and students were more likely to agree that they could relate to the scenario than unemployed and nonstudents. Male respondents and employed were more likely to think that the video went too fast than female and unemployed respondents. When asked an open-ended question on what most struck them about the narrative video, respondents most often stated that they found it to be boring and long, though a good number also indicated it was interesting and informative (see figure 9). Just as with the screencast video, a variety of responses, both negative and positive, were observed, some even conflicting. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 35 Figure 8. Narrative reactions. Figure 9. Open-ended questions: narrative. TROPE OR TRAP? ROLE-PLAYING NARRATIVES AND LENGTH IN INSTRUCTIONAL VIDEO | CLOSSEN 36 https://doi.org/10.6017/ital.v37i1.10046 In addition, 13.5 percent of respondents were unsatisfied with the content of the video. Just as with the screencast video, a variety of responses, both negative and positive, were observed, some even conflicting. Screencast versus Narrative The screencast video tended to be preferred by respondents, with higher average scores in content, engagement, learning value, and narrator trustworthiness. In contrast, respondents also thought that the screencast video moved too quickly compared to the narrative video. Additionally, participants were more impatient during the narrative video (see figure 10). Figure 10. Screencast versus narrative. To observe differences between the screencast and narrative videos with regards to respondent reactions within specific population demographics, MANOVA test was performed. This test revealed that none of the p-values were significant (at α = .05), leaving no correlation between student status, employment status, and reaction to each video. A more liberal interpretation of the data from this analysis might conclude that differences in impatience across student status were possibly significant (α = .10), with students being more likely to exhibit a smaller difference in *Score defined as 1 = “Not very much” to 5 = “Very much”, with Difference = Screencast score – narrative score. Red rows indicate higher scores for the narrative video. Statistics for differences in screencast and narrative* (n=1305) INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 37 impatience for the two video styles. The preferences for screencast over narrative video did not change when the demographics were spliced. CONCLUSIONS It is impossible to please everyone all the time—at least that is what survey results suggest. There are several takeaways to this study: Video length matters, especially as a consideration before the video is viewed. Timestamps should be included in video creation, or it is highly likely that the video will not be viewed. The video player is key here, as some video players include video length, while others do not. Videos that exceed four minutes are unlikely to be viewed unless they are required. Voice quality in narration matters. Although preference in type of voice inevitably varies, the actor’s voice is noticed over production value. It is important that the narrator speaks evenly and clearly. For brief how-to videos, there is a small preference for screencast instructional videos over a narrative roleplay scenario. The results of the survey indicate that roleplay videos should be well- produced, brief, and high quality. However, what constitutes high quality is not very well established.15 Finally, screencast videos should include an example scenario, however brief, to ground the viewer in the task. SUGGESTIONS FOR FURTHER STUDY Next steps for research might include a more refined survey focusing on the results of this study. Of equal value would be a series of focus groups that are given both a screencast and narrative video and asked to discuss their preferences. Though a wide variety of students were surveyed, limits of this dataset prevented the exploration of specific correlations among students attending different institution types or among those pursing different majors. Further research addressing the differences among these student bodies would be a welcome addition to the literature. REFERENCES 1 Melissa Bowles-Terry, Merinda Kaye Hensley, and Lisa Janicke Hinchliffe, “Best Practices for Online Video Tutorials in Academic Libraries: A Study of Student Preferences and Understanding,” Communications in Information Literacy 4, no. 1 (January 1, 2010): 17–28. 2 Anali Maughan Perry, “Lights, Camera, Action! How to Produce a Library Minute,” College & Research Libraries News 72, no. 5 (2011): 278–83. TROPE OR TRAP? ROLE-PLAYING NARRATIVES AND LENGTH IN INSTRUCTIONAL VIDEO | CLOSSEN 38 https://doi.org/10.6017/ital.v37i1.10046 3 Ariana Baker, “Students’ Preferences Regarding Four Characteristics of Information Literacy Screencasts,” Journal of Library & Information Services in Distance Learning 8, no. 1–2 (January 2, 2014): 67–80, https://doi.org/10.1080/1533290X.2014.916247. 4 Nichole A. Martin and Ross Martin, “Would You Watch It? Creating Effective and Engaging Video Tutorials,” Journal of Library & Information Services in Distance Learning 9, no. 1–2 (January 2, 2015): 40–56, https://doi.org/10.1080/1533290X.2014.946345. 5 Bowles-Terry, Hensley, and Hinchliffe, “Best Practices,” 23; Perry, “Lights, Camera, Action!,” 282. 6 Baker, “Students’ Preferences,” 76; Martin and Martin, “Would You Watch It?,” 48. 7 Jaclyn R. Devine, Todd Quinn, and Paulita Aguilar, “Teaching and Transforming through Stories: An Exploration of Macro- and Micro-Narratives as Teaching Tools,” Reference Librarian 55, no. 4 (October 2, 2014): 273–88, https://doi.org/10.1080/02763877.2014.939537. 8 Shannon F. Harp and Richard E. Mayer, “The Role of Interest in Learning from Scientific Text and Illustrations: On the Distinction between Emotional Interest and Cognitive Interest,” Journal of Educational Psychology 89, no. 1 (1997): 92–102, https://doi.org/10.1037//0022- 0663.89.1.92. 9 Suzanne Hidi and Valerie Anderson, “Situational Interest and Its Impact on Reading and Expository Writing,” in The Role of Interest in Learning and Development, ed. by K. Ann Renniger (Hillsdale, NJ: L. Erlbaum Associates, 1992), 213–14. 10 Babette Park et al., “Does Cognitive Load Moderate the Seductive Details Effect? A Multimedia Study,” in “Current Research Topics in Cognitive Load Theory,” special issue, Computers in Human Behavior 27, no. 1 (January 1, 2011): 5–10, https://doi.org/10.1016/j.chb.2010.05.006. 11 Annette Towler et al., “The Seductive Details Effect in Technology-Delivered Instruction,” Performance Improvement Quarterly 21, no. 2 (January 1, 2008): 65–86, https://doi.org/10.1002/piq.20023. 12 Babette Park, Terri Flowerday, and Roland Brünken, “Cognitive and Affective Effects of Seductive Details in Multimedia Learning,” Computers in Human Behavior 44 (March 1, 2015): 267–78, https://doi.org/10.1016/j.chb.2014.10.061. 13 Mónica Colón-Aguirre and Rachel A. Fleming-May, “‘You Just Type in What You Are Looking For’: Undergraduates’ Use of Library Resources vs. Wikipedia,” Journal of Academic Librarianship 38, no. 6 (November 1, 2012): 391–99, https://doi.org/10.1016/j.acalib.2012.09.013. 14 Baker, “Students’ Preferences,” 76. 15 Towler et al., “The Seductive Details,” 71. https://doi.org/10.1080/1533290X.2014.916247 https://doi.org/10.1080/1533290X.2014.946345 https://doi.org/10.1080/02763877.2014.939537 https://doi.org/10.1037/0022-0663.89.1.92 https://doi.org/10.1037/0022-0663.89.1.92 https://doi.org/10.1016/j.chb.2010.05.006 https://doi.org/10.1002/piq.20023 https://doi.org/10.1016/j.chb.2014.10.061 https://doi.org/10.1016/j.acalib.2012.09.013 Abstract Literature Review Length Narrative Roleplay Scenario Methods Survey Design Implementation Results and Discussion Demographics Information and Video Confidence Video Length Instructions versus Roleplay Screencasts Narrative Screencast versus Narrative Conclusions Suggestions for Further Study References 10060 ---- Of the People, For the People: Digital Literature Resource Knowledge Recommendation Based on User Cognition Wen Lou, Hui Wang, and Jiangen He INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 66 Wen Lou (wlou@infor.ecnu.edu.cn) is an assistant professor in the Faculty of Economics and Management, East China Normal University. Hui Wang (1830233606@qq.com) is a graduate student in the Faculty of Economics and Management, East China Normal University. Jiangen He (jiangen.he@drexel.edu) is a Doctoral Student in the College of Computing and Informatics, Drexel University. ABSTRACT We attempt to improve user satisfaction with the effects of retrieval results and visual appearance by employing users’ own information. User feedback on digital platforms has been proven to be one type of user cognition. Through conducting a digital literature resource organization model based on user cognition, our proposal improves both the content and presentation of retrieval systems. This paper takes Powell's City of Books as an example to describe the construction process of a knowledge network. The model consists of two parts. In the unstructured data part, synopses and reviews were recorded as representatives of user cognition. To build the resource category, linguistic and semantic analyses were used to analyze the concepts and the relationships among them. In the structural data part, the metadata of every book was linked with each other by informetrics relationships. The semantic resource was constructed to assist with building the knowledge network. We conducted a mock-up to compare the new category and knowledge-recommendation system with the current retrieval system. Thirty-nine subjects examined our mock-up and highly valued the differences we made for the improvements in retrieval and appearance. Knowledge recommendation based on user cognition was tested to be positive based on user feedback. There could be more research objects for digital resource knowledge recommendations based on user cognition. INTRODUCTION The concept of user cognition originates in cognitive psychology. This concept principally explores the human cognition process through information-processing methods.1 The concept characterizes a process in which a user obtains unknown information and knowledge through acquired information. As information-science workers, we may explore the psychological activities of users by analyzing their cognitive processes when they are using information services.2 A knowledge-recommendation service based on user cognition has become essential since it emphasizes facilitating collaborations between humans and computers and promotes the participation of users, which ultimately improves user satisfaction. A knowledge-recommendation system is based on a combination of information organization, a retrieval system, and knowledge visualization.3 However, when exploring digital online literature resources, it is difficult to quickly and precisely find what we want because of the problem of information organization and retrieval. Most search results only display a one-by-one list view. mailto:2012101040015@whu.edu.cn mailto:1830233606@qq.com mailto:jiangen.he@drexel.edu OF THE PEOPLE, FOR THE PEOPLE | LOU, WANG, AND HE 67 https://doi.org/10.6017/ital.v37i3.10060 Thus, adding visualization techniques to an interface could improve user satisfaction. Furthermore, the retrieval system and visualizations rely on information organization. Only if information is well designed can the retrieval system and visualization be useful. Therefore, we attempt to improve retrieval efficiency by proposing a digital literature resource organization model based on user cognition to improve both the content and presentation of retrieval systems. Taking Powell’s City of Books as an example, this paper proposes user feedback as first-hand user information. We will focus on (1) resource organizations based on user cognition and (2) new formats on search results based on knowledge recommendations. We will purposefully employ data from users’ own information and give knowledge back to users in accordance with the quote “of the people, for the people.” RELATED WORK User Cognition and Measurement User cognition usually consists of a series of processes, including feeling, noticing, temporary memory, learning, thinking, and long-term memory.4 Feeling and noticing are at an inferior level, while learning, thinking, and memory are comparatively superior. Researchers have so far tried to identify user cognition processes by analyzing user needs. There are four levels of user needs according to Ma and Yang5 (See Figure 1.) In turn, user interests normally reflect potential user needs. Users who retrieve information on their own show feeling needs. Users who give feedback show expression needs. Users who ask questions show knowledge needs, which is the highest level. The methods to quantify user cognition require visible and measurable variables. Existing studies have commonly used website log analysis or user surveys. Website log analysis has been proven to be a solid data source to record and analyze both user interests and information needs.6 User surveys, including online questionnaires and face-to-face interviews, have been widely used to comprehend user feelings and user satisfaction.7 User surveys generally measure two kinds of relationship: between users and digital services and between users and the digital community.8 With a survey, we can make the most of statistics and assessment studies to analyze user satisfaction about an array of standards and systems of existing service platforms, service environments, service quality, and service personnel, which provides some references and suggestions for future study of user experience quality, platform elements, interaction process , and more.9 However, neither log data nor surveys can obtain first-hand user information in real- life settings. Eye tracking and the concept-map method can be used to understand user behavior in the course of user testing.10 However, these approaches are difficult to adapt to a large group of users. Therefore, a linguistic-oriented review analysis has become an increasingly important method. User content, including reviews and tags, could be analyzed through text mining and become valuable data sources to learn their preferences for the product and service in the areas of electronic commerce and digital libraries.11 This type of data has been called “more than words.”12 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 68 Figure 1. Understanding user cognition by analyzing user needs. User-Oriented Knowledge Service Model The user-oriented service model includes user demand, user cognition, and user information behavior. A service model based on user demand chiefly concentrates on the motives, habits, regularities, and purposes of user demand to identify the model of use demand so that the appropriate service is adopted.13 Service models based on user cognition attach importance to the process of user cognition, the influence that users are facing,14 and the change of library information services under the effects of series of cognitive processes (such as feeling, receiving, memorizing, and thinking).15 A service model based on user information behavior focuses on interactive behavior in the process of library information services that users participate in, such as interactions with academic librarians, knowledge platforms,16 and others. Studies have paid more attention to the pre-process of the user-oriented service model, which analyzes information habits and user behaviors.17 Studies have also proposed frameworks of knowledge services, design innovations,18 or personalized systems and frames of the knowledge service model, but they have not succeeded in implementing or performing user testing. Knowledge Service System Construction Most studies of knowledge service system construction are in business areas. Numerous studies have explored knowledge-innovation systems for product services.19 Cheung et al. proposed a knowledge system to improve customer service.20 Vitharana, Jain, and Zahedi composed a knowledge repository to enhance the knowledge-analysis skills of business consultants.21 From OF THE PEOPLE, FOR THE PEOPLE | LOU, WANG, AND HE 69 https://doi.org/10.6017/ital.v37i3.10060 the angle of user demand, Zhou analyzed the elements of service-platform construction and found that crucial platforms should serve knowledge service system construction. 22 Scholars proposed basic models for knowledge management and knowledge sharing, but they did not simulate their applications.23 Knowledge management from the library-science perspective is very different from that in the business area. Library knowledge management usually refers to a digital library, especially a personal digital library.24 Others explore and attempt to construct a personalized knowledge service system,25 while fewer studies about system designs are based on the results of user surveys in accordance with documented surveys. We rarely see a user-feedback study combined with the method of using users’ own knowledge. Users themselves know what they desire. If user-oriented studies separate the system design from user-needs analysis or the other way around, the studies may miss the purpose. Therefore, we propose a resource-organization method based on users’ own knowledge to close the distance between the users and the system. RESOURCE-ORGANIZATION MODEL BASED ON USER COGNITION There are normally two ways to construct a category system. One method gathers experts to determine categories and assign content to them; the category system comes first and the content second. The other method is to derive a category tree from the content itself, as we propose in this paper. In this way, the content takes priority over the categorization system. In this paper, we focus on this second way to organize resources and index content. Resource organization requires a series of steps, including information processes, extraction, and organization. Figure 2 shows the resource-organization model based on user cognition. This model fits the needs of digital resources with comments and reviews. The model has two interrelated parts. One is for indexing the content, and the other is for knowledge recommendations. For the first part, the model integrates all the comments and reviews of all literature in an area or the whole resource. The core concepts and the relationships among the concepts are extracted through natural language processing. The relationships between concepts are either subordination and correlation. A triple consists of two core concepts and their relationship. The triple set includes all triples. Next, all books are indexed by taxonomy in the new category system. However, the indexing of every book is not based upon the traditional method, which is to manually determine each category by reading the literature. We use a method based on the books’ content. While we are extracting the core concepts from all books we extract the core concepts from every book by the same semantic-analysis methods and build up triples for the individual book. Then the triples of this book can match the triple set in the new category system. Once a triple in a single book yields a maximum matching value, the core concepts in the triple set will be indexed as the keywords of the book. A few examples of the matching process will be discussed in the empirical study (in the section “Indexing Books”). The first part is about comments and reviews, which are unstructured data. The second part is to make use of structural data in the bibliography to build a semantic network. Structural data, including titles, keywords, authors, and publishers, is stored separately. We calculate the INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 70 informetrics relationships among the entities. The relationships can be among different entities, such as between one author and another or between an author and a publisher. Then two entities and their relationship compose a triple. The components in triples are linked to each other, which makes them semantic resources. Furthermore, the keywords in structural data are not the original keywords before the new category system but are the modified keywords. Finally, the reindexed resources (books in the new category) and semantic resources (the triples from structural data) are both used to build the knowledge network. Figure 2. Resource-organization model based on user cognition. However, why is it important to use both unstructured data and structural data? The reason is to complete the entire content of a literature resource. Neither of them can fully represent the whole semantics for a literature resource. Structural data lacks subjective content, and unstructured data lacks basic information. Thus, a full semantic network can be built using both kinds of data. OF THE PEOPLE, FOR THE PEOPLE | LOU, WANG, AND HE 71 https://doi.org/10.6017/ital.v37i3.10060 RESOURCE-ORGANIZATION EXPERIMENT Object Selection Located in Portland, Oregon, Powell’s City of Books (hereafter referred to as “Book City”) is one of the largest bookstores in the United States, with 200 million books in its inventory. Book City caught our eyes for four reasons. (1) The comments and reviews of books on Book City’s website are well constructed and plentiful. The National Geographic Channel established it as one of the ten best bookstores in the world.26 Atlantis Books, Pendulo, and Munro's Books are also on the list. Among these bookstores, only Book City and Munro’s Books have indexed the information of comments and reviews. Since user reviews are fundamental to this study, we restricted ourselves to bookstores that provided user reviews. (2) We excluded libraries because literature resources have been well organized in libraries. It might not be necessary to reorganize them according to user cognition. However, we can put this topic in the future study. (3) Book City is a typical online bookstore that also has a physical bookstore. Unlike Amazon, Book City, Indigo, Barnes & Noble, and Munro’s Books have physical bookstores. However, they all have technological limitations on retrieval-system and taxonomical construction compared to Amazon. Thus, it is necessary to investigate these bookstores’ online systems and optimize them. (4) The location was geographically convenient to the researchers. The authors are more familiar with Book City than other bookstores. Moreover, we plan on conducting a face-to-face interview for the user study. It is doable only if the authors can get to the bookstore and the users who live there. In all, we choose Book City as a representative object. Data Collection and Processing On December 22, 2015, we randomly selected the field “Cooking and Food” and downloaded bibliographic data for 462 new and old books that included title, picture, synopsis and review, ISBN, publication date, author, and keywords. In our previous work we described how metadata for all kinds of literature can be categorized into one of three types: structural data, semistructural data, and unstructured data.27 (See table 1). Title, ISBN, date, publisher, and author are classified as structural data. Titles can be seen as structural data or unstructured data depending on the need. Titles will be considered as an indivisible entity in this paper as titles need to retain their original meanings. Keywords are considered as semistructural data for two reasons: (1) normally one book is indexed with multiple keywords, which are natural language; and (2) keywords are separated by punctuation. Each keyword can individually exist with its own meaning. However, in the current category system, keywords are the names of categories and subcategories. Since we are about to reorganize the category system, the current keywords will not be included in the following steps. We use the field “Synopsis and Review” in the downloaded bibliographic records as the source of user cognition. Synopses and reviews are classified as unstructured data. All synopses and reviews of a single book are first incorporated into one paragraph, since some books contain more than one review. Structural data will be stored for constructing a knowledge network. Unstructured data will be part-of-speech tagged and word segmented by the Stanford Segmenter. All the books’ metadata are stored into the defined three data types and separate fields. Each field is linked by the ISBN as the primary key. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 72 Category Organization First, the frequencies of words in all books are separately calculated after word segmenting so that core concepts are identified by the frequencies of words. In total, 29,370 words appeared 43,675 times, after excluding stop words. The 206 words in the sample that occurred more than 105 times appeared 34,944 times. This subset was defined as the core words according to the Pareto principle. Table 1. Data Sample. Field Content Data type Title A Modern Way to Eat: 200+ Satisfying Vegetarian Recipes Structural data ISBN 9781607748038 Date 04/21/2015 Publisher Ten Speed Press Author Anna Jones KWDS Cooking and Food-Vegetarian and Natural Semistructural data Synopsis and Review A beautifully photographed and modern vegetarian cookbook packed with quick, healthy, and fresh recipes that explore the full breadth of vegetarian ingredients—grains, nuts, seeds, and seasonal vegetables—from Jamie Oliver's London-based food stylist and writer Anna Jones. How we want to eat is changing. More and more people cook without meat several nights a week and are constantly seeking to . . . Unstructured data We are inspired by Zhang et al., who described a linguistic-keywords-extraction method by defining multiple kinds of relationships among words.28 The relationships include direct relationship, indirect relationship, part-whole relationship, and related relationship. • Direct relationship. Two core words have a relationship directly to each other. • Indirect relationship. Two core words are related and linked by another word as a media. • Part-whole relationship. The “is a” relation. One core word belongs to the other. It is the most common relationship in context. • Related relationship. Two core words have no relationships but they both appear in a large context. The first two relationships can be mixed with the second two relationships. For instance, a part- whole relationship can have either a direct relationship or an indirect relationship. For this study, we combined every two core words into pairs for analysis. For example, the sentence “A picnic is a great escape from our day-to-day and a chance to turn a meal into something more festive and memorable” would result in several core-word pairs, including OF THE PEOPLE, FOR THE PEOPLE | LOU, WANG, AND HE 73 https://doi.org/10.6017/ital.v37i3.10060 “Picnic” and “Meal,” “Picnic” and “Festive,” and “Meal” and “Festive.” For “Picnic” and “Meal,” there is an obvious part-whole relationship in this context. We observed all their relationships in all books and determined their relationship as a direct part-whole relationship because 67 percent of their relationships are part-whole relationship, 80 percent are direct relationship, and others are related relationship. This is the case when two core words are in the same sentence. For two words in different sentences but within one context, we define the words’ relationship as a sentence relationship. For example, “Ingredient” and “Meat” in one review in table 1 have an indirect relationship because they are connected by other core concepts between them. Therefore, the relationship between “Ingredient” and “Meat” is an indirect part-whole one in this context. For other cases, two concepts are either related if they appear in the same context or are not related if they do not appear in the same review. Thus, all couples of concepts are calculated and stored as semantic triples. Figure 3. Parts of a modified category in “Cooking and Food” based on user cognition. The next step is to build up a category tree (figure 4). A direct part-whole relationship is that between a parent class and child class. An indirect part-whole relationship is the relationship between a parent class and a grandchild class. A related relationship is the relationship between sibling classes. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 74 Compared to the modified category system (figure 3), the current hierarchical category system (figure 4) has two major issues. First, some categories’ names are duplicated. For example, the child class “By Ingredient” contains “Fruit,” “Fruits and Vegetables,” and “Fruits, Vegetables, and Nuts.” Second, there are categories without semantic meaning, such as “Oversized Books.” These two problems brought out disorderly indexing and recalled many irrelevant results. For example, the system would let you refine your search first if you type one word in search box. However, refining is confusing by parent class and children class. Searching “diet” books as an example, the system suggests you refine your search from five subcategories of “Diet and Nutrition” under three different parent classes. However, the modified category system has avoided the duplicated keywords. Furthermore, the hierarchical system based on users’ comments maintains meaning. Figure 4. Parts of current category system in “Cooking and Food.” Indexing Books We found that the list of keywords was confusing due to the inefficiency of the previous category system. It is necessary to re-index the keywords of each book based on the modified category system. We stand on the data-oriented indexing process. The method to detect the core concepts of each book is the same as that for all books in section 4.3. Taking the book A Modern Way to Eat as an example, triples are extracted from the book, including “grain-direct part whole-ingredient,” “nut-direct part whole-ingredient,” “vegetarian-related-health,” and so on. Using all triples of the book to match with the triples set from all books in section 4.3, we index this book to categories by the best match parent class. In this case, 5 out of 9 triples of A Modern Way to Eat are matched with the parent class “Ingredient.” Another two are matched with “Natural” and “Technique,” and OF THE PEOPLE, FOR THE PEOPLE | LOU, WANG, AND HE 75 https://doi.org/10.6017/ital.v37i3.10060 the other two cannot correctly match with the triples set. Then, A Modern Way to Eat will be indexed with “Cooking and Food-Ingredient,” “Cooking and Food-Natural,” and “Cooking and Food-Technique.” 4.5 Semantic-Resource Construction The semantic resource is constructed based on structural data that was prepared at the beginning. The informetrics method (specifically co-word analysis) will be used to extract the precise relationship among the bibliography of books, as we previously proposed.29 We construct all structural data together and conduct co-words matrixes between each title, publisher, date, author, and keyword. For example, the author “Anna Jones” co-occurred with many keywords to varying degrees. The author co-occurred with the keyword “Natural” four times and “Person” seven times. According to Qiu and Lou, the precise relationship needs to be divided by the threshold and formatted as literal words.30 Therefore, among the degree of all relationships between “Anna Jones” and other keywords, the relationship between “Anna Jones” and “Natural” is highly correlated, and the relationship between “Anna Jones” and “Person” is extremely correlated. Triples are composed of two concepts and their relationships. Then a semantic resource is finally constructed that could be used for knowledge retrieval. Figure 5. An example of the knowledge network. Once the semantic resource is ready, the knowledge network is presentable. We adopted D3.js to display the knowledge network (figure 5). The net view automatically exhibits several books related with an author William Davis, which is placed in a conspicuous position on the screen. The forced map can be reformed when users drag any book with the mouse, which will be the noticeable center of other books. The network can connect with the database and the website. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 76 5. USER-EXPERIENCE STUDY ON KNOWLEDGE DISPLAY AND RECOMMENDATION There are two common ways to evaluate a retrieval system. One is to test the statistic results, such as the recall and precision. The other is a user study. Since our aim is “of the people, for the people,” we chose to conduct two user-experience studies over the statistical results. As such, we can obtain what users suggest and comment on our approach. User-Experience Study Design In February 2016, with the help of friends, we recruited volunteers by posting fliers in Portland, Oregon. fifty volunteers contacted us. Thirty-nine responses were received by the end of March 2016 because the other eleven volunteers were not able to enroll in the electronic test. Since we needed to test the feasibility of both the new indexing category and the knowledge recommendation, we set up the user study into two parts, including the comparison of the simple retrieval and the knowledge recommendation. First, we requested permission to use the data source and website frame from Book City. However, we cannot construct a new website for Book City due to intellectual-property issues. Therefore, we constructed a practical mock-up to guide users to simulate a retrieval experiment. Following the procedure of the user experience design, we chose MockingBot (https://mockingbot.com) as the mock-up builder. MockingBot allows the demo users to experience a vivid system that will be developed later. The mock-up supports every tag that can be linked with other pages so that subjects could click on the mock-up just as they would on a real website. The demo is expected to help us (1) examine whether our changes would meet the users’ satisfaction and (2) gather information for a better design. Then we performed face-to-face, user- guided interviews to first gain experience on the previous retrieval system and then compare them with our results. We concurrently recorded the answers and scores of users’ feedback. In the following sections, we will describe the interview process and present the feedback results. Study 1: Comparison of Simple Retrieval First, subjects were asked to search related books written by “Michael Pollan” at Powells.com (figure 6). As such, all subjects used the search box based on their instincts. Then they were asked to find a new hardcover copy of a book named Cooked: A Natural History of Transformation. We paid attention to the ways that subjects located the target. Only five of them used keyboard shortcuts to find the target. However, thirteen subjects stated their concerns regarding the absence of refinement options. Furthermore, we noticed that six subjects swept (moused over) the refinement area and then decided to continue eye screening. In the meantime, we recorded the time they spent looking for the item. After they found the target, all subjects gave us a score from one to ten that represented their satisfaction with the current retrieval system. OF THE PEOPLE, FOR THE PEOPLE | LOU, WANG, AND HE 77 https://doi.org/10.6017/ital.v37i3.10060 Figure 6. Screenshot of retrieval results in the current system. In the comparison experiment, we placed our mock-up in front of subjects and conducted the same exam above. In the mock-up, we used the basic frame of the retrieval system but reframed the refinement area. In the new refinement area (figure 7), we added an optional box with refinement keywords in the left column to narrow the search scope. The logic of the refined keywords comes from the indexing category, as we mentioned in the section on the Indexing books. “Michael Pollan” was indexed in six categories: “Biographies,” “Children’s Books,” “Cooking and Food,” “Engineering Manufactures,” “Hobby and Leisure,” and “Gardening.” Thus, when subjects clicked the “Cooking and Food” category, they can refine the results to only twelve books rather than the seventy books in the current system. Users can obtain accurate retrieval results faster. After the subjects completed their tasks, they gave us a score from one to ten representing their satisfaction with the modified retrieval system. Figure 7. Refinement results in the modified category-system mock-up. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 78 Study 2: Knowledge Recommendation In this experiment, we conducted two tests for two functions on knowledge visualization. One tested the preferences for the net view, and the other tested the preferences for the individual recommendation. For the net view, we guided subjects to search for “William Davis” in the mock-up and reminded them to click the net view button after the system recalled a list view. Then, the subjects could see the net view results in figure 5. We recorded the scores that they gave for the net view. As for the recommendation on individual books, we adopted multiple layers of associated retrieval results for every book. Users could click on one book and another related book would show in a new tab window. We asked subjects to conduct a new search for “William Davis.” Then they could browse the website and freely click on any book. Once they clicked on Davis’s book Wheat Belly: Lose the Wheat, Lose the Weight, and Find Your Path Back to Health, the first recommendation results popped up (figure 8). The recommendation results about wheat in the field of “Grain and Bread” showed up, including Good to the Grain: Baking with Whole Grain Flours and Bread Bakers Apprentice: Mastering the Art of Extraordinary Bread. Others about health and losing weight showed up also, such as Paleo Lunches and Breakfasts on the Go. All related books appeared because the first book is about both wheat and a healthy diet. A new window showing relevant authors and titles would pop up if the mouse glided over any picture. We asked the subjects about their thoughts on the new recommendation format and recorded the scores. Figure 8. An example of knowledge recommendation. Users’ Feedback As a result, knowledge organization and retrieval received a positive response (tables 2 and 3). First, subjects complained about the inefficiency of the current retrieval system in that it took so long to find one book without using shortcut keys (Ctrl-F). Three quarters of them were not satisfied with the original search style due to the search time length. However, 67 percent of the subjects gave a score of more than eight points for the refined search results of our new system. OF THE PEOPLE, FOR THE PEOPLE | LOU, WANG, AND HE 79 https://doi.org/10.6017/ital.v37i3.10060 Only two of them thought that it was useless since they were the two users who only took ten seconds to target the exact result. Second, 67 percent and 74 percent of the subjects, respectively, thought that the knowledge recommendation and net view were useful and gave them six points. However, five subjects gave scores of one point because they maintained that it was not necessary to build a new viewer system. Table 2. The time to find the exact result in the current system. Answers # of users Fewer than 10 seconds 2 10 to 30 seconds 4 30 seconds to 1 minute 12 More than 1 minute 21 Table 3. Statistics of quantitative questions in the questionnaire. Score Questions 10 9 8 7 6 5 4 3 2 1 Total Satisfied with original results 0 0 0 0 1 9 14 9 4 2 39 Preference of refined results 2 10 14 6 5 0 0 0 0 0 37 Preference of results in net view 1 8 10 6 4 1 2 3 1 3 39 Preference of knowledge recommendation 3 6 4 8 5 6 0 3 1 2 38 During the interview, subjects who gave scores of more than eight points spoke positively about the vivid visualization of the retrieval results, using words such as “innovative” and “creative.” For instance, User 11 said, “Bravo changes for Powell, that’d be the most innovative experience for the locals.” Among the subjects who gave scores of more than six points, the comments were mostly “interesting idea.” For instance, User 17 commented, “This is an interesting idea to explore my knowledge. I had no idea Powell could do such an improvement.” Some users offered suggestions to improve the system. For example, User 12 suggested that the system was not comprehensive enough to confidently assess whether the modified category system was better than the previous system. User 25 (a possible professional) was very concerned about the recall efficiency since the system might use many matching algorithms. DISCUSSION AND CONCLUSION In this paper, a digital literature resource organization model based on user cognition is proposed. This model aims to make users exert subjective initiative. We noticed a significant difference between the previous category system and the new system based on user cognition. Our aim, which was “of the people, for the people,” was fulfilled. Taking Powell’s City of Books as an example, it is purposeful to describe how to construct a knowledge network based on user cognition. The user experience study showed that this network implements an optimized exhibition of a digital-resource knowledge recommendation and knowledge retrieval. Although user cognition includes many other processes of user behavior, we only used the literal expression. It turned out to be a positive and possible way to reveal users’ cognition. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 80 We find that there is much more space for the construction object of digital resource knowledge recommendation based on user cognition. For one, in this paper we only take the familiar Book City as a study object and books as experiment objects and determined favorable positive effects, which indicates that the digital resource knowledge link can be applied to physical libraries and bookstores or other types of literature. Even though libraries have well-developed taxonomy systems, they can be compared with or combined with new ideas. For another, users adore visual effects and user functions. The results show promise in actualizing improvements to Book City’s website or even to other digital platforms. The concerns will be how to optimize the retrieval algorithm and reduce the time costs in the next study. ACKNOWLEDGEMENTS We thank Carolyn McKay and Powell’s City of Books for such great help for the questionnaire networking and all participates for feedback. This work was supported by the National Social Science Foundation of China [grant number 17CTQ025]. REFERENCES AND NOTES 1 Peter Carruthers, Stephen Stich, and Michael Siegal, The Cognitive Basis of Science (Cambridge: Cambridge University Press, 2002). 2 Sophie Monchaux et al., “Query Strategies during Information Searching: Effects of Prior Domain Knowledge and Complexity of the Information Problems to Be Solved,” Information Processing and Management 51, no. 5 (2015): 557–69, https://doi.org/10.1016/j.ipm.2015.05.004. 3 Hoill Jung and Kyungyong Chung, “Knowledge-Based Dietary Nutrition Recommendation for Obese Management,” Information Technology and Management 17, no. 1 (2016): 29–42, https://doi.org/10.1007/s10799-015-0218-4. 4 Dandan Ma, Liren Gan, and Yonghua Cen, “Research on Influence of Individual Cognitive Preferences upon Their Acceptance for Knowledge Classification Recommendation Service,” Journal of the China Society for Scientific and Technical Information 33, no. 7 (2014): 712–29. 5 Haiqun Ma and Zhihe Yang, “Study on the Cognitive Model of Information Searchers from the Perspective of Neuro-Language Programming,” Journal of Library Science in China 37, no. 3 (2011): 38–47. 6 Paul Gooding, “Exploring the Information Behaviour of Users of Welsh Newspapers Online through Web Log Analysis,” Journal of Documentation 72, no. 2 (2016): 232–46. https://doi.org/10.1108/JD-10-2014-0149. 7 Munmun De Choudhury and Scott Counts, “Identifying Relevant Social Media Content : Leveraging Information Diversity and User Cognition,” in ’HT11 Proceedings of the 22nd ACM Conference on Hypertext and Hypermedia (New York: ACM, 2011), 161–70, https://doi.org/10.1145/1995966.1995990; Carol Tenopir et al., “Academic Users’ Interactions with ScienceDirect in Search Tasks: Affective and Cognitive Behaviors ,” Information Processing and Management 44, no. 1 (2008): 105–21, https://doi.org/10.1016/j.ipm.2006.10.007. https://doi.org/10.1016/j.ipm.2015.05.004 https://doi.org/10.1007/s10799-015-0218-4 https://doi.org/10.1145/1995966.1995990 https://doi.org/10.1016/j.ipm.2006.10.007 OF THE PEOPLE, FOR THE PEOPLE | LOU, WANG, AND HE 81 https://doi.org/10.6017/ital.v37i3.10060 8 Young Han Bae, Jong Woo Jun, and Michelle Hough, “Uses and Gratifications of Digital Signage and Relationships with User Interface,” Journal of International Consumer Marketing 28, no. 5 (2016): 323–31, https://doi.org/10.1080/08961530.2016.1189372. 9 Claude Sicotte et al., “Analysing User Satisfaction with the System in Use Prior to the Implementation of a New Electronic Inpatient Record,” in Proceedings of the 12th World Congress on Health (Medical) Informatics; Building Sustainable Health Systems (Amsterdam: IOS Press, 2007), 1779-1784; Zhenzheng Qian et al., “SatiIndicator: Leveraging User Reviews to Evaluate User Satisfaction of SourceForge Projects,” in Proceedings—International Computer Software and Applications Conference 1 (2016):93–102, https://doi.org/10.1109/COMPSAC.2016.183. 10 Christina Merten and Cristina Conati, “Eye-Tracking to Model and Adapt to User Meta-Cognition in Intelligent Learning Environments,” in Proceedings of the 11th International Conference on Intelligent User Interfaces—IUI ’06 (New York: ACM, 2006), 39–46, https://doi.org/10.1145/1111449.1111465; Weidong Zhao, Ran Wu, and Haitao Liu, “Paper Recommendation Based on the Knowledge Gap between a Researcher’s Background Knowledge and Research Target,” Information Processing & Management 52, no. 5 (2016): 976–88, https://doi.org/10.1016/j.ipm.2016.04.004. 11 Haoran Xie et al., “Incorporating Sentiment into Tag-Based User Profiles and Resource Profiles for Personalized Search in Folksonomy,” Information Processing and Management 52, no. 1 (2016): 61–72, https://doi.org/10.1016/j.ipm.2015.03.001; Francisco Villarroel Ordenes et al., “Analyzing Customer Experience Feedback Using Text Mining: A Linguistics-Based Approach,” Journal of Service Research 17, no. 3 (2014): 278–95, https://doi.org/10.1177/1094670514524625; Yujong Hwang and Jaeseok Jeong, “Electronic Commerce and Online Consumer Behavior Research: A Literature Review,” Information Development 32, no. 3 (2016): 377–88, https://doi.org/10.1177/0266666914551071. 12 Stephan Ludwig et al., “More Than Words: The Influence of Affective Content and Linguistic Style Matches in Online Reviews on Conversion Rates,” Journal of Marketing 77, no. 1 (2012): 1–52, https://doi.org/10.1509/jm.11.0560. 13 Jun Yang and Yinglong Wang, “A New Framework Based on Cognitive Psychology for Knowledge Discovery,” Journal of Software 8, no. 1 (2013): 47–54. 14 Alan Baddeley, “On Applying Cognitive Psychology,” British Journal of Psychology 104, no. 4 (2013): 443–56, https://doi.org/10.1111/bjop.12049. 15 Aidan Moran, “Cognitive Psychology in Sport: Progress and Prospects,” Psychology of Sport and Exercise 10, no. 4 (2009): 420–26, https://doi.org/10.1016/j.psychsport.2009.02.010. 16 John Van De Pas, “A Framework for Public Information Services in the Twenty-First Century,” New Library World 114, no. 1/2 (2013): 67–79, https://doi.org/10.1108/03074801311291974. 17 Enrique Frias-Martinez, Sherry Y. Chen, and Xiaohui Liu, “Evaluation of a Personalized Digital Library Based on Cognitive Styles: Adaptivity vs. Adaptability,” International Journal of https://doi.org/10.1080/08961530.2016.1189372 https://doi.org/10.1109/COMPSAC.2016.183 https://doi.org/10.1145/1111449.1111465 https://doi.org/10.1016/j.ipm.2016.04.004 https://doi.org/10.1016/j.ipm.2015.03.001 https://doi.org/10.1177/1094670514524625 https://doi.org/10.1177/0266666914551071 https://doi.org/10.1509/jm.11.0560 https://doi.org/10.1111/bjop.12049 https://doi.org/10.1016/j.psychsport.2009.02.010 https://doi.org/10.1108/03074801311291974 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 82 Information Management 29, no. 1 (2009): 48–56, https://doi.org/10.1016/j.ijinfomgt.2008.01.012. 18 Shing Lee Chung et al., “An Integrated Framework for Managing Knowledge-Intensive Service Innovation,” International Journal of Services Technology and Management 13, no. 1/2 (2010): 20, https://doi.org/10.1504/IJSTM.2010.029669. 19 Koteshwar Chirumalla, “Managing Knowledge for Product-Service System Innovation: The Role of Web 2.0 Technologies,” Research-Technology Management 56, no. 2 (2013): 45–53, https://doi.org/10.5437/08956308X5602045; Koteshwar Chirumalla et al., “Knowledge- Sharing Network for Product-Service System Development: Is It a Typical?,” in International Conference on Industrial Product-Service Systems (2013): 109–14; Fumiya Akasaka et al., “Development of a Knowledge-Based Design Support System for Product-Service Systems,” Computers in Industry 63, no. 4 (2012): 309–18, https://doi.org/10.1016/j.compind.2012.02.009. 20 C. F. Cheung et al., “A Multi-Perspective Knowledge-Based System for Customer Service Management,” Expert Systems with Applications 24, no. 4 (2003): 457–70, https://doi.org/10.1016/S0957-4174(02)00193-8. 21 Padmal Vitharana, Hemant Jain, and Fatemeh Zahedi, “A Knowledge Based Component/Service Repository to Enhance Analysts’ Domain Knowledge for Requirements Analysis,” Information and Management 49, no. 1 (2012): 24–35, https://doi.org/10.1016/j.im.2011.12.004. 22 Baihai Zhou, “The Construction of Library Interdisciplinary Knowledge Sharing Service System,” in 2014 11th International Conference on Service Systems and Service Management (ICSSSM), June 25–27, 2014, https://doi.org/10.1109/ICSSSM.2014.6874033. 23 Rusli Abdullah, Zeti Darleena Eri, and Amir Mohamed Talib, “A Model of Knowledge Management System for Facilitating Knowledge as a Service (KaaS) in Cloud Computing Environment,” 2011 International Conference on Research and Innovation in Information Systems, November 23–24, 2011, 1–4, https://doi.org/10.1109/ICRIIS.2011.6125691. 24 Alan Smeaton and Jamie Callan, “Personalisation and Recommender Systems in Digital Libraries,” International Journal on Digital Libraries 5, no. 4 (2005): 299–308, https://doi.org/10.1007/s00799-004-0100-1. 25 Yanwen Wu et al., “Research on Personalized Knowledge Service System in Community E- Learning,” Lecture Notes in Computer Science (Berlin: Springer, 2006), https://doi.org/10.1007/11736639_17; Shu-Chen Kao and ChienHsing Wu, “PIKIPDL. A Personalized Information and Knowledge Integration Platform for DL Service,” Library Hi Tech 30, no. 3 (2012): 490–512, https://doi.org/10.1108/07378831211266627. 26 National Geographic, Destinations of a Lifetime: 225 of the World’s Most Amazing Places (Washington D.C.: National Geographic Society, 2016). 27 Wen Lou and Junping Qiu, “Semantic Information Retrieval Research Based on Co-Occurrence Analysis,” Online Information Review 38, no. 1 (January 8, 2014): 4–23, https://doi.org/10.1016/j.ijinfomgt.2008.01.012 https://doi.org/10.1504/IJSTM.2010.029669 https://doi.org/10.5437/08956308X5602045 https://doi.org/10.1016/j.compind.2012.02.009 https://doi.org/10.1016/S0957-4174(02)00193-8 https://doi.org/10.1016/j.im.2011.12.004 https://doi.org/10.1109/ICSSSM.2014.6874033 https://doi.org/10.1109/ICRIIS.2011.6125691 https://doi.org/10.1007/s00799-004-0100-1 https://doi.org/10.1007/11736639_17 https://doi.org/10.1108/07378831211266627 OF THE PEOPLE, FOR THE PEOPLE | LOU, WANG, AND HE 83 https://doi.org/10.6017/ital.v37i3.10060 https://doi.org/10.1108/OIR-11-2012-0203; Junping Qiu and Wen Lou, “Constructing an Information Science Resource Ontology Based on the Chinese Social Science Citation Index,” Aslib Journal of Information Management 66, no. 2 (March 10, 2014): 202–18, https://doi.org/10.1108/AJIM-10-2013-0114; Fan Yu, Junping Qiu, and Wen Lou, “Library Resources Semantization Based on Resource Ontology,” Electronic Library 32, no. 3 (2014): 322–40, https://doi.org/10.1108/EL-05-2012-0056. 28 Lei Zhang et al., “Extracting and Ranking Product Features in Opinion Documents,” in International Conference on Computational Linguistics (2010): 1462–70. 29 Lou and Qiu, “Semantic Information Retrieval Research,” 4; Qiu and Lou, “Constructing an Information Science Resource Ontology,” 202; Yu, Qiu, and Lou, “Library Resources Semantization,” 322. 30 Qiu and Lou, “Constructing an Information Science Resource Ontology,” 202. https://doi.org/10.1108/OIR-11-2012-0203 https://doi.org/10.1108/AJIM-10-2013-0114 https://doi.org/10.1108/EL-05-2012-0056 ABSTRACT Introduction Related Work User Cognition and Measurement User-Oriented Knowledge Service Model Knowledge Service System Construction Resource-Organization Model Based on User Cognition Resource-Organization Experiment Object Selection Data Collection and Processing Category Organization Indexing Books 4.5 Semantic-Resource Construction 5. User-Experience Study on Knowledge Display and Recommendation User-Experience Study Design Study 1: Comparison of Simple Retrieval Study 2: Knowledge Recommendation Users’ Feedback Discussion and Conclusion Acknowledgements References and notes 10067 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Editorial: How Do You Know Whence They Will Come? Marmion, Dan Information Technology and Libraries; Mar 2000; 19, 1; ProQuest pg. 3 Editorial: How Do You Know Whence They Will Come? A s I write this, I am putting my affairs in order at Western Michigan University, in preparation for a move to a new position at the University of Notre Dame Libraries beginning in April. At each university my responsibilities include overseeing both the online cata- log and the libraries' Web presence. I mention this only because I find it interesting, and indicative of an issue with which the library profession in general is grappling, that librarians in both institutions are engaged in discus- sions regarding the relationship between the two. In talking to librarians at those places and others, from some I hear sentiment for making one or the other the "primary" access point. Thus I've heard arguments that "the online catalog represents our collection, so we should use it as our main access mechanism." Other librarians state that "the online catalog is fine for search- ing for books in our collection, but there is so much more to find and so many more options for finding it, that we should use our Web pages to link everything together." My hunch is that probably we can all agree that there are things that an online catalog can do better than a Web site, and things that a Web site can do better than the online catalog. As far as that goes, have we ever had a pri- mary access point (thanks to Karen Coyle for this thought)? But that's not what I want to talk about today. The debate over a primary access point contains an invalid implicit assumption and asks the wrong question. The implicit assumption is that we can and should con- trol how our patrons come into our systems. The question we should be asking ourselves is not "What is our pri- mary access method?" but rather "How can we ensure that our users, local and remote, will find an avenue that enables them to meet their informational needs?" Since at this time I'm more familiar with WMU than Notre Dame, I'll draw some examples from the former. We have "Subject Guides to Resources" on our Web site. These consist of pages put together by subject specialists that point to recommended sources, both print and electronic, Dan Marmion local and remote, on given subjects. Students can use them to begin researching topics in a large number of subject areas. The catch is that the students have to be browsing around the Web site. If they happen to start out in the online catalog they will never encounter these gateways, because the only reference to them is on the Web site. On the other hand, a student who stays strictly with the Web site is quite possibly going to miss a valuable resource in our library if he/she doesn't consult the online catalog, because we obviously can't list everything we own on the Web site. (Also, obviously, the Web site doesn't provide the patron with status information.) This is why we have to ask ourselves the correct question mentioned above. What is the solution? Unfortunately I'm not any smarter than everyone else, so I don't have the answer (although I do know some folks who can help us with it: check out www.lita.org/ committe / toptech/ main page. htm). My guess is that we'll have to work it out as a pro- fession, possibly in collaboration with our online system vendors, and that the solution will be neither quick nor simple nor easy. There are some ad hoc moves we can make, of course, such as put links to the gateways into the catalog, and on our Web pages stress that the patron really needs to do a catalog search. The bottom line is that we have a dilemma: We can't control how people come into our electronic systems, so we can't have a "primary access point." If we try, we do harm to those who, for whatever reason, reach us via some other avenue. We need to make sure that we provide equal opportunity for all. Dan Marmion (dmarmion@nd.edu) is Associate Director of Information Systems and Access at Notre Dame University, Notre Dame, Indiana. PRODUCTION: ALA Production Services (Troy D. Linker, Christine S. Taylor; Angela Hanshaw, Kevin Heubusch, and Tracy Malecki), American Library Association, 50 E. Huron St., Chicago, IL 60611. Publication of material in Information TechnologtJ and Libraries does not constitute official endorsement by LITA or the ALA. Abstracted in Computer & Information Systems, Computing Reviews, Information Science Abstracts, Library & Information Science Abstracts, Referativnyi Zhurnal, Nauclmaya i Tekhnicheskaya Informatsiya, Otdyelnyi Vypusk, and Science Abstracts Publications. Indexed in CompuMath Citation Index, Computer Contents, Computer Literature Index, Current Contents/Health Services Administration, Current Contents/Social Behavioral Sciences, Current Index to Journals in Education, Education, Library Literature, Magazine Index, NewSearch, and Social Sciences Citation Index. Microfilm copies available to sub- scribers from University Microfilms, Ann Arbor, Michigan. mum requirements of American National Standard for Information Sciences-Permanence of Paper for Printed Library Materials, ANSI Z39.48-1992.oo Copyright ©2000 American Library Association. All material in this journal subject to copyright by ALA may be photocopied for the noncommercial purpose of scientific or educational advancement granted by Sections 107 and 108 of the Copyright Revision Act of 1976. For other reprinting, photo- copying, or translating, address requests to the ALA Office of Rights and Permissions. The paper used in this publication meets the mini- EDITORIAL I 3 10068 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Is This a Geolibrary? A Case of the Idaho Geospatial Data Center Jankowska, Maria Anna;Jankowski, Piotr Information Technology and Libraries; Mar 2000; 19, 1; ProQuest pg. 4 Is This a Geolibrary? A Case of the Idaho Geospatial Data Center Maria Anna Jankowska and Piotr Jankowski The article presents the Idaho Geospatial Data Center (IGDC), a digital library of public-domain geographic data for the state of Idaho. The design and implementa- tion of IGDC are introduced as part of the larger context of a geolibrary model. The article presents methodology and tools used to build IGDC with the focus on a geoli- brary map browser. The use of IGDC is evaluated from the perspective of access and demand for geographic data. Finally, the article offers recommendations for future development of geospatial data centers. I n the era of integrated transnational economies, demand for fast and easy access to information has become one of the great challenges faced by the tradi- tional repositories of information-libraries. Global- ization and the growth of market-based economies have brought about, faster than ever before, acquisition and dissemination of data, and the increasing demand for open access to information, unrestricted by time and location. These demands are mobilizing libraries to adopt digital information technologies and create new methods of cataloging, storing, and disseminating information in digital formats. Libraries encounter new challenges constantly. Participation in the global information infrastructure requires them to support public demand for new infor- mation services, to help the society in the process of self- education, and to promote the Internet as a tool for sharing information. These tasks are becoming easier to accomplish thanks to the growing number of digital libraries. Since 1994, when the Digital Library Initiative originated as part of the National Information Infrastructure Program, the Internet has accommodated many digital libraries with spatial data content. For example, the Electronic Environmental Library Project at the University of California, Berkeley (http:/ /elib.cs. berkeley.edu/) provides botanical and geographic data; the University of Michigan Digital Library Teaching and Learning Project (www.si.umich.edu/UMDL/) focuses on earth and space sciences; the Carnegie Mellon's Informedia Digital Video Library (www.informedia. cs.cmu.edu) distributes digital video, audio, and images Maria Anna Jankowska (majanko@uidaho.edu) is Associate Network Resources Librarian, University of Idaho Library, and Piotr Jankowski (piotrj@uidaho.edu) is Associate Professor, Department of Geography, University of Idaho, Moscow, Idaho. 4 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 with text; and the Alexandria Digital Library at Santa Barbara (http:/ /alexandria.sdc.ucsb.edu/) provides geo- graphically referenced information. The Alexandria Digital Library is of special interest in this article because it implements a model of a geolibrary. A geolibrary stores georeferenced information searchable by geographic location in addition to traditional searching methods such as by author, title, and subject. The purpose of this article is to present the Idaho Geospatial Data Center (IGDC) in the larger context of a geolibrary model. IGDC is a digital library of public- domain geographic and statistical data for the state of Idaho. The article discusses methodology and tools used to build IGDC and contrast its capabilities with a geoli- brary model. The usage of IGDC is evaluated from the perspective of access and demand for geographic data. Finally, the article offers recommendations for future development of geospatial data centers. I Geographic Information Systems for Public Services Terms such as digital, electronic, virtual, or image libraries have existed long enough to inspire diverse interpretations. The broad definition by Covi and King concentrates on the main objective of digital libraries, which is the collection of electronic resources and servic- es for the delivery of materials in different formats.1 The common motivation for initiatives leading to the develop- ment of digital libraries is to allow conventional libraries to move beyond their traditional roles of gathering, select- ing, organizing, accessing, and preserving information. Digital libraries provide new tools allowing their users not only to access the existing data but also to create new information. The creation of new information using the existing data sources is essential to the very idea of the digital library. Since the information in a digital library exists in virtual form, it can be manipulated instanta- neously by computer-based information processing tools. This is not possible using traditional information media (e.g., paper, microfilm) where the information must first be transferred from non-digital into digital format. Since late 1994, when the U.S. National Science Foundation founded the Alexandria Digital Library Project, the number of Internet sites devoted to spatially referenced information has grown dramatically. Today, it would require a serious expenditure of time and effort to visit all geographic data sites created by state agencies, universities, and commercial organizations. In 1997 Karl Musser wrote, "there are now more than 140 sites featur- ing interactive maps, most of which have been created in the last two years." 2 This incredible boom in publishing Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. spatial data is possible thanks to geographic information system (GIS) technology and data development efforts brought about by the rapidly increasing use of GIS. This new technology provides its users with capabilities to automate, search, query, manage, and analyze geographic data using the methods of spatial analysis supported by data visualization. Traditionally, geographic data were presented on maps considered as public assets. According to a Norwegian survey, the aggregate benefit accrued from using maps was three times the total cost of their produc- tion, even though maps provided only static information.3 Today, the conventional distribution of geographic data on printed maps has become less efficient than distribut- ing them in the digital format through wide area data net- works. This happened largely due to GIS's ability to separate data storage from data presentation. As a result, data can be presented in a dynamic way, according to users' needs. Often GIS is termed "data mixing system" because it can process data from different sources and for- mats such as vector-format maps with full topological and attribute information, digital images of scanned maps and photos, satellite data, video data, text data, tabular data, and databases. 4 All of these data types provide a rich informational infrastructure about locations and proper- ties of entities and phenomena distributed in terrestrial and subterrestrial space. The definition of GIS changes according to the disci- pline using it. GIS can be used as a map-making machine, a 3-D visualization tool, and as an analytical, planning, collaboration, and business information management tool. Today, it is hard to find a planning agency, city engi- neering department, or utility company (not to mention individual Internet users) that has not used digital maps. This is why the number of users seeking spatial data in digital format has increased so dramatically. Data discov- ery can be for GIS users the most time-consuming part of using the technology. 5 As a result, libraries are faced with the growing demand for services that help discover, retrieve, and manipulate spatial data. The Web greatly improved the availability and accessibility of spatial data but, at the same time, stimulated public interest in using geographic information. The continuing migration to popular operating sys- tems (i.e., Microsoft Windows family) and the adoption of their common functionality has brought GIS software to many desktops. Tools such as ArcView GIS from Environmental Systems Research Institute, Inc. (ESRI, www.esri.com) or Maplnfo from Maplnfo Corporation (Maplnfo, www.mapinfo.com) have become popular GIS desktop systems. New software tools such as ArcExplorer, released by ESRI, are focused on making GIS more accessible, simpler, and available for use by the public. By taking advantage of the popularity of the Web, attempts are being made to gain a wider acceptance of GIS. In the wake of the simplification of GIS tools and improved access to spatial data, a new exciting area of GIS use has recently emerged-public participation GIS.6 Public participation GIS by definition is a pluralistic, inclusive, and nondiscriminatory tool that focuses on the possibility of reducing the marginalization of societies by means of introducing geographic information operable on a local level.7 It promotes an understanding of spatial problems by those who are most likely to be affected by the implementation of problem solutions, and encour- ages transfer of control and knowledge to these parties. This approach leads to a broader use of GIS tools and spa- tial data and creates new challenges for libraries storing and serving geographic data in digital formats. Broadening the use of data and GIS tools requires atten- tion to data access. Traditional libraries have often ful- filled the crucial role of being an impartial information provider for all parties involved in public decision-mak- ing processes. Will they be capable of serving the society in this capacity in the digital age? I Geolibrary as a Repository of Georeferenced Information According to Brandon Plewe, the user of spatial data can choose among seven types of distributed geographic information services available on the Intemet. 8 They range from raw data download, through static map dis- play, metadata search, dynamic map browsing, data pro- cessing, Web-based GIS query and analysis, to net-savvy GIS software. Yet, another important new category of geographic data service that can be added to this list is geolibrary. Goodchild defines a geolibrary as a library filled with georeferenced information where the primary basis of representation and retrieval are spatial footprints that determine the location by geographic coordinates. "The footprints can be precise, when they refer to areas with precise boundaries, or they can be fuzzy when the limits of the area are unclear." 9 According to Buttenfield, "the value of a geolibrary is that catalogs and other indexing tools can be used to attach explicit locational information to implicit or fuzzy requests, and once accomplished, can provide links to specific books, maps, photographs, and other materials." 10 A geolibrary is distinguished from a traditional library in being fully electronic, with digital tools to access digital catalogs and indexes. It is anticipated that most of the information is archived in digital form. The value of a geolibrary is that it can be more than a traditional, physical library in electronic form.11 IS THIS A GEOLIBRARY? I JANKOWSKA AND JANKOWSKI 5 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Since its introduction, the concept of a geolibrary has been synonymous with the Alexandria Digital Library (AOL) project. Once AOL was defined as the Internet- based archive providing comprehensive browsing and retrieval services for maps, images, and spatial informa- tion.12 A more recent definition characterizes AOL as a geolibrary where a primary attribute of collection objects is their location on Earth, represented by geo- graphic footprints. A footprint is the latitude and lon- gitude values that represent a point, a bounding box, a linear feature, or a complete polygonal boundary.13 According to Goodchild (1998) a geolibrary' s compo- nents include: • The browser-a specialized software application running on the user's computer and providing access to geolibrary via a computer network. • The basemap-a geographic frame of reference for the browser's searches. A basemap provides the image of an area corresponding to the geo- graphical extent of geolibrary collection. For the worldwide collection this would be the image of the Earth. For the statewide collection this could be the image of a state. The basemap may be poten- tially large, in which case it is more advantageous to include it in the browser then to download it from a geolibrary server each time a geolibrary is accessed. • The gazetteer-the index that links place names to a map. The gazetteer allows geographic searches by place name instead of by area. • Server catalogs-collection catalogs maintained on distributed computer servers. The servers can be accessed over a network with the browser, uti- lizing basic server-client architecture. The value of a geolibrary lies in providing open access to a multitude of information with geographic footprints regardless of the storage media. Because all information in a digital library is stored using the same digital medium, traditional problems of physical storage, accessibility, portability, and concurrent use (e.g., many patrons want- ing to view the one and only copy of a map) do not exist. I Idaho Geospatial Data Center In 1996, inspired by the AOL project, a team of geogra- phers, geologists, and librarians started to work on a dig- ital library of public-domain geographic data for the state of Idaho. The main goal of the project was the development of a geographic digital data repository accessible through a flexible browsing tool. The project 6 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 was funded by a grant from the Idaho Board of Education's Technology Incentive Program. The project resulted in the creation of the Idaho Geospatial Data Center (IGDC, http://geolibrary.uidaho.edu). The first in the state of Idaho, this digital library is comprised of a database containing geospatial datasets, and GeoLibrary software that facilitates access, browsing, and retrieval of data in popular GIS data formats including Digital Line Graph (DLG), Digital Raster Graphics (DRG), USGS Digital Elevation Model (DEM), and U.S. Bureau of Census TIGER boundary files for the state of Idaho. The site also provides an interactive visual analysis of select- ed demographic/economic data for Idaho counties. Additionally, the site provides interactive links to other Idaho and national spatial data repositories. The key component of the library is the GeoLibrary software. The name "GeoLibrary" is not synonymous with the model of geolibrary defined by Goodchild (1998). It was rather adopted as a reference to a geolibrary browser-one of the components of the geolibrary. The GeoLibrary browser (GL) supports online retrieval of spatial information related to the state of Idaho. It was implemented using Microsoft Visual Basic 5.0/6.0 and ESRI MapObjects technology. The software allows users to query an area of interest using a search based on map selection, as well as selection by area name (based on uses 7.5-minute quad naming convention). Queries return GIS data including DEMs, DLGs, DRGs, and TIGER files. Queries are intended both for profes- sionals seeking GIS-format data and nonprofessionals seeking topographic reference maps in the DRG format. The interface of GL consists of three panels resem- bling the Microsoft Outlook user interface. Our intent in designing the interface was to have panels that would be used in the following order. First, the map panel is used to explore the geographic coverage of the geolibrary and to select the area of interest. Next, the query panel is used to execute a query, and finally the result panel allows the user to analyze results and to download spatial data. Users can use a shortcut to go directly to the query panel and type their query. Both approaches result in the out- put being displayed as the list of files available for down- load from participating servers. The map panel (figure 1) includes a navigable map of Idaho, a vertical command toolbar, and a map finder tool. The command toolbar allows the user to zoom in, zoom out, pan the map, identify by name the entities visible on the map canvas, and select a geographic area of interest. Geographic entity name identification was implemented as a dynamic feature whereby the name of entity changes as the user moves the mouse over the map. Spatial selec- tion provides a tool to select a rectangular area of interest directly on the map canvas. The map finder provides additional means to simplify the exploration of the map. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The results panel shows the outcome of the query and includes important information about the data files: their size, type , projection, scale , the name of the server providing the data, as well as the access path (figure 4). Based on this information , the user has the option of manually connecting to the server, using FTP protocol, and retrieving th e selected files. A much more con- venient approach, however, is to rely on GL software to automati- cally retrieve the files through the software int erface. As an option , the result of the query can also be exported to a plain HTML docu- ment that contains links to all listed files . This feature can be very useful in the case of multi- file files selected by the user and slow or limited-time Internet access. This way the user can open the saved list of files in a Web browser and download indi- vidual files as needed, without having to download all the files at once and tie up the Internet connection for a long period of time. Figure 1. Map panel. The vertical toolbar provides zooming, panning , as well as labeling and simple feature querying capabilities. The map finder allows finding and selecting an area by county or USGS quad name . The screen copy here presents the selection of Latah County in Idaho. The result panel provides a flexible way to review and organ- ize the outcomes of queries before commencing the download. One can sort files by name, size, scale, The user can select a county or a quad name and zoom in on the selected geographic unit. The query panel (figure 2) allows the user to perform a query, based either on the selection made on the map or a new selection using one of the available query tools (fig- ure 3). In the latter case, the user can enter geographic coordinates (in decimal degrees) defining the area of interest. This approach is equivalent to selecting a rectan- gular area directly on the map, and will return all data files that spatially intersect with the selected area. Optionally, the user can handpick quads of interest from the list. Finally, a name can be entered to execute a more flexible query . For instance, the search containing the word "Moscow" returns spatial data related to three quads containing "Moscow" within their names. The query is executed when the user presses the Query but- ton . After the results are received, the application auto- matically switches to the results panel. projection, and server name . This feature may be useful if the user decides to retrieve data of only one type (e.g., DEMs), of one scale, or when the user prefers to connect only to a specific sever. In addition, individual records as well as entire file types can be selected to prevent files from being downloaded. The user can also remove select- ed files to scale down the set of data in the list. One of the most important assets of the GL browser is that all of the user activities described up to this point, with the exception of file download, take place entirely on the client-side without any network traffic. In fact, area/file selection as well as queries do not require an active Internet connection. Map exploration is based on vector-format maps contained in GL software and queries are run against the local database. Such an approach limits bandwidth consumption and unneces- sary network traffic. Internet connection is only necessary to perform retrieval of selected files. IS THIS A GEOLIBRARY? I JANKOWSKA AND JANKOWSKI 7 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 2. Query panel. The interface was set to query spatial selection from the map panel. Figure 3. Query panel. The query is based on the selection of USGS quads . Optionally, the user can enter geographic coordinates of the area or a text to search. 8 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 The vulnerability of the client-side approach to data query is to be left with a potentially outdated local database. In order to prevent this problem from happening, the GL is equipped with a database synchronization mechanism that allows users to keep up with the server database updates. The client-side database, contained in GL software, which mirrors the schema of the server database, can be synchronized automatically or by the user's request. In either case, the GL client contacts the server-based database synchronizer on the server side and handles all necessary process- es. Since the synchronization is limited to data- base record updates, the network traffic is kept low, making GL suitable for limited Internet connections. IGDC is an open solution. New local datasets can be added or removed making the collection easily adaptable to different geographical areas. In addition, datasets can physically reside on multiple servers, taking full advantage of the Internet's distributed nature. I Evaluation of IGDC Use Geospatial information is among the most common public information needs; almost 80 percent of all information is geographic in nature. Published research reflecting those needs and the role of libraries in resolving them is not extensive. The efforts of federal, state, and local agencies collecting digital geospatial data and the growth of GIS creat- ed an interest in the role of libraries as repos- itories of geospatial data. 14 The main obstacle to providing access to digital spatial information is its complexity. This is why the user-friendly interface is crit- ical for presenting spatially referenced infor- mation.15 The IGDC has been a first attempt at creating a user-friendly interface in the form of a map-based data browser allowing the users to access and retrieve geographic datasets about Idaho. In order to track and evaluate the use of geospatial data, WebTrends software was installed on the IGDC server. The WebTrends software pro- duces customized Web log statistics and allows tracking information on traffic and Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. AHSAHKA -- SOUTHWICK ·· LENORE --JULIAETTA GREEN KNOB -· ALDEAMAND RIDGE PARK - TEXAS RIDGE · MCGARY BUTTE ·· BOVILL - DEARY VIOLA PALOUSE DLG_Aoads i.tJ - DLG_Rai l!l ·- DLG_Transp01t DLG_Hydro OLG_BCU'ldaries Tiger_Streets Tiger_Bnds - -- - ---'-- ----'--"-'--'--'----'=---:__.:_::.._-_- ·- - -- Since the opening of IGDC for public us e (April 1998), the GeoLibrary map browser was downloaded 1,352 times. The software proved to be relati vely easy to use by the public. Out of fort y-four bug report s/ user questions submitted to IGDC, most were concerned with filling out the software regis- tration form and not with software failure. The IGDC project spurred an interest in geographic information among students , faculty, and librarians at the University of Idaho. In a direct response to this interest, the University of Idaho library installed a new dedicated computer at the reference desk with GeoLibrar y software to access, view , and retrieve IGDC data . I Conclusion Idaho Geospatial Data Center is the first geospatial digital library for the state of Idaho. It does not fulfill all requirements of a Figure 4. The results panel. Results of a query can be sorted; individual items can be removed from the list or can be deselected to prevent them from being downloaded . geolibrary model proposed by Goodchild and others. The IGDC has only two compo- nents of the geolibrary model; they are the datasets dissemination. During a one-year timeframe the number of successful hits was more than twenty-five thousand . Almost 40 percent of users came from .com domain, 35 percent were .net domain users, 15 percent w ere .org, and 10 percent were .edu users (figure 5). Tracking the geographic origin of users by state, the biggest number of users came from Virginia, followed by Washington, California, Ohio, and Idaho . The high number of users from Virginia can be explained by the linking of the IGDC site to one of the most popular geospatial data sites in the country-the United States Geological Survey (USGS) site. Eighty-four percent of user sessions were from the United States; the rest originated from Sweden, Canada , and Germany. The average number of hits per day on weekdays was around one hundred customers. The most popular retrievable information were Digital Raster Graphics (DRG) data that present scanned images of USGS standard series topographic maps at 1:24,000 scale. Digital Elevation Models (DEM) and Digital Line Graphs (DLG) were less popular. The Tiger boundary files for the state of Idaho were in small demand . The popularity of DRG-for- mat maps and the fact that most of the users accessed IGDC via the USGS Web site makes plausible a speculation that most of the users were non-GIS specialists interested in general reference geographic information about Idaho including topography and basic land use information. GeoLibrary map browser and the basemap . The main difference between the GeoLibrary map browser and a Web-based browser solu- tion adopted by other spatial repositories is a client-side solution to geospatial data query and selection. Spatial data query is done locally on the user's machine, using the library data base schema contained in the GeoLibrary map browser. This saves time by eliminating client-serv- er communication delays during data searches, gives the user an experience of almost instantaneous response to queries , and reduces the network communication to the data download time . In comparison with th e geolibrary model, IGDC is missing the gazetteer . This component can help improve the ease of user navigation through a geospatial data col- lection. The other useful component includes online map- ping and spatial data visualization services. The idea of such services is to provide the user with a simple-to- operate mapping tool for visualizing and exploring the results of user-run queries . One such service, currently under implementation at IGDC, includes thematic map- ping of economic and demographic variables for Idaho using Descartes software .16 Descartes is a knowledge- based system supporting users in the design and utiliza- tion of thematic maps. The knowledge base incorporates domain-independent visualization rules determining which map presentation technique to employ in re- sponse to the user selection of variables. An intelligent IS THIS A GEOLIBRARY? I JANKOWSKA AND JANKOWSKI 9 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. I ,I Distribution of IGDC Users (in %) by Domain 40 30 20 10 0 . com .net org .edu Web Domain Categories Figure 5. Distribution of IGDC Users in Percent by Origin Domain map generator such as Descartes can enhance the utility of a geolibrary by providing tools to transform georefer- enced data into information. References and Notes 1. L. Covi and R. King, "Organizational Dimensions of Effective Digital Library Use: Closed Rational and Open Natural Systems Models," Journal of the American Society for Information Science 47, no. 9 (1996): 697. 2. K. Musser, "Interactive Mapping on the World Wide Web." (1997) Accessed March 6, 2000, www .min.net/-boggan/ mapping/thesis.htm. 3. T. Bernhardsen, Geographic Information Systems (Arendal, Norway: Viak IT and Norwegian Mapping Authority, 1992), 2. 4. Ibid., 4. 5. J. Stone, "Stocking Your GIS Data Library," Issues in Science and Technology Librarianship. (Winter 1999). Accessed March 6, 2000, www.library.ucsb .edu/istl/99-winter/articlel. html. 6. P. Schroeder, "GIS in Public Participation Settings." (1997.) Accessed June 2, 1999, www.spatial.maine.edu/ucgis/ testproc/ schroeder / ucgisdft.htm . 7. W. J. Craig and others, "Empowerment, Margin- alization, and Public Participation GIS," Report of a Specialist Meeting Held under the Auspices of the Varenius Project. Santa Barbara, California, Oct. 15-17, 1998, NCGIA, UC Santa Barbara. 8. B. Plewe, GIS Online: Information Retrieval, Mapping, and the Internet (Santa Fe, N.M.: On Word Pr., 1997), 71-91 . 9. M. F. Goodchild, "The Geolibrary," in Innovations in GIS 5: Selected Papers from the Fifth National Conference on GIS Research UK (GISRUK), ed. S. Carver. (London: Taylor and Francis, 1998), 59. Accessed March 6, 2000, www.geog.ucsb.edu/ -good/Geolibrary.html . 10. B. P. Buttenfield, "Making the Case for Distributed GeoLibraries." (1998) Accessed March 6, 2000, www.nap.edu/ html/ geolibraries/ app_b .html . 11. Ibid . 12. M. Rock, "Monitoring User Navigation through the Alexandria Digital Library," (master's thesis abstract, 1998). Accessed March 6, 2000, http :/ /greenwich.colorado.edu/proj- ects/ rockm.htm. 13. L. L. Hill and others, "Geographic Names the Implementation of a Gazetteer in a Georeferenced Digital Library. D-Lib Magazine 5, no. 1 (1999). Accessed March 6, 2000, www.dlib. org/ dlib/ january99 /hill/0lhill.html. 14. M. Gluck and others, "Public Librarians' Views of the Public's Geospatial Information Needs," Library Quarterly 66, no . 4 (1996): 409. 15. B. P. Buttenfield, "User Evaluation for the Alexandria Digital Library project." (1995) Accessed March 6, 2000, http://edfu.lis.uiuc.edu/allerton/95 /s 2/buttenfield .html. 16. G. Andrienko and others, "Thematic Mapping in the Internet: Exploring Census Data with Descartes," in Proceedings of TeleGeo '99, First International Workshop on Telegeoprocessing, Lyon, May 6-7, R. Laurini, ed. (Seiten, France: Claude Bernard Univ. of Lyon, 1999), 138--45. 10 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 10069 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Internet as a Source of Academic Research Information: Findings of Two Pilot Studies Kibirige, Harry M;DePalo, Lisa Information Technology and Libraries; Mar 2000; 19, 1; ProQuest pg. 11 The Internet as a Source of Academic Research Information: Harry M. Kibirige and Lisa DePalo Findings of Two Pilot Studies As a source of serious subject-oriented information, the Internet has been a powerful feature in the information arena since its inception in the last quarter of the twenti- eth century. It was, however, initially restricted to gov- ernment contractors or major research universities operating under the aegis of the Advanced Research Projects Network (ARPANET).1 In the 1990s, the con- tent and use of the Internet was expanded to include mundane subjects covered in business, industry, educa- tion, government, entertainment, and a host of other areas. It has become a magnanimous network of networks the measurement of whose size, impact, and content often elude serious scholarly effort.2 Opening the Internet to common usage literally opened the flood gates of what has come to be known as the information superhighway. Currently, there is virtually no subject that cannot be found on the Internet in one form or another. T here is both hype and reality as to what the Internet can generate in terms of substantive information. In their daily pursuits of information, information professionals as well as end-users of information are challenged with regard to what their expectations are and what actually is delivered in terms of tangible informa- tion products and services on the Net. Academic users are a special breed in that both faculty and students have specific topics covered in their courses of study or facul- ty research agendas for which they need information. The use of electronic resources found on and off the Internet is becoming increasingly vital for education and training in academic environments.3 Five basic elements often are required in the electronic resources that academic infor- mation seekers desire: accessibility, timeliness, readabili- ty, relevance, and authority. The Internet excels in the first three, but depending on how and from where the infor- mation is gathered, it may not be so reliable with regard to the last two elements. The two pilot studies discussed in this article involved four academic institutions and were conducted by the researchers with approximately twelve months apart. One covering two institutions was done in the fall of 1997. It was replicated covering another two institu- tions in the spring of 1999. The main goal of the studies was to investigate how academic users perceive search engines and subject-oriented databases as sources of topi- cal information . The basic underlying question was, "When faced with a topical subject, what is the users' pre- dominant recourse, online databases (which may include CD-ROM, or DVD databases) or search engines?" Our results indicated that there is predominant preference for search engines for the group taken as a whole. Further analysis using nonparametric correlation coefficients- Kendall's tau_b and Spearman's rho-however, indicated that those who use the Internet monthly or weekly had high correlations with online databases as their preferred predominant information sources . On the other hand, daily users tended to have high correlations with search engines as preferred predominant information sources . I Information Seeking Behavior of Academic Users Over the years, several studies have been conducted on how users seek and find information relevant to their needs . For the purposes of our analysis three categories will be used: the undergraduate, the graduate, and the post-doctoral research faculty user. While the levels of how the needed information may be articulated and packaged may be different , the five basic required ele- ments in the electronic information resources needed by academics, already identified, remain the same. The Internet has, however, added another dimension to the information-seeking behavior of all academics in that much of the needed information, if and when found, has a higher chance of appearing as full text (sometimes defined as viewdata) on the Internet. 4 With viewdata the end user has the ultimate in information seeking and acquisition in that he or she will get text, images, and sound in one, two, or more resources on the Net. The process also may be accomplished in one sitting or search session on the computer terminal. The Internet thus may be more likely to generate viewdata in con- trast to conventional databases , which have for a long time been associated with the less desirable citations. In many instances and with a little persistence, it can pro- vide the analogy of "one stop shopping" whereby a user can get viewdata needed for a topic. This may explain the tendency to try the Internet first as a potential infor- mation source even for experienced searchers. To be effective, such searching needs experience and a lot of patience while sifting through pages of useless ver- biage, as the information sources often are garnered from several sites. Categories of academic users have varying levels of expertise in information seeking and Harry M. Klblrlge is Associate Professor at Queens College, City University of New York, and Lisa DePalo is Assistant Professor at College of Staten Island, City University of New York. THE INTERNET AS A SOURCE OF ACADEMIC RESOURCE INFORMATION I KIBIRIGIE AND DEPALO 11 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. have different characteristics in their information-seek- ing behavior. Undergraduate Users Undergraduates are at the lowest point on the totem pole with regard to expertise in information seeking at any academic institution. There is more to the information needs of undergraduate students than can be revealed during the reference interview process. There are the per- vading needs that the information age has created, which can be met only by those who possess critical thinking skills. Critical thinking skills are imperative to much more than completing college-level assignments-they are also imperative to surviving in the job market once students graduate. This premise has been set forth in the 1992 United States government report from the Department of Labor and The Secretary's Commission on Achieving Necessary Skills (SCANS) entitled Skills and Tasks for Jobs: A SCANS Report for America 2000. This report defined two types of skills needed to excel in the workplace and labeled them as competencies and foun- dations. Effective reference and instruction services can help students develop the critical thinking skills needed to meet the information competency, in particular, since it pertains to one who "acquires and evaluates information, organizes and maintains information, interprets and communicates information, and uses computers to process information." 5 Acquiring and evaluating infor- mation can be particularly difficult for undergraduates in the information age since one is bombarded with data in print and electronic formats. One can easily determine the reliability of print sources by looking at the name of the author, editor, or publisher. However, the Internet has become a popular choice for students who need to do research. It has gained the reputation for providing all that one needs right at one's fingertips. The problem is that one cannot readily discern what is reliable and what is not without some instruction. It may be argued that the undergraduates' informa- tion seeking is somewhat eased by the general guides they get from the faculty in the classroom. There is the general professorial lecture which outlines the topics to be covered during the course, as well as associated relevant readings used to broaden the subjects covered. In addition, there is the text book which elaborates on material covered in class. Finally, there are journal articles and other informa- tion sources which ordinarily are placed on reserve. As far as subject content covered in class lectures and discussion is concerned, information is usually well organized and accessible. At that level information seeking is minimal and often guided by the dictates of the professor. But then enters the term paper and the whole student peace of mind with regard to information gathering habits is disturbed. The term paper brings many unknowns to the undergraduate. The magnitude of the subject to be covered is initially fuzzy. The resources needed to get background as well as specific information are also fuzzy. Furthermore, even when the resources are a little clear, sifting through them and making rational selection of rel- evant material may be problematic. The whole academic exercise entails learning and using new information tools, many of which were not covered in high school. Computers and other electronic equipment have accentu- ated the undergraduates' mesmerization process in their information-seeking effort. A trait that most undergradu- ates exhibit in their information-seeking behavior is approaching the reference librarian for suggestions of leads to information sources needed for the term paper topic. They also may request the librarian to evaluate the sources as to their relevance, and sometimes even ask him or her to fetch the actual material needed. 6 With the advent of the Internet and other electronic resources online or otherwise, (e.g., Dialog, Lexis-Nexis, CD-ROMs, DVD, and tapes), the undergraduate may go directly to the Internet terminal and thus skip the librarians' counsel and hand-holding which used to be vital for accessing the printed material. Unless the undergraduate student is well-groomed in searching the Internet, this relatively new tendency to act independently of the information professional may result in hours of useless roaming on the Net with little relevant information retrieved. The Graduate User In their study of business students, Atkinson and Figueroa found that graduates reported fewer hours spent in the library than undergraduates.7 The researchers did not attempt to explain why that was so. Perhaps because of their search skills, graduates do more focused information seeking and do not waste much of their time browsing and floundering in the unknown information abyss within the library. The researchers reported an equal interest in searching Internet resources and online databases (e.g., Lexis-Nexis, Dow Jones, and ABI/Inform), among graduates and undergraduates. However, their research was done at the end of 1995 and beginning of 1996, before the proliferation of search engines on the Internet. As an information searcher the graduate is more sophisticated compared with the undergraduate. Subject coverage is usually more clearly defined in many of the assignments encountered. He or she has gone through most of the pitfalls of the undergraduate experience and can select a subject and research it relatively well. Most likely due to the nature of their assignments, undergrad- uates' information needs may be satisfied by simple information systems that allow users to browse. Their searches also tend to be less exhaustive than graduates. On the other hand, graduates are faced with relatively 12 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. narrower subjects and prefer to conduct more compre- hensive searches. 8 The Post-Doctoral Researcher-Faculty Faculty have mastered the art of getting relevant infor- mation. Many belong to the informal invisible college and attend professional conferences, both of which are used to get information for teaching and research. Hart's study found that formal sources, which may be found in the personal and college or university library, are more important in the faculty's information-seeking effort than informal ones. 9 According to Hart, this information- seeking characteristic would be applicable to printed and electronic resources found on the Internet. Although our research did not specifically test it, online databases tend to direct the end user to formalized definitive and tested resources than the Internet search engines. This would minimize user search time and maximize rele- vance of the information needed by the research aca- demic faculty. In other words, while the listserv might be one of the Internet substitutes for the invisible college, information found on it would be more acceptable to a research faculty if it directs him or her to reliable and verifiable databases, i.e., information from CENDATA (U.S. Census Bureau information database), EDGAR (U.S. Security and Exchange database), or Dow Jones . Developments in the electronic resources arena have made many hard copies less popular. Subject-oriented databases can be searched either in the library or in fac- ulty offices. Curtis et al. researched the information-seek- ing behavior of health sciences faculty and found a relatively new and growing information-seeking charac- teristic. According to Curtis et al., faculty tend to prefer to search electronic resources from their offices rather than go to the library. 10 That is not surprising, for if a faculty member can access library catalogs and electronic data- bases, some of which can provide viewdata (full text), it is not necessary for him or her to go to the library for some of the information needed. In addition, if CD-ROM databases are on a local area network accessible via the college online catalog, faculty may seldom go to a library whose resources are on the network via a library Web site, Telnet, or the traditional dial up. I The Pilot Studies With the general information-seeking behavior of aca- demic users in mind, the researchers decided to investi- gate the use of search engines for information sources in the academe in the New York metropolitan area. Search engines were contrasted to databases which may be URL- (Universal Resource Locator) accessible online via an Internet browser, stand alone on CD-ROM, or on CD- ROM towers linked by a library local area network. In her article on Web search engines, Schwartz discussed recent studies done on their performance . She pointed out the fact that the end user is not often a participant in such studies .11 Although our research was not on evaluation, we deliberately focused on the end user to gather statistics on perception of Web search engine utility in Internet surf- ing and information seeking . Kassel evaluated search engines indicating their variety and complexity when used to search the Internet. 12 Other relevant literature indicated the difficulty of navigating the Internet for both the information professional and the end user. It also indi- cated how direct access to databases was a shortcut to retrieving some of the topical information . Our periodic observations of Internet users revealed heavy use of search engines. We suspected that end users use them to get topical information which might otherwise be easily gotten from online databases. Consequently, we thought it necessary to conduct a study on end-user perception . Objectives Our objectives in embarking on the pilot studies were to: 1. Find the frequency of Internet use by end users. This would allow us to check whether there is a correla- tion between frequency of Internet use and percep- tion of search engine utility. 2. Find the most popular search engine. Examining the most popular search engine with respect to indexing policy might indicate whether it would generate more topical subject type of information. 3. Gauge the use of online and CD-ROM databases in the library. In order to help the end-users' memory as to what databases are involved in the research, common databases were listed on the questionnaire as examples. 4. Gauge the use of search engines in libraries and information centers. Common search engines like- wise were listed to help the end user identify what they were. 5. Relate the results to pragmatic library and informa - tion-center functioning in providing information . Methodology Four metropolitan New York academic institutions were selected : Borough of Manhattan Community College; Iona College; Queens College of the City University of New York; and Wagner College. The main criteria for selection was ease of access for the researchers. A com- posite sample of users was selected from these institu- tions to participate in the studies. The sample used was dynamic and self-selected in that whoever used the THE INTERNET AS A SOURCE OF ACADEMIC RESOURCE INFORMATION I KIBIRIGIE AND DEPALO 13 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. "Internet Terminal" was a potential research subject. Only end users as opposed to informa- tion professionals/librarians were used in the study . While subjects sat at the terminal, they were requested to complete the questionnaire and return it to the reference/information desk. Simplicity dictated the design of the research and data collection instrument (ques- tionnaire). It was one page, multicolored, and was entitled "Internet Use Questionnaire." We estimated that it would take the subjects four to seven minutes to complete. Our assumption Daily 46% IVlonthly 9% Weekly 45% in designing it to be simple and least time-con- suming was that since the subjects were sitting at the terminals, they were time conscious. Figure 1. Frequency of Internet Use While subjects were asked to complete the questionnaire, they had the option not to. Forty copies of the question- naire were given to each academic institutional librar y, making a total of 160. Useable returns were 155, or 97 per- cent. In addition to the questionnaire, we conducted exit interviews with some of the subjects who were using the Internet terminals after they handed in the completed questionnaires. The purpose of the interviews was to have some idea as to how the users perceived the utility of the Internet in getting electronic-based information . Four questions were used: 1. How do you find the Internet as an information source? 2. Did you get what you needed from the Internet ? 3. Do you have a favorite search engine? 4. Is there any point when you would seek the assistance of the reference librarian/information specialist? Analysis of the data was done using the SPSS (Statistical Package for Social Science) package. We used descriptive statistics for general group tendencies-fre- quency of Internet use and preferred sources for topical subject search. For inferential statistics we preferred the non-parametric pairwise two-tailed correlation coeffi- cients, Kendall's tau_b and Spearman's rho statistics . Microsoft's Excel program package was used to draw some of the illustrations. Results The study revealed that an overwhelming majority of subjects (91 percent) use the Internet at least once a week (this includes those who use it daily) . An almost equal number (45 percent) use it weekly-(at least once a week); 46 percent use it at least once a day (see figure 1). As figure 2 shows, search engines are the predominant preferred tools for searching topical subjects on the Search Engine 84% Figure 2. Preferred Sources for Subject Search Online DB 16% Internet as contrasted to online and CD-ROM databases. We used the two-tailed pairwise correlation coefficients to see whether there are correlations between frequency of Internet use and tool preferences. As table 1 and table 2 indicate, subjects who used the Internet monthly or weekly had high correlations with online databases . Daily users, however, tended to have high correlations with search engines as tools to get to topical subject infor- mation sources. I Interpretations and Conclusions Search engines certainly provide the most common access points utilized by library /information center users to get to electronic resources on the Internet . Unfortunately, the average user seems to have the impression that the Internet is a be-all and almost a panacea to all information problems. Kassel suggests 14 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The pilot studies do not give Correlatlons conclusive answers as to why the DAILY I SENG I M:>NTHL Y I WEEKLY ONDB weekly and monthly Internet users correlated with those who use online and CD-ROM databas- es. It might be that they search the Internet via search engines as sup- plements to conventional online sources. Alternatively they may search using search engines on an exploratory basis when they begin a relatively new subject. Daily users who correlated with search engines might have mistaken the highway function of search Spearman's rho DAILY Correlation Coeffic ient 1.000 -.544 Sig. (2-tailed) .456 N 4 4 SENG Correlation Coefficient -.544 1.000 Sig. (2-tailed) .456 N 4 4 M:>NTHL Y Correlation Coefficient -258 0.316 Sig. (2-tailed) .742 .684 N 4 4 WEEKLY Correlation Coefficient -.544 .500 Sig . (2-tailed) .456 .500 N 4 4 ONDB Correlation Coefficient .258 .316 Sig . (2-tailed) .742 .684 N 4 4 Table 1. Nonparametric Correlations-Spearman's Rho that, at best, search engines seem to reach just about half of the Web pages available on the Internet.13 Sullivan has given several reasons why search engine coverage is incomplete and search results sometimes may be mis- leading.14 Among the most cogent reasons are: docu- ments may be changed after they have been picked up for inclusion; deleted materials may be displayed as avail- able; and Web sites or files which are password accessible are not covered. Much of the information needed in acad- eme is proprietary and available via database vendors. Using search engines as the main recourse to topical information shortchanges the user and may lead to frus- tration unless the high user expectations are tempered by constant education by the information specialist. Correlations - .258 .742 4 .316 .684 4 1 4 .949 .051 4 .800 .200 4 -.544 .45€ 4 .50< .50< ' 0.94! .051 ' 1.00( .63 .36! ' .258 .742 4 .316 .684 4 0.8 .200 4 .632 .368 4 1.000 4 engines from the actual sources for example: EDGAR or MED- LINE or ERIC. It might have been the problem of confusing "the end" with the "means to the end ." I Implications for Information Professionals Our studies indicated that a majority of the users in the sample preferred the search engines as access points to the Internet for topical information. The interest in search engines correlated with the State University of New York at Albany study which also indicated their predominant use in searching the Internet. 15 While the Albany study was general, ours related the search engines to getting topical information and the use of online databases as an alternative. Our findings point to the need to re-educate the Internet user in several aspects of the superhigh- DAILY I SENG I M:>NTHL YI WEEKLY ONDB way. First, content-the fact that only a fraction of the possible sites (approximately one half) are indexed by the search engines. Second, authority-because it is so easy to self-publish on the Internet, a lot of information of low integrity (for instance) or fac- tual inaccuracy may be mistaken for reliable sources. Third, tran- siency of information found on the Internet must be pointed out. The maxim "here today, gone tomor- row" is appropriate for several Kendall 's tau_ b DAILY Correlation Coefficient 1.000 -.516 Sig . (2-tailed) .346 N 4 4 SENG Correlation Coefficient - .516 1.000 Sig. (2-tailed) .346 N 4 4 M:>NTHL Y Correlation Coefficient -236 .000 .183 Sig . (2-tailed) .655 .718 N 4 4 WEEKLY Correlation Coefficient -516 .000 .400 Sig . (2-tailed) .346 .444 N 4 4 ONDB Correlation Coeffic ient .236 .183 Sig. (2-tailed) .655 .718 N 4 4 Table 2. Nonparametric Correlations-Kendall's Tau-B - .236 - .516 .655 .346 4 ' .183 .40< .718 .44< 4 ' 1.000 .91, .071 4 ' .913 1.00C .071 4 ' .667 .54l .174 .27! 4 ' .236 .655 4 .183 .718 4 .667 .174 4 .548 .279 4 1.000 4 Web sites on the Internet. Finally, information professionals must THE INTERNET AS A SOURCE OF ACADEMIC RESOURCE INFORMATION I KIBIRIGIE AND DEPALO 15 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. emphasize in their training the proven online databases to which users should go directly, if and when those data- bases are provided by the library or information center. Information professionals have a direct link to pro- viding users with guidance to proven online databases, specifically during course-integrated instruction. Education for the end user is paramount to the optimum utilization of electronic information sources. A well- developed information resources instruction program is needed in conjunction with the one-on-one instruction that takes place every day at the reference/information desk. Such instruction programs must be cumulative, if they are to be effective in an age of burgeoning choices for end users who can more and more often choose to be remote users of information resources. In an academic environment, early intervention at the freshman level is paramount, but also must be pursued in a structured manner at the upper levels. Many college and university information resources instruction programs are based on a one-shot, approximately fifty minute session, which often is executed as an orientation to the library /infor- mation center. Such a method of instruction has no guar- antee that there will be further guidance sought, either at the behest of a teaching faculty member in the form of course-integrated instruction, or on an individual level at the reference desk. Developing effective ways to integrate information resources instruction into the lives of end users is one of the challenges information professionals face in the new millennium with an increase in the use of electronic resources found on the Internet. References and Notes 1. Jon Guice, "Looking Backward and Forward at the Internet," The Information Society 14, no. 3 (July /Sept. 1998): 201-11. 2. G. McMurdo, "The Net by Numbers," Journal of Information Science 22, no. 5 (1996): 1397-411. 3. N. L. Pelzer and others, "Library Use and Information Seeking Behavior of Veterinary Medical Students Revisited in the Electronic Environment," Bulletin of the Medical Library Association 86, no. 3 (July 1998): 346-55. 4. Harry M. Kibirige, "Viewdata," in Encyclopedia of Electrical and Electronics Engineering, vol. 23, ed. by G. Webster (New York: John Wiley, 1999), 223-31. 5. Department of Labor, The Secretary's Commission on Achieving Necessary Skills, Skills and Tasks for Jobs (Washington , D.C.: Department of Labor, 1992). 6. Gloria L. Leckie , "Desperately Seeking Citations: Uncovering Faculty Assumptions about the Undergraduate Search Process," Journal of Academic Librarianship 22, no. 3 (1996): 202-208. 7. Joseph D. Atkinson and Miguel Figueroa, "Information Seeking Behavior of Business Students : A Research Study," The Reference Librarian 58, (1997): 59-73. 8. Deborah Shaw, "Bibliographic Database Searching by Graduate Students in Language and Literature: Search Strategies, Systems Interfaces, and Relevance Judgements," Library & Information Science Research 17, no. 4 (Fall 1995): 327-45 . 9. Richard L. Hart, "Information Gathering among the Faculty of a Comprehensive College : Formality and Globality," Journal of Academic Librarianship 23, no . 1 (Jan. 1997): 21-27. 10. K. L. Curtis and others, "Information-Seeking Behavior of Health Science Faculty: The Impact of New Information Technologies," Bulletin of the Medical Library Association 85, no . 4 (Oct. 1997): 402-10. 11. Candy Schwartz, "Web Search Engines," Journal of the American Society for Information Science 49, no. 11 (Sept. 1998) 973-82. 12. Amelia Kassel, "Internet Power Searching : Finding Pearls in a Zillion Grains of Sand," Information Outlook (Apr . 1999): 28-32. 13. Ibid. 14. Danny Sullivan , "Search Engine Coverage Study Published," Search Engine Watch. Accessed March 11, 2000, www .searchenginewatch.com. / sereport/99 /OS-size.html. 15. Wei Peter He, "What Are They Doing on the Internet?: Study of Information Seeking Behaviors," Internet Reference Services Quarterly 1, no. 1 (1996): 31-51 . 16 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 10070 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Harvesting Information from a Library Data Warehouse Su, Siew-Phek T;Needamangala, Ashwin Information Technology and Libraries; Mar 2000; 19, 1; ProQuest pg. 17 Harvesting Information from a Library Data Warehouse Data warehousing technology has been defined by John Ladley as "a set of methods, techniques, and tools that are leveraged together and used to produce a vehicle that deliv- ers data to end users on an integrated platform. "1 This concept has been applied increasingly by industries world- wide to develop data warehouses for decision support and knowledge discovery. In the academic sector, several uni- versities have developed data warehouses containing the universities'ftnancial, payroll, personnel, budget, and stu- dent data.2 These data warehouses across all industries and academia have met with varying degrees of success. Data warehousing technology and its related issues have been widely discussed and published. 3 Little has been done, however, on the application of this cutting edge technology in the library environment using library data. I Motivation of Project Daniel Boorstin, the former Librarian of Congress, men- tions that "for most of Western history, interpretation has far outrun data." 4 However, he points out "that modem tendency is quite the contrary, as we see data outrun meaning." His insights tie directly to many large organi- zations that long have been rich in data but poor in information and knowledge. Library managers are increasingly finding the importance of obtaining a com- prehensive and integrated view of the library operations and the services it provides. This view is helpful for the purpose of making decisions on the current operations and for their improvement. Due to financial and human constraints for library support, library managers increas- ingly encounter the need to justify everything they do- for example, the library's operation budget. The most frustrating problem they face is knowing that the infor- mation needed is available somewhere in the ocean of data but there is no easy way to obtain it. For example, it is not easy to ascertain whether the materials of a certain subject area, which consumed a lot of financial resources for their acquisition and processing, are either frequently used (i.e., high rate of circulation), seldom used, or not used at all. Or, whether they satisfy users' needs. Another example, an analysis of the methods of acquisition (firm order vs. approval plan) together with the circulation rate could be used as a factor in deciding the best method of acquiring certain types of material. Such information can play a pivotal role in performing collection development and library management more efficiently and effectively. Unfortunately, the data needed to make these types of decisions are often scattered in different files maintained Siew-Phek T. Su and Ashwin Needamangala by a large centralized system, such as NOTIS, that does not provide a general querying facility or by different file/ data management or application systems. This situa- tion makes it very difficult and time-consuming to extract useful information. This is precisely where data ware- housing technology comes in. The goal of this research and development work is to apply data warehousing and data mining technolo- gies in the development of a Library Decision Support System (LOSS) to aid the library management's decision making. The first phase of this work is to establish a data warehouse by importing selected data from separately maintained files presently used in the George A. Smathers Libraries of the University of Florida into a relational database system (Microsoft Access). Data stored in the existing files were extracted, cleansed, aggregated, and transformed into the relational repre- sentation suitable for processing by the relational data- base management system. A graphical user interface (GUI) is developed to allow decision makers to query for the data warehouse's contents using either some prede- fined queries or ad hoc queries. The second phase is to apply data mining techniques on the library data ware- house for knowledge discovery. This paper covers the first phase of this research and development work. Our goal is to develop a general methodology and inexpen- sive software tools, which can be used by different func- tional units of a library to import data from different data sources and to tailor different warehouses to meet their local decision needs. For meeting this objective, we do not have to use a very large centralized database management system to establish a single very large data warehouse to support different uses. I Local Environment The University of Florida Libraries has a collection of more than two million titles, comprising over three mil- lion volumes. It shares a NOTIS-based integrated system with nine other State University System (SUS) libraries for acquiring, processing, circulating, and accessing its collection. All ten SUS libraries are under the consortium umbrella of the Florida Center for Library Automation (FCLA). Siew-PhekT. Su (pheksu@mail.uflib.ufl.edu) is Associate Chair of the Central Bibliographic Services Section, Resource Services Department, University of Florida Libraries, and Ashwin Needamangala (nsashwin@grove.ufl.edu) is a graduate student at the Electrical and Computer Engineering Department, University of Florida. HARVESTING INFORMATION FROM A LIBRARY DATA WAREHOUSE I SU AND NEEDAMANGALA 17 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. I Library Data Sources The University of Florida Libraries' online database, LUIS, stores a wealth of data, such as bibliographic data (author, title, subject, publisher information), acquisitions data (price, order information, fund assignment), circula- tion data (charge out and browse information, with- drawn and inventory information), and owning location data (where item is shelved). These voluminous data are stored in separate files. The NOTIS system as used by the University of Florida does not provide a general querying facility for accessing data across different files. Extracting any information needed by a decision maker has to be done by writing an application program to access and manipulate these files. This is a tedious task since many application programs would have to be written to meet the different information needs. The challenge of this project is to develop a general methodology and tools for extracting useful data and metadata from these disjoint- ed files, and to bring them into a warehouse that is main- tained by a database management system such as Microsoft Access. The selection of Access and PC hard- ware for this project is motivated by cost consideration. We envision that multiple special purpose warehouses be established on multiple PC systems to provide decision support to different library units. The Library Decision Support System (LOSS) is developed with the capability of handling and analyzing an established data warehouse. For testing our method- ology and software system, we established a warehouse based on twenty thousand monograph titles acquired from our major monograph vendor. These titles were published by domestic U.S. publishers and have a high percentage of DLC/DLC records (titles cataloged by the Library of Congress). They were acquired by firm order and approval plan, The publication coverage is the calen- dar year 1996-1997. Analysis is only on the first item record (future project will include all copy holdings). Although the size of the test data used is small, it is suffi- cient to test our general methodology and the functional- ity of our software system. FCLA D82 Tables and Key List Most of the data from the twenty-thousand-title domain that go into the LOSS warehouse are obtained from the DB2 tables maintained by FCLA. FCLA developed and maintains the database of a system called Ad Hoc Report Request Over the Web (ARROW) to facilitate querying and generating reports on acquisitions activities . The data are stored in 0B2 tables. 5 For our research and development purpose, we needed DB2 tables for only the twenty-thousand titles that we identified as our initial project domain. These titles all have an identifiable 035 field in the bibliograph- ic records (zybp1996, zybcip1996, zybp1997 or zybp- cip1997). We used the BatchBAM program developed by Gary Strawn of Northwestern University Library to extract and list the unique bibliographic record numbers in separate files for FCLA to pick up. 6 Using the unique bibliographic record numbers, FCLA extracted the 0B2 tables from the ARROW database and exported the data to text files. These text files then were transferred to our system using the file transfer protocol (FrP) and inserted as tables into the LOSS warehouse. Bibliographic and Item Records Extraction FCLA collects and stores complete acquisitions data from the order records as DB2 tables. However, only brief bibliographic data and no item record data are available . Bibliographic and item record data are essen- tial for inclusion in the LOSS warehouse in order to cre- ate a viable integrated system capable of performing cross-file analysis and querying for the relationships among different types of data. Because these required data do not exist in any computer readable form, we designed a method to obtain them. Using the identical NOTIS key lists to extract the targeted twenty-thousand bibliographic and item records, we applied a screen scraping technique to scrape the data from the screen and saved them in a flat file. We then wrote a program in Microsoft Visual Basic to clean the scraped data and saved them as text-delimited files that are suitable for importing into the LOSS warehouse. Screen Scraping Concept Screen scraping is a process used to capture data from a host application. It is conventionally a three-part process: • Displaying the host screen or data to be scraped. • Finding the data to be captured. • Capturing the data to a PC or host file, or using it in another Windows application. In other words, we can capture particular data on the screen by providing the corresponding screen coordinates to the screen scraping program. Numerous commercial applications for screen scraping are available on the mar- ket. However, we used an approach slightly different from the conventional one. Although we had to capture only certain fields from the NOTIS screen, there were other fac- tors that we had to take into consideration. They are: • The location of the various fields with respect to the screen coordinates changes from record to record . This makes it impossible for us to lock a particular field with a corresponding screen coordinate. 18 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • The data present on the screen are dynamic because we are working on a "live" database where data are frequently modified. For accurate query results, all the data, especially the item record data where the circulation transactions are housed, need to be cap- tured within a specified time interval so that the data are uniform. This makes the time taken for capturing the data extremely important. • Most of the fields present on the screen needed to be captured. Taking the above factors into consideration, it was decided to capture the entire screen instead of scraping only certain parts of the screen. This made the process both simpler and faster . The unnecessary fields were fil- tered out during the cleanup process . I System Architecture The architecture of the LOSS system is shown in figure 1 and is followed by a discussion on its components' functions. NOTIS NOTIS (Northwestern Online Totally Integrated System) was developed at the Northwestern University Library and introduced in 1970. Since its inception, NOTIS has undergone many versions. University of Florida Libraries is one of the earliest users of NOTIS. FCLA has made many local modifications of the NOTIS system since UF Libraries started using it. As a result, the UF NOTIS is dif- ferent from the rest of the NOTIS world in many respects . NOTIS can be broken down into four subsystems: • acquisitions • cataloging • circulation • online public access catalog (OPAC) At the University of Florida Libraries, the NOTIS sys- tem runs on an IBM 370 main frame computer that runs the OS/390 operating system . Host Explorer Host Explorer is a software program that provides a TCP /IP link to the main frame computer . It is a terminal emulation program supporting the IBM main frame, AS/400, and VAX hosts . Host Explorer delivers an enhanced user environment for all Windows NT plat- forms, Windows 95 and Windows 3.x desktops. Exact TN3270E, TN5250, VT420/320/220/101/100/52, WYSE 50/60 and ANSI-BBS display is extended to leverage the wealth of the Windows desktop. It also supports all DB2Tables LOSS Host Explorer Data Cleansing and Extraction Warehouse Graphical User Interface Figure 1. LOSS Architecture and Its Components TCP /IP based TN3270 and TN3270E gateways. The Host Explorer program is used as the terminal emulation program in LOSS. It also provides VBA com- patible BASIC scripting tools for complete desktop macro development. Users can run these macros directly or attach them to keyboard keys, toolbar buttons, and screen hotspots for additional productivity. The function of Host Explorer in the LOSS is v ery simple. It has to "visit" all screens in the NOTIS system corresponding to each NOTIS number present in the BatchBam file, and capture all the data on the screens. In order to do this, we wrote a macro that read the NOTIS number one at a time from the BatchBam file and input the number into the command string of Host Explorer . The macro essentially performed the follow- ing functions: • Read the NOTIS numbers from the BatchBam file. • Inserted the NOTIS number into the command string of Host Explorer . • Toggled the Screen Capture option in Host Explorer so that data are scraped from the screen only at nec- essary times. • Saved all the scraped data into a flat file. After the macro has been executed, all the data scraped from the NOTIS screen reside in a flat file. The data present HARVESTING INFORMATION FROM A LIBRARY DATA WAREHOUSE I SU AND NEEDAMANGALA 19 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. in this file have to be cleansed in order to make them suit- able for insertion into the Library Warehouse. A Visual Basic program is written to perform this function. The details of this program will be given in the next section. I Data Cleansing and Extraction This component of the LOSS is written in the Visual Basic programming language. Its main function is to cleanse the data that have been scraped from the NOTIS screen. The Visual Basic code saves the cleansed data in a text-delimit- ed format that is recognized by Microsoft Access. This file is then imported into the Library Warehouse maintained by Microsoft Access. The detailed working of the code that performs the cleansing operation is discussed below. The NOTIS screen that comes up for each NOTIS number has several parts that are critical to the working of the code. They are: • NOTIS number present in the top-right of the screen (in this case, AKR9234) • Field numbers that have to be extracted. Example: 010::, 035:: • Delimiters. The " I " symbol is used as the delimiter throughout this code. For example, in the 260 field of a bibliographic record, "I a" delimits the place of publication, " I b" the name of the publisher and, "I c" the date of publication. We shall now go step by step through the cleansing process. Initially we have the flat file containing all the data that have been scraped from the NOTIS screens. • The entire list of NOTIS numbers from the BatchBam file is read into an array called Bam_Number$. • The file containing the data that have been scraped is read into a single string called BibRecord$. • This string is then parsed using the NOTIS numbers from the Bam_Number$ array. • We now have a string that contains a single NOTIS record. This string is called Single_Record$. • The program runs in a loop till all the records have been read. • Each string is now broken down into several smaller strings based on the field numbers. Each of these smaller strings contains data pertaining to the corre- sponding field number. • A considerable amount of the data present on the NOTIS screen is unnecessary from the point of view of our project. We need only certain fields from the NOTIS screen. But even from these fields we need the data only from certain delimiters. Therefore, we now scan each of these smaller strings for a certain set of delimiters, which was predefined for each indi- vidual field. The data present in the other delimiters are discarded. • The data collected from the various fields and their corresponding delimiters are assigned to correspon- ding variables. Some variables contain data from more than one delimiter concatenated together. The reason for this can be explained as follows. There are certain fields, which are present in the database only for informational purposes and will not be used as a criteria field in any query. Since these fields will never be queried upon, they do not need to be cleansed as rigorously as the other fields and therefore, we can afford to leave the data of these fields as concatenated strings. Example: The Catalog_source field which has data from " I a" and " I c" is of the form " I a DLC I c DLC" while the Lang code field which has data from "I a" and" I h" is of the form" I a eng I h rus." But we split this into two fields: Lang_code_l containing "eng" and Lang_code_2 containing "rus." • The data collected from the various fields are saved in a flat file in the text-delimited format. Microsoft Access recognizes this format. A screen dump of the text-delimited file, which is the end result of the cleansing operation, is shown in figure 2. The flat file, which we now have, can be imported into the Library Warehouse. I Graphical User Interface In order to ease the tasks of the user (i.e., the decision maker) to create the library warehouse and to query and analyze its contents, a graphical user interface tool has been developed. Through the GUI, the user can enact the following processes or operations through a main menu: • connection to NOTIS • screen scraping • data cleansing and extracting • importing data • viewing collected data • querying • report generating The first option opens HostExplorer and provides a connection to NOTIS. lt provides a shortcut to closing or minimizing LDSS and opening HostExplorer. The Screen Scraping option activates the data scraping process. The Data Cleansing and Extracting option filters out the unnecessary data fields and saves the cleansed data in a text-delimited format. The Importing Data option imports the data in the text-delimited format into the warehouse. The Viewing Collected Data option allows the user to view the contents of a selected relational table stored in 20 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. "RECORD HUMBER","System Control Humber","Catalogin Source","Language Codes 1","Language Code~ "AKR9234", "YBP1996 0507--CLARR done", "a DLC I c DLC ", "1 : I a eng "," I h rus", "e-ur-ru", "306/. 0~ "RKS6472", "YBP1996 0507--CLRRR done"," a DLC I c DLC ", "1 : I a eng "," I h rus", "Hull", "891. 73/ 44 "AKS6493", "YBP1996 0507--CLARR done"," a DLC I c DLC ","Hull", "Hull", "Hull"," 001. 4/225/ 028563 I ~f "AJX7554", "YBP1996 05 08--CLARR done"," a Uk I c Uk ","Hull", "Hull", "e-uk---", "362. 1 / 068 12 2 O",' "AKB3478", "YBP1996 05 08--CLARR done"," a DLC c DLC ","Hull", "Hull", "e-fr---", "843/. 7 12 2 O", "t " "AKC6442","YBP19960508--CLARR done","a DLC c DLC ","1 : la eng ","lh ger","e-fr---","194 12 "AKE9837", "YBP1996 0508--CLARR done"," a DLC c DLC ","Hull", "Hull", "e-gr---", "883/. 01 12 20",' "AKK9486", "YBP1996 0508--CLARR done", "a DLC c DLC ","Hull", "Hull", "e-uk---", "822/. 052309 12 ~% l'AKL2258", "YBP1996 05 08--CLARR done"," a DLC c DLC ","Hull", "Hull", "e-xr---", "929. 4/2/ 08992401 1• "AKM2455", "YBP1996 05 08--CLARR done"," a DLC c DLC ","Hull", "Hull", "e-gx---", "943. 086 12 2 O",' "AKM4649", "YBP1996 0508--CLARR done"," a DLC c DLC ","Hull", "Hull", "Hull", "863/ .64 I 2 20", "Hu] ' "AKH0246","YBP19960508--CLARR done","a DLC c DLC ","Hull","Hull","n-us--- la e-uk-en","700/. "AKH181 O", "YBP1996 05 08--CLARR done"," a DLC c DLC ","Hull" ,"Hull", "e-uk---", "305. 6/2 042/ 0903.: "AKH3749","YBP19960508--CLARR done","a DLC c DLC ","Hull","Hull","f-ke--- la f-so - --","327.{ "AKQ727 4", "YBP1996 05 08--CLARR done"," a DLC c DLC ","Hull", "Hull", "Hull", "355. 4/2 12 2 O", "Hu] "AKQ9180", "Y.BP1996 0508--CLARR done", "a DLC c DLC ","Hull", "Hull", "n-us---", "23 0/. 93/ 09 12 2,f "AKR 0424", "YBP1996 05 08--CLARR done"," a DLC c DLC ","Hull", "Hull", "n-us-mi", "331 . 88/1292/ 097' "RKR1411", "YBP1996 05 08--CLARR done"," a CL I c CL ","Hull", "Hull", "n-us---", "3 05. 896/ 073 12 2 O' "AKR1846", "YBP1996 05 08--CLARR done"," a DLC I c DLC ","Hull", "Hull", "e-uk-ni", "Hull", "Hull", "x, "AKR2169", "YBP1996Jt5 08--CLARR done"," a DLC I c DLC ","Hull", "Hull", "n-us-sc", "323. 1/196073/ 091 "AKR2245" ,"YBP19960508--C .LARR d.one" ," a DLC I c DLC ","Hull", "Hull", "Hull", "306 .4/6 I 2 20", "Hu1 "AKR2255", "YBP1996 05 08--CLARR done"," a DLC I c DLC ","Hull", "Hull", "Hull", "3 03. 48/2 12 2 O", "2r "AKR226 O", "YBP1996 0508--CLARR done"," a DLC I c DLC ","Hull", "Hull", "n-us- - -", "3 03. 48/2 12 2 O", "AKR2281", "YBP1996 05 08--CLARR done"," a DLC I c DLC ","Hull", "Hull", "t------ I a r------", "333. , · "AKR2287", "YBP1996 05 08--CLARR done"," a DLC I c DLC ","Hull", "Hull", "Hull", "57 4. 5/262 12 2 O", "t "RKR2357", "YBP1996 05 08--CLARR done"," a DLC I c DLC ","Hull", "Hull", "e------", "361 . 6/1 / 094 12 l "AKR2358", "YBP1996 0508--CLARR done"," a DLC I c DLC ","Hull", "Hull" ,"Hull", "333. 7/2/01 12 20" ,' ¥' "AKR2371", "YBP1996 05 08--CLARR done"," a DLC I c DLC ","Hull", "Hull", "e------", "3 07. 72/ 094 12 211 "AKR2386", "YBP1996 05 08--CLARR done", "DLC I c DLCI", "Hull" ,/'Hull", "e-uk---", "Hull", "Hull", "xu, "RKR25 03", "YBP1996 05 08--CLARR done"," a DLC I c DLC ","Hull", "Hull", "Hull", "575. 1 / 09 12 2 O", "HL 'i-r---· ----- -----·----- Figure 2. A Text-Delimited File the warehouse. The Querying option activates LDSS's querying facility that provides wizards to guide the for- mulations of different types of queries, as discussed later in this article . The last option, Report Generating, is for the user to specify the report to be generated. I Data Mining Tool A very important component of LOSS is the data mining tool for discovering association rules that specify the interrelationships of data stored in the warehouse. Many data mining tools are now available in the commercial world. For our project, we are investigating the use of a neural-network-based data mining tool developed by Limin Fu of the University of Florida.? The tool allows the discovery of association rules based on a set of training data provided to the tool. This part of our research and development work is still in progress . The existing GUI and report generation facilities will be expanded to include the use of this mining tool. I Library Warehouse FCLA exports the data existing in the 0B2 tables into text files. As a first step towards creating the database, these text files are transferred using FTP and form separate relational tables in the Library Warehouse. The data that HARVESTING INFORMATION FROM A LIBRARY DATA WAREHOUSE I SU AND NEEDAMANGALA 21 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. are scraped from the bibliographic and item record screens result in the formation of two more tables. Characteristics Data in the warehouse are snapshots of the original data files. Only a subset of the data contents in these files are extracted for querying and analysis since not all the data are useful for a particular decision-making situation. Data are filtered as they pass from the operational envi- ronment to the data warehouse environment. This filter- ing process is necessary particularly when a PC system, which has limited secondary storage and main memory space, is used. Once extracted and stored in the ware- house, data are not updateable. They form a read-only database. However, different snapshots of the original files can be imported into the warehouse for querying and analysis. The results of the analyses of different snap- shots can then be compared. Structure Data warehouses have a distinct structure. There are summarization and detail structures that demarcate a data warehouse. The structure of the Library Data Warehouse is shown in figure 3. The different components of the Library Data Warehouse as shown in figure 3 are: • NOTIS and 0B2 Tables. Bibliographic and circula- tion data are obtained from NOTIS through the screen scraping process and imported into the ware- house. FCLA maintains acquisitions data in the form of DB2 tables. These are also imported into the ware- house after conversion to a suitable format. • Warehouse. The warehouse consists of several rela- tional tables that are connected by means of relation- ships. The universal relation approach could have been used to implement the warehouse by using a single table. The argument for using the universal relation approach would be that all the collected data fall under the same domain. But let us examine why this approach would not have been suitable. The dif- ferent data collected for import into the warehouse were bibliographic data, circulation data, order data, and pay data. Now, if all these data were incorporat- ed into one single table with many attributes, it would not be of any exceptional use since each set of attributes have their own unique meaning when grouped together as bibliographic table, circulation table, and so on. For example, if we group the circu- lation data and the pay data together in a single table, it would not make sense. However, the pay data and the circulation data are related through the Bib_key. Hence, our use of the conventional approach of hav- User .....--------~----.----------......----=--___ Bibliographic Data View Circulation Data View Ufbib, Ufpay, Ufinv, Ufcirc, Uford WAREHOUSE Pay Data View Import Screen Scraping NOTIS FCLA DB2 Tables Figure 3. Structure of the Library Data Warehouse ing several tables connected by means of relation- ships is more appropriate. • Views. A view in SQL terminology is a single table that is derived from other tables. These other tables could be base tables or previously defined views. A view does not necessarily exist in physical form; it is considered a virtual table, in contrast to base tables whose tables are actually stored in the database. In the context of the LDSS, views can be implemented by means of the AdHoc Query Wizard. The user can define a query /view using the Wizard and save it for future use. The user can then define a query on this query I view. • Summarization. The process of implementing views falls under the process of summarization. Summarization provides the user with views, which make it easier for users to query on the data of their interests. As explained above, the specific warehouse we established consists of five tables. Table names including "_WH" indicates that it contains current detailed data of the warehouse. Current detailed data represents the most recent snapshot of data that has been taken from the NOTIS system. The summarized views are derived from the current detailed data of the warehouse. Since current detailed data of the warehouse are the basic data of the 22 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. application, only the current detailed data tables are shown in appendix A. I Decision Support by Querying the Warehouse The warehouse contains a set of integrated relational tables whose contents are linked by the common primary key, the Bib_key (Biblio_key). The data stored across these tables can be traver sed by matching the key values associated with their tuples or records . Decision makers can issue all sorts of SQL-type queries to retrieve useful information from the warehouse. Two general types of queries can be distinguished : predefined queries and ad hoc queries . The former type refers to queries that are fre- quently used by decision makers for accessing informa- tion from different snapshots of data imported into the warehouse . The latter type refers to queries that are exploratory in nature. A decision maker suspects that there is some relationship between different types of data and issues a query to verify the existence of such a rela- tionship. Alternatively, data mining tools can be applied to analyze the data contents of the warehouse and dis- cover rules of their relationships (or associations). Predefined Queries Below are some sample queries posted in English. Their corresponding SQL queries can be processed using LOSS. l. Number and percentage of approval titles circulated and noncirculated. 2. Number and percentage of firm order titles circulat- ed and noncirculated . 3. Amount of financial resources spent on acquiring noncirculated titles. 4. Number and percentage of DLC/DLC cataloging records in circulated and noncirculated titles . 5. Number and percentage of "shared" cataloging records in circulated and noncirculated titles. 6. Numbers of original and "shared" cataloging records of noncirculated titles. 7. Identify the broad subject areas of circulated and noncirculated titles . 8. Identify titles that have been circulated "n" number of times and by subjects . 9. Number of circulated titles without the 505 field. Each of the above English queries can be realized by a number of SQL queries. We shall use the first two English queries and their corresponding SQL queries to explain how the data warehouse contents and the query- ing facility of Microsoft Access can be used to support decision making. The results of SQL queries also are given . The first English query can be divided into two parts (see figure 4), each realized by a number of SQL queries as shown below . Sample Query Outputs Query 1: Number and percentage of approval titles circu- lated and noncirculated Result : Total approval titles Circulated Noncirculated 1172 980 192 83.76 % 16.24 % Similar to the above SQL queries, we can translate the second English query into a number of SQL queries and the result is given below: Query 2: Number and percentage of firm order titles cir- culated and noncirculated Result : Total firm order titles Circulated Noncirculated Report Generation 1829 1302 527 71.18 % 28.82 % The results of the two predefined English queries can be presented to users in the form of a report. Total titles 3001 Approval 1172 39% Circulated 980 83.76 % Noncirculated 192 16.24 % Firm Order 1829 61% Circulated 1302 71.18 % Noncirculated 527 28 .82 % From the above report, we can ascertain that, though 39 percent of the titles were purchased through the approval plan and 61 percent through firm orders, the approval titles have a higher rate of circulation, 83.76 per- cent, as compared to firm order titles of 71.18 percent. It is important to note that the result of the above queries is taken from only one snapshot of the circulation data. Analysis from several snapshots is needed in order to compare the results and arrive with reliable information. We now present a report on the financial resources spent on acquiring and processing noncirculated titles. In order to generate this report, we need the output of queries four and five listed earlier in this article. The cor- responding outputs are shown below. Query 4: Number and percentage of DLC/DLC cataloging records in circulated and noncirculated titles. HARVESTING INFORMATION FROM A LIBRARY DATA WAREHOUSE I SU AND NEEDAMANGALA 23 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Result: Total DLC/DLC records Circulated Noncirculated 2852 2179 673 76.40% 23.60% Query 5: Number and percentage of "shared" cataloging records in circulated and noncirculated titles. Result: Total "shared" records Circulated Noncirculated 149 100 49 67.11% 32.89% In order to come up with the financial resources, we need to consider several factors, which contribute to the amount of financial resources spent. For the sake of sim- plicity, we consider only the following factors: 1. the cost of cataloging each item with DLC/DLC record Approval Titles Circulated 2. the cost of cataloging each item with shared record 3. the average price of noncirculated books 4. the average pages of noncirculated books 5. the value of shelf space per centimeter Because the value of the above factors differs from institution to institution and might change according to more efficient workflow and better equipment used, users are required to fill in the value for factors 1, 2, and 5. LOSS can compute factors 3 and 4. The financial report , taking into consideration the value of the above factors, could be as shown below. Processing cost of each DLC Title = $10.00 673 X $10.00 = $ 6,730.00 Processing cost of each Shared Title = $20.00 SQL query t.o retrieve the distinct bibliographic keys of all the approval titles: SELECT DISTINCT BibScreen.Bib_key FROM BibScreen RIGHT JOIN pa yl ON BibScreen.Bib_key = pa y l.BIB_NUM WHERE (((payl.FUND_KEY) Like "*07*")); SQL query to count the number of approval titles that have been circulated: SELECT Count (Appr_Title.Bib_key) AS CountOfBib_key FROM (BibScreen INNER JOIN Appr_Title ON BibScreen.Bib_key = Appr _Title.Bib_key) INNER JOIN ItemScreen ON BibScreen.Bib_key = ItemScreen .Biblio_key WHERE (((ItemScreen.CHARGES)>0)) ORDER BY Count(Appr _Title.Bib_key); SQL query to calculate the percentage: SELECT Cnt_Appr_Ti tle_Circ.CountOfBib_ke y, Int(([Cnt_Appr_Titl e_Circ]![CountOfBib _key])*lO0/ Count([BibScreen)![Bib_key])) AS Percent_apprcirc FROM BibScreen, Cnt_Appr_Title _Circ GROUP BY Cnt _Appr _Title_Circ.CountOfBib _key; Approval Titles Noncirculated SQL query for counting the number of approval titles that have not been circulated: SELECT DISTINCT Count(Appr_Title.Bib_key) AS CountOfBib_ke y FROM (Appr _Title INNER JOIN BibScreen ON Appr_Title.Bib_key BibScreen.Bib_key) INNER JOIN ItemScreen ON BibScreen .Bib_key = ItemScreen.Biblio_ke y WHERE ( ( (ItemScreen.CHARGES)=0) ); SQL query to calculate the percentage: SELECT Cnt_Appr_Title_Noncirc.CountOfBib_ke y, Int(([Cnt_Appr_Title_Noncirc)![CountOfBib_ke y])*lO0/ Count([BibScreen]! [Bib _key]))) AS Percent_appr _noncirc FROM BibScreen, Cnt_Appr _Title_Noncirc GROUP BY Cnt_Appr_Title_Noncirc .CountOfBib_ke y; Figure 4. Example of an English Query Divided into Two Parts 24 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 49 X $20.00 = $ 980.00 Average price paid per noncirculated item = $48.00 722 X $48.00 = $34,656.00 Average size of book = 288 pages = 3 cm Average cost of 1 cm of shelf space= $0.10 722 X $0.30 = $216.60 Grand Total = $42,582.60 Again it is important to point out that several snap- shots of the circulation data have to be taken to track and compare the different analyses before deriving the reli- able information. Ad Hoc Queries Alternately, if the user wishes to issue a query that has not been predefined, the Ad Hoc Query Wizard can be used. The following example illustrates the use of the Ad Hoc Query Wizard. Assume the sample query is: How many circulated titles in the English subject area cost more than $35? We now take you on a walk-through of the AdHoc Query Wizard starting from the first step till the output is obtained. Figure 4 depicts Step 1 of the Ad Hoc Query Wizard. The sample query mentioned above requires the follow- ing fields: • Biblio_key for a count of all the titles which satisfy the given condition. • Charges to specify the criteria of "circulated title". • Fund_Key to specify all titles under the "English" subject area. • Paid_Amt to specify all titles which cost more than $35. Step 2 of the Ad Hoc Query Wizard (figure 5) allows the user to specify criteria and thereby narrow the search domain. Step 3 (figure 6) allows the user to specify any mathematical operations or aggregation functions to be performed. Step 4 (figure 7) displays the user-defined query in SQL form and allows the user to save the query for future reuse. The output of the query is shown below in figure 8. The figure shows the number of circulated titles in the English subject area that cost more than $35. Alternatively, the user might wish to obtain a listing of these 33 titles. Figure 9 shows the listing. I Conclusion In this article, we presented the design and development of a library decision support system based on data ware- housing and data mining concepts and techniques. We described the functions of the components of LOSS. The screen scraping and data cleansing and extraction Figure 4. Step 1: Ad Hoc Query Wizard ~ E.9,~Lang__;c~,tfe ... 1 Lik~ "'ft,f" J.esi: !han Eg,. Crfi;irget t 4 Gr~er th'Jn·Eii, Q:,arges,> 0 Equal tci'E_g_- Cfiarge~= !1 Not . . Figure 5. Step 2: Ad Hoc Query Wizard HARVESTING INFORMATION FROM A LIBRARY DATA WAREHOUSE I SU AND NEEDAMANGALA 25 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 6. Step Three : Ad Hoc Query Wizard Figure 7. Step Four: Ad Hoc Query Wizard Figure 8. Query Output Figure 9. Listing of Query Output 26 INFORMATION TECHNOLOGY AND LIBRARIES i MARCH 2000 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. processes were described in detail. The process of import- ing data stored in LUIS as separate data files into the library data warehouse was also described. The data con- tents of the warehouse can provide a very rich informa- tion source to aid the library management in decision making. Using the implemented system, a decision maker can use the GUI to establish the warehouse, and to activate the querying facility provided by Microsoft Access to explore the warehouse contents . Many types of queries can be formulated and issued against the data- base. Experimental results indicate that the system is effective and can provide pertinent information for aid- ing the library management in making decisions. We have fully tested the implemented system using a small sample database . Our on going work includes the expan- sion of the database size and the inclusion of a data min- ing component for association rule discovery. Extensions of the existing GUI and report generation facilities to accommodate data mining needs are expected. I Acknowledgments We would like to thank Professor Stanley Su for his sup- port and advice on the technical aspect of this project. We would also like to thank Donna Alsbury for providing us with the 0B2 data, Daniel Cromwell for loading the 0B2 files and along with Nancy Williams and Tim Hartigan for their helpful comments and valuable discussions on this project. References and Notes 1. John Ladley , "Operational Data Stores: Building an Effective Strategy, " Data Warehouse: Practical Advice from the Experts (Englewood Cliffs, N.J.: Prentice Hall , 1997). 2. Information on Har vard University's ADAPT proj ect. Accessed March 8, 2000, www.adapt.harvard .edu/; Information on the Arizona State University Data Administration and Institutional Analysis warehou se. Accessed March 8, 2000, www .asu .edu / Data_Admin / WH-1.html; Information on the University of Minnesota CLARITY project. Accessed March 8, 2000,www.clarity.umn .edu/; Information on the UC San Diego DARWIN project. Accessed March 8, 2000, www.act .ucsd .edu/ dw I darwin.html; Information on University of Wisconsin- Madison InfoAccess . Accessed March 8, 2000, http :/ / wiscinfo. doit.wisc .edu/infoac cess /; Information on the Univer sity of Nebraska Data Warehouse-nulook. Accessed March 8, 2000, www .nulook.uneb.edu /. 3. Ramon Barquin and Herbert Edelstein, eds ., Building, Using, and Managing the Data Warehouse (Englewood Cliffs, N .J.: Prentice Hall , 1997); Ramon Barquin and Herbert Edelstein, eds ., Planning and Designing the Data Warehouse (Upper Saddle River, N.J .: Prentice Hall, 1996); Joyce Bischoff and Ted Alexander, Data Warehouse: Practical Advice from the Experts (Englewood Cliffs, N.J.: Prentice Hall , 1997); Jeff Byard and Donovan Schneider, "The Ins and Outs (and Everything in Between) of Data War ehousing ," ACM SIGMOD 1996 Tutorial Notes, May 1996. Accessed March 8, 2000, www .redbrick.com / product s/ white / pdf/sigmod96.pdf ; Surajit Chaudhuri and Umesh Dayal, "An Overview of Data Warehousing and OLAP Technolog ," ACM SIGMOD Record 26(1), March 1997. Accessed March 8, 2000, www.acm.org/sigmod / record/issue s/ 9703/ chaudhuri .ps ; B. Devlin , Data Warehouse: From Architecture to Implementation (Reading, Mass.: Addison-Wesle y, 1997); U. Fayyad and others, eds ., Advances in Knowledge Discovery and Data Mining (Cambridge, Mass.: The MIT Pr., 1996); Joachim Hammer, "Data War ehousing Overview, Terminology, and Research Issues." Accessed March 8, 2000, www.cise.ufl .edu/ -jhammer / classes / wh-seminar / Overview / index .htm ; W. H. Inmon, Building the Data Warehouse (New York, N.Y.: John Wiley, 1996); Ralph Kimball , "Dangerous Preconceptions." Accessed March 8, 2000, www .dbmsmag.com/9608d05.html ; Ralph Kimball , The Data Warehouse Toolkit (New York, N.Y.: John Wiley, 1996); Ralph Kimball, "Mastering Data Extraction," in DBMS Magazine, June 1996. (Provides an overview of the process of extracting , cleaning, and loading data .) Accessed March 8, 2000, www .dbmsmag.com / 9606d05 .html ; Alberto Mendelzon , "Bibliography on Data Warehousing and OLAP." Accessed March 8, 2000, www.cs.toronto.edu/-mendel/dwbib.html. 4. Daniel J. Boorstin, "The Age of Negative Discovery," Cleopatra's Nose: Essays on the Unexpected (New York: Random Hous e, 1994). 5. Information on the ARROW system . Accessed March 8, 2000,www . fcla.edu /s ystem/intro_arrow.html. 6. Gary Strawn, "BatchBAMing." Accessed March 8, 2000, http:/ /web .uflib.ufl .edu/rs/rsd/batchbam .html. 7. Li-Min Fu, "OOMRUL: Leaming the Domain Rules ." Accessed March 8, 2000, www .cise.ufl .edu / -fu / domrul.html. HARVESTING INFORMATION FROM A LIBRARY DATA WAREHOUSE I SU AND NEEDAMANGALA 27 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix A Warehouse Data Tables UFCIRC_WH UFORD _WH UFPAY_WH Attribute Domain Attribute Domain Attribute Domain Bib_key Text(S0) Id AutoNumber Inv_key Text(20) Status Text(20) Ord_num Text(20) Ord_num Text(20) Enum / Chron Text(20) Ord_Div Number Ord_div Number MidSpine Text(20) Process_Uni t Text(20) Process _Unit Text(20) Temp_Locatn Text(20) Bib_num Text(20) Bib_key Text(20) Pieces Number Order_da te Da te / Time Ord_Seq_Num Number Ch arges Number Mod_Date Date / Time Inv_Seq_Num Number Last_Use Date / Tune Vendor_Code Text(20) Status Text(20) Browse s Number VndAdr_Order Text(20 Create_ Date Da te / Tune Value Text(20) VndAdr_Claim Text(20) Lst_update Da te / Time Invnt_Date Date / Time VndAdr_Retum Text(20) Currency Text(20) Created Date / Time Vend_ Title_N um Text(20) Paid_am t Num ber Ord_Unit Text(20) USD_amt N u mber Rcv_Unit Text(20) Fund_Key Text(20) UFINV_WH Ord_Scope Text(20 Exp_class Text(20) Pur_Ord_prod Text(20) Fiscal_year Text(20) Attribute Domain Action _Int Number Copies Number Inv_Key Text(20) LibSpecl Text(20) Type_pay Text(lO) Create _Dat e Date / Time LibSpec2 Text(20) Text Text(20) Mod_Date Date / Time Vend_Note Text(20) DB2_11meStamp Date / Time Approv _Stat Text(20) Ord_Note Text(20) Vend_Adr _Code Text(20) Source Text(20) Vend_Code Text(20) Ref Text(20) UFBIB_WH Action_Date Text(20) CopyCtl _Num Number Attribute Domain Vend_Inv _Date Date/Tune Mediu m Text(20) Approval_Date Date / Tune Piece_Cnt N umber Bib_key Text(20) Appro ver_Id Text(20) Div_No te Text(20) System_Control _Num Text(S0) Vend_Inv _Num Text(20) Acr_Stat Text(20) Ca talog_Source Text(20) Inv_Tot Number Rel_Stat Text(20) Lan g_Code_l Text(20) Cale_ Tot_rym ts Num ber Lst_Date Date / Time Lang_Code_2 Text(20) Calc_Net _Tot_Pymts Number Action_Date Text(20) Geo_Code Text(20) Currency Text(20) LibSpec3 Text(20) Dewey_Num Text(20) Discount_Percen t Number LibSpec4 Text(20) Edition Text(20) Vouch_No te Text(20) Encum b_Units Number Pagina tion Text(20) Official_ Vend Text(20) Currency Text(20) Size Text(20) Process _Unit Text(20) Est_Price Number Series_440 Text(20) Intemal_Note Text(20) Encumb_outs Num ber Series_490 Text(20) DB2_ Timestamp Text(20) Fund _key Text(20) Conten t Text(20) Fiscal_ Year Text(20) Subject_l Text(20) Copies N u mber Subject_2 Text(20) Xpay_Method Text(20) Subject_3 Text(20) Vol_Isu_Date Text(20) Authors_l Text(20) Title_Author Text(20) Au thors_2 Text(20) DB2_ Timestamp Date / Time Au th ors_3 Text(20) Series Text(20) 28 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 10071 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. That's My Bailiwick: A Library-Sponsored Faculty Research Web Server Soderdahl, Paul A;Hughes, Carol Ann Information Technology and Libraries; Mar 2000; 19, 1; ProQuest pg. 29 Communications That's My Bailiwick: A Library-Sponsored Faculty Research Web Server Pau/A.Soderdahland Carol Ann Hughes The University of Iowa Libraries pro- vide a unique, new, scholarly pub- lishing outlet for their faculty and graduate students. With the preva- lence of personal faculty home pages and course Web sites in just about every department on campus, it's not very hard for faculty to find a Web server somewhere for storing an HTML file. And, with some work, faculty can often find some "techie" to help convert a document to HTML or to save a list of links. What is rare, however, is a space on the Web where faculty from all disciplines can find a home for their scholarly research interests, coupled with a computing environment and a knowledgeable staff to help them "follow their bliss" in digital form. The Information Arcade's new Bailiwick project does just that. The Need for Something New For a number of years, academic departments in the humanities and social sciences have been able to mount departmental information on the University of Iowa's central Web server maintained by academic com- puting. More recently, two centrally administered course Web servers have been made available to any fac- ulty member or teaching assistant offering a credit course. Based on feedback from faculty and graduate students, however, the university libraries learned that there was no place for a research idea or other aca- demically oriented "pet project" to be published on the Web. Instead, facul- ty and students needed to bury these somewhere on a personal home page, often with a commercial Internet Service Provider at their own expense . Rising to address this need, the university libraries sought to pro- vide a well-respected, institutionally supported Web server for just this sort of electronic publishing endeav- or. What originally started as simply a "projects" directory on the library's general Web server has now grown into the Bailiwick project. Officially launched in March 1998, Bailiwick provides a space on the Web where academic passions can be realized as highly specialized and creative Web sites. It is not sim- ply a place for personal home pages, nor is it intended for course Web sites or academic departmental information. Rather, Bailiwick is designed to provide faculty, staff, and graduate students with Web space where they can focus on a par- ticular area of scholarly interest. Bailiwick is not meant to serve as the new model for scholarly publish- ing in peer-reviewed journals . Most electronic publishing initiatives arise from an attempt to transfer existing models of print publishing to the dig- ital environment. A small number of electronic scholarly journals are cur- rently published on the University of Iowa campus, and the university libraries already provide a number of ways to support this medium, from archiving to cataloging to hosting journal sites, as one element of the university libraries' new Scholarly Digital Resources Center . Bailiwick, instead, provides a Web space that allows authors to harness and exploit this new elec- tronic medium, permitting new models of expression with multime- dia, hypertext, and the ability to incorporate anything in digital form. It is not intended to substitute or even compete with traditional schol- arly publishing or electronic journal publishing. Rather, Bailiwick pro- vides an opportunity to engage in an entirely new medium for scholarly communication. A History of Innovation The heart of the Bailiwick Project within the library environment is the Information Arcade, an award-win- ning facility located in the University of Iowa's Main Library. Opened in 1992, the Information Arcade is a place that provides access to pub- lished electronic information resources coupled with state-of-the- art multimedia development work- stations that allow faculty and students to digitize and manipulate source materials that are not already in electronic form. The facility also houses a fully networked electronic classroom, with twenty-four student workstations, where classes from throughout the university are held- some for the whole term and others for one or two class sessions. In support of its unique service mission "to facilitate the integration of new technologies into teaching, learning, and research," the Information Arcade is well regarded as a place for innovation and risk- taking on the University of Iowa campus . It is a place where ideas can be fleshed out; a place that can respond to the real technology needs presented by faculty and students . When the Information Arcade first opened, it was the only fully wired electronic classroom on campus, with a workstation at every stu- dent's desk. It was the only publicly accessible facility on campus where any faculty member or student Paul A. Soderdahl (paul-soderdahl@ uiowa.edu) is Head of Information Arcade, and Carol Ann Hughes (carol-hughes@ uiowa .edu) is Head of Information , Research, and Instructional Services at the University of Iowa Libraries. COMMUNICATIONS I SODERDAHL AND HUGHES 29 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. could create digital video on a drop- in basis. It was the only computer facility on campus where anyone could access the Internet for free. All of these innovations are now main- stays on campus. In 1998 the Information Arcade expanded its offerings with three new innovative Web-based services. The MOO Project This text-based virtual reality campus for the University of Iowa communi- ty is made possible through the magic of MOO, a piece of software that creates a networked environ- ment on the Internet that is part e-mail, part chat-room, and part programming interface. Known col- lectively as "The Mediatrix," this educational MOO currently houses two distinct academic projects. The Scholar's Web Project, devoted to the possibilities of digital communication in graduate education, makes its MOO home in "The Cave." The MOOniversity Project, which strives to provide a virtual undergraduate learning environment that encour- ages collaboration across campuses and disciplines, is located in "The MOOniversity." Coadministered by D. Diane Davis, assistant professor of rhetoric in the rhetoric department, and Michael Calvin McGee, a profes- sor of rhetoric in the communications studies department, the Mediatrix is available to any faculty member wishing to make use of either of them for teaching and research. The Streaming Video Project With text-based virtual reality at one end of the spectrum, the Information Arcade simultaneously launched a new streaming video server to meet high-end multimedia needs for delivering real-time motion video and audio over the Internet. With a fifty-user license to Real Networks' Real Server, the Information Arcade now provides students and faculty with the ability to serve digital movie files to several locations simultane- ously. Because of the streaming qual- ity of the video files, users do not need to wait for an entire file to download before playing it. Already used by Bob Boynton, professor of political science, for his Multimedia Politics class, the streaming video server provides a delivery mecha- nism for the digital videos created by students and faculty at the Information Arcade's multimedia development stations. The Bailiwick Project By linking new modes of communi- cation and providing an outlet for any number of innovative scholarly projects, the Bailiwick server has become a home for research projects, complementing the university libraries' course Web server. Space is available on this research Web server to any University of Iowa faculty, staff, or graduate student developing a scholarly academic Web site or Web-based tool that might be experi- mental in nature. Open by simple proposal, Bailiwick runs on a dedicated Web server within the library and is sup- ported by the university libraries' Web server infrastructure. Content providers retain editorial control and freedom, and have the ability to define their topic of interest, identify the target audience, and design a customized Web site. Each bailiwick is initially limited to 5MB of space, with the ability to petition for more based on specific needs for a given project. In addition to disk space, authors can turn to library staff at the Information Arcade for consultation on site design, graphics and layout, technical support, and training. An individual bailiwick might: • serve as a home page for artis- tic expression and collabora- 30 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 tion among artists working in Iowa and other states; • be a showcase for digitally produced art that incorporates interactivity meant to be viewed on a computer screen; • provide a natural home for hypertext experiments that explore new forms of multilin- ear argument or open-system documents that welcome, even depend on, links to other Web sites to expand or count- er those arguments; • host a site not full of bells and whistles, but simply a collec- tion of narrowly focused pages of links to resources on a given topic; and • offer an electronic publishing medium for delivery of spe- cialized bibliographies or dig- ital reproductions of rare documents. There are currently eleven baili- wicks in production, with another eight in development. The authors of bailiwicks represent thirteen differ- ent academic departments, including communication studies, political sci- ence, athletics administration, and theatre arts. They range in rank from teaching and research assistants to full professors. Sample Bailiwicks Currently, developed bailiwicks fall into one of four categories: (1) a col- lection of Internet links on a special- ized topic of study, ranging from a small set of links on a particular page to an annotated Internet bibliogra- phy of thousands of links; (2) a hypertextual or multimedia essay or thesis that necessitates publishing in this medium; (3) a scholarly research project that is dynamic or updated with such frequency that print pub- lishing would be ineffective, includ- ing, for example, ongoing findings from a research study; or (4) a collab- orative project that makes use of a Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. DIASPORA LAFONr£RA 6£ND£R L£SBI6A.Y C.YBOR6S BORD£R INCID£Nrs or-H£R BORD£RS Figure 1. Karla Tonella 's Award-Winning "Border Crossings " Bailiwick shared electronically accessible work space. The Internet Bibliography Karla Tonella, a graduate student in mass communication, has authored three different bailiwick sites that loosely fall into the category of Internet bibliography. As a former graduate assistant and Information Arcade staff member, Tonella first identified the need for this sort of publishing medium on campus and articulated the concept of the Bailiwick project. She was instru- mental in bringing the server to fruition and quickly adopted it as a home for two comprehensive and award-winning sites of Internet resources in her areas of expertise: "Women ' s Studies Online" and "Journalism and Mass Media ." Both of these sites have been given wide- spread praise in those subject areas and have helped bring attention to the Bailiwick project, both on cam- pus and around the country . Her "Border Crossings" site (see figure 1) also relies on Internet links as its core content, but it is experi- mental in design and published in a way that is intended to "encourage the browsing readers to consider the areas of their postmodern world where traditional boundaries are being renegotiated and blurred." The site explores the notion of "border crossings" from a number of differ- ent perspectives. "Border Crossings" has received numerous citations in the mainstream press, including a sidebar in the Chronicle of Higher Education, inclusion in Britannica Online's catalog of recommended sites, and a feature article in Search, a monthly newsletter for advanced graduate students published by Northeastern University in Boston. The Multimedia Essay The most popular use for Bailiwick thus far has been for publishing mul- timedia essays. The Information Arcade itself has been a proponent of the multimedia essay since it first opened in 1992, and most semester- long courses now held in the Information Arcade's electronic class- room incorporate some sort of multi- media term paper as part of the course requirements . The Information Arcade is one of the leaders on cam- pus in the adoption of electronic the- ses and dissertations, working closely with the graduate college and aca- demic computing on a pilot project this semester. It is not surprising, then, that faculty members and grad- uate students are turning to Bailiwick as a medium for publishing these sorts of materials . Michael Calvin McGee , professor of communication studies, has pub- lished his essay, "Suffix it to Say that Reality is at Issue," as a bailiwick (see figure 2). Jennifer Lawrence-Gentry, a Ph.D. candidate also in communi- cation studies, created a comprehen- sive site on the work of Mikhail Bakhtin, which is now seen as one of the most complete online resources on Bakhtin. Patrick Muller, a teach- ing assistant in preventive and com- munity dentistry, developed a bailiwick essay titled, "Complexity Studies: The Fluid Multifaceted Nature of Knowledge." The sites are all very different in design, target audience, and perhaps even scholar- ly value. Nonetheless, Bailiwick pro- vides an ideal way for the University of Iowa to support this sort of exper- imental multimedia publishing out- side the rubric of a class assignment for a multimedia term paper or a COMMUNICATIONS I SODERDAHL AND HUGHES 31 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Suffix it to Say that Reality is at Issue Mkhael Calvin McGee The Univft'Sity of Iowa · Th• famous S1nchi Stup.a ... coveis .a ~sht co nbi nlng Buddh.a's bones . They w. 1• broug ht here by lndi.l'sfi rsttn.HI impu i1list, Ashok.a, ln ~7 BC.".Mil.tra Pr1dt1h· GOITO/NY Fo r• det.alled description.sum J.!l.!11iliJ!! ~ - Most everyone is familiar with Westernized yersions of stupa rituals: You know -- what you Figure 2. Professor Michael Calvin McGee's Essay Published as a Bailiwick more traditional electronic scholarly publishing environment. The Scholarly Research Project Aside from the hypertextual and mul- timedia aspects of publishing on the Web, the most unique advantage to the Web for publishing scholarly research is the ability to maintain cur- rency on a published project. The most well developed example of this is a bailiwick on gender equity in sports (see figure 3), sponsored by the women's intercollegiate athletics department. The site monitors the cur- rent state of affairs of gender equity in intercollegiate and interscholastic sport, and tracks Title IX compliance and pending Title IX litigation at col- leges and universities. This resource has received significant national atten- tion and acts as a research tool in and of itself that is published out of the University of Iowa Libraries and now available for students and scholars across the country. Another example is the Dogon bailiwick, published by Chris Culy, associate professor of linguistics . Marcel Kervran, a member of the congregation of Catholic missionar- ies know as Peres Blancs, who lived in the town of Bandiagara, Mali for about thirty years, compiled this dic- tionary of the Dogon language. The dictionary has more than seven thou- sand head words. A second expand- ed edition was published in 1993. Partially representing the varieties of Dogon spoken in and around Bandiagara, the dictionary is being expanded from its earlier HyperCard format, and it may soon be ported 32 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 into an SGML environment. It is an excellent example of an academic toolthat would be difficult to create and deliver in paper form. The Collaborative Work Space The Bailiwick server provides a way for researchers at the University of Iowa to work collaboratively and in a public forum with colleagues at other institutions. This collaborative space can be used as a way to gather research data, or to allow others to comm ent on or contribute to the development of a site . Barbara Bianchi, a graduate student in coun- selor education and an art therapist, has established a bailiwick for Global Connections, a set of online art and notes from travel journals. One com- ponent of the Global Connections site, called "Russia Revisited," includes materials from a number of contributing artists and students in Russia, who are jointly working together to create a collaborative artistic travel journal. International collaboration is being tested in another project as well. With grant funding, two schol- ars-one at the University of Iowa and one in Germany-are working with University of Iowa Libraries staff to create a new academic resource consisting of a Web-search- able critical edition of the work of Ingeborg Bachmann. This bailiwick will eventually contain bibliogra- phies , a hypertext archive of materi- als not yet published in any form relating to Bachmann's life, work, and cultural context, and a searchable corpus of commentaries and transla- tions. An advisory group for the proj - ect consisting of additional international scholars has already been named to oversee the develop- ment of content. As it grows, this bailiwick will result in an unprece- dented resource for scholars from many disciplines. It presents a new Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Last Update: June 29 , 1999. Welcome to the University of Iowa Gender Equity in Sports proj ect. For information about the maintenan ce of this proj ect, visit A bout This Research Proiect . See Ind ex to Updates & Law suits hv CaJegory for most recent changes. Choose one of the following for more information. Title IX: The Law • Overview of Title IX • The Federal Law 0 Title IX of the Education Amendments of 1972 (Supt. of Public Instruction WA) HEW Figure 3. Bailiwick on Gender Equity in Sports Sponsored by the Women 's Intercollegiate Athletics Department model for the development of aca- demic Web sites that not only reflect serious study but actually nurture the creation of new, international scholar- ships. Other Bailiwick proposals are also candidates for outside funding and can follow this exciting lead. Policies Regarding Bailiwick Sites Bailiwick sites run the gamut in sub- ject area, nature, and scope. No attempt is made to centrally control the content of someone's site . After all, it is their bailiwick and they have complete editorial freedom . On the other hand, there are certain guide- lines in place for establishing a baili- wick to maintain the focus of the project as an innovative research Web server . First, the site is not intended to be a space for student class assignments. Short-term projects intended to meet course requirements can be accom- modated currently on one of the uni- versity's centrally administered course Web servers. In addition, the site is not meant to be a place to mount a personal home page or even a student's career portfolio . This type of activity can be better accommodat- ed on a student's personal account through academic computing or through a commercial Internet Service Provider. Sites that are com- mercial in nature are refused, as are sites that are completely divorced from the University's mission . Content providers need to abide by the University's Acceptable Use Policy, which identifies inappropri- ate uses of information technology resources on campus, such as hack- ing, forgery, inserting viruses, vio- la ting intellectual property rights and software licenses, interfering with others' access to information technology resources, or personal campaigning, lobbying, or commer- cial activities. These modest restrictions not- withstanding, most proposals for bailiwicks ha ve been approved. Inappropriate use of bailiwick Web space has not yet been an issue. Library Resources to Support the Project The hallmark of the Information Arcade is its dual strength in provid- ing a facility with state-of-the-art, high-end computing equipment for electronic publishing and multime- dia development as well as provid- ing a diverse public services staff who can work closely with faculty and students, often one-on-one, to help them harness the technology and integrate it effectively into their teaching, learning, and research. The facility is staffed with six half-time graduate assistants selected from a variety of academic programs in an attempt to achieve a balance of tech- nologists, information specialists, graphics artists, and instructional designers. The primary benefit of this unique staffing arrangement is that the Information Arcade is much more than just another computer or library lab. It is a place where faculty and students can find qualified con- sultants trained in a subject specialty with expertise in almost any area related to technology. With this high-tech and high- touch model, the Information Arcade is uniquely suited to host a project like Bailiwick. Within the walls of this facility, the library provides support for every step of development from inception to creation to delivery. With expert consultation, access to equip- ment, technical support , and Web server space, the Arcade becomes a COMMUNICATIONS I SODERDAHL AND HUGHES 33 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. one-stop place for presenting scholar- ly research. Staff support includes consulta- tion in any aspect of the Bailiwick project, including design issues, inter- face development, and training in software. Staff members do not pro- vide programming nor do they do any work in researching or assembling sites. Each faculty member is assigned an Information Arcade consultant at the point of submitting a bailiwick application. The consultant serves as a primary contact person for technical support, troubleshooting, basic inter- face design guidance, and referrals to other staff both in the libraries and on campus. At present, the current level of staffing has been sufficient to accommodate this sort of assistance, which is not unlike the assistance pro- vided to any patron who walks in the door of the Information Arcade. As a computing facility, the Information Arcade provides public access to a host of multimedia devel- opment workstations for scanning images, slides, and text, and for digi- tizing video and audio. At these mul- timedia stations, a large suite of multimedia integration software and Web publishing software is made available for public use. Staff at the public services desk have a strong background in multimedia develop- ment and Web design and can pro- vide some one-on-one training on a walk-in basis beyond technical sup- port and troubleshooting. All of these hardware and software resources are available to Bailiwick content providers, who can choose to do their development work in the Information Arcade or at their home or office. Finally, since there is a close rela- tionship between the Information Arcade and the university libraries Web site, system administration and Web server support is all handled in- house as well. There are few artificial barriers imposed by the technology, thereby permitting content providers to focus on their creative expression and scholarly work. With only minimal reallocation of existing resources, the University of Iowa Libraries has been able to launch the Bailiwick project and con- tinue to develop it at a modest pace. One of the components most essen- tial for its continued success, howev- er, is the ability to scale up to meet the expected demand over the next sev- eral years. Technical infrastructure challenges are not overwhelming as yet. An analysis still needs to be made to determine how quickly creators are developing their sites, what the implications are for network delivery of these resources, what reasonable projections there are for disk space, and who is using the resources. Perhaps more importantly, though, adequate staffing will always remain a concern. Some faculty wish to work more closely with library staff consultants than time allows, and the consultants would certainly find it enriching to be more intimate- ly involved with the development of each bailiwick site. Marketing of the Bailiwick project has been discrete (to say the least) because of the limited staffing available. However, embed- ded in the collaboration inherent in bailiwicks is the potential for stronger involvement with faculty in obtaining grant funding to support the develop- ment of specific bailiwick sites. A Model for Research Libraries Bailiwick is a project that allows the University of Iowa Libraries, and specifically the Information Arcade, to focus on the integration of technol- ogy, multimedia, and hypertext in the context of scholarship and research. To date, most of the bailiwick sites represent disciplines in the arts, humanities, and social sciences. This matches the overall clientele of the Information Arcade (given its loca- tion in the University of Iowa's Main Library), but it also reflects the fact that these disciplines have been tradi- 34 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 tionally undersupported with respect to technology. Nevertheless, individ- ual faculty in these disciplines have integrated some of the most creative applications of the technology in their everyday teaching and research, in part because of the existence of the Information Arcade and the ground- work laid by the libraries for the past several years. With the Information Arcade's vis- ibility on campus, and with similar resources and support in the Information Commons-a sister facili- ty in the Hardin Library for the Health Sciences-the University of Iowa Libraries are well regarded on campus as a leader in information technology, electronic publishing, and new media. Thus, faculty and students alike are accustomed to turning to the libraries for innovation in technology and the Bailiwick project is a natural fit. Bailiwick is now fully integrated as part of a palette of new technology services and scholarly resources included within the libraries' support of teaching, learning, and research at the University of Iowa. Engelond: A Model for Faculty-Librarian Collaboration in the Information Age Scott Walter The question of how best to incorporate information literacy instruction into the academic curriculum has long been a lead- ing concern of academic librarians. In Scott Walter (walter.123@osu.edu), formerly Humanities and Educaton Reference Librarian, University of Missouri-Kansas City, now is Information Services Librarian, Ohio State University. 10072 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Engelond: A Model for Faculty-Librarian Collaboration in the Information Age Scott, Walter Information Technology and Libraries; Mar 2000; 19, 1; ProQuest pg. 34 one-stop place for presenting scholar- ly research. Staff support includes consulta- tion in any aspect of the Bailiwick project, including design issues, inter- face development, and training in software. Staff members do not pro- vide programming nor do they do any work in researching or assembling sites. Each faculty member is assigned an Information Arcade consultant at the point of submitting a bailiwick application. The consultant serves as a primary contact person for technical support, troubleshooting, basic inter- face design guidance, and referrals to other staff both in the libraries and on campus. At present, the current level of staffing has been sufficient to accommodate this sort of assistance, which is not unlike the assistance pro- vided to any patron who walks in the door of the Information Arcade. As a computing facility, the Information Arcade provides public access to a host of multimedia devel- opment workstations for scanning images, slides, and text, and for digi- tizing video and audio. At these mul- timedia stations, a large suite of multimedia integration software and Web publishing software is made available for public use. Staff at the public services desk have a strong background in multimedia develop- ment and Web design and can pro- vide some one-on-one training on a walk-in basis beyond technical sup- port and troubleshooting. All of these hardware and software resources are available to Bailiwick content providers, who can choose to do their development work in the Information Arcade or at their home or office. Finally, since there is a close rela- tionship between the Information Arcade and the university libraries Web site, system administration and Web server support is all handled in- house as well. There are few artificial barriers imposed by the technology, thereby permitting content providers to focus on their creative expression and scholarly work. With only minimal reallocation of existing resources, the University of Iowa Libraries has been able to launch the Bailiwick project and con- tinue to develop it at a modest pace. One of the components most essen- tial for its continued success, howev- er, is the ability to scale up to meet the expected demand over the next sev- eral years. Technical infrastructure challenges are not overwhelming as yet. An analysis still needs to be made to determine how quickly creators are developing their sites, what the implications are for network delivery of these resources, what reasonable projections there are for disk space, and who is using the resources. Perhaps more importantly, though, adequate staffing will always remain a concern. Some faculty wish to work more closely with library staff consultants than time allows, and the consultants would certainly find it enriching to be more intimate- ly involved with the development of each bailiwick site. Marketing of the Bailiwick project has been discrete (to say the least) because of the limited staffing available. However, embed- ded in the collaboration inherent in bailiwicks is the potential for stronger involvement with faculty in obtaining grant funding to support the develop- ment of specific bailiwick sites. A Model for Research Libraries Bailiwick is a project that allows the University of Iowa Libraries, and specifically the Information Arcade, to focus on the integration of technol- ogy, multimedia, and hypertext in the context of scholarship and research. To date, most of the bailiwick sites represent disciplines in the arts, humanities, and social sciences. This matches the overall clientele of the Information Arcade (given its loca- tion in the University of Iowa's Main Library), but it also reflects the fact that these disciplines have been tradi- 34 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 tionally undersupported with respect to technology. Nevertheless, individ- ual faculty in these disciplines have integrated some of the most creative applications of the technology in their everyday teaching and research, in part because of the existence of the Information Arcade and the ground- work laid by the libraries for the past several years. With the Information Arcade's vis- ibility on campus, and with similar resources and support in the Information Commons-a sister facili- ty in the Hardin Library for the Health Sciences-the University of Iowa Libraries are well regarded on campus as a leader in information technology, electronic publishing, and new media. Thus, faculty and students alike are accustomed to turning to the libraries for innovation in technology and the Bailiwick project is a natural fit. Bailiwick is now fully integrated as part of a palette of new technology services and scholarly resources included within the libraries' support of teaching, learning, and research at the University of Iowa. Engelond: A Model for Faculty-Librarian Collaboration in the Information Age Scott Walter The question of how best to incorporate information literacy instruction into the academic curriculum has long been a lead- ing concern of academic librarians. In Scott Walter (walter.123@osu.edu), formerly Humanities and Educaton Reference Librarian, University of Missouri-Kansas City, now is Information Services Librarian, Ohio State University. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. recent years, this issue has grown beyond the boundaries of professional librarianship and has become a general concern regular- ly addresssed by classroom faculty, educa- tional administrators, and even regional accrediting organizations and state legisla- tures. This essay reports on the success of a pilot program in course-integration infor- mation literacy instruction in the field of medieval studies. The author's experience with the "Enge/and" project provides a model for the ways in which information literacy instruction can be effectively inte- grated into the academic curriculum, and for the ways in which a successful pilot program can both lead the way for further development of the general instructional program in an academic library, and serve as a springboard for future collaborative projects between classroom faculty and aca- demic librarians. In 1989 the Chronicle of Higher Education reported on the proceed- ings of a conference on teaching and technology held near the Richmond, Indiana campus of Earlham College.1 Conference speakers identified a number of concerns for those involved in teaching and learning at the end of the twentieth century. Chief among these were recent advances in information technology that threatened "to leave students adrift in a sea of information." Earlham College librarian Evan I. Farber and his fellow speakers called upon conference attendees to devel- op new teaching strategies that would help students learn how to evaluate and make use of the "mass- es of information" now accessible to them through emergent information technologies, and to embrace a col- laborative teaching model that would allow academic librarians and classroom faculty members to work together in developing instructional objectives appropriate to the infor- mation age. The concerns expressed by these faculty and administrators for the information literacy skills of their students may have still seemed unusual to the general educational community in the late 1980s, but, as Behrens and Breivik have demon- strated, such concerns have been a leading issue for academic librarians for more than twenty years. According to its most popular defini- tion, information literacy may be understood as "[the ability] to recog- nize when information is needed and ... the ability to locate, evaluate, and use effectively the needed informa- tion."2 It has become increasingly clear over the past decade that edu- cators at every level consider infor- mation literacy a critical educational issue in contemporary society. Perhaps the most frequently cited example of concern among educa- tional policy-makers for the informa- tion literacy skills of the student body can be found in Ernest Boyer's report to the Carnegie Foundation, College: The Undergraduate Experience in America (1987), in which the author concludes that "all undergraduates should be introduced to the full range of resources for learning on campus," and that students should spend "at least as much time in the library ... as they spend in classes."3 But while Boyer's report may be the most famil- iar example of such concern, it is hardly unique. As Breivik and Gee have described, a small group of edu- cational leaders have regularly expressed similar concerns over the past several decades. Moreover, as Bodi et al. among others, have demonstrated, the rise in professional interest in information literacy issues among librarians in the past decade is closely related to more general con- cerns among the educational commu- nity, especially the desire to foster critical thinking skills among the stu- dent body. By the mid-1990s, profes- sional organizations such as the National Education Association, accrediting bodies such as the Middle States Association of Colleges and Schools, and even state legislators began to incorporate information lit- eracy competencies into proposals for educational reform at both the sec- ondary and the post-secondary lev- els. The confluence over the past decade of new priorities in educa- tional reform with rapid develop- ments in information technology provided a perfect opportunity for academic librarians to develop and implement formal information litera- cy programs on their campuses, and to assume a higher profile in terms of classroom instruction. For the past two years, a pilot project has been underway at the Miller Nichols Library of the University of Missouri-Kansas City that not only fosters collaborative relations between classroom faculty members and librarians, but pro- motes the development of higher- order information literacy skills among participating members of the student body. Engelond: Resources for 14th-Century English Studies (www.umkc.edu/lib I engelond/) incorporates traditional library instruction in information access as well as instruction in how to apply critical thinking skills to the contem- porary information environment into the academic curriculum of partici- pating courses in the field of medieval studies. Our experience with the Engelond project provides a model for the ways in which informa- tion literacy instruction can be effec- tively integrated into the academic curriculum, and for the ways in which a successful pilot program can both lead the way for further devel- opment of the general instructional program in an academic library, and serve as a springboard for future col- laborative projects between class- room faculty members and librarians. The Impetus for Collaboration "Most medieval Web sites are dreck," or so wrote Linda E. Voigts, curators' professor of English at the University COMMUNICATIONS I WALTER 35 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. of Missouri-Kansas City, in a recent review of her participation in the Engelond project for the Medieval Academy News. Describing the impe- tus for the development of the proj- ect in terms of a complaint increasingly common among mem- bers of the classroom faculty, Voigts provides a number of examples from recent years in which students made extensive, but inappropriate, use of Web-based information resources in their academic research. In one example, Voigts describes a student who made the mistake of relying heavily on what appeared to be an authoritative essay for her report on medieval medical practices. The report was actually authored by a radiologist "with little apparent knowledge of either the Middle Ages or of premodern medicine." "How can those of us who teach the Middle Ages," Voigts asked, "help our stu- dents find in the morass of rubbish on the Internet the relatively few pearls? How can we foster skills for distinguishing between true pearls and those glittery paste jewels that dissolve upon close examination?"4 By the time Voigts approached the Miller Nichols Library during the fall 1997 semester for suggestions about the best ways to teach her stu- dents how to "sift the Web" in their search for resources suitable for aca- demic research in medieval studies, the issue of faculty-librarian collabo- ration in Internet instruction was a familiar one. In a representative review of the literature, Jayne and Vander Meer identified three "com- mon approaches" that libraries have taken to the problem of teaching stu- dents how to apply critical thinking skills to the use of Web-based infor- mation resources: (1) the develop- ment of generic evaluative criteria that may be applied to Web-based information resources; (2) the inclu- sion of Web-based information resources as simply one more materi- al type to be evaluated during the course of one's research (i.e., adding Lo.st updated : 27.April 1999 ! Enge/and supports the research of students in Dr . Linda Ehrsam Voigts' Chaucer (English 4121512 ) and Medieval Literature II (English 555A) courses at the University of Miss ouri-K ansas City. The site was created by the University Libraries' Public Services Staff with the collaboration of Dr . Voigts . We hope it will serve as a prototype for future collaborative efforts integrating library resources, course content. and multi-media technologies These pages contain syllabi for both courses, links to Internet resources (including web sites. news groups and online discussion groups relevant to medieval studies) . a guide to evaluating both online and print research tools. a list of materials held on reserve at Miller Nich ols Library for the use of these classes. and links to the MERLIN Library Catalog and a wide range of databases available through the University Libraries . AudioNisual resources include Rea!Audio streams of Dr. Voigts reading from Chaucer's Canterbury Tales and Troi/us and Criseyde . Also included is Joshua Merrill's 'From Gatehouse to Cathedral A Phot ograp hic Pilgrimage to Chaucerian Landmarks .' , ,~ • • I ~ I I Ju l . • I t• II I I I ' • "I h d1,t I lt.;l 1.,,ld~.;, ..,;,.,"' r'\UU,V I io 1.;:,U.Ji .;l ,t. '•·" ;,t.,:, ),J l..,l<;i~.;tl.., .... J -'-' ' • J ~ Figure 1. Engelond Home Page the Web to the litany of resources, popular and scholarly, print and elec- tronic, typically addressed in a gen- eral instructional session); and (3) working with faculty to integrate critical thinking skills into an aca- demic assignment that asks students to use or evaluate Web-based infor- mation resources relevant to their coursework. 5 While the Engelond project focused primarily on the last of these options, our work on the project also fostered the use of the first two approaches in our broader instructional program. Engelond's Landscape The Engelond Web site provides access to a number of resources for participat- ing students. These resources may be categorized as course-specific (e.g., course syllabi), information literacy- related (e.g., a set of evaluative criteria for use with Web-based information resources), or multimedia (e.g., sound recordings of Voigts reading excerpts from Chaucer's works in Middle English). All of these resources are accessible from the Engelond home page www.umkc.edu/lib/engelond/) (see figure 1). Several links are also provided throughout the site to resources housed on the library's Web site, including access to elec- tronic databases and subject-specif- ic guides to relevant resources in the print collection. Although stu- dents make use of all of these resources during the course of the semester, the emphasis in this essay will be on describing the nature and use of the information literacy-relat- ed resources. As Behrens and Euster have noted, recent interest in information literacy instruction has been guided to a degree by concern over student ability to make effective use of new forms of information technology. This concern is addressed in the Engelond project by its "Internet Resources" page, through which stu- dents are acquainted with the archi- tecture of the Internet and are provided with annotated references (and links) to a number of electronic resources (including Web portals) that will allow them to begin their research in medieval studies. Students making use of the page are 36 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. introduced, for example, to a variety of the different types of information resourc e s available through the Internet, including Web sites, Telnet sites, news groups, and discussion lists. Users are also directed to relat- ed resources on the library Web site, including a guide to print resources for the study of Chaucer and an annotat ed guide to Web-based infor- mation resources generally useful for the study of literatur e . 6 Also provided on the Engelond site is a discussion of evaluative cri- teria that students might apply to their selection of Web-based informa- tion resources for academic research. Designed to address Voigts' initial concern about the issue of teaching students how to apply critical think- ing skills to their use of the Web, the "Criteria" page provides a general discussion of the nature of Web- based information resources, the ways in which such resources differ from traditional resources, and the kinds of questions that students must ask of any Web-based resource before making use of it in their aca- demic work . Reflecting the idea that information literacy skills are best taught in connection with a specific subject matter, the "Criteria" page includes references to a number of illustrative examples of Web-based resources in medieval studies. This page also reflects the evolutionary nature of the Engelond project, since new illustrations are added as each successive group of student users discovers different examples (both positive and negative). Also included on this page is a link to the library 's "Quick Reference Guide to Evaluating Resources on the World Wide Web," a generic version of the criteria developed for use with the broader instruction program at the Miller Nichols Library . 7 While the resources described above introduce students to the information landscape in the field of medieval studies and provide them with evaluative tools tailored to sub- ject-specific concerns in making use of Web-based information resources in their academic work, the final information literacy-related resource made available through the Engelond site is perhaps of the great- est interest. The "Class Picks" page presents the results of participating students' Web site evaluati on assign- ments. On this page , user s will find student evaluations of Web-based resources in medieval studies that draw not only on the information lit- eracy skills provided through tradi- tional library instruction, but also on the subject-specific knowledge that students gain as part of their aca- demic coursework. Jayne and Vander Meer wrote that faculty-librarian collaboration in Internet instruction is most effective when students are asked to draw both on generic informational litera- cy skills and on information and evaluative criteria specific to the sub- ject matter being addressed.8 As they concluded, " [to] benefit fully from the Web's potential, stud ents need training and guidance from librari- ans and faculty." Incorporating dis- cussions of site design, organization of information, and veracity of con- tent, the Web site evaluations found on the "Class Picks" page demon- strate that participating students have learned both from the librarian and the scholar, and hav e begun to consider the best ways to incorporate Web-based information resources into their day-to-day academic work. In a review of "The Harvard Chaucer Page " (http:/ / icg.fas. harvard .edu / -chaucer /) , for exam- ple, students note the general appeal of the site, but criticize it both for technical problems in its design and for editorial choices that limit its util- ity for academic research: The Harvard Chaucer is an insightful , colorful look at the author and his times, but is dappled conspicuously with misspellings, repeated phrases , sentence fragments, broken links, and unfinished pages . Translations of medieval texts provided on the site are often anonymou s, making it hard to tell if the translation is credible and an acceptable resourc e for serious research in Chauce r studies. If one is interested in pursuing a topic found on the Harvard Chaucer , s / he is well advised to explore the site for ideas and background infor- mation, but to go elsewhere for authoritative sources .. . 9 In another review , this one of "The Medieval Feminist Index" (www.haverford.edu / library/ reference/mschaus/mfi/mfi.html), students provide a discussion of the scholarly authority of the site as well as a description of the results retrieved in sample searches of the index for materials relevant to the study of Chaucer. 10 The review con- cludes with further examples of issues relevant to Chaucer studies that might be effectively investigated with information identified through this resource. In both reviews, stu- dents demonstrate the ability to criti- cally evaluate a Web site both for its design and for its content , and the ability to express the strengths and weaknesses of a site from the point of view of a student concerned with how to make use of a Web-based information resource in his or her academic work. As a result, the reviews found on the "Class Picks" page not only demonstrate the suc- cessful approach to course-integrated information literacy instruction pro- moted through the Engelond project, but also provide a useful student resource in their own right. The Collaborative Approach In her review of faculty-librarian partnerships in information literacy instruction, Smalley wrote that, in the best-case scenario, "the student gains COMMUNICATIONS I WALTER 37 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. mastery in using some portion of Internet resources, as well as expo- sure to resources intrinsically impor- tant to disciplinary pursuits. In doing the Web-based exercises, students see information seeking and evaluation as essential parts of problem solving within the field of study." 11 The three information literacy-related resources found on the Engelond site- "Internet Resources," "Criteria, " and "Class Picks" -demonstrate one approach to providing course-inte- grated information literacy instruc- tion in such a way that the classroom faculty member and the academic librarian can work collaboratively and productively to meet their mutu- al instructional goals . Both the classroom faculty mem- ber and the cooperating librarian are able to meet their instructional goals using the Engelond model because of the collaborative nature of the infor- mation literacy instruction provided to the participating students. Students enrolled in Voigts' Chaucer class during the winter 1999 semester received information literacy instruc- tion focused both on information access and critical thinking while completing successive iterations of the Web site evaluation assignment required for the course. A brief overview of the collaborative teach- ing process should suggest ways in which the participating faculty mem- ber and librarian were able to draw successfully both on generic infor- mation literac y skills and on subject- specific knowledge while conducting course-integrated library instruction using the Engelond site. Participating students during the winter 1999 semester began with a general introduction to the electronic resources available through the Miller Nichols Library at the University of Missouri-Kansas City (e.g., using the online catalog and databases such as the MLA Bibliog- raphy) . Students were then presented with an introduction to the problem of applying critical thinking skills to the use of Web-based information resources, as described on Engelond's "Criteria" page. Following this intro- ductory session conducted by the cooperating librarian, the cooperat- ing faculty member provided stu- dents with a number of illustrative examples of the inappropriate use of electronic resources for academic research in medieval studies. From the beginning, the librarian and the faculty member modeled an integrat- ed approach to the evaluation of information resources for their stu- dents; one that drew both on generic critical thinking skills and on specific examples of how such skills might be applied to resources in their field. Following this initial session (which took place during the first week of the semester) , students were asked to complete an evaluation of a Web site containing information they might consider using as part of their academic work. Individual sites were chosen from among those accessible through the subject-specific Web por- tals provided on the "Internet Resources" page. Students were pro- vided both with the library's "Quick Reference Guide to Evaluating Resources on the World Wide Web" and with the more extensive descrip- tion of Web site evaluation available on the "Criteria" page . Students completed these initial reviews over the following week and submitted copies to both the faculty member and the librarian . In preparation for the second instructional session (which took place during the third week of the semester), the faculty member and the librarian evaluated each review twice (individually, and then togeth- er). Reviews were evaluated for the clarity of their criticism of a site, both from the point of view of information organization and design and from the point of view of the significance of the information for student research in the field . Sites that seemed to merit further review by the entire class were selected from 38 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 this pool of evaluations and were discussed in greater detail by the instructors. The second instructional session took the form of an extended review of the sites selected in the meeting described above . In each case, stu- dents were asked to describe their reaction to the site in question. In cases where more than one student had evaluated the same site, each student was asked to present one or two distinct points from his or her review. The instructors then present- ed their reactions to the site. Again, the librarian and the faculty member modeled for the students an approach to the critical evaluation of information resources that drew not only on the professional expertise of the librarian, but also on the scholar- ly expertise of the faculty member . By the end of this session, students had been exposed to three separate critiques of the selected Web sites: the student's opinion of how the information presented on the site might be used in academic research; the librarian 's opinion of how effec- tively the information was organized and presented, and how its authority, currency, etc ., might differ from that of comparable print resources; and, finally, the faculty member's opinion of the place and value of the infor- mation provided on the site in the broader scheme of the discipline. Following this session, the stu- dents were assigned to groups in order to develop more detailed eval- uations of the Web sites discussed in class. As before, these assignments were submitted both to the faculty member and to the librarian. After further review by both instructors, the assignments were returned to the students for a third (and final) itera- tion, and then mounted to the "Class Picks" page. By the conclusion of this assignment, participating students had learned not only how to apply critical thinking skills to Web-based information resources, but had begun to think about the nature of Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. electronic information and the many forms that such information can take. The Web site evaluations included on the "Class Picks" page demonstrate the students' ability to successfully evaluate a Web-based information resource both for its design and for its content, and to suggest the aca- demic situations in which its use might be warranted for a student of medieval literature. Evaluating Engelond During the winter 1999 semester, we attempted to evaluate the success of the information literacy instruction provided through the Engelond proj- ect. While the Web site evaluations produced by the students provided one obvious measure of our instruc- tional success, we attempted to learn more about the ways in which stu- dents used the materials provided through the Engelond site by polling users and by examining use patterns on the site. Both of these latter meas- ures confirmed what the instructors already suspected: students enrolled in participating courses were making heavy use of the information litera- cy-related resources housed on the Engelond site and saw the skills fos- tered by those resources as a valuable complement to the disciplinary knowledge being gained in the tradi- tional classroom. As part of a general evaluation of the instructional services provided by the library during the course of the semester, students participating in the Engelond project were asked open-ended questions such as: "What features of the Engelond Web site did you find most useful as a student in this course?"; "How did the existence of the Engelond site and the collaboration between your classroom instructor and the library enhance your learning experience in this course?"; and "What aspects of the library instruction that you received as part of this course do you believe will be useful to you in other courses or in regards to life- long learning?" Among the specific items cited most often by students as being useful to them in their aca- demic work were two of the infor- mation literacy-related resources: "Internet Resources" and "Class Picks." Likewise , information litera- cy skills such as familiarity with the structure of the Internet and the abil- ity to critically evaluate Web-based information resources were listed by almost every student as skills that would be useful both in other aca- demic courses and in their daily lives . Moreover, two graduate stu- dents who were participants report- ed that their experience with Engelond had led them to incorpo- rate information literacy instruction into the undergraduate courses that they taught themselves. Any conclusions about the appeal of the information literacy- related resources housed on the Engelond site based on these narra- tive responses were reinforced by a study of the use statistics for the same period. Through the first three months of the winter 1999 semester Ganuary-March), the Engelond site recorded approximately one thou- sand "hits" on its main page.12 In each month, the most frequently accessed pages were the three infor- mation literacy-related resources described above, with the "Criteria" page regularly recording the greatest number of hits . Among the other most-frequently visited pages on the site were the multimedia resource page (" Audio -Visual"), the "Syllabi" page, and the "Quick Reference Guide to Chaucer" (housed on the library Web site, but accessible through the "Internet Resources" page). Taken in conjunction with the narrative responses provided on the evaluation form , these use statistics suggest that the information literacy resources provided through the Engelond site have become a fully- integrated, and greatly appreciated, feature of the academic curriculum in medieval studies in the Depart- ment of English at the University of Missouri-Kansas City. A Model for Future Collaboration The Engelond project has not only been a success with students who have enrolled in participating cours- es, but has had a significant influence on the broader instructional program at the Miller Nichols Library. It has served as a template for future col- laborative efforts between the class- room faculty and the library in terms of integrating information technolo- gy and information literacy into the academic curriculum. In terms of the instructional pro- gram at the Miller Nichols Library, our experience with Engelond helped lay the groundwork for the development of new instructional materials and for new instructional programs . It was through Engelond, for example, that we first provided electronic access to our point-of-serv- ice guides to library materials in var- ious subjects (e.g., the "Library Guide to Chaucer"). As of the end of the winter 1999 semester, we have made almost all of our pathfinders available on the library Web site and are now considering ways in which these might be effectively incorporat- ed into the work being done by our faculty in developing Web-based coursework. Also, it was through Engelond that our subject specialists started col- lecting and annotating Web-based information resources of potential use to our students and faculty. Now, sub- ject specialists are developing "subject guides" to Web-based resources in a number of fields and promoting their use among faculty members who , like Voigts, are concerned about the quali- ty of the Web-based information being used by their students in their COMMUNICATIONS I WALTER 39 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. !' Miller Nichols Library About the TL TC Services Schedule Workshops Staff Technology for Learning and Teaching Center A UMKC FACULTY SERVICE Figure 2. TLTC Home Page academic work. Both our pathfinders and our subject guides to Web-based resources are available online (www. umkc.edu/lib /instruction/guides/ index.html). Finally, the instructional session on the critical evaluation of Web- based resources that has been the centerpiece of library instruction for the Engelond project has now been adapted for inclusion in our normal round of instructional workshops. While support for such innovations in our instructional program clearly existed within the library prior to the initiation of the Engelond project, the project's success has provided an important spur to the development of instructional services in the library. The commitment to collaborative instructional programming demon- strated by the Engelond project has also helped pave the way for the development of the University of Missouri-Kansas City's new Technology for Learning and Teaching (TLT) Center. Housed in the Miller Nichols Library, the TLT Center offers faculty workshops in the use of information technology and a place in which classroom facul- ty, subject specialists, and educational technologists may collaborate on the development of projects such as Engelond. Further information on the TLT Center is available online (www.umkc.edu/tltc/) (see figure 2). Initiating a culture of collabora- tion between members of the class- room faculty and academic librarians can be a difficult task (as so much of the literature has shown). In reviewing our experi- ence with Engelond, we have bene- fited from the suggestions that Hardesty made some years ago about the means of supporting the adoption across campus of an inno- vative instructional model: (1) the librarian must present information literacy instruction in such a way that it does not threaten the role of the classroom faculty member as an authority in the subject matter of the course; (2) the new approach to instructional collaboration must be adopted on a limited basis at first, rather than requiring that all instructional programs immediately adopt the new approach; and (3) the results of a successful pilot projects 40 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 must be "readily visible to others" on campus. 13 Designed as a pilot project, Engelond has successfully demon- strated that classroom faculty and academic librarians can collaborate to meet their mutual instructional objectives, both in terms of informa- tion literacy instruction and in terms of academic course content. As infor- mation technology continues to gain a central place in the educational mission of the college and university, it is likely that the sphere of mutual instructional objectives between classroom faculty and academic librarians will only increase. Our careful approach to raising the instructional profile of librarians on campus has been rewarded, too, both by an increasing number of faculty members seeking course-related instruction in our electronic class- room as part of the regular instruc- tional program of the library, and by the institutional commitment of resources to the TLT Center, which will become the nexus of instruction- al collaboration between faculty and librarians on our campus. During the 1999-2000 academic year, no fewer than three academic courses in medieval studies will make use of the Engelond site. As more faculty become aware of the services provided by the TLT Center, such collaborative approaches to information literacy instruction will likely become more evident across a variety of disciplines. The lessons learned over the past two years of project development will be invalu- able as we move to provide course- integrated information literacy instruction to an increasing number of students in an increasingly broad variety of courses. Acknowledgments The Engelond project has benefited from the work of a number of indi- viduals over the past two years, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. especially Ted P. Sheldon, director of libraries at the University of Missouri-Kansas City, and Marilyn Carbonell, assistant director for col- lection development, both of whom were instrumental in developing the plan for a pilot project in course- integrated information literacy instruction with Professor Voigts. The design for the Engelond site was developed by John LaRoe, former multimedia design technologist at the Miller Nichols Library. The orig- inal text for the site was written by Voigts, LaRoe, and T. Michael Kelly, former humanities reference librari- an at the Miller Nichols Library. Additional text and resources for the site have been developed over the past year by Voigts and myself. In addition, a number of librarians and staff members in the public services division of the Miller Nichols Library devoted time to critiquing the site and to assisting with the cre- ation of the embedded audio files. These contributions may not always be evident to the students who bene- fit from the project, but they were instrumental in our ability to suc- cessfully meet our instructional objectives during the 1998-99 aca- demic year. References and Notes 1. Thomas J. DeLoughry, "Pro- fessors Are Urged to Devise Strategies to Help Students Deal with 'Information Explosion' Spurred by Technology," Chronicle of Higher Education 35 (March 8, 1989), A13, Al5. 2. Shirley J. Behrens, "A Concep- tual Analysis and Historical Overview of Information Literacy," College & Research Libraries 55 Guly 1994): 309-22; Patricia Senn Breivik, Student Learning in the Information Age (Phoenix, Ariz.: Oryx Pr., 1998); "Final Report of the American Library Association Presidential Com- mittee on Information Literacy" (1989), as reproduced in Breivik, Student Learning in the Information Age, 121-37 (quotation is from pp. 121-22). For another recent overview of the development of the theo- ry and practice of information literacy at every level of American education over the past two decades, see Kathleen L. Spitzer and others, Information Literacy: Essential Skills for the Information Age (Syracuse, N.Y.: ERIC Clearinghouse on Information and Technology, 1998). 3. Ernest L. Boyer, College: The Undergraduate Experience in America (New York: Harper & Row, 1987), 165; Patricia Senn Breivik and E. Gordon Gee, Information Literacy: Revolution in the Library (New York: MacMillan, 1989); Sonia Bodi, "Critical Thinking and Bibliographic Instruction: The Relation- ship," Journal of Academic Librarianship 14 Guly 1988): 150-53; Barbara B. Moran, "Library /Classroom Partnerships for the 1990s," C&RL News 51 (June 1990): 511-14; Sonia Bodi, "Collaborating with Faculty in Teaching Critical Thinking: The Role of Librarians," Research Strategies 10 (Spring 1992): 69-76; Hannelore B. Rader, "Information Literacy and the Undergraduate Curriculum," Library Trends 44 (Fall 1995): 270-78; Spitzer and others, Information Literacy; and Breivik, Student Learning in the Information Age, 7-8. On the relation- ship between trends in educational reform favoring the development of criti- cal thinking skills and their relationship to the place of information literacy instruction in higher education, see also Joanne R. Euster, "The Academic Library: Its Place and Role in the Institution," in Academic Libraries: Their Rationale and Role in American Higher Education, Gerard B. McCabe and Ruth J. Person eds. (Westport: Greenwood Pr., 1995), 7; Craig Gibson, "Critical Thinking: Implications for Instruction," RQ 35 (Fall 1995): 27-35. 4. Linda Ersham Voigts, "Teach- ing Students to Sift the Web," Medieval Academy News (Nov. 1998): 5. 5. Elaine Jayne and Patricia Vander Meer, "The Library's Role in Academic Instructional Use of the World Wide Web," Research Strategies 15 (1997): 125. See also Topsy N. Smalley, "Partnering with Faculty to Interweave Internet Instruction into College Coursework," Reference Services Review 26 (Summer 1998): 19-27. 6. Behrens, "A Conceptual Analysis and Historical Overview of Information Literacy," 312; Euster, "The Academic Library," 6; Scott Walter, "UMKC University Libraries: Quick Reference Guide to Chaucer." Accessed Sept. 24, 1999, ww.umkc.edu/lib/ instruction/ guides/ chaucer .html; Scott Walter, "UMKC University Libraries: Subject Guide to Literature." Accessed Sept. 24, 1999, www.umkc.edu/lib/ instruction/ guides/literature.html. All references to specific pages on the Engelond site will be made to the page title, e.g., "Internet Resources." Because Engelond has been designed in a frame- set, it will be easier for interested readers to access the main page at the URL pro- vided in the text and then make use of the navigational buttons provided there. 7. Scott Walter, "UMKC Univ- ersity Libraries: Quick Reference Guide to Evaluating Resources on the World Wide Web." Accessed Sept. 24, 1999, www.umkc.edu/ lib/ instruction/ guides/ webeval.html. 8. Jayne and Vander Meer, "The Library's Role in Academic Instructional Use of the World Wide Web," 125. 9. Laura Arruda and others, review of "The Harvard Chaucer Page." Accessed Accessed Sept. 24, 1999, www.umkc.edu/lib/engelond. 10. Sherrida D. Harris and Jennifer Kearney, review of "The Medieval Feminist Index: Scholarship on Women, Sexuality, and Gender." Accessed Sept. 24, 1999, www.umkc.edu/lib/engelond. 11. Smalley, "Partnering with Faculty to Interweave Internet Instruction into College Coursework," 20. 12. In January 1999 Engelond received 368 hits, with the three most fre- quently accessed items being "Criteria" (157), "Internet Resources" (130), and "Class Picks" (128). In February the total number of hits dropped to 216, with the most frequently accessed items being "Criteria" (130), "Audio-Visual" (59), and "Internet Resources" and "Class Picks" (both with 46). In March the total number of hits was 323, with the favorite resources again being "Criteria" (113), "Internet Resources" (74), and "Class Picks" (65). Statistics are based on a study of the daily use logs. Accessed Sept. 24, 1999,www.umkc.edu/ _reports/. 13. Larry Hardesty, "The Role of the Classroom Faculty in Bibliographic Instruction," in Teaching Librarians to Teach: On-the-Job Training for Bibliographic Instruction Librarians, Alice F. Clark and Kay F. Jones eds. (Metuchen: Scarecrow Pr., 1986), 171-72. COMMUNICATIONS I WALTER 41 10073 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Site License Initiatives in the United Kingdom: The PSLI and NESLI Experience Borin, Jacqueline Information Technology and Libraries; Mar 2000; 19, 1; ProQuest pg. 42 L Site License Initiatives in the United Kingdom: The PSLI and NESLI Experience Jacqueline Borin This article examines the development of site licensing within the United Kingdom higher education community. In particular, it looks at haw the pressure to make better use of dwindling fiscal resources led ta the conclusion that information technology and its exploitation was necessary in order to create an effective library service. These conclusions, reached in the Follett Report of 1993, led to the establishment of a Pilot Site License Initiative and then a National Electronic Site License Initiative. The focus of this article is these initiatives and the issues they faced, which included off-site access, definition of a site and perhaps most importantly, the unbundling of print and electronic journals. Increased competition for institution funding around the world has result- ed in an erosion of library funding. In the United States state universities are receiving a decreasing portion of their funds from the state while pri- vate universities are forced to limit tuition increases due to outside mar- ket forces. In the United Kingdom the entitlement to free higher educa- tion is currently under attack and losing ground. Today's economic pressures are requiring individual libraries to make better use of their fiscal resources while the emphasis moves from being a repository for information to providing access to information. Jacqueline Sorin (jborin@csusm.edu) is Coordinator of Reference and Electronic Resources, Library and Information Services, California State University, San Marcos. As in the United States, the use of consortia for cost sharing in the United Kingdom is becoming imperative as producers produce more electronic materials and make them available in full-text formats. Consortia, while orig- inally formed to cooperate on interli- brary loans and union catalogs, have recently taken on a new role, driven by financial expediency, in negotiating electronic licenses for their members, and the percentage of vendor contracts with consortia are rising. Academic libraries cannot afford the prevalent pricing model that asks for the current print price plus an electronic surcharge plus projected inflation surcharges, therefore group purchasing power allows higher education institutions to leverage the money they have and to provide resources that would other- wise be unavailable. Advantages for the vendor include one negotiator and one technical person for the consortia as a whole. In addition, the use of con- sortia provide greater leverage in pushing for the need for stable archiv- ing and for retaining the principles of fair use within the electronic environ- ment as well as reminding publishers of the need for flexible and multiple economic models to deal with the diverse needs and funding structures of consortia. I During the spring of 1998, while visiting academic libraries in the United Kingdom, I looked at an exist- ing initiative within the UK higher education community-the Pilot Site License Initiative (PSLI), which had begun as a response to the Follett Report and to rising journal prices. At the time the three-year initiative was nearing its end and its successor, the National Electronic Site License Initiative (NESLI), was already the topic of much discussion. I History The concept of site licensing in the United Kingdom higher education 42 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 community had already been estab- lished, since 1988, by the Combined Higher Education Software Team (CHEST), based at the University of Bath. CHEST has negotiated site licenses with software suppliers and some large database producers through two different methods. Either the supplier sells a national license to CHEST, which passes it on to the individual institution or CHEST sells licenses to the institu- tion on the suppliers behalf and pass- es the fees on to them (see figure 1). CHEST works closely with National Information Services and Systems (NISS). NISS provides a focal point for the UK education and research communities to access infor- mation resources. NISS's Web serv- ice, the NISS Information gateway, provides a host for CHEST informa- tion such as Ebsco Masterfile and OCLC NetFirst. Most CHEST agree- ments are institution-wide site licenses that allow for all noncom- mercial use of the product, normally for five years to allow for incorpora- tion into the curriculum. Once an institution signs up it is committed for the full term of the agreement. CHEST is not in the business of either evaluating products or differ- entiating among competing suppli- ers. Evaluations and purchase decisions are left up to the individual institutions.2 CHEST does set up and support e-mail discussion lists for each agree- ment so that users can discuss fea- tures and problems of the product among themselves. They also send out electronic news bulletins to pro- vide advance warning of forthcom- ing agreements and to assess level of interest in future agreements. CHEST operates in a similar manner to many library consortia in the United States. The major differences are that it sells to higher education institutions as a whole so the products they sell include not only databases but also for example, software programs. This is also beginning to change in Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the United States. A recent article in the Chronicle of Higher Education men- tions that institutions will not stop with library databases, "in the future we'll be negotiating site licenses for software and all sorts of things . . . not just databases."3 Although CHEST is substantial- ly self-funding it is strongly sup- ported (as is NISS) by the Joint Information Systems Committee (JISC) of the Higher Education Funding Councils of England (HEFCE). The majority of public funding for higher education fund- ing in the United Kingdom is fun- neled through the HEFCs (one each for England, Scotland, Wales, and Northern Ireland). One of the JISC committees, the Information Services Subcommittee (ISSC), which in 1997 became part of the Committee for Electronic Information (CEI) defined principles for the delivery of con- tent. 4 They were: • free at the point of use; • subscriptions not transaction based; • lowest common denominator; • universality; • commonality of interfaces and • mass instruction. I Follett Report In 1993 an investigation into how to deal with the pressures on library resources caused by the rapid expan- sion of student numbers and the worldwide explosion in academic knowledge and information was undertaken by the Joint Funding Council 's Libraries Review Group, chaired by Sir Brian Follett. This investigation resulted in the Follett Report. One of the key conclusions of the report was "The exploitation of IT is essential to create the effective Higher Education and Public Research Establishments Software, Data , Training Needs ! CHEST © CHEST (University of Bath) 1996 Figure 1. CHEST Diagram CHEST Deals , CHEST Offers Negotiations Software , Data, Training Materials t IT Product Suppliers library service of the future ." The review group recommended that as a starting point "a pilot initiative between a small number of institu- tions and a similar number of pub- lishing houses should be sponsored by the funding councils to demon- strate in practical terms how material can be handled and distributed elec- tronically." 5 As a consequence £15 million was allocated to an Electronic Libraries Program, managed by JISC on behalf of HEFCE. The Electronic Libraries Program was to "engage the higher education community in developing and shaping th e imple- mentation of the electronic library." 6 This project provided a body of elec- tronic resources and services for UK higher education and influenced a cultural shift towards the acceptance and use of electronic resources instead of more traditional informa- tion storage and access methods. PSLI In May 1995 a pilot site license initia- tive subsidized by the funding coun- cils was set up to : • Test if the site license concept could provide wider access to journals for those in the academ- ic community; • See if it would allow more flexi- bility in the use of scholarly material ; • Test the methods for dissemina- tion of scholarly material to the higher education sector in a vari- ety of formats ; • Test legal models for a national site license program; and • Explore the possibility for increased value for money from scholarly journals.7 Sixty-five publishers were invit- ed by HEFCE to participate for three years commencing January 1, 1996. HEFCE was also responsible through JISC for the funding of the elib pro- gram, but no formal links were estab- lished between the elib project and COMMUNICATIONS I BORIN 43 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the PSLI. 8 The final selection of four companies included Academic Press Ltd., Blackwell Publishers Ltd., Blackwell Science Ltd., and IOP Publishing Ltd. The publishers agreed to offer print journals to high- er education institutions for dis- counts of between 30 and 40 percent over the three year period as well as electronic access as available. Originally the electronic journals were supposed to be the subsidiary component of the agreement but by the end of the agreement they had become the major focus. The PSLI achieved almost 100 percent take up among the higher education institu- tions due to the anticipated savings through the program.9 HEFCE did not specify how the publishers were to deliver their con- tent. IOPP hosted the journals on their own server, for example, while Academic Press linked their IDEAL server to the Journals Online service at the University of Bath. One of the key provisions of the site license was the unlimited rights of authorized users to make photocopies (includ- ing their use within course packs) of the journals. Academic Press and IOPP provided full-text access to all their journals while Blackwell and Blackwell Science only allowed reading of full text where a print subscription existed. An integral part of the PSLI was that the funding from HEFCE to the higher education institutions was top sliced to sup- port the discounted price offered to the institutions. Several assessments of the initia- tive were made and a final evalua- tion of the pilot was concluded at the end of 1997. Initial surveys indicated subscription savings through the program (average annual savings were approximately £11,800 per annum) and the first report of the evaluation team showed a wide level of support for the project despite major problems with lack of commu- nication in a timely manner.10 The team recommended an extension of the PSLI to include more publishers and more emphasis on electronic delivery. One concern that was raised was ease of access, students had to know which system a journal they required was on. This was not easily discernible or user friendly. Evaluations by focus groups showed users wanted one single access point to all electronic journals.11 Also unre- solved was the need for one consis- tent interface to the electronic journals and a solution to the archiv- ing issue. At the end of the PSLI, HEFCE handed the next phase over to JISC. In the fall of 1997 JISC announced that a NESLI would be set up and a new steering group was established. NESLI was to be an electronic-only scheme and the invitation to tender went out at the end of 1997 with a decision to be made mid-1998. National Electronic Site License Initiative NESLI, a three-year JISC funded pro- gram, began on January 1, 1999 although the "official" launch was held at the British Library on June 15, 1999. It is an initiative to deliver a national electronic journal service to the United Kingdom higher education and research community (approxi- mately 180 institutions) and is a suc- cessor program to the Pilot Site License Initiative {PSLI). In May 1998 JISC appointed a consortium of Swets and Zeitlinger and Manchester Computing {University of Manchester) to act as a managing agent (Swets and Blackwell Ltd. announced in June 1999 their intention to combine Swets Subscription Service and Blackwell's Information Services, the two sub- scription agency services). The manag- ing agent represents the higher education institutions in negotiations with publishers, manages delivery of the electronic material through a single Web interface and oversees day-to-day operation of the program including the handling of subscriptions.12 44 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 The managing agent also encour- ages the widespread acceptance by publishers of a standard model site license, one of the objectives of this being to reduce the number and diversity of site definitions used by publishers. Other important provi- sions of the model site license addressed the issues of walk-in use by clients and the need for publishers to provide access to material previ- ously subscribed to when a subscrip- tion is cancelled. The subscription model is currently the prevalent option although they are also work- ing towards a pay-per-view option.13 Priority has been given to pub- lishers who had been involved in the PSLI and to those publishers partici- pating in SwetsNet, the delivery mechanism for the NESLI. SwetsNet is an electronic journal aggregation service that offers access to and man- agement of Internet journals. Its search engine allows searching and browsing through titles from all pub- lishers with links to the full-text arti- cles. NESLI is not a mandatory initiative, the higher education insti- tutions can choose whether to partic- ipate in proposals and can pursue their own arrangements individually or through their own consortiums if they wish. While PSLI was basically a print- based initiative limited to a small number of publishers and funded via top slicing, NESLI is an electronic ini- tiative aimed at involving many more publishers. It is designed to be self-funding, although it did receive some start-up funding. Although it is an electronic initiative, proposals that include print will be considered, as it is still not easy to separate print and electronic materials.14 The initiative addresses the most effective use, access, and purchase of electronic journals in the academic library community. Its aims include: • access control-for on-site and remote users; • cost; Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • definition of a site; • archiving; and • unbundling print from electronic. Access to SwetsNet, the delivery mechanism for journals included in NESLI, has now been supplemented by the option of Athens authentica- tion. Athens, an authentication sys- tem developed by NISS, provides individuals affiliated with higher education institutions a single user- name and password for all electronic services they have permission to access. Athens is linked to SwetsNet to ensure access for off-site, remote, and distance learners who do not have a fixed IP address. This supple- ments SwetsNet's IP address authen- tication, which does not allow for individual access to TOC and SDI alerting. A help desk is available for all NESLI users through the Univer- sity of Manchester. The definition of a site is being addressed by the NESLI model site license, which tries to standardize site definitions (including access from places that authorized users work or study, including homes and residence halls); interlibrary loan (supplying an authorized user of another library a single paper copy of an electronic original of a individ- ual document); walk-in-users; access to subscribed material in perpetuity (it provides for an archive to be made of the licensed material with access to the archive permissible after ter- mination of the license); and inclu- sion of material in course packs. JISC' s NESLI steering group ap- proved the model NESLI site license on May 11, 1999 for use by the NESLI managing agent.15 The managing agent asks pub- lishers to accept the model license with as few alterations as possible. During the term of the initiative the managing agent will be working on additional value added services. These include links from key index- ing and abstracting services, provi- sion of access via z39.50, linking from library OPACs, creation of catalog records and assessing a model for e- journal delivery via subject clusters. In particular, they have begun to look at the technical issues concerned with providing MARC records for all elec- tronic journals included in NESLI offers. Additionally they will be look- ing at solutions for longer term archiving of electronic journals to provide a comfort level for librarians purchasing electronic only copies.16 Two offers that have been made under the NESLI umbrella so far are Blackwell Sciences for 130 electronic journals and Johns Hopkins Uni- versity Press for 46 electronic titles. Most recently two additional ven- dors have been added to the list. Elsevier has made a proposal to deliver full text content via the pub- lishers ScienceDirect platform that includes the full text of more than 1,000 Elsevier science journals along with those of other publishers. A total of more than 3,800 journals would be included in the service.17 MCB University Press, an independ- ent niche publisher, is offering access to 114 full text journals and second- ary information in the area of man- agement through it's Emerald Intelligence + Fulltext service. Similarly, here in the United States, California State University (CSU) put out for competitive tender a contract for the building of a cus- tomized database of 1200+ electronic journals based on the print titles sub- scribed to by 15 or more of the 22 campuses-Journal Access Core Collection OACC). The journals will be made available via Pharos, a new Unified Information Access System for the CSU. Like Ohiolink, a consor- tium of 74 Ohio libraries, it will pro- vide a common interface to electronic journals for students and faculty and will facilitate the development of dis- tance learning programs.18 By unbundling the journals, libraries will no longer be required to pay for jour- nals they do not want or need leading to moderate price savings. Additional savings can be realized through the lowering of overhead costs achieved by system wide purchasing of core resources. Other issues being addressed within the JACC RFP included archiving and perpetual access to journal articles the universi- ty system has paid for, availability of e-journals in multiple formats, inter- library loan of electronic documents, currency of content and cost value at the journal-title level. 19 Currently 500 core journals are being provided under the JACC by Ebsco Inform- ation Services and the CSU plans on expanding those offerings. I Conclusion As we move into the next millennium library consortia will continue to work together with vendors to further cus- tomize journal offerings. However it is still far too early to say whether NESLI will be successful or whether it will succeed in getting the publishing industry to accept the model site license. If it is to work within the high- er education community, it will depend greatly on the flexibility and willingness of the publishers of schol- arly journals. It has made a start by developing a license that sets a wider definition of a site and that deals real- istically with the question of off-site access. By encouraging the unbundling of electronic and print subscriptions NESLI allows services to be tailored to specific needs of the information community, but it remains to be seen how many publish- ers are prepared to accept unbundled deals at this stage. Also as technology stabilizes and libraries acquire increas- ingly larger electronic collections, we will not be able to rely on license nego- tiations as the only way to influence pricing, access, and distribution. An additional problem that remains unaddressed by either PSLI or NESLI is the pressure on academics to pub- lish in traditional journals and the cor- COMMUNICATIONS I BORIN 45 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. responding rise in scholarly journal prices. NESLI neither encourages nor hinders changes in scholarly commu- nication and therefore the question of restructuring the scholarly communi- cation process remains.20 References and Notes 1. Barbara McFadden and Arnold Hirshon, "Hanging Together to Avoid Hanging Separately: Opportunities for Academic Libraries and Consortia," Information Technology and Libraries 17, no. 1 (March 1998): 36. See also International Coalition of Library Consortia, "Statement of Current Perspective and Preferred Prac- tices for the Selection and Purchase of Electronic Information," Information Technology and Libraries 17, no. 1 (March 1998): 45. 2. Martin S. White, "From PSLI to NESLI: Site Licensing for Electronic Journals," New Review of Academic Librarianship 3, (1997): 139-50. See also CHEST. CHEST: Software, Data, and Information for Education (1996). 3. Thomas J. DeLoughry, "Library Consortia Save Members Money on Electronic Materials," The Chronicle of Higher Education (Feb. 9, 1996): A21. 4. Information Services Subcom- mittee, "Principles for the Delivery of Content." Accessed Nov. 17, 1999, www.jisc.ac.uk/ pub97 /nl_97.html#issc. 5. Joint Funding Council's Libraries Review Group. The Follett Report. (Dec. 1993): Accessed Nov. 20, 1999, www.niss.ac. uk/ education/ hefc / follett/report/. 6. John Kirriemuir, "Background of the eLib programme." Accessed Nov. 21, 1999, www.ukoln.ac.uk/services.elib/ background/history.html. 7. PSLI Evaluation Team, "UK Pilot Site License Initiative: A Progress Report," Serials 10, no. 1 (1997): 17-20. 8. White, "From PSLI to NESLI," 149. 9. Tony Kidd, "Electronic Journals: Their Introduction and Exploitation in Academic Libraries in the UK," Serials Review 24, no. 1 (1998): 7-14. 10. Jill Taylor Roe, "United We Save, Divided We Spend: Current Purchasing Trends in Serials Acquisitions in the UK Academic Sector," Serials Review 24, no. 1 (1998): ~- 11. PSLI Evaluation Team, "UK Pilot Site License Initiative," 17-20. 12. Beverly Friedgood, "The UK National Site Licensing Initiative," Serials 11, no. 1 (1998): 37-39. 13. University of Manchester and Swets & Zeitlinger, NESLI: National Electronic Site License Initiative (1999). Accessed Nov. 21, 1999, www.nesli.ac.uk/. 14. NESLI Brochure, "Further Information for Librarians." Accessed Nov. 21, 1999, www.nesli.ac.uk/ nesli-librarians-leaflet.html. 15. A copy of the model site license is available on the NESLI Web site. Accessed Nov. 22, 1999, www.nesli.ac.uk/ Mode1License8.html. 16. Albert Prior, "NESLI Progress through Collaboration," Learned Publishing 12, no. 1 (1999). 17. Science Direct. Accessed Nov. 24, 1999, www.sciencedirect.com. 18. Declan Butler, "The Writing is on the Web for Science Journals in Print," Nature 397, Oan. 211998). 19. The Journal Access Core Collection Request for Proposal. Accessed Nov. 22, 1999, www.calstate.edu/tier3/ cs+p/rfp_ifb/980160/980160.pdf. 20. Frederick J. Friend, "UK Pilot Site License Initiative: Is it Guiding Libraries Away from Disaster on the Rocks of Price Rises?" Serials 9, no. 2 (1996): 129-33. A Low-Cost Library Database Solution Mark England, Lura Joseph, and Nern W. Schlecht Two locally created databases are made available to the world via the Web using an inexpensive but highly func- tional search engine created in-house. The technology consists of a microcom- puter running UNIX to serve relation- al databases. CGI forms created using the programming language Perl offer flexible interface designs for database users and database maintainers. Many libraries maintain indexes to local collections or resources and cre- ate databases or bibliographies con- 46 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 cerning subjects of local or regional interest. These local resource indexes are of great value to researchers. The Web provides an inexpensive means for broadly disseminating these indexes. For example, Kilcullen has described a nonsearchable, Web- based newspaper index that uses Microsoft Access 97.1 Jacso has writ- ten about the use of Java applets to publish small directories and bibli- ographies.2 Sturr has discussed the use of WAIS software to provide searchable online indexes.3 Many of the Web-based local databases and search interfaces currently used by libraries may: • have problems with functionality; • lack provisions for efficient searching; • be based on unreliable software; • be based on software and hard- ware that is expensive to pur- chase or implement; • be difficult for patrons to use; and • be difficult for staff to maintain. After trying several alternatives, staff members at the North Dakota State University Libraries have implemented an inexpensive but highly functional and reliable solu- tion. We are now providing search- able indexes on the Web using a microcomputer running UNIX to serve relational databases. CGI forms created at the North Dakota State University Libraries using the pro- gramming language Perl offer flexi- ble interface designs for database users and database maintainers. This article describes how we have imple- Mark England (england@badlands. nodak.edu) is Assistant Director, Lura Joseph (ljoseph@badlands.nodak.edu) is Physical Sciences Librarian, and Nem W. Schlecht (schlecht@plains.nodak.edu) is a Systems Administrator at the North Dakota State University Libraries, Fargo, North Dakota. 10074 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A Low-Cost Library Database Solution England, Mark;Lura, Joseph;Schlecht, Nem W Information Technology and Libraries; Mar 2000; 19, 1; ProQuest pg. 46 responding rise in scholarly journal prices. NESLI neither encourages nor hinders changes in scholarly commu- nication and therefore the question of restructuring the scholarly communi- cation process remains.20 References and Notes 1. Barbara McFadden and Arnold Hirshon, "Hanging Together to Avoid Hanging Separately: Opportunities for Academic Libraries and Consortia," Information Technology and Libraries 17, no . 1 (March 1998): 36. See also International Coalition of Library Consortia, "Statement of Current Perspective and Preferred Prac- tices for the Selection and Purchase of Electronic Information," Information Technology and Libraries 17, no. 1 (March 1998): 45. 2. Martin S. White, "From PSLI to NESLI: Site Licensing for Electronic Journals," New Review of Academic Librarianship 3, (1997): 139-50. See also CHEST. CHEST: Software, Data, and Information for Education (1996). 3. Thomas J. DeLoughry, "Library Consortia Save Members Money on Electronic Materials," The Chronicle of Higher Education (Feb. 9, 1996): A21. 4. Information Services Subcom- mittee , "Principles for the Delivery of Content." Accessed Nov . 17, 1999, www.jisc.ac.uk/ pub97 / nl_97.html#issc. 5. Joint Funding Council's Libraries Review Group . The Follett Report. (Dec. 1993): Accessed Nov . 20, 1999, www.niss . ac . uk/ ed ucation/hefc/ follett/report/ . 6. John Kirriemuir, "Background of the eLib programme ." Accessed Nov . 21, 1999, www .ukoln.ac.uk/services .elib/ background/history.html . 7. PSLI Evaluation Team, "UK Pilot Site License Initiative : A Progress Report," Serials IO, no. 1 (1997): 17-20. 8. White, "From PSLI to NESLI," 149. 9. Tony Kidd, "Electronic Journals: Their Introduction and Exploitation in Academic Libraries in the UK," Serials Review 24, no . 1 (1998): 7-14. 10. Jill Taylor Roe, "United We Save, Divided We Spend: Current Purchasing Trends in Serials Acquisitions in the UK Academic Sector," Serials Review 24, no. 1 (1998): ~- 11. PSLI Evaluation Team, "UK Pilot Site License Initiative," 17-20. 12. Beverly Friedgood, "The UK National Site Licensing Initiative," Serials 11, no. 1 (1998): 37-39 . 13. University of Manchester and Swets & Zeitlinger, NESLI: National Electronic Site License Initiative (1999). Accessed Nov. 21, 1999, www.nesli.ac.uk/. 14. NESLI Brochure, "Further Information for Librarians." Accessed Nov . 21, 1999, www .nesli .ac.uk/ nesli-librarians-leaflet.html. 15. A copy of the model site license is available on the NESLI Web site . Accessed Nov . 22, 1999, www .nesli .ac .uk/ Mode1License8.html . 16. Albert Prior, "NESLI Progress through Collaboration," Learned Publishing 12, no . 1 (1999). 17. Science Direct. Accessed Nov. 24, 1999, www .sciencedirect.com. 18. Declan Butler, "The Writing is on the Web for Science Journals in Print," Nature 397, Oan. 211998) . 19. The Journal Access Core Collection Request for Proposal. Accessed Nov . 22, 1999, www .calstate.edu/tier3/ cs+p/rfp_ifb/980160/980160.pdf . 20. Frederick J. Friend, "UK Pilot Site License Initiative: Is it Guiding Libraries Away from Disaster on the Rocks of Price Rises?" Serials 9, no. 2 (1996): 129-33. A Low-Cost Library Database Solution Mark England, Lura Joseph, and Nem W. Schlecht Two locally created databases are made available to the world via the Web using an inexpensive but highly func- tional search engine created in-house. The technology consists of a microcom- puter running UNIX to serve relation- al databases. CGI forms created using the programming language Perl offer flexible interface designs for database users and database maintainers. Many libraries maintain indexes to local collections or resources and cre- ate databases or bibliographies con- 46 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 cerning subjects of local or regional interest. These local resource indexes are of great value to researchers. The Web provides an inexpensive means for broadly disseminating these indexes. For example, Kilcullen has described a nonsearchable, Web- based newspaper index that uses Microsoft Access 97.1 Jacso has writ- ten about the use of Java applets to publish small directories and bibli- ographies.2 Sturr has discussed the use of WAIS software to provide searchable online indexes.3 Many of the Web-based local databases and search interfaces currently used by libraries may: • have problems with functionality; • lack provisions for efficient searching; • be based on unreliable software; • be based on software and hard- ware that is expensive to pur- chase or implement; • be difficult for patrons to use; and • be difficult for staff to maintain. After trying several alternatives, staff members at the North Dakota State University Libraries have implemented an inexpensive but highly functional and reliable solu- tion. We are now providing search- able indexes on the Web using a microcomputer running UNIX to serve relational databases. CGI forms created at the North Dakota State University Libraries using the pro- gramming language Perl offer flexi- ble interface designs for database users and database maintainers. This article describes how we have imple- Mark England (england@badlands . nodak.edu) is Assistant Director, Lura Joseph (ljoseph@badlands.nodak.edu) is Physical Sciences Librarian, and Nem W. Schlecht (schlecht@plains.nodak.edu) is a Systems Administrator at the North Dakota State University Libraries, Fargo, North Dakota. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. mented this technology to distribute two local databases to the world via the Web. It is hoped that recounting our experiences will facilitate other such projects . I Creating the Databases The two databases that we selected to use as demonstrations of this technol- ogy are a community newspaper index and a bibliography of publica- tions related to North Dakota geology. The Forum Index The Farg o Forum is a daily newspaper published in Fargo, North Dakota. It began publication in 1879 and is the paper of record for North Dakota . For many years, the North Dakota State University Libraries have main- tained an index to the Forum. Beginning with the selective index- ing of notable events and editions, we started offering full-text indexing of the entire paper in 1996. Until early in the 1980s, all indexing was done manually and preserved on cards or paper. Then for several years , indexing was done on one of the university's mainframe comput- ers . Starting in 1987, microcomputers were used to compile the index, first using DBASE and then using Pro- Cite as the database management software . Printed copies of the data- base were sold annually to subscrib- ing libraries and businesses . Starting in the summer of 1996, th e library made arrangements with the pub- lisher of the paper to acquire digital copy of the text of each newspaper. In early 1997, the NDSU Libraries began a project to place all of our Forum indexes on the Web. DBASE, Pro-Cite, WordPerfect, or Microsoft Access computer files existed for the newspaper index from 1879 to 1975, 1988, and from 1990 to 1996. All other data was unavailable or unreadable. Printed indexes from 1976 to 1987 and 1989 were scanned using a Hewlett Packard 4C scanner fitted with a page feeder . Optical character recog- nition was accomplished using the software OmniPage Pro. Once expe- rience was gained with scanner and software settings, the scanning went very quickly with very few errors appearing in the data. Various mem- bers of the library staff volunteered to check and edit the data, and the digitizing of approximately 1,500 pages was completed in about three weeks. All data were checked and nor- malized using Microsoft's Excel spreadsheet software and then saved as tab-delimited text. Programmer's File Editor was used to do the final text editing. Because of variations in the completeness of the indexing, three separate relational database tables were created: one each for the years 1879-1975, 1976-1996, and 1996-the present. The Collective Bibliography of North Dakota Geology In 1996 a project was initiated to combine three bibliographies of North Dakota geology and to make the final product searchable and browsable on the Web. All three of the original print bibliographies were published by the North Dakota Geological Survey. Scott published the first bibliography as a thesis . It is a bibliography of all then-known North Dakota geological literature published between 1805 and 1960, and most entries are annotated. 4 The second print bibliography, also by Scott, focuses on North Dakota geo- logical literature published in the years 1960 through 1979, and also includes some material omitted in the first bibliography .5 Most entries in the second bibliography include annotations in the form of keywords or keyword phrases. The third bibli- ography covers the years 1980 through 1993, and is not annotated.6 All three bibliographies are indexed . The third bibliography was available in digital format, whereas the first two were in print format only. Library staff members began rekeying the two print bibliographies using Microsoft Word. The remain- ing pages were digitally scanned using a new Hewlett Packard 4C scanner and the optical character recognition software OmniPage Pro . There were many errors in the result- ing text. Different font sizes in the original documents may have con- tributed to optical recognition errors . Editing of the scanned pages was nearly as time consuming and tedious as rekeying the documents . The Microsoft Word documents were saved as text files and combined as a single text file. Programmer's File Editor was used as a final editor to remove any line breaks or other undesirable formatting. Each record was edited to occupy one line, and each field was delimit- ed by two asterisks . Asterisks were used because there were many occur- rences of commas, semicolons, and other symbols that would have made it difficult to parse any other way. Because italics were removed by con- verting to a text file, some errors were made in parsing. In retrospect, parsing should have been done before the document was saved as a text file. Punctuation between fields was removed because the database would be converted to a large table. It would have been better to leave the punctuation intact, since it can- not easily be put back in for the out- put to be presented in bibliographic form. The alphabetical additions to publication dates (e.g. Baker, 1966a) were left intact to aid in hand-cutting and pasting index terms into the records at a later date. Initially, the resulting document was converted to a Microsoft Access file so that it would be in a table for- mat. However, many of the fields COMMUNICATIONS I ENGLAND, JOSEPH, AND SCHLECHT 47 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Secure Database: Shaw diese fields in results : Aalhor: ::::=====~---~---Date : le~al to i P Author P Date P Tille P' Source Tid,:L . _J r Annot:l!iom R: Index Sour1:e: ~=====~ Amiotalions: l ... -· · ... ······-···~~ ..... ·.-... --.... J r Prilll Resource P Record Number bulu: ;:::::::::::=::::::::::::::::::;:::;,~~ - Priat Re1oun:11: ! Show all ii Record Naml,er: I equal to iJ l=:J Sort results by: jAulhor j r Descending AI B IC IDI EIF IG IHII IJ IKIL IM IN IO IPI.Q.IRI S IT IUIVIWIX IY IZ Figure 1: Secure Database Editing Interface were well over the 256 character limit of individual fields . To solve this problem, the data were imported into a relational database called MySQL, which allows large data fields called "blobs." Running under UNIX, MySQL is very flexible and powerful . I Database and Search Engine Design We examined the features and capa- bilities of various online bibliogra- phies and indexes when deciding on our search interfaces and search engine designs . We wanted our data- bases to be both searchable and browsable and, in the case of the Collective Bibliography of North Dakota Geology, we wanted to pro- vide the option of receiving search results accurately in a specific biblio- graphic format. We wanted both sim- ple and advanced search capabilities, including the ability to do highly sophisticated Boolean searching. Finally, we wanted to provide those maintaining the databases with the ability to easily add, delete, and change records from within simple forms on the Web and immediately see the results of this editing . MySQL uses a Perl interface, DBI (Database Independent Interface), which makes accessing the database simple from a Perl script. Essentially, a SQL statement is generated, based on data from an HTML form. This SQL statement is then run against the MySQL database, returning match- ing rows that the same script can handle and display as needed. All of the dynamically generated pages in this database are created this way. Using both MySQL and Perl provid- ed a nice, elegant way to integrate database functionality with the Web. The databases were installed on a server and made available via the Web. It soon became apparent that there were problems with large num- bers of returns . Depending upon the client machine's hardware configura- tion, browsers could lock up the 48 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 machine. While an efficient search should not result in such a large num- ber of hits, we decided to limit returns to reduce this problem. Following suggestions from users, various search tips were added, and some search interface terminology was changed. From a secure gateway , it is possi- ble to call up different forms that allow individual records to be dis- played, edited, and saved (see figure 1). New records are added by using a simple HTML form . It is also possible to bulk-load large numbers of records by using a special Perl program to load the data directly from a text file. I Advantages of the UNIX/MySQL Solution After first using Glimpse, a popular Web search engine, under Linux, a free UNIX platform, and then Microsoft's Internet Information Server (IIS) software on a Windows NT platform to search the Forum newspaper index, we settled on using MySQL on a microcomputer running Linux and the Apache Web server. We found we could write Perl scripts that allowed users to make very sophisti- cated searches of the data from with- in very simple Web forms. MySQL is stable, reliable, free, and offers a high degree of functionality, flexibility, and efficiency. Apache is reliable, extendible, very fast, free, and offers tight control of data access. Initially, each story received from the newspaper was maintained as a separate file on a microcomputer. By having the stories as separate files, it was easy to set up Glimpse as a searching tool for the articles. Although it did provide a nice pre- view of a workable system, Glimpse did not provide enough flexibility in how records were displayed, organ- ized, or searched. It was not meant for managing data of this sort. Windows NT, although a popular and successful IT solution, was Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. found to be somewhat cumbersome to implement and did not provide enough flexibility. The installation of these tools was easy, but it was diffi- cult to obtain a high level of database and Web integration . Reliability and cost were also concerns . We found that UNIX was more stable and practically eliminated any unavailability of the data . Perl, MySQL, and Apache were ultimately used to manage, store, and deliver the data. Although these products are available for Windows NT, their native platform is UNIX. By running these products on UNIX, we were able to take advantage of all the fea- tures offered by each of the products. We found that MySQL offered the flexibility and power to manage both sets of data efficiently. Also, to load the data into a relational database such as MySQL required the data to be normalized. Normalized data are data that are separated into logically separate components. To normalize data often takes some extra effort, as fields must be defined to contain cer- tain types of data, but in the end the data is easier to manage and well organized. By having articles and bibliographies in a relational data- base, we are able to easily make updates, additions, and generate out- put or reports on the data in many different ways. There are several Web servers available on the market today . However, Apache is often singled out as being the most popular server . Apache, like Perl and MySQL, is available free for all uses (educational and commercial). Using Apache and .htaccess control files, we are able to restrict access to administrative pages where data are added or modified. Many extensions for Apache are available to increase Web perform- ance in different situations. For exam- ple, a module for Apache allows the Web server to execute Perl code with- in the server without the need to run the regular Perl interpreter. I Conclusion and Future Plans Work is under way to refine and update The Collective Bibliography of North Dakota Geology. Because bibli- ography number three was not anno- tated, index terms are being added to facilitate searching and retrieval of citations. We have recently updated The Collective Bibliography of North Dakota Geology to include citations to publications through 1998, and we plan to update the database annually. Additionally, we receive monthly updates of Forum articles, which are added using a simple Perl script as soon as they are received. We have successfully implemented a number of other databases using these meth- ods. We realize that this UNIX/ MySQL solution is likely to be most helpful to other academic libraries: there are generally students and staff available on many campuses who are capable of programming in Perl and maintaining SQL databases on UNIX servers. Our Perl scripts are available at the URL ww.lib.ndsu .nodak.edu/ kids. References and Notes 1. M . Kilcullen, "Publishing a Newspaper Index on the World Wide Web Using Microsoft Access 97," The Indexer 20, no . 4 (1997): 195-96 . 2. P . Jacso, "Publishing Textual Databases on the Web," Information Today 15, no . 11 (1998): 33, 36 3. N .O . Sturr, "WAIS: An Internet Tool for Full-Text Indexing," Computers in Libraries 15 (June 1995): 52-54. 4. M .W . Scott, Annotated Bibliography of the Geology of North Dakota 1806-1959 North Dakota Geological Survey Miscellaneous Series, no. 49 . (Grand Forks , N .D .: North Dakota Geological Survey , 1972). 5. M . W . Scott , Annotated Bibliography of the Geology of North Dakota 1960-1979 North Dakota Geological Survey Miscellaneous Series, no. 60. (Grand Forks, N.D.: North Dakota Geological Survey, 1981). 6. L. Greenwood and others, Bibliography of the Geology of North Dakota 1980-1993 North Dakota Geological Survey Miscellaneous Series, no. 83. (Bismarck, N .D .: North Dakota Geological Survey, 1996). Related URLs Linux Homepage: www.linux.org/ MySQL Homepage: www.mysql.com/ Perl Homepage: www.perl.com/ Apache Homepage: www.apache.org/ NDSU Forum Index: www.lib.ndsu. nodak.edu/Forum/ Collective Bibliography of North Dakota Geology: www.lib.ndsu.nodak.edu/ ndgs/ COMMUNICATIONS I ENGLAND, JOSEPH, AND SCHLECHT 49 10075 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Internet, the World Wide Web, Library Web Browsers, and Library Web Servers Jian-Zhong, Zhou Information Technology and Libraries; Mar 2000; 19, 1; ProQuest pg. 50 Tutorial The Internet, the World Wide Web, Library Web Browsers, and Library Web Servers Jian-Zhong (Joe) Zhou This article first examines the difference between two very familiar and sometimes synonymous terms, the Internet and the Web. The article then explains the relation- ship between the Web's protocol HTTP and other high-level Internet protocols, such as Telnet and FTP, as well as provides a brief history of Web development. Next, the article analyzes the mechanism in which a Web browser (client) "talks" to a Web server on the Internet. Finally, the article studies the market growth for Web browsers and Web servers between 1993 and 1999. Two statis- tical sources were used in the Web market analysis: a survey conducted by the University of Delaware Libraries for the 122 members of the Association of Research Libraries, and the data for the entire Web industry from different Web survey agencies. Many librarians are now dealing with the Internet and the Web on a daily basis. While the Web is some- times synonymous with the Internet in many people's minds, the two terms are quite distinct, and they refer to different but related concepts in the modem computerized telecommunication system. The Internet is nothing more than many small computer networks that have been wired together and allow electronic information to be sent from one network to the next around the world . A piece of data from Joe Zhou (joezhou@udel.edu) is Associate Librarian at the University of Delaware Library, Newark. Beijing, China may traverse more than a dozen networks while making its way to Washington, D.C. We can compare the Internet to the Great Wall of China, which was built in the Qin dynasty around the third centu- ry B.C. by connecting many existing short defense walls built by previous feudal states . The Great Wall not only served as a national defense system for ancient China, but also as a fast military communication system. A border alarm was raised by means of smoke signals by day, and beacon fires at night, ignited by burning a mixture of wolf dung , sulfur, and saltpeter. The alarm signal could be relayed over many beacon-fire tow- ers from the western end of the Great Wall to the eastern end (4,500 miles away) within a day . This was consid- ered light speed two thousand years ago. However, while the Great Wall transferred the message in a linear mode, the Internet is a multidimen- sional network. The Web is a late-comer to the Internet, one of the many types of high-level data exchange protocols on the Internet. Before the Web, there was Telnet, the traditional command- driven style of interaction. There was FTP, a file transfer protocol useful for retrieving information from large file archives. There was Usenet , a com- munal bulletin board and news sys- tem. There was also e-mail for individual information exchange, and e-mail lists, for one-to-many broadcasts. In addition, there was Gopher, a campus-wide information system shared among universities and research institutions, and WAIS, a powerful search and retrieval sys- tem developed by Thinking Machines, Inc. In 1990 Tim Bemers- Lee and Robert Cailliau at CERN (www. cern.ch), the European Laboratory for Particle Physics, cre- ated a new information system called "World Wide Web" (WWW). Designed to help the CERN scientists with the increasingly confusing task of exchanging information on the 50 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 Internet, the Web system was to act as a unifying force, a system that would seamlessly bind all file-proto- cols into a single point of access. Instead of having to invoke different programs to retrieve information via various protocols, users would be able to use a single program, called a "browser," and allow it to handle all the details of retrieving and display- ing information. In December 1993 WWW received the IMA award, and in 1995 Bemers-Lee and Cailliau received the Association for Computing (ACM) Software System Award for its development. The Web is best known for its ability to combine text with graphics and other multimedia on the Internet. In addition, the Web has some other key features that make it stand out from earlier Internet infor- mation exchange protocols. Since the Web is a late-comer to the Internet, it has to be compatible backwards with other communications protocols in addition to its native language, HyperText Transfer Protocol (HTTP). Among the foreign languages spo- ken by Web browsers are Telnet, FTP, and other high-level communication protocols mentioned earlier. This support for foreign protocols lets people use a single piece of software, the Web browser, to access informa- tion without worrying about shifting from protocol to protocol and soft- ware incompatibility . Despite different high-level pro- tocols including HTTP for the Web, there is one thing in common for all parts of the Internet-TCP/ IP, the lower level of the Internet protocol. TCP /IP is respon sible for establish- ing the connection between two com- puters on the Internet and guarantees that the data can be sent and received intact. The format and content of the data are left for high-level communi- cation protocols to manage, among which the Web is the best known one. At the TCP /IP level all computers "are created equal." Two computers establish a connection and start to Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. communicate. In reality, however, most conversations are asymmetric. The end user's machine (the client) usually sends a short request for information, and the remote machine (the server) answers with a long- winded response. The media is the Internet. The common language on the Internet can be the Web or any other high-level protocols . On the Web, the client is the Web browser; it handles the user's request for a document. The first Web brows- er, NCSA Mosaic, developed by the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana- Champaign, was released in mid- November 1993 for Unix, Windows, and Macintosh platforms. Version 3.0 of NCSA Mosaic is available at www. ncsa. uiuc.ed u/ SDG /Software/ Mosaic. Both source code and bina- ries are free for academic use. Mosaic lost market share to Netscape after its key developer left NCSA and joined Netscape. Even after Mosaic introduced an innovative 32-bit ver- sion in early 1997, which can perform feats that other major browsers had not even thought of back then, Mosaic remained out of the major browsers' market. The two most widely-used browsers today are Microsoft's Internet Explorer (IE) and Netscape's Navigator (part of the Netscape Communicator suite). Recent Web browser surveys conducted by dif- ferent Internet survey companies such as www.zonaresearch.com/ browserstudy, www.psrinc.com/ Trends.htm, and www .statmarket. com all indicate that IE is the market leader with more than 60 percent market share, leaving Navigator with between 35 percent and 40 per- cent. In 1995 IE had only 1 percent share versus Navigator's more than 90 percent, an unimaginable rise critics have attributed to Microsoft's strategy of bundling the browser with its near-monopoly Windows operating system. However, a survey conducted in December 1998 by the University of Delaware Library of 122 members of the Association of Research Libraries (ARL) showed that Netscape still remained the mar- ket leader among big academic libraries. More than 90 percent of ARL libraries supported Netscape, and about 50 percent also supported IE. Most ARL libraries supported both browsers, and unlike the brows- er industry survey mentioned earlier, in which only one product can be picked as the primary browser , the sum of the percentages for the ARL survey was greater than 100 percent. The main function of the Web brows- er is to request a document available from a specific server through the Internet using the information in the document's URL. The server on a remote machine returns the docu- ment usually physically stored on one of the server's disks. With the use of Common Gateway Interface (CGI), the documents do not have to be static. Rather, they can be synthe- sized at the point of being requested by CGI scripts running on the serv- er's side of the connection . In some database-driven Web servers that make the core of today's e-com- merce, the documents provided may never exist as physical files but are generated as needed from database records . The Web server can be run on almost any computer, and server software is available for almost all operating systems, such as Unix, Windows 95/98/NT, Macintosh, and OS / 2. According to the University of Delaware Library's 1998 survey of Internet Web servers among ARL member libraries, more than 32 per- cent of ARL libraries chose Apache as their Web server software, fol- lowed by the Netscape series at 29.32 percent, NCSA HTTPd at 11.28 per- cent, and Microsoft Internet Inform- ation Server (IIS) at 7.52 percent. In July 1999 the author checked the Netcraft survey at www .netcraft. com/Survey . The top three Web serv- er software programs for more than 6.5 million Web sites are Apache (56.35 percent) , Microsoft-HS (22.33 percent), and Netscape (5.65 per- cent). The Netcraft survey also pro- vides the historical market share information of major Web servers since August 1995. NCSA HTTPd was the first Web server software released, about the same time as the release of Mosaic in 1993. However, it slipped from the number-one position with more than 90 percent market share in 1993, and almost 60 percent in 1995, to less than 1 percent in July 1999. It is no longer supported by NCSA, howev- er, HTTPd remains a popular choice for Web servers due to its small size, fast performance, and solid collec- tion of features . The "inertia effect" of the existing sites (if it runs well, why bother to change?) will likely keep NCSA on the major Web server software list for some time. NCSA is free, but available only for the Unix platform. It is available from http:/ /hoohoo .ncsa.uiuc.edu. How- ever, when the author visited the site in July 1999, the following message appeared on the main page : "THE NCSA HTTPd IS NO LONGER UNDER DEVELOPMENT. It is an unsupported product. We recom- mend that you check out the Apache server, instead of installing our server." Most people who use only Web browsers may have heard of Apache only as an Indian nation or a military helicopter, not the most popular Web server software with more than 50 percent market share . It was first introduced as a set of fixes or "patch- es" to the NCSA HTTPd. Apache 1.0 was released in December 1995 as open-source server software by a group of webmasters who named themselves the Apache Group. Open-source means the source code is available and freely distributed, and it is the key to Apache's attrac- tiveness and popularity. The Apache Group members were NSCA users TUTORIAL I ZHOU 51 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. who decided to coordinate develop- ment work on the server software after NSCA stopped. In July 1999 the Apache Group announced that it was establishing a more formal organiza- tion called the Apache Software Foundation (ASP). In the future, the ASP (www .apache.org) will monitor development of the free software, but it will remain a "not-for-profit" foundation. Apache is high-end, enterprise-level server software and can be run on OS/2, Unix (including Linux), and Windows platforms, but a Mac version is still not available. The Netscape series includes Netscape-Enterprise, Netscape-Past- Track, Netscape-Commerce, and Netscape-Communication . Enterprise is a high-end, enterprise-level server while PastTrack serves as an entry- level server for small workgroups. Netscape supports both the Unix and the Windows NT platforms. The other major commercial Web server, Microsoft Internet Information Server (IIS), as of 1999, is only available for the Windows platform. However, one advantage of IIS over Netscape is that it can be downloaded for free as part of the Windows Option Pack. In addi- tion, IIS can handle MS Office docu- ments very well. While both the Microsoft and Netscape brand names are well recognized by millions of end users. a name alone does not neces- sarily equate to large market share, nor does a deep pocket. Apache remains the top Web server despite intense competition. One of the keys to Apache's success, in addition to its outstanding performance, lies in its open-source code movement and active user support on a wide basis. The Web server of choice for the Macintosh platforms is WebStar. However, due to the limitations of the operating system networking software, the performance of Macintosh-based servers has not been great. WebStar can be down- loaded as a free evaluation release from www.stamine.com/webstar. The Web server market is dynam- 52 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 ic and competition intense. There are more than sixty Web server products on the top list ( of Web servers with more than one thousand Web sites) as of July 1999, and newcomers are being added frequently. Acknowledgments The author thanks Peter Liu, Head of the Systems Department at the University of Delaware Library, for providing the Web survey data of ARL libraries . After this article was submitted, the survey data was pub- lished by ARL in 1999 as SPEC Kit 246: Web Page Development and Management. The author also wants to thank his dear wife Min Yang for her tech- nical assistance. Min is Webmaster and System Administrator for the Web site at A. I. duPont Nemours Foundation and Hospital for Child- ren, http:/ /kidshealth.org. 10076 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Pearls Marmion, Dan Information Technology and Libraries; Mar 2000; 19, 1; ProQuest pg. 53 Pearls Ed. Note: "Pearls" is a new section that will appear in these pages from time to time. It will be ITAL 's own version of the "Top Technology Trends" topic begun by Pat Ensor. These Pearls might be gleaned from a variety of places, but most often will come from discussion lists on the Net. Our first pearl, from Thomas Dowling appeared on Web4Lib on August 19, 1999 under the subject "Pixel sizes for web From : Thomas Dowling To : Multiple recipients of list Sent : Thu, 19 Aug 1999 06:07 :08 -0700 (PDT) Subject: [WEB4LIB] Pixel s izes for web pages Dan Marmion pages." He is responding to a query that asked if Web site developers should assume the standard monitor resolution is 640x480 pixels, or something else. You may want to consult the Web4Lib archive for comments from the last few merry go-rounds on this topic. Monitor size in inches is different from monitor size in pixels , which is different from window size in pixels, which is d ifferent from the rendered size of a browser's default font. Not only are these four measurements different, they operate almost wholly independently of each other . So a statement like "I have trouble reading text at 600x800" puts the blame in the wrong place . HTML inherently has no sense of screen or window dimensions. Many Web designers will argue that the only aspects to a page with fixed pixel dimensions should be inline images; such designers typically restrain their use of Images so that no single image or horizontal chain of images is wider than, say, 550px (with obvious exceptions for sites like image archives where the main purpose of a page is to display a larger Image) . Outside of images, find ways to express measurements relative to window size (percentages) or relative to text size (ems). Users detest horizontal scrolling. In my experience, users with higher screen resolutions and/or larger monitors are less likely to run any application full screen; average window size on a 1280x1024 19" or 21 " monitor is very likely to be less than B00px wide. (The browser window I currently have open is 587px wide and 737px high .) I applaud your decision to support Web access for the visually Impaired . Since that entails much , much more than monitor resolution, I trust the people actually writing your pages are familiar with the Web Content Accessibility Guidelines. It is actually possible to design web sites that are equally usable , even equally beautiful, under a wide range of view- Ing conditions. Failing to accomplish that completely is understandable; failing to identify It as a goal is not. My recommendations to your committee would be A) find a starting point that isn't tied up In presentational nitpick- ing; B) find a design that looks attractive anywhere from 550 to 1550 pixels wide; C) crank up both your workstations ' resolution and font size; and D) continue to run your browsers in windows that are approximately 600 to 640 pixels wide . Thomas Dowling OhioLINK - Ohio Library and Information Network tdowllng @ohiolink.edu PEARLS I 53 10077 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Information Ecologies: Using Technology With Heart/The Media ... Zillner, Tom Information Technology and Libraries; Mar 2000; 19, 1; ProQuest pg. 54 Book Reviews Information Ecologies: Using Technology With Heart by Bonnie A. Nardi and Vicki L. O'Day. Cambridge: MIT Pr., 1999. 232p. $27.50 (ISBN 0-262-14066-7). The Media Equation: How People Treat Computers, Television, and New Media Like Real People and Places by Byron Reeves and Clifford Nass. Cambridge: Cambridge Univ . Pr., 1996 and 1999. 305p. $28.95 (ISBN 1-575-86052- X); paper, $15.95 (ISBN 1-575-86053-8). The books I am reviewing this month are interrelated because they both focus on information technology and our changing world, with the two volumes looking at different levels of the picture. The broader, and to me more intriguing, view is presented by Nardi and O'Day in their wonder- ful book Information Ecologies. Although it is not clear from the cap- sule biographies of the dust jacket, Nardi and O'Day are anthropologists who study the world of technology in a number of locales, and they here report the findings from their field work. Among the case studies they discuss are an examination of the activities of reference librarians at two corporations and a look at a vir- tual world created for and by ele- mentary school students. But they do much more than simply present case studies, although these alone make the book a worthwhile read. In addi- tion, they argue that the most useful way to look at information technolo- gy is through the metaphor of "infor- mation ecologies," "system[s] of people, practices, values, and tech- nologies in ... particular local envi- ronment[s]." They adopt this biological metaphor after carefully considering the most commonly employed information technology metaphors: technology as tool, text, or system . In turn, they find each of these metaphors wanting. It is particularly important to choose carefully the metaphorical lenses through which technological developments are viewed. Each par- ticular metaphor has consequences for how sanguinely we view a tech- nology, and it is often worthwhile to use multiple metaphors to enhance our world view. The information ecology metaphor is particularly appropriate for an anthropological view of local "habitats" and their inhabitants and artifacts . In turn, an anthropological view is particularly apt for capturing the human side of technology (thus the subtitle: Using Technology With Heart). This is a side of things that can be overlooked in other metaphorical views, particular- ly since it requires that the sticky issue of values be considered. Unfortunately for all of us, there is a reluctance to talk of human values when considering technology. As Nardi and O'Day note, there is a ten- dency to either enthusiastically applaud new technology without regard to its effects, or to condemn all new technology as inherently debas- ing to humanity, or to simply resign oneself pessimistically to the inevitable development of technolo- gy and our lack of control over it. Nardi and O'Day tend to be cau- tious optimists, claiming that we can control technology, and the way to exercise that control is through our own local encounters with informa- tion ecologies. Thus, rather than bemoaning the dehumanizing effects of the Internet, Information Ecologies explores the successful use of Internet technologies to set up a virtual world for students and the elderly in Phoenix, Arizona. Instead of thinking or acting globally, exploit the technol- ogy locally, but do so in a way that makes sense in terms of human val- ues. On the taxonomic scale of tech- nology views, ranging from gloom and doom (e.g., the views of Clifford 54 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 Tom Zillner, Editor Stoll) to perpetual optimism (e.g., Nicholas Negroponte), I place Nardi and O'Day somewhere in the middle, but as I suggested, leaning toward cautious optimism. In fact, they spend several chapters discussing the views of others and offering prescient criticism of the deficiencies of those views . Of particular interest to me was their analysis of the French soci- ologist Jacques Ellul, who apparently sounded the alarm concerning the stress to mind and soul of constant technological change in 1954, well before the current crop of doomsay- ers. Nardi and O'Day find Ellul's views, as articulated in The Technological Society to be compelling. Yet, they claim, the rise of the Internet can counteract the trend that Ellul saw toward monotonous sameness and lack of diversity in the face of technological efficiency. Perhaps so. One thing that I was looking for in Information Ecologies were some practical tools for engaging in the kind of exploration of information habitats that Nardi, O'Day, and other anthropologists engage in. There is a spate of interest lately in the role of anthropologists in the design and deployment of new technologies, and I would like to determine its applica- bility to my modest software devel- opment projects. Unfortunately, I was mainly disappointed on this score. In fairness to the authors, they did not set out to spell out the anthropologi- cal methodology of exploring infor- mation ecologies in any detail. The purpose of the book is rather to argue that viewing the world of technology as a set of interconnected information ecologies is useful and accurate, and in many cases superior to other metaphorical views. They succeed in this goal. Now I want them to go on to write a book on using anthropo- logical methods in these ecologies without necessarily becoming a pro- fessional anthropologist. Nardi and O'Day do touch extremely briefly on a few conven- tions of interviewing subjects, with - - - - - --------- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. their most important technical dis- cussion centering on what they call "strategic questioning," which they present in the context of evolving information ecologies . They provide useful categories of questions to be asked, and specific examples. Although it may seem obvious to ask penetrating questions of members of an information habitat, this is one area in which software developers in particular fail miserably . Another seemingly obvious pointer is to pay attention . Again, its obviousness is deceptive , since most of us are poor observers who make many assump- tions about the characteristics of a work activity without observational evidence . As evidence that people intro- ducing new technologies to an ecolo- gy do not follow these simplest pieces of advice you can tum to the chapter "A Dysfunctional Ecology," to see how badly technology can fail for nontechnological reasons . This case study deals with a major teach - ing hospital that introduced a moni- toring system into its neurosurgical operating suites that captured instru- ment readings as well as complete audio and video. The system was installed to aid neurophysiologists, experts who are called in to advise neurosurgeons at key points during complex surgeries to ensure that patient neurological function is not compromised . The neurosurgeons and neurophysiologists at this hospi- tal decided that it would be more efficient for the neurophysi ologists to be able to remotely monitor multiple surgeries simultaneously. Both groups failed to consult with the other constituencies among the oper- ating team, the nurses and anesthesi- ology staff. These groups believed that their privacy was being compro- mised, particularly since it was pos- sible to tape any procedures at multiple workstations throughout the hospital. I can easily envision similar sorts of problems due to lack of communication in introducing new or modified technology into other milieus, e.g., libraries. Although the consequences might not lead to the potentially life-threat- ening situations that could arise in an operating suite, there are certainly possible outcomes where service to users could be undermined. Despite the book being not exact- ly what I (rather selfishly) want, lnformation Ecologies is a first-rate read and an important starting point for those concerned with better con- trolling technological change in the world of information. Turning from an anthropological point of view to a psychological one, The Media Equation offers another important basis for technological design and implementation, particu- larly of computer software and mul- timedia. The release last year of a paperback edition of this volume, first published in 1996, provides a convenient pretext for reviewing this work. Reeves and Nass have super- vised years of study and experimen- tation that have consistently demonstrated the truth of what they call the "media equation": that our relations with media, including com- puters and multimedia, are identical in key ways to our relationships with other human beings. This is true of all of us, even those of us sophisticat- ed enough to understand that we are dealing with devices and human artifacts rather than people . Reeves and Nass quite entertain- ingly present the technique they've used over the years to perform their research, on a step-by-step basis: 1. Pick a research finding on how people respond to each other or their environment. 2. Find the summary of the social or natural rule that the study has yielded. 3. Replace the words "person" or "environment" in the summary with media of some sort (televi- sion, movi es, computers, etc.) 4. Find the research procedure . 5. Substitute media for one of the people or the environment in the procedure. 6. Run the experiment. 7. Draw conclusions. Although this may sound face- tious, it is in fact the recipe that pro- duced the startling conclusions that we all tend to behave toward media much as we do toward other people. What's perhaps more important is that Reeves and Nass point toward techniques that practitioners can use to produce more effective media, including computer software . As a simple example, consider politeness. Reeves and Nass discovered that people treated computers with the same sort of politeness that they would other human beings, and in turn Reeves and Nass suggest that people respond better to "polite media." They then provide some fairly straightforward advice on pro- ducing polite computer programs, starting with Grice's Maxims, a set of politeness rules assembled by H. Paul Grice, a philosopher and psy- chologist. These center around truth telling, appropriate quantity of infor- mation (neither too much nor too lit- tle), relevance, and clarity. All of this is fairly unsurprising, but the authors spell out just how the max- ims can be applied to the construc- tion of computer programs . Further, they go on to suggest some rules of thumb of their own. For example, some computer programs produce verbal output but expect the user to key in his or her responses. This may be perceived by the user , possibly subconsciously, as forcing an impo- lite response, since mixing communi- cations modalities is a faux pas. Thus, they suggest that if text input is required , perhaps only text output should be supplied . This should provide you with some of the flavor of The Media Equation, and in turn you may be able to see a set of potential ethical dilemmas that can arise from utilizing BOOK REVIEWS I 55 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. techniques that result from the research of Reeves and Nass. This set of problems can be seen most clearly in the chapter "Subliminal Images," where they discuss how subliminal messages could be inserted into new media to advertise products or to attempt to bolster employee morale. In fact, they say, " ... it might be eas- ier to accomplish subliminal intru- sions with a computer than with a television, because software can respond to the particular input of individual users and timing is more precise." They immediately temper this insight with the caution that" ... ethical and legal issues abound." Indeed. Although some of the techniques that can be applied to new media do lead to ethical problems, I think that most of what Reeves and Nass talk about are just elements of good design. Subliminal suggestion seems to most of us to be out of bounds because it unfairly manipulates user response in a powerful way. The unfairness is that someone can be manipulated without his or her knowledge to do something outside of the person's normal behavior. Although the other techniques tend to subtly alter behavior, they don't gen- erally result in an anomalous action by the user. If you think this is a kind of philosophical hairsplitting, you're right. The onus is upon the program- mer or multimedia designer to use these techniques with great care. In a past professional life I wrote computerized patient interviews for the psychiatry department of the University of Wisconsin. Researchers there and elsewhere found that peo- ple were generally more candid with the computer than they were with human clinicians. So the findings of Reeves and Nass were not quite as surprising to me as they might be to others. What did surprise me, howev- er, is that the media equation is not a phenomenon solely of the nai"ve or inexperienced media and computer users. On the contrary, all of us, no matter how conversant we are with underlying technology, are suscepti- ble to the effects described in The Media Equation. This vastly increases the power of computer programs and other media for both good and ill. I want to emphasize that not all of the possible effects of human- media interaction are pernicious. Most are simply innocuous, and if techniques that benefit users can result from these effects there should be no harm in applying them in soft- ware or multimedia. In general, it's desirable to make user experiences of software and media pleasanter and more productive, and Reeves and Nass do an excellent job of providing pointers throughout the book. There are suggestions with regard to per- sonality, emotion (including arousal), social roles, and form (e.g., image size, fidelity of sound, and video). None of them comes close to being as controversial as subliminal sugges- tion, although it continues to make me uncomfortable that people react to media as if they were dealing directly with other human beings. This is a disquieting finding, but it should not dissuade us from our jobs of designing good systems for users. All in all, Information Ecologies and The Media Equation are both first- rate books that belong in our libraries and on our professional bookshelves. Both provide method- ologies and techniques for making user interactions with automated systems a better experience, both in terms of accomplishing tasks effi- ciently and in terms of user satisfac- tion.-Tom Zillner Index to Advertisers Info USA Library Technologies, Inc. LITA cover 4 cover 3 cover 2, 2 56 INFORMATION TECHNOLOGY AND LIBRARIES I MARCH 2000 10078 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Editorial: I inhaled Helmer, John F Information Technology and Libraries; Jun 2000; 19, 2; ProQuest pg. 59 Editorial: I Inhaled T his editorial introduces the third special issue of Information Technology and Libraries dedicated to library consortia, and the second primarily aimed at surveying consortial activities outside the United States. 1 The concept of a special consortial issue began in 1997 as an outgrowth of a sporadic and wide-ranging dis- cussion with Jim Kopp, editor of ITAL 1996-98. At the time, Jim and I were involved in the creation and matu- ration of the Orbis consortium in Oregon and Washington. Jim was a member and later chair of the governing council and I was chief volunteer staff person and finding myself increasingly absorbed by consortial work. Our discussions lasted more than a year and were sustained by many e-mail messages and several enjoy- able conversations over bottles of Nut Brown Ale. In the mid-1990s it seemed obvious that we were wit- nessing the beginning of a renaissance in library consor- tia. Consortia had been around for many years but now established groups were showing renewed vigor and new groups seemed to be forming every day. Why was this happening? What were all these consortia doing? Jim and I discussed these questions and speculated on future roles for library consortia and their impact on member libraries. Library consortia seemed an ideal topic for a special issue of ITAL. My initial goal as guest editor of ITAL was to take a snapshot of a variety of consortia and begin to better understand the implications of the explosive growth we were witnessing. While assembling the March 1998 issue I soon realized that consortia were all over the map, both figuratively and literally. A small amount of study revealed a tremendous variety of consortia and a truly worldwide distribution. Although American consortia were starting to receive attention in the professional liter- ature, a great deal of important work was occurring abroad. This realization gave rise to the September 1999 issue and the present issue dedicated to consortia from around the world. In addition to six articles from the United States, these three special issues of ITAL include contributions from South Africa, Canada, Israel, Spain, Australia, Brazil, John F. Helmer China, Italy, Micronesia, and the United Kingdom. Taken together these groups represent a dizzying array of organizing principles, membership models, governance structures, and funding models. Although most are geo- graphically defined, the type of library they serve also defines many. Virtually all license electronic resources for their membership but many offer a wide variety of other services including shared catalogs, union catalogs, patron-initiated borrowing systems, authentication sys- tems, cooperative collection development, digitizing, instruction, preservation, courier systems, and shared human resources. Each consortium is formed by unique political and cultural circumstances, but a few themes are common to all. It is clear that the technology of the Web, the increas- ing importance of electronic resources, and advances in resource-sharing systems have created new opportuni- ties for consortia. Beyond these technological and eco- nomic motivations, I believe that in consortia we see the librarian's instinct for collaboration being brought to bear at a time of great uncertainty and rapid change. Librarians often forget that as a profession we collaborate and cooperate with an ease seldom seen in other endeav- ors. There is safety in numbers and in uncertain times it helps to confer with others, spread risk over a larger group, and speak with a collective voice. Library consor- tia fulfill these functions very well and their future con- tinues to look bright. As I conclude my duties as guest editor I would like to thank Jim Kopp for sparking my interest in this project and for several years of stimulating conversation. Special thanks are due to managing editors Ann Jones and Judith Carter as well as the helpful and professional staff at ALA Production Services. Obstacles of language and time dif- ferences make composing and editing a publication such as this unusually challenging. The quality and cohesive- John F.Helmer(jhelmer@darkwing.uoregon.edu) is Executive Director, Orbis Library Consortium. PRODUCTION: ALA Production Services (Troy D. linker, Kevin Heubusch; Ellie Barta-Moran, Angela Hanshaw, and Karen Sheets), American library Association, 30 E. Huron St., Chicago, IL 60611. Publication of material in Infornrntion Trclz110logy and Libraries does not constitute official endorsement bv LITA or the ALA. . Abstracted in Computer & /11jtJ1·11wtwn Systems, Compllting Rn 1icws, il~{ormation Science Abstracts, Library [-r lnforlllatio11 Science Abstracts, Rtfrrati'unyi Zlwrnal, I\Iauclmaya i Tckfrniclzeskaya l11fon11atsiya, Otdyclnyi Vyp11sk, and Science Abstracts Pu{J/icnticms_ Indexed in Co111pu1\r!nth Citation lndcx, Comptdcr Contents, Co111putcr Litaaturc lndc:r, Current Contc11ts/Healtl1 Scn.·iccs Admi11istratio1l, Current Ccmtcnfs/Social Bclwuioral Scic11ces, C11rrcnt Index to Journals in Education, Education, Library Literature, A1agazinc JndcJ:, NcwScarcl1, and Social Sciences Citation Index. Microfilm copies available to sub- scribers from University Microfilms, Ann Arbor, Michigan. for Information Sciences-Permanence of Paper for Printed library Materials, ANSI 239.48-1992.= Copyright ©2000 American Library Association. All material in this journal subject to copyright by ALA may be photocopied for the noncommercial purpose of scientific or educational advancement granted by Sections 107 and 108 of the Copyright Revision Act of 1976. For other reprinting, photo- copying, or translating, address requests to the ALA Office of Rights and Permissions. The paper used in this publication meets the mini- mum requirements of American National Standard EDITORIAL 59 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ness of these issues of ITAL are due in large measure to the efforts of these individuals. In Inhaling the Spore, the editorial introduction to the first special consortial issue, I compared a librarian's involvement in consortia to the Cameroonian stink ant's inhalation of a contagious spore. The effect of this spore is featured in Mr. Wilson's Cabinet of Wonder, Lawrence Weschler's remarkable history of the Museum of Jurassic Technology. 2 Weschler explains that, once inhaled, the spore lodges in the brain and "immediately begins to grow, quickly fomenting bizarre behavioral changes in its ant host." Although the concept of a consortial spore is somewhat extreme (or "icky" according to my nine-year- old daughter) the editorial was an accurate reflection of my own sense of being inexorably drawn into a consor- tium-drawn not so much against my will but as a will- ing crazed participant. At the time I was nominally working for the University of Oregon Library System and vainly trying to keep consortial work in perspective. 60 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 By the time of my second editorial, Epidemiology of the Consortia/ Spore, I was exploring consortia around the world but still laboring under the illusion that I could keep my own consortium at arm's length. I must have failed since, as of this writing, I have left my position at the UO and now serve as the executive director of the Orbis Library Consortium. Like the Cameroonian stink ant, I have inhaled the spore and am now happily labor- ing under its influence. References and Notes 1. See ITAL 17, no. 1 (Mar. 1998) and ITAL 18, no. 3 (Sept. 1999). 2. Lawrence Weschler, Mr. Wilson's Cabinet of Wonder (New York: Vintage Books, 1995). The Museum of Jurassic Technology (www.mjt.org) is located in Culver City, Calif. See www.mjt.org/ exhibits/stinkant.html for more on the Cameroonian stink ant. 10079 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Electronic library for scientific journals: Consortium project in Brazil Rosaly Favero Krzyzanowski;Taruhn, Rosane Information Technology and Libraries; Jun 2000; 19, 2; ProQuest pg. 61 Electronic Library for Scientific Journals: Consortium Project in Brazil Making information available for the acquisition and transmission of human knowledge is the focal point of this paper, which describes the creation of a consortium for the 1111iversity and research institute libraries in the state of Sao Paulo, Brazil. Through sharing and coop- eration, the project will facilitate information access and minimize acquisition costs of international scien- tific periodicals, consequently increasing user satisfac- tion. To underscore the advantages of this procedure, the objectives, management, and implementation stages of the project are detailed, as submitted to the Research Support Foundation of the State of Sao Paulo (FAPESP). I Production, Organization, and Acquisition of Knowledge In 1851, predicting the imminent growth in information, which in fact exploded in volume one hundred years later, Joseph Henri of the Smithsonian Institute voiced his opinion that the progress of mankind is based on research, study, and investigation, which generate wis- dom, knowledge or, simply , information. He stated that for practically every item of interest there is some record of knowl edge pertinent to it, "and unless this mass of information be properly arranged, and the means fur- nished by which its content may be ascertained, literature as well as science will be overwhelmed by their own unwieldy bulk. The pile will begin to totter under its own weight, and all the additions we may heap upon it will tend to add to the extension of the base, without increas- ing the elevation and dignity of the edifice." 1 At the threshold of the twenty-first century, these words become more self-evident by the day. There are enormous archives of knowledge from which people extract parts, allowing them to advance and progress in science, technology, and the humanities. Until some decades back, recovery from these archives was essen- tially a manual task consisting of written work and organization. Today's technologies provide auxiliary tools to transmit this knowledge . Although information is a cultural and social asset, it now is purchased at high prices . Making these enormous archives available in a clear and organized manner by using the proper technology is currently the greatest challenge for all those involved in knowledge manage- ment-the production , organization, and transmission of information. Rosaly Favero Krzyzanowski Rosane Taruhn I The Advent and Implications of Electronic Publications Among the major contributions of the industrial era, out- standing are the evolution and growth of information publi shing and printing facilities that use tools to record, store, and distribute information. In the last ten years, the first steps were taken toward the storage and reproduc- tion of sounds and images in new multimedia formats. Technological advances also have brought new pos- sibilities in accessing and disseminating information . Electronic publishing has been particularly effective in accelerating access and contributing to the generation of additional knowledge; consequently, an exponential increase in data has taken place, most notably in the sec- ond half of the twentieth century. Current journals num- bered about 10,000 at the beginning of the century; by the year 2000 the number had reached an estimated 1 million. 2 As a result, specialized literature has been warning about a possible crisis in the traditional system of scien- tific publications on paper . In addition to the difficulty of financing the publication of these works, the prices of subscriptions to scientific periodicals on paper have been rising every year. At times, this makes it impracticable to update collections in all libraries, which interferes sub- stantially in development. On the other hand, access to electronic scientific pub- lications via Internet is proving to be an alternative for maintaining these collections at lower cost. It also pro- vides greater agility in publishing and distributing the periodical, and in the final user's accessing of the infor- mation. Due to this, it is important that institutions that wish to support and promote research developed by their scientific communities facilitate access to these publica- tions on electronic media . To paraphrase Line, we can say that although pub- lishers are still uncertain as to all the aspects of transmit- ting information electronically, because authors and institutions will be increasingly able to distribute their works on the Web without the direct involvements of publishers, there is an escalation in electronic publica- tions being published by scientific publishers.3 Rosaly Favero Krzyzanowski is Technical Director of the Integrated Library System of the University of Sao Paulo- SIBi/USP, Brazil. Rosane Taruhn is Director of the Development and Maintenance of Holdings Service of the Technical Department of the University of Sao Paulo-SIBi/USP, Brazil. ELECTRONIC LIBRARY FOR SCIENTIFIC JOURNALS I KRZYZANOWSKI ANDTARUHN 61 ! Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Physical Figure 1. Infrastructure Resources for Consortium Formation Line also savs that one of the reasons for the growth in the number o'f electronic publications is "that it is tech- nically possible to make them [journals] accessible in this way, and in fact easy and cheap, since nearly all te_xt ~oes through a digital version on the way to pubh~ahon. Secondly, journal publishers believe that electronic ve~- sions provide a second market in addition to that for t~eir printed versions, or at least in an expanded market, since many users will be the same." 4 . . . . . It is important to point out that the sC1enhhc penod1- cal, be it paper or electronic, must ensure market valu_e and academic community receptivity, have a staff quali- fied for scientific publishing, be consistent in publishing release dates, comply with international standards, and use established distribution and sales mechanisms. 5 Line goes further: "Electronic publication as an_ 'extra' to printed publication has few added costs of J~urnal publication other than those of printing, and pubhshe~s are not going to want to make less money fro~ elect~onic journals than they do from printed ones. While p~inted journals once acquired can be used and reused without extra cost, each access to an electronic article has to be paid for. And although the costs of storage and binding may be saved, these are offset by the costs of printing out."6 He then notes that this technology demands an active equipment and telecommunication infrastructure. Another point he addresses is the need for users to master the search strategies required to efficiently recover information, thus reducing the time spent and costs. In turn, Saunders points out that, depending on the contracts made with the publishers or their agents: 62 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 libraries, through their development, formation, and maintenance policies, should be receptive to this tran- sition by accommodating the different means of com- munication to the different user needs and striving for a new balance. These policies should certainly stress the cooperation and sharing of remote access to the information demanded. Budget estimates should, therefore, foresee, in addition to the subscriptions to electronic titles with complete texts, other possible items like licensing rates for multi-user remote access and the right to copy articles on electronic media to paper, depending on the contracts made with the pub- lishers or their agents.7 I Electronic Publication Consortiums Catering to mutual interests by setting up a library con- sortium to select, acquire, maintain, and preserve elec- tronic information is one means of reducing or sharing costs as well as expanding the universe of information available to users and ensuring a successful outcome. Resources-physical, human, financial, and elec- tronic-are combined for the common good; in this case, the consortium, as shown in figure 1, which was extracted and adapted from an OCLC Institute. 8 The consortium presupposes invigoration of coopera- tive activities among member libraries by promoting the central administration of electronic publication databases as part of a shared library system visible to all and replete with access facilities. In addition to putting in place sim- plified, reciprocal lending progra~s and spu_rring _the cooperative development of collections and the~r st~nng, the consortium has the objective of implementing infor- mation distribution by electronic means, provided that copyright and fair use rights are complied _wi~h.9 On t~e other hand, "the research library community is commit- ted to working with publishers and database producers to develop model agreements that deploy lice~ses that d? not contract around fair use or other copynght provi- sions. In this way, one seeks to insure the library practices being disseminated, especially interli?~ary lendi~g."'. 0 Experience shows that acqumng ~ubhcahons through consortia has brought great benefits and has equally favored different size institutions that would not be able to afford single subscriptions, whether on paper or in electronic format. North American and European universities have been opting for this type of alliance to augment inve~tment cost-benefit. Important examples of these consortia cur- rently operative are: • Washington Research Library Consortium, Washington, D.C., www.wric.org; Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • University System of Georgia, Galileo Project, www.galileo.peachnet.edu; • Committee on Institutional Cooperation, Mich- igan, www.cedar.cic.net/ cic; and • Ohio Library and Information Network, Ohio Link, www.ohiolink.edu. I The Electronic Consortium in the State of Sao Paulo Considering that Brazilian institutions also are being affected by the high cost of maintaining periodical collec- tions and that alternative means of distributing this infor- mation are available, the model used abroad has shown itself as appropriate for developing the International Scientific Publications Electronic Library in the state of Sao Paulo. The location has a favorable information infra- structure available, particularly that of the electronic net- work of the Academic Network of Sao Paulo (ANSP), thanks to the support of the Research Support Foundation of the State of Sao Paulo (FAPESP). 11 Growing user demand for direct, convenient access to information in the state of Sao Paulo also was a factor in location choice. The final decision was to compose the consortium of five Sao Paulo state universities- Universidade de Sao Paulo (USP), Universidade Estadual Paulista (UNESP), Universidade de Campinas (UNI- CAMP), Universidade Federal de Sao Carlos (UFSCAR), and Universidade Federal de Sao Paulo (UNIFESP)-as well as the Latin American and Caribbean Center for Health Science Information (BIREME). The consortium's goal was to make available to the member institutions' entire scientific community-10,492 faculty and researchers -rapid access to the complete, updated texts of the Elsevier Science scientific journals. This publishing house, an umbrella for North Holland, Pergamon Press, Butterworth-Einemann, and Excerpta Medica, presently publishes electronic versions of its journals. Selection of the member institutions that would serve as a pilot group for this project was based on prior expe- rience with the cooperative work in preparing the Unibibli Collective Catalog CD-ROM, which, using Bireme/OPAS/OMS technology, consolidates the collec- tions of these three universities. The project was initially funded by the FAPESP; since its fourth edition the CD- ROM has been published through funds provided by the universities themselves, by means of a signed agreement. Moreover, choice of Elsevier Science, which would be justified solely by its premier ranking in the global pub- lishing market, also is due to the fact that consortium member institutions maintain subscriptions to a great number (606) of this publishing house's titles on paper. Already fully available on electronic media, these titles are components of a representative collection initiating the building of the International Scientific Publications Electronic Library in the state of Sao Paulo. Furthermore, the majority of the titles are studied on the Institute of Scientific Information's Web of Science site, which has been at the disposal of researchers and libraries in the state of Sao Paulo since 1998. Consortium Objectives The consortium was formed to contribute to the develop- ment of research through the acquisition of electronic publications for the state of Sao Paulo's scientific com- munity. Using the ANSP Network, in addition to aug- menting and speeding up access to current scientific information in all the member institutions, will: • increase the cost-benefit per subscription; • promote the rational use of funds; • ensure continuous subscription to these periodicals; • increase the universe of publications available to users through collection sharing; • guarantee local storage of the information acquired and thus ensure the collection's mainte- nance and its continual use by present and future researchers; and • develop the technical capabilities of the personnel of the state of Sao Paulo institutions in operating and using electronic publication databases. Initially, the project will not interfere in the current process of acquiring periodicals on paper and in distrib- uting collections in member institutions. However, as electronic collection utilization becomes predominant, duplicate subscriptions to paper may be eliminated so as to allow new subscriptions to be available to the consor- tium at no additional cost. Implementation of the Electronic Library for International Scientific Publications Implementation of this project includes the following stages already achieved: • constitution of the consortium by the six member institutions; and • set up of an administrative board. The following stages are in progress: • purchase of hardware (central server) and a soft- ware manager; and • estimate for the installation of the operational system. ELECTRONIC LIBRARY FOR SCIENTIFIC JOURNALS I KRZYZANOWSKI AND TARUHN 63 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. BIREME Server FAPESP Server Full-Text Database r----------.,- 1 Full-Text 1 t International i r Database 1 ~----------~ Web of Science .... •--•.. : Scientific : : Current : : Contents : SCIELO : Periodical : 1 Electronic 1 I L'b I 1 1 rary 1 .. __________ .. : Connect : I (CCC) I I I ., __________ ., \/ Universe • Web of Science: 8,000 titles • CCC: 9,000 titles Users in consortia institutions • SCIELO (Scientific Electronic Library Online): 100 titles • International Scientific Periodical Electronic Library: 606 titles Figure 2. Reference database and full-text interconnectivity to optimize information access And the following stages are planned: • training for qualified personnel and maintenance of the infrastructure built up; • acquisition and implementation of the electronic library on the central server; and • permanent utilization assessment. The pilot project proposes that the central server, for storage and availability of electronic scientific periodical collections on the ANSP network, be located at FAPESP in order to facilitate development of an electronic bank. In the future, the bank should, in addition to the collec- tion in mind for the project, include international collec- tions of other publishing houses: the Scielo collection of Brazilian scientific magazines (Project FAPESP /Bireme) as well as the Web of Science and Current Contents Connect reference databases (see figure 2). Consortium Management The electronic library will be administrated by the con- sortium's administrative board, made up of a general coordinator, an operations coordinator, and directors and coordinators of the library systems and central libraries of member institutions as well as consultants recom- mended by FAPESP. The administrative board shall be in charge of the implementation, operation, dissemination, and assess- ment of electronic library utilization. It also is charged 64 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 with supervising qualified personnel training in order to guarantee the success of the project. An agreement speci- fying the consortium objective, its constitution, the man- ner by which it shall be executed and consortium member obligations established was signed. Shortly, a contract to use Elsevier Science electronic publications shall be signed by FAPESP and by the provider. The agreement's documents and use license were drawn up in compliance with the principles for licensing electronic resources recommended by the American Library Association, published in final version at the 1997 American Library Association Annual Conference.12 I Recovery System and Information Use Evaluation Research on electronic media suggests that use of a single software program that offers different strategies and forms of interacting for searching the collections requires an evaluation of the efficiency of individual research strategies. This evaluation is critical for preparation of guidelines that orient the choice of systems and proper training programs.13 For the electronic library, the challenge of measuring not only the amount of file use but also the efficacy and efficiency of its information access systems and training for its users is an imperative task. In the project Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. described, evaluation shall be made by indicators that demonstrate use of the electronic library and of the col- lections on paper, per journal title, subject researched, user institution, number of accesses per day, and user sat- isfaction regarding service provided (interface, response time, text copies), among other factors to be studied. I Final Remarks The way in which electronic media are read by the users is a code far beyond the written, because sound and image are being added increasingly. In this first genera- tion of electronic publications, FAPESP supported avail- ability of Web of Science and of Scielo by FAPESP and the creation of the International Scientific Publications Electronic Library in the state of Sao Paulo. The possible introduction of Current Contents Connect will trigger an extraordinary leap in research development, facilitating the access of scientific information and the acquisition and transmission of human knowledge as well as enhancing the cooperative and sharing enterprise of member libraries. References and Notes l. Annual Report of the Board of Regents of tile Smit/zsonum Institution ... During the Year 1851 (Washington, D.C. 1852), 22. 2. Leo Wieers, "A Vision of the Library of the Future," in Developing the Library of the Fut11re: The Tilb11rg Experience, H. Geleijnse and C. Grootaers, eds. (Tilburg, The Netherlands: Tilburg Univ., 1994), 1-11. 3. M. B. Line, "The Case for Retaining Printed LIS Journals," !FLA Journal 24, no. 1 (Oct./Nov. 1998): 15-19. 4. Ibid. 5. R. F. Krzyzanowski, "Administra<;ao de Revistas Cientificas," in Re11niiio Anual da Sociedade de Pesquisa Odonto/6gica, Aguas de Sao Pedro, 14, 1997. (Lecture) 6. Line, "The Case for Retaining Printed LIS Journals." 7. L. M. Saunders, "Transforming Acquisitions to Support Virtual Libraries," Information Teclmology and Libraries 14, no. 1 (Mar. 1995): 41-46. 8. OCLC Institute, OCLC Instit11te Seminar: Information Tec/znology Trends for thl' Global Library Cormmmity, 1997, Ohio (Dublin, Ohio: OCLC Institute/The Andrew W. Mellon Foundation/Funda<;ao Gettilio Vargas/Bibliodata Library Network, 1997). 9. A definition of fair use is the "legal use of information: permission to reproduce texts for the purposes of teaching, study, commentary or other specific social purposes." Found in J. S. D. O'Connor, "Intellectual Property: An Association of Research Libraries Statement of Principles." Accessed July 28, 1999, http://arl.cni.org/ scomm/ copyright/ principles. html. 10. Statement of Current Perspective and Preferred Practices for the Selection and Purchase of Electronic Information. ICOLC Statement on Electronic Information. Accessed July 2, 1998, www.library.yale.edu/ consortia/statement.html. 11. R. F. Krzyzanowski and others, Biblioteca Eletr6nica de Publicac;oes Cientfficas Internacionais para as Universidades e Institutos de Pesquisa do Estado de Sao Paulo. Sao Paulo, 1998 (project presented to FAPESP-Fundac;ao de Amparo a Pesquisa do Estado de Sao Paulo). 12. B. E. C. Schottlaender, "The Development of National Principles to Guide Librarians in Licensing Electronic Resources," Library Acquisitions-Practice and Theory 22, no. 1 (Spring 1998): 49-54. 13. W. S. Lang and M. Grigsby, "Statistics for Measuring the Efficiency of Electronic Information Retrieval," Journal of the American Society for Information Science 47, no. 2 (Feb. 1996): 159-66. ELECTRONIC LIBRARY FOR SCIENTIFIC JOURNALS I KRZYZANOWSKI AND TARUHN 65 10080 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. China Academic Library and Information System: An academic library consortium in China Dai, Longji;Chen, Ling;Zhang, Hongyang Information Technology and Libraries; Jun 2000; 19, 2; ProQuest pg. 66 China Academic Library and Information System: An Academic Library Consortium in China Longji Dai, Ling Chen, and Hongyang Zhang Since its inception in 1998, China Academic Library and Information System (CALIS) has become the most important academic library consortium in China. CALIS is centrally funded and organized in a tiered structure. It currently consists of thirteen management or informa- tion centers and seventy member libraries' 700,000 stu- dents. After more than a year of development in information infrastructure, a CALIS resource-sharing network is gradually taking shape. L ike their counterparts in other countries, academic libraries in China are facing such thorny problems as shrinking budgets, growing patron demands, and rising costs for purchasing books and subscribing to periodicals. It has thus become increasingly difficult for a single library to serve its patrons to their satisfaction. Under these circumstances, the idea of resource sharing among academic libraries was born. Library consortia provide an organizational form for libraries to share their resources. The Georgia Library Learning Online (GALILEO), the Virtual Library of Virginia (VIVA), and OhioLINK are among the well- known library consortia in the United States.I Traditionally, the primary purpose of establishing a library consortium is to share physical resources such as books and periodicals among members. More recently, however, advances in computer, information, and telecommunica- tion technologies have dramatically revolutionized the way in which information is acquired, stored, accessed, and transferred. Sharing electronic resources has rapidly become another important goal for library consortia. I What Is CALIS? In May 1998, as one of the two public service systems in "Project 211," the China Academic Library and Information System (CA LIS) project was approved by the State Development and Planning Commission of China after a two-year feasibility study by experts from aca- demic libraries across the country. CALIS is a nationwide academic library consortium. Funded primarily by the Chinese government, it is Longji Dai is Director, Peking University Library, and Deputy Director, CALIS Administrative Center; Ling Chen is Deputy Director, CALIS Administrative Center; and Hongyang Zhang is Deputy Director, Reference Department, Peking University Library. 66 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 intended to serve multiple resource-sharing functions among the participating libraries-including online searching, interlibrary loan, document delivery, and coor- dinated purchasing and cataloguing-by digitizing resources and developing an information service network. I Structure and Management of CALIS A library consortium is an alliance formed by member libraries on a voluntary basis to facilitate resource shar- ing in pursuit of common interests. Whether a consor- tium can operate successfully depends in large part on how it is managed. CALIS differs from library consortia in the United States in that it is a national network. It resembles multi- state consortia in the United States with respect to geo- graphic distribution of member libraries, but it is like tightly knit or even centrally funded statewide ones in terms of management.2 The CALIS members are distributed in twenty-seven provinces, cities, and autonomous regions in China, mak- ing an entirely centralized management difficult. After surveying some of the major library consortia in the United States, Europe, and Russia, CALIS adopted an organizational mode characterized by a combination of both centralized and localized management-that is, a three-tiered structure (figure 1). In order to improve the management efficiency and maximize the sharing of various resources including funds, CALIS has established a coordination and man- agement network comprising one national administra- tive center (which also serves as the North Regional Center), five national information centers (see table 1) and seven regional information centers (see table 2). The thirteen centers are maintained by full-time staff mem- bers provided by the libraries in which these centers are located. The National Administrative Center (located in Peking University)-overseen by officials from the con- cerned office at the Ministry of Education and the presi- dents of Peking and Tsinghua Universities and advised by an advisory committee consisting of experts from major member libraries-is responsible for the construc- tion and management of CALIS, makes policies and reg- ulations, and prepares resource-sharing agreements. The center has an office handling routine management needs and several specialized work groups overseeing CALIS' national projects, such as those for the development of databases for union catalogues, current Chinese periodi- cals, and CALIS' service software. Under the guidance of the National Administrative Center, five national information centers are each respon- sible for building and maintaining an information system Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. in one of five general areas-humanities, social science, and science; engineering and technology; agriculture and forestry; medicine; and national defense-in coordina- tion with regional centers and member libraries. The host libraries where these centers are located possess rela- tively abundant collections in their respective areas. These centers, which are intended to be information bases that cover all major disciplines of science, are responsible for importing databases for sharing and con- structing resource-sharing networks among member libraries and for providing searching and document delivery services to member libraries. 5 national information centers 8 regional information centers 70 member libraries Depending on their location, academic libraries in China are divided into eight groups, with each forming a regional library consortium. Each regional consortium is overseen by a regional management center, except for the consortium in the north, which is directly managed by the national management center. The regional centers not only participate in nationwide projects in coordination with the national centers and other Figure 1. The Three-Tiered Structure of CALIS regional centers, but they also are responsible for promoting coopera- tion among libraries in their particu- lar regions. All the centers are located in member universities and staffed by the host universities. The concerned vice president or library director of a host university is in charge of the associated center. The regional centers also are assisted by regional coordina- tion committees and advisory com- mittees of provincial and municipal officials in charge of education; uni- versity presidents; library directors; and senior librarians in the concerned Table 2. Table 1. Five National Information Centers Areas of specialization Humanities , Social Science and Science Engineering and Technology Agriculture and Forestry Medicine National Defense Location Peking University, Beijing Tsinghua University , Beijing China Agricultural University, Beijing Beijing Medical University, Beijing Haerbin Industrial University, Haerbin, Heilongjiang Regional Information Centers and Areas of The ir Jurisdiction Name National Administrative Center Southeast (South) Regional Center Southeast (North) Regional Center Central Regional Center South Regional Center Southwest Regional Center Northwest Regional Center Northeast Regional Center Location Beijing Shanghai Nanjing Wuhan Guanzhou Chengdu Xi'an Jilin Areas of juristiction Beijing, Tianjin , Hebei, Shanxi, and Inner Mongolia Shanghai, Zhejiang, Fujian, and Jiangxi Jiangsu, Anhui, and Shandong Hubei, Hunan, and Henan Guangdong, Hainan, and Guangxi Sichuan, Chongqing, Yunnan, and Guizhou Shanxi, Gansu, Ningxia, and Xinjiang Jilin, Liaoning, and Heilongjiang CHINA ACADEMIC LIBRARY AND INFORMATION SYSTEM I DAI, CHEN, AND ZHANG 67 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. regions. These committees serve a coordinating role in the regions. I Funding The development and operation of CALIS has been funded in large part by the Chinese government. The sources of funding for CALIS at the present time are as follows: • Government grants. Much of the funds for the CALIS project during the first phase of construction came from the government. Because of the demonstrated benefits of the ongoing project, it is expected that the government will provide funds for the second phase of CALIS construction. These government funds have been used in the purchase of software and hardware for the CALIS centers and commer- cial databases, development of service software and databases, training of staff members, etc. • Local matching funds. According to prior agree- ments, a province or city that desires to have a regional center is required to provide funds in sup- plementation to the government funds for the con- struction of its local center. • Member library funds. These funds, primarily derived from the university budgets, have been used to purchase electronic resources and cover the expenses incurred from the use of the CALIS service software platforms. Although CALIS is currently funded by the govern- ment, the future expansion and operation of the system is expected to rely in large part on other sources of fun_ds. The funding needs for CALIS may be met by operating the system in a commercial mode. I Principles for Cooperation among Members The successful operation of a library consortium clearly depends on good working relationships among members and between members and the consortium. At CALIS, all members are required to adhere to a set of principles (see below) in dealing with these relationships. It is based on these principles, known as the CALIS Princ~ples for Cooperation among Members, that CALIS pohc1es and rules are made. • The common interests of CALIS are above those of individual member libraries. 68 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 • • • • Member libraries should not cooperate at the expense of the interests of others. CALIS provides services to member libraries for no profit. Member libraries are all equal and enjoy the same privileges. Larger member libraries are obliged to make more contributions. I What Has Been Achieved? When it was first established, CALIS had sixty-one mem- ber libraries from major universities participating in "Project 211." Later, as many other major universities were interested in joining the alliance, the number of CALIS members has climbed to seventy. At the present, CALIS serves about 700,000 students. Construction of CALIS is a long-term, strategic undertaking. The system provides service functions as they become available and is constantly being improved in the process. In the first phase (1998 to 2000) of the proj- ect, CALIS successfully started the following informa- tion-sharing functions in its member libraries: • primary and secondary data searching; • interlibrary borrowing and lending; • document delivery; • coordinated purchasing; and • online cataloguing. The following tasks have been completed: • purchase of computer hardware (e.g., SUN E~S00); • construction of a CERN et- or Internet-based infor- mation-sharing network connecting academic libraries across the country; and • group purchase of databases, such as UMI, EBSCO, EI Village, INSPEC, Elsevier, and Web of Science, that are shared among member libraries either directly online or indirectly through requested service/ document delivery. CALIS also has completed development of a number of databases, including: • • Union Catalogues. These databases currently con- tain 500,000 bibliographic records of the Chinese and Western language books and periodicals in all member libraries. Dissertation Abstracts and Conference Proceedings . These databases now contain abstracts of doctoral dissertations (12,000 bibliographic records) and pro- ceedings of national and international conferences (8,000 records) collected from more than thirty Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. memb er librarie s. The databa ses are expected to ha ve 40,000 record s in total by the end of 2000. • Current Chinese Periodicals. Th ese databases (5,000 titl es, 1.5 milli on bibliographic records) cont ain cont ents and indexes of current Chinese pe rio di- cals from about thirt y member libraries. • Key Disciplines Databases. CALIS has sponsor ed the de ve lopment of twe nt y- fiv e di scip line-sp eci fic d a tabases by m em ber librarie s. Each of thes e dat a- bc.ses contains about 50,000 to 100,000 record s. The first three class es of databases are prepared in the USMARC, UN IMARC, or CCFC format for the ease of u se b y patron s and ca ta loguing s taff and in data exchang e. Clients from member librari es may perform a Web-ba sed sear ch of th e above databa ses. Most of th em contain secondary docum en ts and ab str acts, and access CALIS onl ine resources using brows ers. Deve lopm ent of sofhvare platform s includes the fol- lowing: • Cooperative online cataloguing systems. The syst ems includ e protocol 239.50-based searc h and upl oad - in g serve rs and terminal softw are platforms for cataloguing staff. Acquisition and ca taloguin g staff in each memb er library m ay participate in cooper- ativ e online cataloguing using the terminal sof t- ware platform s on their local sys tem . Th e sys tems have been u sed for the devel opment and operation of the union catalogue databa ses. • Systems for database development. These syst ems can be used in the de velopment of shared databa ses con- taining secondary data informati on in USMARC, UNIMARC, CCFC, or Dublin Core format. The sys- tems for dat abase developm ent in the USMARC, UNIMARC, or CCFC format s are equipp ed with a search server based on the 239.50 protocol to permit use by catalo guing staff and for data exchange . • A n interlibrary loan system. The sys tem, d eve loped base d on the ISO10160/10161 protocol, consists of ILL protocol machines and clien t terminal s. These sys tems, locat ed in memb er libr aries, are interco n- nected to form a CALIS interl ibrar y loan n etw ork. Primar y document deliv ery sof tware bas ed on the FTP protocol also has been developed for the de livery of scann ed docum ent s between libr aries. • An OPAC system. The system has both Web /239. 50 a nd Web / ILL ga teways . Patron s may visit the sys- tem using co mmon brow sers , sea rch all CALIS NEW! LITA Publications Getting the Most Out of Web-based Surveys by David Ward • 2000 $20 ($18 LITA members) ISBN 0838981089 Surv eys help evalu ate user service s, rat e diff e r e nt librar y programs, facili- tat e n ee ds assess m ents , a id fa cul ty research , a nd mor e. Posti ng surv eys to the W eb provide s an easy and con- veni en t way to reach in ten ded aud i- Getting the Most Out of Web-Based Survey s enc es, cen tralizes data collection a n d gives librari a ns gre ater contro l over analyz ing and repor ting results . Thi s guide shows ho w to create r ob u st W eb-ba se d sur - veys, a nd t h e n gather a nd ass imil ate t h e ir da ta for u se in common database a nd spre adsh eet programs. Th e auth or h as applied th e techniques described in hi s own work and has desi gned both comm ercial and acad emic Web sites . Digital Imaging of Photographs: A Practical Approach to Workflow Design and Project Management by Lisa Macklin and Sarah Lockmiller• 1999 $20 ($18 LITA members) • ISBN 0838980058 A com pre hens ive app roach to man agement of digit al im ag ing in libr aries a nd archi ves , from archival nega tives to meta- data ca taloging a nd Web -base d access. Getting Mileage Out of Meta data: Applications for the Library by Jean Hudgins, Grace Agnew, and Elizabeth Brown 1999 • $22 ($19.20 LITA members)• ISBN 0838980066 An overview of the state-of-the-art metadata cataloging and curr ently ava ilabl e metadata standa rds, incl uding compr e- hen sive descr iption s an d links to current a pplications. Other LITA publications and a printable order form can be found at www.lita.org/litapubs/index.html. Fax orders to (312) 836-9958 or call 800-545-2433, press 7 (M-F, 8-5 CST). CHINA ACADEMIC LIBRARY AND INFORMATION SYSTEM I DAI, CHEN, AND ZHANG 69 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. databases, and send search results directly to the CALIS interlibrary loan service. Patrons also may access an ILL server through Web/ILL, tracking the status of submitted interlibrary loan requests, inquiring about fees, and so on. The databases that are centrally located and those that are distributed at various locations as well as service plat- forms in member libraries form a CALIS information service network. I Future Considerations In a period of just over a year, considerable progress has been made in forming a nationwide resource-sharing library consortium in China. However, because member libraries vary in size, available funds, staff quality, and automation level, CALIS has yet to realize its potential. There are a number of problems that remain to be solved. For example, the CALIS union catalogue databases do not work well on some of the old automation systems in member libraries and the CALIS service platforms are incompatible with a dozen automation systems currently in use; as a result, the union catalogues cannot tell the real-time circulation status in all member libraries, affect- ing interlibrary loan service. In addition, primary 70 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 resources are not sufficiently abundant. Therefore, the extent to which resources are shared among member libraries remains quite limited. In the next phase of development, CALIS will improve service systems (including hardware and soft- ware platforms) and the distribution of shared databases. At the same time, CALIS will develop more electronic resource databases and be actively involved in the research and development of digital libraries, expanding the scale and extent of resource sharing. References 1. Barbara A. Winters, "Access and Ownership in the 21st Century: Development of Virtual Collection in Consortia! Settings," in Electronic Resources and Consortia (Taiwan: Science and Technology Information Center, 1999), 163-80; Katherine A. Perry, "VIVA (The Virtual Library of Virginia): Virtual Management of Information, in Electronic Resources and Consortia (Taiwan: Science and Technology Information Center, 1999), 93-114; Delmus E. Williams, "Living in a Cooperative World: Meeting Local Needs Through OhioLINK," in Electronic Resources and Consortia, Ching-Chin Chen, ed. (Taiwan: Science and Technology Information Center, 1999), 137-61. 2. Jordan M. Scepanski, "Collaborating on New Missions: Library Consortia and the Future of Academic Libraries," in Proceedings of the International Conference on New Missions of Academic Libraries in the 21st Century, Duan Xiaoqing and He Zhaohui, eds. (Peking: Peking Univ. Pr., 1998), 271-75. 10081 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Prospector: A multivendor, multitype, and multistate Western Union catalog Bush, Carmel;Garrison, William A;Machovec, George;Reed, Helen I Information Technology and Libraries; Jun 2000; 19, 2; ProQuest pg. 71 Prospector: A Multivendor, Multitype, and Multistate Western Union Catalog The Prospector project represents a unique union catalog. The origin, goals, and design of the union catalog that uses the INN-Reach system are presented. Challenges of the union catalog include the integration of records from libraries that do not use the Innovative Interfaces system and the development of best practices for participating libraries. T he Prospector project is a union catalog of sixteen libraries in Colorado and Wyoming built around the INN-Reach software from Innovative Interfaces, Inc. (III).1 In 1997, the Colorado Alliance of Research Libraries (the Colorado Alliance) and the University of Northern Colorado submitted a joint grant proposal to create a regional union catalog for many of the major academic and public libraries in the region. The project would allow users to view library holdings and circulation information with a single query of the central database. The union catalog also would allow patrons to request items from any of the participating libraries and have them delivered to a nearby local library. However, unlike many of the other union catalogs in the country, Prospector has several unique elements: • It is multistate (Colorado and Wyoming). • It is multisystem (incorporating systems from Innovative Interfaces and CARL Corporation; plans call for Voyager from Endeavor). • It is multi-library-type (academic, public, and spe- cial libraries). Regional union catalogs representing the cataloged collections of libraries that are related by geography, sub- ject, or library type have been extant for many years. Early leaders in the field spearheaded locally developed systems such as the University of California's MELVYL system and the Illinois Library Computer Systems Organization's (ILCSO) ILLINET Online system, which became operational in 1980.2 The commercial integrated library system market began to emerge in the late 1980s and the 1990s with such vendors as Innovative Interfaces and its work with OhioLink through its INN-Reach union catalog product, and the CARL System.3 Many major vendors now have union catalog solutions for a single physical union catalog, although most have the requirement that participating libraries all use the same integrated library system. An alternative approach that is also becoming popular, because of the heterogeneous nature of the ILS marketplace and the widespread imple- mentation of Z39.50, is for libraries to create virtual union catalogs through broadcast searching. This solution is available from many ILS vendors as well as through organizations such as OCLC and its WebZ software. Carmel Bush, William A. Garrison, George Machovec, and Helen I. Reed There is not a single "right" answer for whether regional catalog searching and document delivery is best accom- plished through a physical or virtual union catalog. Each solution has benefits and drawbacks that must be bal- anced against the mix of vendors, economics, politics, and technical issues within a state. Prospector is some- what unusual in that it does create a single physical union catalog but allows for the incorporation of other library systems, made possible through a published spec- ification from Innovative Interfaces. I Prospector History, Funding, and Project Goals Colorado has a long history of resource sharing through a variety of programs, including use of the Colorado Library Card statewide borrower's card and access to individual libraries' online catalogs through the Access Colorado Library Information Network (ACLIN) and other regional catalogs. The Colorado Alliance has taken a leadership role within the state in promoting coopera- tion among major academic and public libraries in the areas of automation, joint acquisitions, and other coop- erative endeavors. Existing online catalog software enabled patrons to easily search individual online cata- logs, but searching several catalogs was a tedious task requiring many steps. It has long been a goal of the alliance to have a true union catalog of holdings for all member libraries. To forward this goal, in 1997 the Colorado Alliance of Research Libraries and the University of Northern Colorado jointly applied for and received a grant from the Colorado Technology Grant and Revolving Loan Program to establish the Colorado Unified Catalog, a uni- fied catalog of holdings for sixteen of the major academic, public, and special libraries in Colorado.4 The University of Wyoming was included in the project through separate funding. The grant of $640,000 was used to develop a union catalog that would support searching and patron borrowing from a single database. The Colorado Alliance Carmel Bush (cbush@manta.library.colostate.edu) is Assistant Dean for Technical Services at the Colorado State University Libraries, Fort Collins; William A. Garrison (garrisow@ spot.colorado.edu) is Head of Cataloging at the University of Colorado at Boulder (Colo.) Libraries; George Machovec (gmachove@coalliance.org) is the Associate Director of the Colorado Alliance of Research Libraries, Denver; and Helen I. Reed (hreed@unco.edu) is Associate Dean, University of Northern Colorado Libraries, Greeley. PROSPECTOR I BUSH, GARRISON, MACHOVEC, AND REED 71 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and the University of Northern Colorado contributed an additional $189,500 of in-kind services to the unified cat- alog project. Additionally, the Colorado Alliance con- tributed $119,000 of in-kind funds to support purchase of distributed system software. The Colorado Unified Catalog project, later named Prospector, was based upon the INN-Reach software developed by Innovative Interfaces, Inc. It included all Innovative Interfaces sites in Colorado as of December 1996 as well as the CARL sys- tem sites that were members of the nonprofit Colorado Alliance of Research Libraries.s The Colorado Unified Catalog project had two major goals: • the development of a global catalog database con- taining the library holdings of the largest public and academic libraries in the region; and • the development of an automated borrowing sys- tem so that users at any of the participating libraries could easily request materials electroni- cally from any other participating libraries.6 The union catalog would allow users to view library holdings and circulation information on titles with a sin- gle query of the global database. Once titles were located, patrons could request available items and have them delivered to their home library. The grant proposal identified four major goals and outcomes of the project: access, equity, connections, and content and training. By creating a global catalog, the Colorado Unified Catalog project would provide stu- dents, faculty, staff, and patrons free and open access to the union catalog via the Internet. Patrons from all par- ticipating libraries would have equal access to the com- bined holdings of all sixteen participating libraries, thus greatly enhancing resources available to patrons without the necessity of travel across the state. Connectivity was greatly enhanced by the installation of high-speed Internet access in the Colorado Alliance office where the union catalog server was housed. The unified catalog project amassed, in one place, the complete cataloged col- lections of the major libraries in the region creating a sin- gle, easy-to-use public interface. Training for the catalog would be conducted in each library so that it could be integrated into the standard training and reference serv- ices of each participating library.? Addressing statewide goals for libraries, the Colorado Unified Catalog was designed to dovetail with an exist- ing project in Colorado called the Access Colorado Library and Information Network (ACLIN) in several ways. The goal of ACLIN was to provide statewide searching of several hundred library catalogs in Colorado through broadcast 239.50 searching. However, because of the large number of online library catalogs (too many Z39.50 targets cause broadcast searching to be slow) and 72 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 poor network infrastructure in some parts of the state, the creation of physical union catalogs, such as Prospector, greatly enhanced the ability for a project such as ACLIN to be successful. As stated in the grant proposal it will: • make ACLIN more efficient since sixteen libraries will be grouped together and can be accessed via a single search, thus saving ALCIN users steps in searching; • enhance ACLIN's document delivery plans since patrons can make requests themselves; • offer both Web and character interfaces for various levels of users; • provide access via ACLIN's dial-in ports as well as via the Internet; and • support ALCIN's future developments based on a 239.50 environment.s Work on the development of the Colorado Unified Catalog began in mid-1997. Even while contract negotia- tions were underway in mid- to late 1997, groups were busy undertaking discussions on the design and struc- ture of the unified catalog. Work on development of pro- filing and system specifications continued through July 1998. This data was entered onto the server at the Colorado Alliance office and a test database was created in August 1998. Testing was completed in November 1998 and the first records were loaded in December 1998. The creation of the database for the first twelve libraries took seven months. During the database load the catalog was available for searching, although most participating libraries did not highlight the system in their local OPACs. Innovative Interfaces, Inc. conducted training on the actual patron requesting and circulation functions at three sites over the period from May through July 1999. As of January 2000 the catalog included more than 3.6 million unique bibliographic records of the twelve largest libraries in Colorado (more than 6.6 million MARC records have been contributed, which has resulted in 3.6 million unique records after de-duplication). With the database in place and OPAC and circulation training complete, Prospector went "live" for patron-initiated requests in the first eight libraries on July 23, 1999. As of December 31, 1999, all twelve Innovative sites were "live" in Prospector. The final programming for loading the records from CARL-system sites will be completed in spring 2000. It is anticipated that CARL-system library records will be loaded in late spring 2000 and will bring the database to more than five million unique MARC records, with more than ten million item records. Since the receipt of the grant, two participating libraries have selected Endeavor as their online integrated system . Contract negotiations are underway between Innovative Interfaces and the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Colorado Alliance to come to an agreement on loading records for the Endeavor libraries into Prospector. I Politics and Marketing of Prospector Planning and policy making are inherently political processes in which participants choose among goals and options in order to make decisions and to direct actions. For Prospector the diverse makeup of multitype libraries and multisystems augured for different perspectives on implementation from the onset. Nearly every department in member libraries would have an impact from the proj- ect. To be successful in carrying out their charges, the work of the task forces appointed to implement Prospector had to address how these staff could influence the process and how local practices would be affected. The challenge was to engage staff in the process since the task force structure precluded representation from every member library. Meeting this challenge would be vital to ensuring input and fostering buy-in and advocacy for Prospector in member institutions. Consequently, in addition to reviewing standards or best practices and focusing on the goals stipulated in the grant, obtaining factual knowledge about member practices and resources and encouraging communications served as key ingredi- ents in planning and policy development. General Process Profiling Prospector, a main charge for the Cataloging/ Reference Task Force, illustrates the general process employed in planning and how key ingredients were applied to gain input and produce results. The first step involved the task force's review of the grant's aims for the unified catalog. With that framework as a basis, a plan- ning process was outlined and shared with participants. The Prospector Web site detailed the specification devel- opment process, including the schedule and opportuni- ties for input. Next the task force surveyed participants for informa- tion on their systems: bibliographic indexing rules, types of indexes, characters indexed in phrase indexes, indexes on which authority control performed, and suppliers of authority records. Using this data, the task force identi- fied the commonalties and differences to determine what to create in the unified catalog. Members also consulted Innovative Interfaces and reviewed what previous INN- Reach customers had established. Draft recommendations for indexing, indexes, record overlay, and record display specifications were then posted on the Prospector Web site and participants requested to review and provide input. A notice in Data/ink: The Alliance Newsletter (www.coalliance.org/ datalink) also referenced the site. At the same time, testing was performed using draft specifications in order to assess them and to check for other concerns that testing might reveal. Because of the importance of the recommendations, an open forum was held to receive additional comments. Following the forum, the task force members made final adjustments to the specifications. After the period for public comment ended, the spec- ifications were submitted as recommendations to the Prospector Steering Committee for approval. Once approved, the specifications became official and were ref- erenced in all site visits. Issues Because of the design of INN-Reach, participants must make decisions about contribution of records, priorities for what record would serve as the master record, order of loading, indexing, indexes, and displays for the unified catalog. Circulation functions require decisions about services for patron types, circulation statuses, loan peri- ods, numbers of loans, renewals, recalls, checkouts, holds, overdues, fines, notices, pick-up locations, and billing. In the case of Prospector, expectations regarding what would be controversial met with a few surprises. For example, the master record, the bibliographic record from one participating library to which holdings of other libraries are displayed, is based upon encoding level and the library priority list. The latter determines if the incoming record should replace an existing level; a record with a higher level will replace a lower one. Based upon the data collected from libraries, a proposal cate- gorized libraries into the following order: large, special, and "all others." The order was further factored by a member library's application of authority control and participation in Program for Cooperative Cataloging programs. The proposal drew minimal comment from libraries. Pride of ownership was not an obstacle. Everyone was committed to the fullest authorized form of the record. How many loans an individual could request was the subject of early debate. There were concerns about dis- crepancies between local limits for borrowing and the possible setting of a higher number of loans on Prospector. A corollary concern was that a high number might result in depleting a member library's collection. Previous experience with borrowing by a subset of mem- bers shed light on the issue; there were no problems with loan limits. In fact, INN-Reach supports "load leveling" across participating libraries randomly as well as by PROSPECTOR I BUSH, GARRISON, MACHOVEC, AND REED 73 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. precedence tables thus avoiding systematic checkout from one library only. Members decided that they could always pass a request on to another owning library if nec- essary and monitor loans to determine if any abuses would develop. With these options, it then became possi- ble to establish a forty checkout limit for individual patrons in Prospector. Differences in cataloging practices engendered more discussion because of the potential for a policy that might affect local practice. In the course of comparing practices of institutions, the Cataloging/Reference Task Force identified multiple records for the same serial titles that reflected differences in forms of entry and multiple ver- sions treated either in separate records or on the same record. There was wide variety in statements of holdings. These differences warranted gathering further informa- tion on holdings; multiple versions, especially those involving electronic versions; and successive/latest entry for cataloging. The task force decided to hold a focus group on serials and invited staff in member libraries from serials, cataloging, and reference to attend. In the meantime, visits to participating libraries were instituted, the first of the roadshows, to discuss serials practices, their implications for overlays and displays, and options for handling them. The focus group attracted a large attendance and proved useful in gathering information about practices and the concerns of participating libraries regarding serials. Most libraries reported individual practices for recording holdings. Although participants expressed a desire for consistency, attendees also shared that resources are not available to retroactively change them. Instead attendees encouraged development of a best practice recommendation that would follow the NISO standards for those libraries wishing to change practices. With the exception of electronic versions of serials, focus group participants had no problem with multiple formats in the same bibliographic record as long as it was clear to users. Electronic versions prompted a lot of ques- tions about what to do with 856 links to restricted access resources and about changes in software. It was clear that this issue would need further investigation by the task force. The hottest area, successive or latest entry cataloging of serials, registered strong preferences by proponents. Attendees did not welcome changing practice in either direction. Instead there were questions asked about pos- sible system changes and about the conduct of use stud- ies to determine what problems might arise from latest entry records in the system. With the information gained from the focus group meeting, the task force assigned priority to the areas and pursued latest/ successive entry as the top priority. 74 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 Already the task force had consulted Innovative Interfaces, Inc. and received a negative reply to possible changes to matching algorithms, loading programs, and record values that could deal with practices of partici- pants because of the software structure. It was technically impossible for a latest entry and successive entry record to load separately given their match on the OCLC number. The predominant use of successive entry and its sta- tus as the current national standard persuaded the task force initially to recommend coding latest entry in a spe- cial way so that the record for such an entry would not be the master record in the system unless it was unique. This interim measure led to the policy recommendation that successive entry serve as the standard for Prospector. As a part of the recommendation, members are asked to not undertake retroactive conversion/ recataloging projects to change existing latest entry records. Up to the meeting of the Prospector Board of Directors, the serials policy was argued. The approval by the board illustrates that controversial issues may require that leadership commit their libraries to policies. Marketing Marketing incorporates an overall strategy of identifying patrons' needs, designing products to meet those needs, implementing the products, and promoting and evaluat- ing them. The twin goals of Prospector are: (1) one-stop shopping and expanded access regardless of location, and (2) an automated borrowing system to facilitate fast delivery of materials that addressed problems experi- enced by patrons in searching and obtaining materials. The grant proposal outlined a plan for member libraries to meet these goals through INN-Reach software and the cooperative efforts of participating members. With the implementation of the unified catalog and patron-initi- ated borrowing, the next pieces of the strategy, promotion and evaluation, come into play. Member Libraries Commitment to a cooperative venture takes time and energy. The support for Prospector at the library director and dean level had to be translated to staff in member libraries whose efforts would be necessary to support the unified catalog and patron-initiated loans. Staff members had to become acquainted with how Prospector would benefit patrons and their work. Hence internal promotion was a necessary component throughout planning and policy development and with implementation to users. Because of the numbers of staff in member libraries, no one method would assure awareness of developments for Prospector. The approach involved the Alliance's newsletter (DataLink), a Prospector Web site, electronic Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. discussion lists, e-mail, correspondence, phone calls, doc- umentation, training sessions, and many site visits. The site visits facilitated interaction across institutional lines and were important for discussing critical issues at the local level. In arranging for site visits, it was important to clarify what the staff members wanted to discuss. A gen- eral update on Prospector might be followed by other technical sessions such as preparing the library's data- base for load into the Prospector system. Participants' questions emphasized the importance of sharing the plan for developing Prospector and the basic concepts guiding the implementation planning and pol- icy process as listed below. These concepts bore repeating because a staff member could have been hearing about Prospector for the first time. • Decisions and directions are guided by data and input gathered from participants, standards/best practices, system capabilities, and the aims for Prospector described in the grant. • Relatively few local practices are affected by par- ticipating in Prospector. • Inclusiveness in record contributions would build Prospector into a rich resource for users; however, participating libraries can exert control over con- tributions. • Global policies are developed for Prospector only; local sites define their own local policies. • Assistance is available to participating libraries in coming up with solutions for special circum- stances. • Prospector is not reinventing the wheel. Although the multitype library and multisystem involve- ment would produce a new model of INN-Reach, other INN-Reach sites could serve as models. • Think globally but act locally. More than a catch- phrase, this statement acknowledges the reality of individual library circumstances and the balancing of Prospector goals to maximize access and use of resources by patrons. Patrons The design of the PAC, a promotional brochure, and indi- vidual library public relations efforts all served to pro- mote Prospector's availability to users. Prospector provides access via Telnet and the Web. The impetus, however, was to examine member WebPACs and create a Prospector WebP AC that exemplified the best in menu design including caption descriptions, navigational aids, and consistency in display of elements among search screens. Special attention was paid to providing example searches that would have appeal for the diversity of patrons served by the membership. After mulling over several name possibilities, the Alliance staff suggested the name Prospector for the uni- fied catalog, connoting the rich mining history of the Rocky Mountain area. This identity found its depiction in a classic picture of a gold miner supplied by the Colorado Historical Society. Representing the user, the miner is the center panning for gold, an apt image for users exploring the richness of resources from the unified catalog. The incorporation of the image as the logo on the Web site and the catalog was followed by its adoption for the entire cooperative venture. Name recognition spread quickly. To facilitate promotion at member libraries, the Alliance staff designed a brochure. The design features a brief description of the unified catalog, a list of members and information for patrons on how to connect, what's available on Prospector, how to use the self-service bor- rowing, and how to view their circulation record. Many libraries have Web-mounted guides or paper handouts in their instructional service, using the Alliance-designed brochure as a model. Finally, staff in member libraries exercised individual approaches to promote Prospector to users. Denison Library describes and provides a link to Prospector on its Web list of databases and help guides. Colorado State University Libraries devoted the front page of its library newsletter to "hunting for hidden gold," the introduction of Prospector. A special newsletter for Auraria's history faculty highlighted Prospector in its database news sec- tion. The University Libraries of the University of Colorado at Boulder describes the unified catalog in its Web site on its State Services page. More introductions came from instructional classes held by every member library. Profile of Participating Libraries Prospector is unique since it is multistate, multi type, and multisystem. Of the sixteen members (see appendix A), almost all are located along the front range of the Rocky Mountains extending from Laramie, Wyoming, south- ward to Colorado Springs, Colorado. Only Fort Lewis College is located on the western slope of the mountains. Despite the distances, a network of courier service con- nects all members. Within the membership are eleven public and private academic libraries, three special libraries representing law and medicine, and two public libraries that serve almost one million registered patrons. Twelve of the libraries operate Innopac and are loaded into Prospector. Two libraries on the CARL System are slated for loading in mid-2000. Two other libraries are migrating to the Voyager System by Endeavor Information Systems in the summer of 2000. Hopes are to incorporate them into the system in 2001. PROSPECTOR I BUSH, GARRISON, MACHOVEC, AND REED 75 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Description of How INN-Reach Works The INN-Reach software is designed to provide a union catalog with one record per title with all of the libraries holding a title represented. After databases are loaded ini- tially, the software automatically queues transactions that occur to bibliographic, item, order, or summary serial holdings records and sends those transactions up to the central catalog. Staff in the local library has no extra work or steps to take to send transactions to the union catalog. The union catalog uses a "master" record to maintain only one bibliographic record per title. The "owner" of the master record is determined by several factors. A bib- liographic record with only one holding library automat- ically has that library as the owner of the master record. If more than one library holds a title, the system uses an algorithm to determine which record coming into the sys- tem has the highest encoding level. The library that has the record with the highest encoding level becomes the owner of the record, and its version of the record is dis- played and indexed in the catalog. In addition, a table is created which has a list of the libraries in priority order for determining the master record if two or more match- ing records enter the system with the same encoding level. For the Prospector catalog, a survey was conducted of the participating institutions to determine which libraries might have the best or fullest records. Questions in the survey included size of database, source of biblio- graphic records, participation in national projects (e.g., Program for Cooperative Cataloging, OCLC Enhance), amount of authority work done and level of authority control in the local database, level of cataloging given to records, and type of institution. The task force charged with designing the catalog examined these surveys and determined a priority order of the participating institu- tions for selecting bibliographic records. The system also uses a set of match points each time a bibliographic record is added to the union catalog. Whenever a match occurs, the system examines the encoding level of the incoming record and the library from which the record is coming to determine if a change in the master record is required. The existing record is overlaid by the incoming record if the master record holder is changed. The first check is done on the OCLC record number. If there is a match on that, the system adds the holdings to the existing record. If there is no match on the OCLC number, the system attempts to match on the ISBN or ISSN in combination with the title in the 245 field. Again, if a match occurs, the system adds the holdings to the existing record. If no match occurs, a new bibliographic record is added to the catalog. In addition, each library that has a local Innovative Interfaces system has the ability to exclude bibliographic, item, order, or check-in records from being sent to the 76 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 union catalog. Suppression may occur in each of these record types. The library may also choose to send a record to the union catalog but exclude it from public dis- play in the union catalog or to suppress a record from dis- playing in the public catalog both locally and centrally. The INN-Reach system has no central database main- tenance module, though it does provide a staff mode in which to view records, to create lists, and to monitor transaction queues. The staff module that is available via a telnet connection allows authorized users to view those records that have been contributed to the union catalog but are not displayed to the public in the union catalog. For example, a library may contribute its order records to the union catalog but choose to suppress those records from public display; however, authorized staff may view these records in the INN-Reach staff mode or create lists for collection development purposes that include those order records. Circulation status of individual items and volumes also appears to the user. The Prospector member libraries with local Innovative Interfaces systems also maintain a set of circulation or item status codes that display various messages to users of their individual public catalogs. The INN-Reach system also has a set of circulation or item status codes. Agreement was reached on what the status codes were to be in the central catalog, and each member library then had to map its local codes to the codes used in the central catalog to ensure proper message display in the union catalog. In some cases, the member libraries had to adjust local status codes. Indexes for the Prospector catalog were determined during the profiling process. In general, there are more indexes in the union catalog than are available in the member libraries' local catalogs. Indexes in Prospector include author, author/ title, Library of Congress Subject Headings, Medical Subject Headings, Library of Congress Children's Subject Headings, journal title, key- word, Library of Congress classification numbers, National Library of Medicine classification numbers, Dewey Decimal classification numbers, government documents numbers, OCLC numbers, and special num- bers (e.g., ISBN, ISSN, music publisher numbers, etc.). The classification number indexes are derived using the classification numbers that appear in the defined MARC tags for the various classification schemes in the biblio- graphic record and do not represent local call numbers. Local call numbers are always stored at the item record level in the union catalog. It was decided that many local MARC fields that are defined for local notes or local access would not transfer from the local catalog to the union catalog (e.g., 59x, 69x, 79x, 9xx) to avoid ambigui- ties and excessive heading conflicts. Therefore, there may be access points or index entries in the local catalog that may not be available in the union catalog; the local Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. catalog may still contain "richer" or "fuller" searching than the union catalog. The local catalog may have mate- rials accessible in it as well that do not appear in the union catalog. Patrons using a local catalog may transfer their searches up to Prospector simply by clicking on a button in their local public catalogs and have the search auto- matically occur in the union catalog. Patrons may access Prospector directly either via the World Wide Web or via telnet. Navigation between local catalogs and Prospector as well as navigation within Prospector has been designed to be clear and simple. Patrons may also go from Prospector either back to their local catalog or to the local catalogs of other member libraries. When a patron locates an item that he or she wishes to borrow from Prospector, he or she may initiate the request for the item online. The borrowing and lending process is described below. Prospector member libraries have been asked to be as inclusive as possible in contributing bibliographic records to the union catalog. Member libraries have been asked to contribute the following: • items that users may borrow, including all mono- graphic materials that circulate, and other material types as specified by individual institutions that are listed as available for circulation. • items that users may not borrow but may use on- site, including reference materials, archival materi- als, rare books, and others as determined by individual institutions. Virtual items, such as elec- tronic journals, which have IP limiting and authen- tication are included in this category. • Items that are owned virtually which have URLs or IP addresses that are open and unrestricted include government publications and selected home pages as determined by the local institution. Bibliographic records that are contributed should have as full cataloging as possible for identification and retrieval. Materials that are on reserve and other locally defined special materials (e.g., materials that have use restrictions placed upon them) may be excluded from Prospector. The Prospector union catalog will also include biblio- graphic and circulation information from libraries that do not use Innovative Interfaces as their local system vendor. I The Integration of Non-Innovative Libraries into INN-Reach One of the major efforts in the Prospector project was to be able to incorporate bibliographic, item, summary serial holdings, and acquisitions records from other ven- dors with the INN-Reach union catalog software. In 1997, when the grant was written, it was envisioned that the system would incorporate libraries using two ILS ven- dors-Innovative Interfaces, Inc. and CARL Corpora- tion-two of the major vendors in Colorado at the time. Twelve libraries used Innovative Interfaces and four used the CARL system (Denver Public Library, Regis University, Colorado School of Mines, and the University of Wyoming). However, in late 1999, the Colorado School of Mines and the University of Wyoming decided to migrate to the Voyager system by Endeavor Information Systems (this is occurring in 2000). Both of these institu- tions have still expressed an interest in being part of Prospector, so they will need to be integrated in 2001 after they are stable on their new system. The remaining CARL sites will be fully integrated in 2000. The integration of records that allows document requests from different vendors is being accomplished as follows: • Innovative Interfaces, Inc. has published a set of specifications for how bibliographic, item, sum- mary serial holdings, and acquisitions order records should be formatted to be loaded into the union catalog. • Published specifications were also created for patron verification and for how document requests are to be transferred. • The Alliance office is developing the software to package USMARC bibliographic records, item records, summary serial holding records, and order records to transfer to Prospector. Work is also being done so that document requests may be relayed between the different systems using an intermediate Unix server running an SQL database with a Web interface for circulation to ILL staff. Because the CARL and Endeavor systems are built differently, the record updating may be done on a "batch" basis several times a day. Patron verification, to deter- mine if a CARL or Endeavor patron is in good standing before allowing a document request, will be done in real- time. I Administrative and Committee Structures Under provisions of the grant, the Dean of Libraries at the University of Northern Colorado provides administrative management for the project while the Colorado Alliance of Research Libraries houses the server, maintains the union catalog software, provides network connectivity, PROSPECTOR I BUSH, GARRISON, MACHOVEC, AND REED 77 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. develops the software to integrate the non-Innovative sites into the union catalog, and provides ongoing system administration support for the project. A Prospector Steering Committee comprised of deans and directors of three participating libraries provided general overview for the project during the initial stages. To carry out the initial work of the project, two task forces were appointed with responsibility for detailed design and implementa- tion of the system: the Catalog/Reference Task Force and the Circulation/Document Delivery Task Force. The Catalog/Reference Task Force was charged with making all bibliographic and display decisions relating to the catalog. This included establishing the criteria for determining which institution's bibliographic record dis- plays in the catalog, developing display and overlay hier- archies for bibliographic records coming into the system, and identifying MARC fields that would be indexed and displayed in the catalog. Membership on this task force included both public services and technical services per- sonnel, but did not include representation from every participating library.9 The Circulation/Document Delivery Task Force was charged with developing common circulation policies to be applied in the union catalog including loan periods, fines, renewals, holds, recalls, checkout limits, and patron blocks. The task force was also responsible for develop- ing the precedence table for routing patron requests. The members of this task force represented each participating library, and several libraries had representation from both their circulation and interlibrary loan department.lo These two task forces conducted meetings from July 1997 through December of 1999. The stage was set for the task forces' work at a training session held by Innovative Interfaces, Inc. on system operation and functionality. Each group received direction on what policy issues needed to be determined to lay the groundwork for establishing the codes that drive system functionality. After the initial train- ing, each task force met several times a month, often con- sulting with Innovative Interfaces, Inc. and/ or their local libraries as their planning and deliberations continued. Communication was an important component during the development of the system. Soon after the grant was awarded, staff from the Alliance office visited each par- ticipating library and met with library personnel to explain the overall goals of the project and how work would be conducted. As detailed development pro- gressed, open forums were held in central locations to keep representatives of all libraries apprised of progress and to get feedback regarding specific policy issues. Completed work from the task forces was mounted on the Prospector Web site. In addition, regular articles appeared in Data/ink, the Alliance monthly newsletter. Specific training sessions were conducted both by the Task Forces and by Innovative Interfaces. 78 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 As the actual database loading process began, the Catalog/Reference Task Force conducted sessions at each Prospector library. These sessions were twofold in pur- pose: to provide an opportunity for a general overview of how the database structure and indexing worked for all library personnel, and to train technical services person- nel in how local coding of records impacted the display of their local records in the global catalog. In preparation for going live with patron requesting, Innovative Interfaces, Inc. conducted PAC searching and circulation training sessions at several central locations for frontline staff from all institutions. In addition, the Circulation/ Document Delivery Task Force held a central session for representatives from all libraries to discuss issues relating to the flow of materials among libraries. During system implementation, it became apparent that some ongoing structure would be required for ongo- ing maintenance and development of the global catalog. In completion of their charges, each task force prepared a final report, which was submitted to the Steering Committee and to the Prospector Directors Group. Each task force recommended its own termination but out- lined a structure to address ongoing issues. As approved by the Prospector Directors Group, the ongoing governance structure is multilayered with front- line operations groups, broader planning and policy-set- ting committees, an Advisory Committee, a Directors Group, and electronic discussion lists for communication. Monitoring of the day-to-day work of the cataloging and circulation/ document delivery operations is handled by frontline staff via e-mail, electronic discussion lists, and/ or telephone. Broader planning and policy issues are addressed through smaller, representative standing committees. The Advisory Committee and Directors Group operate at a policy level. The new structure includes: • a Catalog Site Liaison group comprised of one rep- resentative from each participating library and charged with serving as the point of contact for inquiries regarding catalog maintenance, access and record merging; • a Catalog/Reference Committee comprised of members selected from the participating libraries and charged with responsibility for all biblio- graphic and display issues relating to Prospector. This includes monitoring details of the current implementation as well as addressing ongoing policy issues, recommending system enhance- ments, testing new system functionality, and train- ing staff at new sites coming into the system; • a Document Delivery Site Liaison group com- prised of one or more representatives from each participating institution with responsibility to Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. serve as a point of contact for other Prospector libraries that have inquiries concerning issues, lost books, courier delivery, or related topics; • a Circulation/Document Delivery Committee com- prised of representatives selected from the partici- pating libraries and responsible for issues relating to the courier delivery service, circulation load-balanc- ing, monitoring member compliance with circula- tion policies, recommending system enhancements, testing new system functionality , and the year-end reconciliation of lost book charges; and • a Prospector Advisory Committee comprised of tewnty-four deans and directors from participat- ing libraries to address issues requiring quick response relating to project specifications and operating rules. The Prospector Directors Group is comprised of the deans/ directors of all participating libraries and is charged with making recommendations on high-level policy and admission of new participants . Since Prospector is a proj- ect of the nonprofit Colorado Alliance of Research Libraries consortium, all final high-level decisions and financial commitments are subject to the approval of the Board of Directors of the consortium . At the present, five of the sixteen Prospector libraries are not part of the formal consortium but participate in this one project. The newly formed committees will continue to address broad policy and operational issues such as the load-balancing tables for routing patron requests to own- ing libraries, will document best practices for local libraries to follow in implementing certain functionality within their local system to achieve maximal results in the central catalog, will identify enhancements to the sys- tem , and will test new release functionality. I Borrowing and Lending Policies and Specifications As a prelude to its work, the Circulation / Document Delivery Task Force examined borrowing and lending practice s from other Innovative Interfaces . INN-Reach sites and reviewed the borrowing policies for consortia! borrowers that were developed and agreed to by a subset of Alliance libraries (University of Northern Colorado, Auraria Library, and Denver Public Library) several years ago. The first major duty of the task force was to establish circulation and document delivery policies that would govern those functions within the Prospector system. These common circulation and document delivery poli- cies were based on a series of assumptions: • the task force policies apply to the unified catalog only; local sites define local policies; • local workflow remains local purview; • policies should be kept simple; • circulating materials are commonly circulated materials, primarily books, at each site; • the task force will work within the confines of the INN-Reach system; • if a patron is blocked locally, he or she will be blocked at the global level; • for routing purposes, each institution (rather than branch) is the routing site; and • local sites will determine when their items are declared lost. The task force established a series of recommenda- tions for policies that applied to the Prospector system . The proposed policies were discussed within the local institutions as well as with various administrative groups. The final policies for Prospector lending as adopted and implemented in the system are: • loan period : twenty -one days • renewals: one • number of holds allowed : forty • checkout limit: forty items • recalls: none, except for academic library reserve collections • lost book charge: $100, which is comprised of a $75 refundable lost book charge and a $25 nonrefund - able processing fee • libraries establish their own local rules for overdue fines on Prospector materials . Key features of the INN-Reach software that were emphasized with each local library during training ses- sions are: • Libraries have local control over what is loaned through the global catalog. • Libraries have local control over which of their patrons can borrow materials through the global catalog. • If the local copy is checked out or missing, a copy may be requested through Prospector. • The system is sensitive to multivolume works and allows particular volumes to be selected. The ongoing Document Delivery Committee has developed a series of "best practices" that establish benchmark policies that each library is urged to adopt in the spirit of uniform cooperation among participating libraries. Individual libraries, however, may choose not to adopt these practices. PROSPECTOR I BUSH, GARRISON, MACHOVEC, AND REED 79 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. System Functionality The actual steps for a patron to request an item within the Prospector system are simple and self-explanatory. Once a patron has identified an item they wish to order, the fol- lowing steps take place: • The user is prompted for institutional affiliation, name, and library card number. • The system checks local system to ensure that the patron is in good standing. • The user selects a pick-up location from those offered by their home institution. • The system forwards the patron request to an own- ing library with an available circulation status doing load balancing among the libraries with available copies. Once the patron request is forwarded to a lending library, the request goes into the queue of requested items from that library. Each library has established its own workflow for handling requests; however, that workflow must include interaction with the system to record the status of the request. Once the item is located by the lend- ing library, it is checked out to the requesting patron's "home" library and is sent, via courier, to that library. The "home" library then receives the item in the system and holds it pending pick-up by the patron. When the patron arrives to borrow that item, it is checked out to that patron's record according to the Prospector loan rules. Having a common set of loan rules for all Prospector loans provides consistency for the patron. The patron may still have multiple due dates on items checked out at the same time depending on the loan rules for local checkouts. The system maintains statistics on several elements of the borrowing and lending processes. It tracks the total number of items borrowed and loaned and calculates the ratio of borrowing to lending per institution. In addition, it tracks the number of items cancelled and the reason why, the number of holds filled and cancelled, and sev- eral other groupings. I Challenges and Issues With the building of Prospector still underway and pub- lic access available only since late July 1999, Prospector is doing a respectable volume of loans in its infancy. Over ten thousand items were delivered during the first six months of operation. This number is expected to dramat- ically rise as the system grows and as local libraries pro- mote the service. This auspicious start provides a sense of 80 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 accomplishment tempered by recognition that there is more to do. Some of the major challenges facing the proj- ect include: • • • • • • • • Development is underway to integrate records for the CARL system libraries into the central catalog and provide borrowing capabilities for their patrons. As member libraries choose other online system providers, ideally, these systems likewise need to be interfaced with the Prospector system. Coming to agreements with all vendors involved will require careful negotiation and wording of contracts. Discussions are underway with Innovative Interfaces and Endeavor Information Systems for merging Endeavor libraries into INN-Reach. Monitoring how the fiscal accounting for first end- of-year reconciliation will work for lost books is planned. Developing best practices and evaluating software enhancements for INN-Reach are necessary. We need to determine how to handle electronic resources and multiple formats, and load records from commercial electronic resources, for example, net Library. We must improve matching within the system and additional enhancements to the Prospector Web site. With growth of the system, full-time operations and management staff may be required. Securing funding for the new ventures and new staffing will require development efforts or a shar- ing of costs by members. There is no state-based funding for ongoing maintenance and new prod- uct acquisition. With the increasing flow of materials between libraries, the courier delivery service must be monitored on an ongoing basis. The statewide courier service has been recently restructured and was contracted based on pre-Prospector activity levels for interlibrary loan materi- als. With the ever-growing popularity of Prospector, there will be a corresponding increase in volume for the courier. Service levels need to be monitored closely to ensure that the speed of delivery is maintained and that the loss and incorrect routing rate is within acceptable limits. The balance of borrowing and lending will have financial impacts on some of the participating libraries. Through a legislative allocation, the State Library of Colorado provides funding on a per transaction basis to libraries that are net lenders, or that loan more materials than they borrow. Most libraries are considering the Prospector transactions as equivalent to interlibrary loan transactions and counting them toward the payment for lending program. It is anticipated that the inclusion of Prospector activity in the interlibrary loan borrowing and Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. lending statistics will significantly alter the balance of payment for lending among the Prospector libraries. Already Prospector has shown that it is changing behaviors. The cooperation between libraries has been impressive. In member libraries, staff are factoring Prospector into their plans and realizing that keeping Prospector operations staff informed of problems is a good habit. User searching and document delivery patterns are changing. Margaret Landrum, Director at the Fort Lewis College Library, predicts that Prospector will have a dra- matic effect on researchers in the geographic area. Its start has given all members a share in that expectation. I The Future and Interesting Spin-Offs Union catalog projects often take on a "life of their own" far beyond what was originally envisioned. Some of the future spin-offs may include: • The addition of other research libraries in nearby states. • Collection overlap studies and improved coordina- tion on acquisition and weeding projects between libraries. • With the full implementation of the union catalog, there are opportunities for resource sharing at a broader level. The central catalog has the function- ality to support bibliographic records for and access to "consortia!" resources, thus enabling libraries to jointly purchase resources and provide centralized access to them. • As database and online information providers develop new methodologies for access to their resources, there will be opportunities to easily link from either the local or central catalog to these online resources, a process which is cumbersome and/or impossible in the nonglobal environment. For instance, where databases are centrally mounted at the Alliance office with shared ownership, the link to serial holdings feature is pointed to Prospector, thus providing patron access to consortiawide holdings. • Use of the system as a central repository for cata- loged metadata for electronic resources on the Web. • Encouraging Innovative Interfaces, Inc. to allow document requests that "fail" in the system to be forwarded to national ILL subsystems or commer- cial document suppliers using national standards. I Conclusion Prospector dramatically alters the bibliographic land- scape in Colorado, offering patrons easy access to the bib- liographic wealth of the state. Patrons will be easily able to move from a local catalog to this regional system and request materials. Librarians will find the system useful for collection overlap studies, improved coordination on acquisitions and weeding projects, Z39.50 links with other indexing/ abstracting services for serials holdings information (e.g., Ovid or SilverPlatter), and expedited book delivery. The high level of cooperation among the diverse nature of the participating libraries is exemplary. The incorpora- tion of public and private universities, public libraries, and special libraries offers a model for cooperation. References 1. Anthony J. Dedrick, "The Colorado Union Catalog Project," College and Research Libraries News 59, no. 10 (1998): 754-55; George Machovec, "Prospector: A Regional Union Catalog," Colorado Libraries 25, no. 2 (1999): 43-45. 2. Clifford A. Lynch, "The Next Generation of Public Access Information Retrieval Systems for Research Libraries: Lessons from Ten Years of the MELVYL System," l!'.formation Technology and Libraries 11, no. 4 (1992): 405-15; Bernie Sloan, "Testing Common Assumptions about Resource Sharing," Information Technology and Libraries 17, no. 1 (1998): 18-29. 3. Thomas Dowling, "OhioLINK-The Ohio Library and Information Network," Library Hi Tech 15, no. 3 / 4 (1997): 136-39; Lindy Naj, "The CARL System at the University of Hawaii UHM Library," Library Software Review 12, no. 1 (1993): 5-11. 4. Gary Pitkin and George Machovec, Colorado Union Catalog. Senate Bill 96-197. Technology Grant and Revolving Loan Program. Excellence in Learning Through Technology. December 1996. Grant proposal by the University of Northern Colorado and the Colorado Alliance of Research Libraries. 5. Gary Pitkin, Colorado Union Catalog-Prospector. Final Report. July 27, 1999. 6. Machovec, "Prospector: A Regional Union Catalog." 7. Ibid. 8. Ibid. 9. Prospector Staff Web site, www.coalliance.org/prospector. 10. Ibid. PROSPECTOR I BUSH, GARRISON, MACHOVEC, AND REED 81 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A General Statistics about Prospector: • sixteen libraries (see below) • twelve Innovative Interfaces sites (went live in fall 1999) • two CARL sites (to go live in 2000) • two Voyager Endeavor sites (to be incorporated in 2001 pending final negotiations with both vendors) • 3.6 million unique MARC records as of January 2000, which are expected to grow to more than 5 million after the incorporation of the CARL and Endeavor sites. • 9 million item records, which are expected to grow to more than 12 million after the incorporation of the CARL and Endeavor sites. • Currently 61 percent of the records in the system are held by only one library. • Greater than 1 million registered patrons are possible users . Denver Public Library has over 500,000 patrons and Jefferson County Public Library has over 300,000 patrons . • Prospector URL for public use : http:/ /prospector.coalliance.org • Prospector staff URL, which includes policies, committee minutes, and profiling tables: www.coalliance.org/ prospector Prospector Libraries Auraria Library Colorado College Colorado School of Mines Colorado State University Denver Public Library Fort Lewis College Jefferson County Public Library Regis University University of Colorado at Boulder University of Colorado/Colorado Springs University of Colorado/Health Sciences University of Colorado/Law Library University of Denver University of Denver/Law Library University of Northern Colorado University of Wyoming Web site http://carbon.cudenver.edu/public/library http://www.coloradocollege.edu/library http://www.mines.edu/academic/library http://manta.library.colostate.edu http://www.denver.lib.co.us http:/ !library. fortlewis.edu http://www.jefferson.lib.co .us http://www.regis.edu/1 ib/wlibhome.htm http://www.libraries.colorado.edu http://web.uccs.edu/library http://www.uchsc.edu/library/index.html http://www.colorado.edu/law/lawlib http://www.penlib.du.edu http://www.law.du.edu/library http://www.unco.edu/library http://www-lib.uwyo.edu 82 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX B Early Borrowing/Lending Data The borrowing and lending patterns in Prospector will be of interest to monitor because of the wide variety of partici- pating libraries in the system. The incorporation of both academic and public libraries has the potential for different use patterns as seen in more homogeneous academic union catalogs. The following data represents some of the very early borrowing and lending patterns in Prospector . All of the libraries in the table went "live" in terms of borrowing and lend- ing in late July or August 1999, with the exception of Jefferson County Public Library, which went live in November 1999. History with other similar projects has shown that use will dramatically grow as libraries and users gain familiarity with the service. The incorporation of Denver Public Library in 2000 should provide significant impact on the service. At the present (and in the accompanying table), Prospector has been configured to do random load balancing without the use of any precedence tables to force document requests to one site or another. Borrowing Site Aur CCC SU CUL CUB DU DUL FTL JCPL UCCS UCHSC UNC Lending (Owning) Site Ratio UB TOTALS 1879 930 2301 225 1520 1132 129 946 1775 882 364 2063 AUR 0.89 1667 108 282 33 232 187 17 113 234 128 70 263 CCC 0.72 673 114 109 11 96 57 66 89 53 10 68 CSU 0.86 1985 267 156 29 272 221 18 130 288 134 55 415 CUL 0.55 123 24 9 20 5 11 12 3 10 7 3 19 CUB 2.05 3120 396 231 590 26 260 21 246 420 233 56 641 DU 2.07 2341 361 153 464 42 315 20 163 279 131 69 344 DUL 1.12 145 27 7 14 27 15 25 3 11 6 4 6 FTL 0.54 511 66 36 130 3 66 36 7 72 31 11 53 JCPL 0.54 962 187 81 201 11 154 65 11 64 33 38 117 uccs 1.02 900 170 65 148 12 130 65 5 3 137 15 90 UCHSC 0.83 301 63 5 49 5 26 31 3 5 32 36 46 UNC 0.69 1422 219 81 291 27 207 153 13 89 222 90 30 Prospector Fulfillments Report, August 1999 through February 14, 2000 PROSPECTOR I BUSH, GARRISON, MACHOVEC, ANO REED 83 10082 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Digital resource sharing and library consortia in Italy Giordano, Tommaso Information Technology and Libraries; Jun 2000; 19, 2; ProQuest pg. 84 Digital Resource Sharing and Library Consortia in Italy Tommaso Giordano Interlibrary cooperation in Italy is a fairly recent and not very widespread practice. Attention to the topic was aroused in the eighties with the Italian library network project. More recently, under the impetus toward tech- nological innovation, there has been renewed (and more pragmatic) interest in cooperation in all library sectors. Sharing electronic resources is the theme of greatest interest today in university libraries, where various ini- tiatives are aimed at setting up consortia to purchase licenses and run digital products. A number of projects in hand are described, and emerging trends analyzed. T he state of progress and the details of implementa- tion in various countries of initiatives to share digi- tal information resources obviously depend-apart from current investment policies to develop the informa- tion society-on many factors of a historical, social, and cultural nature that have determined the evolution and consolidation of cooperation practices specific to each context. Before going to the heart of the specific subject of this article, in order to foster an understanding of the envi- ronment in which the trends and problems that we shall be considering are set, I feel it best to give a quick (and necessarily summary) sketch of the library cooperation position in Italy. The word "cooperation" became established in the language of Italian librarians only toward the mid-'70s, when in the sector of public libraries-which were trans- ferred in those years from central government to local authorities-the "territorial library systems" were taking shape: this was a form of cooperation provided for and encouraged by regional laws that brought together groups of small and medium-sized libraries, often around a system centre supplying shared services. A few years later, in the wake of the new information technologies and in line with ongoing trends in the most advanced countries, in Italy, too, the term "cooperation" became increasingly associated with the concept of com- puterized library networks. The decisive impulse in this direction came from a project of the National Library Service (SBN), the national network of Italian libraries, then in a gestation stage, which also had the merit of Tommaso Giordano (giordano@datacomm.iue.it) is Deputy Director of the Library at the European University Institute, Florence. 84 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 speeding up the opening of the Italian librarianship pro- fession to experiences underway in the most advanced countries_! In the '80s, cooperation, together with automation, was the dominant theme at conferences and in Italian professional literature. However, the heat of the debate had no satisfactory counterpart in terms of practical implementation, because of both resistance attributable to a noninnovative administrative culture and the polarization of the bulk of the investments around a sin- gle major project (the SBN network), the technical and organizational choices of which were shared by only part of the libraries, while others remained completely outside this programme. Many librarians, while recog- nizing the progress over the last fifteen or twenty years (including the possibility of accessing the collective cat- alogue of SBN libraries through the Internet), maintain that results obtained in the area of cooperation are well below expectations, or energy involved. I am touching here on one of the most sensitive, controversial points in the ongoing professional debate, which I do not wish to dwell on except to note the split that came in Italian libraries following the vicissitudes of a project that ought, instead, to have united them and stimulated large-scale cooperation.2 I shall now seek to summarize the cooperation posi- tion in Italy in relation to the subject of this article. Very schematically (and arbitrarily) I have grouped the experi- ences I feel most signficant under three heads: SBN net- work, territorial library systems, and sectoral cooperation. SBN brings together some eight hundred large, medium-sized, and small libraries (national, local- authority, university, and research-institute). The pro- gramme, funded by the central government, supports cooperation in the following main sectors: • hardware sharing, • development and maintenance of library software packages, • network administration, • shared cataloguing, and • interlibrary loans. The SBN is a star network with its central node con- sisting of a database (the so-called "index") containing the collective catalogue of the participating libraries (cur- rently some four million relevant bibliographic titles and 7.5 million locations). To the index are linked the thirty- seven local systems, single libraries or multiple libraries, that apply the computerized procedures developed by the SBN programme. Thus the SBN is a closed network of only those libraries agreeing to adopt the automation sys- tems distributed by the Central Institute for the Union Catalogue, the central office coordinating the pro- gramme, take part. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. From the organizational viewpoint, the SBN can be regarded as a de facto consortium (i.e., not in the legal sense of the term), even if the management bodies, par- ticipation structures, and funding mechanisms differ con- siderably from consortia that have been set up in other countries. In fact, libraries join the SBN through an agree- ment among state, regions, and universities, and the gov- erning bodies represent not the libraries but their parent institutions. Participating libraries receive the services free, and funding for developing the systems and net- work administration comes from the central government, which coordinates the technical level of the project through ICCU.3 Currently, ideas are moving toward evolving the SBN into an open network system and reor- ganizing its management bodies: if this provision becomes a reality, the SBN will have potential for taking on an important role in developing digital cooperation. The territorial library systems, developed especially in the central and northern regions, consist of small groups of public libraries cooperating in one or more sec- tors of activity such as: • sharing computer systems, • cataloguing, • centralized management of purchases, • interlibrary loans, and • professional training and other activities. The library systems are based on conventions and for- mal or informal agreements between local institutions (the municipalities) and receive support from the provin- cial and regional administrations. In more recent years some systems (e.g., Abano Terme, in the Veneto) have formed themselves into formal, legal consortia. The most advanced experience in this sector-for example, the libraries in the Valseriana (an industrial valley in Lombardy), which have been operating on the basis of an informal consortium for some twenty years now-have reached a high level of efficiency comparable with the most developed European situations and may rightly be regarded as reference models for the organization of cooperation. However, given their limited size, they are unlikely to achieve economies of scale in the digital con- text unless they develop broader alliances. It is not unlikely that these consortia, given their capacity to work together, will in the near future develop broader forms of cooperation suited to tackling current technological chal- lenges. Sectoral cooperation (cooperation by area of special- ization) is meeting today with steadily increasing inter- est, though it did not fare very well in the past. Among the rare initiatives embarked upon by university and research libraries in this direction, particular importance in our context attaches to the National Coordination of Architectural Libraries (CNBA), started some twenty years ago, which became an association in 1991. The CNBA has various projects on its programme and can be regarded as an established reference point for coopera- tion among architectural libraries. We should also mention one of the "oldest" coopera- tion projects among research libraries: the Italian period- icals catalogue promoted by the National Research Council (CNR), recently made available online by the University of Bologna.4 To complete this sketch, at least a mention should be made of the participation of Italian libraries in the European Commission's technical programme in favor of libraries. This programme, which since 1991 has mobi- lized the world of libraries in the European Union, not only favors and guides explosion of technologies into libraries in accordance with preset objectives, but also has the aim of encouraging cooperation among libraries in the various countries. The programme-the latest edition of which includes not just libraries but also archives and museums-has secured significant participation from many Italian libraries. Over and above the validity of the projects already carried out or under way (important as that is), this programme has been very valuable to Italian libraries in terms of exchanges of experience and of open- ing up professional horizons, especially as regards coop- eration practice.s Digital Cooperation Recently, following the expansion of electronic publish- ing, university libraries have been displaying renewed interest in cooperation activities with particular reference to acquiring licenses and sharing electronic resources. This movement is at present in full swing and is giving rise to manifold cooperation initiatives. To get an idea of the trends under way, one may leaf through a session on database networking in Italian universities in the pro- ceedings of the AIB Congress at Genoa. 6 On that occasion a group of universities presented a "Draft Proposal of Agreement on Access to Electronic Information." The document is divided into two parts, the first defining the purposes and object of university cooperation in the sphere of electronic information. The second part indi- cates operational objectives for cooperation in acquiring electronic information and proposes a model contract for purchasing licenses, to which member universities are to keep. The content of this second part coincides with the recommendations and understandings signed by associ- ations, consortia, and groups of libraries in other coun- tries, and largely follows the indications and recommendations issued by the European Bureau of Library Information and Documentation Associations (EBLIDA), the organization that brings together the library associations of the various European countries; by DIGITAL RESOURCE-SHARING AND LIBRARY CONSORTIA IN ITALY I GIORDANO 85 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the International Coalition of Library Consortia (ICOLC); and by other library organizations. There is no point here in listing all initiatives under way in Italian libraries, in part because most of them are only just started or in the experimental stage. I shall men- tion a few only to bring out the trends that seem, from my point of view , to be emerging . Development of Digital Collections At the moment initiatives in this sector are much fewer and less substantial than in other industrialized coun- tries. Among them the Biblioteca Telematica Italiana stands out: in it, fourteen Italian and two foreign univer- sities digitize , archive, and put online works in Italian . The project is based on a consortium, the Italian Interuniversity Library Center for the Italian Telematic Library (CIBIT), supported by funds from the National Research Council (CNR) and made up of the fourteen Italian and two foreign universities that have signed the agreement. Technical support is provided by the CNR Institute for Computer Linguistics, located in Pisa.7 In this context we must also note, especially for the consequences it may have for the future growth of digi- tal collections, an agreement between the National Central Library in Florence and the publishers and authors associations aimed at accomplishing the National Legal Depository for Electronic Publishing project, which also provides for production of a section of the Italian National Bibliography to be called BNI- Documenti Elettronici. The publishers who have signed the agreement undertake to supply a copy of their elec- tronic products to the National Central Library in Florence. The latter undertakes to guarantee conserva- tion of the electronic products deposited, and to make them accessible to the public in accordance with the agreements reached. • Description of Electronic Resources In this area the bulk of the initiatives are still in an embry- onic stage. In the sector of periodicals index production (i.e., TOCs), mention should be made of the Economic Social Science Periodicals (ESSPER), a cooperation project on Italian economics periodicals launched by the Libero Istituto Universitario Carlo Cattaneo (Castellanza, Varese) to which some forty libraries are contributing. 9 Recently the project has been extended to Italian legal journals. ESSPER is a cooperative programme based on an informal agreement among the libraries, each of which undertakes to supply in good time the TOCs of the peri- odical titles they have undertaken to monitor. The pro- gramme does not benefit from any outside funds, being supported entirely by the participating libraries, which 86 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 have recently been endeavouring to evolve into a more structured form of cooperation . Administration of Electronic Resources and Licenses In this sphere there have been numerous initiatives recently, particularly by university libraries . One may note, first, a certain activism by university data-processing con- sortia (big computing centres created at the start of the computer era to support applications in scientific and then university and library administration areas). The Interuniversity Consortium for Automation (CILEA) in Milan , which has for some time been operating in the area of library systems and electronic information distribution (especially in the biomedical sector), has extended its activ- ities by offering services to nonmembers of the consortium too. Recently CILEA, in connection with a broader pro- gramme---CDL-CILEA Digital Library-has been negoti- ating with a number of major publishers the distribution of electronic journals and online bibliographic services on the basis of needs expressed by the libraries in the consortium. CASPUR (the university computing consortia in Rome) is working on several projects, among them shared manage- ment of electronic resources on CD-ROM in a network among five universities of the Centre-South . CASPUR, too, has opened its services to libraries not in the consortium and is negotiating with a number of major publishers the licenses for establishing a mirror site for electronic period- icals. The University of Genoa, through CSITA, its com- puting services centre, has concluded an agreement with an Italian distributor of electronic services to enable multi- site license-sharing for biomedical databases by institu- tions operating on the territory of Liguria. Very recently the universities of Florence, Bologna, Modena, Genoa, and Venice and the European University Institute in Florence have initiated a pilot project (CIPE) for shared administra- tion of electronic periodicals, and have begun negotiations with a number of publishers. Let us now seek to draw some conclusions from this initial, brief consideration of current initiatives: • Initiatives in the area of digital cooperation are coming mainly from the world of university and research-institute libraries. • No projects are big enough to achieve economies of scale, with most initiatives in hand having a very limited number of partners and often being experimental in nature . • Projects under way do not provide for the formation of proper consortia, most likely because the legal form of the consortium is hard to set up in Italy because of the burdens involved, especially the com- plexity and length of the decision-making processes needed to constitute such an organization. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • Librarians prefer decentralized forms of coopera- tion, partly because , shaken by experiences of the past, they fear losing autonomy and efficiency and finding themselves caught up in the bureaucracy of centralized organizations. "However, there can also be a correlation between the amount of auton- omy that the individual institution retains and the ability of the consortium to achieve goals as a group". This observation by Allen and Hirshon obviously holds for Italy too . JO It is no coincidence , in fact, that university computing consortia, who have centralized staff and funds available, are able to carry out more incisiv e actions in this sector. • Except for the Biblioteca Telematica Italiana, no initiatives seem to have been incentivized by ad hoc government programmes or funds. • A part of the cooperation projects concerns sharing of databases on CD-ROMs. The traditional Italian resistance to online materials would seem to be due partly to the still inadequate network infra- structures in our country; improvements in this sector might bring a quick turnaround here. • Some initiatives in hand have been inspired more by suppliers than by librarians : the risk is to coop- erate in distributing a particular product, not to enhance libraries' bargaining power. Without wishing to deny anything to the suppliers, who today play an essential part in terms of profes- sional information too, I feel that keeping the roles clearly separate may help to develop clear, upright and mutually advantageous cooperation. • Some major project s are being led by universit y computing consortia that have begun to take an interest in the library sector. The university com- puting consortia would indeed have some of the requirements to play a first-rank role in this sphere if they can manage to bring themselves into their most natural position, i.e., to operate as agents of libraries rather than as distributors of services on behalf of the commercial suppliers. Moreover, it ought to be clear that th e computing consortia should act as partners with the library consortia and not as substitutes for them, otherwise the libraries risk limiting their autonomy of decision . • Some attention is turning toward university elec- tronic publishing , though at the present stage it does not seem there are practical projects for coop- eration in this area. • Finally, one has to not e low initiative by libraries (compared with other countries) in developing content and in storing digital collections. Th e analysis I have rapidly summarized here is the basis for an initative which has in recent months been stimulating the debate on digital cooperation in Italy. I am referring to the Italian National Forum on Electronic Information Resources (INFER), a coordination group ini- tially promoted by the Europ ean University Institut e, the University of Florence, and a number of universities in the Centre-North, which is today extending beyond the sphere of university and research libraries. The forum's chief mission is to coop erate to promote efficient use of electronic information reso urce s and facilitate access by the public. To this end it encourages libraries to set up consortia and other typ es of agreement on acquisition and management of electronic resources and access to them . INFER's objectives can be summarized as follows: • to act as a reference and linkage point and develop initiatives to promote activities and programmes in the area of library e lectro nic resource sharing; • to enhance awar eness both at institutional political levels (ministries, universities, local authorities, etc.) and among librarians and end users; • to facilitate dialogue and mutual collaboration between libraries and all others in the knowledge production and distribution chain, to help them all (authors, publi shers, intermediaries, end users) to take advantage of the opportunities offered by the information society; and • to maintain contacts with similar initiatives under way in other countries. INFER has immediately embarked on a rich pro- gramme of activities which is giving appreciable results especia lly in terms of raising awareness of the problem and coordinating initiativ es in the area. We shall her e briefly mention some of the actions in hand that seem to us most important. Dissemination of information. INFER has developed a Web site where as well as information on the Forum's activities, important information and documents can be found relating to the consortia, the negotiations and licenses, and in general the digital resource-sharing pro- grammes in Italy and around the world.1 1 A discussion list for TNFER members has also been activated. Seminars and workshops. Thi s activity is aimed at fur- ther exploration of themes of particular interest (e.g ., legal aspects of license contracts, or programmes under way in other countries) . Data collection. The two main programmes corning und e r this heading are: (a) monitoring of Italian coopera- tion initiatives under way in the digital sector; and (b) collecting data on acquisitions of electronic information resources in university libraries . This information will enable the libraries to have a more exact picture of the sit- uation , so as to assess their bargaining power and achieve the necessary support to adopt the most appropriate strategies. DIGITAL RESOURCE-SHARING AND LIBRARY CONSORTIA IN ITALY I GIORDANO 87 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Indications and recommendations. As well as translating and distributing documents from the most important associations operating in this area (such as EBLIDA, ICOLC, and IFLA), INFER is developing a model license for the Italian consortia. INFER was set up in May 1999 and currently has some forty members, most of them representatives of university library systems, university computing consor- tia or research libraries, or univer si ty professors. One of INFER's aspirations is to persuade decision-makers to develop a programme of incentives on a national scale for the creation of library consortia . I Critical Factors As to the delay we note in terms of shared management of electronic resources, weight clearly attaches to the fact that cooperation is not very established , nor are the national structures that ought to have supported it. It would be all too easy and perhaps also more fun to attrib- ute this situation to the so-called individualism of Italians and to abandon inquiry into th e structural limitations that may have determined it. First of all, except in very few cases, libraries have no administrative autonomy, or only very little, and with hardly any decision-making powers. This factor favors interference in decision-making processes, complicates th em, slows down procedures, and strips librarians of their responsibility. One of the reasons why the SBN has not managed to generate cooperation is to be sought in the mechanisms for joining and participating in the pro- gramme . In other words, many libraries have joined the SBN following decisions taken from above, at the politi- cal and admistrative levels, and not on the basis of an autonomous, weighted assessment of attitudes, needs, and alternatives. These experiences have augmented libraries' reluc- tance to embark on centrally steered national pro- grammes. On the other hand, the low administrative autonomy they have prevents them from implementing truly effective alternative solutions, i.e., ones able to real- ize economies of scale. Another factor is the administrative fragmentation of libraries . The big universities have fifty or so libraries each (often one per department). Some universities have an office coordinating the librari es , but only in very few cases does this structure have the powers and the neces- sary support to coordinate; more often it acts as a media- tion office with no real administrative powers. In short, the result is that since (perhaps also because of a misun- derstood sense of departmental autonomy) there is no 88 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 decision-making centre for libraries in each university , decisional processes prov e slow and cumbersom e. Clearly, all this brings many probl ems in establishing understandings and cooperative programmes with other libraries and weakens the universities in negotiating licenses. This position, while objectively favoring suppli- ers in the short term, in the long term risks facing them with difficulties given an increasingly impoverished, uncertain market because of the fragmentation and the limited capacity of possible purchasers . Another limit is the insufficient awareness, especially on the academic side, of the challenges of electronic infor- mation. In early 1999 the French daily Le Monde published an extensive feature on scientific publishing, showing how current publishing production mechanisms, whil e assuring a few big publishers of ample profit margins, are suffocating libraries and universities under the continu - ous rises in prices for scientific joumals.12 The argument, immediately taken up by the Spanish El Pais and other European newspapers, met with very little response in Italy. Clearly, in Italy today, the conditions do not exist to embark on initiatives like the incisive open letter to pub- lishers sent by the Kommission des Deutschen Bibliotheksinstituts filr Erwerbung und Bestandsent- wicklung in Germany, supported by similar Swiss, Austrian, and Dutch organizations. 13 The lack of an adequate national policy in the area of electronic information is probably the direct consequence of the problems I have just mentioned. In this context, however praiseworthy the initiatives, they tend in the absence of reference points and practical support to break up or fritter away . Under the Ministry for Universities there are no leadership or action bodies in the area of aca- demic information, like the Joint Information System Committee in Britain that stimulates programmes aimed at developing and utilizing information technologies in university and research libraries . These observations are also valid for the state libraries and public libraries, too, where the central (Ministry for Cultural Affairs) and regional authorities could play a more effective part in promoting digital cooperation . I Conclusions The picture I have presented is not very rosy. However, it does reveal considerable elements of vitality and great er awareness of the problems emerging, starting with a few representatives of academic sectors who might be able to wield influence and bring about a turnaround. At the moment, the consortium movement to share electronic resources chiefly involves university libraries, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. but a few initiatives by public libraries are starting to appear, especially in the multimedia products sector. No specific lines of action are yet emerging at the level of the national authorities-especially the Ministry for Education and Research and the Ministry of Cultural Activities, on which the national libraries and many research libraries depend. It is likely that in the near future the entry of these agencies may be able to modify the current scenario and considerably influence the approach to cooperation. From this viewpoint, the impression is that a few consortium initiatives that have been flourishing in recent months on the part of both libraries and suppliers have the principal aim of propos- ing cooperation models to guide future choices. In conclusion, we are only at the outset, and the game is still waiting to be played. References and Notes 1. Michel Boisse!, "L'organisation Automatisee de la Bibliotheque de l'Institut Universitaire Europeen de Florence," Bulletin des Bibliotheques de France 24, no. 5 (1979): 231-39. For an overall picture of the debate, see: La Cooperazione: II Servizio Bibliotecario Nazionale: Atti de/ 30th Congresso del/'Associazione Italiana Biblioteche, Giardini Naxos, November 21-24, 1982 (Messina: Universita di Messina, 1986). 2. Tommaso Giordano, "Biblioteche tra Conservazione e Innovazione," in Giornate Uncee Su/le Biblioteche Pubbliche Statali, Roma, January 21-22, 1993 (Roma: Accademia Nazionale dei Lincei, 1994): 57-65. For the most recent developments in the debate, see the articles by Antonio Scolari, "A proposito di SBN," Giovanna Mazzola Merola, "Lo Studio sull'Evoluzione de! Servizio Bibliotecario Nazionale," and Claudio Leombroni, "SBN un Bilancio per ii Futuro," Bollettino AIB 37, no. 4 (1977): 437-66. 3. Further information on SBN can be found at www.iccu.sbn.it/sbn.htm, accessed Oct. 27, 1999, where the col- lective catalogue of participating libraries is also accessible. 4. Catalogo Italiano dei Periodici (ACNP),www.cib.unibo.it/ cataloghi/infoACNP.htm, accessed Sept. 19, 1999. 5. There is a considerable literature on the European Commission's "Libraries Programme": for a summary of proj- ects in the programme, see Telematics for Libraries: Synopses of Projects (Luxembourg: Office for Official Publications of European Communities, 1998). Updated information on the lat- est version of the programme can be found at www.echo.lu/ digicult, accessed Oct. 26, 1999. On Italian participation in the programme see: "Ministero per i Beni Culturali e Ambientali, L'Osservatorio dei Programmi Internazionali delle Biblioteche 1995-1998" (Roma: MBAC, 1999). 6. Associazione Italiana Biblioteche (AIB), XLIV Congresso Nazionale AIB. Genova, 1988: www.aib.it/aib/congr/co98univ. htm, accessed Oct. 27, 1999. 7. More information about CIBIT can be found at www.ilc.pi.cnr.it/pesystem/19.htm, accessed May 19, 2000. 8. Progetto EDEN: Deposito Legale Editoria Elettronica N azionale, www.bncf.firenze.sbn.it/ progetti.html, accessed Sept. 29, 1999. 9. More information about ESSPER mav be found at www.liuc.it/biblio/ essper /Default.htm, access~d May 19, 2000. 10. Barbara McFadden Allen and Arnold Hirshon, "Hanging Together to Avoid Hanging Separately: Opportunities for Academic Libraries and Consortia," Information Technology and Libraries 17, no. 1 (1998): 37-44. 11. The INFER Web page can be found on the Universita di Roma I site, www.uniromal.it/infer, accessed May 19, 1999. 12. Le Monde, 22 Jan. 1999: A whole page is devoted to this topic. See especially the article titled "Les Journaux Scientifiques Menaces per la Concurrence d'Internet." Accessed Feb. 4, 1999, www.lemonde.fr/ nvtechno /branche / journo / index.html. The point was taken up again by El Pa(s, 27 Jan. 1999; see the article titled "Las Revistas Cientfficas, Amenazadas por Internet." 13. The letter, signed by Werner Reinhardt, DBI president, is available at www.ub.uni-siegen.de/pub/misc/Offener_Brief-engl. pdf, accessed Feb. 4, 1999. DIGITAL RESOURCE-SHARING AND LIBRARY CONSORTIA IN ITALY I GIORDANO 89 10083 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Consortia building: A handshake and a smile, island style Cutright, Patricia J Information Technology and Libraries; Jun 2000; 19, 2; ProQuest pg. 90 Consortia Building: A Handshake and a Smile, Island Style Patricia J. Cutright In the evaluation of consortia and what constitutes these entities the discussion runs the gamut. From small, loosely knit groups who are interested in cooperation for the sake of improving services to large membership- driven organizations addressing multiple interests, all recognize the benefits of partnerships. The Federated States of Micronesia are located in the western Pacific Ocean and cover 3.2 million square miles. Throughout this scattering of small islands exists an enthusiastic library community of staff and users that have changed the outlook of libraries since 1991. Motivated by the collaborative eff orts of this group, a project has unfolded over the past year that will furth er enhance library services through staff training and edu- cation while utilizing innovative technology. In assess- ing the library needs of the region this group crafted the document "The Federated States of Micronesia Library Services Plan, 1999-2003," which coalesces the con- cepts, goals, and priorities put forward by a broad-based contingency of librarians. The compilation of the plan and its implementation demonstrate an understanding of the issues and exhibit the ingenuity, creativity, and will- ingness to solve problems on a g rand scale addressing the needs of all libraries in this vast Pacific region. T he basic philosophy inher ent in librarianship is the concept of sharing. The di sse mination of informa- tion through material exchang e and interlibrary communication has enriched so cieties for centuries. Th ere ar e few institutions other than libraries that are bet- ter equipped or suited for such cooperation and collabo- rati ve e ndeavors. With servic e as the lifeblood that runs through its inky veins , the librar y has the potential to be the driving force in an y community toward partnerships that a fford mutual benefit for all. The examination of the literatur e exposes a wid e rang e of perceptions as to the d e finition of what is a con- sortium . The term "consortia" conjur es up impressions that span the spectrum from highly or ganized, member- ship-driv en groups to loosely knit cadres focusing on impro ving services to their patrons however they can make it happen. In Kopp 's pap er "Library Consortia and Patricia J. Cutright (cutright@eou .edu} is Library Director of the Pierce Library at Eastern Oregon University. 90 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 Information Technology : Th e Past, the Present, th e Promise" he presents information from a study con- duct ed by Ruth Patrick on academic library consortia. In that study she identified four general types of consortia : • Large consortia concerned primarily with comput- erized large-scale technical processing; • Small consortia conc erned with user services and everyday probl ems ; • Limited-purpose consortia cooperating with respect to limited special subject areas; • Limited-purpose con sorti a concerned primarily with interlibrary loan or reference; and network operations.I With this distinction in mind , this paper will focus on th e second category typifying a small , less structured organization. Whil e on a visiting assis tantship in the Federated States of Micronesia (FSM), I worked with a partnership of libraries that believe in order for cooperation to suc- ceed, results for the patron must be the goal-not equity between libraries or some magical balance between resources lent by one library and resources received from a noth er library.2 Unified effort s to provide service to the p a tron is the key. The libraries on a small, rem ote island situated in the western Pacific Ocean exhibit this grassroots effort that define s the true meaning of consortia-demonstrating col- laboration , cooperation , and partnerships. It is a multi type library cooperative that not only encompasses interaction among libraries but also betwe en agencies as well as gov- ernments. The librarians on the island of Pohnpei, Micron esia, and all the islands throughout the Federated States of Micronesia have embraced this consortia) attitud e whil e achieving much through these collaborative efforts : • The joint work done on crafting the Library Services Plan, 1999-2003 for the libraries throu gh- out the Federated States of Micronesia • Initiating successful grant-writing efforts which target national goals and priorities • Implementing a collaborative library automation project which is d esigned to evolve into a national union catalog • The implementation of a viable resource-sharing and document delivery service for the nation I Background and Socioeconomic Overview Micron esia, a name m eaning " tiny islands ," comprise s som e 2,200 volcanic and coral islands spread throughout Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3.2 million square miles of Pacific Ocean. Lying west of Hawaii, east of the Philippines, south of Japan and north of Australia, the total land mass of all these tropical islands is fewer than 1,200 square miles with a population base estimated at no more than 111,500.3 A location- unique region, but nonetheless still plagued with all the problems associated with any geographically remote, economically depressed area found anywhere in the United States or elsewhere in the world. The Federated States of Micronesia is a small-island, developing nation that is aligned with the United States through a Compact of Free Association, making it eligible for many U.S. federal programs. The economic base is cen- tered around fisheries and marine-related industries, tourism, agriculture, and small-scale manufacturing. The average per capita income in 1996 was $1,657 for the four states of the FSM: Kosrae, Pohnpei, Yap, and Chuuk. Thirteen major languages exist in the country, with English as the primary second language. The 607 different islands, atolls, and islets dot an immense expanse of ocean; this geographic condition presents challenges in implementing and enhancing library services and technology. 4 Despite the extreme geographic and economic condi- tions, the College of Micronesia-FSM National campus in collaboration with the librarians throughout the states have been successful in implementing nationwide proj- ects. These endeavors have resulted in technical infra- structure and the foundation for information technology instruction supported through awards from the U.S. Department of Education, the Title III program, and the National Science Foundation. I Collaboration: Building Bridges that Cross the Oceans The libraries in Micronesia have shown an ongoing com- mitment to librarianship and cooperation since the estab- lishment of the Pacific Islands Association of Libraries and Archives (PIALA) in 1991. The organization is a Micronesia-based regional association committed to fos- tering awareness and encouraging cooperation and resource sharing among libraries, archives, museums, and related institutions. PIALA was formed to address the needs of Pacific Islands librarians and archivists, with a special focus on Micronesia; it is responsible for the common-thread cohe- siveness shared by the librarians over the past eight years. The organization has grown to become an effective champion of the needs of libraries and librarians in the Pacific region.s When PIALA was established, the most pressing areas of concern within the region were development of resource-sharing tools and networks among the libraries, archives, museums, and related institutions of the Pacific Islands. The development of continuing education pro- grams and the promotion of technology and telecommu- nications applications throughout the region were areas targeted for attention. Those concerns have changed little since the group's inception. Building upon that original premise, in January 1999 a group of interested parties from throughout the Federated States of Micronesia met to draft a document they envisioned would lay the groundwork for library planning over the next five years. This strategic plan encompasses all library activity-services, staffing, and the impact technology will have on libraries in the region. The document, "The Federated States of Micronesia Library Services Plan, 1999-2003," coalesces the concepts, goals, and priorities put forward by a broad-based contin- gent. In this meeting, the group addressed basic issues of library and museum service, barriers and solutions to improve service delivery, and additional funding and train- ing resources for libraries and museums.6 The compilation of the plan crafted at the gathering demonstrated a thor- ough understanding of the issues that face the librarians of the vast region. It exhibits the ingenuity, creativity, and will- ingness to problem-solve on a grand scale in a way that addresses the needs of all libraries in the Pacific region. The goals set forward by the writing session group illustrate the concerns impacting library populations throughout the FSM. The FSM has now established six major goals to carry out its responsibilities and the need for overall improvement in and delivery of library services: 1. Establish or enhance electronic linkages between and among libraries, archives, and museums in the FSM. 2. Enhance basic services delivery and promote improvement of infrastructure and facilities. 3. Develop and deliver training programs for library staff and users of the libraries. 4. Promote public education and awareness of libraries as information systems and sources for lifelong learning. 5. Develop local and nationwide partnerships for the establishment and enhancement of libraries, muse- ums, and archives. 6. Improve quality of information access for all segments of the FSM population and extend access to informa- tion to underserved segments of the population. Priorities The following are general priorities for the FSM Library Services Plan. The priorities represent needs for overall improvement of the libraries, museums, and archives. The priorities are based on the fact that currently libraries, museums, and archives development is in its infancy in CONSORTIA BUILDING I CUTRIGHT 91 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the FSM. Specific priorities will change from year to year as programs are developed. 1. Establishment of new libraries and enhancement of existing library facilities to increas e accessibility of all FSM citizens to library resources and services. Outer islands and remote areas generally have no access to libraries or information sources. New facilities or mechanisms need to be established to provide access to information resources for the public. Existing pub- lic and school library facilities often lack adequate staffing, climate control, and electrical connections needed to meet the needs of the community. Existing public and school libraries also need to improve their facilities and services delivery to meet the needs of disabled individuals and other special populations. 2. Provide training and professional development for library operation and use of new information tech- nologies. A survey held during the writing session indicated that public and school library staff do not currently possess the skills needed to effectively pro- vide assistance in the use of new information tech- nologies. Well-designed training programs with mechanisms for follow-up technical assistance and support need to be developed and implemented. 3. Promote collaboration and cooperation among libraries, museums, and archives for sharing of hold- ings and technical ability. Limited holdings, financial capacity, and human resources are major barriers to improving library services. Collaboration and coop- eration are needed among libraries, museums, and archives to maximize scarce resources . 4. Develop recommended standards and guidelines for library services in the FSM. The ability to share resources and information could be significantly increased by development and implementation of recommended standards and guidelines for library services. Standardization could assist with sharing of holdings and holdings information, increase avail- ability of technical assistance, and provide guidance as new libraries and library services are set up. 5. Increase access to electronic information sources. Existing public and school libraries have limited or no access to electronic linkages including basic serv- ices such as e-mail and connections to the Internet. The priority need is to establish basic electronic link- ages for all libraries, followed by extending access to electronic information to all users.7 I Shifting into Action With the drafting of this five-year plan, the librarians stated emphatically the need and desire to move ahead 92 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 with haste and determination . As the plan was conceptu- alized and documented, a small cadre of librarians from the College of Micronesia -FSM National campus, the public library, and high school library crafted two suc- cessful grant proposals which addressed: • a cooperative library automation project which is designed to evolve into a national union catalog (goal 1; priorities 3, 5); • the installation of Intern et services that would link the College of Micronesia-FSM campuses, the pub- lic library, and high school library (goals 1, 2, 6; priorities 1, 2, 3, 5); • the development and delivery of training pro- grams for library staff and users of the libraries (goals 3, 4, 6; priority 2); and • the implementation of a viable resource-sharing and document delivery service for the nation (goal 1, 2, 5, 6; priorities 3, 4, 5). Over the past year the awarding of grant funds has shifted the library community into high gear with the design and implementation of project activities that will fulfill the targeted needs. The Automation Project and Internet Connectivity A collaborative request submitted by the Bailey Olter High School (BOHS) library and the Pohnpei Public Library pro- vided the funding necessary to computerize the manual card catalog system at BOHS and upgrade the dated auto- mated library system at Pohnpei Public Library. Since the College of Micronesia-FSM campuses are automated, it was important for the high school library and the public library to install like systems to achieve a networkable automated system, facilitating the develop- ment of a union catalog for all th e libraries' holdings. This migration to an automated system promoted cooperation and resource sharing for the island libraries-opening a wealth of information for all island residents. The project entailed purchasing a turnkey cataloging and circulation system that will facilitate the cataloging and processing of new acquisitions for each library as well as the conversion of approximately five thousand volumes of material already owned by the public and high school libraries. Through Internet connectivity, which was integral to the project, the system would also serve as public access to the many holdings of the libraries for students, faculty, and town patrons through a union catalog to be established in the future. The development and deliv ery of training programs for library staff and users is linked to the implementation of a viable resource-sharing and document delivery serv- ice for the nation. Stated earlier, the librarians of the Federated States of Micronesia accepted the challenge facing them in ramp- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ing up for the twenty-first century. Their prior experience laid the groundwork necessary to implement the training programs necessary to bring the library community the knowledge and skills needed. A survey administered during the writing session indicated that few public and school librarians have sig- nificant training in or use of electronic linkages or infor- mation technologies, nor are they actively using such technologies at present. Of the fourteen public and school librarians in the four states of Micronesia, none hold a master's degree from an accredited library school or Library Media Specialist certification. An exception is the library staff at the COM-FSM National campus, where two-thirds of the librarians hold professional credentials. Significant effort is needed on a sustained basis for effective training in the understanding and use of infor- mation systems throughout the nation. Where training has occurred, it has often been of an infrequent, short variety with little support for ensuring implementation at the work site. Additionally, often there are no formal systems for getting answers to questions when problems do arise. In addressing the information needs for this popula- tion it is apparent that education is the key component for continued improvement of library services. This con- cern is evident in a paper by Daniel Barron, where it is stated that only 54 percent of librarians and 19 percent of staff in libraries serving communities considered to be rural (i.e., 25,000 people or fewer) have an ALA-MLS. 8 And Dowlin proposes even more perplexing questions, "How can a staff with such an educational deficit be expected to accomplish all that will be demanded to enable their libraries to go beyond being a warehouse of popular reading materials? How can we expect them to change from pointers and retrievers to organizers and facilitators?" 9 Micronesia is no different than any other state or country in wanting its population to have access to qual- ified staff, current resources, and services. It recognizes the libraries are inadequately staffed and many others have staff who are seriously undereducated to meet the expanded information needs of the people in their com- munities. If these libraries are to seize the opportunities suggested by the developing positive view, develop serv- ices to support this view, and market such a view to a wider range of citizens in their communities they must invest in the intellectual capital of their staffs. In order to carry out this charge, the following activi- ties were designed to address the educational and train- ing needs of the librarians in the FSM. As outlined in a recently funded Institute of Museums and Library Services (IMLS) National Leadership grant, preparation has begun with the following activities, which will address the staffing and technology concerns described in FSM libraries: 1. Recruit and hire an outreach services librarian to survey training needs, coordinate and plan training, and deliver or arrange for needed training. 2. Develop a skills profile for all library, museum, and archival staff positions. 3. Identify training contact or coordinator for each state. 4. Develop and provide periodic updates to opera- tional manuals for school and public libraries, muse- ums, and archives. 5. Recruit local students and assist them in seeking out scholarships for professional training off island. 6. Design and implement programs to provide contin- uous training and on-site support in new technolog- ical developments and information systems (provided on-site and virtually). 7. Establish a Summer Training Institute offering train- ing based on needs as determined by the Outreach Services Librarian in collaboration with state coordi- nators and recruiting on- and off-island expertise as instructors. 8. Design and develop programs for orientation and training of users of information systems (provided on-site and virtually). 9. Develop and implement a "train the trainer" pro- gram, which will have representation from all four states, that will ensure continuity and sustainability of the project for the years to come. 10 The primary requisite to initiating this project is the recruitment and hiring of the outreach services librarian who will then begin the activities as listed. A beginning cadre of librarians gleaned from the summer institute will become the trainers of the future, perpetuating a learning environment enhanced with advanced technology. Breakthroughs in distance education, aided with advances in telecommunications, will significantly impact this project. On-site training will be imperative for the initial cadre of summer institute attendees to provide sound teaching skills and a firm understanding of the material at hand. Follow-up training will be presented on each island by the trainer either on location or virtually with available technology. Products such as Web Course in a Box, WebCT, or Nicenet will be analyzed for appro- priate utilization as teaching tools. These products will take advantage of newly established Internet connections on each island and, more importantly, will provide the interactive element that distinguishes this learning methodology from the "talking head" or traditional cor- respondence course approach. A Web site designed for this project will provide valuable information and con- nectivity for not only the Pacific library community but anyone worldwide who may be interested in innovative methods of serving remote populations. Using computer conferencing and virtual communi- ties technology, a video conferencing system such as 8 x 8 CONSORTIA BUILDING I CUTRIGHT 93 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Technologies will be used, which will allow face-to-face interaction with trainer and student in an intra-island sit- uation (interisland telephone rates are too expensive for regular use as a teaching tool). To enhance the learning experience and information retrieval component for these librarians and the popula- tion they serve, the project also incorporates implementa- tion of a viable resource-sharing, document delivery system capitalizing on a shared union catalog and using a service such as Research Library Group's Ariel product. With library budgets reflecting the critical economic cli- mate of the nation, it becomes even more crucial for col- laborative collection development and resource sharing to satisfy the needs of the library user. To maintain cost-effective communication and build a sense of community among the librarians, the messaging software ICQ has been installed on all participant hard- ware and utilized for group meetings, question and answer, and general correspondence. Since ICQ operates as part of the Internet, this package allows low-cost com- munication with maximum benefit in connecting the group. This technology will also be used as the primary mechanism for communication with an outside advisor who will provide expertise in the area of outreach serv- ices for rural populations. The realm of outreach services in libraries has always presented unique challenges that can now benefit greatly from current and emerging technologies. The definition of "outreach" is truly a matter of perspective, with the more traditional sense relating to a specific library servic- ing its own user or patron. But current practice regards "outreach" as a mere extension of services to all users whether they be a registered patron or colleague or peer. Micronesia is a country where the proverbial phrase "the haves and the have-nots" is amplified. The recent (and ongoing) installation of Internet services in the region has made possible many basic changes, but there still exists the reality that some of the sites for services proposed have nothing more than a common analog line and rudimentary services. As an example of the realities that exist, only 38 percent of the approximately 180 pub- lic schools in the FSM have access to reliable sources of electricity. Another challenge for these libraries is the cli- mate and environment, which has a significant impact on library facilities, equipment, and holdings. The FSM lies in the tropics, with temperatures ranging daily from 85 to 95 degrees with humidity normally 85 percent or higher.11 The high salt content in the ocean air wreaks havoc upon electrical equipment, and the favorable envi- rons inside a library often entice everything from termites in the wooden bookcases to nesting ants in keyboards. From these examples it is apparent that the problems that trouble these libraries are not going to be solved with the magic bullet of technology. This reality constitutes the 94 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 need for varying strategies and different aproaches to address the training requirements of the library staff. I Summary The FSM library group, in particular the Pohnpeian librari- ans, have accomplished much in the past year. The moti- vating factor for the flurry of activity that enveloped the libraries on Pohnpei was spurred by the collaborative writ- ing session in January 1999. A week-long "meeting of the minds" from libraries throughout Micronesia produced the blueprint that will map the future of libraries and library service for years to come. These librarians stated their pri- mary issues in delivering library services and came to a consensus on activities needed to address the issues. The "Federated States of Micronesia Library Services Plan, 1999-2003" was crafted as a working document, a strategic plan for improving library services in the Pacific region, and a commitment to achievement through collaboration. While in Micronesia I observed the impact that the unification of ideas can have on the citizens of a commu- nity. In my fourteen-year tenure at Eastern Oregon University I have been exposed to the benefits of "con- sortium attitude" that come from cooperation and part- nerships. Time and again the university demonstrates the positive effects of what is referred to as "politics of entan- glement." Shepard describes the overriding philosophy that has been the recipe for success: The politics are really quite simple. We maintain an intricate pattern of relationships, any one of which might seem inconsequential. Yet there is strength in the whole that is largely unaffected if a single relation- ship wanes. Rather than mindlessly guarding turf, we seek to involve larger outside entities and in the ensnaring, to turn potential competitors into helpful partners .12 Just as Eastern Oregon University has discovered, the libraries of the Federated States of Micronesia are learn- ing the merits of entanglement. References and Notes 1. James J. Kopp, "Library Consortia and Information Technology: The Past, the Present, the Promise," Information Technology and Libraries 17 (Mar. 1998): 7-12. 2. Jan Ison, "Rural Public Libraries in Multi-type Library Cooperatives," Library Trends 44 (Summer 1995): 29-52. 3. Pacific Islands Association of Libraries and Archives, www.uog.edu/rfk/piala.html, accessed June 6, 2000. 4. Division of Education, Department of Health, Education and Social Affairs, Federated States of Micronesia, "Federated Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. States of Micronesia, Library Services Plan 1999-2003" (March 3, 1999): 2. 5. Pacific Islands Association of Libraries and Archives, www.uog.edu/rfk/piala.html, accessed June 6, 2000. 6. Division of Education and others, "Library Services Plan," 4. 7. Ibid, 6. 8. Daniel D. Barron, "Staffing Rural Pubic Libraries: The Need to Invest in Intellectual Capital," Library Trends 44 (Summer 1995): 77-88. The MIT From Gutenberg to the Global Information Infrastructure Access to Information in the Networked World Christine L. Borgman Considers digital libraries from a social rather than a technical perspective. Digital Libraries and Electronic Publishing series 340 pp. $42 now in paperback Remediation Understanding New Media Jay David Bolter and Richard Grusin " Clearly written and not overly technical, this book will interest general readers, students, and scholars engaged with current trends in technology." - Choice 307 pp., 102 illus. $17.95 paper 9. K. E. Dowlin, "The Neographic Library: A 30-Year Per- spective on Public Libraries," in Libraries and the Future: Essays Oil the Library ill the Twenty-First Century, F. W. Lancaster, ed. (New York: Haworth Pr., 1993). 10. Patricia J. Cutright and Jean Thoulag, College of Micronesia-FSM National campus, "Institute of Museums and Library Services, National Leadership Grant" (Mar. 19, 1999). 11. Division of Education and others, "Library Services Plan," 2. 12. W. Bruce Shepard, "Spinning Interin;titutional Webs," AAHE Bulletin 49 (Feb. 1997): 3-6. The Intellectual Foundation of Information Organization Elaine Svenonius "Provides sound guidance to future developers of search engines and retrieval systems. The work is original, building on the foundations of information science and librarianship of the past 150 years." - Dr. Barbara 8. Tillett, Director. ILS Program, Library of Congress Digital Libraries and Electronic Publishing series 264 pp. $37 now in paperback Information Ecologies Using Technology with Heart Bonnie A. Nardi and Vicki L. O'Day "A new and refreshing perspective on our technologically dependent society." - Daily Telegraph 246 pp. $15.95 paper To order call 800-356-0343 (US & Canada) or 617-625-8569. Prices subject to change without notice. http:/ /mitpress.mit.edu CONSORTIA BUILDING I CUTRIGHT 95 10084 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. New strategies in library services organization: Consortia University Libraries in Spain Miguel Duarte Barrionuevo Information Technology and Libraries; Jun 2000; 19, 2; ProQuest pg. 96 New Strategies in Library Services Organization: Consortia University Libraries in Spain Miguel Duarte Barrionuevo New political, economic, and technological developments, as well as the growth of information markets, in Spain have created a foundation for the creation of library consortia. The author describes the process by which different regions in Spain have organized university library consortia. S panish libraries are public entities that depend either on central or local governments and are funded through either the national general budget or the regional government (Comunidades Aut6nomas) budget. On one hand, the player at the national level is the Education and Culture Ministry, which contributes to the fifty-two state public libraries and shares jurisdiction with the regional government. On the other hand, universities are self-governed institutions of a public nature regulated by the Ley de Reforma Universitaria, or University Reform Law, which was approved by the Spanish parlia- ment in 1983 to promote scientific study and greater self- government of Spanish universities. Universities have their own budget, and they are mainly funded by the regional government. The university library system is currently made of about fifty public libraries and twelve private libraries. Since the second half of the 1980s, a new philosophy concerning public services has spread in Spain, as in other European countries: a philosophy calling for higher quality and more efficiency in the management and administration of the public capital. There has also arisen a claim to the government's satisfactory use of public funds as a social right, as well as a claim to a return on that capital in social terms. This is where libraries' public services come into play. There is a clear interest in all the aspects related to the introduction of new techniques in management. Quality management, effectiveness and efficiency measuring, costs control, services assessment, and users content or analysis from the stakeholders' point of view are con- cepts that emerge in university libraries. In order to adjust to the circumstances, universities are changing their management procedures, and university libraries have been forced into managing their "business" accord- ing to managerial criteria. The commonality of their activities, and the relaxation of geographical boundaries fostered by information tech- nologies, have encouraged libraries to join consortia in order to remain relevant in the current library services context. Such concepts as the "electronic," "digital," and Miguel Duarte Barrionuevo is head Director of the Central Library of the University of Cadiz (Andalucia), and an active contributor of the University Libraries Consortium of Andaluc1a. 96 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 "virtual" libraries lead, from my point of view, to a differ- ent configuration in the library services context; they have pushed the library managers to consider strategically where they are and what is their most adequate position within this new configuration. Departments dealing with information are to be wider, more heterogeneous, and multidisciplinary. New organization strategies need to be defined in order to offer services in a different way When library managers are forced to obtain the best results out of their limited resources, the organization of consortia represents a qualitative leap forward in cooper- ation, efficiency, and cost-savings. Library consortia aim to share resources and to promote participation on the basis of the mutual benefit of the libraries involved and, although the concepts of cooperation, coordination, and sharing resources are not new in the library world, the organization of library consortia introduces a major level of commitment and involvement among the participants. I New Settings, New Facts Libraries are going through a crisis. A library is still an institution with a strong traditional character, but its tra- ditional duties as depository of knowledge no longer jus- tify its costs, and the crisis is exacerbated by an accelerated technological and informative revolution. 1 Within the changing atmosphere of the Spanish univer- sity in the last few years, goals and objectives are affected by a number of socioeconomic, institutional, and techno- logical factors, as well as others with an internal character that push these institutions to move toward change as an opportunity to maintain continuous improvement. Materials and services are more expensive, and technology is more sophisticated every day, which leads to a need for strong investments. The public financing funds are more and more limited while the costs are growing. The univer- sity, in general, is suffering from a lack of efficiency and organizational flexibility; staff rejects monotonous tasks and holds high expectations; the fast dynamics of the implementation of information technology in the last few years has caused a very serious imbalance in the skill lev- els of people and in job-position demands. All these factors generate a new setting of weaknesses and hopes to which the university libraries have to respond in order to maintain their competitive advantages. I Technology Technology has recently become a strategic element in the development of libraries. Technology is more and more sophisticated and its life is shorter. Its use implies Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the need of strong investments in computer and commu- nication infrastructure. I Economical Pressure on Information Market Agents Materials costs have diversified and arc more and more costly, with annual growths far exceeding even inflation rate levels. An absolute change has been produced in the supply and demand of the information market, which causes the agent's utter disorientation: the publishing sector is adapting very slowly to the electronic context; the distri- bution sector needs a deep technological and organizational transformation (few Spanish suppliers offer added value services such as cataloguing, outsourcing, or material preparation-Puvill Libras, or filial multinationals such as Blackwell or Dawson are exceptions). Electronic Data Inter- change, a European standard like SISAC, is not a standard format among the sector and there is not a national supplier that offers services of the Approval Plans type. Additionally, the agents of the information market are very conditioned by the change of the demand orientation. Specialized users (teachers, researchers, thesis students, etc.) demand from libraries electronic resources, quick information, and access at all times from remote locations. This conflicts with the restrictive tendencies in the maintenance of the public services and drastic budget cuts. Libraries are forced to obtain the highest possible ratio of efficiency in the use of the fewest resources. I Total Quality Management Implementation and Other Management Techniques The result is implementation of Total Quality Management (TQM), which guarantees quality of services. It is important to consider TQM as an instrument that develops organiza- tional strategies. It is a continuous process developed in order to replace obsolete types of organization, to orient the corporate activity as a permanent basis to the processing optimization, and to obtain a coherent relation between the efficacy in the reaching of objectives and the efficiency in the use of resources. Changes in the editorial industry, the budget cuts, the quick expansion of electronic resources, the new price pol- itics, and the problems related to copyright and intellec- tual property form the new setting. In this context, the consortia organization is considered by the university and library managers as a means to face the challenges which the new settings imply, to unify their pressure capacity with regard to the different agents, and to take advantage of the system's strength in order to adjust to the new situ- ation and improve their competitive advantage. I Adequate Information Technologies The Spanish university libraries are connected to the aca- demic information network upheld by Rediris, a scien- tific-technical installation that depends on the Science and Technology Office of the Prime Minister. The main line that maintains the Redlris services is formed by seventeen nodes in each region (Comunidad Aut6noma), connected by ATM circuits on ATM accesses of 34/155 Mbps. Each node is formed by a set of commu- nication equipment that allows the coordination of the main transmission means and of the access lines from the centers of each regions. Redlris participates in the TEN-34 Project, which aims at building up an IP Paneuropean net of 34 Mbps, that interconnects us with the different academic and research nets and that is planned to become a TEN-135 in 1999.2 On the other hand, the region (Comunidad Aut6noma) incorporates added value elements to the Net segments they manage, such as faster access speeds that allow cen- tralized architecture (for instance, Union Catalogue Consortia Libraries of Galicia is managed through a broad band net of 155 Mbps). The region also allows access to data- bases in CD-ROM and electronic formats orientated to the final users in a regional context. For instance, the Scientific Computer Center of Andalucia manages twenty-two data- bases in CD-ROM and other electronic formats that can be searched by all the Andalusian universities and research centers through the Andalusian Scientific Research Net. Homogeneous Automation Level The automation process of the library services, initiated at the end of the decade of the '80s, is practically completed. Dobys-Libis, Libertas, vrLS, Absys, and Sabini are the most widely used library management systems. 3 Since 1997 some libraries have updated their library automation sys- tem to Unicom (Sirsi) and Innopac (Innovative Interfaces). The Spanish university libraries have a homogeneous automation level and can establish projects from the con- sortia perspective, such as regional union catalogs, shar- ing electronic information resources, and shared purchase policies. Favourable Political Situation Traditionally, the cooperative efforts have obtained little offical support. However, in the last years, a positive atti- tude can be perceived from the academic authorities in NEW STRAGETIGES IN LIBRARY SERVICES ORGANIZATION I BARRIONUEVO 97 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. relation to cooperation activities and the cooperative projects development, both as an answer to the need to reduce costs by sharing resources and as a means to face the growing and unstoppable demand from the users. The initiatives for the consortia organization are sup- ported by highest academic level institutional agreements among the universities: principals and vice-principals of research (such is the case of the Consortia of Andalucia and Madrid) or they are the result of initiatives taken by the autonomous government (Galicia Consortium) or a con- fluence of interests between the autonomous government and the universities (Catalufla Consortium). Remote Access to End Users' Information Resources Following the automation projects and the network tech- nologies and data transmission development, most uni- versity libraries have made projects for all information resources integration and maintain a wide group of serv- ices: campuswide networks, catalogs, databases in CD- ROM (e .g., Indice Espanol de Ciencias Sociales y Humanidades, Indice Espanol de Ciencia y tecnologia, Aranzadi Legislaci6n y Jurisprudencia, Medline, ABI Inform, Academic Search) , e-mail , and remote access via Internet. Access to dll resources is available through the libraries management system Opac Web. There is access to any of these resources from any point connected to the network, whether from terminal servers, workstations, PCs, Unix stations, or MACs. I Cooperation in Spain Up to the middle of the '80s, university libraries were sep- arate realities with scattered funds and disorganized serv- ices; they were not structured as a system and they were lacking any tradition or mentality of cooperation. In a 1994 poll, only 40 percent of university library directors declared that cooperation among libraries was important. 4 We could say that the cooperation initiatives depend on the will of the people who obtain little support from the government. Therefore, two different stages could be set: one in which cooperation is the result of personal actions, taken with no institutional support, in which local projects are undertaken ; or one in which individual initia- tives are taken by the people in charge of libraries and a certain concern from the central government converge. Will to Share Resources Spain did not join the movement toward library automa- tion until the '80s . At this time, the cooperative tenden- 98 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 cies now associated with information and communi- cation technologies were only slightly realized in the libraries. Eventually, however, a consolidation of efforts took place, helping to bring about, at the end of the '80s and beginning of the '90s, some important cooperative initiatives out of which some specialized union catalogs could be brought. Some of the first cooperative initiatives arose from the Association of Specialized Libraries. 5 Among these we can point out the Coordinating Committee of Biomedical Documentation, whose mission was to promote the coop- eration and rationalization of document resources in the field of biomedicine. This committee holds conferences and maintains a union catalog of the daily publications on health services accessible through Internet. 6 Documat, created in 1988, groups together the libraries specializing in mathematics and maintains a union catalog of journals on which basis are organized plans of shared acquisition . MECANO groups together the libraries of the schools of engineering and maintains a union catalog accessible through Internet? Early cooperative initiatives were also promoted by the Library Automation Systems Users Groups. Red Universitaria Espanola de Dobis / Libis began in 1990 when twelve universities using the system decide to cre- ate an online union catalog maintained by the University of Oviedo. The Libertas Spanish Users Group maintains its union catalog associated with SLS Database, accessible online from Bristol. RUECA is the union catalog of Absys users .8 Need to Cooperate In the early '80s a forum started in universities that attempted to influence the writing of the University Statutes (as a result of the Ley de Reforma Universitaria) and establish a general criterion for regulations. As a result of this debate, two documents have been published and have proved to be essential for subsequent cooperative development. 9 Some reports from confer- enc es on university libraries h eld in 1989 in the University Complutense of Madrid had a wide influence at the national level, and the same year, FUNDESCO pro- duced a report about the state-of-the-art in automation in the Spanish university libraries .10 The situation that is repeated in these reports about th e libraries is extremely pessimistic. Their evolution from 1985 to 1995 has been perfectly described by M. Taladriz and L. Anglada as "the lack of recognition of the role of university libraries ... the dispersion of biblio- graphical funds ... the general disorganization of the library services .... " 11 In 1988, Red de Bibliotecas Universitarias (REBIUN, University Libraries Network) was created. Although ini- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. tially only nine university libraries were involved , the number grew to seventeen during the following years. The cooperative activiti es were centralized, and th ey obtained remarkable results in training, the improvement of library interlending, and in the publishing on CD-ROM of bibliographical records from participant libraries. At the same time, and thanks to the celebration of the IFLA Congress in Barcelona in 1993, the general need to create a wider discussion forum including all the univ er- sity libraries and to obtain bett er cooperation and coordi- nation was established. This idea crystallized with the creation of the Conferencia de Directores de Bibliotecas Universitarias y Cientfficas (COBIDUCE, th e Conference of University and Scientific Libraries Directors). The first working mee ting was held in November 1993.12 This led to th e merging of REBIUN with COBIDUCE in order to concen- trate all the cooperation efforts into a single institution. A single institution, which kept th e name of REBIUN, was created in 1996. In 1998, REBIUN became the local com- mitte e of the Conferencia de Rectores de las Universidades Espanolas (CRUE, Conference of Spanish University Principals). REBIUN has become the organi- zation that oversees all the cooperation and coordination efforts in Spanish academic librari es . REBIUN activities include a union catalog published on CD-ROM, "Regulation s for University and Scientific Librari es," agreements on int erli brary loans, and activi- ties in different working groups .13 I University Libraries' Consortia In the past few years the tran sfer of powers to the autonomous regions on ed u ca tion and culture, a conse- quence of a constitutional order, has brought about another political and administrative context for the achievement of the libraries ' objectives. Th e autonomous regions are now working on the design of regional developm en t plans or regional infor- mation systems that are related, unfailingly, to the coop- erative activity of the libraries of the territor y. Thi s initiati ve can be applied to university librarie s as well as any other type of library , which, through their institutions, request their autonomous governments' assistance or funding in order to achieve cooperative projects. Or it could be done the other way round: a gov- ernm ent can outline an action plan for its libraries and suggest it to the potential participants. Thus, the basis for consortia development was set in the second half of the '90s, and encouraged by events like the celebrated conference in Ca diz , organized by the University of Carlos III de Madrid and the University of Cadiz libraries , and Ebsco Information Services (Spanish branch) in 1998. Catalonia Consortium of University Libraries (Consorcio de Bibliotecas Universitarias de Catalufia) We could sum up the situation in Catalonia according to the following: the existenc e of new automated libraries, few automated records, the us e of their own automation systems, and the existence of only three universitie s. We can es tablish some cooperation background developed at this time : CRUC, CAPS , and the joint selection of an automation system realiz ed by Universidad Aut6noma de Barcelona and Universidad Politecnica de Cataluna. It is not until the '90s that positive factors combined to move the cooperative movement a step forward in Catalonia. These positive factors were a homogeneous s ta te of automation among university libraries, a good communications network, and the use of standards for library data recording . The previous cooperative move- ments and an analysis of the worldwide evolution of libraries helped in the building of a united view in which coop era tion appeared as an additional instrument for the improvement of the library world. The university library directors of Catalonia consid- ered cooperation a way to accelerate the evolution of libraries, to create new services, to facilitate changes, and to save expenses. With this conviction, they wrote a pro- proposal for the creat io n of a library network in Catalonia , which in 1993 resu lted in the interconnection of the university librarie s in Catalonia, followed in 1995 with the first steps toward the cre ation of the United Catalog of the Univer sities of Catalonia. This catalog was fully operative in early 1996. At the end of 1996 th e Univ ersity Library Consortium of Catalonia (CBUC) was created with the task of improv- ing library services through cooperation. 14 Its objectiv es are: • To create new workin g too ls • To improve services • To build a digital librar y • To take better advantages of resources • To face together the changing role in libraries The CBUC comprises the University of Barcelona , Universidad Autonoma de Barcelona, the Politechnical University of Catalonia , Pompeu Fabra University , th e Univ ersity of Girona, the University of Lleida, Rovira i Virg ili University, the University Oberta of Catalonia , and the Library of Catalonia. The direction of CBUC is determined by a board of representatives from each of th e institutions, an executive committee of six members, and a technical committee of librar y dir ectors. A staff of seven NEW STRAGETIGES IN LIBRARY SERVICES ORGANIZATION I BARRIONUEVO 99 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. runs the CBUC office, and different working groups audit active plans and study possible issues of concern. University Libraries Consortium of the Madrid Region (Comunidad Autonoma de Madrid) The Public University Libraries, based in the Madrid region (Universidad de Alcala, Universidad Carlos III, Universidad Complutense, Universidad Politecnica, Universidad Rey Juan Carlos, and Uiversidad Nacional de Educaci6n a Distancia), are developing many cooper- ation programs with the following objectives: • To facilitate access to information resources • To improve the existing library services • To test and promote the use of information and com- munication technologies • To reduce costs by sharing resources 15 Two programs have already been initiated: Interlibrary loan. An agreement to obtain a faster deliv- ery system for books and journal articles has been estab- lished. Using the services of a private courier company, maximum delivery time from one university to another will be set to forty-eight hours. This service started work- ing on the first of Sepember. Training. Different courses for the joint training of library staff are being organized on a cooperative basis. In the future, other programs will be developed, including a union catalog (with the creation of a collec- tive data basis that will also save cataloging costs by shar- ing bibliographical resources); and an elecronic library, which will allow common access to electronic resources. Galician Libraries Consortium The Galician Libraries Consortium is the result of a regional government intiative. 16 In November 1996 the Xunta de Galicia signed an agreement of scientific and technological collaboration with Fijitsu ICL Spain in which the company agreed to develop the telecommuni- cations infrastructure of the community: the Galician Information Highway (AGI: Autopista Gallega de la Informaci6n). Inaugurated in 1997, AGI serves as the basis for projects with great political and social appeal. Three projects were embarked upon : • tele-teaching, • tele-medicine, and • access to libraries Users have access to a loan service by which a loan may be requested from any library in the consortium. The loan works as it would work in a local climate, with the same limitations, controls, and blocking of any other local loan system . The request to the system is sent online and is fulfilled within twenty-four to forty-eight hours. The consortium originally was to encompass all types of libraries, but as the project advanced, it was decided to restrict the collaboration to university libraries. This allowed the project to move forward with greater speed, because the member libraries had more narrowly defined interests and concerns. The Xunta de Galicia prepared the "Protocol of Intentions," which has been signed by the highest represen- tatives of the three Gallician universities (Universidad de Santiago, Universidad de la Corufta, and Universidad de Vigo). This protocol is characterized by two essential ideas: 1. Allow adequate time for planning individual incor- poration into the consortium, so that each institution may participate at the rate it deems appropriate. 2. Create a permanent working commission formed by representatives of the institutions involved, which will: • answer existing and future questions; • define the model of consortium that each organi- zation desires to establish through specific objec- tives; and • promote adequate measurement in order to obtain the objectives that have been designed . Andalucian University Libraries Consortium In the era of the Internet, electronic documents, and the virtual library, maintaining independent libraries is out of order . In addition, the efforts needed to face the chal- lenges of the information society and the changes that society is demanding of universities are destined to become weaknesses more than strengths in those institu- tions that face them individually. There are many reasons why it is advisable for libraries to approach these challenges collaboratively: • The productivity and competitiveness that society demands of the universities • The huge technological opportunities to share infor- mation • The importance of the changes that are taking place in the products and services that the information market offers • The high cost of the new products (e.g., e-joumals) • The need of very specialized knowledge in order to activate some of these services • The growing demands of library users The Andalucian University Libraries concluded that if they wished to stay current with information technolo- gies, if they wished to continue implementing improved services, and if they wished to do so within their budg- 100 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ets, solid cooperation mechanisms would have to be established. In March 1998 the Andalucian vice-principals of research requested the directors of the Andalucian univer- sity libraries to analyze possible cooperative activities among the university libraries of the community. Two goals were set in this meeting: • The analysis of library automation products cur- rently on the market. • The analysis of the current individual management systems within the Andalucian libraries (which, though automation varied within them, were each considered to be outdated) and the potential for shar- ing resources with the present systems, which is dif- ficult because currently available systems may not be compatible with Z39.50. The object of this analysis is to define essential requirements so the new systems to be implemented facilitate possible cooperative actions. This possible inte- gration will not be simple: the University Pablo de Olavide, recently created, is planning to purchase its own system; the universities of Seville, Granada, and Cordoba are using Dobis-Libis; and the universities of Cadiz and Malaga are using Libertas and are preparing to update to Innopac. The Andalucian university libraries have studied some of the systems that the Spanish market offers: Abys (Baratz, Document Systems), Amicus (Elias), Innopac (SLS), Sabini (Sabini Library Automation) and Unicorn (Sirsi). They are preparing a catalog of electronic infor- mation resources available in the Andalucian university libraries to know which resources are available and pre- ferred by different universities. The Andalucian University Libraries Consortium is in an early stage; while its organizational structure and functions are defined, its tasks are still being elaborated. The Delegate Commission of the Vice-principals of Research of the Andalucian Universities is responsible for this work. The commission is presided over by the vice- principal of the University of Seville and formed by the directors of the Andalucian libraries and the juridical consultant of the University of Cordoba. The commision will produce a working paper that outlines the main facets of the organization, based on the following general principles: • To add value to the computer net of research • To favor the use of technologies that contribute to the improvement of the production times and the designing of efficient processes • To apply scale economies: • in the purchase of products and services • in repetitive tasks and activities • To favor the use of information resources among the members of the Andalusian universities and the society in general In order for the project to succeed, the following con- ditions must exist: • A homogeneous situation among the libraries in terms of regulations and technical instruments used in the description of materials, data format, and information interchange format; • The Andalucian universities are connected with high speed optic fiber lines (32 MB); • The administrative framework is clearly defined; and • The responsible members of the Andalusian univer- sity libraries are convinced that cooperation will improve substantially the quality of the library serv- ices in each university. Additionally, the following advantages must result: • Decline or leveling of production expenses • Economies of scale in the purchase of products such as computer systems, databases, and journal and electronic information subscriptions • Shared technical support • Shared training costs • Shared information resources through interlibrary loan I Conclusions The ultimate goal of cooperation is to join users and the documents and information they need; establishing rela- tions among participant institutions is a means to that end. Consortia represent the possibility to test alterna- tives to the traditional automated library. They represent the potential to offer the best library services to a wider number of users with all the resources they possess. Further than simple cooperation that unites efforts and resources, consortia represent the possibility to test innovative formulas of processes management and serv- ices organization from a regional perspective. References 1. Miguel Duarte, "Evaluaci6n del rendimiento aplicando sistemas de gesti6n de calidad, La Experiencia de la biblioteca de la Universidad de Cadiz" [Performance Assesment Implementing Total Quality Management Systems. The University Library of Cadiz Experience], in XV Jornadas_ de Gerencia Universitaria: Mode/as de financiaci6n, evaluaci6n y me1ora de la calidad de la gesti6n de las servicios [15th University Managers Meeting: Financing Models, Assesment and Quality Assurance NEW STRAGETIGES IN LIBRARY SERVICES ORGANIZATION I BARRIONUEVO 101 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. of Services] (Cadiz, University Pr., 1997), 309-10; Marta Torres, "El impacto de las autopistas de la informaci6n para la comu- nidad academica y los bibliotecarios" [The Information Highway to Academic Community and Librarians], in Autopistas de la informaci6n: el reto de/ Siglo XXI (Madrid: Editorial Complutense, 1996), 37-55. 2. Victor Castelo en la Mesa Redonda: Suen.an los informati- cos con bibliotecas electr6nicas. En Seminario sobre Consorcios de Bibliotecas [Dream the computerman with electronic libraries?] Table Ronde in Libraries Consortia Conference, Cadiz, University Press, 1999, 130; see also www.rediris.es, accessed Apr. 24, 2000. 3, M. Jimenez and Alice Keefer, "Library Automation in Spain," Program 26, no. 3 (1992): 225-37; Assumpcio Stivill, "Automation of University Libraries in Spain," Telephasa Seminar on Innovative Information Services and Information Handling (Tilburg, June 10-12, 1991); Rebiuns Statistical Annual offers data about catalog automation. 4. Luis Anglada and Margarita Taladriz, "Pasado, presente y futuro de las bibliotecas universitarias espaii.olas" [Past, Present and Future of Spanish University Libraries] in IX Jornadas de Bibliotecas de Andalucfa (Granada: Asociaci6n Andaluza de Bibliotecarios, 1996), 108-31. 5. L. Anglada, "Cooperaci6 bibliotecaria a Espanya [Library Cooperation in Spain]," Item 95, no. 16: 51--67. 6. See www.doc6.es/cdb, accessed Apr. 24, 2000. 7. See http:/ /biblioteca.upv.es/bib/mecano, accessed Apr. 24, 2000. 8. See www.uned,es/bibliote/biblio/ruedo.htm and www. baratz.cs/RUECA, accessed Apr. 24, 2000. 9. "The Library in the University: Report on the University Libraries in Spain, Produced by a Working Team Formed by University Librarians and Teachers" (Madrid: Ministry of General Culture of the Book and Libraries, 1985); "University Libraries: Recommendations about its Regulations, Conference's on University Libraries, 'Castillo Magalia,' Las Navas de! Marques," Avila, May 27-28, 1986 (Madrid: Library Coordination Centre, 1987). 10. Situaci6n de las bibliotecas universitarias dependientes del MEC [Academic Libraries from Education Department State of Art] (Madrid: Universidad Complutense, Biblioteca, 1988); Estudio sob re normalizaci6n e informatizaci6n de las bibliotecas cientificas espaii.olas.-Fundesco, 1989 (no publicado). 11. Luis Anglada and Margarita Taladriz, 108. 12. See Consorcios de Bibliotecas [Consortia Libraries Conference], Maribel Gomez Campillejo, ed. (Cadiz: Cadiz Univ. Pr., 1999). 13. See www2.uji.es/rebiun, accessed Apr. 24, 2000. 14. For more information about CBUC, see www.cbuc.es, accessed Apr. 24, 2000. 15. Marta Torres, Los Consorcios, forma de organizaci6n bib- liotecaria en el S.XXI. Una aproximaci6n desde la perspectiva espaii.ola. In Consorcios de bibliotecas (Library Consortia Conference), 17-35. 16. Santiago Raya, "El Consorcio de Bibliotecas de Galicia [Galician Library Consortium]," in Consorcios de Bibliotecas [Library Consortia Conference], cit, 117-25. 102 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 10085 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In the Beginning...Was the Command Line Zillner, Tom Information Technology and Libraries; Jun 2000; 19, 2; ProQuest pg. 103 Book Reviews In the Beginning ... Was the Command Line by Neal Stephenson. New York: Avon Books, Inc., 1999. 151p. $10 (ISBN 0-380- 81593-1) Neal Stephenson is best known for his cyberfiction, including Snow Crash and most recently Cryptonomicon. In the Beginning . . . Was tlze Command Line is a quite different kettle of fish. Command Line is a short book with a succinct message: the command line is a good thing, because the full power of the computer is only avail- able to those who can access the com- mand line and type in the magic commands that make things happen. Stephenson learned this lesson the hard way, after first spending much time as a Macintosh-devoted GUI- head. The revelation came when he lost a document he was editing on his PowerBook, completely and without a trace, forever irretrievable. Actually, I say the book has a succinct message, but it has many messages and many metaphors, all artfully constructed by a master of prose. Stephenson constructs his argu- ments along multiple lines, provid- ing a discursive tour through Windows, Macintosh, and UNIX his- tory, offering personal history as well as his own take on the economics of the software industry. For example, he believes that Microsoft would be better off as an applications company rather than carrying the millstone of a family of operating systems. As for Apple, he suggests that they have been doing their best to destroy themselves for years, so far unsuc- cessfully (but give them time). The real meat of the book is whether, in fact, it is better to offer to people the flash of metaphor with the recognition that power and certain levels of choice are lost, as with graphical user interfaces exemplified by Windows and the Macintosh, or whether it is better to have at least some access to the command line interface, which MS/DOS offered and members of the UNIX family (e.g., Linux) afford. This is, in fact, both a silly and important question at the same time. Silly because many people would wonder why anyone would want command line access to any software. Silly because others might wonder why you couldn't have both. Important, or at least apparently important, because we seem to have become, without much warning, a world wrapped in GUis of one sort or another. Important in the library automation world, because end-user tools are moving increasingly toward GUI-based or Web-based interfaces without text- based alternatives (except, perhaps, Lynx or similar Web browsers, which have their own problems). For much of the book, Stephenson dances around the question, among others, of why not both GUI and text-based interfaces, and finally finds the answer in the Be operating system. My question is, why not as many interfaces as it takes, of whatever sort? To repeat the trite saw, there are two kinds of people in the world, those who divide the world into two kinds of people and those who don't. Stephenson has a lot of fun trying to make the division in this case, then ultimately comes out from behind the posturing and admits that he believes in the availability of both worlds. There are many people who do, indeed, want hard things hidden from them, at least some of the time. When I am dealing with an auto- mated teller machine, I don't want to have to use mechanical levers or pedals as I might have needed were ATMs invented in an earlier age, nor do I want to type in commands, although I am comfortable using a command line environment in my workplace. I just want to be prompted through a minimal num- ber of steps to walk away with some cash from my checking account. The world is a complicated and challeng- ing place to navigate. Some people Tom Zillner, Editor would like to be helped by other people in this navigation, although many have found that they would far rather deal with the dumbed- down interface of an ATM machine than to interact with not-so-friendly, underpaid bank tellers. Similarly, many people want to accomplish a particular task requir- ing the use of a computer and don't mind having the details hidden from them, no matter how much power knowing the details would provide. Or, they want to do that at least some of the time. As an example in the library world, let's consider a nai:ve patron who enters the library desir- ing to perform a known-item search. Such a user might be quite comfort- able with an interface with a single type-in box and a set of clickable but- tons labeled Title, Author and Subject. Or maybe just a single but- ton "click to start search." Although nai:ve users may consult library staff, who are most often more friendly than bank tellers, many people want to find their own materials. At the same time, more sophisticated users want more sophisticated capabilities and interfaces from the same cata- logs. Although vendors have gotten better at providing a couple of levels of complexity and corresponding user interfaces, why not go further? There aren't just two kinds of people. There are lots of kinds of people, with lots of kinds of informa- tion needs, representing lots of expe- rience levels. Why the restrictions at the user interface? In the history of microcomputing, Stephenson points to the evolution of two major play- ers, Microsoft and Apple, with Linux coming on strong and Be represent- ing an interesting offshoot. I think the important insight implicit in what Stephenson discusses is that much of the appearance and behav- ior of Windows and the Macintosh desktop are historically based arti- facts. In order to maintain backward compatibility with existing applica- tions, the Windows and Macintosh BOOK REVIEWS 103 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. operating systems have picked up a great deal of "cruft," computer code that allows multitasking and other improvements cobbled on to the fragile inner shell of ancient code required for compatibility with older applications. At the same time, Stephenson invokes the familiar refrain that the user interfaces of both platforms are tied to a tired set of metaphors that attempt to mimic the real-world office (e.g., desktop, folder) but do not do so with any kind of useful fidelity. In the library world, I think a similar kind of line- age might be traced from command line interfaces to the current Windows- and Web-based front-ends. Although many libraries and librari- ans have faced painful conversion processes over the years in moving through generations of automated systems, it might be interesting to see if there are still traces of underlying code that owe their existence to back- ward compatibility. Where does Stephenson turn in the face of the inelegance of the Windows and Macintosh worlds? He finds solace in the power and integrity of Linux. It may take a long time to successfully install the oper- ating system and get it to function with all of the hardware components of a particular computer configura- tion, but it has all that power, and all of those cool applications carefully constructed by people who care. Bugs are fixed quickly. It's a commu- nity effort. That's all very appealing, particularly when compared to the appalling response (or lack of it) to Windows or Macintosh bugs. The problem is that so far most of us aren't equipped to deal with the steep curve required to install Linux on personal computers, and the cor- porate or library environment usu- ally isn't politically prepared for Linux to be adopted as an institu- tionwide standard. So, while Linux boxes are frequent choices for servers, they are not widespread personal PC choices. Nor r.hould they be until easy installation tools are available. Again, Stephenson is ambivalent. On the one hand, he recognizes that there are many people who don't want the kind of power offered by being so close to the machine if it means becoming experts in arcane commands and codes. Even though he wants the power and simplicity, and decries the limitations imposed by the GUI, he recognizes that Linux is not for everyone. He's right. Most people use computers to get some work done (or to play). To the extent that the software gets in the way, it isn't operating properly. By that cri- terion, none of the three environ- ments described are particularly useful in a desktop world. In spite of the fact that the old metaphors have been rightly criti- cized for years for their tiredness, there doesn't seem to be much move- ment beyond them, except in limited research operating environments and applications. Similarly, it seems, in the library and information world, at least in most people's routine interactions with OPACs and data- bases. Yes, I am waffling, because I'm sure that someone could point out the "Snarfle n 1 Virtual Reality inter- face to the LC catalog that affords a walkthrough browsing experience," but of course only six computer sci- ence researchers have actually expe- rienced the SnarfleTM interface, and it requires a $25,000 workstation and $10,000 in virtual reality gear to work, plus it is s-1-o-w. Pardon the sarcastic riff, but there is a lot of won- derful user interface work that is cer- tainly not finding its way onto mainstream computer users' desk- tops, or to the library or information center. So what's the answer? Criticism is fun, because critics don't necessar- ily have to provide a positive account to match their nay-saying function. If things are bleak in the world of the user interface, both on the average user desktop and on the library desk- 104 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 top as well, what is to be done? For a taste of what is to come in the library world, take a look at MyLibrary (http:/ /my.lib.ncsu.edu/), which allows profiling of user preferences and customization based on aca- demic discipline. Similarly, there are a number of Web portals and other sites that allow customization for users (e.g., My Yahoo, My Excite, etc.). Suppose that these first steps in customization are carried further, so that each user's unique profile gener- ates a unique user interface experi- ence across all databases he or she deals with in a session. The interface unification could be accomplished across heterogeneous databases in a couple of different ways. A simple initial step that many libraries already employ is to obtain databases from a single aggregator, so that a uniform interface is pre- sented to the user. For example, OCLC' s First Search offers a single interface to a number of commercial databases. This type of solution is not possible for libraries that need access to a diverse array of databases not available through a single aggregator or vendor. Of course, this situation can present patrons and staff with a bewildering array of interfaces and search methods. A more elaborate solution is to employ Z39.50 to access the databases and build a single interface at the front end. There may be aggregators that already use this strategy with the databases they pro- vide, but in the future perhaps there would be an incentive to offer uni- fied interfaces with fine-grain cus- tomization possible by users. Getting back to Stephenson's more generalized view of the user interface, I think there are also opportunities here for more fine- grained customization. Stephenson points to the BeOS, which apparently allows both command-line and GUI- based interactions, as an example of what can be done when an operating system is constructed anew, from the bottom up, with no pre-existing Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. audience to satisfy. At the same time, and in contrast, Stephenson extols the power of open software develop- ment, which he believes is most apparent in operating systems, the production of which he describes as money-losing propositions. Yet, Linux is tremendously successful without, for the most part, commer- cial gain for developers. Can this same model be applied to interface and other development in the library world? In this example, might not some group of librarian coders (or coder librarians) work together to put Mylibrary together with Z39.50 capabilities and customization of interfaces to produce a little slice of paradise for library patrons? Promising moves are being made within the library community to get open source efforts off the ground. This could be one of many especially useful and fruitful projects to come out of open software development for libraries. Although his book is ostensibly about a few issues that elicit yawns from most of the world, Stephenson is really using In the Beginning . . . Was the Command Line to look at a much bigger picture than simply the command line versus the GUI at its microscopic level. Stephenson looks at the cloaking, obfuscation or replacement of underlying text by images and multimedia as contribut- ing to the decline of civilization. That seems like a radical claim, but at heart it is the one that Stephenson makes in his discussion of the Disney-ification of the world-that visual metaphors and explanations oversimplify and obscure the truth. In fact, Stephenson goes further, dis- cussing this trend toward anti-word as our attempt at an antidote for the kind of intellectualism that resulted in a lot of death, pain, and suffering for people in the twentieth century. He, as a person who lives by words and loves the intellectual life, thinks we've gone too far, reaching a state of cultural relativism where there is nei- ther good nor bad remammg. This discussion includes my favorite quote of the book: The problem is that once you have done away with the ability to make judgments as to right and wrong, true and false, etc., there's no real culture left. All that remains is clog dancing and macrame. The ability to make judgments, to believe things, is the entire point of having a culture. I think this is why guys with machine guns sometimes pop up in places like Luxor and begin pumping bul- lets into Westerners .... When their sons come home wearing Chicago Bulls caps with the bills turned sidewavs, the dads go out of their minds. (p. 56) It's a pretty startling move to try to connect up the decline in use of the command line to an anti-intellec- tualism following World War II that resulted in cultural relativism. I think it actually has some merit, although in the case of visual interfaces versus the command line the ethical import is minimal, i.e., I don't believe my decision to accomplish certain tasks using visual metaphors contributes to the decline of civilization, and I think the fact that I like to work on other tasks utilizing a command line won't serve to save our written cul- ture. It's too much of a stretch. I think that something Stephenson misses in his discussion of the replacement of the written word by visual images is that there is still a cre- ative force and judgment involved in the creation of the images. There is still script writing. Isn't this, after all, what a writer does in any case, creat- ing images, metaphorically, through his or her work? Certainly, we are moving through a perilous time, when the world really is changing from a reliance on the written word to more dependence on the visual. There will be many things lost in this transi- tion. Plato had some major, well- founded doubts about the transition from Greece's oral cultural tradition to a written one. The change hap- pened anyway. Civilization has been declining for a long time. My fearless prediction is that it will continue to decline for a long time. I think Stephenson has done a masterful job of writing a brief glimpse of the overall picture that represents the state of culture and intellectual life in the world today, and has also made some important points about the economics and char- acter of the world of software and operating environments. His writing skills make this fairly short book a pleasurable read and a worthwhile one. As I did, I think you might find this long essay a useful starting point for thoughts about issues large and small.-Tom Zillner, WILS The Cathedral & the • Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary by Eric S. Raymond, Sebastopol, Calif.: O'Reilly, 1999. 288p. $19.95 (ISBN 1- 56592-724-9) This short essay examines, in the guise of a book review, the concept of a "gift cul- ture" and how it may or may not be related to librarianship. As a result of this examination, and with a few qualifi- cations, I believe my judgements about open source software and librarianship are true: open source software develop- ment and librarianship have a number of similarities-both are examples of gift cultures. I have recently read a book about open source software development by Eric Raymond. The Cathedral & the Bazaar describes the environment of free software and tries to explain why some programmers are willing to give away the products of their labors. It describes the "hacker milieu" as a "gift culture": BOOK REVIEWS 105 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Gift cultures are adaptations not to scarcity but to abun- dance. They arise in popula- tions that do not have significant material scarcity problems with survival goods. We can observe gift cultures in action among aboriginal cul- tures living in ecozones with mild climates and abundant food. We can also observe them in certain strata of our own soci- ety, especially in show business and among the very wealthy. 1 Raymond alludes to the defini- tion of "gift cultures," but not enough to satisfy my curiosity. Being the good librarian, I was off to the reference department for more spe- cific answers. More often than not, I found information about "gift exchange" and "gift economies" as opposed to "gift cultures." (Yes, I did look on the Internet but found little.) Probably one of the earliest and more comprehensive studies of gift exchange was written by Marcell Mauss. 2 In his analysis he says gifts, with their three obligations of giving, receiving, and repaying, are in aspects of almost all societies. The process of gift giving strengthens cooperation, competitiveness, and antagonism. It reveals itself in religious, legal, moral, economic, aesthetic, morphological, and mythological aspects of life.3 As Gregory states, for the indus- trial capitalist economies, gifts are nothing but presents or things given, and "that is all that needs to be said on the matter." Ironically for econo- mists, gifts have value and conse- quently have implications for commodity exchange. 4 He goes on to review studies about gift giving from an anthropological view, studies focusing on tribal communities of various American Indians, cultures from New Guinea and Melanesia, and even ancient Roman, Hindu, and Germanic societies: The key to understanding gift giving is apprehension of the fact that things in tribal eco- nomics are produced by non- alienated labor. This creates a special bond between a pro- ducer and his/her product, a bond that is broken in a capi- talistic societv based on alien- ated wage-labor. 5 Ingold, in "Introduction To Social Life," echoes many of the things sum- marized by Gregory when he states that industrialization is concerned exclusively with the dynamics of commodity production. Clearly in non-industrial soci- eties, where these conditions do not obtain, the significance of work will be very different. For one thing, people retain control over their own capacity to work and over other productive means, and their activities are carried on in the context of their relationships with kin and com- munity. Indeed their work may have the strengthening or regeneration of these relation- ships as its principle objective. 6 In short, the exchange of gifts forges relationships between part- ners and emphasizes qualitative as opposed to quantitative terms. The producer of the product (or service) takes a personal interest in produc- tion, and when the product is given away as a gift it is difficult to quantify the value of the item. Therefore, along with the product or service, less tangible elements-such as obliga- tions, promises, respect, and interper- sonal relationships-are exchanged. As I read Raymond and others I continually saw similarities between librarianship and gift cultures, and therefore similarities between librari- anship and open source software development. While the summaries outlined above do not necessarily mention the "abundance" alluded to by Raymond, the existence of abun- dance is more than mere speculation. Potlatch, "a ceremonial feast of the American Indians of the northwest coast marked by the host's lavish dis- tribution of gifts or sometimes destruction of property to demon- strate wealth and generosity with the 106 INFORMATION TECHNOLOGY AND LIBRARIES I JUNE 2000 expectation of eventual reciproca- tion," is an excellent example.? Libraries have an abundance of data and information. (I won't go into whether or not they have an abun- dance of knowledge or wisdom of the ages. That is another essay.) Libraries do not exchange this data and infor- mation for money; you don't have to have your credit card ready as you leave the door. Libraries don't accept checks. Instead the exchange is much less tangible. First of all, based on my experience, most librarians simply take pride in their ability to collect, organize, and disseminate data and information in an effective manner. They are curious. They enjoy learning things for learning's sake. It is a sort of Platonic end in itself. Librarians, generally speaking, just like what they do and they certainly aren't in it for the money. You won't get rich by becoming a librarian. Information is not free. It requires time and energy to create, collect, and share, but when an information exchange does take place, it is usually intangible, not monetary, in nature. Information is intangible. It is difficult to assign it a monetary value, espe- cially in a digital environment where it can be duplicated effortlessly: An exchange process is a process whereby two or more individuals (or groups) ex- change goods or services for items of value. In Library Land, one of these individuals is almost always a librarian. The other individuals include tax payers, students, faculty, or in the case of special libraries, fel- low employees. The items of value are information and information services exchanged for a perception of worth-a rating valuing the services ren- dered. This perception of worth, a highly intangible and difficult thing to measure, is something the user of library services "pays," not to libraries and librarians, but to adminis- trators and decision-makers. Ultimately, these payments Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. manifest themselves as tax dol- lars or other administrative support. As the perception of worth decreases so do tax dol- lars and support. 8 Therefore, when information ex- changes take place in libraries, librari- ans hope their clientele will support the goals of the library to administra- tors when issues of funding arise. Librarians believe that "free" informa- tion ("think free speech, not free beer") will improve society. It will allow peo- ple to grow spiritually and intellectu- ally. It will improve humankind's situation in the world. Libraries are only perceived as beneficial when they give away this data and informa- tion. That is their purpose, and they, generally speaking, do this without regard to fees or tangible exchanges. In many ways I believe open source software development, as articulated by Raymond, is very simi- lar to the principles of librarianship. First and foremost they are similar in the idea of sharing information. Both camps put a premium on open access. Both camps are gift cultures and gain reputation by the amount of "stuff" they give away. What people do with the information, whether it be source code or journal articles, is up to them. Both camps hope the shared informa- tion will be used to improve our place in the world. Just as Jefferson's informed public is necessary for democracy, open source software is necessary for the improvement of computer applications. Second, human interactions are a necessary part of the mixture in both librarianship and open source devel- opment. Open source development requires people skills by source code maintainers. It requires an under- standing of the problem the computer application is intended to solve, since the maintainer must be able to "patch" the software, both to add functionality and to repair bugs. This, in turn, requires interactions both with other developers and with users who request repairs or enhancements. Similarly, librarians understand that information-seeking behavior is a human process. While databases and many "digital libraries" house infor- mation, these collections are really "data stores" and are only manifested as information after the assignment of value is given to the data and interre- lations between data are created. Third, it has been stated that open source development will remove the necessity for programmers. Yet Raymond posits that no such thing will happen. If anything, there will be an increased need for programmers. Similarly, many librarians feared the advent of the Web because they believed their jobs would be in jeop- ardy. Ironically, librarianship is flow- ering under new rubrics such as information architects and knowl- edge managers. It has also been brought to my attention by Kevin Clarke (kevin_clarke@unc.edu) that both institutions use peer-review: Your cultural take (gift culture) on "open source" is interesting. I've been mostly thinking in material terms but you are right, I think, in your assessment. One thing you didn't mention is that, like academic librarians, open source folks participate in a peer-review type process. Index to Advertisers All of this is happening because of an information economy. It sure is an exciting time to be a librarian, especially a librarian who can build relational databases and program on a Unix computer. Acknowledgements Thank you to Art Rhyno (arhyno@ server.uwindsor.ca) who encouraged me to post the original version of this text.-Eric Lease Morgan, North Carolina State University, Raleigh, North Carolina References 1. The Cathedral & the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary, 99. 2. M. Mauss, The Gift: Forms and Functions of Exchange in Archaic Societies (New York: Norton, 1967). 3. S. Lukes, "Mauss, Marcel," in International Encyclopedia of the Social Sciences, D. L. Sills, ed. (New York: Macmillian), vol 10, 80. 4. C. A. Gregory, "Gifts," in The New Pa/grave: A Dictionary of Eeconomics, J. Eatwell and others, eds. (New York: Stockton Pr., 1987), vol. 4, 524. 5. Ibid. 6. T. Ingold, "Introduction to Social Life," in Companion Encyclopedia of Anthropology, T. Ingold, ed (New York: Routledge, 1984), 747. 7. The Merriam-Webster Online Dic- tionary, http://search.eb.com/ cgi-bin/ dictionary?va=potlatch 8. E. L. Morgan, "Marketing Future Libraries." Accessed Apr. 27, 2000, www.lib.ncsu.edu/ staff/ morgan/ cil/ marketing. Info USA Library Technologies, Inc. LITA MIT Press cover 2 cover 3 58, 69, cover 4 95 BOOK REVIEWS 107 10086 ---- September_ITAL_yelton_final President’s Message: 50 Years Andromeda Yelton INFORMATION TECHNOLOGIES AND LIBRARIES | SEPTEMBER 2017 1 Fifty years. LITA was voted into existence (as ISAD, the Information Science and Automation Division) in Detroit at Midwinter 1966. Therefore we have just completed our first fifty years, a fact celebrated (thanks to our 50th Anniversary Task Force) with a slide show and cake at Annual in Chicago. It’s truly humbling to take office upon this milestone. Looking back, some of the true giants of library technology have held this office. In 1971-72, Jesse Shera, who in his wide-ranging career challenged librarians to think deeply about the epistemological and sociological dimensions of librarianship; ALA makes several awards in his name today. In 1973-74 and again in 1974-75, Frederick Kilgour, the founding director of OCLC, who also has an eponymous award. In 1975-76, Henriette Avram, the mother of MARC, herself. Moreover, thanks to the work of countless LITA volunteers, much of this history is available open- access. I strongly recommend reading http://www.ala.org/lita/about/history/ for an overview of the remarkable people and key issues across our history. You can also read papers by Avram and Kilgour, among many others, in the archives of this very publication. In fact, reading the ITAL archives is deeply engaging. It turns out library technology has changed a bit in 50 years! (I trust that isn’t a shock to you.) The first articles (in what was then the Journal of Library Automation) are all about instituting first-time computer systems to automate traditional library functions such as acquisitions, cataloging, and finance. The following passage caught my eye: “A functioning technical processing system in a two-year community college library utilizes a model 2201 Friden Flexowriter with punch card control and tab card reading units, an IBM 026 Key Punch, and an IBM 1440 computer, with two tape and two disc drives, to produce all acquisitions and catalog files based primarily on a single typing at the time of initiating an order” (“An Integrated Computer Based Technical Processing System in a Small College Library”, Jack W. Scott; https://doi.org/10.6017/ital.v1i3.2931.) How many of us are still using punch cards today? And, indeed, how many of us are automating libraries for the first time? The topics discussed among LITA members today are far more wide- ranging: user experience, privacy, accessibility. They’re more likely to be about assessing and improving existing systems than creating new ones, and more likely to center on patron-facing technologies. Andromeda Yelton (andromeda.yelton@gmail.com) is LITA President 2017-18 and owner/consultant of Small Beautiful Useful LLC. PRESIDENT’S MESSAGE | YELTON https://doi.org/10.6017/ital.v36i3.10086 2 And yet, with a few substitutions — say, “Raspberry Pi” for “Friden Flexowriter” — the blockquote above would not be out of place today. Then as now, LITA members were doing something exciting, yet deeply practical, that cleverly repurposes new technology to make library experiences better for both patrons and staff. Our job descriptions have changed enormously in fifty years; in fact, the LITA Board charged a task force to develop LITA member personas, so that we can better understand whom we serve, and work to align our publications, online education, conference programming, and committee work toward your needs. (You can see an overview of the task force’s stellar work on LITAblog: http://litablog.org/2017/03/who-are-lita-members-lita-personas/.) At the same time, the spirit of pragmatic creativity that runs throughout the first issues of the Journal of Library Automation continues to animate LITA members today. I’m looking forward to seeing where we go in our next fifty years. 10087 ---- September_ITAL_varnum_final Editorial Board Thoughts: Content and Functionality: Know When to Buy ‘Em, Know When to Code ‘Em1 Kenneth J. Varnum2 INFORMATION TECHNOLOGIES AND LIBRARIES | SEPTEMBER 2017 3 We in library technology live in interesting times, though not those of these apocryphal curse. No, these are interesting times in the best possible way. Where once there was a paucity of choice in interfaces and content, we have arrived at a time when a range of competing and valid choices exists for just about any particular technology need. Data and functionality of actual utility to libraries are increasingly available not just through proprietary interfaces, but also through APIs (Application Programming Interfaces) that are ready to be consumed by locally developed applications. This has expanded the opportunity for libraries to respond more thoughtfully and strategically to local needs and circumstances than ever before. Libraries are faced with an actual, rather than hypothetical, choice between building or buying fundamental user interfaces and systems As the internet has evolved, and coding has become more central to the skillset of many libraries, the capability of libraries to seriously consider building their own interfaces has grown. How does a technologically capable library make the decision to buy a complete system or build its own interface to existing data? The process can be decided using a range of criteria that can help define the library’s need for a locally managed solution. We’ll start by discussing technological capabilities needed to take on almost any development project, then define three criteria, and finally discuss the circumstances in which a build solution might be appropriate. The goal is outline a process for deciding when it make more sense to buy both the interface and the content, to build one or the other locally, or to build both. Criterion 0: What are the short- and long-term technological capabilities of the library? Clearly, the first point of consideration is whether the institution has the capacity to manage application development and user research. The short-term answer may be no, but the long-term answer -- one based on the library’s strategic direction -- may be that these skills are needed to meet the library’s goals or strategic vision. One project may not be enough to tip the scales, but if the library is continually deciding if the immediate project under discussion is the one to change the balance, then perhaps the answer is that it’s time to invest in new skillsets and capabilities. There are actually several skillsets needed to undertake development projects. Individuals with coding skills are needed to adapt existing open-source software to the library’s needs — it is a rare 1 With apologies to Kenny Rogers 2Kenneth J. Varnum (varnum@umich.edu), a member of the ITAL Editorial Board, is Senior Program Manager for Discovery, Delivery, and Library Analytics at the University of Michigan Library, Ann Arbor, MI. EDITORIAL BOARD THOUGHTS | VARNUM https://doi.org/10.6017/ital.v36i3.10087 4 open-source project that does exactly what a library needs it to do, with connectors to all the same data sources and library management tools already perfectly configured by somebody else — but that is not sufficient. A library also needs people with user interface and user research skills ensure that the application meets at least the critical needs of its own user community, and does so with language and cues that match user expectations. Even if there is not a permanent capability on the library’s staff, development can take place with contract services. If this is the option selected, a library would do well to make sure that staff are sufficiently trained to make minor updates to interfaces and applications, or that a longer-term arrangement is made for ongoing maintenance and updates. Criterion 1: What is the need to customize interactions to local situations? Most, but not all, applications offer opportunities to match interface features and functionality with local user needs. The more interactive and core to the library’s service model the tool is, the more likely the tool is to benefit from customization. For example, a proxy server -- technology that allows an authenticated user to access licensed content as if she were in the physical library or within a campus on a defined network -- has little or no user interface. There is little need to customize the tool to meet user needs, beyond ensuring the list of online resources and URLs subject to being proxied is up to date. There really aren’t any particularly useful APIs to consumer and reproduce elsewhere, and there are easier ways to build an A-Z list of licensed content than harvesting the proxy server’s configuration lists. In contrast, the link resolver -- technology that takes a citation formatted according to the OpenURL standard and returns a list of appropriate full-text destinations to which the library has licensed access -- may well be worth bringing in house. Some vendors offer their software to be run locally, while others provide API access to the metadata. At my institution, we used the APIs Serials Solutions makes available for its 360 Link API to build our own interface using the open- source Umlaut software. (See https://mgetit.lib.umich.edu/). Why go to the trouble of recreating an interface? For several reasons, some of which (understanding user behaviors and maintaining control over user data to the extent practical) I’ll touch on in the following two sections. The main reason centered on providing a user interface consistent with the rest of our web presence, offering integrations to our document delivery service, and a way to contact our online chat service, and a way to report problem links directly to the library when the full text links provided by the system do no work. While these features are generally available through vendor interfaces, the user experience is hard to make consistent with other services we offer. Criterion 2: What are the needs for integration with other systems from different providers? Integrations can run in two directions: from the system under consideration to existing library or campus/community tools, and from those environmental tools to the library. When thinking about the buy-or-build decision, understanding the scope of these integrations up front is important. If all of the tools or services that need to consume information from or provide information to your INFORMATION TECHNOLOGIES AND LIBRARIES | SEPTEMBER 2017 5 system rely on well-defined standards that are broadly implemented, this criterion may be a wash; there may not be an inherent advantage to building or buying based on data exchange. If, however, the other systems are themselves tricky to work with, relying on inputs or providing outputs in a non-standard or idiosyncratic way, this situation may swing the pendulum toward building the system yourself so you can manage. For example, many course management systems on academic campuses can consume and provide data using the LTI [Learning Tools Interoperability] standard for data exchange. Many traditional library applications do, as well, so if a library using an LTI-compliant system needs to provide course reserves reading lists to the course management system, this is a ready-made way to make that information available. At the other extreme, bringing registrar’s data into a library catalog -- to know who is in what courses to provide those patrons with an appropriate reference librarian contact for a particular subject, or access to a reading list through a course reserves system -- may only be possible through customized applications to read non-standard data. In this case, to provide the desired level of service to the campus, the library may need to build local applications. Criterion 3: Who manages confidentiality or privacy of user interactions? A final, and increasingly significant, criterion to consider is where the library believes responsibility for patron data and information seeking behavior to reside. Notwithstanding contractual or licensing obligations taken on by library vendors, the risk of inadvertent exposure or intentional sharing of user interactions is always present. One advantage of building local systems to interact with vendor systems (link resolvers, discovery platforms, etc.) is that vendor does not have access to the end-user’s IP address or any other personally identifying information. The vendor only sees a request coming from the library’s application; all requests are equal and undifferentiated. Of course, once users access the target item they are seeking (an online journal, database, etc.), that particular vendor’s site has access to that information. For libraries concerned about user privacy, the risk of exposure is somewhat mitigated by managing the discovery or access layer in-house -- and deciding to maintain a level of user information that suits that particular library’s comfort level -- and potentially minimizing the single point of failure for breaches. At the same time, such a decision puts more responsibility on the library or its parent information technology organization to protect data from exposure. Some libraries feel they can handle this responsibility -- either by careful protection of the data, or by not collecting and storing it in the first place -- in a way that library vendors cannot. Concluding Thoughts Making the buy-or-build decision is not straightforward; the criteria described here are not the only ones a library might wish to consider, but they are common ones with the greatest ramifications. Putting the decision process into a framework can help a library make consistent EDITORIAL BOARD THOUGHTS | VARNUM https://doi.org/10.6017/ital.v36i3.10087 6 decisions over time, enabling it to focus on the projects and systems that are most important to the library and its community (a campus, a town, or company). 10113 ---- From Dreamweaver to Drupal: A University Library Website Case Study Jesi Buell and Mark Sandford INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 118 Jesi Buell (jbuell@colgate.edu) is Instruction and Design and Web Librarian and Mark Sandford (msandford@colgate.edu) is Systems Librarian at Colgate University, Hamilton, New York. ABSTRACT In 2016, Colgate University Libraries began converting their static HTML website to the Drupal platform. This article outlines the process librarians used to complete this project using only in-house resources and minimal funding. For libraries and similar institutions considering the move to a content management system, this case study can provide a starting point and highlight important issues. INTRODUCTION The literature available on website design and usability is predominantly focused on business or marketing websites. What separates library websites from other informational or commercial websites is the complexity of the information architecture—they contain both intricate informational and transactional functions. Website managers need to maintain congruity between many interrelated but disparate tools in a singular interface and navigational system. Libraries are also often challenged with finding individuals who possess the appropriate skills to build and maintain a secure, accessible, attractive, and easy-to-use website. In contrast to libraries, commercial companies employ a team of designers, developers, content managers, and specialists to triage internal and external issues. They can also spend months or years perfecting a website and, of course, all these factors have great costs associated with them. Given that many commercial websites need a team of highly skilled workers with copious time and funding, how can librarians be expected to give their patrons similar experiences to sites like Google? This case study will outline how a small team of librarians completely overhauled their fragmented, Dreamweaver-based website to a more secure, organized, and appealing open-source platform with Drupal within a tight timeline and very few financial consequences. It includes a timeline of major milestones in the appendix. GOALS AND OBJECTIVES The first necessity for restructuring the Colgate University Libraries’ website was building a team that had the skills and knowledge necessary to perform this task. The website overhaul was spearheaded by Jesi Buell, instructional design and web librarian, and Mark Sandford, systems librarian. Buell has a user experience (UX) design and editing background while Sandford has systems, cataloging, and server experience. They were advised by Web Development Committee (WDC) members Cindy Li, associate director of library technology and digital initiatives, and Debbie Krahmer, digital learning and media librarian. Together, the group understood trends in digital librarianship, the needs of the Libraries’ patrons, as well as website and catalog design and mailto:jbuell@colgate.edu mailto:msandford@colgate.edu FROM DREAMWEAVER TO DRUPAL | BUELL AND SANDFORD 119 https://doi.org/10.6017/ital.v37i2.10113 maintenance. The first thing the WDC did was outline its goals and objectives, and this documented weaknesses the group wanted to address with a new website. The WDC identified four main improvements Colgate Libraries needed to make to the website: Improve Design Colgate Libraries’ old website suffered from varied design and language use across pages and various tools (LibGuides, catalog, etc.). This led to an inconsistent and often frustrating user experience and detracted from the user’s sense of a single, cohesive website. The WDC also wanted to improve and update the aesthetic quality of the website. While many of these changes could have been made with an overhaul of the existing site, the WDC would have still needed to address the underlying cause. Responsibility for content was decentralized, and content creation relied too heavily on technical expertise with Dreamweaver. Further, the ad hoc nature of the content—the product of years of “fitting in” content without a holistic approach—meant that changes to visual style could not be accomplished by changing a single CSS file. There were far too many exceptions to make changes simply. Improve Usability The WDC needed to make sure all the webpages were responsive and accessible. A restructuring of layout and information architecture (IA) was also necessary to improve findability of resources. On the old site, some content was hidden behind several layers of links. With no platform to ensure or enforce accessibility standards, website managers had to trust that all content creators were conscious of best practices or, failing that, pages had to be re-edited to improve accessibility. Improve Content Creation and Governance A common source of library staff frustration was the authoring experience using Dreamweaver. There was no way to track when a webpage was changed or see who had made those changes. Situations occurred where content was deleted or changed in error, and no one else knew until a patron discovered a mistake. Staff could also mistakenly push out outdated versions of pages. It was not an ideal situation, and it was impossible for an individual (the web librarian) to monitor hundreds of pieces of content for daily changes to check for accuracy. The only other option would be narrow access to only those on the WDC, but that would mean everyone had to wait for the web librarian to push content live, which would also be frustrating. Beyond the security and workflow issues, many of the library staff felt uncomfortable adding or editing content because Dreamweaver requires some coding knowledge (HTML, CSS, JavaScript). Therefore, the group wanted to install a content management system (CMS) that provided a WYSIWYG (What You See Is What You Get) content editor so that no coding knowledge would be needed. Unite Disparate Sites (Website, Blog, and Database List) under One Updated URL on a Single Secure Server Colgate Libraries’ website functionality suffered from what Marshall Breeding describes as “a fragmented user experience.”1 The Libraries website’s main address was http://exlibris.colgate.edu. However, different tools lived under other URLs—one for a blog, another for the database list, yet another still for the mobile site librarians had to maintain INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 120 because the main website was not responsive. Additionally, some portions of the website had been set up on other servers because of various limitations in the Windows.Net environment and in- house skills. This was further complicated by the fact that most specialized interactivity or visual components had to be created from scratch by existing staff. The Libraries’ blog was on an externally hosted WordPress site, and the database A–Z list was on a custom-coded PHP page. A unified domain would make usage statistics easier to track and analyze. Additionally, it would eliminate the need for multiple credentials for the various external sites. Custom code, be it in PHP, .Net, or any other language, also needs to be regularly updated as new security vulnerabilities arise.2 Moving to a well-maintained CMS would help alleviate that burden. By establishing goals and objectives, the WDC had identified that it wanted a CMS to help with better governance, easier maintenance, and ways to disperse web maintenance responsibilities across library faculty. It was important to choose a CMS platform that offered a WYSIWYG editor so that content authoring did not require coding knowledge. Additionally, the group wanted to update the site’s aesthetic and navigational designs. The WDC also decided that this was the optimal time to introduce a discovery layer (since all these changes would be one entirely new experience for Colgate users) rather than smaller, continual changes that would require users to keep readjusting how they used the website. The backend complexity of updating both the website platform and implementing a discovery layer required abundant and detailed planning. However, while there was a lot of overlap in the preparatory work for implementing the discovery layer as well the CMS, this article will focus primarily on the CMS. PLANNING After the WDC had detailed goals and objectives, and the proposal to update the Libraries’ website platform was accepted by library faculty, the group had to take several steps to plan the implementation. The first steps in planning dealt with analysis. Content Analysis The web librarian conducted a content analysis of the existing website. Using Microsoft Excel to document the pages and the Omni Group’s Omnigraffle to organize the spreadsheet into a diagram, she cataloged each page and the navigation that connected that page to other pages. This can be extremely laborious but was necessary because some content was inherited from past employees over the course of a decade, and no one knew exactly what content was live on the website. This visual representation allowed for content creators to see redundancy in both content and navigation. It also made it easy for them to identify old content and combine or reorder pages. Needs Analysis The WDC wanted to make sure it considered more than the content creators’ needs. This group surveyed Colgate faculty, staff, and students to learn what they would like to see improved or changed. The web librarian conducted several UX studies with both students and faculty, and this elucidated several key areas in need of improvement. FROM DREAMWEAVER TO DRUPAL | BUELL AND SANDFORD 121 https://doi.org/10.6017/ital.v37i2.10113 Peer Analysis Peer Analysis involves thoroughly investigating peer institution’s websites to analyze how they organize both their content and their site navigation. It also gives insight into what other services and tools they provide. It is important to choose institutions similar in size and academic focus. Colgate University is a small, liberal arts institution that only serves an undergraduate population, so the Libraries would not seek to emulate a large university that serves graduate populations or distance learners. Peer analysis is an excellent opportunity to see where a website is not measuring up to other websites as well as to borrow ideas from peers to customize for your specific patrons. Evaluating Platforms Now that the group knew what the Libraries had and what the Libraries wanted from our web presence, it was time to evaluate the available options. This involved evaluating CMS products and discovery layer platforms. The WDC researched different CMSs and listed positives and negatives. Ultimately, the group determined that Drupal best satisfied the majority of Colgate’s identified needs. A separate committee was formed to evaluate the major discovery-layer services with the understanding that any option could be integrated into the main website as a search box. Budgeting As free, open-source software, Drupal does not require a subscription or licensing fee. Campus IT provided a virtual server for the website at no cost to the Libraries. Budgeting was organized by the associate director of library technology and digital initiatives and the university librarian. Money was set aside in case a consultant or developer was needed, but the web and systems librarians were able to execute the conversion from Dreamweaver to Drupal without external support. If future development support is needed for specific projects, it can be budgeted for and purchased as needed. The last step was creating a timeline defining achievable goals, ownership (who oversees completing the goal and who needs to be involved with the work), and date of completion. TIMELINE The timeline was outlined as follows: October 2015–January 2016 Halfway through the Fall 2015 semester, the WDC began to create a proposal for changes to be made to the website. This proposal would be submitted to the university librarian for consideration by December 1. In the meantime, the web librarian completed a content inventory, peer analysis, and UX studies. She also gathered faculty and staff feedback on the current website through suggestion-box commentary, one-on-one interviews, online questionnaires, and anecdotal stories. By the deadline for the proposal, this additional information was condensed and presented to the university librarian. After incorporating suggested changes made by the university librarian, the WDC was able to present both the proposal and results from various studies to the library faculty on January 4, INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 122 2016. At the end of the meeting, the faculty voted to move forward and adopt the proposed changes. February 2016 February was spent meeting with stakeholders, both internal and external to the Libraries, to gather concerns, necessary content, and ideas for improvements. The WDC members shared the responsibility of running these meetings. All members from the following departments were interviewed: Research and Instruction, Borrowing Services, Acquisitions, Library Administration, Cataloging, Government Documents, Information Literacy, Special Collections and University Archives, and the Science Library. Together, the WDC also met with members from IT and Communications. It was vital that these sessions identify several components. First, what content was important to retain on the new site, and why? The act of justification made stakeholders evaluate whether the information was necessary and useful to the Libraries’ users. The WDC also asked the stakeholders to identify changes they wanted to see made to the website. The answers ranged from minor aesthetic tweaks to major navigational overhauls. Last, it was important to understand how specific changes might impact workflows and functionality for tools outside Colgate Libraries’ own website. For example, the WDC had to update information with the Communications department so that the Libraries’ website would be findable on the university’s app. All the answers the WDC received were compiled into a report, and the web librarian used this information to inform design decisions moving forward. March 2016 While the associate director of library technology and digital initiatives coordinated demos from discovery layer vendors, the WDC also met to choose the final template from three options designed by the web librarian. The web and systems librarians also met to create a list of developers in case assistance was needed in the development of the Drupal site. The WDC team researched potential developers and inquired about their pricing. The web librarian began to create wireframe templates of the different types of pages and page components (homepage, hours blocks, blogs, forms, etc.). She also began transferring existing content from the old website to the new website. This process, in addition to the development of new content identified by stakeholders, was to be completed by mid-summer. Meanwhile, the systems librarian began to consolidate the external sites under Drupal to the extent possible. While LibGuides lives externally to Drupal and maintains its own URL that the Libraries’ website links out to, he was able to bring the database A–Z list, blog, and analytics into the Drupal platform. This entailed setting up new content types in Drupal to accommodate various functional requirements for the A–Z list and assist in creating pages to search for and display database information. FROM DREAMWEAVER TO DRUPAL | BUELL AND SANDFORD 123 https://doi.org/10.6017/ital.v37i2.10113 April–May 2016 Drupal allows for various models of permissions and authentication. By default, accounts can be created within the Drupal system and roles and permissions assigned to individuals as needed. The LDAP (Lightweight Directory Access Protocol) module allowed us to tie authentication to university accounts and includes the ability to tie Drupal permissions to Active Directory roles and groups. Connecting Drupal to the university LDAP server required the assistance of IT Infrastructure staff but was straightforward. IT staff provided the connection information for the Drupal module’s configuration and created a resource account for the Drupal module to use to connect to the LDAP service. As currently implemented, the LDAP module simply verifies credentials and, if a local Drupal account does not exist, creates one for the user. Permissions for staff are added to accounts after account creation as needed as a part of the onboarding process. Permissions in Drupal can be highly granular. Since one of the goals of the migration to Drupal was to simplify maintenance of the website, the WDC decided to begin with a relatively simple, permissive approach. Currently, all library staff can edit any page. Because of Drupal’s ability to track and revert changes easily, undoing a problematic edit is a simple procedure, and because all changes are tied to an individual login, problems can be addressed through training as needed. The WDC discussed a more fragmented approach that tied editing privileges to specific parts of the site but decided against it. The WDC team felt it was better to begin with the presumption of trustworthiness, expecting staff to only make changes to pages they were personally responsible for. Additionally, trying to divide the site into logical pieces, then accounting for the inevitable exceptions, would be complicated and time-consuming. The WDC reserved the right to begin restricting permissions in the future, but thus far this has proven unnecessary. July–August 2016 As the Libraries ramped up to the official launch, it was crucial to educate the library faculty and staff so they could become independent back-end content creators. Both the web and systems librarians held multiple training sessions for the Libraries employees so that everyone felt comfortable both editing and generating content. The associate director of library technology and digital initiatives drafted a campus-wide email announcing the new website and discovery layer at this point. It was sent out a month in advance of the official launch. The new website launched in two parts. The soft launch occurred on August 1, 2016. The web and systems librarians set up a link to the new website on the old site so that users could choose between getting acclimated to the new website or using the tool they were used to in the frantic weeks leading up to the beginning of the semester. August 15, 2016, was the official launch. At this point, the http://exlibris.colgate.edu Dreamweaver-based website was retired, and IT redirected all traffic heading to the old URL to the new Drupal-based website at http://cul.colgate.edu. Because Drupal’s URL structure and information architecture differed from the old website, the WDC decided that mapping every page on the old site to the new one would be too time consuming. While it was acknowledged that this may cause some disruption (as it would break existing links), it seemed necessary for keeping the project moving forward. Library staff updated all external links possible. The Google search operator “inurl” allowed us to identify other sites INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 124 outside the Libraries’ control that pointed to the old website. The WDC reached out to the maintainers of those few sites as appropriate. The biggest risk the Libraries took by not redirecting all URLs to the correct content was the potential to disrupt faculty who had bookmarked content or had direct URLs in course materials. However, the WDC team received very few complaints about the new site, and most users agreed that the improvements to the site far outweighed any temporary inconveniences caused by it. If nothing else, the simplified architecture made finding content easier, so direct links and bookmarks became far less important than they once were. IMPLEMENTATION AND FUTURE STEPS By strictly following the timeline and working closely together, the web librarian and systems librarian were able to launch Colgate Libraries’ new website in time for the 2016 Fall semester. The WDC team was able to pull off this feat within eight months without spending any extra money. The timeline above only gives a high-level view of the steps the WDC took to accomplish this task. The librarians who worked on this project cannot overemphasize the complexity of this endeavor, especially with a small team. However, a website conversion is feasible with organization, time, and with the online support the Drupal community provides (especially the community of libraries on the Drupal platform). It is also critical to have in-house personnel that have technical (coding and server-side) knowledge, project management knowledge, and information architecture and design knowledge. The response from incoming and returning students and faculty to the updated look and improved usability of the Libraries’ digital content was overwhelmingly positive. Following best design practices, in January 2017 more UX testing was conducted with student and teaching faculty participants to gauge their reactions to the new website. 3 Users overwhelmingly found the new website to be both more aesthetically pleasing and usable than the old website. On the back end, the Libraries’ content is now more secure, responsive, and accessible because the Libraries are using a CMS. Library faculty and staff have been able to add or remove content that they are responsible for, but the website can still maintain a consistent look and feel across all pages. Governance has been improved exponentially as library staff have been able to easily and quickly contribute to the website’s content without administrative delays. As the team moves forward, the WDC plans to investigate different advanced Drupal tools, implementing an intranet, and better leveraging Google Analytics. As with all library endeavors, improvement requires continued effort and attention. FROM DREAMWEAVER TO DRUPAL | BUELL AND SANDFORD 125 https://doi.org/10.6017/ital.v37i2.10113 APPENDIX: DETAILED TIMELINE 1. October 2015 a. Began discussion with WDC to create proposal for website changes (web librarian) 2. November–December 2015 a. Complete content inventory (web librarian) b. Complete peer analysis (web librarian) c. Complete UX studies (web librarian) d. Gather faculty and staff feedback on current website (web librarian) 3. December 1, 2015 a. Submit proposal to change from Dreamweaver to Drupal to university librarian for consideration and approval (web librarian) 4. January 4, 2016 a. Submit revised proposal to library faculty for consideration and approval (web librarian) 5. January 2016 a. Set up test Drupal site (systems librarian) 6. February 2016 a. Complete meetings with departments to gather feedback on concerns, content, and ideas for improvements (library department meetings were split among WDC members) 7. March 2016 a. Demo PRIMO, Ex Libris, and Summon for library faculty and staff consideration (associate director of library technology and digital initiatives) b. From three options, choose template for our website (web librarian—approval by the WDC and then the library faculty) c. Create list of developers in case we need assistance (web librarian and systems librarian) d. Create wireframe templates for homepage (web librarian) e. Begin transferring content from old website to new website and create new content with other stakeholders—to be completed by mid-summer (web librarian) f. Begin consolidating multifarious external sites under Drupal as much as possible (systems librarian) 8. April 2016 a. Get Drupal working with the LDAP (systems librarian) b. Agree on permissions and roles for back-end users (systems librarian—with approval by WDC) c. Agree on discovery layer choice (associate director of library technology and digital initiatives) d. Meet with outside stakeholders—Communications, IT, administration 9. May 2016 a. Integrate discovery layer search (systems librarian) 10. July 2016 a. Provide training for library faculty and staff as back-end content creators (web librarian) INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 126 b. Prepare campus-wide email to announce new website and discovery layer with our new URL (associate director of library technology and digital initiatives and web librarian) 11. August 1, 2016 a. Set up a link on our old site (http://exlibris.colgate.edu) so for two weeks users could choose between using the old interface or start getting acclimated to the new website before the Fall semester started (systems librarian) 12. August 15, 2016 a. OFFICIAL LAUNCH—We retire our http://exlibris.colgate.edu Dreamweaver-based website and redirect all traffic headed to our old URL to our new Drupal-based website at http://cul.colgate.edu (systems librarian) 13. September–October 2016 a. Update and get approval from library faculty for a new web style guide and governance guide (web librarian) 14. January 2017 a. Conduct UX studies of students and faculty to see how people are using both the new website and the new discovery layer; gather feedback and ideas for improvement (web librarian) BIBLIOGRAPHY Breeding, Marshall. “Smarter libraries through technology: strategies for creating a unified web presence.” Smart Libraries Newsletter 36, 11 (November 2016): 1-2. General OneFile (accessed August 3, 2017). http://go.galegroup.com/ps/i.do?p=ITOF&sw=w&v=2.1&it=r&id=GALE%7CA471553487. Naudi, T. “Nearly all websites have serious security vulnerabilities--new research shows.” Database and Network Journal 45, 4 (2015): 25. General OneFile (accessed August 3, 2017). http://bi.galegroup.com/essentials/article/GALE%7CA427422281. Raward, R. “Academic Library Website Design Principles: Development of a Checklist.” Australian Academic & Research Libraries 32, 2 (2001): 123-36. http://dx.doi.org/10.1080/00048623.2001.10755151 1 Marshall Breeding, “Smarter Libraries through Technology: Strategies for Creating a Unified Web Presence,” Smart Libraries Newsletter 36, no. 11 (November 2016): 1–2. General OneFile. 2 Tamara Naudi, “Nearly All Websites have Serious Security Vulnerabilities—New Research Shows,” Database and Network Journal 45, no. 4 (2015): 25. General OneFile. 3 Roslyn Raward, “Academic Library Website Design Principles: Development of a Checklist,” Australian Academic & Research Libraries 32, no. 2 (2001): 123–36. http://dx.doi.org/10.1080/00048623.2001.10755151 Introduction GOALS and OBJECTIVES Improve Design Improve Usability Improve Content Creation and Governance Unite Disparate Sites (Website, Blog, and Database List) under One Updated URL on a Single Secure Server PLANNING Content Analysis Needs Analysis Peer Analysis Evaluating Platforms Budgeting TIMELINE October 2015–January 2016 February 2016 March 2016 April–May 2016 July–August 2016 IMPLEMENTATION and FUTURE STEPS APPENDIX: Detailed timeline BIBLIOGRAPHY 10146 ---- Metadata Provenance and Vulnerability Timothy Robert Hart and Denise de Vries INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 24 Timothy Robert Hart (tim.hart@flinders.edu.au) is PhD researcher and Denise de Vries (denise.devries@flinders.edu.au) is Lecturer of Computer Science, College of Science and Engineering, Flinders University, Adelaide, Australia. ABSTRACT The preservation of digital objects has become an urgent task in recent years as it has been realised that digital media have a short life span. The pace of technological change makes accessing these media increasingly difficult. Digital preservation is primarily accomplished by main methods, migration and emulation. Migration has been proven to be a lossy method for many types of digital objects. Emulation is much more complex; however, it allows preserved digital objects to be rendered in their original format, which is especially important for complex types such as those comprising multiple dynamic files. Both methods rely on good metadata to maintain change history or construct an accurate representation of the required system environment. In this paper, we present our findings that show the vulnerability of metadata and how easily they can be lost and corrupted by everyday use. Furthermore, this paper aspires to raise awareness and to emphasise the necessity of caution and expertise when handling digital data by highlighting the importance of provenance metadata. INTRODUCTION UNESCO recognised digital heritage in its “Charter on the Preservation of Digital Heritage,” adopted in 2003, stating, “The digital heritage consists of unique resources of human knowledge and expression. It embraces cultural, educational, scientific and administrative resources, as well as technical, legal, medical and other kinds of information created digitally, or converted into digital form from existing analogue resources. Where resources are ‘born digital’, there is no other format but the digital object.” 1 Born-digital objects are at risk of degradation, corruption, loss of data, and becoming inaccessible. We combat this through digital preservation to ensure they remain accessible and useable. The two main approaches to preservation are migration and emulation. Migration involves migrating digital objects to a different and currently supported file type. Emulation involves replicating a digital environment in which the digital object can be accessed in its original format. Both methods have advantages and disadvantages. Migration is the more common method because it is simpler than emulation and the risks can often be neglected. These risks include potential data loss or change, in which the effects are permanent. Emulation is complex, but it offers the better means to access preserved objects, especially complex file types comprising multiple dynamic files that must be constructed correctly. Emulation also allows users to handle digital objects as closely to the “look and feel” as originally intended. 2 mailto:tim.hart@flinders.edu.au mailto:denise.devries@flinders.edu.au METADATA PROVENANCE AND VULNERABILITY | HART AND DE VRIES 25 https://doi.org/10.6017/ital.v36i4.10146 Accurate and complete metadata is central to both migration and emulation; thus, it is the focus of this paper. Metadata are needed to record the migration history of a digital object and to record contextual information. They are also necessary to accurately render digital objects in emulated environments. Emulated environments are designed around a digital object’s dependencies , which typically include, but are not limited to, drivers, software, and hardware. 3 The metadata describe the attributes of the digital object from which we can derive the type of system in which it can run (e.g., the operating system), the versions of any software dependencies, and other criteria that are crucial for accurate creation of an emulated environment. While metadata are being used to support the preservation of digital objects, there is another equally important role it should be playing. It is not enough to preserve the object so it can be accessed and used in the future. What of the history and provenance of the digital object? What about search and retrieval functionality within the archive or repository the digital object is held in? One must consider how these preserved objects will be used in the future, and by whom. Preserving digital objects is difficult if adequate metadata is not present, especially if the item is outdated and no longer supported. Looking to the future, we should try to ensure metadata are processed correctly for the lifecycle of the digital object. This means care must be taken at the time of creation and curation of any digital objects because although some metadata are typically generated automatically, many elements that will play a pivotal role later must be created manually. Digital objects also commonly go through many changes, which is something that must be captured, as the change history will reveal what has happened to the object over of its lifecycle. The changes may include how the object has been modified, migrations to different formats, and what software created or changed the object—all of which is considered when emulating an appropriate environment. Examples of these changes can be found in case studies presented in the paper. METADATA TYPES The common and more widely used metadata types include, but are not restricted to, Administrative, Descriptive, Structural, Technical, Transformative, and Preservation metadata. Each metadata type describes a unique set of characteristics for digital objects. Administrative metadata include information on permissions as well as how and when an object was created. Transformative Metadata includes logs of events that have led to changes to a digital object. 4 Structural metadata describe the internal structure of an object and any relationships between components. Technical metadata describe the digital object with attributes such as height, weight, format, and other technical details. 5 Preservation metadata support digital preservation by maintaining authenticity, identity, renderability, understandability, and viability. They are not bound to any one category as they comprise multiple types of metadata, not including descriptive or contextual metadata. However, unlike the common metadata types, preservation metadata are unique from the other metadata types and are often ambiguous. 6 In 2012, the developers of version 2.2 of the PREMIS Data Dictionary for Preservation Metadata saw descriptive metadata as less crucial for preserving digital objects; however, they did state it was important for discovery and decision making. 7 While version 2.2 allowed descriptive INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 26 metadata to be handled externally through existing standards such as Dublin Core, the latest version (2017) of the dictionary allows for “Intellectual Entities” to be created within PREMIS that can capture descriptive metadata. 8 Thus, while digital preservation does not require all types of metadata, the absence of contextual metadata limits the future possibilities for the preserved object. Hart writes that because the multimedia objects are dynamic and interactive, and often composed of multiple image, audio, video, and software files, descriptive metadata are increasingly important because they can be used to describe, organise, and package the files. 9 It is also stressed that content description is of great importance because digital objects are not self-describing, which makes identifying semantic-level content difficult; without description metadata, context is lost. 10 For example, without description metadata to provide context, an image’s subject information and search and retrieval functionality is lost. Without this information, verifying whether an object is the original, a copy, or a fabricated or fraudulent item is impossible in most cases. Metadata Vulnerability—Case Studies Digital objects that are currently being created often go through several modifications, making it difficult to identify the original or authentic copy of the object. Verifying and validating authenticity is important for preserving, conserving, and archiving objects. The Digital Preservation Coalition defines authenticity as The digital material is what it purports to be. In the case of electronic records, it refers to the trustworthiness of the electronic record as a record. In the case of “born digital” and digitised materials, it refers to the fact that whatever is being cited is the same as it was when it was first created unless the accompanying metadata indicates any changes. Confidence in the authenticity of digital materials over time is particularly crucial owing to the ease with which alterations can be made. 11 Tests were undertaken to discover how vulnerable metadata can be in digital files that are subject to change, which can lead to loss, addition, and modification. The tests were conducted using the file types JPEG, PDF, and DOCX (Word 2007). The tests revealed what metadata can be extracted and what metadata could be present in the selected file types. Furthermore, they revealed how specific metadata can verify and validate the authenticity of a file such as an image. For each test, the metadata were extracted using ExifTool (http://owl.phy.queensu.ca/~phil/exiftool/). Alternative browser-based tools were tested and provided similar results; however, ExifTool was selected as the primary testing tool because it produced the best results and had the best functionality. Some of the files tested provided extensive sets of metadata that are too large to include, but subsets can be found in Hart (2009). Note that only subsets are included because some metadata was removed for privacy and relevance reasons. The process and method for each test was conducted in the following manner: http://owl.phy.queensu.ca/~phil/exiftool/ METADATA PROVENANCE AND VULNERABILITY | HART AND DE VRIES 27 https://doi.org/10.6017/ital.v36i4.10146 • Case study 1—JPEG o Original metadata extracted for comparison o Image copied, metadata extracted from copy and examined for changes o File uploaded to social media, downloaded from social media, extracted and examined against original • Case study 2—JPEG (modified) o Original metadata extracted for comparison o Image opened and modified in photo editing software (Adobe Photoshop), metadata extracted from new version and examined against original • Case study 3—PDF o Basic metadata extraction performed to establish what metadata are typically found in PDF files and what types of metadata could be possible • Case study 4—DOCX o Original metadata extracted for comparison o File saved as PDF through Microsoft Word and metadata compared to original o File converted to PDF through Adobe Acrobat and metadata compared to original Case Study 1 This case study investigated the everyday use of digital files, the first being simply copying a file. It was revealed that copying a file creates an exact copy of the original file and no changes in metadata aside from the creation and modification time/date. Thus, the copy could not be identified against the original unless the original creation time/date was known. The second everyday use was uploading an image to Facebook. The metadata-extraction tests revealed that the original file had approximately 265 metadata elements. (The approximation is caused by the ambiguity of certain elements that may be read as singular or multiple entries.) These elements included, but were not limited to, the following: • dates • technical metadata • creator/author information • color data • image attributes • creation-tool information • camera data • change • software history Many of the metadata elements had useful information for a range of situations. Even so, several metadata elements were missing that would require a user input for creation. Once the file had been uploaded to and then downloaded from social media, approximately 203 metadata elements were lost, included date, color, creation-tool information, camera data, change, and software history. It can be argued that removing some of this metadata would help keep user information private, but certain metadata should be retained, such as change and software history. These INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 28 metadata make it easier to differentiate fabricated images from authentic images and to know which modifications have been made to a file. For preservation purposes, the missing metadata is what may be needed to provide authenticity. This case study aims to make users aware of the significant risk of metadata loss when dealing with digital objects. If metadata are not identified and captured before the object is processed within a repository, the loss could be irreversible. Case Study 2 The second case study revealed how the change and software history metadata can be used to easily identify when a file has been modified. In the test conducted, it was evident by visually comparing the images that changes were made; however, modifications are not always obvious as some changes can be subtle, such as moving an element in the image that completely changes what the image is conveying. The following example displays the change history from the image used in case study 1, revealing how the metadata can easily identify modification: • History Action—saved, saved, saved, saved, converted, derived, saved • History When—The first saved was at 2010:02:11 21:59:05, the last saved was at 2010:02:11 22:12:01 with each action having its own timestamp • History Software Agent—Adobe Photoshop CS4 Windows for each action • History Parameters—Converted from TIFF to JPEG Further testing was conducted with simple photo manipulation using an original image to see firsthand the issues described in the initial test. The image contained approximately 178 metadata elements, including the typical metadata that were found in the first case study. Once the image was processed and modified with Adobe Photoshop CS5, the metadata were no longer identical. The modified image had approximately 201 metadata elements. The new elements included Photoshop-specific data, change, and software history. However, extensive camera data were lost. It can be argued that the camera data are not important for digital preservation because the lack of it will not hinder the preservation process. However, once the file is preserved and those data are lost, important technical and descriptive information can never be regained. For example, consider a spectacular digital image that captures an important moment in history. If that image is preserved for twenty years, in that time cameras and perhaps photography itself will have advanced dramatically. How digital images are captured and processed might be completely different and will most likely provide different results. Should someone wish to know how that preserved image was captured, they would need to know what camera was used, lens and shutter - speed data, lighting data, and other technical information. Preserving those metadata can be almost as important as preserving the file itself because each metadata element has importance and meaning to someone. As most viewers of online media are aware, photos are often modified, especially on social media. This is often performed on “selfies,” pictures taken of oneself. These can be modified to make the person in the photo look better or to hide features they see as flawed. Small modifications, such as covering some blemishes or improving the lighting have little effect on the image’s context, but some modifications and manipulations that can mislead people. These manipulated images often METADATA PROVENANCE AND VULNERABILITY | HART AND DE VRIES 29 https://doi.org/10.6017/ital.v36i4.10146 take the form of viral hoax images circulating around the web. For example, Figure 1 displays how two images can be combined into a composite image that changes the context of the image. Figure 1. Composite image. “Photo Tampering throughout History,” Fourandsix Technologies, 2003, http://pth.izitru.com/2003_04_00.html. The two images side by side are original photos taken in Basra of a British soldier gesturing to Iraqi civilians to take cover. In the right image, the Iraqi man is holding a child and seeking help from the solider; as you can see, this soldier does not interpret this as a hostile act. The image above is a composite of the two that changes the story. In this image, the soldier appears to be responding with hostility toward the man approaching. With basic photo manipulation, this soldier who is protecting innocent civilians is portrayed holding them against their will. Images like this circulate through media of all types, and although the exchangeable image file format (EXIF) metadata may not identify what has been done to the image, it would eliminate any doubt that the image has been modified. Unfortunately, these data are not made available. Making users aware of this vulnerability may improve detection of file manipulation at the time of ingest to better ensure only accurate and authentic material is being considered for preservation. Donations received by digital repositories such as libraries must be scrutinised by trained individuals. With this awareness and knowledge of metadata, they can perform their duties to a much higher standard. Case Study 3 The PDF metadata extraction provided interesting results. Over a range of tests on academic research papers, the main metadata identified consisted of PDF version, author, creator, creation date, modification date, and XMP (Adobe Extensible Metadata Platform) data. These metadata http://pth.izitru.com/2003_04_00.html INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 30 were not present in every PDF tested; in fact, the majority of PDF files seemed to be lacking important metadata. The author and creator fields were generally listed as “administrator” or “user” and bibliographic metadata was usually missing. However, PDF openly supports XMP embedding, therefore, bibliographic metadata could be embedded into the PDF. Through further testing, bibliographic metadata linked to the PDFs were discovered stored in online databases. Bibliographic software such as Endnote and Zotero allow metadata extraction, which enables users to import PDF files and automatically generate the appropriate bibliographic metadata. For example, Zotero performs this extraction by first searching for a match for the PDF on Google Scholar. If this search does not return a match, Zotero uses the embedded Digital Object Identifier (DOI) to perform the match. This method is not consistent: it often fails to retrieve any data, and in rare cases it retrieves the wrong data, which leads to incorrect references. Given what we saw happen to metadata when a file is uploaded such as in case study 1 and the nature of a PDF’s journey through template selection, editing, and publishing, it is no surprise that metadata are lost or diluted along the way. Case Study 4 The fourth case study conducted on DOCX files provided an extensive set of metadata, some of which are unique to this file type. Creating a new Word document via the File Explorer context menu and attempting to extract metadata resulted in an error as there were no readable metadata to extract until the file was accessed and saved. Once the file had some user input and was saved, the metadata were created and could be extracted. Microsoft Office files contain external XML files that holds information about the document, such as formatting data, user information, edit history, and information about the document’s page count, word count, etc. Picture a DOCX file as an uncompressed directory. However, using ExifTool on the DOCX file allowed retrieval of the metadata from all the hidden files. The metadata included creation, modification, and edit information, such as number of edits and total edit time. Every element within the document (e.g., text, images, tables, etc.) has its own metadata attached that are crucial for preserving the format of the document. The next step in the test involved converting the DOCX file into PDF using the following two methods: (1) converting the document via the “Publish” save option within Microsoft Word; and (2) “right clicking” the document and selecting the option to convert to an Adobe PDF. The results of the two methods varied slightly. Method 1 stripped all the metadata from the document and generated only default PDF metadata consisting of system metadata (file size, date, time, permissions) and the PDF version, author details, and document details. Method two behaved the same way except that some XMP metadata were created. Both methods resulted in no informative metadata remaining as the majority of the XMP elements were empty fields or contained generic values such as the computer name as the author. All formatting and metadata unique to Microsoft Word was lost. This case study is an enlightening example of what can happen to metadata when a file is changed from one format to another. METADATA PROVENANCE AND VULNERABILITY | HART AND DE VRIES 31 https://doi.org/10.6017/ital.v36i4.10146 HUMAN INTERVENTION The human element is a requirement in digital preservation as certain metadata, such as descriptive and administrative metadata, can only be created by humans. In fact, as Hart notes, user input is needed to record the majority of the digital preservation metadata. 12 The process can be tedious, as described by Wheatley. 13 One of the examples described included following the processes in a repository from ingest to access, beginning with the creation of metadata and the managerial tasks that are necessary. These tasks include using extraction tools and automation where possible. Using frameworks to record changes to metadata is required, and in some cases metadata must be stored externally to their digital objects. This allows multiple objects of the same type to utilise a generic set of metadata to avoid redundant data. However, although using a generic metadata set is convenient, a large collection of digital objects could be affected if the metadata is lost or damaged. The human element increases the risk of error drastically because there are numerous steps to metadata creation. Misconduct is also possible. Therefore, the less digital preservation is reliant on humans (and the easier the tasks are that require human input), the better. This can only be achieved by automating most process and training people to ensure they handle their responsibilities accurately, consistently, and completely. Learning the results from the case studies like those described in this paper will better prepare users working with digital objects. DISCUSSION To achieve the most authentic, consistent, and complete digital preservation, institutions must revise their preservation workflows and processes. This entails ensuring the initial processes within workflows are correct before processing digital content. The content must come from a credible source and have its authenticity approved. Participation from the donor of the digital content might be beneficial if they can provide information and metadata about the content. This information could provide additional context for the content as well as identify its history (e.g., format migration or modification). This is not always possible as the donor is not always be the creator of the digital content. If the original source is no longer available, as much information as possible should be gathered from the donor about the acquisition of the content and any information regarding the original source. This should be considered and carefully monitored throughout the lifecycle of digital content. Granted, if no changes are needed, devices such as write blockers can ensure this as they restrict users and any systems from making unwanted changes or “writes.” However, changes are sometimes unavoidable and (although it may not affect the content) detrimental. When changes are required, it is crucial to maintain the digital history by capturing all metadata added, removed, or modified during processing, commonly known as the “change history.” Donor participation should be stipulated in a donor agreement, something that each institution offers to all donors, sometimes in the form of agreements through communication and often with a structured document. Donor-agreement policies differ for each institution: some are quite detailed, allowing donors to carefully stipulate their conditions, whereas others place most of the INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 32 responsibility on the receiving institution. When dealing with sensitive or historic data of importance, policies should be in place to capture adequate data from the donor. When the content does not fall into this category, standard procedures, which should be present in all donor agreements and institution policies, can be followed. Institutions must also consider when to apply these steps as some transactions between donor and institution can follow standard protocol; others are more complex, such as donations of content with diverse provenance issues. CONCLUSION We have presented four case studies that illustrate how vulnerable digital-object metadata are. These examples show that common methods of handling files can cause irretrievable loss of important information. We discovered significant loss of metadata when uploading photos to social media and when converting a file to another format. The digital footprint left behind from photo manipulation was also exposed. We shed light on the bibliographic-metadata generation of PDF files, how they are obtained, and the surrounding issues. Action is needed to ensure proper metadata creation and preservation for born-digital objects. Librarians and Archivists must place a greater emphasis on why digital objects are preserved as well as how and when users may need to access them. Therefore, all types of metadata must be captured to allow users from all disciplines to take advantage of historical data in many years to come. Given the rate of technological change, we must be prepared; observing first-hand the vulnerability of metadata is a step toward a safer future for our digital history. REFERENCES 1 “Charter on the Preservation of Digital Heritage,” UNESCO, October 15, 2003, http://portal.unesco.org/en/ev.php- URL_ID=17721&URL_DO=DO_TOPIC&URL_SECTION=201.html. 2 K. Rechert et al., “bwFLA—A Functional Approach to Digital Preservation,” PIK—Praxis der Informationsverarbeitung und Kommunikation 35, no. 4 (2012), 259–67. 3 K. Rechert et al., Design and Development of an Emulation-Driven Access System for Reading Rooms, Archiving Conference, 2014, 126–31, Society for Imaging Science and Technology, 2014. 4 M. Phillips et al., The NDSA Levels of Digital Preservation: Explanation and Uses, Archiving Conference, 2013, 216–22, Society for Imaging Science and Technology, 2013. 5 “PREMIS: Preservation Metadata Maintenance Activity” Library of Congress, accessed March 10, 2016, http://www.loc.gov/standards/premis/. 6 R. Gartner and B. Lavoie, Preservation Metadata (2nd Edition) (York, UK: Digital Preservation Coalition, 2013), 5–6. http://portal.unesco.org/en/ev.php-URL_ID=17721&URL_DO=DO_TOPIC&URL_SECTION=201.html http://portal.unesco.org/en/ev.php-URL_ID=17721&URL_DO=DO_TOPIC&URL_SECTION=201.html http://www.loc.gov/standards/premis/ METADATA PROVENANCE AND VULNERABILITY | HART AND DE VRIES 33 https://doi.org/10.6017/ital.v36i4.10146 7 PREMIS Editorial Committee, PREMIS Data Dictionary for Preservation Metadata, Version 2.2 (Washington, DC: Library of Congress, 2012), http://www.loc.gov/standards/premis/v2/premis-2-2.pdf. 8 PREMIS Editorial Committee, PREMIS Schema, Version 3.0 (Washington, DC: Library of Congress, 2015), http://www.loc.gov/standards/premis/v3/premis-3-0-final.pdf. 9 Timothy Hart, “Metadata Standard for Future Digital Preservation” (Honours thesis, Flinders University, Adelaide, Australia, 2015). 10 J. R. Smith and P. Schirling, “Metadata Standards Roundup,” IEEE MultiMedia 13, no 2 (April-June 2006): 84–88. 11 “Glossary,” Digital Preservation Coalition, accessed August 5, 2016, http://handbook.dpconline.org/glossary. 12 Timothy Hart, “Metadata Standard for Future Digital Preservation” (Honours thesis, Flinders University, Adelaide, Australia, 2015). 13 Paul Wheatley, “Institutional Repositories in the Context of Digital Preservation,” Microform & Digitization Review 33, no. 3 (2004): 135–46. http://www.loc.gov/standards/premis/v2/premis-2-2.pdf http://www.loc.gov/standards/premis/v3/premis-3-0-final.pdf http://handbook.dpconline.org/glossary ABSTRACT INTRODUCTION METADATA TYPES Metadata Vulnerability—Case Studies Case Study 1 Case Study 2 Case Study 3 Case Study 4 HUMAN INTERVENTION DISCUSSION CONCLUSION REFERENCES 10160 ---- Academic Libraries on Social Media: Finding the Students and the Information They Want Heather Howard, Sarah Huber, Lisa Carter, and Elizabeth Moore INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 8 Heather Howard (howar198@purdue.edu) is Assistant Professor of Library Science; Sarah Huber (huber47@purdue.edu) is Assistant Professor of Library Science; Lisa Carter (carte241@purdue.edu) is Library Assistant; and Elizabeth Moore (moore658@purdue.edu) is Library Assistant and Student Supervisor at Purdue University. Librarians from Purdue University wanted to determine which social media platforms students use, which platforms they would like the library to use, and what content they would like to see from the library on each of these platforms. We conducted a survey at four of the nine campus libraries to determine student social media habits and preferences. Results show that students currently use Facebook, YouTube, and Snapchat more than other social media types; however, students responded that they would like to see the library on Facebook, Instagram, and Twitter. Students wanted nearly all types of content from the libraries on Facebook, Twitter, and Instagram, but they did not want to receive business news or content related to library resources on Snapchat. YouTube was seen as a resource for library service information. We intend to use this information to develop improved communication channels, a clear social media presence, and a cohesive message from all campus libraries. INTRODUCTION In his book Tell Everyone: Why We Share and Why It Matters, Alfred Hermida states, “People are not hooked on YouTube, Twitter or Facebook but on each other. Tools and services come and go; what is constant is our human urge to share.”1 Libraries are places of connection, where people connect with information, technologies, ideas, and each other. As such, libraries look for ways to increase this connection through communication. Social media is a key component of how students communicate with classmates, families, friends, and other external entities. It is essential for libraries to communicate with students regarding services, collections, events, library logistics, and more. Purdue University is a large, land-grant university located in West Lafayette, Indiana, with an enrollment of more than forty thousand. The Purdue Libraries consist of nine libraries, presented collectively on the social media platforms Facebook and Twitter since 2009 and YouTube since 2012. Going forward, the Purdue Libraries want to ensure it establishes a cohesive message and brand that is communicated to students on platforms they use and on which they will engage with it. The purpose of this study was to determine which social media platforms the students are currently using, which platforms they would like the library to use, and what content they would like to see from the libraries on each of these platforms. mailto:howar198@purdue.edu mailto:huber47@purdue.edu mailto:carte241@purdue.edu mailto:moore658@purdue.edu ACADEMIC LIBRARIES ON SOCIAL MEDIA | HOWARD, HUBER, CARTER, AND MOORE 9 https://doi.org/10.6017/ital.v37i1.10160 LITERATURE REVIEW Academic Libraries and Social Media Academic libraries have been slow to accept social media as a venue for either promoting their services or academic purposes. A 2007 study of 126 academic librarians found that only 12 percent of those surveyed “identified academic potential or possible benefits” of Facebook while 54 percent saw absolutely no value in social media.2 However, the mission of academic libraries has shifted in the last decade from being a repository of knowledge to being a conduit for information literacy; new roles include being a catalyst for on-campus collaboration and a facilitator for scholarly publication within contemporary academic librarianship.3 Academic librarians have responded to this change, with many now believing that “social media, which empowers libraries to connect with and engage its diverse stakeholder groups, has a vital role to play in moving academic libraries beyond their traditional borders and helping them engage new stakeholder groups.”4 Student Perceptions about Academic Libraries on Social Media As the use of social media has grown with college-aged students, so has an increasing acceptance of academic libraries using social media to communicate. A Pew Research Center report from 2005 showed just 7 percent of eighteen to twenty-nine year olds using social media. By 2016, 86 percent were using social media.5 In 2007 the OCLC asked 511 college students from six different countries to share their thoughts on libraries using social networking sites. This survey revealed that “most college students would be unlikely to participate in social networking services offered by a library,” with just 13 percent of students believing libraries have a place on social media.6 However, just two years later (in 2009), a shift was seen: students were open to connecting with academic libraries, as observed in a survey of 366 freshmen at Valparaiso University. When asked their thoughts on the library sending announcements and communications to them via Facebook or MySpace (a social media powerhouse at the time), 42.6 percent answered they would be “more receptive to information received in this way than any other response.” A smaller group, 12.3 percent, responded more negatively to this approach. Students showed concern for their privacy and the level of professionalism, as a quote from a student illustrates: “Facebook is to stay in touch with friends or teachers from the past. Email is for announcements. Stick with that!!!” 7 As students report becoming more open to academic libraries on social media, the question of whether they will engage through social media emerges. A recent study from Western Oregon University’s Hammersley Library asked this question with promising results. Forty percent of students said they were either “very likely “or “somewhat likely” to follow the library on Instagram and Twitter, as opposed to wanting communications being sent to them directly through social media (for example, a Facebook message). Pinterest followed, with 33 percent of students saying they were either “very likely” or “somewhat likely” to follow the library using this platform.8 Throughout the literature, students have shown an interest in information about the libraries that is useful to them. In another survey given to undergraduate students from three information technology classes at Florida State University, one question examined the perceived importance of different library social media postings to students. The report showed students considered postings related to operations updates, study support, and events as the most important.9 In the Hammersly study noted above, 78 percent and 87 percent of respondents said INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 10 they were either “very interested” or “somewhat interested,” respectively, in every category relating to library resources presented in the survey, but “interesting/fun websites and memes” received the least interest from participants.10 The literature shows an increase in students being receptive to academic libraries on social media. Results vary campus to campus and students are leery of libraries reaching out to them via social media, but they have an increasingly positive view about content posted that will help them with the library. RESEARCH QUESTIONS The aim of this project was to investigate the social media behaviors of Purdue University students as they relate to the libraries, and to develop evidence-based practices for managing the library’s social media accounts. The project focused on three research questions: 1. What social media platforms are students using? 2. What social media platforms do students want the library to use? 3. What kind of content do students want from the library on each of these platforms? METHODS We created the survey using the web-based Qualtrics survey software. It was distributed in electronic form only, and it was promoted to potential respondents via table tents in the libraries, bookmarks at the library desk, Facebook posts, and in-classroom promotion. Potential respondents were advised that the survey was anonymous and voluntary. The survey consisted of closed questions, though many questions contained an open-ended field for answers that did not fall into the provided choices. Inspiration for some of the options in our survey questions came from the Hammersly Library study, as we felt they did a good job capturing information about the social media usage of their patrons.11 Our survey asked what social media platforms students use, what they use them for, how often they visit the library, how likely they are to follow the library on social media, which platforms they want the library to have, and what content they would like from the library on each of those platforms. The social media platforms included were Facebook, Flickr, G+, Instagram, LinkedIn, Pinterest, Qzone, Renren, Snapchat, Tumblr, Twitter, YouTube, and Yik Yak.12 There were also open-ended spaces where participants could write in additional platforms. The survey originally ran for three weeks in only the business library early in the spring 2017 semester, as its intended purpose was to inform how the business library would manage social media. After that survey was completed, we decided to replicate the survey in three additional libraries (humanities, social science, and education; engineering; and the main undergraduate libraries). This was done to expand the dataset and reach additional students in a variety of disciplines. These libraries were chosen because they were the libraries in which the authors work, with the hope to expand to additional libraries in the future. The second survey also lasted for three weeks starting in mid-April of the spring 2017 semester. As a participation incentive, students who completed the initial survey and the second survey had an opportunity to enter a drawing for a $25 Visa gift card. ACADEMIC LIBRARIES ON SOCIAL MEDIA | HOWARD, HUBER, CARTER, AND MOORE 11 https://doi.org/10.6017/ital.v37i1.10160 The survey was advertised across four different campus libraries and promoted in several ways to reach different populations. Though the results are not from a random sample of the student population, the results are broad enough that we intend to apply them to our entire student population. RESULTS Survey The survey was completed by 128 students. An additional 13 students began the survey but did not complete it; we removed their results from the analysis. The breakdown of respondents was 10 percent freshmen (n = 13), 22 percent sophomore (n = 28), 27 percent junior (n = 35), 20 percent senior (n = 25), and 21 percent graduate or professional (n = 27). Library Usage The students were asked how frequently they visit the library to determine if the survey was reaching a population of regular or infrequent library visitors. The results showed that the students who completed the survey were primarily frequent library users, with 93 percent (n = 119) visiting once a week or more. Social Media Platforms The students were asked to identify which social media platforms they used and how frequently they used them. The most popular social media platforms were determined by combining the number of students who said they used them daily or weekly. The top five were Facebook (n = 114, 88 percent), YouTube (n = 102, 79 percent), Snapchat (n = 90, 70 percent), Instagram (n = 85, 66 percent), and Twitter (n = 41, 32 percent). Full results are in table 1. Table 1. Usage frequency by platform Social Media Platform Daily Weekly Monthly < Once per Month Never Facebook 94 (72.87%) 20 (15.50%) 5 (3.88%) 5 (3.88%) 4 (3.10%) Flickr 0 (0.00%) 1 (0.78%) 2 (1.55%) 8 (6.20%) 117 (90.70%) G+ 3 (2.33%) 6 (4.65%) 4 (3.10%) 16 (12.40%) 99 (76.74%) Instagram 68 (52.71%) 17 (13.18%) 5 (3.88%) 11 (8.53%) 27 (20.93%) LinkedIn 9 (6.98%) 29 (22.48) 22 (17.05%) 22 (17.05%) 46 (35.66%) Pinterest 12 (9.30%) 12 (9.30%) 16 (12.40%) 19 (14.73%) 69 (53.49%) Qzone 0 (0.00%) 0 (0.00%) 0 (0.00%) 4 (3.10%) 124 (96.12%) Renren 0 (0.00%) 0 (0.00%) 1 (0.78%) 3 (2.33%) 124 (96.12%) Snapchat 84 (65.12%) 6 (4.65%) 6 (4.65%) 7 (5.43%) 25 (19.38%) Tumblr 7 (5.43%) 2 (1.55%) 7 (5.43%) 11 (8.53%) 101 (78.29%) INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 12 Social Media Platform Daily Weekly Monthly < Once per Month Never Twitter 28 (21.71%) 13 (10.08%) 12 (9.30%) 9 (6.98%) 66 (51.16%) YouTube 58 (44.96%) 44 (34.11%) 15 (11.63%) 4 (3.10%) 7 (5.43%) Yik Yak 0 (0.00%) 0 (0.00%) 0 (0.00%) 11 (8.53%) 117 (90.70%) Other: Email 1 (0.78%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) Other: Groupme 3 (2.33%) 1 (0.78%) 0 (0.00%) 0 (0.00%) 0 (0.00%) Other: Reddit 2 (1.55%) 2 (1.55%) 0 (0.00%) 0 (0%) 0 (0.00%) Other: Skype 0 (0.00%) 0 (0.00%) 0 (0.00%) 1 (0.78%) 0 (0.00%) Other: Vine 0 (0.00%) 1 (0.78%) 0 (0.00%) 0 (0.00%) 0 (0.00%) Other: Wechat 3 (2.33%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) Other: Weibo 1 (0.78%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) Other: Whatsapp 1 (0.78%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) Social Media Activity Next, students were asked how much time they spend on social media doing the following activities: watching videos, keeping in touch with friends/family, sharing photos, keeping in touch with classmates/professors, learning about campus events, doing research, getting news, or following public figures. Table 2 shows that students overwhelmingly use social media daily or weekly to watch videos (94 percent, n = 120), keep in touch with family/friends (93 percent, n = 119), and to get news (81 percent, n = 104). The least popular activities, those that students do less than once per month or never, were research (47 percent, n = 60) and to following public figures (34 percent, n = 45). Social Media and the Library The students were asked how likely they are to follow the libraries on social media. The response to this was primarily positive, with 57 percent of respondents saying they are either extremely likely or somewhat likely to follow the library. One response for this question was inexplicably null, so for this question n = 127. Figure 1 contains the full results. ACADEMIC LIBRARIES ON SOCIAL MEDIA | HOWARD, HUBER, CARTER, AND MOORE 13 https://doi.org/10.6017/ital.v37i1.10160 Table 2. Social media activity Social Media Activity Daily Weekly Monthly < Once per Month Never Watch videos 85 (66.41%) 35 (27.34%) 1 (0.78%) 4 (3.13%) 3 (2.34%) Keep in touch with friends/family 89 (69.53%) 30 (23.44%) 6 (4.69%) 2 (1.56%) 1 (0.78%) Share photos 32 (25%) 33 (25.78%) 38 (29.69%) 20 (15.63%) 5 (3.91%) Keep in touch with classmates/professors 34 (26.56% 47 (36.72%) 21 (16.41%) 19 (14.84%) 7 (5.47%) Learn about campus events 24 (18.75%) 53 (41.41%) 29 (22.66%) 18 (14.06%) 4 (3.13%) Do research 24 (18.75%) 26 (20.31%) 18 (14.06%) 23 (17.97%) 37 (28.91%) Get news 66 (51.56%) 38 (29.69%) 7 (5.47%) 9 (7.03%) 8 (6.25%) Follow public figures 34 (26.56%) 30 (23.44%) 20 (15.63%) 19 (14.84%) 24 (18.75%) Other 2 (1.56%) 0 (0%) 0 (0%) 0 (0%) 0 (0%) Figure 1. Library social media follows. 12 66 23 16 10 0 10 20 30 40 50 60 70 Extremely likely Somewhat likely Neither likely nor unlikely Somewhat unlikely Extremely unlikely How likely are you to follow the library on social media? INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 14 The students were asked which social media platforms they thought the library should be on. Five rose to the top of the results: Facebook (82 percent, n = 105), Instagram (55 percent, n = 70), Twitter (40 percent, n = 51), Snapchat (34 percent, n = 44), and YouTube (29 percent, n = 37). Full results can be seen in figure 2. After a student selected a platform they wanted the library to be on, logic built into the survey then directed them to an additional question that asked what content they would like to see from the library on that platform. Content included library logistics (hours, events, etc.), research techniques and tips, how to use library resources and services, library resource info (database instruction/tips, journal availability, etc.), business news, library news (e.g., if the library wins an award), campus-wide info/events, and interesting/fun websites and memes. For Facebook, students widely selected all types of content, with the most selections made for library logistics (n = 73) and the fewest made for business news (n = 33). For Instagram, students wanted all content except business news (n = 18). Snapchat was similar, except along with business news (n = 8), students also were not interested in receiving content related to library resource information (n = 9). Twitter was similar to Facebook in that all content was widely selected. YouTube had a focus on library services, with the three most-selected content options being research techniques and tips (n = 20), how to use library resources and services (n = 19), and library resource info (n = 16). Table 3 contains the full results. Figure 2. Library social media presence. 105 7 70 23 10 1 1 44 5 51 37 0 20 40 60 80 100 120 Facebook G+ Instagram LinkedIn Pinterest Qzone Renren Snapchat Tumblr Twitter YouTube What social media platform should the library be on? ACADEMIC LIBRARIES ON SOCIAL MEDIA | HOWARD, HUBER, CARTER, AND MOORE 15 https://doi.org/10.6017/ital.v37i1.10160 Table 3. Library social media content by platform What type of content would you like to see from the library? Content Type F a c e b o o k (n = 1 0 5 ) G + (n = 7 ) In s ta g r a m (n = 7 0 ) L in k e d In (n = 2 3 ) P in te r e s t (n = 1 0 ) S n a p c h a t (n = 4 4 ) T u m b lr (n = 5 ) T w itte r (n = 5 1 ) Y o u T u b e (n = 3 7 ) Library logistics (hours, events, etc.) 73 (69.52%) 2 (28.57%) 34 (48.57%) 7 (30.43%) 4 (40%) 23 (52.27%) 2 (40%) 32 (62.75%) 8 (21.62%) Research techniques & tips 52 (49.52%) 3 (42.85%) 28 (40%) 13 (56.53%) 7 (70%) 19 (43.18%) 3 (60%) 27 (52.94%) 20 (54.05%) How to use library resources & services 53 (50.48%) 3 (42.85%) 26 (37.14%) 8 (34.78%) 7 (70%) 16 (36.36%) 3 (60%) 25 (49.02%) 19 (51.35%) Library resource info (database instruction/tips , journal availability, etc.) 53 (50.48%) 3 (42.85%) 22 (31.42%) 8 (34.78%) 6 (60%) 9 (20.45%) 2 (40%) 23 (45.10%) 16 (43.24%) Business news 33 (31.43%) 2 (28.57%) 18 (25.71%) 13 (56.52%) 3 (30%) 8 (18.18%) 2 (40%) 17 (33.33%) 7 (18.92%) Library news (e.g., if the library wins an award) 49 (46.67%) 3 (42.85%) 37 (52.86%) 12 (52.17%) 5 (50%) 19 (43.18%) 3 (60%) 24 (47.06%) 7 (18.92%) Campus-wide info/events 73 (69.52%) 3 (42.85%) 42 (60%) 5 (21.74%) 5 (50%) 26 (59.09%) 2 (40%) 35 (68.63%) 13 (35.14%) Interesting/fun websites & memes 48 (45.71%) 0 41 (58.57%) 2 (8.70%) 10 (100%) 30 (68.18%) 3 (60%) 26 (50.98%) 12 (32.43%) Other 1 (0.95%) 0 2 (2.86%) 0 1 (10%) 2 (4.55%) 0 2 (3.92%) 1 (2.70%) DISCUSSION Historically, libraries have used social media as a marketing tool.13 With social media’s ever- increasing popularity with young adults, academic libraries have actively established a presence on several platforms.14 Our survey shows that our students follow this trend, using social media regularly and for a variety of activities. We were surprised that Facebook turned out to be the INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 16 most widely used by our students, as much has been written in the last few years about teens and young adults leaving the platform.15 A November 2016 survey, however, found that 65 percent of teens said they used Facebook daily, a large increase from 59 percent in November 2014. Though Snapchat and Instagram preferred, teens continue to use Facebook for its utility in scheduling events or keeping in touch regarding homework.16 Students do seem receptive to following the library on different platforms and report wanting primarily library-related content from us, including more in-depth content such as research techniques and database instruction. LIMITATIONS AND FUTURE WORK Findings from this study give insight into opportunities for libraries to reach university students through social media. We acknowledge that only limited generalizations can be made because of the way the survey was conducted. Our internal recruitment methods led to a selection bias in our surveyed population, as advertisement of the survey took place either in the chosen libraries or on the Purdue Libraries’ existing Facebook page. Because of this, our sample consists primarily of students who visit the library or already follow the library on Facebook. We hope to alter this in future surveys by expanding our recruitment to other physical spaces across campus. In addition, we plan to add questions that first establish a better understanding of students’ opinions of libraries being on social media before asking what social media they would like to see libraries use. This would potentially avoid leading students to an answer. Further, we are concerned we took for granted students’ understanding of library resources; that is, we may have made distinctions librarians understand, but students may not. In future studies, we plan to rephrase, and possibly combine, questions in a way that will be clear to people less familiar with library resources and services. We believe confusion with these questions created contradictory responses. For example, “research help through social media” received a low response rate, but “information on research techniques and tips” received a much higher response rate. Additionally, a limitation of using a survey to collect behavior information is that respondents do not always report how they actually behave. Using methods such as focus groups, interviews, text mining, or usability studies could provide a more holistic view of student behavior. Duplication of this study on a yearly or semi-yearly basis across all libraries could help us see how social media preferences change over time and across a larger sample of our population. This study aimed to provide a broad view of a large university’s student body by surveying across different subject libraries. With the changes discussed, we think a revised survey could give us the detailed information we need to build a more effective social media strategy that reaches both library users and non-users. CONCLUSION This study improved our understanding of the social media usage and preferences of Purdue students. From these results, we intend to develop better communication channels, a clear social media presence, and a more cohesive message across the Purdue libraries. Under the direction of our new director of strategic communication, a social media committee was formed with representatives from each of the libraries to contribute content for social media. The committee will consider expanding the Purdue Libraries’ social media presence to communication channels where students have said they are and would like us to be. As social media usage is ever-changing, we recommend repeated surveys such as this to better understand where on social media students want to see their libraries and what information they want to receive from them. ACADEMIC LIBRARIES ON SOCIAL MEDIA | HOWARD, HUBER, CARTER, AND MOORE 17 https://doi.org/10.6017/ital.v37i1.10160 REFERENCES 1 Alfred Hermida, Tell Everyone: Why We Share and Why It Matters (Toronto: Doubleday Canada, 2014), 1. 2 Laurie Charnigo and Paula Barnett-Ellis, “Checking Out Facebook.com: The Impact of a Digital Trend on Academic Libraries,” Information Technology and Libraries 26, no. 1 (March 2007): 23–34, https://doi.org/10.6017/ital.v26i1.3286. 3 Stephen Bell, Lorcan Dempsey, and Barbara Fister, New Roles for the Road Ahead: Essays Commissioned for the ACRL’s 75th Anniversary (Chicago: Association of College and Research Libraries, 2015). 4 Amanda Harrison et al., “Social Media Use in Academic Libraries: A Phenomenological Study,” Journal of Academic Librarianship 43, no. 3 (May 1, 2017): 248–56, https://doi.org/10.1016/j.acalib.2017.02.014. 5 “Social Media Fact Sheet,” Pew Research Center, January 12, 2017, http://www.pewinternet.org/fact-sheet/social-media/. 6 Online Computer Library Center, Sharing, Privacy and Trust in Our Networked World: A Report to the OCLC Membership, (Dublin, Ohio: OCLC, 2007)), https://eric.ed.gov/?id=ED532599. 7 Ruth Sara Connell, “Academic Libraries, Facebook and MySpace, and Student Outreach: A Survey of Student Opinion,” Portal: Libraries and the Academy 9, no. 1 (January 8, 2009): 25–36, https://doi.org/10.1353/pla.0.0036. 8 Elizabeth Brookbank, “So Much Social Media, So Little Time: Using Student Feedback to Guide Academic Library Social Media Strategy,” Journal of Electronic Resources Librarianship 27, no. 4 (2015): 232–47, https://doi.org/10.1080/1941126X.2015.1092344. 9 Besiki Stvilia and Leila Gibradze, “Examining Undergraduate Students’ Priorities for Academic Library Services and Social Media Communication,” Journal of Academic Librarianship 43, no. 3 (May 1, 2017): 257–62, https://doi.org/10.1016/j.acalib.2017.02.013. 10 Brookbank, “So Much Social Media, So Little Time.” 11 Stvilia and Gibradze, “Examining Undergraduate Students’ Priorities.” 12 Qzone and Renren are Chinese social media platforms. 13 Curtis R. Rogers, “Social Media, Libraries, and Web 2.0: How American Libraries are Using New Tools for Public Relations and to Attract New Users,” South Carolina State Library, May 22, 2009, http://dc.statelibrary.sc.gov/bitstream/handle/10827/6738/SCSL_Social_Media_Libraries_20 09-5.pdf?sequence=1; Jakob Harnesk and Marie-Madeleine Salmon, “Social Media Usage in Libraries in Europe—Survey Findings,” LinkedIn SlideShare slideshow presentation, August https://doi.org/10.6017/ital.v26i1.3286 https://doi.org/10.1016/j.acalib.2017.02.014 http://www.pewinternet.org/fact-sheet/social-media/ https://eric.ed.gov/?id=ED532599 https://doi.org/10.1353/pla.0.0036 https://doi.org/10.1080/1941126X.2015.1092344 https://doi.org/10.1016/j.acalib.2017.02.013 http://dc.statelibrary.sc.gov/bitstream/handle/10827/6738/SCSL_Social_Media_Libraries_2009-5.pdf?sequence=1 http://dc.statelibrary.sc.gov/bitstream/handle/10827/6738/SCSL_Social_Media_Libraries_2009-5.pdf?sequence=1 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 18 10, 2010, https://www.slideshare.net/jhoussiere/social-media-usage-in-libraries-in-europe- survey-teaser. 14 “Social Media Fact Sheet.” 15 Daniel Miller, “Facebook’s so Uncool, but It’s Morphing into a Different Beast,” The Conversation, 2013, http://theconversation.com/facebooks-so-uncool-but-its-morphing-into-a-different- beast-21548; Ryan Bradley, “Understanding Facebook’s Lost Generation of Teens,” Fast Company, June 16, 2014, https://www.fastcompany.com/3031259/these-kids-today; Nico Lang, “Why Teens Are Leaving Facebook: It’s ‘Meaningless,’” Washington Post, February 21, 2015, https://www.washingtonpost.com/news/the-intersect/wp/2015/02/21/why-teens- are-leaving-facebook-its-meaningless/?utm_term=.1f9dd4903662. 16 Alison McCarthy, “Survey Finds US Teens Upped Daily Facebook Usage in 2016,” eMarketer, January 28, 2017, https://www.emarketer.com/Article/Survey-Finds-US-Teens-Upped-Daily- Facebook-Usage-2016/1015053. https://www.slideshare.net/jhoussiere/social-media-usage-in-libraries-in-europe-survey-teaser https://www.slideshare.net/jhoussiere/social-media-usage-in-libraries-in-europe-survey-teaser http://theconversation.com/facebooks-so-uncool-but-its-morphing-into-a-different-beast-21548 http://theconversation.com/facebooks-so-uncool-but-its-morphing-into-a-different-beast-21548 https://www.fastcompany.com/3031259/these-kids-today https://www.washingtonpost.com/news/the-intersect/wp/2015/02/21/why-teens-are-leaving-facebook-its-meaningless/?utm_term=.1f9dd4903662 https://www.washingtonpost.com/news/the-intersect/wp/2015/02/21/why-teens-are-leaving-facebook-its-meaningless/?utm_term=.1f9dd4903662 https://www.emarketer.com/Article/Survey-Finds-US-Teens-Upped-Daily-Facebook-Usage-2016/1015053 https://www.emarketer.com/Article/Survey-Finds-US-Teens-Upped-Daily-Facebook-Usage-2016/1015053 Introduction Literature Review Academic Libraries and Social Media Student Perceptions about Academic Libraries on Social Media Research Questions Methods Results Survey Library Usage Social Media Platforms Social Media Activity Social Media and the Library Discussion Limitations and Future Work Conclusion References 10170 ---- The Provision of Mobile Services in US Urban Libraries Ya Jun Guo, Yan Quan Liu, and Arlene Bielefield INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 78 Ya Jun Guo (yadon0619@hotmail.com) is Associate Professor of Information and Library Science at Zhengzhou University of Aeronautics, China. Yan Quan Liu (liuy1@southernct.edu) is Professor of Information and Library Science at Southern Connecticut State University. Arlene Bielefield (bielefielda1@southernct.edu) is Professor in Information and Library Science at Southern Connecticut State University. . ABSTRACT To determine the present situation regarding services provided to mobile users in US urban libraries, the authors surveyed 138 Urban Libraries Council members utilizing a combination of mobile visits, content analysis, and librarian interviews. The results show that nearly 95% of these libraries have at least one mobile website, mobile catalog, or mobile app. The libraries actively applied new approaches to meet each local community’s remote-access needs via new technologies, including app download links, mobile reference services, scan ISBN, location navigation, and mobile printing. Mobile services that libraries provide today are timely, convenient, and universally applicable. INTRODUCTION The mobile internet has had a major impact on people’s lives and on how information is found located and accessed. Today, library patrons are untethered from and free of the limitations of the desktop computer.1 The popularity of mobile devices has changed the relationship between libraries and patrons. Mobile technology allows libraries to have the kind of connectivity with their patrons that did not exist previously. Patrons no longer think that it is necessary for them to be physically in the library building to use library services, and they are eager to obtain 24/7 access to library resources anywhere using their mobile devices. Mobile patrons need mobile libraries to provide them with services. In other words, “patrons want to have a library in their pocket.”2 As a result, libraries around the world are exploring and developing mobile services. According to the State of America’s Libraries 2017 report by the American Library Association, the 50 US states, the District of Columbia, and outlying territories have 8,895 public library administrative units (as well as 7,641 branches and bookmobiles). The vital role public libraries play in their communities has also expanded.3 As part of the main role of public libraries, US urban libraries need to embrace the developmental trend of the mobile internet to better serve their communities. The provision of mobile services in US urban libraries is worthy of study and is of great significance as a model for how other public libraries plan and implement their mobile services. mailto:yadon0619@hotmail.com mailto:liuy1@southernct.edu mailto:bielefielda1@southernct.edu THE PROVISION OF MOBILE SERVICES IN US URBAN LIBRARIES | GUO, LIU, AND BIELEFIELD 79 https://doi.org/10.6017/ital.v37i2.10170 LITERATURE REVIEW Definition and Types of Mobile Devices and Mobile Services As early as 1991, Mark Weiser proposed “ubiquitous computing,” pointing out how people could obtain and handle information at anytime, anywhere, and in any way.4 With this expectation, the possibilities of using personal digital assistants (PDAs) as mobile web browsers were researched in 1995.5 In combination with a wireless modem, library users are able to use PDAs to access information services whenever they are needed. Today, mobile devices are generally defined as units small enough to carry around in a pocket, falling into the categories of PDAs, mobile phones, and personal media players.6 For many researchers, laptops are not included in the definition of mobile devices. Although wireless laptops purportedly offer the opportunity to go “anywhere in the home,” laptops are generally used in a small set of locations, rather than moving fluidly through the home; wireless laptops are portable, but not mobile.7 In contrast, Lippincott suggested that mobile devices should include laptops, netbooks, notebook computers, cell phones, audio players such as MP3 players, cameras, and other items.8 According to the “Mobile Strategy Report” by the California Digital Library, mobile phones, e-readers, MP3 players, tablets, gaming devices, and PDAs are common mobile devices.9 Each mobile device has its own characteristics and the potential to connect to the internet from anywhere with a Wi-Fi network, driving widespread use and thus the provision of library mobile services. Mobile services are services libraries offer to patrons via their mobile devices. These services as described herein comprise two categories: traditional library services modified to be available via mobile devices and services created for mobile devices.10 Pope et al. listed several mobile services, including SMS or text-messaging services, the My Info Quest Project, digital collections, audiobooks, applications, and mobile-friendly websites.11 The California Digital Library pointed out that a growing number of university and public libraries are offering mobile services. Libraries are creating mobile versions of library websites, using text messaging to communicate with patrons, developing mobile catalog searching, providing access to resources, and creating new tools and services, particularly for mobile devices.12 The most recognized mobile services in university libraries are mobile sites, mobile apps, mobile OPACs, mobile access to databases, text messaging services, QR codes, augmented reality, and e - books.13 Both academic and public libraries’ use of Web 2.0 applications and services include blogs, wikis, phone apps, QR codes, mash-ups, video or audio sharing, customized webpages, social media and social networking, and types of social tagging.14 This study focuses on the two most common mobile devices, mobile phones and tablets, and on the services provided to library patrons and local communities through mobile websites, mobile apps, and mobile catalogs. Status of Mobile Services in US Libraries Mobile devices present a new and exciting opportunity for libraries of all types to provide information to people of all ages on the go, wherever they are.15 It is generally observed that there is an increased use of mobile technology in the library environment. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 80 Librarians see their users increasingly using mobile phones instead of laptops and desktop computers to search the catalog, check the library’s opening hours, and maintain contact with library staff.16 In an earlier investigation of 766 librarians, Spires found that there was very little demand for services for mobile devices as of August 2007. At that time, relatively few libraries (18%) purchased content specifically for wireless handheld device use, and very few libraries (15%) reformatted content for these devices.17 However, a survey of public libraries completed by the American Library Association between September and November 2011 indicated interesting changes: 15% of library websites are optimized for mobile devices, and 12% of libraries use scanned codes (e.g. QR codes), and 7% of libraries have developed smartphone applications for access to library services; 36% of urban libraries have websites optimized for mobile devices, compared to 9% of rural libraries; 76% of libraries offer access to e-books; 70% of libraries use social networking tools such as Facebook. 18 Later studies revealed more significant changes. 99 Association of Research Libraries member libraries were surveyed in 2012 to identify how many had optimized at least some services for the mobile web. Apps were not investigated. The result showed that 83 libraries (84%) had a mobile website.19 A study in 2015 by Liu and Briggs showed that the top 100 university libraries in the United States offered one or more mobile services, with mobile websites, mobile access to the library catalog, mobile access to the library’s databases, e-books, and text messaging services being the most common. QR codes and augmented reality were less common.20 Kim noted that “libraries are acknowledging that people expect to do just about everything on mobile devices and that more and more people are now using a mobile device as their primary access point for the Web.”21 Although librarians may have previously underestimated what people wanted to do using mobile devices, there is a growing understanding of the potential of these access points. RESEARCH DESIGN Survey Samples While a growing number of users tend to access information remotely, urban libraries, as the most popular public-sector institutions and community centers, are facing great challenges in addressing the growing need for mobile services. The Urban Libraries Council (ULC) (https://www.urbanlibraries.org), as an authoritative source founded in 1971, is the premier membership association of North America’s leading public library systems. ULC’s member libraries are in communities throughout the United States and Canada, comprising a mix of institutions with varying revenue sources and governance structures, and serving communities with populations of differing sizes. ULC’s website lists 145 US and Canadian urban libraries. Since this study focused only on US urban libraries, 138 libraries were chosen as the study targets, and all were examined. https://www.urbanlibraries.org/ THE PROVISION OF MOBILE SERVICES IN US URBAN LIBRARIES | GUO, LIU, AND BIELEFIELD 81 https://doi.org/10.6017/ital.v37i2.10170 Table 1. The survey and examples of survey results. Contents Options Example No.1: Pima County Public Library … Example No.138: Milwaukee Public Library Components of mobile websites 1 Account login; 2 Catalog search; 3 Contact us; 4 Downloadables; 5 Events; 6 Interlibrary loan; 7 Kids & teens; 8 Locations and hours; 9 Meeting room; 10 Recent arrivals; 11 Recommendations; 12 Social media; 13 Suggest a purchase; 14 Support 1, 2, 3, 4, 5, 7, 8, 9, 10, 12, 13, 14. 1, 2, 3, 4, 5, 7, 8, 9, 12, 13, 14. Components of mobile apps 1 Account login; 2 Barcode Wallet; 3 Bestsellers; 4 Catalog search; 5 Contact us; 6 Downloadables; 7 Events; 8 Full website; 9 Interlibrary loan; 10 Just ordered; 11 Kids & teens; 12 Locations and hours; 13 Meeting room; 14 My Bookshelf; 15 My library; 16 Pay fines; 17 Popular this week; 18 Recent arrivals; 19 Recommendations; 20 Scan ISBN; 21 Social media; 22 Suggest a purchase; 21 Support 1, 4, 5, 6, 7, 8, 12, 15, 18, 20, 21. 1, 4, 5, 6, 7, 8, 12, 17, 20, 21. Mobile reference services 1 Chat/IM; 2 Social Medias; 3 Text/SMS; 4 Web Form -- 1, 3, 4. Social media 1 Blog; 2 Facebook; 3 Flickr; 4 Goodreads; 5 Google+; 6 Instagram; 7 LinkedIn; 8 Pinterest; 9 Tumblr; 10 Twitter; 11 YouTube 1, 2, 3, 6, 8, 10, 11. 1, 2, 6, 8, 10. Mobile reservation services 1 Reserve a computer; 2 Reserve a librarian; 3 Reserve a meeting room; 4 Reserve a museum pass; 5 Reserve a study room; 6 Reserve exhibit space -- 3. Mobile printing 1 Mobile printing; 2 No mobile/ Wi-Fi printing; 3 Wi- Fi printing 3. 2. Apps or databases 1 Axis 360; 2 BiblioBoard; 3 BookFlix;4 Brainfuse; 5 Career Transitions; 6 Cloud Library; 7 Driving -Tests.org; 8 EBSCOhost; 9 Flipster; 10 Freading; 11 Freegal; 12 Gale Virtual; 13 Hoopla; 14 Instant Flix; 15 Learning Express; 16 Lynda.com; 17 Mango Languages; 18 Master FILE; 19 Morningstar; 20 New York Times; 21 NoveList; 22 One Click Digital; 23 Overdrive; 24 Reference USA; 25 Safari; 26 Tumble Book; 27 Tutor.com; 28 World Book; 29 WorldCat; 30 Zinio. 4, 11, 14, 22, 23, 26, 28, 30. 4, 8, 11,12, 13, 15, 17, 18, 19, 21, 23, 24, 30. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 82 Survey Methods As mobile services are offered basically via wireless systems and mobile devices, a combination of research methods, including mobile website visits, content analysis, and librarian interviews, were applied for data collection. Specifically, librarian interviews were employed as a verification and supplemental process to ensure that survey data were accurate and exhaustive. First, the authors utilized an iPhone, an Android mobile phone, and an iPad to access the websites of the 138 US urban libraries in the study sample to ascertain if these libraries have mobile websites or mobile catalogs and whether the platforms are operated properly. Then the authors checked whether these libraries have mobile apps that can be downloaded from the Apple app store or the Google Play store. The survey was conducted from June 18 to June 24, 2017. Next, the authors went through all the mobile websites and the mobile apps the libraries provide to check the mobile services offered. The authors used a specially designed survey to collect data about each library’s mobile website and app (see table 1). The procedure of survey content analysis was conducted between June 25 and July 24, 2017, with the examination of each library’s services taking approximately 30 minutes. Finally, for those libraries that had no mobile websites or mobile apps found through the website visits, the authors made interview requests to staff librarians via their online reference services such as live chat, web form and email. An additional purpose of this step was to confirm the accuracy of the survey data collected from website visits. The survey was conducted from July 22 to August 3, 2017. RESULTS AND ANALYSIS Results from the examination of mobile website visits, content analysis, and librarian interviews revealed what services US urban libraries provided as mobile services, how they were provided, and which were commonly provided. How Many Libraries Provide Mobile Services? Over 83% of US urban libraries have developed their own mobile websites (see figure 1) for communities they serve. The mobile website is currently the most popular service platform for mobile users. THE PROVISION OF MOBILE SERVICES IN US URBAN LIBRARIES | GUO, LIU, AND BIELEFIELD 83 https://doi.org/10.6017/ital.v37i2.10170 Figure 1. Types of mobile services provided by libraries. Promisingly, each test of these websites through the authors’ mobile devices, either smartphones or tablets, confirmed that all the study subjects can be accessed 100% of the time. These library websites, however, are not entirely built specially for mobile devices. While the majority of urban libraries have transformed their desktop websites into mobile sites with proper responsive design, about 17% are just smaller versions of their desktop websites (see figure 2). A responsive mobile website can react or change according to the needs of the users and the mobile device they’re viewing it on to achieve a good layout and content display. Here, text and images change from a three-column to a single-column layout, and unnecessary images are hidden. The web address of a responsively designed mobile website is the same as the desktop website. Responsive design is described as a long-term solution for addressing both designers’ and users’ needs.22 The survey found that 59% of libraries now have apps. Our analysis of the earliest version of apps records indicate that Los Angeles Public Library was the first to use an app, in August 2010. Mobile apps have advantages and disadvantages compared to mobile websites, and many libraries compared them and chose between the two. Skokie (Illinois) Public Library, as of October 2015, is no longer supporting the library’s mobile app because they claim the library’s website offers a better mobile experience. They also offer an easy access solution like that for a mobile app, with a message displayed to users: “Miss having an icon on your home screen? Bookmark the site to your home screen and you’ll have an icon to take you directly to this site.” 83% 59% 22% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% Mobile website Mobile app Mobile catalog INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 84 Figure 2. The smaller versions of the desktop website and the specially designed mobile website The proportion of libraries providing mobile catalog services is only 22%. Libraries can use multiple options to create one or more mobile service platforms. Nearly half (46%) of US urban libraries have both mobile websites and mobile apps. According to the survey, 95% of libraries have at least one mobile website, mobile catalog, or mobile app. A survey the authors conducted in April 2014 found that only 81% of the urban libraries had at least one mobile website, mobile catalog, or mobile app (see figure 3). Clearly, libraries are paying increasing attention to mobile services, and providing mobile services has become the unavoidable choice of libraries nowadays. Figure 3. Changes in the proportion of libraries that provide mobile services from 2014 to 2017. 19% 81% 2014 No mobile services At least one mobile service 5% 95% 2017 No mobile services At least one mobile service THE PROVISION OF MOBILE SERVICES IN US URBAN LIBRARIES | GUO, LIU, AND BIELEFIELD 85 https://doi.org/10.6017/ital.v37i2.10170 What Content do the Mobile Websites Offer? Through mobile website visits and content analysis, it was found that some types of information are available at all libraries, including “Account login,” “Events,” “Locations and hours,” “Contact us,” and “Social media” (see figure 4). Figure 4. Components of mobile websites The proportion of library mobile sites that offer “Support” and “Downloadables” is 96% and 95%, respectively. Among them, “Support” generally includes donations to the library foundation, donation of books and other materials, and providing volunteer services; “Downloadables” generally include e-books, e-magazines, and music. A total of 86% of the urban libraries set up “Kids” and “Teens” sections, providing specialized information services, such as storytime, games, events, book lists, homework help, volunteer information, and college information. A majority (62%) of libraries provide interlibrary loan information on mobile websites, but one library, Palo Alto (California) City Library, no longer offers the costly Interlibrary loan service as of July 2011. More than half (56%) of the libraries set up a “Suggest a purchase” function and generally ask readers to provide title, author, publisher, year published, format, and other information in web form. Some libraries display “Recommendations” (26%) on their mobile websites. Denver Public Library has a special column recommending books for children and teenagers and offers personalized reading suggestions: “Tell us what you like to read and we’ll send you our recommendations in about a week.” Many mobile websites will pop hints to the libraries’ mobile apps and link to the Apple app store or the Google Play store after automatically identifying the user’s mobile phone operating system. This is helpful for promoting the use of the libraries’ apps, and it also provides great convenience for users. 100% 100% 100% 100% 100% 99% 96% 95% 86% 74% 62% 56% 32% 26% 0% 20% 40% 60% 80% 100% Account login Events Locations and hours Contact us Social media Catalog search Support Downloadables Kids & teens Meeting room Interlibrary loan Suggest a purchase Recent arrivals Recommendations http://www.marinlibrary.org/events/?trumbaEmbed=filter3%3DStorytimes INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 86 What Content do the Mobile Apps Offer? The content of mobile websites in libraries is basically the same, but the content of their mobile apps varies widely. The reason is that the understanding of the various libraries about the functions an app should offer differs from one library to another. Some of these apps were designed by software vendors, such as Boopsie, SirsiDynix, and BiblioCommoms, but some were designed by the libraries themselves, leading to the absence of a uniform standard or template for the app design. Survey results show that only “Account login” and “Catalog search” are available in all apps (see figure 5). “Locations and hours” accounts for a high proportion of apps at 96%. The “Locations” feature in many libraries apps, with the help of GPS, helps users find their nearest library location. Figure 5. Components of mobile apps About 85% of apps provide “Contact us.” Click “Contact us” in Poudre River Public Library District and some other libraries’ apps, and you can directly call the library or send text messages via e- mail. “Scan ISBN” is a unique feature of mobile apps, and 75% of apps provide this functionality. If a library user finds a book they need in a bookstore or elsewhere, they can scan the ISBN to can see if that book is in the library’s collection. Apps designed by BiblioCommoms all have “Bestsellers”, “Recently Reviewed”, “Just Ordered” and “My library” (See chart Figure 6). In “My library,” the “Checked Out” section contains red alerts for “Overdue,” yellow alerts for “Due Soon,” and “Total items.” The “Holds” section contains “Ready for pickup,” “Active holds,” and “Paused holds.”. The “My Shelves” section contains “Completed,” “In Progress,” and “For Later.” In this way, users can clearly see the details of the books they have 100% 100% 96% 89% 85% 77% 75% 68% 46% 27% 24% 19% 18% 16% 16% 10% 6% 5% 3% 0% 20% 40% 60% 80% 100% Account login Catalog search Locations and hours Downloadables Contact us Events Scan ISBN Social media Full website Recent arrivals Bestsellers Recently reviewed Popular this week Just ordered My library My Bookshelf Pay fines Barcode wallet Kids & teens THE PROVISION OF MOBILE SERVICES IN US URBAN LIBRARIES | GUO, LIU, AND BIELEFIELD 87 https://doi.org/10.6017/ital.v37i2.10170 borrowed and intend to borrow. Apps designed by Boopsie generally have “Popular this week” to tell users which books have been borrowed more recently. Figure 6. An app designed by BiblioCommoms. Only 3% of apps have “Kids” and “Teens” sections, which differs greatly from the percentage of mobile websites that offer those sections (86%). What Mobile Reference Services do Libraries Provide? According to the survey, the most common way for US urban libraries to provide mobile reference service is a web form, which is available in 86% of surveyed libraries (see figure 7). Related to “Call us,” a web form has the advantage of being independent from the library’s working hours. Although users fill out and submit a web form, it is similar to email and, generally, librarians respond to the user’s e-mail address, but it does not require users to enter their own email system, as they only need to fill in the content required by the web form. Therefore, it is more convenient to use. The authors believe that providing only an email address is not mobile reference service. The survey found that 6% of libraries do not have mobile reference services. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 88 Figure 7. Mobile reference services provided by libraries. Currently, 43% of libraries offer chat and instant messaging (IM) services, which allow users to communicate with librarians instantly. For example, when Gwinnett County (Georgia) Public Library’s mobile website is visited, an “Ask Us” dialog box appears in the upper right corner of the site, which allows visitors to chat with librarians. Outside of the library’s work hours, the box displays “sorry, chat is offline but you can still get help” (see figure 8). The County of Los Angeles Public Library provides four options for IM. They are AIM, Google Talk, Yahoo! Messenger, and MSN Messenger. Figure 8. “Ask Us” on Gwinnett County Public Library’s mobile website 86% 43% 33% 8% 0% 20% 40% 60% 80% 100% Web form Chat/IM Text/SMS Social media THE PROVISION OF MOBILE SERVICES IN US URBAN LIBRARIES | GUO, LIU, AND BIELEFIELD 89 https://doi.org/10.6017/ital.v37i2.10170 All the Florida urban libraries surveyed offer reference services via the web form, chat, and text because an “Ask a Librarian” service administered by the Tampa Bay Library Consortium provides Florida residents with those mobile reference services. The survey shows that only 8% of the libraries provide social media reference service in “Ask a librarian.” The social media that provides reference service is either Facebook or Twitter. In fact, 100% of libraries have social media, and 100% of libraries have Facebook and Twitter, but most libraries do not use them to provide reference services. What Social Media do the Libraries Use? Survey results showed that 100% of mobile websites display links to their social media, usually in the prominent position of the front page of the websites; 68% of apps have social media links. Facebook and Twitter are social media leaders, and now all libraries’ mobile websites have both (see figure 9). The survey conducted in 2014 showed that Facebook and Twitter had the highest occupancy rate, but only 61% of libraries offered Facebook and 53% offered Twitter. It is obvious that libraries have made great progress in the last three years in the application of social media. Figure 9. Social media being used by libraries. Instagram and Pinterest are both photo social media, and they are used 76% and 49%, respectively. As the leading social media in the video field, YouTube is used by 67% of libraries. What Mobile Reservation Services do Libraries Provide? Mobile reservation services were found in 78% of all libraries’ mobile services. A majority (62%) of the libraries allow online reservation of a meeting room via web form or other forms, and 14% allow reserving a study room (see figure 10). Some libraries only reserve a study or meeting room via phone. 100% 100% 76% 67% 57% 49% 41% 19% 12% 12% 9% 0% 20% 40% 60% 80% 100% Facebook Twitter Instagram YouTube Blog Pinterest Flickr Tumblr LinkedIn Google+ Goodreads INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 90 Figure 10. Mobile reservation services provided by libraries. A few libraries provide instant online access to free and low-cost tickets to museums, science centers, zoos, theatres, and other fun local cultural venues with Discover & Go. A total of 14% of the libraries provide “reserve a librarian” service, allowing patrons to reserve a free session with a reference librarian or subject specialist at the library. In addition, several libraries, such as Pasadena Public Library, allow reserving of exhibit space. How Many Libraries Provide Mobile Printing? Mobile printing services allow patrons to print to a library printer from outside the library or from their mobile device. Patrons’ print jobs are available for pick up at the library. Already, 43% of the libraries provide mobile printing service (see figure 11). It is expected that more libraries will provide this service. To print from a mobile device, patrons need to download an app that supports mobile printing. PrinterOn is the more commonly used app, which has been used by Oakland Public Library, and San Mateo County (California) Libraries, and others. However, San Diego Public Library uses the Your Print Cloud print system, and Santa Clara County (California) Library uses Smart Alec. San Mateo County Libraries offers wireless printing from smartphones, tablets, and laptops at all of its locations, and its wireless printing includes mobile printing, web printing, and email printing. In addition, 14% of libraries offer wireless printing services but do not provide mobile printing services. For example, Live Oak Public Libraries in Savannah, Georgia, states that printing from laptops (PC and Mac) is available in all branches, but they don’t have apps that support printing from tablets or mobile phones. 62% 20% 15% 14% 14% 4% 0% 10% 20% 30% 40% 50% 60% 70% Reserve a meeting room Reserve a computer Reserve a museum pass Reserve a study room Reserve a librarian Reserve exhibit space THE PROVISION OF MOBILE SERVICES IN US URBAN LIBRARIES | GUO, LIU, AND BIELEFIELD 91 https://doi.org/10.6017/ital.v37i2.10170 Figure 11. The proportion of libraries that offer mobile printing. What Apps or Databases do Libraries Provide for Patrons? Four main software programs found to be used to display e-books of the surveyed libraries are Overdrive (93%), Hoopla (64%), Tumblebook (61%), and Cloud Library (48%). For audiobooks, Overdrive (93%) and Hoopla (64%) are the most popular; oneclickdigital is used by 48%. Most libraries (74%) use Zinio for e-magazines, and 48% use the music software Freegal. Overdrive is the most common application in libraries (see table 2). Table 2. The proportion of apps or databases being used in libraries. Apps or Databases % of Libraries Providing Apps or Databases % of Libraries Providing Overdrive 93 World Book 46 NoveList 79 New York Times 44 ReferenceUSA 74 MasterFILE 43 Zinio 74 EBSCOhost 43 LearningExpress 69 Flipster 29 Gale Virtual 68 BookFlix 28 Hoopla 64 Brainfuse 22 Morningstar 64 Tutor.com 17 Mango Languages 61 Safari 17 TumbleBook 61 Driving-Tests.org 16 Lynda.com 57 BiblioBoard 12 WorldCat 51 Career Transitions 12 Freegal 48 Axis 360 11 OneClick Digital 48 InstantFlix 10 Cloud Library 48 Freading 9 Mobile printing 43% No Wireless/mobile printing 42% Wireless printing 14% INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 92 The libraries provide users with various types of databases. Survey statistics show that the widely used databases include ReferenceUSA (business), Mango Languages (language learning), LearningExpress and Career Transitions (job and career), Lynda.com and Tutor.com (education), Morningstar (investment), World Book (encyclopedias), WorldCat (library resources worldwide), New York Times (newspaper articles), Driving-Tests.org (testing preparation), and Safari (technology). CONCLUSION This study shows that mobile services have become popular in US urban libraries as of summer 2017, with 95% offering one or more types of mobile service. Responsive mobile websites and mobile apps are the main platforms of current mobile services. The US urban libraries are terribly striving to meet local community’s remote access needs via new technologies. Compared with desktop websites, mobile websites and apps for mobile devices offer services that are more accessible, smarter and interactive for local users. Some mobile websites automatically prompt the user to install the libraries’ apps; many libraries’ apps offer the “Scan ISBN” function, making it convenient for the user to scan a book title at any time to see if it is in the library’s collection; “Location” provides GPS positioning and navigation services for users; “Contact us” can directly link telephone, text, and email. Libraries are actively developing and adding more mobile services, such as mobile reservation services and mobile printing services. The development of mobile technology has provided the support for libraries to offer mobile services. A future world of users accessing services provided by the libraries at anytime, anywhere, and in any way is getting closer and closer. ACKNOWLEDGEMENTS This work was supported by grant no. 14CTQ028 from the National Social Science Foundation of China. REFERENCES 1Jason Griffey, Mobile Technology and Libraries (New York: Neal-Schuman, 2010). 2Meredith Farkas, “A Library in Your Pocket,” American Libraries no. 41 (2010): 38. 3American Library Association, “The State of America’s Libraries 2017: A Report from the American Library Association,” special report, American Libraries, April 2017, http://www.ala.org/news/sites/ala.org.news/files/content/State-of-Americas-Libraries- Report-2017.pdf. 4Mark Weiser, “The Computer for the 21st Century,” Scientific American 265, no. 3 (1991): 94–104. 5Stefan Gessler and Andreas Kotulla, “PDAs as mobile WWW browsers,” Computer Networks and ISDN Systems 28, no. 1–2 (1995): 53–59. 6Georgina Parsons, “Information Provision for HE Distance Learners using Mobile Devices,” Electronic Library 28, no. 2 (2010): 231–44, https://doi.org/10.1108/02640471011033594. http://www.ala.org/news/sites/ala.org.news/files/content/State-of-Americas-Libraries-Report-2017.pdf http://www.ala.org/news/sites/ala.org.news/files/content/State-of-Americas-Libraries-Report-2017.pdf https://doi.org/10.1108/02640471011033594 THE PROVISION OF MOBILE SERVICES IN US URBAN LIBRARIES | GUO, LIU, AND BIELEFIELD 93 https://doi.org/10.6017/ital.v37i2.10170 7Allison Woodruff et al., “Portable, but Not Mobile: A Study of Wireless Laptops in the Home,” International Conference on Pervasive Computing 4480 (2007): 216–33, https://doi.org/10.1007/978-3-540-72037-9_13. 8Joan K. Lippincott, “A Mobile Future for Academic Libraries,” Reference Services Review 38, no. 2 (2010): 205–13. 9Rachel Hu and Alison Meir, “Mobile Strategy Report,” California Digital Library, August 18, 2010, https://confluence.ucop.edu/download/attachments/26476757/CDL+Mobile+Device+User+R esearch_final.pdf?version=1. 10Yan Quan Liu and Sarah Briggs, “A Library in the Palm of Your Hand: Mobile Services in Top 100 University Libraries,” Information Technology & Libraries 34, no. 2 (2015): 133–48, https://doi.org/10.6017/ital.v34i2.5650. 11Kitty Pope et al., “Twenty-First Century Library MUST-HAVES: Mobile Library Services,” Searcher 18, no. 3 (2010): 44–47. 12Hu and Meir, “Mobile Strategy Report.” 13Qian and Briggs, “A Library in the Palm of Your Hand.” 14Kalah Rogers, “Academic and Public Libraries’ Use of Web 2.0 Applications and Services in Mississippi,” SLIS Connecting 4, no. 1 (2015), https://doi.org/10.18785/slis.0401.08. 15 Pope et al., “Twenty-First Century Library MUST-HAVES.” 16Lorraine Paterson and Low Boon, “Usability Inspection of Digital Libraries: A Case Study,” Ariadne 63, no. 1 (2010): 11, https://doi.org/10.1007/s00799-003-0074-4. [website lists H. Rex Hartson, Priya Shivakumar, and Manuel A. Pérez-Quiñones as the authors] 17Todd Spires, “Handheld Librarians: A survey of Librarian and Library Patron Use of Wireless Handheld Devices,” Internet Reference Services Quarterly 13, no. 4 (2008): 287–309, https://doi.org/10.1080/10875300802326327. 18 American Library Association, “Libraries Connect Communities 2011-2012,” Last modified June, 2012, http://connect.ala.org/files/68293/2012.67B%20PLFTS%20Results.pdf. 19Barry Trott and Rebecca Jackson, “Mobile Academic Libraries,” Reference & User Services Quarterly 52, no. 3 (2013): 174–78. 20 Liu and Briggs, “A Library in the Palm of Your Hand.” 21Bohyun Kim, “The Present and Future of the Library Mobile Experience,” Library Technology Reports 49, no. 6 (2013): 15–28. 22Hannah Gascho Rempel and Laurie Bridges, “That Was Then, This Is Now: Replacing the Mobile- Optimized Site with Responsive Design,” Information Technology & Libraries 32, no. 4 (2013): 8–24, https://doi.org/10.6017/ital.v32i4.4636. https://doi.org/10.1007/978-3-540-72037-9_13 https://confluence.ucop.edu/download/attachments/26476757/CDL+Mobile+Device+User+Research_final.pdf?version=1 https://confluence.ucop.edu/download/attachments/26476757/CDL+Mobile+Device+User+Research_final.pdf?version=1 https://doi.org/10.6017/ital.v34i2.5650 https://doi.org/10.18785/slis.0401.08 https://doi.org/10.1007/s00799-003-0074-4 https://doi.org/10.1080/10875300802326327 http://connect.ala.org/files/68293/2012.67B%20PLFTS%20Results.pdf https://doi.org/10.6017/ital.v32i4.4636 ABSTRACT INTRODUCTION LITERATURE REVIEW Definition and Types of Mobile Devices and Mobile Services Status of Mobile Services in US Libraries RESEARCH DESIGN Survey Samples Survey Methods RESULTS AND ANALYSIS How Many Libraries Provide Mobile Services? What Content do the Mobile Websites Offer? What Content do the Mobile Apps Offer? What Mobile Reference Services do Libraries Provide? What Social Media do the Libraries Use? What Mobile Reservation Services do Libraries Provide? How Many Libraries Provide Mobile Printing? What Apps or Databases do Libraries Provide for Patrons? CONCLUSION Acknowledgements REFERENCES 10177 ---- Efficiently Processing and Storing Library Linked Data using Apache Spark and Parquet Kumar Sharma, Ujjal Marjit, and Utpal Biswas INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 29 Kumar Sharma (kumar.asom@gmail.com) is Research Scholar, Department of Computer Science and Engineering; Ujjal Marjit (marjitujjal@gmail.com) is System-in-Charge, Center for Information Resource Management (CIRM); and Utpal Biswas (utpal01in@yahoo.com) is Professor, Department of Computer Science and Engineering, the University of Kalyani, India. ABSTRACT Resource Description Framework (RDF) is a commonly used data model in the Semantic Web environment. Libraries and various other communities have been using the RDF data model to store valuable data after it is extracted from traditional storage systems. However, because of the large volume of the data, processing and storing it is becoming a nightmare for traditional data- management tools. This challenge demands a scalable and distributed system that can manage data in parallel. In this article, a distributed solution is proposed for efficiently processing and storing the large volume of library linked data stored in traditional storage systems. Apache Spark is used for parallel processing of large data sets and a column-oriented schema is proposed for storing RDF data. The storage system is built on top of Hadoop Distributed File Systems (HDFS) and uses the Apache Parquet format to store data in a compressed form. The experimental evaluation showed that storage requirements were reduced significantly as compared to Jena TDB, Sesame, RDF/XML, and N-Triples file formats. SPARQL queries are processed using Spark SQL to query the compressed data. The experimental evaluation showed a good query response time, which significantly reduces as the number of worker nodes increases. INTRODUCTION More and more organizations, communities, and research-development centers are using Semantic Web technologies to represent data using RDF. Libraries have been trying to replace the cataloging system using a linked-data technique such as BIBFRAME.1 Libraries have received much attention on transitioning MARC cataloging data into RDF format.2 Data stored in various other formats such as relational databases, CSV, and HTML have already begun their journey toward the open-data movement.3 Libraries have participated in the evolution of Linked Open Data (LOD) to make data an essential part of the web.4 Various researchers have explored areas related to library data and linked data. In particular, transitioning legacy library data into linked data has dominated most of the research works. Other areas include researching the impact of linked library data, investigating how privacy and security can be maintained, and exploring the potential effects of having open linked library data. Obviously, a linked-data approach for publishing data on the web brings many benefits to libraries. First, once isolated library data currently stored using traditional cataloging systems (MARC) becomes a part of the web, it can be shared, reused, and consumed by web users.5 This promotes the cross-domain sharing of knowledge hidden in the library data, opening the library as a rich source of information. Online library users can share more information using linked library resources since every library mailto:kumar.asom@gmail.com mailto:marjitujjal@gmail.com mailto:utpal01in@yahoo.com EFFICIENTLY PROCESSING AND STORING LIBRARY LINKED DATA | SHARMA, MARJIT, AND BISWAS 30 https://doi.org/10.6017/ital.v37i3.10177 resource is crawlable on the web via Uniform Resource Identifiers (URI). Most importantly, library data benefits from linked-data technology’s real advantages, such as interoperability, integration with other systems, data crosswalks, and smart federated search.6 Numerous approaches have evolved for making the vision of the Semantic Web a success. No doubt, they have succeeded in making the library a part of the web, but there remain issues related to library big data. The term big data refers to data or information that cannot be processed using traditional software systems.7 The volume of such data is so large that it requires advanced technologies for processing and storing the information. Libraries also have real concerns with large volumes of data during and after the transition to linked data. The main challenges are in processing and storage. During conversion from library data to RDF, the process can become stalled because of the large volumes of data. Once the data is successfully converted into RDF formats, there are storage issues. Finally, even if the data is somehow stored using common RDF triple stores, it is difficult to retrieve and filter. This is a challenging problem that every librarian must give attention to. Librarians should know the real nature of library big data, which causes problems in analyzing data and decision making. Librarians must also know the technologies that can resolve these issues. The rate of data generation and the complexity of the data itself are constantly increasing. Traditional data-management tools are becoming incapable of managing the data. That is why the definition of big data has been characterized by five Vs—volume, velocity, variety, value, and veracity.8 • Volume is the amount of the data. • Velocity is the data-generation rate (which is high in this case). • Variety refers to the heterogeneous nature of the data. • Value refers to the actual use of the data after the extraction. • Veracity is the quality or trustworthiness of the data. To handle the five Vs of big data, distributed technologies such as commodity hardware, parallel processing frameworks, and optimized storage systems are needed. Commodity hardware reduces the cost of setting up a distributed environment and can be managed with very limited configurations. A parallel processing system can process distributed data in parallel to reduce processing time. An optimized storage system is required to store the large volume of data, supporting scalability to accommodate more data on demand. With these library requirements to tackle the challenges posed by library big data, a distributed solution is proposed. This approach is based on Apache Hadoop, Apache Spark, and a column-oriented storage system to process large- size data and to store the processed data in a compressed form. Bibliographic RDF data from British National Library and the National Library of Portugal have been used for this experiment. These bibliographic data are processed using Apache Spark and stored using Apache Parquet format. The stored data can be queried using SPARQL queries for which Spark SQL is used to execute queries. Given an existing RDF dataset, we designed a schema for storing RDF data using a column- oriented database. Using column-oriented design with Apache Parquet and Spark SQL as the query INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 31 processor, a distributed RDF storage system was implemented that can store any amount of RDF data by increasing the number of distributed nodes as needed. LITERATURE REVIEW While big data continues to rise, library data are still in traditional storage systems isolated from the web. To continue working with the web, libraries must redesign the way they format data and contribute toward the web of data. To serve library data to other communities, libraries must integrate their data with the web. Attempts to do this have been made by several researchers. The task of integration cannot be achieved by only librarians; rather, it requires a team of experts in the field of library and information technology. The advanced way for integrating resources is with linked-data technology by assigning URIs to every piece of library data. With this goal, there exist various projects related to the convergence of library data and linked data. One of these, BIBFRAME, is an initiative to transition bibliographic resources into linked-data representation. BIBFRAME aims to replace traditional cataloging standards such as MARC and UNIMARC using the concept of publishing structured data on the web. MARC formats cannot be exchanged easily with nonlibrary systems. The MARC standard also suffers from inconsistencies, errors, and inability to express relationships between records and fields within the record. That is why mostly bibliographic resources stored in MARC standards are targeted for conversion.9 Other works include the open-data initiative from the British National Library, library catalog to linked open- data conversion, exposing library data as linked data, and building a knowledge graph to reshape the library staff directory.10 Linked data is fully dependent on RDF. RDF reveals graph-like structures where resources are linked with one another. Thus, RDF can improve on MARC standards because of its strong ability to link related resources. This system of revealing everything as a graph helps in building a network of library resources and other data on the web. This also makes for fast search functionality. In addition, searching a topic or book could bring similar graphs from other library resources, leading to the creation of linked-data service.11 Such a service has been implemented by the German National Library to provide bibliographic and authority data in RDF format, by the Europeana Linked Open Data with access to open metadata on millions of books and multimedia data, and by the Library of Congress Linked Data Service.12 There is less discussion of library big data. Though big data in general is in active research, the library domain has received much less attention than the broader concept of big data and its challenges. This could be because most of librarians working with linked data are from nontechnical backgrounds. Now is the right time for libraries to give priority to adopting big data technologies to overcome challenges posed by big data. Wang et al. have discussed library big data issues and challenges.13 They made some statements about whether library data belongs to the big data category. Obviously, library data belongs to big data since it fulfills some of the characteristics of big data, such as volume, variety, and velocity. Wang et al. also raise some of libraries’ challenges related to library big data, such as lacking teams of experts, inability to adopt big data due to budgetary issues, and technical challenges. Finally, they point out that to take advantage of the web’s full potential, library data must be transformed into a format that can be accessible beyond the library using technologies like Semantic Web and linked data. The web has already started its work related to big-data challenges. Libraries need to transition their data into an advanced format with the ability to handle big-data issues. The main problems EFFICIENTLY PROCESSING AND STORING LIBRARY LINKED DATA | SHARMA, MARJIT, AND BISWAS 32 https://doi.org/10.6017/ital.v37i3.10177 related to library big data happen at data transformation and storage. To store and retrieve large amounts of data, we need commodity hardware that can handle trillions of RDF triples, requiring terabytes or petabytes of disk space. As of now, there are Semantic Web frameworks such as Jena and Sesame to handle RDF data, but these frameworks are not scalable for large RDF graphs.14 Jena is a Java-based framework for building Semantic Web and linked-data applications. It is basically a Semantic Web programming framework that provides Java libraries for dealing with RDF data. Jena TDB is the component of Jena for storing and querying RDF data. 15 It is designed to work in a single-node environment. Sesame is also a Semantic Web framework for processing, storing, and querying RDF data. Basically, Sesame is a web-based architecture for storing and querying RDF data as well as schema information. 16 BACKGROUND This section briefly describes the structure of RDF triples, Apache Spark along with its features and column-oriented database system, and Apache Parquet. Structure of RDF Triples RDF is a schema-less data model. It implies that the data is not fixed to a specific schema, so it does not need to conform to any predefined schema. Unlike in relational tables, where we define columns during schema definition and those columns must contain the required type of data, in RDF we can have any number of properties and data using any kind of vocabulary. We only need vocabulary terms to embed properties. The vocabulary is created using domain ontology, which represents the schemas. To describe library resources we need a library-domain ontology. For example, to define a book and its properties one can use the BookOnt ontology.17 BookOnt is a book-structure ontology designed for an optimized book search and retrieval process. However, it is not mandatory to use existing ontology and all the properties defined under it. We can use terms from a newly created ontology or mixed ontologies with required properties. RDF represents resources in the form of subject, predicate, and object. The subject is the resource being described, identified by a URI. This subject can have any number of property-value pairs. This way representation of a resource is called knowledge representation, where everything is defined as a knowledge in the form of entity attribute value (EAV). In RDF, the basic unit of information is a triple T, such that T = {Subject, Predicate, Object}. Such information when stored on disk is called a triplestore. The collection of RDF triples is called an RDF database. An RDF database is specially designed to store linked data to make the web more useful by interlinking data from different sources in a meaningful way. The real advantage of RDF is its support of the common data model. RDF is the standard way for publishing meaningful data on the web, and this is backed by linked data. Linked data provides some rules about how data can be published on the web by following the RDF data model.18 With such a common data model, one can integrate data from any sources by inserting new property-value pairs without altering database schema. Another important purpose of RDF is to provide resources to be processable by software agents on the web. RDF triples are of two types: literal triples and linked triples. Literal triples consist of a URI - referenced subject and a literal object (scalar value) joined by a predicate. In linked triples, both the subject and the object consist of URIs linking by the predicate. This type of linking is called RDF link, which is the basis for interlinking the resources.19 RDF data are queried using the SPARQL query language.20 SPARQL is a graph-matching query language and is used to retrieve INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 33 triples from the triple store. The SPARQL queries are also called semantic queries. Like SQL queries, SPARQL also finds and retrieves the information stored in the triplestore. A SPARQL query is composed of five main components:21 • the prefix declaration part is used to abbreviate the URIs; • the dataset definition is used to specify the RDF dataset from which the data is to be fetched; • the result clause is used to specify what information is needed to be fetched, which can be SELECT, CONSTRUCT, DESCRIBE, and ASK; • the query pattern is used to specify the search conditions; and • the query modifiers are used to rearrange query results using ORDER BY, LIMIT etc. Hadoop and MapReduce Hadoop is open-source software that supports distributed processing of large datasets on machine clusters.22 Two core components—Hadoop Distributed File System (HDFS) and MapReduce— make distributed storage and computation of processing jobs possible.23 HDFS is the storage component, whereas MapReduce is a distributed data-processing framework, the computational model of Hadoop based on Java. The MapReduce algorithm consists of two main tasks: map and reduce. The map task takes a set of data as input and produces another set of data with individual components in the form of key/value pairs or tuples. The output of the map task goes to the reduce task, which combines common key/value pairs into a smaller set of tuples. HDFS and MapReduce are based on driver/worker architecture consisting of driver and worker nodes having different roles. An HDFS driver node is called the Name-Node while the worker node is called the Data-Node. The Name-Node is responsible for managing names and data blocks. Data blocks are present in the Data-Nodes. Data-Nodes are distributed across each machine, responsible for actual data storage. Similarly, the MapReduce driver node is called the Job-Tracker and the worker node is called the Task-Tracker. Job-Tracker is responsible for scheduling jobs on Task-Trackers. Task-Tracker again is distributed across each machine along with the Data-Nodes, responsible for processing map and reducing tasks as instructed by the Job-Tracker. The concept of Hadoop implies that the set of data to be processed is broken into smaller forms that can be processed individually and independently. This way, tasks can be assigned to multiple processors to process the data, and eventually it becomes easy to scale data processing over multiple computing nodes. Once a MapReduce program is written, the program can be scaled to run over thousands of machines in a cluster. Spark and Resilient Distributed Datasets (RDD) Apache Spark is an in-memory cluster computing platform, which is a faster batch-processing framework than MapReduce. More importantly, it supports in-memory processing of tasks along with data, so querying data is much faster than disk-based engines. The core of Spark is the Resilient Distributed Dataset (RDD). RDD is a fundamental data structure of Spark that holds a distributed collection of data where data cannot be modified. Rather, data modification yields another immutable collection of data (or RDD). This process is called RDD transformation. For example, figure 1 depicts an example of RDD transformation. The distributed processing and EFFICIENTLY PROCESSING AND STORING LIBRARY LINKED DATA | SHARMA, MARJIT, AND BISWAS 34 https://doi.org/10.6017/ital.v37i3.10177 transformation of data is managed by RDD. RDDs are fault-tolerant, meaning that the lost data is recoverable using lineage graph of RDDs.24 Spark constructs a Direct Acyclic Graph (DAG) of a sequence of computations that needed to be performed on data. Spark has the most powerful computing engine that allows most of the computations in multistage memory. Because of this multistage in-memory computation engine, it provides better performance at reading and writing data than the MapReduce paradigm.25 It aims at speed, ease of use, extensibility, and interactive analytics. Spark relies on concepts such as RDD, DAG, Spark Context, Transformations, and Actions. Spark Context is an execution environment in which RDDs and broadcasting variables can be created. Spark Context is also called the master of a Spark application and allows accessing the cluster through a resource manager. Data transformation happens in the Spark application when the data is loaded from a data-store into RDDs and some filter or map functions are performed to produce a new set of RDDs. When the set of computations is created, forming a DAG, it does not perform any execution; rather, it prepares for execution in the end, like a lazy loading process. Some examples of actions are data extraction or collection and getting the count of words. Transformations are the sequence of events, and action is the final execution of the underlying logic. Figure 1. RDD transformations. The execution model of Spark is shown in figure 2. The execution model is based on the driver/worker architecture consisting of the driver and the worker processes. The driver process creates the Spark context and schedules tasks based on the available worker nodes. Initially, the master process must be started, then creating worker nodes follows. The driver takes the responsibility of converting a user’s application into several tasks. These tasks are distributed among the workers. The executors are the main components of every Spark application. Executors actually perform data processing, reading and writing data to the external sources and the storage system. The Spark manager is responsible for resource allocation and deallocation to the Spark job. Basically, Spark is only a computation model. It is not related to storage of data, which is a different concept. It only helps in computations and data analytics in a distributed manner. For distributed execution, the task is distributed among the connected nodes so that every node can perform tasks at the same time; it performs the desired operation and notifies the master upon completion of the task. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 35 Figure 2 Execution model of Spark. In MapReduce, read/write operations happen between disk and memory, making job computation slower than Spark. RDDs resolve this by allowing fault-tolerant, distributed, in-memory computations. In RDD, the first load of data is read from disk and then a write-to-disk operation may take place depending upon the program. The operations between first read and last write happen in memory. Data on RDDs are lazily evaluated, i.e., during RDD transformations, data will not take part until any action is called on the final RDD, which triggers the job execution. The chain of RDD transformations creates dependencies between RDDs. Each dependency has a function for calculating its data and a pointer to its parent RDD. Spark divides RDD dependencies into stages and tasks, then it sends them to workers for execution. Hence, an RDD does not actually hold the data; rather, it either loads data from disk or from another RDD and performs some actions on the data for producing results. One of the important features of RDD is its fault tolerance, because of which it can retain and recompute any of the unsuccessful partitions due to node failures. RDDs have built-in methods for saving data into files. For example, the RDD calls on saveAsTextFile(), its data are written on the specified text file line by line. There are numerous options for storing data in different formats, such as JSON, CSV, sequence files, and object files. All these file formats can be saved directly into HDFS or normal file systems. Spark SQL and Dataframe Spark SQL is a query interface for processing structured data using SQL style on the distributed collection of data. That means it is used for querying structured data stored in HDFS (like Hive) and Parquet. Spark SQL runs on top of Spark as a library and provides higher optimization. The EFFICIENTLY PROCESSING AND STORING LIBRARY LINKED DATA | SHARMA, MARJIT, AND BISWAS 36 https://doi.org/10.6017/ital.v37i3.10177 Spark dataframe is an API (application programming interface) that can perform relational operations on RDDs and external data sources such as Hive and Parquet. Like RDDs, a Spark dataframe is also a collection of structured records that can be manipulated by Spark SQL. It evaluates operations lazily to perform relational optimizations.26 A dataframe is created using RDDs along with the schema information. For example, the Java code snippet below creates a dataframe using RDD and a schema called RDFTriple (rdf-triple schema will be discussed in the proposed approach). JavaRDD n_triples_ = marc_records.map(new TextToString()); JavaRDD rdf_triples = n_triples.map(new LinesToRDFFunction()); Dataset dataframe = sparkSession.createDataFrame(rdf_triples, RDFTriple.class); dataframe.write().parquet("/full-path/RDFData.parquet"); The Spark dataframe uses memory management wisely by saving data in off-heap memory and provides an optimized execution plan. Conceptually, a dataframe is equivalent to the relational tables with richer optimization and supports SQL queries over its data. So, a dataframe is used for storing data into tables. Structured data from Spark dataframe can be saved into the Parquet file format as shown in the above code snippet. Column-Oriented Database A database is a persistent collection of records. These records are accessed via queries. The system that stores data and processes queries to retrieve data is called a database system. Such systems use indexes or iteration over the records to find the required information stored in the database. Indexes are an auxiliary, dictionary-like data structure that keeps indexes of individual records. Indexing is efficient in some cases, however, as it requires two lookup operations and it slows down the access time. Data scanning or iteration over each record resolves the query by finding the exact location of the records. It is inefficient when the size of the data is too large. As data-generation rate is increasing constantly, more and more data is going to be stored on the disk. For a fast-growing rate of data, we need a system that can adjust to more data than traditional storage systems and, at the same time, query-processing tasks should take less time. When the data gets too large, indexing and record scanning will be costly during querying. Hence, a satisfying solution is the columnar-storage system, which stores data by columns rather than by rows. 27 A column-oriented database system stores data in corresponding columns, and each column is stored in a separate file into the disk. This makes data access time much quicker. Since each column is stored separately, any required data can directly be accessed instead of reading all the data. That means any column can be used as an index, making it auto-indexing. That is why the column-oriented representation is much faster than the row-oriented representation. Apart from this, data is stored in the compressed form. Each column is compressed using a different scheme. In the column-oriented database, the compression is always efficient as all the values belong to the same data type. Hence, column-oriented databases require less disk space, as they do not need additional storage for indexes since the data is stored within the indexes themselves. Consider an example where a database table named “Book” consisting of columns “BookID,” “Title,” and “Price.” Following a column-oriented approach, all the values for BookID are stored together under the “BookID” column, all the values for Title are stored together under “Title” column. and so on as shown in Figure 3. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 37 Figure 3 An example of an entity and its row and column representation. Apache Parquet Parquet is a top-level Apache project that stores data in column-oriented fashion, highly compressed and densely packed in the disk.28 It is a self-describing data format that embeds schema within the data itself. It supports efficient compression and encoding schemes that allows lowering data-storage costs and maximizes the effectiveness of querying data. Parquet has added advantages, such as limiting the I/O operation and storing data in compressed form using the Snappy method developed by Google and used in its production environment. Hence it is designed especially for space and query efficiency. Snappy aims at compressing petabytes of data in minimal amounts of time, and especially aims for resolving big data issues.29 The data compression rate is more than 250 MB/sec, and decompression rate is more than 500 MB/sec. These compression and decompression rates are for a single core of a system having a Core i7 processor in 64-bit mode. It is even faster than the fastest mode of zlib compression algorithm.30 Parquet is implemented using column-striping and assembly-language algorithms that are optimized for storing large data-blocks.31 It supports nested data structures in which each value of the same column is stored in contiguous memory locations.32 Apache Parquet is flexible and can work with many programming languages because it is implemented using Apache Thrift (https://thrift.apache.org/). A Parquet file is divided into row groups and metadata at the end of the file. Each row group is divided into column values (or column chunks), such as column 1, column 2, and so on as shown in figure 4. Each column value is divided into pages, and each page consists of the page header, repetition levels, definition levels, and values. The footer of the file contains various metadata, such as file metadata, column metadata, and page-header metadata. The metadata information is required to locate and find the values, just like indexing. https://thrift.apache.org/ EFFICIENTLY PROCESSING AND STORING LIBRARY LINKED DATA | SHARMA, MARJIT, AND BISWAS 38 https://doi.org/10.6017/ital.v37i3.10177 Figure 4 Parquet file structure. THE PROPOSED APPROACH The proposed approach relies on Spark’s core APIs—RDD, Spark SQL, and Dataframe—which can operate on large datasets. RDD is used to load the initial data from the input file, process the data and transform them into triple structure. Spark dataframe is used to load the data from RDD into the triple structure and send the transformed RDF data into a Parquet file. Spark SQL is used to fetch the data stored in the Parquet file. Processing RDF Data Processing RDF data from large RDF/XML files requires breaking the file into smaller file components. General data-processing systems cannot handle large files because they face memory issues. At this stage, the proposed approach can process the data using an N-Triples file, hence individual RDF/XML files again need to be converted into the N-Triples file format. The process of breaking RDF/XML file into smaller file components and then converting them into N-Triples format depends upon the size of the input file. If it is not more than 500 MB then it is directly converted into N-Triples file format. Multiple RDF/XML files are converted into individual N- Triples file formats, which are again combined into one N-Triples file, as the proposed Spark application reads input from a single file. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 39 Schema to Store RDF Data A simple RDF schema with three triple entities has been designed. This schema is an RDF triple view, which is the building block of the RDF storage schema proposed in this work. The RDF triple view is a simple Java class consisting of three attributes—subject, predicate, and object. Given an RDF dataset D, consisting of a set of RDF triples T, in either RDF/XML or N-Triples format, the dataset is transformed into a format that can be processed by a Spark application. Further, the dataset is transformed into a line-based format where the individual triple statement is placed in a line separated by a new-line (\n) character. A line contains three components—subject, predicate, and object separated by a space. Here each line is unique, using the combined information of subject, predicate, and object. Given an RDF triple structure Ti, Ti = (Si, Pi, Oi) and Ti ∈ T, for each T an instance of RDF triple view is created to hold the triple information. The columnar schema organizes triple information into three components, storing each component separately as subject, predicate, and object columns (figure 5). Figure 5. RDF Triple view. RDF Storage We store the RDF data based on RDF Triple view, which is the main schema for storing data in the triple representation. We do not need any indexing or additional information related to subject, predicate, or object to be stored on the disk. Since we can have any number of temporary dataframe tables in memory, join operations can be performed using these tables to filter the data. In the absence of expensive indexing and additional triple information, storage area can be reduced significantly. Apart from this, the compression technique used in Apache Parquet reduces lot more space than storing in other triple stores. In figure 6, we illustrate the data-storing process. EFFICIENTLY PROCESSING AND STORING LIBRARY LINKED DATA | SHARMA, MARJIT, AND BISWAS 40 https://doi.org/10.6017/ital.v37i3.10177 Figure 6. Data-storing process in HDFS. The collection of triple instances is loaded into an RDD. At the end, the collection of triple instances is loaded into Spark dataframe. Spark dataframes are equivalent to the RDBMS tables and support both structured and unstructured data formats. Using a single schema, multiple dataframes can be used and can be registered as temporary tables in the memory, where high- level SQL queries can be executed on top of them. Here the concept of using multiple dataframes with a single schema is motivated to avoid joins and indexing. In the final step, the Spark dataframe is saved into HDFS files in the Parquet format. From the Parquet file, the data can be loaded back into dataframes in memory and queried using Spark SQL. Fetching Data from Storage Given an RDF dataset D, a SPARQL query Q, and a columnar-schema S, we use S to translate Q to Q' to perform queries on top of S. Here, the answer of query Q' on top of S is equal to the answer of Q on top of D. Query mappings M are used to transform SPARQL queries into Spark SQL queries. For querying, first the data is loaded into a Spark dataframe from Parquet files. To query data using SPARQL, queries must follow basic graph patterns (BGP). A BGP is a set of triple patterns similar to an RDF triple (S, P, O) where any of S, P, and O can be query variables or literals. BGP is used for matching a triple pattern to an RDF graph. This process is called binding between query variables and RDF terms. The statements listed under the WHERE clause is known as BGP consisting of query patterns. For example, the query “SELECT ?name ?mbox WHERE {?x foaf:name ?name . ?x foaf:mbox ?mbox .}” has two query patterns. To evaluate the query containing two query patterns, one join is required. Based on the total number of query patterns, INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 41 we need one less number of joins. That is, for n number of query patterns we need n-1 joins to resolve the values. Figure 7 illustrates the process of query execution. Figure 7. Process of query execution. EVALUATION To evaluate the proposed approach we compare the storage size with file-based storage systems such as N-Triples files and RDF/XML files. We also compare with standard triple stores such as Jena TDB and Sesame. The data-storing time is compared with Jena TDB, Sesame, and Parquet, having one, two, and three worker nodes respectively. Finally, for the purposes of the experiment, some SPARQL queries are selected and tested over RDF data stored in Parquet format into HDFS. The query performance is tested on the distributed system having one, two, and three worker nodes respectively. In the following subsections, we show the results for each of the above comparisons. Datasets For evaluation, we use two datasets. Dataset 1 contains bibliographic data from the National Library of Portugal (NLP) (http://opendata.bnportugal.pt/eng_linked_data.htm). From NLP, we choose the NLP Catalogue datasets in RDF/XML formats. The datasets are freely available to reuse and contain metadata information from NLP Catalogue, the National Bibliographic Database, the Portuguese National Bibliography, and the National Digital Library. The datasets are available as linked data, which were produced in the context of the European Library. The size of the RDF/XML file is 6.46 GB with more than 45 billion RDF triples. http://opendata.bnportugal.pt/eng_linked_data.htm EFFICIENTLY PROCESSING AND STORING LIBRARY LINKED DATA | SHARMA, MARJIT, AND BISWAS 42 https://doi.org/10.6017/ital.v37i3.10177 Dataset 2 contains bibliographic data from the British National Library (https://www.bl.uk/bibliographic/download.html). From the British National Bibliography collection we choose the BNB LOD Books dataset. The datasets are publicly available and contain bibliographic records of different categories, such as books, locations, bibliographic resources, persons, organizations, and agents. The datasets are divided into sixty-seven files in RDF format. However, we combine them into one file in N-Triples format to fit the requirement of the large size of the input data. The combined file is 22.52 GB and contains more than 16 billion RDF resources in N-Triples format, making it suitable for the proposed approach. From this conversion, we get more than 150 billion RDF triples. Figure 8. Data storage time for different file formats. Figure 9. Disk size for different file formats. Disk Storage Figure 8 shows the data-storing time using Sesame, Jena TDB, and Parquet for the above two datasets. Data from raw RDF files are stored in Jena TDB and Sesame. Individual files are processed for storing into Jena TDB and Sesame to avoid memory overflow as Jena or Sesame models cannot load data at once from the large files. To store data in Parquet format we run the program separately on different worker nodes. Figure 9 presents the total disk size required for each of these file formats and triple stores for the two datasets. https://www.bl.uk/bibliographic/download.html INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 43 Query Performance For testing, the SPARQL queries are converted manually at this stage. We run some of the selected queries over bibliographic RDF data stored in Parquet file format in HDFS. We run the following type of queries on worker nodes 1, 2 and 3 respectively. The queries are listed below: Q1) The first query is to fetch the count of RDF triples present in the storage. Query: SELECT (count(*) as ?count) WHERE ?s ?p ?o . Q2) The second query is to fetch the entire dataset in SPO format. It fetches data in the N - Triples format. Query: SELECT * { ?s ?p ?o } . Q3) The third query is to fetch resources that belong to books with the subject “English language Composition and exercises.” Query: SELECT ?s WHERE ?x rdf:type Bibo:Book . ?x DC:Subject . Q4) The fourth query is to fetch resources that belong to books with the subject “English language Composition and exercises” and creator “Palmer Frederick.” Query: SELECT ?s WHERE ?x rdf:type Bibo:Book . ?x DC:Subject . ?x DC:Creator . Q5) The fifth query is to fetch objects having predicate DCTerms:isPartOf. Query: SELECT ?name WHERE ?s DCTerms:isPartOf ?name . Figure 10 shows the query response time for the above queries on different worker nodes for two different datasets. The queries are executed in the distributed environment. It shows that increasing the number of worker nodes decreases the query response time. EFFICIENTLY PROCESSING AND STORING LIBRARY LINKED DATA | SHARMA, MARJIT, AND BISWAS 44 https://doi.org/10.6017/ital.v37i3.10177 Figure 10. Query response time with different numbers of worker nodes. Query Comparison For comparing query response time, the proposed approach is tested with the first dataset as mentioned above. Though at this stage the proposed approach requires further research to be compared with other distributed triple storage systems. Also, it requires more worker nodes and larger datasets compatible for parallel processing in the distributed environment. With a smaller setup, it will be hard to analyze the performance of the individual approaches, as they may produce similar results. We compare the proposed approach with the standard Jena TDB solution in a single-node environment. The following SPARQL queries are tested against dataset 1. prefix rdf: prefix DC: prefix rdau: prefix foaf: Q1. SELECT (count(*) AS ?count) { ?s ?p ?o } Q2. SELECT * { ?s ?p ?o } Q3. SELECT ?x WHERE { ?x rdf:type DC:BibliographicResource. } Q4. SELECT ?x WHERE { ?x rdf:type . ?x rdau:P60339 'Time Out Lisboa'. } Q5. SELECT ?s WHERE {?s DC:isPartOf . ?s foaf:page 'http://www.theeuropeanlibrary.org/tel4/record/3000115318515'. } INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 45 Figure 11. Query comparison. We are interested in measuring the query response time with the above queries. First, we test with Jena TDB. We then test the proposed approach on a single-node environment. We execute the above set of queries multiple times to record the average performance. As mentioned above, no indexing is used in the storage. RDF triples are stored as they appeared in the N-Triples file. Queries are executed without indexing and are still getting better performance than Jena TDB, as shown in figure 11. Discussion In this article, we claim that Apache Spark and column-oriented databases can resolve library big data issues. Especially when dealing with RDF data, Spark can perform far better than other approaches because of its in-memory processing ability. Concerning RDF data storage, the column-oriented database is suitable for storing the large volume of data because of its scalability, fast data loading, and highly efficient data compression and partitioning. A column-oriented database system requires less disk, reducing the storage area. As a proof, we have shown the data storage comparison and the performance of the columnar-storage for RDF data using Parquet formats in HDFS. As shown in the results, Apache Parquet takes much less disk space as compared to other storage systems. Also, the data-storing time is relatively very small as compared to others. We observed that the result of query 2 is the entire dataset stored in Parquet format. The size of this resultant dataset is 22.52 GB, which is the same as the original size. The same dataset when stored with Parquet format is reduced to 2.89 GB. This shows that Parquet is a very optimized EFFICIENTLY PROCESSING AND STORING LIBRARY LINKED DATA | SHARMA, MARJIT, AND BISWAS 46 https://doi.org/10.6017/ital.v37i3.10177 storage system that can reduce the storage cost. We have shown the query response time for five different SPARQL queries on distributed nodes for two different datasets. We believe with better schema for storing RDF triples the proposed approach can be improved, and with the used technologies a fast and reliable triple store can be designed. CONCLUSION AND FUTURE WORK Librarians all over the globe should give priority to integrating library data with the web to enable cross-domain sharing of library data. To do this, they must pay attention to current trends in big data technologies. Because the data-generation rate is increasing in every domain, traditional data processing and storage systems are becoming ineffective because of the scale and complexity of the data. In this article, we present a distributed solution for processing and storing a large volume of library linked data. From the experiment, we observe that the processing of large volume of the data takes significantly less time using the proposed approach. Also, the storage area is reduced significantly as compared to other storage systems. In the future we plan to optimize the current approach using advanced technologies such as GraphX, machine learning tools, and other big -data technologies for even faster data processing, searching, and analyzing. REFERENCES 1 Eric Miller et al., “Bibliographic Framework as a Web of Data: Linked Data Model and Supporting Services,” Library of Congress, November 11, 2012, https://www.loc.gov/bibframe/pdf/marcld-report-11-21-2012.pdf. 2 Brighid M. Gonzales, “Linking Libraries to the Web: Linked Data and the Future of the Bibliographic Record,” Information Technology and Libraries 33 no. 4 (2014): 10, https://doi.org/10.6017/ital.v33i4.5631; Myung-Ja K. Han et al., “Exposing Library Holdings Metadata in RDF Using Schema.org Semantics,” in International Conference on Dublin Core and Metadata Applications DC-2015, São Paulo, Brazil, September 1–4, 2015, pp. 41–49, http://dcevents.dublincore.org/IntConf/dc-2015/paper/view/328/363. 3 Franck Michel et al., “Translation of Relational and Non-relational Databases into RDF with xR2RML,” in Proceedings of the 11th International Conference on Web Information Systems and Technologies, Lisbon, Portugal, 2015, pp. 443–54, https://doi.org/10.5220/0005448304430454; Varish Mulwad, Tim Finin, and Anupam Joshi, “Automatically Generating Government Linked Data from Tables,” Working Notes of AAAI Fall Symposium on Open Government Knowledge: AI Opportunities and Challenges 4, no. 3 (2011), https://ebiquity.umbc.edu/_file_directory_/papers/582.pdf; Matthew Rowe, “Data.dcs: Converting Legacy Data into Linked Data,” LDOW 628 (2010), http://ceur-ws.org/Vol- 628/ldow2010_paper01.pdf. 4 Virginia Schilling, “Transforming Library Metadata into Linked Library Data,” Association for Library Collections and Technical Services, September 25, 2012, http://www.ala.org/alcts/resources/org/cat/research/linked-data. 5 Getaneh Alemu et al., “Linked Data for Libraries: Benefits of a Conceptual Shift from Library- Specific Record Structures to RDF-Based Data Models,” New Library World 113, no. 11/12 (2012): 549–70 (2012), https://doi.org/10.1108/03074801211282920. https://www.loc.gov/bibframe/pdf/marcld-report-11-21-2012.pdf https://doi.org/10.6017/ital.v33i4.5631 http://dcevents.dublincore.org/IntConf/dc-2015/paper/view/328/363 https://doi.org/10.5220/0005448304430454 https://ebiquity.umbc.edu/_file_directory_/papers/582.pdf http://ceur-ws.org/Vol-628/ldow2010_paper01.pdf http://ceur-ws.org/Vol-628/ldow2010_paper01.pdf http://www.ala.org/alcts/resources/org/cat/research/linked-data https://doi.org/10.1108/03074801211282920 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 47 6 Lisa Goddard and Gillian Byrne, “The Strongest Link: Libraries and Linked Data,” D-Lib Magazine, 16, no. 11/12 (2010), https://doi.org/10.1045/november2010-byrne. 7 T. Nasser and R. S. Tariq, “Big Data Challenges,” Journal of Computer Engineering & Information Technology 4, no. 3 (2015), https://doi.org/10.4172/2324-9307.1000133. 8 Alexandru Adrian Tole, “Big Data Challenges,” Database Systems Journal 4, no. 3 (2013): 31–40, http://dbjournal.ro/archive/13/13_4.pdf. 9 Carol Jean Godby and Karen Smith-Yoshimura, “From Records to Things: Managing the Transition from Legacy Library Metadata to Linked Data,” Bulletin of the Association for Information Science and Technology 43, no. 2 (2017): 18–23, https://doi.org/10.1002/bul2.2017.1720430209. 10 Corine Deliot, “Publishing the British National Bibliography as Linked Open Data,” Catalogue & Index, issue 174 (2014): 13–18, http://www.bl.uk/bibliographic/pdfs/publishing_bnb_as_lod.pdf; Gustavo Candela et al., “Migration of a Library Catalogue into RDA Linked Open Data,” Semantic Web 9, no. 4 (2017): 481–91, https://doi.org/10.3233/sw-170274; Martin Malmsten, “Exposing Library Data as Linked Data,” IFLA satellite preconference sponsored by the Information Technology Section: Emerging Trends in 2009, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.181.860&rep=rep1&type=pdf ; Keri Thompson and Joel Richard, “Moving Our Data to the Semantic Web: Leveraging a Content Management System to Create the Linked Open Library,” Journal of Library Metadata 13, no. 2– 3 (2013): 290–309, https://doi.org/10.1080/19386389.2013.828551; Jason A. Clark and Scott W. H. Young, “Linked Data is People: Building a Knowledge Graph to Reshape the Library Staff Directory,” Code4lib Journal 36 (2017), http://journal.code4lib.org/articles/12320; Martin Malmsten, “Making a Library Catalogue Part of the Semantic Web,” Humbolt University of Berlin, 2008, https://doi.org/10.18452/1260. 11 R. Hastings, “Linked Data in Libraries: Status and Future Direction,” Computers in Libraries 35, no. 9 (2015): 12–28, http://www.infotoday.com/cilmag/nov15/Hastings--Linked-Data-in- Libraries.shtml. 12 Mirjam Keßler, “Linked Open Data of the German National Library,” In ECO4r Workshop LOD of DNB, 2010; Antoine Isaac, Robina Clayphan, and Bernhard Haslhofer, “Europeana: Moving to Linked Open Data,” Information Standards Quarterly 24, no. 2/3 (2012)<>; Carol Jean Godby and Ray Denenberg, “Common Ground: Exploring Compatibilities between the Linked Data Models of the Library of Congress and OCLC,” OCLC Online Computer Library Center, 2015, https://files.eric.ed.gov/fulltext/ED564824.pdf. 13 Chunning Wang et al., “Exposing Library Data with Big Data Technology: A Review,” 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS), pp. 1-6, https://doi.org/10.1109/icis.2016.7550937. 14 B. McBride, “Jena: a Semantic Web Toolkit,” IEEE Internet Computing 6, no. 6 (2002): 55–59, https://doi.org/10.1109/mic.2002.1067737; Jeen Broekstra, Arjohn Kampman, and Frank Van https://doi.org/10.1045/november2010-byrne https://doi.org/10.4172/2324-9307.1000133 http://dbjournal.ro/archive/13/13_4.pdf https://doi.org/10.1002/bul2.2017.1720430209 http://www.bl.uk/bibliographic/pdfs/publishing_bnb_as_lod.pdf https://doi.org/10.3233/sw-170274 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.181.860&rep=rep1&type=pdf https://doi.org/10.1080/19386389.2013.828551 http://journal.code4lib.org/articles/12320 https://doi.org/10.18452/1260 http://www.infotoday.com/cilmag/nov15/Hastings--Linked-Data-in-Libraries.shtml http://www.infotoday.com/cilmag/nov15/Hastings--Linked-Data-in-Libraries.shtml https://files.eric.ed.gov/fulltext/ED564824.pdf https://doi.org/10.1109/icis.2016.7550937 https://doi.org/10.1109/mic.2002.1067737 EFFICIENTLY PROCESSING AND STORING LIBRARY LINKED DATA | SHARMA, MARJIT, AND BISWAS 48 https://doi.org/10.6017/ital.v37i3.10177 Harmelen, “Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema,” International Semantic Web Conference, ed. J. Davies, D. Fensel, and F. van Harmelen (Berlin and Heidelberg: Springer, 2002), https://doi.org/10.1002/0470858060.ch5. 15 “Apache Jena—TDB,” Apache Jena, accessed August 22, 2018, https://jena.apache.org/documentation/tdb/. 16 “Sesame (framework),” Everipedia, July 15, 2016, https://everipedia.org/wiki/Sesame_(framework)/. 17 Asim Ullah et al., “BookOnt: A Comprehensive Book Structural Ontology for Book Search and Retrieval,” 2016 International Conference on Frontiers of Information Technology (FIT), 211– 16, https://doi.org/10.1109/fit.2016.046. 18 Tom Heath and Christian Bizer, “Linked Data: Evolving the Web into a Global Data Space,” Synthesis Lectures on the Semantic Web: Theory and Technology 1, no. 1 (2011): 1–136, https://doi.org/10.2200/s00334ed1v01y201102wbe001. 19 Christian Bizer et al., “Linked Data on the Web (LDOW2008),” Proceeding of the 17th International Conference on World Wide Web—WWW 08, 2008, pp. 1265–66 (2008), https://doi.org/10.1145/1367497.1367760. 20 Eric Prud and Andy Seaborne, “SPARQL Query Language for RDF,” W3C Recommendation, January 15, 2008, https://www.w3.org/TR/rdf-sparql-query/. 21 Devin Gaffney, “How to Use SPARQL,” Datagov Wiki RSS, last modified April 7, 2010, https://data-gov.tw.rpi.edu/wiki/How_to_use_SPARQL. 22 Tom White, Hadoop: The Definitive Guide (Sebastopol, CA: O’Reilly Media,, 2012), https://www.isical.ac.in/~acmsc/WBDA2015/slides/hg/Oreilly.Hadoop.The.Definitive.Guide. 3rd.Edition.Jan.2012.pdf. 23 Dhruba Borthakur, “The Hadoop Distributed File System: Architecture and Design,” Hadoop Project Website, 2007, http://svn.apache.org/repos/asf/hadoop/common/tags/release- 0.16.3/docs/hdfs_design.pdf; Seema Maitrey and C. K. Jha, “MapReduce: Simplified Data Analysis of Big Data,” Procedia Computer Science 57 (2015), 563–71 (2015), https://doi.org/10.1016/j.procs.2015.07.392. 24 Michael Armbrust et al., “Spark SQL: Relational Data Processing in Spark,” in Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (New York: ACM, 2015), 1383–94, https://doi.org/10.1145/2723372.2742797. 25 Abdul Ghaffar Shoro and Tariq Rahim Soomro, “Big Data Analysis: Apache Spark Perspective,” Global Journal of Computer Science and Technology 15, no. 1 (2015), https://globaljournals.org/GJCST_Volume15/2-Big-Data-Analysis.pdf. 26 Salman Salloum et al., “Big Data Analytics on Apache Spark,” International Journal of Data Science and Analytics 1, no. 3–4 (2016): 145–64, https://doi.org/10.1007/s41060-016-0027-9. https://doi.org/10.1002/0470858060.ch5 https://jena.apache.org/documentation/tdb/ https://everipedia.org/wiki/Sesame_(framework)/ https://doi.org/10.1109/fit.2016.046 https://doi.org/10.2200/s00334ed1v01y201102wbe001 https://doi.org/10.1145/1367497.1367760 https://www.w3.org/TR/rdf-sparql-query/ https://data-gov.tw.rpi.edu/wiki/How_to_use_SPARQL https://www.isical.ac.in/~acmsc/WBDA2015/slides/hg/Oreilly.Hadoop.The.Definitive.Guide.3rd.Edition.Jan.2012.pdf https://www.isical.ac.in/~acmsc/WBDA2015/slides/hg/Oreilly.Hadoop.The.Definitive.Guide.3rd.Edition.Jan.2012.pdf http://svn.apache.org/repos/asf/hadoop/common/tags/release-0.16.3/docs/hdfs_design.pdf http://svn.apache.org/repos/asf/hadoop/common/tags/release-0.16.3/docs/hdfs_design.pdf https://doi.org/10.1016/j.procs.2015.07.392 https://doi.org/10.1145/2723372.2742797 https://globaljournals.org/GJCST_Volume15/2-Big-Data-Analysis.pdf https://doi.org/10.1007/s41060-016-0027-9 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 49 27 Daniel J. Abadi, Samuel R. Madden, and Nabil Hachem, “Column-Stores vs. Row-Stores: How Different are They Really?,” in Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (New York: ACM, 2008), 967–80, https://doi.org/10.1145/1376616.1376712. 28 Deepak Vohra, “Apache Parquet,” in Practical Hadoop Ecosystem (Berkeley, CA: Apress, 2016), 325–35, https://doi.org/10.1007/978-1-4842-2199-0_8. 29 “Google/Snappy,” GitHub, January 04, 2018, https://github.com/google/snappy. 30 Jean-loup Gailly and Mark Adler, “Zlib Compression Library,” 2004, https://www.repository.cam.ac.uk/bitstream/handle/1810/3486/rfc1951.txt?sequence=4. 31 Sergey Melnik et al., “Dremel: Interactive Analysis of Web-Scale Datasets,” Proceedings of the VLDB Endowment 3, no. 1–2 (2010): 330–39, https://doi.org/10.14778/1920841.1920886. 32 Marcel Kornacker et al., “Impala: A Modern, Open-Source SQL Engine for Hadoop,” in Proceedings of the 7th Biennial Conference on Innovative Data Systems Research, Asilomar, California, January 4–7, 2015, http://www.inf.ufpr.br/eduardo/ensino/ci763/papers/CIDR15_Paper28.pdf. https://doi.org/10.1145/1376616.1376712 https://doi.org/10.1007/978-1-4842-2199-0_8 https://github.com/google/snappy https://www.repository.cam.ac.uk/bitstream/handle/1810/3486/rfc1951.txt?sequence=4 https://doi.org/10.14778/1920841.1920886 http://www.inf.ufpr.br/eduardo/ensino/ci763/papers/CIDR15_Paper28.pdf ABSTRACT Introduction literature Review Background Structure of RDF Triples Hadoop and MapReduce Spark and Resilient Distributed Datasets (RDD) Spark SQL and Dataframe Column-Oriented Database Apache Parquet The proposed approach Processing RDF Data Schema to Store RDF Data RDF Storage Fetching Data from Storage Evaluation Datasets Disk Storage Query Performance Query Comparison Discussion Conclusion and Future Work References 10181 ---- 10181 20190318 galley A Systematic Approach Towards Web Preservation Muzammil Khan and Arif Ur Rahman INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 71 Muzammil Khan (muzammilkhan86@gmail.com) Assistant Professor, Department of Computer and Software Technology, University of Swat. Arif Ur Rahman (badwanpk@gmail.com) Assistant Professor, Department of Computer Science, Bahria University Islamabad. ABSTRACT The main purpose of the article is to divide the web preservation process into small explicable stages and design a step-by-step web preservation process that leads to creating a well-organized web archive. A number of research articles are studied about web preservation projects and web archives, and designed a step-by-step systematic approach for web preservation. The proposed comprehensive web preservation process describes and combines strengths of different techniques observed during the study for preserving digital web contents into a digital web archive. For each web preservation step, different approaches and possible implementation techniques have been identified that can be adopted in digital archiving. The potential value of the proposed model is to guide the archivist, related personnel, and organizations to effectively preserved their intellectual digital contents for future use. Moreover, the model can help to initiate a web preservation process and create a well- organized web archive to efficiently manage the archived web contents. A section briefly describes the implementation of the proposed approach in a digital news stories preservation framework for archiving news published online from different sources. INTRODUCTION The amount of information generated by institutions is increasing with the passage of time. One of the mediums that uses this information is the World Wide Web (WWW). The WWW has become a tool to share information quickly with everyone regardless of their physical location. The number of web pages is vast. Google and Bing each index approximately 4.8 billion.1 Though the WWW is a rapidly growing source of information, it is fragile in nature. According to the available statistics, 80 percent of pages become unavailable after one year and 13 percent of links (mostly web references) in scholarly articles are broken after 27 months.2 Moreover, 11 percent of posts and comments on websites for various purposes are lost within a year. According to another study conducted on 10 million web pages collected from the Internet Archive in 2001, the average survival rate of web pages is 1,132.1 days with a standard deviation of 903.5 days. 90.6 percent pages of those web pages are inaccessible today.3 The information fragility causes this valuable scholarly, cultural, and scientific information to vanish and become inaccessible to future generations. In recent years, it was realized that the lifespan of digital objects is very short, and rapid technological changes make it more difficult to access these objects. Therefore, there is a need to preserve the information available on the WWW. Digital preservation is performed using the primary methods of emulation and migration, in which emulation provides the preserved digital objects in their original format while migration provide objects in a different format.4 In the last SYSTEMATIC APPROACH TOWARDS WEB PRESERVATION | KHAN AND UR RAHMAN 72 https://doi.org/10.6017/ital.v38i1.10181 two decades, a number of institutions worldwide, such as national and international libraries, universities, and companies started to preserve their web resources (resources found at a web server, i.e., web contents and web structure). The first web archive was initiated in 1996 by Brewster Kahle, named the Internet Archive, and it holds more than 30 petabytes data, which includes 279 billion web pages, 11 million books and texts, and 8 million other digital objects such as audio, video, image files, etc. More than seventy web archive initiatives were started in 33 countries since 1996, which shows the importance of web preservation projects and preservation of web contents. This information era encourages librarians, archivists, and researchers to preserve the information available online for upcoming generations. While digital resources may not replace the information available in physical form, the digital version of these information resources improves access to the available information.5 There are different aspects of the preservation process and web archiving, e.g., digital objects’ ingestion to the archive during preservation process, digital object’s format and storage, archival management, administrative issues, access and security to the archive, and preservation planning. These aspects need to be understood for effective web preservation and will help in addressing the challenges that occur during the preservation process. The Reference Model for Open Archival Information System (OAIS) is an attempt to provide a high-level framework for the development and comparison of digital archives. In web preservation, a challenging task is to identify the starting point of the preservation process and to effectively complete the process which help to proceed further to the other activities. Therefore, the complicated nature of the Web and the complex structure of the web contents make the preservation of the web content even more difficult. The OAIS reference model helps in achieving the goals of a preservation task in a step-by-step manner. The stakeholders are identified, i.e., producer, management, and consumer, and the packages, i.e., submission information package (SIP), archival information package (AIP) and dissemination information package (DIP), which need to be processed, are clearly defined.6 This study aims to design a step-by-step systematic approach for web preservation that helps to understand preservation or archival activities’ challenges, especially those that relate to digital information objects at various steps of the preservation process. The systematic approach may lead to an easy way to analyze, design, implement, and evaluate the archive with clarity and different options for an effective preservation process and archival development. An effective preservation process is one that leads to a well-organized, easily managed web archive and accomplishes designated community requirements. This approach may help to address the challenges and risks that confront archivists and analysts during preservation activities. STEP-BY-STEP SYSTEMATIC APPROACH Digital preservation is “the set of processes and activities that ensure long-term, sustained storage of, access to and interpretation of digital information.”7 The growth and decline rates of WWW content and the importance of the information presented on the web make it a key candidate for preservation. Web preservation confronts a number of challenges due to its complex structure, a variety of available formats, and the type of information (purpose) it provides. The overall layout of the web varies domain to domain based on the type of information and its presentation. The websites can be categorized based on two things. First, the type of information (i.e., the web INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 73 contents) and second, the way this information presented (i.e., the layout or structure of the web page. Examples include educational, personal, news, e-commerce, and social networking websites, which vary a lot in their contents and structure. The variations in the overall layout make it difficult to preserve different web contents in a single web archive. The web preservation activities are summarized in figure 1. The following sections explain the web preservation activities and possible implementation in proposed systematic approach. Defining the Scope of the Web Archive The WWW provides an opportunity to share information using various services, such as blogs, social networking websites, e-commerce, wikis, and e-libraries. These websites provide information on a variety of topics and address different communities based on their interest and needs. There are many differences in the way the information is handled and presented on the WWW. In addition, the overall layout of the web changes from one domain to another domain.8 Therefore, it is not practically feasible to develop a single system to preserve all types of websites for the long term. So, before starting to preserve the web, one (the archivist) should define the scope of the web to be archived. The archive will be either a site-centric, topic-centric, or domain- centric archive.9 Site-centric Archive A site-centric archive focuses on a particular website for preservation. These types of archives are mostly initiated by the website creator or owner. The site-centric web archives allow access to the old versions of the website. Topic-centric Archive Topic-centric archives are created to preserve information on a particular topic published on the web for future use. For scientific verification, researchers need to refer to the available information while it is difficult to ensure access to these contents due to the ephemeral nature of the web. A number of topic-centric archive projects have been performed including the Archipol archive of Dutch political websites,10 the Digital Archive for Chinese Studies (DACHS) archive2,11 Minerva by the Library of Congress,12 and the French Elections Web archive for archiving the websites related to the French elections.13 Domain-centric Archive The word “domain” refers to a location, network, or web extension. A domain-centric archive covers websites published with a specific domain name DNS, using either a top-level domain (TLD), e.g., .com, .edu, or .org, or a second-level domain (SLD), e.g., .edu.pk or .edu.fr. An advantage of domain-centric archiving is that it can be created by automatically detecting specific websites. Several projects have a domain-centric scope, e.g., the Portuguese Web Archive (PWA) national websites,14 the Kulturarw, a Swedish Royal Library web archive collection of.se and .com domain websites,15 and the UK Government Web Archive collection of UK government websites, e.g., .gov.uk domain websites. Understanding the Web Structure After defining the scope of the intended web archive, the archivist will have a better understanding of the interest and expected queries of the intended community based on the resources available or the information provided by the selected domain. The focus in this step is to understand the type of information (contents) provided by the selected domain and how the information has been presented. The web can be understood by two dimensions. The first SYSTEMATIC APPROACH TOWARDS WEB PRESERVATION | KHAN AND UR RAHMAN 74 https://doi.org/10.6017/ital.v38i1.10181 Figure 1. Systematic Approach for Web Preservation Process. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 75 considers the web as a medium that communicates contents using various protocols, i.e., HTTP, and the second considers the web as a content container, which further presents the contents to the viewers and not simply contents, e.g. the underlying technology used to display the contents.16 The preservation team should understand such parameters as the technical issues, the future technologies, and the expected inclusion of other related content. Identify the Web Resources The archivist should understand the contents and the representation of the contents of the selected domain, e.g., blogs, social networking websites, institutional websites, educational institutional websites, newspaper websites, or entertainment websites. All of these websites provide different information and address individual communities that have distinct information needs. A web page is the combination of two things, i.e., web contents and web structure.17 The resources which can be preserved are as follows. Web Contents Web contents or web information can be categorized into the following categories: • Textual Contents (Plain Text): This category describes textual information that appears on a web page. It does not include links, behaviors, and presentation stylesheets. • Visual Contents (Images): These contents are the visual forms of information or are a complementary material to the information provided in the textual form. • Multimedia Contents: As another form of information, multimedia contents mainly include audio and video. It may also include animation or even text as a part of a video or a combination of text, audio, and video. Web Structure Web structure can be categorized in the following categories: • Appearance (Web Layout or Presentation): This category indicates the overall layout or presentation of a web page. The look and feel of a web page (representation of the contents) are important, which is maintained with different technologies, e.g., HTML or stylesheets, etc. • Behavior (Code Navigations): Categorized by link navigations, these can be within a website or to other websites, external document links or dynamic and animated features, such as live feed, comments, tagging, or bookmarking. Identify Designated Community The archivist should identify the designated community of the intended web archive, their functional requirements and expected queries by analyzing them carefully. The designated community means the potential users, such as those who can access the archived web contents for different purposes, i.e., accessing old information that is not available in normal circumstances or referring to an old news article which is not bookmarked properly or retrieving relevant news articles published long ago, etc. Prioritize the Web Resources After a comprehensive assessment of the resources of the selected domain and the identification of potential users’ requirements and expected queries, the archivist should prioritize the web SYSTEMATIC APPROACH TOWARDS WEB PRESERVATION | KHAN AND UR RAHMAN 76 https://doi.org/10.6017/ital.v38i1.10181 resources. The complexity of web resources and their representation cause complications in the digital preservation process. Generally, it may be undesirable or unviable to preserve all web resources; therefore, it is worthwhile to designate the web resources for preservation. The priority should be assigned on the basis of two things: first, the potential reuse of the resource and second, the frequency with which the resource will be accessed. The resources with no value, little value, or those managed elsewhere can be excluded. For prioritization of resources, the MoSCoW method can be applied.18 The acronym MoSCoW can be elaborated as: M - MUST have, the resource must be preserved or resources that must be a part of the archive and preserved. For example, in the Digital News Story Archive (DNSA), the textual news story must be preserved in the archive because the preservation emphasis is on a textual news story.19 Online news contains textual news stories, and many news stories contain associated images, and a fraction of news stories contain associated audio-video contents. S - SHOULD have, the resource should be preserved if at all possible. Almost all the news stories have associated images; a few news stories have associated audio and video that complement it and should be preserved as a part of the news story in the web archive. C - COULD have, the resource could be preserved if it does not affect anything else or is nice to have. The web structure in DNSA depends on the resources to be used for the preservation of news stories; the layout of the newspaper website could (C) be a part of the preservation process if it does not affect anything, e.g., storage capacity and system efficiency. W - WON’T have, the resource would not be included. Archiving multiple versions of the layout or structure of the online newspaper are not worthwhile and hence would not (W) be preserved. The prioritization of these resources is very important in the context of web preservation planning because it does not waste time and energy, and it is the best way to handle users’ requirements and fulfill their expected queries. How to Capture the Resource(s) The selection of a feasible capturing technique depends on: first, the resources to be captured and second, the capturing task frequency. There are three web resources capturing techniques, i.e., by browser, web crawler, and authoring system. Each capturing technique has associated advantages and disadvantages.7 Web Capturing Using Browsers The intended web content can be captured using browsers after a web page is rendered when the HTTP transaction occurs. This technique is also referred to as a snapshot or post-rendering technique. The method captures those things which are visible to the users; the behavior and other attributes remain invisible. Capturing static contents is one of the disadvantages of web capturing by the browser approach, this approach generally preserved contents in the form of images. It is best for well-organized websites, and commercial tools are available for capturing the web. The following are well-known tools to capture web using browsers. WebCapture (https://web-capture.net/) is a free online web-capturing service. It is a fast web page snapshot tool, which can grab web pages in seven different formats, i.e. JPEG, TIFF, PNG, BMP INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 77 image formats, PDF, SVG, and postscript files of high quality. It also allows downloading the intended format in a ZIP file and is suitable for long vertical web pages with no distortion in layout. A.nnotate (http://a.nnotate.com/), is an online annotating web snapshot tool to keep track of information gathered from the web efficiently and easily. It allows adding tags and notes to the snapshot and building a personal index of web pages as document index. The annotation feature can be used for multiple purposes, for example, compiling an annotated library of objects for organization, sharing commented web pages, product comparison, etc. SnagIt (https://www.techsmith.com/screen-capture.html) is a well-known snapshot tool for capturing screens with built-in advanced image editing features and screen recording. SnagIt is a commercial and advanced screen capture tool that can capture web pages with images, linked files, source code, and the URL of the web page. Acrobat WebCapture (File > Create > PDF from Web Page...) creates a tagged PDF file from the web page that a user visits while the Adobe PDF toolbar is used for the entire website.20 The capture by a browser technique has the following advantages: • By this technique, the archivist can capture only the displayed contents, and it is an advantage if you need to preserve the displayed contents only. • It is a relatively simple technique for well-organized websites. • Commercial tools exist for web capturing using browsers. In addition, the disadvantages are the following: • Capturing displayed contents only is a disadvantage if the focus is not on only displayed contents. • It results in frozen contents and treats contents as if they are publications. • It loses the web structure, such as appearance, behavior, and other attributes of the web page. Web Capturing Using an Authoring System/Server The authoring system capturing technique is used for web harvesting directly from the website hosting server. All the contents, e.g., textual information, images, and source code, are collected from the source web server. The authoring system allows the archivist to preserve the different versions of the website. The authoring system depends on the infrastructure of the content management system and is not a good choice for external resources. The system is best for an owned web server and works well for limited internal purposes. The Web Curator Tool (http://webcurator.sourceforge.net/), PANDAS (an old British Library harvesting tool), and NetarchiveSuite (https://sbforge.org/display/NAS/NetarchiveSuite) are known tools use for planning and scheduling web harvesting. They can be used by non-technical personnel for both selection and harvesting web content selection policies. These web archiving tools were developed in a collaboration of the National Library of New Zealand and the British Library and are used for the UK Web Archive (http://www.ariadne.ac.uk/issue50/beresford/). The tools can interface with web crawlers, such as Heritrix (https://sourceforge.net/projects/archive- crawler/). Authoring systems are also referred to as workflow systems or curatorial tools. SYSTEMATIC APPROACH TOWARDS WEB PRESERVATION | KHAN AND UR RAHMAN 78 https://doi.org/10.6017/ital.v38i1.10181 The authoring system has the following advantages: • It is best for web harvesting, which captures everything available. • It is easy to perform, if you have proper access permission or you own the server or system to access for capturing the resources. • It works in short to medium term resources and feasible for internal access within organizations. The disadvantages of web capturing using the authoring system are: • It captures all available raw information, not only presentations. • It may be too reliant on the authoring infrastructure or the content management system. • It is not feasible for large term resources, or for external access from outside organization. Web Capturing Using Web Crawlers Web crawlers are perhaps the mostly used technique for capturing web contents in systematic and automated manner.21 Crawler development needs the expertise and experience of different tools, i.e. positive and negative of technologies, and the viability of a tool in a specific scenario. The main advantage of crawlers is that they extract embedded content. Heritrix, HTTrack, Wget, and DeepArc are common examples of web crawlers. Heritrix (https://github.com/internetarchive/heritrix3/wiki) is developed in java, an open source and freely available web crawler, and it was developed by Internet Archive. Heritrix is one of the widely used extensible and web-scale web crawlers in web preservation projects. Initially, the Heritrix was developed for specific purpose crawling of specific websites and now a resourceful or customize web crawler for archiving the web. HTTrack (https://www.httrack.com/) is a freely available configurable browser utility. HTTrack crawls HTML, images, and other files from a server to a local directory and allows offline viewing of the website. The HTTrack crawler downloads a complete website from the web server to a local computer system and makes it available for offline for viewing with all related link-structure and seems like the user is using it online. It also updates the archived websites at the local system from the server and resumes all the interrupted previous extractions. The HTTrack available for both Windows and Linux/Unix operating systems. Wget (http://www.gnu.org/software/wget/) is a freely available non-interactive command line tool that can easily be configured with other technologies and different scripts. It can capture files from the web using widely used FTP, FTPS, HTTP and HTTPS protocols, and support cookies as well. It also updates the archived websites and resumes all the interrupted extractions. Wget is available for both Microsoft Windows and Unix operating systems. The advantages of web crawling: • Widely used in capturing techniques. • Can capture specific content or everything. • Avoids some of the accessing issues, such as: Link rewriting and embedded external content from an archive or live. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 79 Disadvantages associated with web crawling: • Much work is required, as well as tools or development expertise and experience, etc. • The web crawler does not have the right scope: sometimes, it does not capture everything that it should, and sometimes the crawler captures too much content. Web Content Selection Policy In the previous steps, the web resources are identified, prioritized based on requirements and expected queries of the designated community, and feasible capturing technique is identified based on capturing frequency. Now, the contents need to be prepared and filtered for selection, and a feasible selection approach needs to be selected based on the contents. A web content selection policy helps to determine and clarify, which web contents are required to be captured based on the priorities, the purpose and the scope of web contents already defined.22 The decision of the selection policy comprises the description of the context, the intended users, the access mechanisms and the expected uses of the archive. The selection policy may comprise the selection process and selection approach. The selection process can be divided into subtasks which, in combination, provide a qualitative selection of web contents to a certain extent, i.e., preparation, discovery, and filtering, as shown in figure 2. The main objective of the preparation phase is to determine the targeted information space, the capture technique, capturing tools, extension categorization, granularity level, and the frequency of archiving activity. The best personnel who can provide help in preparation are the domain experts, regardless of the scope of the web archive. The domain experts may be the archivists, researchers, librarians, or any other authentic reference, i.e. a document or a research article. The tools defined in the preparation phase will help to discover intended information in the discovery phase, which can be divided into the following four categories: 1. Hubs may be the global directories or topical directories, collection of sites or even a single web page with essential links related to a particular subject or topic. 2. Search engines can facilitate discovery by defining a precise query or set of alternative queries related to a topic. The use of specialized search engines can significantly improve the results of discovering related information that can be greatly improved. 3. Crawlers can be used to extract web contents such as textual information, images, audio, video and links. Moreover, the overall layout of a web page or a whole website can also be extracted in a well-defined systematic manner. 4. External Sources may be non-web sources that may be anything, such as printed material for mailing lists, which can be monitored by the selection team. The main objective of the discovery phase is to determine the source of information to be stored the archive. This determination can be achieved by two ways. First, a manually created entry point list is used to determine the list of entry points (usually links) for crawling the collection manually and updating the list during the crawl. There are two discovery methods, i.e., exogenous and endogenous. Exogenous discovery is used in manual selection and mostly relies on exploitation of an entry point list for hubs, search engines, and on non-web documents. Second, there is an automatically created entry point list to determine the list of entry points by extracting links automatically and obtaining an updated list every time during the crawl. Endogenous discovery is SYSTEMATIC APPROACH TOWARDS WEB PRESERVATION | KHAN AND UR RAHMAN 80 https://doi.org/10.6017/ital.v38i1.10181 used in automatic selection and relies on the link extraction using crawlers by exploring the entry point list. Figure 2. Selection Process. The main objective of the filtering phase is to optimize and make concise the discovered web contents (discovery space). Filtering is important in order to collect more specific web content and remove unwanted or duplicated content. Usually, for preservation, an automatic filtering method is used; manual filtering is useful if the robots or automatic tools cannot interpret the web. The discovery and filter phase can be combined practically or logically. Several evaluation axes can be used for the selection policy (e.g., quality, subject, genre, and publisher). In the literature, we have three known techniques for selecting web content. The selection approach can be either automatic or manual. Manual content selection is very rare because it is labor intensive: it requires automatic tools for finding the content, and then manual review of that collection to identify the subset that should be captured. Automatic selection policies are used frequently in web preservation projects for web collection, especially for web archives.23 The selection of the collection approach depends on the frequency with which the web content has been preserved in the archive. There are four different selection approaches for web content collection. Unselective Approach The unselective approach implies collecting everything possible; by specifically using this approach, the whole website and its related domains and subdomains are downloaded to the archive. It is also referred to as automatic harvesting or selection, bulk selection, and domain selection.24 The automatic approach is used in a situation where a web crawler usually performs the collection. For example, the collection of websites from a domain, i.e., .edu means all educational institution websites (at domain level) or the collection of all possible contents/pages from a website (harvesting at website level) by extracting the embedded links. A section of the data preservation community believes that technically it is a relatively cheaper, quicker collection approach and yields a comprehensive picture of the web as a whole. In contrast, its significant drawbacks are that it generates huge unsorted, duplicated, and potentially useless data, consuming too many resources. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 81 The Swedish Royal Library’s project Kulturarw3 harvests websites at domain level, i.e., collecting websites from a .se domain which is a physically located website in Sweden and one of the first projects to adopt this approach.25 Usually, national-based web archive initiatives adopt the unselective approach, most notably NEDLIB, a Helsinki University Library harvester, and AOLA, an Austrian Online Archive.26 Selective Approach The selective approach was adopted by the National Library of Australia (NLA) in the PANDAS project in 1997. In this approach, a website is included for archiving based on certain predefined strategies and on the access and information provided by the archive. The Library of Congress’ project Minerva and the British Library project “Britain on the Web” are the other known projects that have adopted the selective approach. According to NLA, the selected websites are archived based on NLA guidelines after negotiation with the owners.27 The inclusion decision could be taken at one of the following levels: • Website level: which websites should be included from a selected domain, e.g., to archive all educational websites from high level domain “.pk”. • Web page level: which web pages should be included from a selected website, e.g., to archive the homepages of all educational websites. • Web content level: which type of web contents should be preserved, e.g., to archive all the images from the homepages of educational websites. A selective approach is best if the numbers of websites to be archived are very large or the archiving process is targeting the entire WWW and wants to narrow down the scope by identifying the resources in which the archivists are more interested. This approach performs implicit or explicit assumptions about the web contents that are not to be selected for preservation. It may be very helpful to initiate a pilot preservation project, which identifies: What is possible? What can be managed? In addition, some tangible results may be obtained easily and quickly in order to enhance the scope of the project in a broader perspective. The selective approach may be based on a predefined criterion or based on an event. Selective approach based on criteria involves selecting web resources based on various pre- defined sets of criteria. NLA’s guidance characterizes the criteria-based selective approach as the “most narrowly defined method,” and described it as “thematic selection.” A simple or a complex content-selection criteria can be defined, which depends on the overall goal of preservation. For example, all resources owned by an organization, all resources of one genre, i.e., all programming blogs, resources contributed to a common subject, resources addressing a specific community within an institution, i.e., students or staff, all publications belonging to an individual organization or group of organizations, all resources that may benefit external users or an external user’s community, e.g., historians, or alumni. Selective approach based on event involves selecting web resources or websites based on various time-based events. The archivists may focus on websites that address national or international important events, e.g., disasters, elections, and the football world cup, etc. Event- based websites have two characteristics: (1) very frequent updates and (2) website content is lost after a short time, e.g., a few weeks or a few months. For example, the start and end of a term or SYSTEMATIC APPROACH TOWARDS WEB PRESERVATION | KHAN AND UR RAHMAN 82 https://doi.org/10.6017/ital.v38i1.10181 academic year, the duration of an activity, e.g., research project, appointment, or departure of a new senior official. Deposit Approach In the deposit collection approach, the information package is submitted by the administrator or owner of the website which includes a copy of the website with related files that can be accessed through different hyperlinks. The archival information package is applicable to the small collection (of a few websites), or the owner of the website can initiate the preservation project, e.g. a company can initiate a project for preserving their website. The deposit collection approach was adopted by the National Archives and Records Administration (NARA) for the collection of US federal agency websites in 2001 and by Die Deutsche Bibliothek (DDB, http://deposit.ddb.de/) for the collection of dissertations and some online publications. New digital initiatives are heavily dependent on administrator or owner support and provide an easy way to deposit new content to the repository, e.g., in the MacEwan University’s institutional repository, the librarians leading the project tried to offer an easy and effective way to deposit their archival contents.28 Combined Approach There are advantages and disadvantages associated with each collection approach. The ongoing debate is which approach is best in a given situation. For example, the deposit approach should be an inexpensive agreement with the depositors. The emphasis is to use the combination of automatic harvesting and selective approaches as these two approaches are cheaper as compared to other selection approaches because a few staff personnel are required and cope with technological challenges. This initiative was taken by the Bibliothque Nationale de France (BnF) in 2006. The BnF automatically crawls information regarding the updated web pages and stores it in an XML-based “site delta” and uses page relevancy and importance, similar to how Google ranks pages, to evaluate individual pages.29 The BnF used a selective approach for the deep web (that is, web pages or websites that are behind a password or are otherwise not generally accessible to search engines), referred to as “deposit track.” Metadata Identification Cataloging is required to discover a specific item from the digital collection. An identifier or set of identifiers is required to retrieve a digital record in digital repositories or an archive. For digital documents, this catalog or registration or identifier is referred to as metadata.30 Metadata are structured information concerning resources that describe, locate (discover or place), manage, easily retrieve (access) and use digital information resources. Metadata are often referred to as “data about data” or “information about information”, but it may be more helpful and informative to describe these data as “descriptive and technical documentation.”31 Metadata can be divided into the following three categories: 1. Descriptive metadata describes a resource for discovery and identification purposes. It may consist of elements for a document such as title, author(s), abstract, and keywords, etc. 2. Structural metadata describes how compound objects are put together, for example, how sections are ordered to form chapters. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 83 3. Administrative metadata imparts information to facilitate resource management, such as when and how a file was created, who can access the file, its type, and other technical information. Administrative metadata is classified into two types: (1) rights management metadata addresses intellectual property rights and (2) preservation metadata contains information needed to archive and preserve a resource.32 Due to new information technologies, digital repositories, especially web-based repositories, have grown rapidly over the last two decades. This interest prompts the digital libraries communities to devise metadata strategies to manage the immense amount of data stored in digital libraries.33 Metadata play a vital role in the long-term preservation of digital objects and important to identify the metadata which may help to retrieve a specific object from the archive after preservation. According to Duff et al., “the right metadata is the key to preserving digital objects.”34 There are hundreds of metadata standards developed over the years for different user environments, disciplines, and for different purposes; many of them are in their second, third, or nth edition.35 Digital preservation and archiving requires metadata standards to trace and ensure its access to the digital objects. Several of the common standards are briefly discussed below. Dublin Core Metadata Initiative (DCMI, http://dublincore.org/) was initiated at the 2nd World Wide Web conference in 1994 and was standardized by ANSI/NISO Z39.85 in 2001 and ISO 15386 in 2003.36 The main purpose of the DCMI was to define an element set for representing web resources; initially, thirteen core elements were defined which later increased to a fifteen-element set. The elements are optional, repeatable, can be followed in any order, and expressed in XML.37 Metadata Encoding and Transmission Standard (METS, http://www.loc.gov/standards/mets/) is an XML metadata standard intended to represent information of the complex digital objects. METS elements evolved from the early project Making of America II “MOA2” in 2001, supported by the Library of Congress and sponsored by the Digital Library Federation “DLF” and registered with National Information Standards Organization “NISO” in 2004. A METS document contains seven major sections in which each contains different aspects of metadata.38 Metadata Object Description Schema (MODS, http://www.loc.gov/standards/mods/) was initiated by the MARC21 maintenance agency at the Library of Congress in 2002. MODS elements are richer then DCMI, simpler then MARC21 bibliographic format and expressed in XML.39 The MODS identified the widest facets or features of an object and presented nineteen high-level optional elements.40 Visual Resources Association Core Strategies (VRA Core, http://www.loc.gov/standards/vracore/) was developed in 1996, and the current version 4.0 was released in 2007. The VRA core is a widely used standard for art, libraries, and archives for such objects as paintings, drawings, sculpture, architecture, and photographs, as well as books and decorative and performance art.41 The VRA core contains nineteen elements and nine sub-elements.42 Preservation Metadata Implementation Strategies (PREMIS, http://www.loc.gov/standards/premis/) was developed in 2005, sponsored by the Online Computer Library Center (OCLC) and the Research Libraries Group (RLG), includes a data dictionary and some information about metadata. PREMIS defined a set of five interactive core semantic units or entities and XML schema for endorsing digital preservation activities. It is not SYSTEMATIC APPROACH TOWARDS WEB PRESERVATION | KHAN AND UR RAHMAN 84 https://doi.org/10.6017/ital.v38i1.10181 concerned with discovery and access but with common metadata, and for descriptive metadata, other standards (Dublin Core, METS or MODS) need to be used. The PREMIS data model contains intellectual entities (contents that can be described as a unit, e.g., books, articles, databases), objects (discrete units of information in digital form, which can be files, bitstreams, or any representation), agents (people, organization, or software), events (actions that involve an object and an agent known to the system) and rights (assertion of rights and permission).43 It is indisputable that good metadata improves access to the digital object in the digital repository. Therefore, the creation and selection of appropriate metadata make the web archive accessible to the archive user. Structure metadata helps to manage the archival collection internally, as well as the related services, but may not always help to discover the primary source of the digital object.44 Currently, there are many semi-automated metadata generation tools. The use of these semi- automatic tools for generating metadata is crucial for the future, considering the operation’s complexity and cost of manual metadata origination.45 Archival Format The web archive initiatives select websites for archiving based on relevance of contents and the intended audience of the archived information. The size of the web archives varies significantly depending on their scope and the type of content they are preserving, e.g., web pages, PDF documents, images, audio, or video files.46 To preserve these contents, a web archive uses different storage formats containing metadata and utilizes data compression techniques. The Internet Archive defined the ARC format (http://archive.org/web/researcher/ArcFileFormat.php), later used as a defacto standard. In 2009, the Internet Organization for Standardization (ISO) established the WARC format (https://goo.gl/0RBWSN) as an official standard for web archiving. Approximately 54 percent of web archive initiatives applied ARC and WARC formats for archiving. The use of standard formats helps the archivists to facilitate the creation of collaborative tools, such as search engines and UI utilities to efficiently manipulate the archived data.47 Information Dissemination Mechanisms A well-defined preservation process can lead to a well-organized web archive that is easy to maintain and easy to retrieve a specific digital object from the collection using information dissemination techniques. Poor search results are one of the main problems in information dissemination of web archives. The users of a web archive expend excessive time to retrieve intended documents or information to satisfy the user’s query. Archivists are more concerned with “ofness,” “what collections are made up of,” although archive users are concerned with aboutness, “what collections are about.”48 To use the full potential of web archives a usable interface is needed to help the user to search the archive for specific digital object. Full text and keyword search are the dominant ways to search the unstructured information repository, evidently observed from the online search engines. The sophistication of search results against user queries is based on the ranking tools.49 The access tools and techniques are getting the attention of researchers, and approximately 82 percent of European web archives concentrate on such tools, which makes these web archives easily accessible.50 The Lucene full-text search engine and its extension NutchWAX is widely used in web archiving. Moreover, for the combination of semantic descriptions that already rely on or are implicit within their descriptive metadata, reasoning-based or semantic searching of the archival INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 85 collection can enable the system to produce novel possibilities for the archival content retrieval and browsing.51 Even in the current era of digital archives, mobile services are adopted in digital libraries, e.g., access to e-books, libraries databases, catalogs, and text messaging are common mobile services offered in university libraries.52 In a massive repository, a user query retrieves millions of documents, which makes it difficult for users to identify the most relevant information. The ranking model estimates the results relevancy based on user’s queries using specified criteria to overcome this problem and sorts the results by placing the most relevant result at the top.53 There are a number of ranking models that exist in the literature, e.g., conventional ranking models, e.g., TF-IDF, BM25F, temporal ranking models, e.g., PageRank, and learning to rank models, e.g., L2R. The findings of the systematic approach for web preservation are used to automate the process of the digital news-story preservation. The steps of the proposed model are carefully adopted to develop a tool that is able to add contextual information to the stories to be preserved. DIGITAL NEWS STORIES PRESERVATION FRAMEWORK The advancement of web technologies and maturation of the internet attracts news readers to access news online that is provided by multiple sources and to obtain the desired information comprehensively. The amount of news published online has grown rapidly, and for an individual, it is cumbersome to browse through all online sources for relevant news articles. The news generation in the digital environment is no longer a periodic process with a fixed single output, such as printed newspapers. The news is instantly generated and updated online in a continuous fashion. However, because of different reasons, such as the short lifespan of digital information and the speed of generation of information, it has become vital to preserve digital news for the long term. Digital preservation includes various actions to ensure that digital information remains accessible and usable, as long as they are considered important.54 Libraries and archives preserve by carefully digitizing newspapers considering as a good source of knowing the history. Many approaches have been developed to preserve digital information for the long term. The lifespan of news stories published online varies from one newspaper to another, i.e., from one day to a month. However, a newspaper may be backed up and archived by the news publisher or national archives; in the future, it will be difficult to access particular information published in various newspapers regarding the same news story. The issues become even more complicated if a story is to be tracked through an archive of many newspapers, which requires different access technologies. The Digital News Story Preservation (DNSP) framework was introduced to preserve digital news articles published online from multiple sources.55 The DNSP framework is planned based on adopting the proposed step-by-step systematic approach for web preservation to develop a well- organized web archive. Initially, the main objectives defined for the DNSP framework are: • To initiate a well-organized national level digital news archive of multiple news sources. • To normalize news articles during preservation to a common format for future use. • To extract explicit and implicit metadata, which would be helpful in ingesting stories to the archive and browsing through the archive in the future. • To introduce content-based similarity measures to link digital news articles during preservation. SYSTEMATIC APPROACH TOWARDS WEB PRESERVATION | KHAN AND UR RAHMAN 86 https://doi.org/10.6017/ital.v38i1.10181 The Digital News Story Extractor (DNSE) is a tool developed to facilitate the extraction of news stories from the online newspapers and to migrate to a normalized format for preservation. The normalized format also includes a step to add metadata in the Digital News Stories Archive (DNSA) for future use.56 To facilitate the accessibility of news articles preserved from multiple sources, some mechanisms need to be adopted for linking the archived digital news articles. An effective term-based approach “Common Ratio Measure for Stories (CRMS)” for linking digital news articles in DNSA is introduced that links similar news articles during the preservation process.57 The approach is empirically analyzed, and the results of the proposed approach are compared to get conclusive arguments. The initial results computed automatically using a common ratio measure for stories are encouraging and are compared with the similarity of news articles based on human judgment. The results are generalized by defining a threshold value based on multiple experimental results using the proposed approach. Currently, there is ongoing work to extend the scope of DNSA to dual languages, i.e., Urdu and English, as well as content-based similarity measures to link news articles published in Urdu- English. Moreover, research is underway to develop tools for exploiting the linkage created among stories during the preservation process for search and retrieval tasks. SUMMARY Effective strategic planning is critical in creating web archives; hence, it requires a well- understood and a well-planned preservation process. The process should result in a well- organized web archive that includes not only the content to be preserved but also the contextual information required to interpret the content. The study attempts to answer many questions by guiding the archivists and related personnel, such as: How to lead the web preservation process effectively? How to initiate the preservation process? How to proceed through different steps? What are the possible techniques that may help to create a well-organized web archive? How can the archived information can be used to its greatest potential? To answer these questions, the study resulted in an appropriate step-by-step process for web preservation and a well-organized web archive. The targeted goal of each step is identified by researching the existing approaches that can be adopted. The possible techniques for those approaches are discussed in detail for each step. REFERENCES 1 “World Wide Web Size,” The size of the World Wide Web, visited on Jan 31, 2019, http://www.worldwidewebsize.com/. 2 Brian F. Lavoie, “The Open Archival Information System Reference Model: Introductory Guide,” Microform & Imaging Review 33, no. 2 (2004): 68-81; Alexandros Ntoulas, Junghoo Cho, and Christopher Olston, “What's New on the Web? The Evolution of The Web from a Search Engine Perspective,” in Proceedings of the 13th International Conference on World Wide Web-04 (New York, NY: ACM, 2004), 1-12. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 87 3 Teru Agata et al., “Life Span of Web Pages: A Survey of 10 million Pages Collected in 2001,” IEEE/ACM Joint Conference on Digital Libraries, (IEEE, 2014), 463-64, https://doi.org/10.1109/JCDL.2014.6970226. 4 Timothy Robert Hart and Denise de Vries, “Metadata Provenance and Vulnerability,” Information Technology and Libraries 36, no. 4 (Dec. 2017): 24-33, https://doi.org/10.6017/ital.v36i4.10146. 5 Claire Warwick et al., “Library and Information Resources and Users of Digital Resources in the Humanities,” Program 42, no. 1 (2008): 5-27, https://doi.org/10.1108/00330330810851555. 6 Lavoie, “Open Archival Information System Reference Model.” 7 Susan Farrell, K. Ashley, and R. Davis, “A Guide to Web Preservation,” Practical Advice for Web and Records Managers Based on Best Practices from the JISC-Funded PoWR Project (2010), https://jiscpowr.jiscinvolve.org/wp/files/2010/06/Guide-2010-final.pdf. 8 Lavoie, “Open Archival Information System Reference Model;” Farrell, Ashley, and Davis, “Guide to Web Preservation.” 9 Peter Lyman, “Archiving the World Wide Web,” Washington, Library of Congress (2002), https://www.clir.org/pubs/reports/pub106/web/. 10 Diomidis Spinellis, “The Decay and Failures of Web References,” Communications of the ACM 46, no. 1 (2003): 71-77, https://dl.acm.org/citation.cfm?doid=602421.602422. 11 Digital Archive for Chinese Studies (DACHS) Archive2 https://www.zo.uni- heidelberg.de/boa/digital_resources/dachs/index_en.html, visited on Jan 31, 2019. 12 Julien Masanès, “Web Archiving Methods and Approaches: A Comparative Study,” Library Trends 54, no. 1 (2005): 72-90, https://doi.org/10.1353/lib.2006.0005. 13 Hanno Lecher, “Small Scale Academic Web Archiving: DACHS,” in Web Archiving (Berlin/Heidelberg: Springer, 2006), 213-25, https://doi.org/10.1007/978-3-540-46332- 0_10. 14 Daniel Gomes et al., “Introducing the Portuguese Web Archive Initiative,” in 8th international Web Archiving Workshop (Berlin/Heidelberg: Springer, 2009). 15 Gerrit Voerman et al., “Archiving the Web: Political Party Web Sites in the Netherlands,” European Political Science 2, no. 1 (2002): 68-75, https://doi.org/10.1057/eps.2002.51. 16 Sonja Gabriel, “Public Sector Records Management: A Practical Guide,” Records Management Journal 18, no. 2 (2008), https://doi.org/10.1108/00242530810911914. 17 Farrell, Ashley, and Davis, “Guide to Web Preservation.” SYSTEMATIC APPROACH TOWARDS WEB PRESERVATION | KHAN AND UR RAHMAN 88 https://doi.org/10.6017/ital.v38i1.10181 18 Jung-ran Park and Andrew Brenza, “Evaluation of Semi-Automatic Metadata Generation Tools: A Survey of the Current State of the Art,” Information Technology and Libraries 34, no. 3 (Sept, 2015): 22-42, https://doi.org/10.6017/ital.v34i3.5889. 19 Muzammil Khan and Arif Ur Rahman, “Digital News Story Preservation Framework,” in Digital Libraries: Providing Quality Information: 17th International Conference on Asia-Pacific Digital Libraries, ICADL 2015 Seoul, Korea, December 9-12, 2015 (Proceedings, vol. 9469, Springer, 2015), 350-52, https://doi.org/10.1007/978-3-319-27974-9; Muzammil Khan, “Using Text Processing Techniques for Linking News Stories for Digital Preservation,” PhD Thesis, Faculty of Computer Science, Preston University Kohat, Islamabad Campus, HEC Pakistan, 2018. 20 Dennis Dimick, “Adobe Acrobat Captures the Web,” Washington Apple Pi Journal (1999): 23-25. 21 Trupti Udapure, Ravindra D. Kale, and Rajesh C. Dharmik, “Study of Web Crawler and Its Different Types,” IOSR Journal of Computer Engineering (IOSR-JCE) 16, no. 1 (2014): 01-05, https://doi.org/10.9790/0661-16160105. 22 Dora Biblarz et al., “Guidelines for a Collection Development Policy Using the Conspectus Model,” International Federation of Library Associations and Institutions, Section on Acquisition and Collection Development (2001). 23 Farrell, Ashley, and Davis, “Guide to Web Preservation;” E. Pinsent et al., “PoWR: The Preservation of Web Resources Handbook,” http://jisc.ac.uk/publications/programmerelated/2008/powrhandbook.aspx (2010); Michael Day, “Preserving the Fabric of Our Lives: A Survey of Web Preservation Initiatives,” Lecture Notes in Computer Science (Berlin/Heidelberg: Springer, 2003): 461-72, https://doi.org/10.1007/978-3-540-45175-4_42. 24 Pinsent et al., “PoWR:”; Day, “Preserving the Fabric.” 25 Allan Arvidson, “The Royal Swedish Web Archive: A Complete Collection of Web Pages,” International Preservation News (2001): 10-12. 26 Andreas Rauber, Andreas Aschenbrenner, and Oliver Witvoet, “Austrian Online Archive Processing: Analyzing Archives of the World Wide Web,” Research and Advanced Technology for Digital Libraries (2002): ECDL 2002. Lecture Notes in Computer Science, vol 2458, (Berlin/Heidelberg: Springer, 2002), 16-31, https://doi.org/10.1007/3-540-45747-X_2. 27 William Arms, “Collecting and Preserving the Web: The Minerva Prototype,” RLG DigiNews 5, no. 2 (2001). 28 Sonya Betz and Robyn Hall, “Self-Archiving with Ease in an Institutional Repository: Micro Interactions and the User Experience,” Information Technology and Libraries 34, no. 3 (Sept. 2015): 43-58, https://doi.org/10.6017/ital.v34i3.5900. 29 Serge Abiteboul et al., “A First Experience in Archiving the French Web,” in International Conference on Theory and Practice of Digital Libraries, (Berlin/Heidelberg: Springer, 2002), 1- 15, https://doi.org/10.1007/3-540-45747-X_1; Sergey Brin and Lawrence Page, “Reprint of: INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 89 The Anatomy of a Large-Scale Hypertextual Web Search Engine,” Computer Networks 56, no. 18 (2012): 3825-33, https://doi.org/10.1016/j.comnet.2012.10.007. 30 Masanès, “Web Archiving.” 31 NISO-Press, “Understanding Metadata,” National Information Standards (2004), http://www.niso.org/publications/understanding-metadata. 32 Ibid. 33 Jane Greenberg, “Understanding Metadata and Metadata Schemes,” Cataloging & Classification Quarterly 40, no. 3-4 (2009): 17-36, https://doi.org/10.1300/J104v40n03_02. 34 Michael Day, “Preservation Metadata Initiatives: Practicality, Sustainability, and Interoperability,” Publishers: Archivschule Marburg (2004): 91-117. 35 Jenn Riley, Glossary of Metadata Standards (2010). 36 Corey Harper, “Dublin Core Metadata Initiative: Beyond the Element Set,” Information Standards Quarterly 22, no. 1 (2010): 20-31. 37 Jane Greenberg, “Dublin Core: History, Key Concepts, and Evolving Context (Part One),” in Slide Presentation on dc-2010 International Conference on Dublin Core and Metadata Applications Pittsburgh, PA (2010). 38 Cundiff V. Morgan, “An Introduction to the Metadata Encoding and Transmission Standard (METS),” Library Hi Tech 22, no. 1 (2004): 52-64, https://doi.org/10.1108/07378830410524495; Leta Negandhi, “Metadata Encoding and Transmission Standard (METS),”In Texas Conference on Digital Libraries, TCDL-2012 (2012). 39 Sally H. McCallum, “An Introduction to the Metadata Object Description Schema (MODS),” Library Hi Tech 22, no. 1 (2004): 82-88, https://doi.org/10.1108/07378830410524521. 40 R. Gartner, “MODE: Metadata Object Description Schema,” JISC Techwatch Report TSW (2003): 03-06. www.loc.gov/standards/mods/. 41 VRA-Core, “An Introduction of VRA Core,” http://www.loc.gov/standards/vracore/VRA Core4 Intro.pdf, Created: Oct 2014. 42 VRA-Core, “VRA Core Element Outline,” http://www.loc.gov/standards/vracore/VRA Core4 Outline.pdf, Created: Feb 2007. 43 Priscilla Caplan, “Understanding PREMIS,” Washington DC, USA: Library of Congress, (2009), https://www.loc.gov/standards/premis/understanding-premis.pdf; J. Relay, “An Introduction to PREMIS,” Singapore IPRESS Tutorial, (2011), http://www.loc.gov/standards/premis/premistutorial iPRES2011 singapore.pdf. SYSTEMATIC APPROACH TOWARDS WEB PRESERVATION | KHAN AND UR RAHMAN 90 https://doi.org/10.6017/ital.v38i1.10181 44 Jennifer Schaffner, “The Metadata is the Interface: Better Description for Better Discovery of Archives and Special Collections, Synthesized from User Studies,” Making Archival and Special Collections More Accessible, 85 (2015). 45 Joao Miranda and Daniel Gomes, “Trends in Web Characteristics,” in Web Congress, 2009. LA- WEB'09. Latin American, (IEEE, 2009), 146-53, https://doi.org/10.1109/LA-WEB.2009.28. 46 Daniel Gomes, João Miranda, and Miguel Costa, “A Survey on Web Archiving Initiatives,” Research and Advanced Technology for Digital Libraries (2011): 408-20, https://doi.org/10.1007/978-3-642-24469-8_41. 47 Ibid. 48 Schaffner, “Metadata is the Interface.” 49 Miguel Costa and Mário J. Silva, “Evaluating Web Archive Search Systems,” in International Conference on Web Information Systems Engineering (Berlin/Heidelberg: Springer, 2012), 440- 454. https://doi.org/10.1007/978-3-642-35063-4_32. 50 Foundation, I, “Web Archiving in Europe,” technical report, CommerceNet Labs (2010). 51 Georgia Solomou and Dimitrios Koutsomitropoulos, “Towards an Evaluation of Semantic Searching in Digital Repositories: A DSpace Case-Study,” Program 49, no. 1 (2015): 63-90, https://doi.org/10.1108/PROG-07-2013-0037. 52 Liu Yan Quan and Sarah Briggs, “A Library in the Palm of Your Hand: Mobile Services in Top 100 University Libraries,” Information Technology and Libraries 34, no. 2 (June 2015): 133, https://doi.org/10.6017/ital.v34i2.5650. 53 Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval 463. (New York: ACM Pr., 1999). 54 Daniel Burda and Frank Teuteberg, “Sustaining Accessibility of Information through Digital Preservation: A Literature Review,” Journal of Information Science, 39, no. 4 (2013): 442-58, https://doi.org/10.1177/0165551513480107. 55 Muzammil Khan et al., “Normalizing Digital News-Stories for Preservation,” in Digital Information Management (ICDIM), 2016 Eleventh International Conference on (IEEE, 2016), 85- 90, https://doi.org/10.1109/ICDIM.2016.7829785. 56 Khan, et al., “Normalizing Digital News.” 57 Muzammil Khan, Arif Ur Rahman, and M. Daud Awan, “Term-Based Approach for Linking Digital News Stories,” in Italian Research Conference on Digital Libraries (Cham, Switzerland: Springer, 2018), 127-38, https://doi.org/10.1007/978-3-319-73165-0_13. 10191 ---- Primo New User Interface: Usability Testing and Local Customizations Implemented in Response Blake Lee Galbreath, Corey Johnson, and Erin Hvizdak INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 10 Blake Lee Galbreath (blake.galbreath@wsu.edu) is Core Services Librarian, Corey Johnson (coreyj@wsu.edu) is Instruction and Assessment Librarian, and Erin Hvizdak (erin.hvizdak@wsu.edu) is Reference and Instruction Librarian, Washington State University. ABSTRACT Washington State University was the first library system of its 39-member consortium to migrate to Primo New User Interface. Following this migration, we conducted a usability study in July 2017 to better understand how our users fared when the new user interface deviated significantly from the classic interface. From this study, we learned that users had little difficulty using basic and advanced search, signing into and out of primo, and navigating their account. In other areas, where the difference between the two interfaces was more pronounced, study participants experienced more difficulty. Finally, we present customizations implemented at Washington State University to the design of the interface to help alleviate the observed issues. INTRODUCTION A July 2017 usability study by Washington State University (WSU) Libraries was the final segment of a six- month process for migrating to the new user interface of Ex Libris Primo called Primo New UI. WSU Libraries assembled a working group in December 2016 to plan for the migration from the classic interface to Primo New UI and met bi-weekly through May 2017. To start, the Primo New UI working group attempted to answer some baseline questions: What can and cannot be customized in the new interface? How, and according to what timeline, should we introduce the new interface to our library patrons? What methods could be used to assess the new interface? This working group customized the look and feel of the new interface to conform to WSU branding and then released a beta version of Primo New UI in March, leaving the older interface (Primo Classic) as the primary means of access to Primo but allowing users to enter and test the beta version of the new interface. In early May (at the start of the Summer semester), the prominence of the old and new interfaces was reversed, making Primo New UI the default interface but leaving the possibility of continued access to Primo Classic. The older interface was removed from public access in mid-August, just prior to the start of the Fall semester. The public had the opportunity to work with the beta version from March to May and then another two months experience with the production release by the time the usability study took place in July 2017. The remainder of this paper will focus on the details of this usability study. mailto:blake.galbreath@wsu.edu mailto:coreyj@wsu.edu mailto:erin.hvizdak@wsu.edu PRIMO NEW USER INTERFACE | GALBREATH, JOHNSON, AND HVIZDAK 11 https://doi.org/10.6017/ital.v37i2.10191 RESEARCH QUESTIONS Primo New UI was the name given to the new front end of the Primo discovery layer, which was made available to customers in August 2016. According to Ex Libris, “Its design is based on user studies and feedback to address the different needs of different types of users.”1 We were primarily interested in understanding the usability of the essential functionalities of Primo New UI, especially where the design of the new interface deviated significantly from the classic interface (taking local customizations into account). For example, we noted that the new interface introduced the following differences to the user (this ordinal list corresponds to the number labels in figure 1): 1. Basic Search tabs were expressed as drop-downs. 2. The Advanced Search link was less prominent than it was with our customized shape and color in the classic interface. 3. Main Menu items were located in a separate area from the Sign In and My Account links. 4. My Favorites and Help/Chat icons were located together and in a new section of the top navigation bar. 5. Sign In and My Account links were hidden beneath a “Guest” label. 6. Facet values were no longer associated with checkboxes or underlining upon hover. 7. Availability statuses were expressed through colored text. Figure 1. Basic search screen in Primo New UI. We also observed a fundamental change in the structure of the record in Primo New UI: the horizontally oriented and tabbed structure of the classic record (see figure 2) was converted to a vertically oriented and non-tabbed structure in the new interface (see figure 3). Additionally, the tabbed structure of the classic interface opened in a frame of the Brief Results area, while the same information was displayed on the Full Display page of the new interface. The options displayed in these areas are known as Get It and View It (although we locally branded our sections Availability and Request Options and Access Options, INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 12 respectively). Therefore, we were eager to see how this change in layout might affect a participant’s ability to find Get It and View It information on the Full Display page. Taking the above observations into account, we formulated the following questions: 1. Will the participant be able to find and use the Basic Search functionality? 2. Will the participant be able to understand the availability information of the brief results? 3. Will the participant be able to find and use the Sign In and Sign Out features? 4. Will the participant be able to understand the behavior of the facets? 5. Will the participant be able to find and use the Actions Menu? (See the “Send to” boxed area in figure 3.) 6. Will the participant be able to navigate the Get It and View It areas of the Full Display page? (See the “Availability and Request Options” boxed area in figure 3.) 7. Will the participant be able to navigate the My Account area? 8. Will the participant be able to find and use the Help/Chat and My Favorites icons? 9. Will the participant be able to find and use the Advanced Search functionality? 10. Will the participant be able to find and use the Main Menu items? (See figure 1, number 3.) Figure 2. Horizontally oriented and tabbed layout of Primo Classic. LITERATURE REVIEW 2012 witnessed a flurry of studies involving Primo Classic. Majors compared the experiences of users within the following discovery interfaces: Encore Synergy, Summon, WorldCat Local, Primo Central, and EBSCO Discovery Service. The study used undergraduate students enrolled at the University of Colorado and focused on common undergraduate searching activities. Each interface was tested by five or six participants who also completed an exit survey. Observations specific to the Primo interface noted that users had difficulty finding and using existing features, such as email and e-shelf, and difficulty connecting their failed searches to interlibrary loan functionality.2 PRIMO NEW USER INTERFACE | GALBREATH, JOHNSON, AND HVIZDAK 13 https://doi.org/10.6017/ital.v37i2.10191 Figure 3. Vertically oriented and non-tabbed Layout of Primo New UI. Comeaux noted issues relating to terminology and the display of services during usability testing carried out at Tulane University. Twenty people, including undergraduates, graduates, and faculty members, participated in this study, which tested five typical information-seeking scenarios. The study found several problems related to terminology. For example, participants did not fully understand the meaning of the Expand My Results functionality.3 Participants also did not understand that the display text “No full-text” could be used to order an item via Interlibrary Loan. 4 The study also concluded that the mixed presentation of differing resource types (e.g., books, articles, reviews) was confusing for patrons who were attempting known-item searches.5 Jarrett documented a usability study conducted at Flinders University Library. The aims of the study were to determine user perceptions regarding the usability of the discovery layer, the relevance of the information retrieved, and the user experiences of this search interface compared to other interfaces. 6 The usability portion of the study scored the participants’ completion of tasks in the Primo discovery layer as difficult, confusing, neutral, or straightforward. Scores indicated that participants had difficulty determining different editions of a book, locating a local thesis, and placing an item on hold. The investigators also observed that students had issues signing into Primo and distinguishing between journals and journal articles.7 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 14 Nichols et al. conducted a usability test on a newly implemented Primo instance at the University of Vermont Libraries in 2012. Their research questions were designed to understand Primo’s design, functionality, and layout.8 The majority of the participants were undergraduate students. Similar to Comeaux, confusion occurred when participants had to find specific or relevant records within longer sets of results.9 Nichols et al. also noticed that test subjects had difficulty navigating and finding information in the Primo tabbed structure. Like Jarrett, Nichols et al. noted that participants had difficulty distinguishing between the journals and articles.10 Similar to Majors, participants in Nichols et al. had difficulty finding certain Primo functionality, such as email, the e-Shelf, and the feature to open items in a new window.11 The investigators concluded that these tools were difficult to find because they were buried too deep in the interface. The University of Kansas Libraries conducted two usability studies on Primo. The first study took place during the 2012–13 academic year and involved 27 participants, including undergraduate, graduate, and professional students, who performed four to five main tasks in two separate sessions. Similar to other studies, participants experienced great difficulty using the Save to E-shelf and Email Citation tools.12 Kliewer et al. conducted the second usability study in 2016, which focused primarily on student satisfaction with the Primo discovery tool. Thirty undergraduates participated in this study that collected both qualitative and quantitative data. In contrast to most usability studies of discovery services, this study allowed participants to explore Primo with open-ended searches to more closely mimic natural searching strategies. Results of the study indicated that the participants preferred Basic Search to Advanced Search, used facets (but not enough to maximize their searching potential), rarely moved beyond the first page of search results, and experienced difficulties using the link resolver. In response to the latter, a Primo working group clarified language on the link resolver page to better differentiate between links to articles and links to journals.13 Brett, Lierman, and Turner conducted a usability study at the University of Houston Libraries focusing primarily on undergraduate students. Users were able to complete the assigned tasks, but the majority did not do so in the most efficient manner. That is, the participants did not take full advantage of Primo functionality, such as facets, holds, and recalls. Additionally, some participants exhibited difficulty deciphering among the terms journals, journal articles, and newspaper articles. Another difficulty participants experienced was knowing what further steps to take once they had successfully found an item in the results list. For example, participants had trouble locating stacks guides, finding request features, and using call numbers. The researchers concluded that many of the issues witnessed in this usability study could be mitigated via library instruction.14 Usability testing of Primo New UI has recently begun to take a foothold in academic libraries. In addition to conducting usability testing on the Primo Classic in April 2015 (5 participants, 5–6 tasks), researchers at Boston University carried out both pre- and post-launch testing of the new interface in December 2016 and April 2017, respectively. Pre-launch testing with five student participants identified issues with “labelling, locating links to online services, availability statement links in full results, [and] My Favorites.”15 After completing fixes, post-launch testing with four students (2 infrequent users, 2 frequent) found that they were able to easily complete tasks, use filters, save results, and find links to online resources. Usage statistics for the new interface, compared to classic, also showed an increased use of facets after fixes, and an increase in the use of some features but decrease in the use of others, providing information on what features warranted further examination.16 PRIMO NEW USER INTERFACE | GALBREATH, JOHNSON, AND HVIZDAK 15 https://doi.org/10.6017/ital.v37i2.10191 California State University (CSU) libraries conducted usability studies on Primo New UI with 24 participants (undergraduate students, graduate students, and faculty) across five CSU campuses. Five standard tasks were required: find a specific book, find a specific film, find a peer-reviewed journal article, find an item in the CSU network not owned locally, and find a newspaper article. Each campus added additional questions based on local needs. Participants were overwhelmingly positive about the interface look and feel, ease of use, and speed of the system. The success rate for each task varied across the campuses, with participants having greater success on simple tasks such as finding a specific or known item and mixed results on more difficult tasks including using scopes, understanding icons and elements of the FRBR record, and facets. Steps were taken to relabel and rearrange the scopes and facets so that they were more meaningful to users, and FRBR icons were replaced. The authors concluded that Primo is an ideal solution to incorporate both global changes and local preference because of its customizability.17 University of Washington Libraries conducted usability studies on the classic and new Primo interfaces. The Primo New UI study observed 12 participants. Each 60-minute session included an orientation, pre- and post-tests, tasks, and follow-up questions. Difficulties were noted with terminology, the site logo, the inability to select multiple facets, unclear navigation, volume requesting, Advanced Search logic, the pin location in item details, and the date facet. A/B testing with 12 participants (from both the New and C lassic UI studies) revealed the need to fix the Sign-In prompt for My Favorites, enable libraries to add custom actions to the actions menu, add a sort option for favorites in the new interface, add the ability to rearrange elements on a single item page, and add Zotero support. Overall, participants preferred the new interface. Generally, participants easily completed basic tasks, such as known-item searches, searches for course reserves, and open searches, but had more difficulty with article subject searching, audio/visual subject searching, and print-volume searching, which was consistent from the classic to the new interfaces for student participants.18 METHOD We conducted a diagnostic usability evaluation of Primo New UI using eight participants, whom we recruited from the WSU faculty, staff, and student populations. In the end, we received a skewed distribution among the categories: three members of staff and five students (two undergraduate students and three graduate students). The initial composition of the participants comprised a greater number of undergraduate students, but substitution created the final makeup. All the study participants had some exposure to Primo Classic in the past. We recruited participants by hanging flyers around the libraries of our Pullman campus and the adjoining student commons area. We offered the participants $15 in exchange for their time, which we advertised as being a maximum of one hour. The usability test was designed by a team of three library staff, one from Systems (IT) and two from Research Services (reference/instruction). Two of us were present at each session, one to read the tasks aloud and the other to document the session. We used Camtasia to record each session so that we would have the ability to return to it later if we needed to verify our notes or other specifics of the session. We stored the recordings on a secured share of the internal library drive. We received an Institutional Review Board Certificate of Exemption (IRB #16190) to conduct this study. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 16 This usability test comprised eleven tasks (see appendix A) to test the research questions described above. The tasks were drafted in consultation with the Ex Libris set of recommendations for conducting Primo usability testing.19 Each investigator drew their conclusions as to the participants’ successes and failures. We then met as a group to form a consensus regarding task success and failure (see appendix B). We met to discuss the patterns that emerged and to formulate remedies to problems we perceived as hindering student success. RESULTS For each of the ten research questions below, consult appendix B to see details regarding the associated tasks and how each participant approached and completed each task. Task set(s) related to research question 1: Will the participant be able to find and use the Basic Search functionality? This was one of the easier tasks for the participants to complete. Some participants did not follow the task literally to find their favorite book or movie, but rather completed a search for an item or topic of interest to them. All the participants completed this task successfully. Task set(s) related to research question 2: Will the participant be able to understand the availability information of the brief results? The majority of the participants understood that the availability text and its color represented important access information. However, there were instances where the color of the availability status was in conflict with its text. This led at least one participant to evaluate the availability of a resource incorrectly. Task set(s) related to research question 3: Will the participant be able to find and use the Sign In and Sign Out features? The participants all successfully completed this task. Participants used multiple methods to sign in: the Guest link in the top navigation bar, the Sign In link from the ellipsis Main Menu Item, and the Get It Sign In link on the Full Display page. All participants signed out via the User link in the top navigation bar. Task set(s) related to research question 4: Will the participant be able to understand the behavior of the facets? Almost all of the participants were able to select the Articles facet without issue. One person, however, misunderstood the include behavior of the facets. Instead of using the include behavior, this participant used the exclude behavior to remove all facets other than the Articles facet. Only two participants attempted to use the Print Books facet to complete the task, “From the list of results, find a print book that you would need to order from another library.” Instead, the other 75 percent simply scanned the list of results to find the same information. Five out of the eight participants attempted to find the Peer-Reviewed facet when completing the task to choose any peer-reviewed article from a results list: three were successful, while one selected the Newspaper Articles facet, and another selected the Reviews facet. Task set(s) related to research question 5: Will the participant be able to find and use the Actions Menu? The tasks related to the Actions Menu (copy a citation and email a record) were some of the most difficult for the participants: two were successful, three had some difficulty, and three were unsuccessful. Of those PRIMO NEW USER INTERFACE | GALBREATH, JOHNSON, AND HVIZDAK 17 https://doi.org/10.6017/ital.v37i2.10191 who experienced difficulty, one seemed not to understand the task fully; this participant found and copied the citation, but then spent additional time looking for a “clipboard.” The other two participants were both distracted by competing areas of interest: the Citations section of the Full Display and the section headings of the Full Display. Of those who were unsuccessful, one suffered from a technical issue that Ex Libris needs to resolve (the functionality to expand the list of action items failed), one did not seem to understand what a citation was when they found it, and another could not find the email functionality. This last subject continued searching in the ellipsis area of the Main Menu, in the My Account area, and the facets, but ultimately never found the Email icon in the scrolling section of the Actions Menu. Task set(s) related to research question 6: Will the participant be able to navigate the Get It and View It areas of the Full Display page? Three participants experienced substantial difficulty in completing this set of tasks. These participants were distracted by the styled Show Libraries and Stack Chart buttons on the Full Display page that were competing for attention with the requesting options. Task set(s) related to research question 7: Will the participant be able to navigate the My Account area? All of the participants completed this task successfully. Four participants located the back-arrow icon to exit the My Account area, while the other four participants used alternate methods: using the library logo, selecting the New Search button, and signing out of Primo. Task set(s) related to research question 8: Will the participant be able to find and use the Help/Chat and My Favorites icons? Participants encountered very little difficulty in finding a way to procure help and chat with a librarian, with one exception. Participant 2 immediately navigated to and opened our Help/Chat icon, but then moved away from this service because it opened in a new tab. This same participant, along with three others, had a more difficult time finding and deciding to use the Pin this Item icon than did the three participants who completed the same task with ease. The remaining participant failed to complete this task because they could not find the My Favorites area of Primo. Task set(s) related to research question 9: Will the participant be able to find and use the Advanced Search functionality? One participant had more trouble finding the Advanced Search functionality than the other seven. Another experienced a technical difficulty, in which the Primo screen froze during the experiment, and we had to begin the task anew. The remaining six people easily finished the tasks. Task set(s) related to research question 10: Will the participant be able to find and use the Main Menu items? The majority of the participants completed this task with ease, navigating to the Databases link in the Main Menu items. One participant, however, was confused by the term database but was able to succeed once we provided a brief definition of the term. The remaining two participants were further confused by the term and instead entered general search terms into the Primo search bar. These two participants failed to find the list of databases. DISCUSSION INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 18 Study participants completed four of our task sets with relative ease: using Basic Search (see research question 1 above), signing into and out of Primo (see research question 3 above), navigating their My Account area (see research question 7 above), and using Advanced Search (see research question 9 above). There was one exception: one participant experienced minor trouble finding the Advanced Search link, checking first among the drop-down options on our Basic Search page. Subsequent and unrelated to this study, WSU elected to eliminate the first set of drop-down options from our Primo landing page. Further testing might tell us if this elimination in the number of drop-down options has effectively made the Advanced Search link more prominent for users. Also, the ease with which participants were able to use items located underneath the “Guest” label contradicted our expectations. We predicted that this opacity would cause users issues, but it did not seem to deter them. From this, we concluded that the placement of the sign in options in the upper right corner is sufficient to maintain continuity. Participants encountered a moderate degree of difficulty completing two task sets: determining availability statuses and navigating the Get It area of the Full Display page. Concerning availability, participants were quick to understand that statuses such as “Check holdings” relayed that the item was not available. The participants were also keen to notice that green availability statuses implied access while non -green availability statuses implied non-access. However, per the design of the new interface, certain non-green links became green after opening the Full Display page of Primo. This was a significant deviation from the classic interface, where colors indicating availability status did not change. This design element misled one participant. Of note, we did not observe participants experiencing issues with the converted format of the Get It and View It areas (see figures 2 and 3) per se. However, we did notice that three of our participants were unnecessarily distracted by the Show Libraries link when trying to find resource sharing options because WSU had previously styled the Show Libraries links with color and shape. Therefore, our local branding in this area impeded usability and led us to rethink the hierarchy of actions on the Full Display page. Similar to comments made by DeMars, study participants also remarked that the layout of the Full Display was cluttered and difficult to read.20 We therefore took steps to make this page more readable for the viewer. Study participants displayed the greatest difficulty completing the remaining four task sets: selecting a Main Menu item, refining a search via the facets, using the Actions Menu, and navigating the My Favorites functionality. However, web design was not necessarily the culprit in all four areas. Three participants experienced difficulty finding the Databases link (a Main Menu item). After further discussion, it became apparent that this trouble related not to usability but to information literacy—they did not understand the term databases. Therefore, like Majors and Comeaux,21 we recognize the recurring issue of library jargon, and like Brett, Lierman, and Turner,22 we believe that this issue would best be mitigated via library instruction. In agreement with the literature, two participants selected the incorrect facet because they had difficulty distinguishing among the terms articles, newspaper articles, reviews and peer-reviewed.23 Further, one of these participants experienced even more difficulty because of not understanding the inherent functionality of the facet values. That is, this participant did not grasp that the facet value links performed an inclusion process by default. To the contrary, this person believed that they would have had to exclude all unwanted facet values to arrive at the wanted facet value. The change in facet behavior between classic and new interfaces likely caused this confusion. In Primo Classic, WSU had installed a local customization that provided checkboxes and underlining upon hover for each facet value. The new interface did not PRIMO NEW USER INTERFACE | GALBREATH, JOHNSON, AND HVIZDAK 19 https://doi.org/10.6017/ital.v37i2.10191 provide either one of these clues to the user. Additionally, we observed, similar to Kliewer et al. and Brett, Lierman, and Turner, that participants oftentimes preferred to scan the results list over refining their search via faceting.24 This finding also matches a 2014 Ex Libris user study indicating that users are easily confused by too many interface options and thus tend to ignore them.25 Regarding the Actions Menu, the majority of the participants attempted to find the Email icon in the correct section of the Full Display page (i.e., the “Send To” section). However, because of a technical issue in the design of the new interface, the Email icon was not always present for the participant to find. For others, it was difficult to reach the icon even when it was present as participants had to click the right arrow three to four times to navigate past all the citation manager icons. This observed difficulty in finding existing functionalities in Primo echoes that cited by Majors and Nichols et al.26 Participants also experienced significant difficulty deciphering between the similarly named functionalities of the Citation icon and the Citations section of the Full Display page. As a result of this observed difficulty, we concluded that differentiating sections of the page with distinct naming conventions would be beneficial to users. Like the results reported by Boston University, our study participants encountered significant issues when trying to save items into their My Favorites list.27 We noticed that participants had difficulty making connections between the icons named Keep this Item/Remove this Item and the My Favorites area. During testing, it was clear that many of the participants were drawn to the pin icon for the correctly anticipated functionality but then were confused that the tooltips did not include any language resembling “My Favorites.” From this last observation, we surmised that providing continuity in language between these icons and the My Favorites area would increase usability for our library patrons. Pepitone reported problems with the placement of the My Favorites pin icon,28 but we observed this being less of a problem than the actual terminology used to name the pin icon. Beyond success and failure, a 2014 Ex Libris user study suggested that academic level and discipline play a key role in user behavior.29 However, we were unable to draw meaningful conclusions among user groups because of our small and homogenous participant pool. DECISIONS MADE IN RESPONSE TO USABILITY RESULTS Declined to Change Facets. Although one participant did not understand the inclusion mechanism of the facet values, we declined to investigate a customization in this area. According to the Primo August 2017 release notes, Ex Libris plans to make considerable changes to the faceting functionality.30 Therefore, we decided to wait until after this release to reassess whether customization was warranted. Implemented a Change Labels Citations. We observed confusion between the Citation icon of the Actions Menu and the section of the Full Display page labeled “Citations.” To differentiate between the two items, we changed the Actions Menu icon text to “Cite This Item” (see figure 4) and the heading for the Citations section to “References Cited” (see figure 5). INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 20 Figure 4. Cite This Item icon of the Actions Menu. Figure 5. References Cited section of the Full Display page. My Favorites. There was a mismatch among the tooltip texts of the My Favorites icons. We changed the tooltip language for the “Keep this item” pin to read “Add to My Favorites” (see figure 6) and the tooltip language for the “Unpin this item” pin to read “Remove from My Favorites” (see figure 7). Figure 6. Add to My Favorites language for My Favorites Tooltip. Figure 7. Remove from My Favorites language for My Favorites Tooltip. Availability Statuses. Per the design of the new interface, certain non-green links became green after opening the Full Display page of Primo New UI. We implemented CSS code to retain the non-green coloring of the availability statuses after opening the Full Display. In this case, “Check holdings” remains orange (see figure 8). Figure 8. Availability status color of Brief Display, before and after opening the Full Display. PRIMO NEW USER INTERFACE | GALBREATH, JOHNSON, AND HVIZDAK 21 https://doi.org/10.6017/ital.v37i2.10191 Link Removal Full Display Page Headings. There was confusion as to the function of the headings on the Full Display page. These are anchor tags, but patrons clicked on them as if they were functional links. No patrons used the headings successfully. Therefore, we hid the headings section via CSS (see figure 9). Figure 9. Removal of headings on Full Display page. Links to Other Institutions. We observed participants attempting to use the links to other institutions to place resource sharing requests. Therefore, we removed the hyperlinking functionality of the links in the list, via CSS (see figure 10). Figure 10. Neutralization of links to other institutions. Prioritized the Emphasis of Certain Functionalities Request Options and Show Libraries Buttons. It is usually more important to be able to place a request than find the names of other institutions who own an item. However, the Show Libraries button was originally styled with crimson coloring, which drew unwarranted attention, while the requesting links were not. Therefore, we added styling to the resource-sharing links and removed styling from the Show Libraries button via CSS (see figure 11). Figure 11. Resource sharing link with crimson color, Show Libraries removed of styling. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 22 E-mail Icon. We observed that the E-mail icon of the Actions Menu was difficult to find. Therefore, we decreased the number of icons and moved the emailing functionality to the left side of the Actions Menu (see figure 12). Figure 12. Email icon prioritized over Citation Manager icons. Contrast and Separation Full Display Page Sections. Participants noted that the information on the Full Display page tended to run together. To remedy, we created higher contrast between the foreground and background of the page sections via CSS. We also styled the section titles and dividers with color, among other edits (see figure 13). Figure 13. Separated sections of Full Display page (see figure 3 to compare to the New UI default Full Display page design). PRIMO NEW USER INTERFACE | GALBREATH, JOHNSON, AND HVIZDAK 23 https://doi.org/10.6017/ital.v37i2.10191 CONCLUSION While providing one of the first studies on Primo New UI, we acknowledge several limitations. Previous studies on Primo had larger study populations compared to this one (which had eight participants). However, we adhered to Nielsen’s findings that usability studies uncover most design deficiencies with five or more participants.31 Additionally, the scope of this study was limited to the usability of the desktop view. We recommend further studies that will concentrate on accessibility compliance and that will test the interface on mobile devices. Regarding the study design, the question arose as to whether the participants’ difficulties reflected poor design functionality or a misunderstanding of library terminology (as noted by Majors and Comeaux).32 The researchers did not carry out pre-tests or an assessment of participants’ level of existing knowledge. This limitation is almost always unavoidable, however, as a task list will always risk not fitting the skills or knowledge of every participant. The lack of some features’ use also might have been because of study design. While not using the facets may reflect that participants are unaware of them, it could also be from the fact that they never had to scroll past the first few items to find the needed resource. Users might have felt a greater need to use the facets had we asked more difficult discovery tasks. The study also contained an investigative bias in that the researchers were part of the working group that developed the customized interface, and then tested those customizations. This bias could have been reduced if the study had used researchers who were not a part of the same group that made these customizations. Despite these limitations, there are still key findings of note. Tasks that participants completed with the greatest ease mapped to those that we assume they do most often, which included basic searching for materials and accessing account information. Tasks beyond these basics proved to be more difficult. This raises the question of whether difficulties were really a function of the interface design or if they reflected ongoing literacy issues. Therefore, it is crucial that designers work with public services and instruction librarians to identify areas where users might be well-served by making certain functionalities more user- friendly and creating educational and training opportunities to increase awareness of these functionalities.33 Bringing diverse perspectives into the study is also crucial so that researchers can discover and be more conscious of commonalities in design and literacy needs, particularly regarding advanced tasks. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 24 APPENDIX A: USABILITY TASKS Note: Search It is the local branding for Primo at Washington State University. 1) Please search for your favorite book or movie. a) Is this item available for you to read or watch? b) How do you know that this item is or isn’t available for you to read or watch? 2) Please sign in to Search It. 3) Please perform a search for “causes of world war II” (do not include quotation marks). a) Limit your search results to Articles. b) For any of the records in your search results list: i) Find the citation for any item and copy it to the clipboard. ii) Email this record to yourself. 4) Please perform a search for “actor’s choice monologues” (do not include quotation marks). a) From the list of results, find a print book that you would need to order from another library. 5) Please perform a search for a print book with ISBN 0582493498. a) This book is checked out. How would you get a copy of it? b) Pretend that this book is NOT checked out. Please show us the information from this record that you would use to find this item on the shelves. 6) Please navigate to your library account (from within Search It). a) Pretend that you have forgotten how many items you have checked out. Please show us how you would find out how many items you currently have checked out. b) Exit your library account area. 7) Please navigate to Advanced Search. a) Perform any search on this page. 8) Please show us where you would go to find help and/or chat with a librarian? 9) Please perform a search using the keywords “gender and media.” a) Add any source to your My Favorites list. Then open My Favorites and click on the title of the source you just added. b) Return to your list of results. Choose any peer-reviewed article that has the full text available. Click on the link that will access the full text. 10) Please find a database that might be of interest to you (e.g., JSTOR). 11) Please sign out of Search It and close your browser. PRIMO NEW USER INTERFACE | GALBREATH, JOHNSON, AND HVIZDAK 25 https://doi.org/10.6017/ital.v37i2.10191 APPENDIX B: USABILITY RESULTS Note: Search It is the local branding for Primo at Washington State University. Research Question 1: Will the participant be able to find and use the basic search functionality? Associated task(s): 1. Please search for your favorite book or movie. Participant Successful? Commentary 1 Yes Searches for “the truman show” from the beginning. 2 Yes Searches for “pet sematary” from the beginning. 3 Yes Searches for “additive manufacturing” from the beginning. 4 Yes Signs in first, navigates to New Search, searches for “PZT sensor design.” 5 Yes Searches for “the notebook” from the beginning. 6 Yes Searches for “das leben der anderen” from the beginning. 7 Yes Searches for “Legally Blonde” from the beginning. 8 Yes Searches for “Jurassic Park” from the beginning. Research Question 2: Will the participant be able to understand the availability information of the brief results? Associated task(s): 1b. How do you know that this item is or isn’t available for you to read or watch? 4a. From the list of results, find a print book that you would need to order from another library. Participant Successful? Commentary 1 Yes Differentiates between green and orange text; uses the “Check holdings” availability status. Clicks on “Availability and Request Option” heading and then clicks on the resource sharing link. 2 Yes, with difficulty. Says that green “Check holdings” status indicates ability to read the book. Selects book with “Check holdings” status and locates resource sharing link. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 26 Participant Successful? Commentary 3 Yes, with difficulty Unclear. Initially, goes to a record with Online Access; redoes search, eventually locates resource sharing link. 4 Yes Says the record for the item reads “In place” and the availability indicator = 1. The record for the item reads “Check holdings.” 5 Yes Says that status is indicated by statement “Available at Holland/Terrell Libraries.” The record for the item reads “Check holdings.” 6 Yes Says that status is indicated by statement “Available at Holland/Terrell Libraries” and “Item in place.” Clicks on “Check holdings”; says that orange color denotes fact that we don’t have it. 7 Yes Hovers over “Check holdings” status, and then notes that “Availability” statement reads “did not match any physical resources.” The record for the item reads “Check holdings.” 8 Yes Says that status is indicated by statement “Available at Holland/Terrell Libraries.” Says the record for the item reads “Check holdings.” Research Question 3: Will the participant be able to find and use the Sign In and Sign Out features? Associated task(s): 2. Please sign into Search It. 11. Please sign out of Search It and close your browser. Participant Successful? Commentary 1 Yes Navigates to “Guest” link, signs in. 2 Yes Navigates to ellipsis, signs in. Navigates to “User” link, signs out. 3 Yes Navigates to “Guest” link, signs in. Navigates to “User” link, signs out. 4 Yes N/A—already signed in. Navigates to “User” link, signs out. PRIMO NEW USER INTERFACE | GALBREATH, JOHNSON, AND HVIZDAK 27 https://doi.org/10.6017/ital.v37i2.10191 Participant Successful? Commentary 5 Yes Navigates to “Guest” link, signs in. Navigates to “User” link, signs out. 6 Yes Navigates to “Guest” link, signs in. Navigates to “User” link, signs out. 7 Yes Uses Sign In link from Full Display page. Navigates to “User” link, signs out. 8 Yes Navigates to “Guest” link, signs in. Navigates to “User” link, signs out. Research Question 4: Will the participant be able to understand the behavior of the facets? Associated task(s): 3a. Limit your search results to Articles. 4a. From the list of results, find a print book that you would need to order from another library. 9b. Return to your list of results. Choose any peer-reviewed article that has the full text available. Participant Successful? Commentary 1 Yes Selects Articles facet. N/A—does not use facets (however, participant investigates the Library and Type facets, returns to results lists). 2 Yes Selects Articles facet. N/A—does not use facets. 3 No Uses “Exclude” property to remove everything but Articles. Uses “Exclude” property to remove everything but Print Books. Looks in facet Type for Articles; selects Newspaper Articles instead. 4 Yes, with difficulty Selects Articles facet. Selects Print Books facet. Selects Articles under Type facet, clicks on “Full-text available” status, selects Peer-reviewed Articles facet. 5 No Selects Articles facet. N/A—does not use facets. Screen freezes (technical issue) and participant is forced to redo search. N/A— does not use facets. When further prompted to find only peer- reviewed articles, participant searches pre-filter area and then selects Reviews facet. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 28 Participant Successful? Commentary 6 Yes Selects Articles facet. Clicks on “Check holdings.” Participant hovers over “Online Access” text and then selects Peer-reviewed facet. 7 Yes Looks in drop-down scope, then moves to Articles facet. N/A— does not use facets. N/A—does not use facets. 8 Yes Hovers over Peer-Reviewed Articles facet, and then selects Articles facet. N/A—does not use facets. Selects Peer-reviewed facet. Research Question 5: Will the participant be able to find and use the Actions Menu? Associated task(s): 3.b.i. For any of the records in your search results list, find the citation for any item and copy it to the clipboard. 3.b.ii. For any of the records in your search results list, email this record to yourself. Participant Successful? Commentary 1 Yes Briefly looks at Citation icon, scrolls to bottom of page and looks at Citations area, returns to Citation icon. Scrolls to bottom of page, returns to Actions area, scrolls with arrow to find Email icon, emails to self. 2 No Initially clicks on citation manager icon (Easybib), then clicks on Citation icon and copies to clipboard. Could not find Email icon (technical issue with Search It). Although further discussion reveals that participant expects to see email function within “Send To” heading. 3 No Opens Full Display page of item, scrolls to bottom of page. Clicks on the Citation icon but doesn’t see what looking for. Finds Email icon and emails to self. 4 No Opens Full Display page of item, clicks on the Citation icon, double-clicks to highlight citation. Could not find Email icon. Searches in ellipsis. Attempts the Keep This Item pin. Navigates to My Account. Searches in facets. 5 Yes, with difficulty Finds Citation icon, but then leaves the area via Citations heading and winds up at Web of Science homepage. Hovers over “cited in this” language. Finds the copy functionality. PRIMO NEW USER INTERFACE | GALBREATH, JOHNSON, AND HVIZDAK 29 https://doi.org/10.6017/ital.v37i2.10191 Participant Successful? Commentary Attempts Sent To heading twice, looks through Actions icons, scrolls to right, finds Email icon. 6 Yes Finds Citation icon, copies to clipboard. Scrolls down page, returns to Actions Menu, scrolls to Email icon, emails record to self. 7 Yes, with difficulty Copies citation from the Brief Result, and then spends some time trying to find “the clipboard.” Navigates to the Email icon. 8 Yes, with difficulty Scrolls to bottom of Full Display page, clicks on Citing This link, clicks on title to record, and then copies first 3 lines of record. Scrolls until finds Email icon, but then moves to Sent To heading, and then back to Email icon, and sends. Research Question 6: Will the participant be able to navigate the Get It and View It areas of the Full Display page? Associated task(s): 5.a. This book is checked out. How would you get a copy of it? 5.b. Please show us the information from this record that you would use to find this item on the shelves. 9.b. Click on the link that will access the full text. Participant Successful? Commentary 1 Yes Clicks on “Check holdings” availability status, clicks on Availability and Request Options heading, clicks on Request Summit Item link. Refers to call number in Alma iframe. Clicks “Full-text available” status, clicks database name. 2 Yes Opens record, locates resource sharing link. Refers to call number; opens stack chart to find call number. Clicks on title, clicks database name. 3 Yes Locates request option. Locates call number in record. Clicks “Full-text available” status, clicks database name. 4 Yes, with difficulty. Clicks on Show Libraries button, then finds request option after searching page. Locates call number in record. Clicks “Full-text available” status but does not click on database name. 5 Yes, with difficulty. Moves to Stack Chart button, then to Show Libraries button, and then to Availability and Request Options heading, clicks on Stack Chart, clicks on Show Libraries, moves into first library listed and INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 30 Participant Successful? Commentary back out, and finally to ILL link. Finds call number on Full Display page. 6 Yes Finds Request Summit option. Identifies call number and Stack Chart as means to find book. Clicks on database name. 7 Yes, with difficulty. Looks at Status statement, scrolls to bottom of page, then Show Libraries button, then Request Summit option. Identifies call number and Stack Chart as means to find book. Attempts to use “Full-text available” link, then clicks on database name. 8 Yes Finds Summit Request option. Identifies call number and Stack Chart as means to find book. Attempts to use “Full-text available” link, then clicks on database name. Research Question 7: Will the participant be able to navigate their My Account area? Associated task(s): 6. Please navigate to your library account (from within Search It). 6a. Pretend that you have forgotten how many items you have checked out. Please show us how you would find out how many items you currently have checked out. 6b. Exit your library account area. Participant Successful? Commentary 1 Yes Navigates to My Account from “User” link. Navigates to Loans tab. Uses back arrow icon. 2 Yes Navigates to My Account from “User” link. Navigates to Loans tab. Uses back arrow icon. 3 Yes Navigates to My Account from Main Menu ellipsis. Navigates to Loans. Uses back arrow icon. 4 Yes Navigates to My Account from Main Menu ellipsis. Navigates to Loans. Uses to back arrow icon. 5 Yes Navigates to My Account from “User” link. Navigates to Loans. Signs out of Search It. 6 Yes Navigates to My Account from “User” link. Navigates to Loans. Uses Search It logo to exit. PRIMO NEW USER INTERFACE | GALBREATH, JOHNSON, AND HVIZDAK 31 https://doi.org/10.6017/ital.v37i2.10191 Participant Successful? Commentary 7 Yes Navigates to My Account from “User” link. Navigates to Loans. Uses New Search button to exit. 8 Yes Navigates to My Account from “User” link. Navigates to Loans. Uses Search It logo to exit. Research Question 8: Will the participant be able to find and use the Help/Chat and My Favorites icons? Associated task(s): 8. Please show us where would you go to find help and/or chat with a librarian? 9.a. Add any source to your My Favorites list. Then, open My Favorites and click on the title of the source you just added. Participant Successful? Commentary 1 Yes, with difficulty Navigates to Help/Chat icon. Navigates to Keep This Item pin, hesitates, navigates to ellipsis, returns to and clicks on pin. Moves to My Favorites via animation. Clicks on title. 2 Yes, with difficulty Initially navigates to Help/Chat icon, but thinks it is the wrong button because chat is not directly available within Search It. Navigates to Keep This Item pin, hesitates, looks around, selects pin. Moves to My Favorites via animation. Clicks on title. 3 Yes, with difficulty Navigates to Help/Chat icon. Navigates to ellipsis, Actions Menu, and Tags section. Finds Keep This Item pin. 4 No Navigates to Help/Chat icon. Navigates to ellipsis, Keep This Item pin, My Account, and facets Quits search. 5 Yes, with difficulty Navigates to Help/Chat icon. Adds Keep This Item pin after investigating 12 other icons. Moves to My Favorites via animation. Clicks on title. 6 Yes Navigates to Help/Chat icon. Adds Keep This Item pin and moves to My Favorites via animation. Clicks on title. 7 Yes Navigates to Help/Chat icon. Checks Actions menu, adds Keep This Item pin and moves to My Favorites via animation Clicks on title. 8 Yes Navigates to Help/Chat icon. Adds Keep This Item pin and moves to My Favorites via animation. Clicks on title. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 32 Research Question 9: Will the participant be able to find and use the Advanced Search functionality? Associated task(s): 7. Please navigate to Advanced Search. 7a. Perform any search on this page. Participant Successful? Commentary 1 Yes Navigates to Advanced Search. Performs search. 2 Yes Navigates to Advanced Search. Performs search. 3 Yes, with difficulty Navigates to Basic Search drop-down, then to New Search, then to Advanced Search. Has trouble inserting cursor into search box. 4 Yes, with difficulty Navigates to Advanced Search. Builds complex search, then Search It freezes and we have to restart the search tool. 5 Yes Navigates to Advanced Search. Performs search. 6 Yes Navigates to Advanced Search. Performs search. 7 Yes Navigates to Advanced Search. Performs search. 8 Yes Navigates to Advanced Search. Performs search. Research Question #10: Will the participant be able to find and use the Main Menu items? Associated task(s): 10. Please find a database that might be of interest to you (e.g., JSTOR). Participant Successful? Commentary 1 Yes Navigates to “Databases” link of Main Menu. 2 Yes Navigates to “Databases” link of Main Menu. 3 No Types query “stretchable electronics” into search box, but unsure how to find a database in the results lists. 4 No Types query “reinforced concrete” into search box, but unsure how to find a database in the results lists. PRIMO NEW USER INTERFACE | GALBREATH, JOHNSON, AND HVIZDAK 33 https://doi.org/10.6017/ital.v37i2.10191 Participant Successful? Commentary 5 Yes, with difficulty Is confused by term database. Enters “IEEE” in search box. 6 Yes Navigates to “Databases” link of Main Menu. 7 Yes Searches within drop-down scopes, then facets, then moves to “Databases” link of Main Menu. 8 Yes Navigates to “Databases” link of Main Menu. 1 “Frequently Asked Questions,” Ex Libris Knowledge Center, accessed August 28, 2017, https://knowledge.exlibrisgroup.com/Primo/Product_Documentation/050New_Primo_User_Interface /010Frequently_Asked_Questions. 2 Rice Majors, “Comparative User Experiences of Next-Generation Catalogue Interfaces,” Library Trends 61, no. 1 (2012): 186–207, https://doi.org/10.1353/lib.2012.0029. 3 David Comeaux, “Usability Testing of a Web-Scale Discovery System at an Academic Library,” College & Undergraduate Libraries 19, no. 2–4 (2012): 199, https://doi.org/10.1080/10691316.2012.695671. 4 Comeaux, “Usability Testing,” 202. 5 Comeaux, “Usability Testing,” 196–97. 6 Kylie Jarrett, “FindIt@Flinders: User Experiences of the Primo Discovery Search Solution,” Australian Academic & Research Libraries 43, no. 4 (2012): 280, https://doi.org/10.1080/00048623.2012.10722288. 7 Jarrett, “FindIt@Flinders,” 287. 8 Aaron Nichols et al., “Kicking the Tires: A Usability Study of the Primo Discovery Tool,” Journal of Web Librarianship 8, no. 2 (2014): 174, https://doi.org/10.1080/19322909.2014.903133. 9 Nichols, “Kicking the Tires,” 181. 10 Nichols, “Kicking the Tires,” 184. 11 Nichols, “Kicking the Tires,” 184–85. 12 Scott Hanrath and Miloche Kottman, “Use and Usability of a Discovery Tool in an Academic Library,” Journal of Web Librarianship 9, no. 1 (2015): 9, https://doi.org/10.1080/19322909.2014.983259. 13 Greta Kliewer et al., “Using Primo for Undergraduate Research: A Usability Study,” Library Hi Tech 34, no. 4 (2016): 576, https://doi.org/10.1108/lht-05-2016-0052. https://knowledge.exlibrisgroup.com/Primo/Product_Documentation/050New_Primo_User_Interface/010Frequently_Asked_Questions https://knowledge.exlibrisgroup.com/Primo/Product_Documentation/050New_Primo_User_Interface/010Frequently_Asked_Questions https://doi.org/10.1353/lib.2012.0029 https://doi.org/10.1080/10691316.2012.695671 https://doi.org/10.1080/00048623.2012.10722288 https://doi.org/10.1080/19322909.2014.903133 https://doi.org/10.1080/19322909.2014.983259 https://doi.org/10.1108/lht-05-2016-0052 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 34 14 Kelsey Brett, Ashley Lierman, and Cherie Turner, “Lessons Learned: A Primo Usability Study,” Information Technology & Libraries 35, no. 1 (2016): 21, https://doi.org/10.6017/ital.v35i1.8965. 15 Cece Cai, April Crockett, and Michael Ward, “Our Experience with Primo New UI,” Ex Libris Users of North America Conference 2017, accessed November 4, 2017, http://documents.el- una.org/1467/1/CaiCrockettWard_051017_445pm.pdf. 16 Cai, Crockett, and Ward, “Our Experience with Primo New UI.” 17 J. Michael DeMars, “Discovering our Users: A Multi-Campus Usability Study of Primo” (paper presented, International Federation of Library Association and Institutions World Library and Information Conference 2017, Warsaw, Poland, August 14, 2017), 11, http://library.ifla.org/1810/1/S10-2017- demars-en.pdf. 18 Anne M. Pepitone, “A Tale of Two UIs: Usability Studies of Two Primo User Interfaces” (slideshow presentation, Primo Day 2017: Migrating to the New UI, June 12, 2017), https://www.orbiscascade.org/primo-day-2017-schedule/. 19 “Primo Usability Guidelines and Test Script,” Ex Libris Knowledge Center, accessed October 28, 2017, https://knowledge.exlibrisgroup.com/Primo/Product_Documentation/ New_Primo_User_Interface/Primo_Usability_Guidelines_and_Test_Script. 20 DeMars, “Discovering Our Users,” 9. 21 Majors, “Comparative User Experiences,” 190; Comeaux, "Usability Testing," 198–204. 22 Brett, Lierman, and Turner, “Lessons Learned,” 21. 23 Jarrett, “FindIt@Flinders,” 287; Nichols, “Kicking the Tires,” 184; Brett, Lierman, and Turner, “Lessons Learned,” 20–21. 24 Kliewer et al., “Using Primo for Undergraduate Research,” 571–72; Brett, Lierman, and Turner, “Lessons Learned,” 17. 25 Miri Botzer, “Delivering the Experience that Users Expect: Core Principles for Designing Library Discovery Services,” white paper, Nov 25 2015, 10, http://docplayer.net/10248265-Delivering-the- experience-that-users-expect-core-principles-for-designing-library-discovery-services-miri-botzer- primo-product-manager-ex-libris.html. 26 Majors, “Comparative User Experiences,” 194; Nichols et al., “Kicking the Tires,” 184–85. 27 Cai, Crockett, and Ward, “Our Experience with Primo New UI,” 28–29. 28 Pepitone, “A Tale of Two UIs,” 29. 29 Botzer, “Delivering the Experience,” 4–5; Christine Stohn, “How do Users Search and Discover? Findings from Ex Libris User Research,” Library Technology Guides, May 5 2015, 7–8, https://librarytechnology.org/document/20650. https://doi.org/10.6017/ital.v35i1.8965 http://documents.el-una.org/1467/1/CaiCrockettWard_051017_445pm.pdf http://documents.el-una.org/1467/1/CaiCrockettWard_051017_445pm.pdf http://library.ifla.org/1810/1/S10-2017-demars-en.pdf http://library.ifla.org/1810/1/S10-2017-demars-en.pdf https://www.orbiscascade.org/primo-day-2017-schedule/ https://knowledge.exlibrisgroup.com/Primo/Product_Documentation/%20New_Primo_User_Interface/Primo_Usability_Guidelines_and_Test_Script https://knowledge.exlibrisgroup.com/Primo/Product_Documentation/%20New_Primo_User_Interface/Primo_Usability_Guidelines_and_Test_Script http://docplayer.net/10248265-Delivering-the-experience-that-users-expect-core-principles-for-designing-library-discovery-services-miri-botzer-primo-product-manager-ex-libris.html http://docplayer.net/10248265-Delivering-the-experience-that-users-expect-core-principles-for-designing-library-discovery-services-miri-botzer-primo-product-manager-ex-libris.html http://docplayer.net/10248265-Delivering-the-experience-that-users-expect-core-principles-for-designing-library-discovery-services-miri-botzer-primo-product-manager-ex-libris.html https://librarytechnology.org/document/20650 PRIMO NEW USER INTERFACE | GALBREATH, JOHNSON, AND HVIZDAK 35 https://doi.org/10.6017/ital.v37i2.10191 30 “Primo August 2017 Highlights,” Ex Libris Knowledge Center, accessed November 2, 2017, https://knowledge.exlibrisgroup.com/Primo/Product_Documentation/Highlights/ 027Primo_August_2017_Highlights. 31 Jakob Nielsen, “How Many Test Users in a Usability Study?,” Nielsen Norman Group, Jun 4, 2012, https://www.nngroup.com/articles/how-many-test-users/. 32 Majors, “Comparative User Experiences,” 190; Comeaux, “Usability Testing,” 200–204. 33 Brett, Lierman, and Turner, “Lessons Learned,” 21. https://knowledge.exlibrisgroup.com/Primo/Product_Documentation/Highlights/%20027Primo_August_2017_Highlights https://knowledge.exlibrisgroup.com/Primo/Product_Documentation/Highlights/%20027Primo_August_2017_Highlights https://www.nngroup.com/articles/author/jakob-nielsen/ https://www.nngroup.com/articles/how-many-test-users/ Abstract Introduction Research Questions Literature Review Method Results Task set(s) related to research question 1: Will the participant be able to find and use the Basic Search functionality? Task set(s) related to research question 2: Will the participant be able to understand the availability information of the brief results? Task set(s) related to research question 3: Will the participant be able to find and use the Sign In and Sign Out features? Task set(s) related to research question 4: Will the participant be able to understand the behavior of the facets? Task set(s) related to research question 5: Will the participant be able to find and use the Actions Menu? Task set(s) related to research question 6: Will the participant be able to navigate the Get It and View It areas of the Full Display page? Task set(s) related to research question 7: Will the participant be able to navigate the My Account area? Task set(s) related to research question 8: Will the participant be able to find and use the Help/Chat and My Favorites icons? Task set(s) related to research question 9: Will the participant be able to find and use the Advanced Search functionality? Task set(s) related to research question 10: Will the participant be able to find and use the Main Menu items? Discussion Decisions made in response to usability results Declined to Change Implemented a Change Labels Link Removal Prioritized the Emphasis of Certain Functionalities Contrast and Separation Conclusion Appendix A: Usability Tasks Appendix B: Usability Results Research Question 1: Associated task(s): Research Question 2: Associated task(s): Research Question 3: Associated task(s): Research Question 4: Associated task(s): Research Question 5: Associated task(s): Research Question 6: Associated task(s): Research Question 7: Associated task(s): Research Question 8: Associated task(s): Research Question 9: Associated task(s): Research Question #10: Associated task(s): 10208 ---- Managing In-Library Use Data: Putting a Web Geographic Information Systems Platform through Its Paces Bruce Godfrey and Rick Stoddart INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 34 Bruce Godfrey (bgodfrey@uidaho.edu) is GIS Librarian and Rick Stoddart (rstoddart@uidaho.edu) is Education Librarian at the University of Idaho Library. ABSTRACT Web Geographic Information System (GIS) platforms have matured to a point where they offer attractive capabilities for collecting, analyzing, sharing, and visualizing in-library use data for space-assessment initiatives. As these platforms continue to evolve, it is reasonable to conclude that enhancements to these platforms will not only offer librarians more opportunities to collect in-library use data to inform the use of physical space in their buildings, but also that they will potentially provide opportunities to more easily share database schemas for defining learning spaces and observations associated with those spaces. This article proposes using web GIS, as opposed to traditional desktop GIS, as an approach for collecting, managing, documenting, analyzing, visualizing, and sharing in-library use data and goes on to highlight the process for utilizing the Esri ArcGIS Online platform for a pilot project by an academic library for this purpose. INTRODUCTION A geographic information system (GIS) is a computer program for working with geographic data. A GIS is an ideal tool for capturing data about library learning spaces because they can be described by a geographic area. The learning spaces might be small or large, irregularly shaped or symmetrical—either way, the shape can be described by a set of geographic coordinates. Tools for storing, managing, documenting, analyzing, and visualizing geographic data can all be found in a GIS. The locations and shapes of geographic features (such as library learning spaces) as well as attributes of those features (such as the type of learning space) can be captured in a GIS. The roots of GISs stretch back to the 1960s. Goodchild characterizes GISs’ advances in spatial analysis during the 1970s and the growth of GIS in the 1980s, coinciding with the proliferation and affordability of desktop computers.1 The enhancement of GIS software from desktop computer applications to online platforms has been underway for some time. The origins of web GIS can be traced back to 1990s, but it is only since the mid- 2000s that products have really matured to a point where they can be viable alternatives to their desktop counterparts. Web GIS first appears in 1993 when Xerox Corporation’s Palo Alto Research Center created an online map viewer.2 Their map viewer, running in a web browser, was the first demonstration of performing GIS tasks without GIS software installed on a local computer. Even though this early web-based GIS application had limited capabilities, the potential of performing GIS operations from computers anywhere and anytime was recognized. The possible capabilities of web GIS began to be more fully discussed in the mid-1990s.3 Web GIS software became available in earnest in 1996 as GIS companies began releasing commercial offerings.4 The first two decades of this century have seen web GIS explode in functionality and scope to become an integral part of most GISs. mailto:bgodfrey@uidaho.edu mailto:rstoddart@uidaho.edu MANAGING IN-LIBRARY USE DATA | GODFREY AND STODDART 35 https://doi.org/10.6017/ital.v37i2.10208 In late 2012, a collaborative mapping platform hosted by Esri (Environmental Systems Research Institute) named ArcGIS Online (https://www.arcgis.com/) was released. Esri is a GIS software company that was founded in 1969, and its products are used by more than 7,000 colleges and universities across the globe.5 The collaborative platform enables users to create, manage, analyze, store, and share maps, applications, and data on the internet. GIS software continues to evolve from desktop computer programs to specialized software applications (i.e., apps) that are part of a web-focused platform. This transformation is profoundly growing the accessibility of the technology to a broader array of users. What was once a technology reserved for geographic information professionals because of its complexity and cost has now been streamlined and put in the hands of nonprofessionals who want to take advantage of its many possibilities. It is no longer reserved for academic disciplines such as geographic information science and remote sensing science; instead, GIS has seen its use grow in humanities and social science to the point where libraries are developing targeted services for these disciplines. 6 Professionals are afforded the ability to share their data more easily, and nonprofessionals are able to utilize those data to create information and knowledge more easily. This transformation bodes well for libraries because it lowers technological hurdles that might have precluded the technology’s use for space-assessment and other place-based initiatives in the past. Now that software-as-a-service (SaaS) mapping platforms such as Mango, GIS Cloud, and ArcGIS Online enable users to access capabilities over the internet, there is no server software for users to install or licensing to configure. Additionally, the training required by personnel to gather, utilize, and manage data has been greatly reduced compared to its desktop predecessor. Academic libraries, and libraries in general, stand to gain from the evolution. THE USE OF DESKTOP GIS FOR SPACE ASSESSMENT The value of space planning efforts in libraries and the observational methods employed to conduct such activities have been well articulated in library research. The use of desktop GIS as a tool for collecting in - library use data in academic libraries has been present for more than a decade. Bishop and Mandel show that libraries’ use of GIS falls into two broad categories, analyzing service area populations and facilities management, the latter of which encompasses “in-library use and occupancy of library study space.”7 Work related to the use of GIS to study library-patron spaces is discussed below. In the past twenty years, academic libraries have seen many transformations in their roles on college and university campuses. GIS technologies have helped document and respond to those transformations. Xia outlined the value of using GIS as a tool for space management in academic libraries more than a decade ago because of its “capacity for analyzing spatial data and interactive information.” 8 In one study, Xia describes using Esri ArcView 3.x desktop software for library space management. ArcView was Esri’s first GIS software to have a graphical user interface; predecessors had command-line interfaces. Xia mentions the use of, at that time, the emerging ArcGIS product, which went on to replace ArcView 3.x. GIS proved to be a valuable tool for Xia to track the spatial distribution of books in the library environment.9 Xia went on to measure and visualize the occupancy of study space using ArcView.10 Lastly, Xia used ArcView as an item-locating system within the physical space of the library.11 More recently, Mandel utilized MapWindow, an open-source desktop GIS originally developed at Idaho State University, for creating maps of fictional in-library use data.12 Mandel’s process demonstrated how a GIS could be utilized to visualize the use of library spaces for marketing materials and services as well as graphically depicting a library’s value. Coyle argued for the use of GIS as a tool to analyze the interior space of the library, and specifically the library collection itself, while not implementing a system with any https://www.arcgis.com/ INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 36 specific GIS package.13 Given and Archibald detailed their use of visual traffic sweeps as an approach to collect and visualize in-library use data.14 Their workflow involved utilizing a Microsoft Excel spreadsheet to capture data and then importing the data into ArcGIS to query and visualize the data. Therefore, GIS wasn’t used for data capture; it was used toward the end of the process to visualize these data. While the body of work details the use of desktop GIS for working with in-library use data, collaborative web GIS platforms now offer opportunities to advance existing research in this arena by streamlining data- collection workflows, sharing database schemas, and enabling broader collaboration with peers, thereby potentially creating opportunities for new research. Fusing the capabilities of these new platforms with traditional observational methods of gathering data on how people are using library spaces extends the body of knowledge and offers interesting new opportunities for research such as cross-institutional comparisons. It is critical for twenty-first-century academic libraries to collect such data to continue to evolve with the changing needs of digital-age campus research and culture. UTILIZING A CLOUD-BASED PLATFORM FOR LEARNING SPACE ASSESSMENT Discussed below is the approach employed for this pilot project to use web GIS to collect, manage, share, and visualize information about library learning spaces. This pilot project utilized the Esri ArcGIS Online platform and client applications accessing that platform (see figure 1). Collector for ArcGIS (http://doc.arcgis.com/en/collector), a ready-made app, was used for data collection. ArcGIS Desktop (http://desktop.arcgis.com) was used at the outset to create the initial database schema. A custom HTML/JavaScript web application was developed to better enable library administrators to visualize the data as a map, table, or chart. Prior to the implementation of this pilot project, the Circulation Department conducted floor sweeps for safety purposes (e.g., making sure certain doors were locked), but space assessment data had never been gathered for the library. Research Study Location All observations were taken during fall 2016 and spring 2017 at the University of Idaho Library and the Gary Strong Curriculum Center. This article focuses on the implementation of the platform for use at the Library. The first floor of the University of Idaho Library underwent a remodel during winter 2016. The remodel included new furniture and different configurations of areas better customized for learning and studying. Spaces such as group study, booths, and brainstorming spaces figured prominently in the remodel. Additionally, expanded food and beverage options and having proximity to open seating areas located near natural light provide a welcoming environment. Library hours were also expanded to 24 hours per day, 5 days a week. With these changes arose the desire to digitally collect data to learn about the use of these new locations by patrons. Utilizing these data to inform decision-making about future changes to the physical spaces in the library, as well as connecting library learning spaces to campus learning outcomes, were goals of this research. http://doc.arcgis.com/en/collector http://desktop.arcgis.com/ MANAGING IN-LIBRARY USE DATA | GODFREY AND STODDART 37 https://doi.org/10.6017/ital.v37i2.10208 Figure 1. Infrastructure for the pilot project. Selecting the ArcGIS Online Platform Using locally existing resources to implement this pilot project was a requirement. Funding was not available to purchase server software or hardware. Personnel time could be carved out of existing positions for this effort, but money was not available to hire additional personnel. The University of Idaho Library does not have a dedicated IT unit, so choices were limited. Purchasing business-intelligence software such as Tableau was cost prohibitive. An open-source tool such as Suma, developed by North Carolina State University Libraries, was not a practical option in this case because the system requirements did not align with the expertise of existing personnel.15 Fortunately, the ArcGIS Online platform was available for this research at no cost to the library, and existing personnel had experience using the platform. The University of Idaho participates in, and contributes financially to, a State of Idaho higher education site license for Esri software. The software is then available to personnel across the institution for research, teaching, and, to a lesser extent at this time, administrative purposes. Since ArcGIS Online is a cloud platform, there is no server software to install and update and no server hardware to configure. Additionally, the University of Idaho GIS librarian was familiar with the capabilities of the platform and available to actively participate in this research. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 38 In short, researchers’ access to and existing expertise with the ArcGIS Online platform, coupled with the extensive capabilities of the platform itself, made it the best choice for this research. Pilot Project Design A public services librarian and the GIS librarian assumed leadership roles for the pilot project. The public services librarian led tasks associated with defining the learning (i.e., the data-collection) spaces, defining the data fields and domains for those spaces, and overseeing personnel responsible for collecting these data. The GIS librarian led tasks associated with creating the database schema, creating the geographic features representing the learning spaces, creating a web application to visualize the data, and managing content on the ArcGIS Online platform. Library personnel were responsible for collecting the data. Gathering ancillary data Having building floor plans in a digital format was helpful for data collectors to orient themselves in the space when looking at a map on a mobile device. Our research team was able to acquire georeferenced building floor plans for our institution from the Information Technology Services unit on campus. Each of the four floors of the library were published to ArcGIS Online as Hosted Tile Layers to serve as a frame of reference for data collectors. Managing content and users ArcGIS Online provides the ability to create and define groups. Groups are collections of items that can be shared among named users. Individual user accounts for each project participant were created, and a group containing items for this pilot project to be shared among those users was created. This approach allowed all data associated with the project to be private and only shared among personnel participating in the project. Database design The primary knowledge product resulting from this research was a web application containing a two- dimensional map, tables, and charts. A geodatabase, which is an assemblage of geographic datasets, needed to be designed and created to provide data to the web application.16 Designing a geodatabase begins with defining the operational layers required to gather information. 17 For this pilot project, one operational layer depicting individual learning spaces was required (see table 1). Table 1. Description of the learning spaces layer Layer Learning spaces Map use Learning spaces define areas intended for a specific type of learning Data source Digitized using building floor plans as a frame of reference Representation Polygons The learning spaces layer was used to store the geometry of the individual learning spaces. A table to store observations for each learning space was needed, and a relationship between each individual space and the observations for each space was required (see figure 2). The relationship binds observations to their appropriate learning spaces. The relationship was defined to allow one learning space to relate to many observations for that space. MANAGING IN-LIBRARY USE DATA | GODFREY AND STODDART 39 https://doi.org/10.6017/ital.v37i2.10208 Figure 2. Data elements of the geodatabase. Fields, analogous to columns in a spreadsheet, were defined for the learning spaces and observations table to store descriptive information. For example, a friendly name was assigned to each learning space. Additionally, domains were defined to manage valid values for specific fields. Domains were necessary for quality control and quality assurance to enforce data integrity, enabling data collectors to pick items from lists rather than having to type the item names. This feature eliminates potential data-collection errors. Field names, data types, field descriptions, and domains for this pilot project can be found in the appendix. Defining data-collection spaces A template was created to define the information required to create each learning space feature. These features were created by digitizing them on a computer screen for each of the four floors of the library using the building floor plans as a frame of reference. Ten learning spaces were defined for the first floor of the library and one each for floors 2, 3, and 4. A map for each floor was created and published to ArcGIS Online as a Hosted Feature Layer.18 Each map contained two layers: one for the floor plan and one for the learning spaces (figure 3). Library personnel used these maps to collect data. Data collection Data collection was accomplished using Collector for ArcGIS installed on mobile devices. This eliminated the need for any software-development costs for data collection. Collector for ArcGIS is a ready-made ArcGIS Online application that is designed to provide an easy-to-use interface for collecting location-based data. The software was installed on a variety of devices, including a Samsung Galaxy tablet, a Surface tablet, and an Apple iPad. The online collection mode was enabled during collection, resulting in data being transferred real-time to ArcGIS Online. The software can collect data in an offline mode, but, because strong internet connections were available in both campus buildings, the online mode was utilized. The collection workflow consisted of library personnel traversing the floors of the library and recording data about the number of users in each space, what the users were doing in the space, and entering additional context comments if necessary. Library staff were encouraged to use their own expertise and observational cues (e.g., textbooks present) when recording data associated with patron activities in library spaces. The date, time, and name of the data collector was recorded automatically, an option available through the ArcGIS Online platform. The user interface for the software was friendly and intuitive and required minimal training (figure 4). A list was provided to select the type of use for the selected space. Data were accessible via ArcGIS Online immediately following collection. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 40 Figure 3. First floor learning spaces of the University of Idaho library overlaid on the building floor plan. Figure 4. The Collector for ArcGIS user interface utilized for data collection. MANAGING IN-LIBRARY USE DATA | GODFREY AND STODDART 41 https://doi.org/10.6017/ital.v37i2.10208 RESULTS OF USING WEB GIS Web GIS, specifically ArcGIS Online, offered the functionality required for collecting and managing in - library use data. Additionally, the platform offers librarians supplementary opportunities for collaborative space-assessment projects. While the ArcGIS Online platform proved to be useful for this pilot project, some of the advantages and limitations encountered are discussed below. Advantage: Ease of Use Through Targeted Applications Esri software has been used in academia for decades. While the early command-line versions and later desktop versions were the playground of those with GIS training, web GIS applications have a decidedly friendlier interface because of the ability to customize applications on the platform for specific purposes. For example, applications with management functionality can be separated from applications intended for data gathering. The need for excessive functionally to be included in one interface is replaced with a more modular framework, resulting in less complex user interfaces as seen in many desktop GIS programs. While some personnel involved with this project had used Esri software for many years and were familiar with the capabilities of the ArcGIS Online, they had not used the platform for data collection prior to this project. Managing users and content for the project proved to be straightforward. It was made even easier when enterprise logins were configured, which allowed personnel to sign in using their institutional user name and password. Authoring the database schema, creating the necessary maps, and publishing those maps as hosted services was not complicated for those with basic desktop and web GIS knowledge. Those responsible for collecting data needed little training using Collector for ArcGIS to begin data collection. Finally, librarians with no GIS background were able to export the data to a familiar format (comma- separated values) to begin analysis using software such as Excel. In short, authoring the database and map services remains best handled by those with GIS experience. However, targeted application interfaces enable user without GIS experience to collect and work with data. Advantage: Participation in Enterprise Architecture Conducting library research on a platform many faculty, students, and staff are beginning to use for research, learning, and administration places librarians within the same collaborative space as the communities they are serving. In the case of this research, our need for building floor plans presented opportunities to more broadly discuss enterprise GIS at our institution by sharing this information. Interaction took place between the Library, Facilities Services, and Information Technology Services, resulting in a cultivation of relationships around data sharing. Furthermore, integration of our enterprise security with the ArcGIS Online platform adds a level of legitimacy to geospatial data management efforts. Advantage: Potential for Cross-Institutional Collaborative Projects The potential for cross-institutional collaboration on library-space assessment and other projects should not be overlooked when using the ArcGIS Online platform. Such collaborations are even more manageable because Esri software is being used by more than 7,000 colleges and universities across the globe. Even though cross-institutional collaboration was not a goal of this research, the opportunities for projects or programs of this nature became abundantly clear. Items created in ArcGIS Online can be shared between organizations. Simply sharing a library-space-assessment database schema with librarians at other institutions would allow them to quickly implement a similar project on the ArcGIS Online platform. This opens the door to new research opportunities. The functionality exists for one institution to host a database that personnel from multiple institutions could populate. A single dataset containing learning spaces of multiple institutions with multiple contributors could be created, managed, and analyzed collaboratively. This could enable lower-resource libraries to participate in projects with larger institutions as economies INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 42 of scale are realized. And it offers the ability to undertake projects across multiple institutions to explore broader space assessment or other research questions. Limitation: Updating Hosted Feature Service Schemas The ability to author and edit schemas entirely in ArcGIS Online has not yet matured to a point where it matches the abilities of its desktop counterpart. Specifically, updating a published schema is currently difficult to accomplish in ArcGIS Online because a user-friendly interface does not exist. However, the task can be accomplished by editing the JavaScript Object Notation (JSON) of the hosted feature service. While this is a current limitation for managers of the hosted feature service and not data collectors, it is anticipated that this will be addressed in future updates. Limitation: User Interface for Standards-Based Metadata Items created as part of the pilot project were documented using the metadata editor provided in ArcGIS Online. ArcGIS Online’s users can create and maintain geospatial standards-based metadata for content. However, the user interface for creating metadata based on either the ISO 19115-series or Federal Geographic Data Committee (FGDC) Content Standard for Digital Geospatial Metadata (CSDGM) could be improved by simplifying its complexity and allowing for batch updating specific elements. Item documentation for the platform focuses on creating and editing elements of ArcGIS-format metadata. It should be noted, and potentially added as a point of concern for librarians, that the ability to author and edit metadata based on the ISO and CSDGM standards was introduced three years after the initial release of ArcGIS Online. Limitation: Visualizing Data in Related Tables The ability to visualize data collected as part of this project using ready-made applications in ArcGIS Online yielded unsatisfactory results. The primary limitation was related to working with repeated measurements for the learning spaces. Ready-made applications like Web AppBuilder and Operations Dashboard have limited support for a user-friendly presentation of repeated learning-space observation. Therefore, a custom web application was developed by a University of Idaho student using the Esri JavaScript application programming interface (API). The application provides the ability to select a date range, a time scope (e.g., daytime, nighttime, all hours), a building, and a floor to visualize the data. The learning spaces are colored by the total number of users in a space on the basis of the parameters selected (see figure 5). Figure 5. Map view of the space assessment dashboard application. MANAGING IN-LIBRARY USE DATA | GODFREY AND STODDART 43 https://doi.org/10.6017/ital.v37i2.10208 For each individual space, a chart and table can be displayed to gain further insight (see figures 5 and 6). Figure 6. Chart view of the space assessment dashboard application. Figure 7. Table view of the space assessment dashboard application. Limitations: Data-Collection Software Issues Using Collector for ArcGIS on devices running Windows 10 proved frustrating because of a documented bug with Collector. A “You are not connected to the internet” error would appear randomly, even when there was a valid internet connection. A workaround was implemented to circumvent the issue, but it was a source of frustration for data-collection staff. Offline data-collection mode was experimented with to see if it was a more favorable option; however, the date and time of the data collection are not captured in offline INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 44 mode, so that potential workflow was abandoned. There were no issues encountered for data collectors who used the Samsung Galaxy (running the Android operating system) or an Apple iPad. CONCLUSIONS Web-based GIS platforms such as ArcGIS Online have evolved to the point where they offer the functionality required for collecting and managing in-library use data. The ArcGIS Online platform performed commendably for this pilot project. While ArcGIS Desktop was used to author the original database schema in this project, it is reasonable to conclude that it is only a matter of time until the functionality required to complete the entire workflow in the web-based platform is available. Using mobile and desktop devices outfitted with the Collector for ArcGIS application proved to be a practical way for collecting real-time in-library use data. Managing project users and the items those users were able to access was straightforward. While the visualization tools for repeated measurements data are currently limited in ArcGIS Online, the data are accessible as a web service, and the sky is the limit on custom web- application development. Looking ahead, adjusting schemas to capture height above and below ground level to take advantage of 3D data models and visualization is intriguing. Use of this model may be beneficial for space-assessment projects that seek to gather data more broadly across institutions. Finally, a noteworthy realization from this research is the potential for inter-institutional and cross- institution collaboration of library space–assessment projects, or other projects for that matter. Librarians can begin embracing the web GIS movement alongside those in the communities they participate in and serve. Opportunities to create efficiencies are possible through the simple sharing of database schemas. Additionally, the ability for one institution to host a database enabling personnel at multiple institutions, or at multiple libraries at larger institutions, to contribute data is available and ready for further research. MANAGING IN-LIBRARY USE DATA | GODFREY AND STODDART 45 https://doi.org/10.6017/ital.v37i2.10208 APPENDIX: SCHEMAS FOR EACH OBJECT IN THE GEODATABASE USED FOR DATA COLLECTION Building name table and associated domain values DomainName BuildingName Description Name of the building FieldType SmallInteger Domain Type CodedValue Code Name 0 Library 1 Education Space identifier table and associated domain values DomainName SpaceID Description Identifier for the area FieldType String Domain Type CodedValue Code Name 1A Group Study 1B Café 1C Landing 1D Computer Lab 1E Individual/Small Group Study 1F MILL (134) 1G Group Study (133) 1H Group Study (132) 1I Group Study (131) 1J Classroom (120) INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 46 2A 2nd floor 3A 3rd floor 4A 4th floor 3A_1 IMTC Area 1 3B_1 IMTC Area 2 3C_1 IMTC Area 3 3D_1 IMTC Area 4 Type of use table and associated domain values DomainName TypeOfUsage Description Type of usage of the area. FieldType SmallInteger Domain Type CodedValue Code Name 0 Browsing Stacks 1 Individual studying 2 Lounging 3 Meeting / Group Study 4 Service Point (Circulation / Reference / ITS Help) 5 Using Library Computers Space assessment areas feature class table Field DataType Description Domin GlobalID GUID Global Identifier SpaceID String Space Identifier SpaceID Floor String Building Floor BldgName SmallInteger Building Name BuildingName MANAGING IN-LIBRARY USE DATA | GODFREY AND STODDART 47 https://doi.org/10.6017/ital.v37i2.10208 Space assessment areas observations table Field DataType Description Domin TYPE_OF_USAGE SmallInteger Type of Usage TypeOfUsage NUMBER_OF_USERS SmallInteger Number of Users GlobalID GUID Global Identifier SpaceID String Space Identifier SpaceID COMMENTS String General Comments Space assessment areas feature class to observations relationship class Cardinality OneToMany IsAttributed FALSE IsComposite FALSE ForwardPathLabel space_assessment_data BackwardPathLabel space_assessment_areas Description Relationship between the space assessment areas and data collected Origin Class Name Origin Primary Key Origin Foreign Key space_assessment_areas SpaceID SpaceID INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 48 REFERENCES 1 Michael F. Goodchild, “Part 1. Spatial Analysts and GIS Practitioners,” Journal of Geographical Systems 2, no. 1 (2000): 5–10, https://doi.org/10.1007/s101090050022. 2 Pinde Fu and Jiulin Sun, Web GIS: Principles and Applications (Redlands, CA: Esri, 2011), 7. 3 Suzana Dragićević, “The Potential of Web-based GIS,” Journal of Geographical Systems 6, no. 2 (2004): 79– 81, https://doi.org/10.1007/s10109-004-0133-4. 4 Fu and Sun, Web GIS, 9. 5 “Who We Are,” Esri, accessed October 17, 2017, http://www.esri.com/about-esri#who-we-are. 6 Ningning Kong, Michael Fosmire, and Benjamin Dewayne Branch, “Developing Library GIS Services for Humanities and Social Science: An Action Research Approach,” College & Research Libraries 78, no. 4 (2017): 413–27, https://doi.org/10.5860/crl.78.4.413. 7 Bradley Wade Bishop and Lauren H. Mandel, “Utilizing Geographic Information Systems (GIS) in Library Research,” Library Hi Tech 28, no. 4 (2010): 543, https://doi.org/10.1108/07378831011096213. 8 Jingfeng Xia, “Library Space Management: A GIS Proposal,” Library Hi Tech 22, no. 4 (2004): 375, https://doi.org/10.1108/07378830410570476. 9 Jingfeng Xia. “GIS in the Management of Library Pick-up Books,” Library Hi Tech 22, no. 2 (2004): 209–16, https://doi.org/10.1108/07378830410543520. 10 Jingfeng Xia, “Visualizing Occupancy of Library Study Space with GIS Maps,” New Library World 106, no. 5/6 (2005): 219–33, https://doi.org/10.1108/03074800510595832. 11 Jingfeng Xia, “Locating Library Items by GIS Technology,” Collection Management 30, no. 1 (2005): 63–72, https://doi.org/10.1300/J105v30n01_07. 12 Lauren H. Mandel, “Geographic Information Systems: Tools for Displaying In-Library Use Data,” Information Technology & Libraries 29, no. 1 (2010): 47–52, https://doi.org/10.6017/ital.v29i1.3158. 13 Andrew Coyle, “Interior Library GIS,” Library Hi Tech 29, no. 3 (2011): 529–49, https://doi.org/10.1108/07378831111174468. 14 Lisa M. Given and Heather Archibald, “Visual Traffic Sweeps (VTS): A Research Method for Mapping User Activities in the Library Space,” Library & Information Science Research 37, no. 2 (2015): 100–108, https://doi.org/10.1016/j.lisr.2015.02.005. 15 “Suma,” North Carolina State University Libraries, accessed October 17, 2017, https://www.lib.ncsu.edu/projects/suma. 16 “What Is a Geodatabase?,” Esri, accessed October 17, 2017, http://desktop.arcgis.com/en/arcmap/10.4/manage-data/geodatabases/what-is-a-geodatabase.htm. https://doi.org/10.1007/s101090050022 https://doi.org/10.1007/s10109-004-0133-4 http://www.esri.com/about-esri%23who-we-are https://doi.org/10.5860/crl.78.4.413 https://doi.org/10.1108/07378831011096213 https://doi.org/10.1108/07378830410570476 https://doi.org/10.1108/07378830410543520 https://doi.org/10.1108/03074800510595832 https://doi.org/10.1300/J105v30n01_07 https://doi.org/10.6017/ital.v29i1.3158 https://doi.org/10.1108/07378831111174468 https://doi.org/10.1016/j.lisr.2015.02.005 https://www.lib.ncsu.edu/projects/suma http://desktop.arcgis.com/en/arcmap/10.4/manage-data/geodatabases/what-is-a-geodatabase.htm MANAGING IN-LIBRARY USE DATA | GODFREY AND STODDART 49 https://doi.org/10.6017/ital.v37i2.10208 17 “Geodatabase Design Steps,” Esri, accessed October 17, 2017, http://desktop.arcgis.com/en/arcmap/10.4/manage-data/geodatabases/geodatabase-design- steps.htm. 18 “Hosted Layers,” Esri, accessed October 17, 2017, http://doc.arcgis.com/en/arcgis-online/share- maps/hosted-web-layers.htm. http://desktop.arcgis.com/en/arcmap/10.4/manage-data/geodatabases/geodatabase-design-steps.htm http://desktop.arcgis.com/en/arcmap/10.4/manage-data/geodatabases/geodatabase-design-steps.htm http://doc.arcgis.com/en/arcgis-online/share-maps/hosted-web-layers.htm http://doc.arcgis.com/en/arcgis-online/share-maps/hosted-web-layers.htm ABSTRACT INTRODUCTION The Use Of Desktop GIS For Space Assessment UTILIZING A CLOUD-BASED PLATFORM FOR LEARNING SPACE ASSESSMENT Research Study Location Selecting the ArcGIS Online Platform Pilot Project Design Gathering ancillary data Managing content and users Database design Defining data-collection spaces Data collection RESULTS OF USING WEB GIS Advantage: Ease of Use Through Targeted Applications Advantage: Participation in Enterprise Architecture Advantage: Potential for Cross-Institutional Collaborative Projects Limitation: Updating Hosted Feature Service Schemas Limitation: User Interface for Standards-Based Metadata Limitation: Visualizing Data in Related Tables Limitations: Data-Collection Software Issues CONCLUSIONS APPENDIX: schemas for each object in the geodatabase used for data collection REFERENCES 10230 ---- Accessible, Dynamic Web Content Using Instagram Jaci Wilkinson INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 19 Jaci Wilkinson (jaci.wilkinson@umontana.edu) is Web Services Librarian at the University of Montana. ABSTRACT This is a case study in dynamic content creation using Instagram’s Application Program Interface (API). An embedded feed of the Mansfield Library Archives and Special Collections’ (ASC) most recent Instagram posts was created for their website’s homepage. The process to harness Instagram’s API highlighted competing interests: web services’ desire to most efficiently manage content, ASC staff’s investment in the latest social media trends, and everyone’s institutional commitment to accessibility. INTRODUCTION The Mansfield Library Archives and Special Collections (ASC) at the University of Montana had a simple enough request. Their homepage had been static for years and it was not possible to add more content creation to anyone’s workload. However, they had a robust Instagram account with more than one thousand followers. Was there any way to synchronize workflows with an Instagram embed on the homepage? The solution was more complicated than we thought. We developed an Instagram embed, but in the process grappled with some fundamental questions of technology in the library. How do we streamline the creation and sharing of ephemeral, dynamic content? How do we reconcile web accessibility standards with the innovative new platforms we want to incorporate on our websites? Libraries have invested heavily in social media to improve their approachability, reduce library anxiety, and interact with their users. At the Mansfield Library, this investment has paid off for ASC. This unit was an early adaptor of Instagram, a photo and short video–sharing application with the public or approved followers. The ASC Instagram account launched in January 2015, and staff quickly settled on the persona of “Banjo Cat” to share collection items and relevant history. Banjo Cat was inspired by a whimsical nineteenth-century photograph in ASC of a cat playing a banjo (see figure 1). ASC now has about 1,200 followers including many other libraries, archives, and special collections. In fact, connecting to a wider community of similar institutions was a driving factor in creating an Instagram account. The ASC staff member who updates the account said, While we have lots of interactions with patrons on Facebook we have basically zero interactions with other institutions. Instagram is all about interacting with other institutions, sharing ideas for posts, commenting on posts. So by learning about this community and participating and interacting with it we are able to . . . learn about programs and ideas that we would probably not have access to otherwise. 1 mailto:jaci.wilkinson@umontana.edu ACCESSIBLE, DYNAMIC WEB CONTENT USING INSTAGRAM | WILKINSON 20 https://doi.org/10.6017/ital.v37i1.10230 Figure 1. Banjo Cat by L. A. De Ribas. Mansfield Library Archives and Special Collections. 1880s. But while ASC’s social media thrived, its website was bereft of dynamic content. Given that the ASC homepage is the ninth most visited page on the library site, it felt like a wasted opportunity to let such a highly trafficked area lack engaging, current, and appealing content. It seemed only natural to harness the energy put into the ASC Instagram account and embed that same light-hearted, community-oriented, and collection-focused content on the ASC homepage. LITERATURE REVIEW Libraries are enthusiastic adopters of social media; one study even shows that as of 2013, 94 percent of academic libraries had a social media presence.2 A 2006 Library Journal article observed the following about MySpace, then a popular social media platform: “Given the popularity and reach of this powerful social network, libraries have a chance to be leaders on their college campuses and in the larger community by realizing the possibilities of using social networking sites like MySpace to bring their services to the public.” 3 This open-minded spirit and willingness to try new technology trends was shrewd. Pew Research reports that as of 2016, 69 percent of Americans use some type of social media. 4 Social media use has grown more representative of the population: the percentage of older adults on at least one social media site continues to increase.5 For academic libraries, the pull of Facebook was immediately strong because of the initial requirement for users to have a .edu address. Academic libraries very early on attempted to connect with students about services, resources, and spaces using Facebook.6 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 21 Dynamic content is a gateway to building interest toward and buy-in to an institution. In user experience literature, “user delight” is “a positive emotional affect that a user may have when interacting with a device or interface.”7 In Walter’s Hierarchy of User Needs, pleasure tops all other needs.8 Figure 2. Aaron Walter’s Hierarchy of User Needs, from Therese Fessenden, “A Theory of User Delight: Why Usability is the Foundation for Delightful Experiences,” Nielsen Norman Group, March 25, 2017, https://www.nngroup.com/articles/theory-user-delight/. Using social media to engage users with special collections has its own niche. Special collections are typically housed in closed stacks and have no digital equivalent. Often the materials housed in special collections are rare, fragile, exotic, beautiful, and unusual; a study of library blogs and social media found that those with higher aesthetic value received more visitors and more revisits.9 Social media “gives users an idea of what the collection offers while it promotes and potentially gains foot traffic.”10 It has even been suggested that social media gives special collections the opportunity to stand in when digitization isn’t possible: “Instead of digitizing a whole collection, librarians can highlight important parts of the collection with a snippet of its history.”11 In creating UCLA’s Powell Library Instagram account, librarian Danielle Salomon https://www.nngroup.com/articles/theory-user-delight/ ACCESSIBLE, DYNAMIC WEB CONTENT USING INSTAGRAM | WILKINSON 22 https://doi.org/10.6017/ital.v37i1.10230 writes, “Special collections items and digital library images can be a treasure trove of social media content. One of our library’s goals is to increase students’ exposure to special collections items, so we draw heavily from these collections.”12 Instagram is a relative newcomer to social media, but it has been consistently successful since its inception in 2010.13 As of 2016, 28 percent of Americans use Instagram, up from 11 percent in 2013.14 Facebook bought Instagram in 2012 and has since bolstered the application’s success by making the two platforms easy to navigate and share between. After Vine, a short video application, was shuttered in 2017, Instagram’s ability to take and post short videos has increased its value. Instagram is distinct in that it is mobile-dependent: it is difficult to run the application through a web browser, and only one device can operate an Instagram account. Within the library community, Instagram’s adoption has been strongest in academic libraries. This is tied to the high number of Instagram users who are college-age.15 Another reason libraries select Instagram is because it has more diverse users than other social media applications, specifically African Americans and Latinos.16 In a 2016 study, Instagram was the second-most pick among college students at Western Oregon University when asked what social media application the library should use (Twitter came in first). The most popular use of Instagram in academic libraries is familiarizing students with services, resources, and spaces. Uses include first-year instruction activities to combat library anxiety and mini-contests that ask users to identify what posted photos are of.17 UCLA’s Powell Library discovered students posting Instagram photos of their spaces, so they initially joined to repost those photos and interact with those users. Instagram makes a library seem approachable. Librarian Joanna Hare reflected on this discovery: “Instagram is really powerful in that respect because you can just snap a few photos [and] show what’s going on . . . so that students don’t view the Library as being intimidating.” 18 Approachability is augmented by delegating photography and posting tasks to library student employees. Social media is less often seen as a way to help create dynamic content for a library’s website. The exceptions to this trend have come from institutions with substantial technology resources. North Carolina State University created an open source software that adds photos posted by anyone on Instagram to a library photo collection when a certain hashtag is used.19 The University of Nebraska’s Calvin T. Ryan Library created an RSS feed that disseminates blog posts to Twitter, Facebook, and the library homepage. Posts from followed accounts in Twitter and Facebook are also a part of the resulting feed. The RSS feed requires use of a third-party tool called Dlvr.it (https://dlvrit.com/), which supports many other social media applications, but not Instagram. A notable absence in literature on social media use in libraries is any mention of accessibility concerns. The “Improving the Accessibility of Social Media for Public Service” toolkit developed by a group of US government offices is a useful resource that includes specific guidelines on making Instagram posts more accessible.20 The toolkit explains that “more and more organizations are using social media to conduct outreach, recruit job candidates and encourage workplace productivity. . . . But not all social media content is accessible to people with certain disabilities, which limits the reach and effectiveness of these platforms. And with 20% of the population estimated to have a disability, government agencies have an obligation to ensure that their messages, services and products are as inclusive as possible.”21 Given the stated importance of social media in library literature, the lack of conversation about accessibility and social media is a barrier to inclusivity. https://dlvrit.com/ INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 23 MANSFIELD LIBRARY ARCHIVES AND SPECIAL COLLECTIONS’ INSTAGRAM FEED Dynamic content was lacking from any part of the ASC website, but staff had a dearth of time and knowledge of the content management system to create web content. There was a drive to solve this problem because a new web services librarian had recently been hired. When the web services librarian learned of ASC’s thriving Instagram presence, she pursued the possibility of including that content on the ASC website. She felt that, in addition to being more efficient, content creation should stay in-house given the highly specialized nature of ASC’s collections, spaces, and resources. The ideal solution would allow ASC staff to create and manage an Instagram feed unassisted; the web services librarian sought the simplest possible solution for them. Our content management system and Instagram’s developer website were first consulted with the hope that one provided an automated embed or plugin. Our content management system, Cascade, could pull in content from Facebook and Twitter but not Instagram, and Instagram did not have an automated feed creator. After more research, we learned that third-party Instagram feed embeds are the only possible way to create an Instagram feed without using Instagram’s API. The API was considered a last-resort option because we knew that ASC staff could not manage the code themselves. The idea of using any third-party service was undesirable because of a lack of control, stability, and accessibility. If the service has technical issues or goes out of business, it would be very noticeable given the visibility of ASC’s homepage. In 2012, a student advocacy organization at the University of Montana filed a Civil Rights Complaint with the US Department of Education focusing on disabled students’ unequal access to electronic and information technologies. Since then, the Mansfield Library has been proactive to eliminate barriers to access.22 Given this history, we are wary of the accessibility of third-party applications to someone using assistive technology, most likely, a screen reader. Juicer (https://www.juicer.io/), for example, is a freely available service for an Instagram feed but in exchange it retains its branding prominently at the top of the feed. An example of Juicer in use can be found on the home page of the Baltimore Aquarium (http://aqua.org/). Tests of Juicer showed that it was not accessible for a screen reader. Finally, it didn’t fit our need: Juicer curated posts from other users depending on the hashtags and reposts, but we only wanted to feature our own content. The unpredictability of other accounts’ posts ending up on the ASC homepage was not desirable. Instagram’s developer site did not make finding a solution easy. The page titled “Embedding” is about embedding individual posts on a webpage, not a whole feed.23 This content does not even link out to an explanation of how to embed a feed. The “Authentication” page is where the process begins because calling the API requires a token an authenticated Instagram account user.24 A user is authenticated by creating a client ID and then receiving an access token. Another interesting roadblock provided by the Instagram developer site is that the “Authentication” page provides no further information about using the access token to call the API. It took outside research to finally figure out the steps needed to make the API requests for ASC’s feed.25 PHP code is used to call the API and copy the three most recent ASC Instagram posts to a local server file. (Using JavaScript to call the API is a poor choice because that code will make the account’s access token public. If anyone sees this token they can use it themselves to pull your feed using the Instagram API.) CSS replicates the look and feel of Instagram with white, minimalistic icons and a simple photo display https://www.juicer.io/ http://aqua.org/ ACCESSIBLE, DYNAMIC WEB CONTENT USING INSTAGRAM | WILKINSON 24 https://doi.org/10.6017/ital.v37i1.10230 that darkens and shows the beginning of the description when a user’s mouse hovers over it. All code from this project is freely available in GitHub.26 There is a catch to this embedded feed process. The directions given through Instagram and by the online sources we used only took us to sandbox mode (in web development, sandbox refers to a restricted or test version of a final product). In sandbox, Instagram limits the number of requests to the API. Unfortunately, a request was made every time someone went to the ASC page. The initial feed stopped working in minutes because we did not realize this limitation of sandbox mode meant. Another look at the Instagram developer site taught us that the only way to leave sandbox was to have our “app,” as Instagram called it, reviewed.27 In other words, Instagram has only set up their API to be used for full application development (like Juicer). We decided not to leave sandbox mode because of uncertainty about what Instagram’s review process would entail. If our app was rejected, would they force us to discontinue our work? The timeline for the approval process was also uncertain. Distrust and uncertainty, unfortunately, guided our decision-making at this stage. Instead of undergoing the review process, the PHP code was reconfigured to call the API only once a day. This made the feed less dynamic because it was not updating in real time. F or our purposes this was not a problem; the ASC Instagram account is updated at most once or twice a week anyway. As a result, we are “scraping” ASC’s Instagram account. Although “crawling, scraping, and caching” are prohibited by Instagram’s terms of use, other Instagram feeds in GitHub have similar workarounds and point out that a plugin/scraper “uses (the) same endpoint that Instagram is using in their own site, so it’s arguable if the TOC [terms of use] can prohibit the use of openly available information.”28 While figuring out how to work with the Instagram API, a major accessibility roadblock cropped up: there was no place for the alt text—descriptive information about the image that is used by assistive technologies for users with low vision. Besides taking or uploading a photo, the only other actions offered to create a new post were to write a caption, tag people, or add a location. Only the caption allowed for a text string. Without alt text, not only is the Instagram feed unintelligible to a screen reader but it disturbs a screen reader user’s interaction with all other content on that page. An ASC staff member discovered a solution when she noticed a Joshua Tree National Park Instagram post with alt text at the bottom of the caption. Although initially put off by the “wordiness,” we concluded this was the only logical way to move forward. The benefits to this format of alt text took focus as we moved through the project: the ASC staff member was able to choose the desired alt text without any additional steps or skills, and we grew to relish the opportunity to explain to curious users what the #alttext hashtag meant and why it was important to us. PHP code isolates all text after #alttext and displays that as the alt text to a screen reader. Since the Instagram feed was implemented, it has been interesting to follow how the Instagram developer site has changed and grown. Although Facebook has owned Instagram for five years, the Instagram developer site is only now starting to link out to Facebook developer content. Most recently, the Instagram developer site has been advertising the Instagram Graph API for use by business accounts. This type of development is useless for us because we have a personal Instagram account, not a business account. And the function of the Instagram Graph API is focused on the internal user and analytics, not the end user and user experience. Even if the Instagram Graph API was available for personal accounts, it is worth asking if this type of data collection would be of use to an organization that doesn’t have the labor of a devoted marketing team. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 25 Dynamic content through social media and web content provides opportunities to create user delight because it focuses on visually appealing, fun, timely, and interesting information. For archives, special collections, and other cultural heritage institutions, this content is particularly useful because it provides a look into collections that are interesting and rare but also fragile and housed in closed stacks. These positives are tempered by the reality many of these institutions face: budgets are tight, staffs are small, and technical expertise might be lacking. This paper demonstrates how important and useful social media is to create dynamic website content. Unfortunately, there is a gap in library literature on accessibility and social media; although social media content is ephemeral or lacks specific utility, libraries need to pay more attention to the various ways users access resources and information through social media, especially if that same content appears on the institution’s website. The ASC’s embedded homepage Instagram feed fits their needs, is accessible, and builds community around their unique collections. By providing all the code created in this project in GitHub,29 including the CSS we used, our hope is that institutions interested in this Instagram feed model could replicate it for their own purposes without extensive technical support. ACKNOWLEDGMENTS I am thankful for the expertise of Carlie Magill, Donna McCrea, and Wes Samson. Without them this project would not have been possible. REFERENCES 1 Carlie Magill, e-mail message to author, August 8, 2017. 2 Michael Sutherland, “RSS Feed 2.0” Code4Lib 31, January 28, 2016, http://journal.code4lib.org/articles/11299. 3 Beth Evans, “Your Space or MySpace?” Library Journal 131 (2006): 8–12. Library, Information Science & Technology Abstracts, EBSCOhost. 4 “Social Media Fact Sheet,” Pew Research Center, January 12, 2017, http://www.pewinternet.org/fact-sheet/social-media/. 5 Ibid. 6 Brian S. Mathews, “Do you Facebook?” C&RL News, May 2006, http://crln.acrl.org/index.php/crlnews/article/viewFile/7622/7622. 7 Therese Fessenden, “A Theory of User Delight: Why Usability is the Foundation for Delightful Experiences,” Nielsen Norman Group, March 25, 2017, https://www.nngroup.com/articles/theory-user-delight/. 8 Ibid. 9 Daryl Green, “Utilizing Social Media to Promote Special Collections: What Works and What Doesn’t” (paper, 78th IFLA General Conference and Assembly, Helsinki, Finland, June 2012), 11, https://www.ifla.org/past-wlic/2012/87-green-en.pdf. 10 Katrina Rink, “Displaying Special Collections Online,” Serials Librarian 73, no. 2 (2017): 1–9, https://doi.org/10.1080/0361526X.2017.1291462. 11 Ibid. http://journal.code4lib.org/articles/11299 http://www.pewinternet.org/fact-sheet/social-media/ http://crln.acrl.org/index.php/crlnews/article/viewFile/7622/7622 https://www.nngroup.com/articles/theory-user-delight/ https://www.ifla.org/past-wlic/2012/87-green-en.pdf https://doi.org/10.1080/0361526X.2017.1291462 ACCESSIBLE, DYNAMIC WEB CONTENT USING INSTAGRAM | WILKINSON 26 https://doi.org/10.6017/ital.v37i1.10230 12 Danielle Salomon, “Moving on from Facebook,” College & Research Libraries News 74, no. 8 (2013): 408–12, https://crln.acrl.org/index.php/crlnews/article/view/8991. 13 Sarah Perez, “The Rise of Instagram,” TechCrunch, April 24, 2012, https://techcrunch.com/2012/04/24/the-rise-of-instagram-tracking-the-apps-spread- worldwide/. 14 “Social Media Fact Sheet,” Pew Research Center, January 12, 2017, http://www.pewinternet.org/fact-sheet/social-media/. 15 Lauren Wallis, “#selfiesinthestacks: Sharing the Library with Instagram,” Internet Reference Services Quarterly 19, no. 3–4 (2014): 181–206, https://doi.org/10.1080/10875301.2014.983287. 16 Elizabeth Brookbank, “So Much Social Media, So Little Time: Using Student Feedback to Guide Academic Library Social Media Strategy ,” Journal of Electronic Resources Librarianship 27, no. 4 (2015): 232–47, https://doi.org/10.1080/1941126X.2015.1092344; Salomon, “Moving on from Facebook.” 17 Wallis,“#selfiesinthestacks”; Salomon, “Moving on from Facebook.” 18 Wendy Abbott et al., “An Instagram is Worth a Thousand Words: An Industry Panel and Audience Q&A,” Library Hi Tech News 30, no. 7 (2013): 1–6, https://doi.org/10.1108/LHTN- 08-2013-0047. 19 Salomon “Moving on from Facebook.” 20 “Federal Social Media Accessibility Toolkit Hackpad,” Digital Gov, accessed November 25, 2017, https://www.digitalgov.gov/resources/federal-social-media-accessibility-toolkit-hackpad/ . 21 Ibid. 22 Donna E. McCrea, “Creating a More Accessible Environment for Our Users with Disabilities: Responding to an Office for Civil Rights Complaint,” Archival Issues 38, no. 1 (2017): 7, https://scholarworks.umt.edu/ml_pubs/25/ 23 “Embedding,” Instagram Developer, accessed November 25, 2017, https://www.instagram.com/developer/embedding/. 24 “Authentication,” Instagram Developer, accessed November 25, 2017, https://www.instagram.com/developer/authentication/ . 25 Pranay Deegoju, “Embedding Instagram Feed in Your Website,” Logical Feed, December 25, 2015, https://www.logicalfeed.com/embedding-instagram-feed-in-your-website . 26 Wes Samson, “ws784512 instagram,” GitHub, 2016, https://github.com/ws784512/instagram. 27 “Sandbox Mode,” Instagram Developer, accessed November 25, 2017, https://www.instagram.com/developer/sandbox/. 28 “Terms of Use,” Instagram, accessed November 25, 2017, https://help.instagram.com/478745558852511; and “image-hashtag-feed,” Digitoimisto Dude Oy, accessed November 25, 2017, https://github.com/digitoimistodude/image-hashtag-feed. 29 Samson, “ws784512 instagram.” https://crln.acrl.org/index.php/crlnews/article/view/8991 https://techcrunch.com/2012/04/24/the-rise-of-instagram-tracking-the-apps-spread-worldwide/ https://techcrunch.com/2012/04/24/the-rise-of-instagram-tracking-the-apps-spread-worldwide/ http://www.pewinternet.org/fact-sheet/social-media/ https://doi.org/10.1080/10875301.2014.983287 https://doi.org/10.1080/1941126X.2015.1092344 https://doi.org/10.1108/LHTN-08-2013-0047 https://doi.org/10.1108/LHTN-08-2013-0047 https://www.digitalgov.gov/resources/federal-social-media-accessibility-toolkit-hackpad/ https://scholarworks.umt.edu/ml_pubs/25/ https://www.instagram.com/developer/embedding/ https://www.instagram.com/developer/authentication/ https://www.logicalfeed.com/embedding-instagram-feed-in-your-website https://github.com/ws784512/instagram https://www.instagram.com/developer/sandbox/ https://help.instagram.com/478745558852511 https://github.com/digitoimistodude/image-hashtag-feed Abstract Introduction Literature Review Mansfield Library Archives and Special Collections’ Instagram Feed Acknowledgments references 10237 ---- Letter from the Editor Kenneth J. Varnum INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 1 https://doi.org/10.6017/ital.v36i4.10237 I am excited to have been appointed Editor of Information Technology and Libraries as the journal enters its 50th year. Originally published as the Journal of Library Automation, ITAL has a long history of tracking the rapid-fire changes in technology as it relates to libraries. Much as it has over the past 50 years, technology will continue to change not just the way libraries offer services to their communities, but the way we conceptualize what it is we do. If past is prologue, I have no doubt the next decades will continue to amaze, probably in ways even the most adventurous trend-forecaster won’t get quite right. In the context of the rapid change in how we do our work, what we do will remain the same: collecting, preserving, and providing access to the information and artefacts of our culture, whatever that may be. I would like ITAL to grow and expand, while keeping its core essence the same. That core is high-quality, relevant, and informative articles, reviewed by our peers, and made available to the world. But I think there is more we can do for LITA and the library technology profession by expanding the scope and impact of the journal through seeking and soliciting articles from a wider range of librarians, adding more case studies to the research articles that are at the journal’s core, and being more rapidly responsive to the evolving technology landscape in front of us. To that end, I invite you to think broadly about researching, documenting, and describing the technology-related work you do so that others can learn about it. I welcome questions about how your project might fit into ITAL, and look forward to working with you. I’d like to close by extending my thanks to Bob Gerrity, who served as ITAL’s editor for the past 6 years and stewarded the journal’s transition to an open access publication. I am grateful for his service to ITAL, LITA, and the profession. Sincerely, Kenneth J. Varnum Editor varnum@umich.edu mailto:varnum@umich.edu 10238 ---- President’s Message Andromeda Yelton INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 2 Andromeda Yelton (andromeda.yelton@gmail.com) is LITA President 2017-18 and Senior Software Engineer, MIT Libraries, Cambridge, United States. Before I dive into my column, I’d like to recognize and thank Bob Gerrity for his six years of service as ITAL’s Editor in Chief. He oversaw our shift from a traditional print journal to a fully online one, recognized by Micah Vandegrift and Chealsye Bowley as having the strongest open-access policies of all LIS journals (http://www.inthelibrarywiththeleadpipe.org/2014/healthyself/). I’d like to further extend a welcome to Ken Varnum as our new Editor in Chief. Ken’s distinguished record of LITA service includes stints on the ITAL Editorial Board and the LITA Board of Directors, so he knows the journal very well and I am enthusiastic about its future under his lead. I’m particularly curious to see what will be discussed in ITAL under Ken’s leadership because I’ve just come back from two outstanding conferences which drove home the significance of the issues we wrestle with in library technology, and I’m looking forward to a third. In early November, I attended LITA Forum in scenic Denver. The schedule was packed with sessions on intriguing topics – too many, of course, for me to attend them all – but two in particular stand out to me. In one, Sam Kome detailed how he’s going about a privacy audit at the Claremont Colleges Library. He walked us through an extensive – and sometimes surprising – list of places personally identifiable information can lurk on library and campus systems, and talked through what his library absolutely needs (which is less than he’d thought, and far less than the library has been logging without thinking about it). In the other, Mary Catherine Lockmiller took a design thinking approach to serving transgender populations. She shared a fantastic, practical LibGuide (http://libguides.southmountaincc.edu/transgenderresources), but the part that stuck with me most is her statement that many trans people may never physically enter a library because public spaces are not safe spaces; for this population, our electronic services are our public services. As technologists, we create the point of first, and maybe only, contact. A week later, I attended the inaugural Data for Black Lives conference (http://d4bl.org/) at the MIT Media Lab, steps from my office. This was – and I think everyone in the room felt it – something genuinely new. From the galvanizing topic, to the sophisticated visual and auditory design, to the frisson of genius and creativity buzzing all around a room of artists, activists, professors, poets, data scientists and software engineers, it was a remarkable experience for us all. Those of you who heard Dr. Safiya Noble speak at Thomas Dowling’s LITA President’s program in 2016 are familiar with algorithmic bias. Numerous speakers discussed this at D4BL: the ways that racial disparities in underlying data sets can be replicated, magnified, and given a veneer of objective power when run through the black boxes that power predictive policing or risk assessment for bail hearings. Absent and messy data was a theme as well: in a moment that would make many librarians chuckle (and then wince) knowingly, a panel of music industry executives estimated that 40% of their metadata is wrong, thus making it impossible to credit and compensate artists appropriately. mailto:andromeda.yelton@gmail.com) https://www.google.com/url?q=http://www.inthelibrarywiththeleadpipe.org/2014/healthyself/&sa=D&ust=1512118443864000&usg=AFQjCNEDFyL-YwFgnAdmdzfCRVVnMHLhhQ http://libguides.southmountaincc.edu/transgenderresources http://d4bl.org/) PRESIDENT’S MESSAGE | YELTON 3 https://doi.org/10.6017/ital.v36i4.10238 And yet – in a memorable keynote – Dr. Ruha Benjamin called on us not only to collect data about black death, as she showed us an image of the ambulance bill sent to Tamir Rice’s family, but to listen to our artists and poets as we use our data to imagine black life – this in front of an image of Wakanda. With our data and our creativity, what new worlds can we map? Several of my MIT colleagues also attended D4BL, and as we discussed it afterward we started thinking about how these ideas can drive our own work. How does the imaginary world of Wakanda connect to the archival imaginary, and what worlds can we empower our own creators to imagine with what we collect and preserve? How can we use our data literacy and access to sometimes un-Googleable resources to help community groups collate data on important issues that are not tracked by our public institutions, such as police violence (https://mappingpoliceviolence.org/) or racial disparities in setting bail? With these ideas swirling in my mind, I am looking forward with tremendous excitement to LITA Forum 2018. Building on the work of our Forum Assessment Task Force, we’ll be doing a lot of things differently; in particular, aiming for lots of hands-on, interactive sessions. This will be a conference where, whether you’re a presenter or an attendee, you’ll be able to do things. And these last two conferences have driven home for me how very much there is to do in of library technology. Our work to select, collect, preserve, clean, and provide access to data can indeed have enormous impact. Technology services are front-line services. https://mappingpoliceviolence.org/) 10239 ---- Editorial Board Thoughts: Reinvesting in Our Traditional Personnel Through Knowledge Sharing and Training Mark Dehmlow INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 4 Mark Dehmlow (mdehmlow@nd.edu) is a member of the ITAL Editorial Board and Director of Library Information Technology, Hesburgh Library, University of Notre Dame, South Bend, Indiana. Lately I have been giving a lot of thought to how those of us in technology positions can extend our impact throughout our organizations. With finite budgets and time and relatively low personnel turnover, I have realized that the solution goes beyond merely finding ways that technology can optimize workflows through automation. I have been working in academic library technology for nearly 20 years and when I began my career, virtually all areas of technology required specialized staff – from supporting general computer applications to managing the technical infrastructure that underlay our core systems. These days, technology is still a specialty, but the function of technicians has become more focused on providing infrastructure and much of the general application support we used to provide has become ubiquitous and has become an expectation of almost all library positions. Managing email, creating specialized formulas for data analysis, navigating operating systems, even developing basic databases, are now regular parts of library work. The trend of technological infusion will continue but instead of general technical tasks, almost all new library positions will require deeper technical skills. This is due, in part, to the function of knowledge work becoming more specialized as libraries focus on the areas where they can create the most value and those new domains require more technical expertise to be effective. Perhaps the most striking example of this evolution is in the transition of catalogers to metadata specialists. The days of working with a single metadata format (MARC) in a single, tabular interface (catalog) are quickly slipping away and being replaced by metadata structured in multiple complex schemes, expressed in formats like XML and JSON. Instead of acquiring data from OCLC, libraries need to work with web-based APIs to harvest metadata. And the tools for manipulation require basic programming skills in languages like Python or working with open source applications that look little like the integrated library systems we are used to. Working with these tools can enable metadata experts to customize metadata at scale, but it requires new knowledge and even new ways of thinking about metadata and metadata manipulation. Cataloging isn’t the only position undergoing change in academic libraries, either. Acquisitions is pushing toward greater automation and patron driven selection. The catalog is becoming more like a bookstore and the discovery landscape includes a panoply of resources that are purchased only at the point a user clicks on a link to a resource. Acquisitions is also occurring at larger scale, and requiring the ability to work with thousands of items in a batch, to select based on the qualities of what libraries want to make available, to analyze usage trends, and to load, update, and remove metadata as quickly from our discovery environment as possible. The tools to accomplish this are similar to those for metadata. Beyond technical services, we’re beginning to see the role of the subject selector transition from building broad disciplinary collections toward a focus on curation EDITORIAL BOARD THOUGHTS | DEHMLOW 5 https://doi.org/10.6017/ital.v36i4.10239 of specialized collections requiring digitization and digital curation. The tools to accomplish this are digital asset management systems and web-based digital exhibition tools which are specialized content management systems. Subject selectors are transforming into digital content creators and managers. Technologically-driven change regularly outpaces generational personnel turnover in libraries, and given that technological change continues to grow exponentially, it is clear we need a flexible workforce and an organizational commitment to training and professional growth. While organizations are rewriting positions to include technical skills, we will always have a preponderance of staff that started their careers in libraries with depreciating skillsets. Merely directing staff to webinars, conferences and self-driven development isn’t enough. Multi-day workshops are great as long as there are opportunities to apply learning upon returning to work. To guarantee skill retention, sustained training needs to be directed towards the specific skills needed now and based in actual work, not just theoretical exercises. The challenge, then, becomes how to implement such a program and identifying who can provide the necessary training. How can specialization be disseminated to non-specialists? Many libraries have some of the needed resources close at hand, even if staffing is thin and technical resources scarce. It requires thinking a bit pragmatically to reuse the resources libraries do have, and for technologists to evolve with demands as well, transitioning our roles from technology experts alone to a hybrid of practitioners, teachers, and enablers. Teaching is, itself, a specialty and many IT professionals are unlikely to have developed that skillset. Most libraries, though, have staff who do have experience and expertise in training and pedagogy. Evolving towards in-sourced technology development will undoubtedly require IT staff to first learn effective teaching methods and basic curricula development. They will need a framework to take a set of specific skills and build ad-hoc courses with medium range learning objectives. Teaching can occur in the context of actual work scenarios so that learning is put to practical use as part of that training, and skills retention improved. Libraries can become labs for cross-training and knowledge sharing through leveraging our teachers and technologists in interdisciplinary partnerships and collaboration with a focus on internal growth so that library organizations can meet continuously changing demands. Once staff have been trained in new technical areas, there is another opportunity for IT professionals to extend their impact, by dividing technology-driven projects into the parts that require deep technical work and the parts that require transferable technical skills. If technologists start looking at ways to implement technical solutions in componentized ways instead of as end-to-end solutions, they have the opportunity to empower newly trained staff to contribute in practical ways through building solution foundations and then delegating configurable application inputs. As an example – developing a full application stack requires considerable programming skill, but learning to create and update extensible stylesheets to transform XML-based metadata is a teachable skill. IT professionals could develop applications that take a configuration file and an XSL file as inputs while staff with XSLT training can modify the configuration to include parameters for connecting to APIs or loading XML. Trained staff could then modify the XSL to transform data to their specifications without having to pass the task back to the IT professional. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 6 Moving toward more holistic technology capability in libraries will require all personnel to be committed to evolving to meet the emerging needs of our organizations – IT professionals included. For decades, technologists have been in the privileged position of having the necessary skills to advance the profession’s digital future, but it will be important for technologists in libraries to integrate the many valuable skills other personnel can offer so that they also can evolve in ways that best support our organizations – leveraging foundational library skills to enhance overall organizational capacity to accomplish tasks that are increasingly requiring technical expertise. I won’t pretend it will be easy. It will require libraries to prioritize organizationally-led training, even amidst the flurry of demands around us, but I think it is also critical to the future of the profession, and the old adage that winter pays for summer feels apropos here. Technologists will need to be open to incorporating foundational library skills, to collaborating and learning from other library specialists, to thinking of their positions more broadly, and, for those who live in ivory towers (you know who you are), to eliminating the silos they’ve built and collaborate, cooperate, and engage. Technologists are an important part of library ecosystems with what we contribute operationally, but I think we can have a greater impact if we propagate our knowledge in an effort to increase the profession’s overall technology capacity and become agents to support knowledge workers’ future skill development. 10240 ---- Enhancing Visibility of Vendor Accessibility Documentation Samuel Kent Willis and Faye O’Reilly INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 12 Samuel Kent Willis (samuel.willis@wichita.edu) is Assistant Professor and Technology Development Librarian and Faye O’Reilly (faye.oreilly@wichita.edu) is Assistant Professor and Digital Resources Librarian at Wichita State University. ABSTRACT With higher education increasingly being online or having online components, it is important to ensure that online materials are accessible for persons with print and other disabilities. Library- related research has focused on the need for academic libraries to have accessible websites, in part to reach patrons who are participating in distance-education programs. A key component of a library’s website, however, is the materials it avails to patrons through vendor platforms outside the direct control of the library, making it more involved to address accessibility concerns. Librarians must communicate the need for accessible digital files to vendors so they will prioritize it. In much the same way as contracted workers constructing a physical space for a federal or federally funded agency must follow ADA standards for accessibility, so software vendors should be required to design virtual spaces to be accessible. A main objective of this study was to determine a method of increasing the visibility of vendor accessibility documentation for the benefit of our users. It is important that we, as service providers for the public good, act as a bridge between vendors and the patrons we serve. INTRODUCTION The World Wide Web was developed late in 1989 but reached the public sector the following year and quickly gained prominence.1 Around this same time (1990), the Americans with Disabilities Act (ADA) was also passed, so when it was written the role of the web had yet to take shape. Websites and online content, while not included specifically in the ADA, have been increasingly emphasized when institutions examine the accessibility of their resources for persons with disabilities. More recent legislation, as well as legal-settlement agreements (including with colleges and universities), have included—and even emphasized—the importance of accessible online content. Researchers have argued that in requiring facilities to be accessible, ADA must include digital accessibility.2 With higher education increasingly being online or having online components, it is important to ensure that online materials are accessible for persons with print and other disabilities, many of whom may have received more extensive support in primary and secondary schools. Unless accessibility is pursued with purpose, the level of education and educational materials available for students with disabilities will be severely limited.3 LITERATURE REVIEW Legislation and Existing Guidelines Equal access to information for all patrons is a foundational goal of libraries. In higher education, accessible information and communications technology allows users of all abilities to focus on learning without undue burden.4 Colleges and universities are required by law to provide ENHANCING VISIBILITY OF VENDOR ACCESSIBILITY DOCUMENTATION | WILLIS AND O’REILLY 13 https://doi.org/10.6017/ital.v37i3.10240 reasonable accommodations to allow an individual with a disability to participate fully in the programs and activities of the university. According to Title II of ADA, discrimination on the basis of disability by any state or local government and its agencies is strictly prohibited.5 Section 504 of the Rehabilitation Act of 1973 also prohibits discrimination on the basis of disability by any program or activity receiving federal assistance.6 The Department of Education stated, “Public Educational institutions that are subject to Education’s Section 504, regulations because they receive Federal financial assistance from us are also subject to the Title II regulations because they are public entities (e.g., school districts, State educational agencies, public institutions of vocational education and public colleges and universities).” 7 This piece of legislation usually manifests itself in the physical learning space—wheelchair ramps, braille textbook options, interpreters, and more—but finds little application in the digital spaces of a university, especially in the library’s online research presence. This is an alarming revelation; much higher learning today takes place in an online environment, and inaccessible library resources are a contributing factor to challenges in higher education faced by users with disabilities. To be considered accessible, a digital space, such as a website, online-learning management system, or a research discovery layer, and any Word documents, PDFs, and multimedia presented therein, should be formatted in such a way that it is compatible with assistive technologies, such as screen-reading software. A website should also be navigable without a mouse using visual or auditory clues. Content on a website ought to be clearly and logically organized, with skip navigation links to bypass to the page’s main content. Images should have alternative text descriptions, known as “alt text,” that is brief and informative, describing the content and role of the image. Links should likewise have clear descriptions of the target page. These and similar considerations aim to help persons with impairments that may make reading a monitor or screen difficult.8 Digital spaces like a research database are considered electronic information technology (EIT). EIT is defined as “information technology and any equipment or interconnected system or subsystem of equipment that is used in the creation, conversion or duplication of data or information.”9 Recently this terminology has been converted to information and communications technology (ICT) as per the final rule updating Section 508 in early 2017, but the essence of what it means remains unchanged.10 Legislation regarding digital accessibility exists, specifically Section 508 of the Rehabilitation Act of 1973, but only federal agencies and institutions receiving federal aid are required to abide by these statutes. Lawmakers considered technology as a growing part of daily life in 1998 and amended the Rehabilitation Act with Section 508, requiring federal agencies to make their ICT accessible to people with disabilities.11 In 2017, these standards were updated with a final rule that modernized guidelines for accessibility of future ICT.12 Any research databases or other applications used by college and university libraries to facilitate online learning would be considered ICT and thereby subject to Section 508 requirements. It is evident that libraries not only have legal reasons to comply with Section 508, but ethical reasons as well because making library collections and services universally available is a core value of the library community.13 In addition to legislation, the World Wide Web Accessibility Initiative (WAI) created the Web Content Accessibility Guidelines (WCAG) in 1999 in response to the growing need for web accessibility and to promote universal design. These standards created for web-content creators and web-tool developers are continually updated as new technologies and capabilities emerge— with version 2.0 being released in 2008—and apply specifically to web content and design. Many INFORMATION TECHNOLOGY AND LIBRARIES | SEPTMEBER 2018 14 of these guidelines were absorbed by the 2017 refresh of Section 508 of the Rehabilitation Act of 1973.14 With fourteen guidelines assigned priority levels 1–3, WCAG 2.0, and subsequent revisions to date, offer three levels of conformance with digital-accessibility guidelines: Level A, the most basic level, meaning all mandatory level 1 guidelines are met; Level AA, meaning priority levels 1 and 2 are met; and Level AAA, meaning priority levels 1–3 are met. These conformance levels are important because many ICT vendors will make their claims to conformance with WCAG standards by using provided WAI icons or using statements that refer to the level of conformance.15 WCAG 2.0 guidelines alone are not enough to determine fully if a website or other digital content is truly accessible. It partly depends on it having an intuitive layout for a variety of users, which can only be achieved through usability testing.16 It is crucial that librarians understand what is required for a product or service to be considered accessible, and a firm grasp of WCAG 2.0 and its conformance levels will enrich a librarian’s understanding of web accessibility and Section 508 regulations.17 A Voluntary Product Accessibility Template (VPAT) is a self-assessment document that vendors are required to complete only if they wish to sell their products to the federal government or any institution that chooses to require them. The quality of VPATs varies, but essentially they will list Section 508 standards and for each specify whether they fully or partially support it, do not support it, or if the standard is not applicable. There is then a space for the vendor to provide an explanation for limitations. Since these are voluntary self-assessments, these documents can sometimes be brief and incomplete, but even brief statements can be specific enough to relatively easily verify the claims of support. Because libraries are portals to online content, including e-books, e-journals, databases, streaming media, and more, which are provided largely by third-party vendors, libraries face unique struggles when attempting to comply with federal regulations. Notions of equality and equal access are inherent to libraries and important for the maintenance of a democratic society, which makes accessibility within libraries’ digital content a concerning ethics issue.18 Having little control over how ICT is designed, libraries still must figure out how to address accessibility needs within third-party ICT. In 2012, the Association of Research Libraries (ARL) Joint Task Force on Services to Patrons with Print Disabilities encouraged libraries to require publishers to implement industry best practices, comply with legal requirements for accessibility, include language in publisher and vendor contracts to address accessibility, and request documentation like VPATs.19 The task force’s report was vital in the creation and direction of this study. Existing Literature and Studies As library professionals, we may often make assumptions of the accessibility of a third-party resource when the reality is that greater importance is placed on design of a product; accessibility components are either being added as special features or are being included once the design work is completed.20 Tatomir and Durrance conducted a study on the compatibility of thirty-two library databases with a set of guidelines for accessibility they called the Tatomir Accessibility Check- list.21 This list included checking the usability of these databases with a screen reader and braille renewable display. They found that 44 percent of the databases were inaccessible, with an additional 28 percent being only “marginally accessible,” based on their criteria. This suggests major problems exist within vendor database platforms.22 ENHANCING VISIBILITY OF VENDOR ACCESSIBILITY DOCUMENTATION | WILLIS AND O’REILLY 15 https://doi.org/10.6017/ital.v37i3.10240 Building on this research, Western Kentucky University Libraries conducted a study on VPATs from vendors to determine how accessible seventeen of their databases were.23 The university libraries ran an accessibility scan on those databases and compared the results with the vendors’ VPATs, finding that the templates from the vendors were accurate about 80 percent of the time. Most of the vendors did not address the accessibility of Portable Document Format (PDF) files in their VPAT statements, though it was an important component of their services. Pertinent to this study, Western Kentucky’s work looked for accessibility documentation on vendors’ websites , and when one was not found, contacted the vendors requesting this information. This study was unique for targeting vendor-supplied VPATs rather than only examining the databases themselves or tutorials from vendors. As mentioned previously, this was only done for the libraries’ main database vendors. Mune and Agee published an article on the Ebooks Accessibility Project (EAP) funded by Affordable Learning Solutions at the California State University System. In this project, the researchers compared academic e-book platforms to e-reader platforms used for popular trade publications. They gathered data on the top sixteen library e-book vendors at San Jose State University based on patron usage and title and holdings counts. The results indicated that academic e-book platforms were less accessible than nonacademic platforms, largely because of hesitance in adopting the EPUB 3 format, which by default has superior navigation and document structure to PDF or HTML, common academic options.24 While this study focused solely on the accessibility of e-book materials, a method for contacting vendors used in the EAP study was adapted for the current study, applied at a larger scale. The EAP researchers attempted to locate the vendors’ VPATs online, and they contacted the vendors at least twice to request a VPAT or other accessibility statement when none was located. It is noteworthy that of the sixteen vendors, all but one (94 percent) provided EAP with some form of accessibility documentation, though less than half (44 percent) had a VPAT available.25 Another study, by Joanne Oud, examined vendor-supplied database video tutorials. Half of the twenty-four vendors examined in Oud’s study had tutorials in formats that were not accessible by keyboard or screen reader. This was largely because many of these tutorials were Flash-based.26 Shockwave Flash is neither accessible for persons with disabilities nor good for usability on modern browsers.27 Oud’s findings suggest that tutorial content would be more widely accessible if they were placed in YouTube or another platform that had transcripts and captions available. While the focus of the study was different from our own, it was similar in that Oud examined the accessibility of vendor materials apart from the journals and collections. Also, Oud noted that to make use of vendor tutorials, the website on which they are housed must likewise be accessible and the videos easy to find, but this is often not the case.28 Other studies suggest that vendor websites and platforms often impede access to information. Vendor platforms often have inaccessible PDFs, or the links to the full-text options are not easily located. DeLancey’s study also found more than three-fourths of the vendors examined had images without alternative text and frames without titles, resulting in many users with visual impairments being left out of the content of these images and frames entirely. Of particular note, however, was the finding that not one of the vendors in this study had all forms —buttons, search boxes, and other browser navigation tools—labeled correctly, leaving the sites difficult to navigate.29 Beyond whether the information itself is accessible, the question inevitably arises, can INFORMATION TECHNOLOGY AND LIBRARIES | SEPTMEBER 2018 16 the desired information even be reached? One way or another, the content on these platforms must be accessible and easy to find. Part of the motivation behind the current study stems from what DeLancey put so well: “Only one vendor (out of seventeen), Project Muse, had a publicly available VPAT on their website, though 9 others supplied this documentation upon request in under a week.”30 The first step in improving accessibility of resources for our patrons is to discuss accessibility with them—to determine how accessible information resources are today and identify areas of need. If a VPAT or, minimally, any form of an accessibility statement is not easily discoverable on a vendor’s website—even if it is available upon request—users with disabilities as well as enabled users are not able to benefit from this information. Are the vendors making it a priority in this case? Additionally, since 41 percent of the vendors DeLancey examined had no VPAT at all, what can be done before and aside from reaching out to vendors and stressing the importance of accessibility and of making statements on accessibility easy to find? From legal responsibilities to the dismal reality of digital accessibility, the task of improving library service for patrons with disabilities is daunting, even with the empowering ethical drivers of the library value system. Ostergaard created “Strategies for Acquiring Accessible Electronic Information Sources,” an incredible guide to begin creating a guide that helps librarians develop an accessibility plan informed by her own work committed to accessibility in her library. Steps 3 and 4 of Ostergaard’s strategies are particularly relevant to the current study. Step 3, “Communicating with Vendors,” involves inquiring about the accessibility of electronic products in addition to asking about any future plans for accessibility of their product and requesting VPATs or other vendor supplied accessibility documentation. Step 3 also recommends that librarians request vendors meet WCAG 2.0 best practices and to incorporate a clause in license agreements that clearly defines accessibility of their products as further demonstration of ded ication to accessibility. Such communication, it is hoped, would also lead to improved product development.31 Once vendors are contacted, Ostergaard outlines in step 4 the importance of documenting vendor communication regarding digital accessibility and further suggests assigning a person or team to review information received. Ostergaard’s library changed the name of their acquisitions budget to “access budget,” reallocating a portion of their budget to review existing subscriptions, purchase accessible replacements, or in some cases, convert materials to an accessible format. The documentation review allowed the library to make informed decisions about collections and service availability on behalf of library users, but no mention was made of involving users in this process. The article provided a letter template that encompassed the aforementioned concepts and a request for assessment documentation, such as VPATs and official statements of compliance. The Ostergaard template served as a foundation for the language used in vendor communication for the current study, particularly the VPAT or other accessibility documentation request.32 There have been no studies that suggest a way to implement easily discoverable vendor accessibility documentation—even when said documentation is not readily available to the public on the vendors’ sites. DeLancey suggested creating “an open repository for both vendor supplied documentation, and the results of any usability testing,” but this was suggested for internal library use, not public dissemination.33 If this documentation is made more easily available, we can ENHANCING VISIBILITY OF VENDOR ACCESSIBILITY DOCUMENTATION | WILLIS AND O’REILLY 17 https://doi.org/10.6017/ital.v37i3.10240 increase patron involvement in the discussion of accessibility of vendor-supplied library resources. RESEARCH METHODS Library-related research has focused the need for academic libraries to have accessible websites, in part to reach patrons who are participating in distance-education programs.34 A key component of a library’s website is the materials it avails to patrons from vendors, like databases and database aggregators. Since, however, these materials are accessed via vendor platforms, they are outside the direct control of the library, making it more difficult to address accessibility concerns. Some vendors have put forward significant effort in addressing accessibility needs. Some offer a built-in feature for text-to-speech for HTML files or provide documents in a variety of formats, including TXT and MP3 files, thereby offering a format that works well with common screen- reading programs, or providing a sound file directly. This is of particular benefit to patrons with print disabilities.35 Other vendors, such as Ebook Central (formerly Ebrary), have worked to eliminate their Flash dependencies. This is recognized as a positive step toward making vendor content usable for all. Streaming video and other nonprint-based library materials must also be accessible. A person with visual impairments may be able to hear the soundtrack of the video, but unless an accurate description is provided of what is being presented visually, he or she will miss out on such information, such as the names of those speaking. To complicate matters further, hearing impaired users of these databases will not be privy to what is verbalized unless accurate captions and transcripts, or an interpreter, is made available for the videos. Captions and transcripts are sometimes made available, but can easily be incomplete or incorrect. For example, Alexander Street Press provided closed captioning and transcripts for some collections but not others. Even when the captions or transcripts existed, as with a video we tested from Ethnographic Videos Online, it was of low quality, inscribing the word “object” as “old pics,” “house” as “mess,” and so forth. One vendor, Docuseek, had subtitles to translate from Spanish, but no closed captioning or transcript available. Audio-impaired users could not make full use of the video because the subtitles did not include all information presented in the sound track. (Transcripts can also be useful to visually impaired users using screen readers.) Films on Demand had better captions and transcripts, but did not include all the words on the screen in the transcript, such as the title. Regardless of the medium there are multiple ways to provide accessible versions, but they are seldom automatic. Librarians must communicate the need for accessible digital files to vendors so they will prioritize it. As long as libraries—one of their main customer groups—accept their offerings whether accessible or not for persons with disabilities, vendors have no reason to put great effort into making these improvements. As Colker pointed out, commercial vendors are not required to comply with ADA regulations under Title II or Title III.36 Vendors may also face resource restrictions that hinder their ability to improve their platforms’ accessibility. 37 They are businesses, so it is natural that they would only commit a concerted effort to reformat and enhance their platforms and records if the benefits are expected to outweigh the costs; they must firstly be made aware of the issue, and know that it is important to libraries and their patrons. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTMEBER 2018 18 In much the same way as contracted workers constructing a physical space for a federal or federally funded agency must follow ADA standards for accessibility, so software vendors should be required to design virtual spaces to be accessible. This comparison was made by the Department of Education more than twenty years ago, and has the added benefit of greatly reducing the need for accommodation after the fact.38 According to Cardenes, “At a minimum, a public entity has a duty to solve barriers to information access that the public entity’s purchasing choices create.”39 Oswal stressed the importance of integrating the blind user experience into the development of databases from the beginning, as well as finding steps useful for guiding library users after the fact. Merely following the rules set out in federal regulations is not enough to provide exemplary service to library patrons. The patrons as well must be involved in the process to fully address accessibility needs.40 PROCESS AND FINDINGS The first objective of this study was to gain a better understanding of the accessibility of our library’s vendor-provided digital resources through the review of vendor-provided accessibility documentation. The second objective of this study was to determine a method of increasing the visibility of accessibility documentation for the benefit of our users and to communicate to them our commitment to improving service to users with disabilities. With a digital collection consisting of 270 databases, more than 750,000 e-books and e-journals, and more than 12 million streaming media titles, it was difficult to identify an appropriate sample. We needed a collection that would best serve as an illustrative swatch of our library’s digital holdings, and more importantly, a collection that would have the largest impact on our users. We also needed to establish a strategy for obtaining accessibility documentation regarding third-party content as well as create a delivery method for the VPATs and other documentation we discovered in the course of our study. Similar to other institutions, our library maintains a directory of the most used and most useful databases on the library’s homepage in the form of the A–Z List (http://libresources.wichita.edu/az.php). Determinations of usefulness are based on input from our reference librarians, who connect with user needs directly, whereas use comes from annual usage statistics compiled as per standard library procedures. Users can browse this directory by subject, search by title, and sort by database type (full-text, streaming media, etc.), and the A-Z list is a convenient place for users to begin their research. The directory also served as a convenient place to begin this study as it presented us with a sample that not only reflected the needs and habits of our patrons, but an excellent and diverse list of vendors to work with. Beginning with a list of all subscribed databases (270 in 2016) exported directly from the A–Z List’s backend, we sorted the list by vendor and determined that 74 vendors would be investigated. University materials indexed by the directory (i.e., institutional repository and LibGuides) were excluded from this study. As visibility of accessibility documentation is of concern to this study, our investigation began by visiting the database or vendor’s site and conducting a web search to obtain any information about accessibility. We were looking for mentions of the following keywords: “Section 508” or “Section 504,” “W3” or “WCAG,” “VPAT,” “ADA,” and simply “accessibility.” Some sites were intuitive: thirty-four vendors (45 percent) had statements that were found online. Examples of commonly used documentation, which for the purposes of this study will be referred to as http://libresources.wichita.edu/az.php ENHANCING VISIBILITY OF VENDOR ACCESSIBILITY DOCUMENTATION | WILLIS AND O’REILLY 19 https://doi.org/10.6017/ital.v37i3.10240 accessibility statements, included “Accessibility Policy,” “Section 508 Compliance,” or “Accessibility Statement.” Of those thirty-four vendors who posted accessibility documentation online, eleven provided a VPAT or a link thereto in their accessibility statements. If we could not find an accessibility statement on the site, vendors were contacted first via email requesting information and documentation regarding the accessibility of their product using a form letter inspired by the Ostergaard template.41 This email address was either found online— likely the “Contact Us” or technical support email links—or originated in the list of vendors’ contacts maintained in the library management system if another contact could not be found. If a response was not received within thirty days, the vendors were contacted a second time, a suggestion gleaned from Mune and Agee’s work.42 After all vendors included in the study had been contacted, any who did not provide a VPAT were contacted a final time with a specific request for a VPAT. For vendors who responded they could not provide a VPAT or other accessibility statement, we used a screenshot of their response as documentation. The form letter (see appendix A) used in the current study made it known to vendors that their response would be posted publicly for the benefit of our users. Twelve of the remaining vendors responded to our email inquiries with VPATs and seven vendors responded with other accessibility documentation. Figure 1. Results of vendor query for accessibility documentation. In total, eleven VPATs (15 percent) were found online and VPATS from twelve vendors (16 percent) were received in response to our emailed request. Twenty-three vendors (31 percent) had other accessibility documentation available online, while seven vendors (9 percent) provided other accessibility documentation in response to email inquiries. Eight vendors (11 percent) Other Accessibility Documentation Found Online 31% Other Accessibility Documentation Received 9% No Official Statement 11% Did Not Respond 18% VPATS Found Online 15% VPATS Received 16% VPATS 31% Results of Vendor Query for Accessibility Documentation INFORMATION TECHNOLOGY AND LIBRARIES | SEPTMEBER 2018 20 responded they had no official statements or documentation to offer, and thirteen vendors (18 percent) did not respond (see figure 1). With the documentation compiled, we needed to establish an appropriate delivery system that would make this accessibility information visible to library users and therefore further the accessibility efforts. Our collection cross-section, the A–Z List, was chosen because of its prominence in our library’s online research presence as a suitable location to not only store but to convey this documentation to users. We created a clickable icon to be embedded into the databases’ entries in our A–Z List created in LibGuides (a Springshare product). Clicking the icon would take the user to the vendor’s statement page, directly to the VPAT, or to a page we created in LibGuides to store screen captures of vendor emails and VPATs we received as attachments. If a VPAT was available, we linked to it above any other documentation because VPATs present a more rigorous analysis of the accessibility of third-party-created ICT. LibGuides was determined to be a suitable place to house this documentation not only because it made the information easy to find for patrons, but also because Springshare built LibGuides in an increasingly accessible manner and has documented its efforts using VPATs for each product (see appendix B). FURTHER STUDY It is expected that some of the information provided by the vendors is incomplete or inaccurate, even despite their best efforts, so the information we provide to patrons from and about the vendors might at times lead our patrons astray. We briefly examined the VPATs acquired through this project to inform our work moving forward and found errors in at least half of them. Some vendors claimed that skip navigation was available when none was found, while another would have benefitted from it but said it was “not applicable.” Others were too brief to be useful, as no explanations were given for their claims. Building on this current research, we intend, in collaboration with patrons with disabilities, to further verify the accuracy of key statements made by vendors in their VPATs and other accessibility documentation. This analysis will give concrete feedback to vendors on how their sites could be further improved. As stated earlier, giving patrons access requires more than following a set of guidelines; it requires dialog to ensure their needs are fully met.43 It requires more than making the available documents accessible, but also testing the platform used to retrieve the documents for accessibility. As one author put it so well, “A lack of technological access is a solvable problem, but only if it is made a priority.”44 As vendors are not directly subject to enforcement of Section 508 and other statutes regarding accessibility of the products they provide to libraries, VPATs are truly voluntary. As such, the level of effort and detail of the product assessments are inconsistent and accuracy of the documentation is questionable. We intend to continue to be involved in the digital-accessibility initiative in part through our analysis of our digital-library presence, utilizing user input and expanding their role in improving the user experience. This would enable us to further improve our libraries’ service to users with disabilities. If we, as library professionals and institutions, stand together and each say our part, vendors will realize this is an important issue to address. Also, it is important that we, as service providers for the public good, act as a bridge between these vendors—who at times do not avail good service information to their customers—and the patrons we serve. It may be a small step, but providing links to the VPATs and other accessibility statements from vendors right where the patrons need ENHANCING VISIBILITY OF VENDOR ACCESSIBILITY DOCUMENTATION | WILLIS AND O’REILLY 21 https://doi.org/10.6017/ital.v37i3.10240 them is an important step in meeting the patrons where they are and showing them help is available. We can show patrons we care and will work with them to improve the now limited accessibility of not only scholarly information itself, but even of the platforms in which they are housed. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTMEBER 2018 22 APPENDIX A: ACCESSIBILITY DOCUMENTATION REQUEST EMAIL TEMPLATE Subject Line: VPAT Request Thank you for the information you provided answering our inquiry regarding the accessibility of your electronic product. Wichita State University Libraries has set a goal of improving the accessibility of the electronic and information technology we provide to our patrons. In accordance with Section 504 of the Rehabilitation Act and Title II of the Americans with Disabilities Act, do you happen to have a Voluntary Product Accessibility Template (VPAT) available, or have you made plans to do further accessibility testing on your product? The VPAT documentation can be found on the U.S. Department of State Website: http://www.state.gov/m/irm/impact/126343.htm. http://www.state.gov/m/irm/impact/126343.htm ENHANCING VISIBILITY OF VENDOR ACCESSIBILITY DOCUMENTATION | WILLIS AND O’REILLY 23 https://doi.org/10.6017/ital.v37i3.10240 APPENDIX B: VPAT AND OTHER ACCESSIBILITY DOCUMENTATION URLS USED IN THE DATABASES A–Z LIST. (List current as of October 20, 2017. Library subscriptions may have changed. Vendors may have updated URLs or added additional documentation since October 20. Research on this project is ongoing. Please see http://libresources.wichita.edu/az.php for a current list of vendor accessibility documentation.) Vendor URLS AAPG (American Association of Petroleum Geologists) No accessibility documentation available ABC-CLIO No response ACLS (American Council of Learned Societies) http://www.humanitiesebook.org/about/for-librarians/#ada- compliance-and-accessibility ACM (Association of Computing Machinery) https://www.acm.org/accessibility ACS (American Chemical Society) https://www.acs.org/content/acs/en/accessibility- statement.html Adam Matthew Digital http://libresources.wichita.edu/c.php?g=583127&p=4026332 AIAA (American Institute of Aeronautics & Astronautics) http://libresources.wichita.edu/ld.php?content_id=32264954 Alexander Street Press https://alexanderstreet.com/page/accessibility-statement American Institute of Physics http://www.scitation.org/faqs American Mathematical Society http://www.ams.org/about-us/VPAT-MathSciNet-2014-AMS.pdf APA (American Psychological Association) http://www.apa.org/about/accessibility.aspx ASM International No response ASME (American Society of Mechanical Engineers) No accessibility documentation available ASTM No accessibility documentation available BioOne http://www.bioone.org/page/resources/accessibility Books 24x7 https://documentation.skillsoft.com/bkb/qrc/AssistiveQRC.pdf Britannica http://help.eb.com/bolae/Accessibility_Policy.htm Business Expert Press http://media2.proquest.com/documents/ebookcentral_vpat.pdf Cabell’s No response Cambridge Crystallographic Data Centre https://www.ccdc.cam.ac.uk/termsandconditions/ Cambridge University Press http://www.cambridge.org/about-us/accessibility/ CAS No accessibility documentation available CLCD (Children’s Literature Comprehensive Database) No response http://libresources.wichita.edu/az.php http://www.humanitiesebook.org/about/for-librarians/#ada-compliance-and-accessibility http://www.humanitiesebook.org/about/for-librarians/#ada-compliance-and-accessibility https://www.acm.org/accessibility https://www.acs.org/content/acs/en/accessibility-statement.html https://www.acs.org/content/acs/en/accessibility-statement.html http://libresources.wichita.edu/c.php?g=583127&p=4026332 http://libresources.wichita.edu/ld.php?content_id=32264954 https://alexanderstreet.com/page/accessibility-statement http://www.scitation.org/faqs http://www.ams.org/about-us/VPAT-MathSciNet-2014-AMS.pdf http://www.apa.org/about/accessibility.aspx http://www.bioone.org/page/resources/accessibility https://documentation.skillsoft.com/bkb/qrc/AssistiveQRC.pdf http://help.eb.com/bolae/Accessibility_Policy.htm http://media2.proquest.com/documents/ebookcentral_vpat.pdf https://www.ccdc.cam.ac.uk/termsandconditions/ http://www.cambridge.org/about-us/accessibility/ INFORMATION TECHNOLOGY AND LIBRARIES | SEPTMEBER 2018 24 Conference Board http://www.conferenceboard.ca/accessibility/resources.aspx?As pxAutoDetectCookieSupport=1 CQ Press http://library.cqpress.com/cqresearcher/html/public/vpat.html Credo Reference https://credoreference.zendesk.com/hc/en- us/articles/201429069-Accessibility dataZoa http://libresources.wichita.edu/AccessibilityStatements/DataZo aVPAT Docuseek2 https://docuseek2.wikispaces.com/Section+508+Compliance+St atement EBSCO https://www.ebscohost.com/government/full-508-accessibility Ei Engineering Village https://www.elsevier.com/solutions/engineering- village/features/accessibility Elsevier https://www.elsevier.com/solutions/sciencedirect/support/web -accessibility Gale https://support.gale.com/technical/618 Google https://www.google.com/accessibility/initiatives-research.html HathiTrust https://www.hathitrust.org/accessibility HeinOnline https://www.wshein.com/accessibility/ IBISWorld No response IEEE https://www.ieee.org/accessibility_statement.html Infobase Learning http://support.infobaselearning.com/index.php?/Tech_Support/ Knowledgebase/Article/View/1318/0/ada-usability-statement Infogroup http://libresources.wichita.edu/c.php?g=583127&p=4286285 Institute of Physics http://iopscience.iop.org/page/accessibility InterDok No response JSTOR https://about.jstor.org/accessibility/ Kanopy https://help.kanopystreaming.com/hc/en- us/articles/210691557-What-is-Kanopy-s-position-on- accessibility- LexisNexis http://www.lexisnexis.com/gsa/76/accessible.asp Library of Congress https://www.congress.gov/accessibility Mergent No accessibility documentation available National Academies Press No response National Library of Medicine https://www.nlm.nih.gov/accessibility.html Naxos http://libresources.wichita.edu/c.php?g=583127&p=4287131 NCJRS https://www.justice.gov/accessibility/accessibility-information Newsbank http://libresources.wichita.edu/c.php?g=583127&p=4457078 OCLC https://www.oclc.org/en/policies/accessibility.html Ovid http://ovidsupport.custhelp.com/app/answers/detail/a_id/590 9/~/is-the-ovid-interface-section-508-compliant%3F Oxford University Press https://global.oup.com/academic/accessibility/?cc=us&lang=en & ProjectMUSE https://muse.jhu.edu/accessibility http://www.conferenceboard.ca/accessibility/resources.aspx?AspxAutoDetectCookieSupport=1 http://www.conferenceboard.ca/accessibility/resources.aspx?AspxAutoDetectCookieSupport=1 http://library.cqpress.com/cqresearcher/html/public/vpat.html https://credoreference.zendesk.com/hc/en-us/articles/201429069-Accessibility https://credoreference.zendesk.com/hc/en-us/articles/201429069-Accessibility http://libresources.wichita.edu/AccessibilityStatements/DataZoaVPAT http://libresources.wichita.edu/AccessibilityStatements/DataZoaVPAT https://docuseek2.wikispaces.com/Section+508+Compliance+Statement https://docuseek2.wikispaces.com/Section+508+Compliance+Statement https://www.ebscohost.com/government/full-508-accessibility https://www.elsevier.com/solutions/engineering-village/features/accessibility https://www.elsevier.com/solutions/engineering-village/features/accessibility https://www.elsevier.com/solutions/sciencedirect/support/web-accessibility https://www.elsevier.com/solutions/sciencedirect/support/web-accessibility https://support.gale.com/technical/618 https://www.google.com/accessibility/initiatives-research.html https://www.hathitrust.org/accessibility https://www.wshein.com/accessibility/ https://www.ieee.org/accessibility_statement.html http://support.infobaselearning.com/index.php?/Tech_Support/Knowledgebase/Article/View/1318/0/ada-usability-statement http://support.infobaselearning.com/index.php?/Tech_Support/Knowledgebase/Article/View/1318/0/ada-usability-statement http://libresources.wichita.edu/c.php?g=583127&p=4286285 http://iopscience.iop.org/page/accessibility https://about.jstor.org/accessibility/ https://help.kanopystreaming.com/hc/en-us/articles/210691557-What-is-Kanopy-s-position-on-accessibility- https://help.kanopystreaming.com/hc/en-us/articles/210691557-What-is-Kanopy-s-position-on-accessibility- https://help.kanopystreaming.com/hc/en-us/articles/210691557-What-is-Kanopy-s-position-on-accessibility- http://www.lexisnexis.com/gsa/76/accessible.asp https://www.congress.gov/accessibility https://www.nlm.nih.gov/accessibility.html http://libresources.wichita.edu/c.php?g=583127&p=4287131 https://www.justice.gov/accessibility/accessibility-information http://libresources.wichita.edu/c.php?g=583127&p=4457078 https://www.oclc.org/en/policies/accessibility.html http://ovidsupport.custhelp.com/app/answers/detail/a_id/5909/~/is-the-ovid-interface-section-508-compliant%3F http://ovidsupport.custhelp.com/app/answers/detail/a_id/5909/~/is-the-ovid-interface-section-508-compliant%3F https://global.oup.com/academic/accessibility/?cc=us&lang=en& https://global.oup.com/academic/accessibility/?cc=us&lang=en& https://muse.jhu.edu/accessibility ENHANCING VISIBILITY OF VENDOR ACCESSIBILITY DOCUMENTATION | WILLIS AND O’REILLY 25 https://doi.org/10.6017/ital.v37i3.10240 ProQuest http://media2.proquest.com/documents/proquest_academic_vp at.pdf, http://media2.proquest.com/documents/ebookcentral_vpat.pdf, Readex http://uniaccessig.org/lua/wp- content/uploads/2014/11/Readex.pdf SAGE https://us.sagepub.com/en-us/nam/accessibility-0 Salem Press No response SBRnet No response Springer https://github.com/springernature/vpat/blob/master/springerl ink.md Standard & Poor’s No response Swank No accessibility documentation available (http://libresources.wichita.edu/AccessibilityStatements/SWAN Kaccessibility) Taylor & Francis http://libresources.wichita.edu/c.php?g=583127&p=4539268 Thomson Reuters https://clarivate.com/wp- content/uploads/2018/02/PACR_WoS_5.27_Jan-2018_v1.0.pdf, US Department of Commerce http://osec.doc.gov/Accessibility/Accessibliity_Statement.html US Department of Education https://www2.ed.gov/notices/accessibility/index.html US Government Printing Office https://www.gpo.gov/accessibility University of Chicago No accessibility documentation available University of Michigan https://www.press.umich.edu/about#accessibility UpToDate http://libresources.wichita.edu/c.php?g=583127&p=4691631 ValueLine http://libresources.wichita.edu/AccessibilityStatements/ValueLi neAccessibility WRDS (Wharton Research Data Services) https://wrds-www.wharton.upenn.edu/pages/wrds-508- compliance/ Wiley http://olabout.wiley.com/WileyCDA/Section/id-406157.html REFERENCES 1 Neil Savage, “Weaving the Web,” Communications of the ACM 60, no. 6 (June 2017): 22. 2 Ruth Colker, “The Americans with Disabilities Act is Outdated,” Drake Law Review 63, no. 3 (2015): 799. 3 Colker, “The Americans with Disabilities Act,” 817; Joanne Oud, “Accessibility of Vendor-Created Database Tutorials for People with Disabilities,” Information Technology and Libraries 35, no. 4 (2016): 13–14. 4 Laura DeLancey and Kirsten Ostergaard, “Accessibility for Electronic Resources Librarians,” Serials Librarian 71, no. 3–4 (2016): 181, https://doi.org/10.1080/0361526X.2016.1254134. http://media2.proquest.com/documents/proquest_academic_vpat.pdf http://media2.proquest.com/documents/proquest_academic_vpat.pdf http://media2.proquest.com/documents/ebookcentral_vpat.pdf http://uniaccessig.org/lua/wp-content/uploads/2014/11/Readex.pdf http://uniaccessig.org/lua/wp-content/uploads/2014/11/Readex.pdf https://us.sagepub.com/en-us/nam/accessibility-0 https://github.com/springernature/vpat/blob/master/springerlink.md https://github.com/springernature/vpat/blob/master/springerlink.md http://libresources.wichita.edu/AccessibilityStatements/SWANKaccessibility http://libresources.wichita.edu/AccessibilityStatements/SWANKaccessibility http://libresources.wichita.edu/c.php?g=583127&p=4539268 https://clarivate.com/wp-content/uploads/2018/02/PACR_WoS_5.27_Jan-2018_v1.0.pdf https://clarivate.com/wp-content/uploads/2018/02/PACR_WoS_5.27_Jan-2018_v1.0.pdf http://osec.doc.gov/Accessibility/Accessibliity_Statement.html https://www2.ed.gov/notices/accessibility/index.html https://www.gpo.gov/accessibility https://www.press.umich.edu/about#accessibility http://libresources.wichita.edu/c.php?g=583127&p=4691631 http://libresources.wichita.edu/AccessibilityStatements/ValueLineAccessibility http://libresources.wichita.edu/AccessibilityStatements/ValueLineAccessibility https://wrds-www.wharton.upenn.edu/pages/wrds-508-compliance/ https://wrds-www.wharton.upenn.edu/pages/wrds-508-compliance/ http://olabout.wiley.com/WileyCDA/Section/id-406157.html https://doi.org/10.1080/0361526X.2016.1254134 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTMEBER 2018 26 5 Americans with Disabilities Act of 1990, Pub. L. No. 101-336, 104 Stat. 327 (1990). 6 Rehabilitation Act of 1973, Pub. L. No. 93-112, 87 Stat. 355 (1973). 7 Discrimination on the Basis of Disability in Federally Assisted Programs and Activities, 77 Fed. Reg. 14,972 (March 14, 2012) (to be codified at 34 CFR pt. 104). 8 DeLancey and Ostergaard, “Accessibility for Electronic Resources,” 180. 9 Architectural and Transportation Barriers Compliance Board, 65 Fed. Reg. 80,500, 80,524 (December 21, 2000) (to be codified at 36 CFR pt. 1194). 10 Architectural and Transportation Barriers Compliance Board, 82 Fed. Reg. 5,790 (January 19, 2017) (to be codified at 36 CFR pt. 1193-1194). 11 29 USC §794d, at 289 (2016). 12 Architectural and Transportation Barriers Compliance Board, 82 Fed. Reg. 5,790, 5,791 (January 19, 2017) (to be codified at 36 CFR pt. 1193-1194). 13 Paul T. Jaeger, “Section 508 Goes to the Library: Complying with Federal Legal Standards to Produce Accessible Electronic and Information Technology in Libraries,” Information Technology and Disabilities 8, no. 2 (2002), http://link.galegroup.com/apps/doc/A207644357/AONE?u=9211haea&sid=AONE&xid=4c7f 77da. 14 Architectural and Transportation Barriers Compliance Board, 82 Fed. Reg. 5,790, 5791 (January 19, 2017) (to be codified at 36 CFR pt. 1193-1194). 15 Ben Caldwell et al., eds., “Web Content Accessibility Guidelines (WCAG) 2.0,” last modified December 11, 2008, http://www.w3.org/TR/2008/REC-WCAG20-20081211/. 16 DeLancey, Laura, “Assessing the Accuracy of Vendor-supplied Accessibility Documentation,” Library Hi Tech 33, no. 1 (2015): 108. 17 Ostergaard, Kirsten, “Accessibility from Scratch: One Library’s Journey to Prioritize the Accessibility of Electronic Information Resources,” Serials Librarian 69, no. 2 (2015): 159, https://doi.org/10.1080/0361526X.2015.1069777. 18 Jaeger, “Section 508.” 19 Mary Case et al., eds., “Report of the ARL Joint Task Force on Services to Patrons with Print Disabilities,” Association of Research Libraries, November 2, 2012, p. 29, http://www.arl.org/storage/documents/publications/print-disabilities-tfreport02nov12.pdf. 20 DeLancey and Ostergaard, “Accessibility for Electronic Resources,” 180. http://link.galegroup.com/apps/doc/A207644357/AONE?u=9211haea&sid=AONE&xid=4c7f77da http://link.galegroup.com/apps/doc/A207644357/AONE?u=9211haea&sid=AONE&xid=4c7f77da http://www.w3.org/TR/2008/REC-WCAG20-20081211/ https://doi.org/10.1080/0361526X.2015.1069777 http://www.arl.org/storage/documents/publications/print-disabilities-tfreport02nov12.pdf ENHANCING VISIBILITY OF VENDOR ACCESSIBILITY DOCUMENTATION | WILLIS AND O’REILLY 27 https://doi.org/10.6017/ital.v37i3.10240 21 Jennifer Tatomir and Joan C. Durrance, “Overcoming the Information Gap: Measuring the Accessibility of Library Databases to Adaptive Technology Users,” Library Hi Tech 28, no. 4 (2010): 581, https://doi.org/10.1108/07378831011096240. 22 Tatomir and Durrance, “Overcoming the Information Gap,” 584. 23 DeLancey, “Assessing the Accuracy,” 104–5. 24 Christina Mune and Ann Agee, “Are E-books for Everyone? An Evaluation of Academic E-book Platforms’ Accessibility Features,” Journal of Electronic Resources Librarianship 28, no. 3 (2016): 172–75, https://doi.org/10.1080/1941126X.2016.1200927. 25 Mune and Agee, “Are E-books for Everyone?,” 175. 26 Joanne Oud, “Accessibility of Vendor-Created Database Tutorials for People with Disabilities,” Information Technology and Libraries 35, no. 4 (2016): 12, https://doi.org/10.6017/ital.v35i4.9469. 27 Mark Hachman, “Tested: How Flash Destroys Your Browser’s Performance,” PC World, August 7, 2015, https://www.pcworld.com/article/2960741/browsers/tested-how-flash-destroys- your-browsers-performance.html. 28 Oud, “Accessibility of Vendor-Created Database Tutorials,” 12. 29 DeLancey, “Assessing the Accuracy,” 106–7. 30 DeLancey, “Assessing the Accuracy,” 105. 31 Kirsten Ostergaard, “Accessibility from Scratch: One Library’s Journey to Prioritize the Accessibility of Electronic Information Resources,” Serials Librarian 69, no. 2 (2015): 162–65, https://doi.org/10.1080/0361526X.2015.1069777. 32 Ostergaard, “Accessibility from Scratch.” 164 33 DeLancey, “Assessing the Accuracy,” 111. 34 Cynthia Guyer and Michelle Uzeta, “Assistive Technology Obligations for Postsecondary Education Institutions,” Journal of Access Services 6, no. 1/2 (2009): 29; Oud, “Accessibility of Vendor-Created Database Tutorials,” 7. 35 Mune and Agee, “Are E-books for Everyone?,” 173. 36 Colker, “The Americans with Disabilities Act,” 792–93. 37 DeLancey, “Assessing the Accuracy,” 107. 38 Colker, “The Americans with Disabilities Act,” 814; Mune and Agee, “Are E-books for Everyone?,” 182. https://doi.org/10.1108/07378831011096240 https://doi.org/10.1080/1941126X.2016.1200927 https://doi.org/10.6017/ital.v35i4.9469 https://www.pcworld.com/article/2960741/browsers/tested-how-flash-destroys-your-browsers-performance.html https://www.pcworld.com/article/2960741/browsers/tested-how-flash-destroys-your-browsers-performance.html https://doi.org/10.1080/0361526X.2015.1069777 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTMEBER 2018 28 39 Adriana Cardenes to Dr. James Rosser, April 7, 1997, private collection, quoted in Colker, “The Americans with Disabilities Act is Outdated,” 815. 40 Sushil K. Oswal, “Access to Digital Library Databases in Higher Education: Design Problems and Infrastructural Gaps,” Work 48, no. 3 (2014): 316. 41 Ostergaard, “Accessibility from Scratch,” 164. 42 Mune and Agee, “Are E-books for Everyone?,” 175. 43 DeLancey, “Assessing the Accuracy,” 108; Mune and Agee, “Are E-books for Everyone?,” 181. 44 Colker, “The Americans with Disabilities Act,” 817. ABSTRACT Introduction Literature Review Legislation and Existing Guidelines Existing Literature and Studies Research Methods Process and Findings Further Study Appendix A: Accessibility Documentation Request Email Template Appendix B: VPAT and Other Accessibility Documentation URLs Used in the Databases A–Z List. 10308 ---- Library Space Information Model Based on GIS — A Case Study of Shanghai Jiao Tong University Yaqi Shen INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 99 Yaqi Shen (yqshen@sjtu.edu.cn) is a librarian at Shanghai Jiao Tong University. ABSTRACT In this paper, a library-space information model (LSIM) based on a geographical information system (GIS) was built to visually show the bookshelf location of each book through the display interface of various terminals. Taking Shanghai Jiao Tong University library as an example, both spatial information and attribute information were integrated into the model. In the spatial information, the reading room layout, bookshelves, reference desks, and so on were constructed with different attributes. The bookshelf layer was the key attribute of the bookshelves, and each book was linked to one bookshelf layer. Through the field of bookshelf layer, the book in the query system can be connected with the bookshelf-layer information of the LSIM. With the help of this model, readers can search books visually in the query system and find the books’ positions accurately. It can also be used in the inquiry of special-collection resources. Additionally, librarians can use this model to analyze books’ circulation status, and books with similar subjects that are frequently circulated can be recommended to readers. The library’s permanent assets (chairs, tables, etc.) could be managed visually in the model. This paper used GIS as a tool to solve the problem of accurate positioning, simultaneously providing better services for readers and realizing visual management of books for librarians. INTRODUCTION Geographical information systems (GIS) are powerful tools that can edit, store, analyze, display, and manage geographical data. Early in 1992, several Association of Research Libraries (ARL) institutions, including the University of Georgia, Harvard University, North Carolina State University, and Southern Illinois University, launched the GIS Literacy Project and carried out an extensive survey about the possible applications of GIS in libraries.1 Since then, studies about the application of GIS in library research have attracted more and more attention.2 GIS is effective for library-planning efforts, such as investigating library-service areas, modeling the implications of the opening and closing of library services, informing initial location decisions, and so on.3 The University of Idaho Library adopted GIS to link variables such as age, race, income, and education from the 2000 US Census with the service-area maps of two proposed branch libraries. Based on the thematic maps created, the demographic information about potential library users can be displayed. Most importantly, the maps were also helpful for improving the library-service planning. Koontz et al. from Florida State University investigated the reasons for public-library closure by using GIS. The authors presented a methodology using GIS to describe libraries’ geographic market to illustrate the effects of facility location, relocation, and permanent closure on potential users. Sin used GIS with inequality measures and multiple regressions to analyze statistics from the public-libraries survey and the census-tract data. Then the nationwide LIBRARY SPACE INFORMATION MODEL BASED ON GIS | SHEN 100 https://doi.org/10.6017/ital.v37i3.10308 multivariate study of the neighborhood-level variations was investigated, and the public libraries’ funding and service landscapes were mapped. GIS can also provide strong support for the library accessibility.4 In South Wales, United Kingdom, a case study about a preliminary analysis of spatial variations in accessibility of library services was carried out based on a GIS model. Park further measured the public-library accessibility accurately and provided realistic analysis by using GIS, including descriptive and statistical analyses and a road network–based distance measure. In another paper, Park went a step further to measure readers’ travel time and distance while they are using the library. In addition to using GIS for library planning and accessibility, it can be also applied to managing the collections, including the physical documents and digital databases of an academic library.5 Solar and Radovan from the National and University Library of Slovenia explored the possibility of creating a virtual collection of diverse materials like maps and pictorial documents using GIS. They connected spatial data with other pictorial elements, including views and portrait images with hyperlinks.6 Coyle from Rochester Public Library studied the implementation of GIS in the library collection. He believed that libraries that implemented GIS early on would have an intellectual advantage over those coming on board later.7 Sedighi conducted research about GIS as a decision- support system in analyzing geospatial data in the databases of an academic library. By using the analysis functions of the system, a range of features could be indicated; for example, the spatial relationships of data based on the educational course can be analyzed.8 Boda used a 3D virtual- library model to represent the most prominent and celebrated collection of classical antiquity in the Alexandria library.9 Beyond the applications mentioned above, some libraries have used GIS techniques to analyze reader behaviors.10 Xia developed GIS into an analytical tool for examining the relationships between the height of the shelf and the frequency of book use, revealing that readers tended to pull books off shelves that are easily reachable by human eyes and hands. Mandel used GIS to map the most popular routes that readers took when entering the library. Based on the seating sweeps method, Mandel adopted maps to depict use of tables and computers. The research results of both Xia and Mandel can provide the information of readers’ behavior whereby the books’ positions, and accordingly the entry routes and facilities’ evaluation can be adjusted strategically. Though lots of work has been done about the application of GIS to the library, there are few reports about visually showing the exact position of each book through the library-catalog display interface, which is of great importance both for the readers and the librarians. Xia located library items with GIS and pointed out that updating the starting and ending call numbers for each shelf could be the most tedious work.11 Specifically, GIS cannot tell if the book is not in its correct location or is being used by somebody else. Xia advised combining GIS with radio frequency identification (RFID), both of which have the capability of tracing the location of each book. StackMap, a library-mapping tool providing a collection-mapping product for librarians, was being used at the Hampton Library.12 The Shanghai Jiao Tong University Library built an interface that would use GIS to identify the specific location of each book in the catalog. A GIS model that includes spatial and attribute information was constructed. The connection of GIS, RFID, and OPAC was discussed in detail. Additionally, the relationship between the bookshelves and patrons’ behavior was studied deeply. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTMBER 2018 101 It is hoped that this GIS model will bring convenient services for readers and efficient management for librarians. METHODOLOGY Background In 1984, Shanghai Jiao Tong University Circulation System was built based on barcode-reader technology. The first automated library-management system (LMS), MINISIS and IMAGE Library Integrated System, was implemented in 1988. In 1993, the second LMS, the UNIFY online multiuser system, was implemented. In 1994, an Open Public Access Catalogue (OPAC) system was built based on the UNILS, allowing readers to query the library bibliographic record through the computer. In 1998, the third automated LMS, a client/server–based tool, was built based on the Horizon LMS. In 2008, we launched the Aleph integrated library system (ILS). In the same year, Primo, a resource discovery and access system, was introduced. In 2009, the Our Explore interface was built based on the Primo system, providing the services of resource retrieval and access.13 RFID technology was introduced in 2014, and now readers can borrow or return books through self-service machines. Users can find a book via the OPAC or Our Explore system in the Shanghai Jiao Tong University Library homepage (http://www.lib.sjtu.edu.cn/index.php?m=content&c=index&lang=en), a screen shot of which is shown in figure 1. Book information can be found through the systems, but the exact position of the books cannot be exhibited in the system. At the library reference desk, the question readers ask most frequently is where they can find a certain book. The Chinese Library Classification (CLC) system is used to organize the collections in the Shanghai Jiao Tong University. The librarians are very familiar with the classification. However, it is hard for the inexperienced users to understand, even if they have been trained. Although static maps can guide patrons to find the books, patrons sometimes still have difficulties finding the books. If the readers can get the exact bookshelf location for a book through the OPAC or Our Explore system, the users’ experience could be improved significantly, and much of readers’ time for finding the books could be saved. Therefore, it is necessary to introduce GIS to the library with the aim of visually showing the position of each book. Furthermore, library managers need to plan the budget at the end of every year. The arrangement of different subjects should be considered in the planning. Although the usage of the collections by the ILS provides reference for the planning, a library-space information model (LSIM) would bring a new insight. Software There are many kinds of GIS software in this research field, including commercial products such as ArcGIS, MapInfo, and MapGIS as well as free and open-source software (FOSS) solutions. Taking FOSS and ArcGIS for example, FOSS can provide a broader context of the open-source software movement and developments in GIS.14 No single FOSS package can match all the functionality that ArcGIS has for creating thematic maps; therefore, the function of spatial analysis and data processing of ArcGIS is more powerful. The software used in this study is ArcGIS 10.3 trial version. http://www.lib.sjtu.edu.cn/index.php?m=content&c=index&lang=en LIBRARY SPACE INFORMATION MODEL BASED ON GIS | SHEN 102 https://doi.org/10.6017/ital.v37i3.10308 Figure 1. OPAC and Our Explore in the Shanghai Jiao Tong University Library homepage. Methods There are two modules in the LSIM, including spatial information and attributes information, as shown in figure 2. Spatial information, including the building position, the reading-room layout, bookshelf information, and so on, is transferred to shapefile style. Remote-sensing information is used to set the geographic location of the library. These elements are constructed with different attributes, and 2D-attribute and 3D-multipatch data are stored in the geodatabase. ArcMap and ArcScene are used to generate the 2D and 3D maps and analyze the readers’ behavior. We connect the spatial information with data from the OPAC, Our Explore, and RFID. The query fields (which we call “general information”) in the OPAC are title, author, keyword, call number, ISSN, ISBN, system number, barcode, collection location, and publisher. In the Our Explore system, readers can not only search the general information, but also refine the search results by specific fields, such as topic, author, collection location, published date, and CLC. The functions of book reserving and renewing are also supported by these two systems. RFID is introduced to the Shanghai Jiao Tong library to allow self-service, and the fields include collection location, subject, ISSN, ISBN, barcode, and so on. Barcode is the common field in all three systems and is used to connect them. In the RFID system, the bookshelf is the unique identification of each shelf in the bookshelves. In the Shanghai Jiao Tong University Library, the first-book location method is used to manage books in the RFID system. The first book on each bookshelf is recorded as a different bookshelf location, and the books on one bookshelf are assigned to the same bookshelf location. The books are ordered and arranged according to the call number. A book’s current status can be obtained in the INFORMATION TECHNOLOGY AND LIBRARIES | SEPTMBER 2018 103 RFID system by shelf inventory. The books that are borrowed by patrons or not on the right shelf would be recorded in the RFID system. The key attribute information in the LSIM is the bookshelf layer, which is used to describe the book’s position. The field of the bookshelf layer is connected with the RFID data. Taking the bookshelf layer of RFID as the attribute field, the position of a book can be located by the bookshelf layer in the LSIM. Compared to Xia’s research, it is easier to get the bookshelf-layer information based on the RFID in the LSIM.17 Figure 2. Research flowchart. The connection of the OPAC, RFID, and LSIM is shown in figure 3. When the reader locates a book in the OPAC or Our Explore, the barcode will be shown in the system. The bookshelf layer in the RFID system can be retrieved through the barcode immediately. The map of the reading room has been embedded in the OPAC. Furthermore, the coordinates of the book (x, y, height) can be shown through the bookshelf layer. The index of each bookshelf coordination is created in the OPAC, RFID system, and LSIM. The field of the map presentation is built in the OPAC, and the search interface is supported by the ArcMap and ArcScene. The URL link is the content of the field, and its content is varied with the different bookshelves. In short, when the reader searches one book, the related bookshelf coordination is highlighted in the map. Through the bookshelf layer field, the book information in the query system can be connected with that of LSIM. Faculty and students can search books in the query system visually. As shown in figure 2, spatial information and attribute information are connected in the LSIM. Furthermore, a LSIM based on GIS is built to provide better services for readers and enhance librarians’ visual management. LIBRARY SPACE INFORMATION MODEL BASED ON GIS | SHEN 104 https://doi.org/10.6017/ital.v37i3.10308 Figure 3. The connection of the OPAC, RFID system, and LSIM. Figure 4. Finding a book in the Our Explore system. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTMBER 2018 105 Figure 5a. The visual position of the book with the call number R318-53/3 (2D). Figure 5b. The visual position of the book with the call number R318-53/3 (3D). LIBRARY SPACE INFORMATION MODEL BASED ON GIS | SHEN 106 https://doi.org/10.6017/ital.v37i3.10308 DISCUSSION Providing Services for Readers by LSIM Visual Query in the Reading Room When a book about biological medicine is required, it can be searched by using the keyword “biological medicine” in Our Explore. Then, as shown in figure 4, a book titled Amalgamation within Evolution can be found with the CLC call number R318-53/3. Readers can find the book with the call number in the corresponding reading room. However, if the LSIM is applied, the search results include not only the text information about the book’s location, but also a visual map. Firstly, the barcode of the book (32832872) is identified and passed to the bookshelf layer. The bookshelf layer (A4R042C04) will be found in the LSIM. Then the book’s spatial position can be shown on a visual map. Figures 5a and 5b show the 2D and 3D visual position of the book with the call number R318-53/3, and these two results can be switched in the system. The red arrow is the book’s position. Based on the visual position, readers can find the book more conveniently. The reading rooms in Shanghai Jiao Tong University Library are organized by subject. In each reading room, the books with related categories are distributed together. Figures 5a and 5b show the layout of one reading room. The books with the large CLC classes, i.e., O, P, Q, R, and S, were studied as an example in the reading room in this paper. The red triangles represent chairs and the light green rectangles represents desks. Shelves are alphabetically labeled. The reference desk, office area, group study room, storehouse, inquiry machines, printers, and stairs are also shown. Special Collections in Different Reading Rooms In the Shanghai Jiao Tong University Library, there are many special collections, such as contract documents, Tsung-Dao Lee’s manuscripts, alumni theses, important findings of research teams, and so on. Because of their rarity, these special collections do not circulate and can only be read in the reading rooms. Furthermore, these collections are located in different branch libraries. The geographical information of these resources can be input into the model. Scholars can use LSIM to achieve the exact positions of these resources, go directly to the related area, and quickly find these special items. Library Analysis and Management Book-Borrowing Situation Analysis Using GIS, it is also possible to show how often books circulate based on their physical location. As shown in figure 6, each rectangle represents a shelf in the reading room. The books with the same topic are placed on the same shelf. The number labeled on the shelf represents the average borrowing frequency of the books on this shelf. Different colors mean different frequency, with scale of five to one hundred. The CLC classes O, P, and Q appearing on the right of the shelves represent mathematical sciences and chemistry, astronomy and geosciences, and bioscience, respectively. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTMBER 2018 107 Figure 6. Average borrowing frequency of the books on each shelf in one reading room. Based on analysis of the relationship between borrowing frequency and subject category, the hot spots of the professional fields can be found and shown. In turn, books related to the hot spots can be recommended to readers. Taking class O as an example, the shelf position of the highest borrowing frequency (100) is in row 9, column 2. According to the query system, the theme of the books on this shelf is high polymer chemistry. The books with high borrowing frequency can be highlighted both on the bookshelf and in the query system. If the higher-borrowing-frequency books on the remote shelves meet school discipline development policy, the purchases of these books will be increased. Books related to the subjects with the higher borrowing frequency on the taller or lower shelves will also be considered, and vice versa. Permanent-Assets Management Permanent assets such as chairs, desks, shelves, inquiry machines, printers, etc., can be managed in this model. Information about permanent assets (such as their status, spatial position, etc.) was input in the model, as is shown in figures 5a and 5b. Librarians can find the visual positions of permanent assets at any time, and readers can conveniently find the inquiry machines or printers to search books and print documents. LIBRARY SPACE INFORMATION MODEL BASED ON GIS | SHEN 108 https://doi.org/10.6017/ital.v37i3.10308 FUTURE DIRECTIONS The LSIM is only tested in one reading room and is still experimental. This model will be expanded to the whole library, providing visual information of library books and materials. In the process of using this model, GIS potentiality in the library will be exploited to provide better services for readers and managers. CONCLUSION Based on readers’ need of the book position in the library, the LSIM is built to visually show the exact bookshelf layer of the book. Spatial and attribute information is combined into the model. Based on the model, readers can search for books and find books’ positions. Meanwhile, many special collections located in the different branches can be easily found in the model. The GIS model not only brings convenience to readers, but also supports the library’s analysis and management. Librarians can analyze books’ circulation history based on the relationship between the books’ borrowing frequency and subject categories. Books with higher borrowing frequency and ones related them can be recommended to the readers. Then the number of the purchased books with the higher borrowing frequency in the remote, taller, or lower places will be increased based on the above analysis. Permanent assets can also be managed, and librarians can conveniently find the status and spatial position of the inquiry machines, printers, and so on. In short, the application of GIS in the library will bring a visual insight into the library, providing a better reader experience and better library management. ACKNOWLEDGEMENTS I thank Guo Jing, Chen Jiayi and Huang Qinling, Shanghai Jiao Tong University Library, for their advice on the structure of this article and the grammar of the written English. I also thank Liu Min and Peng Xia, East China Normal University, for their help in the model building. Research was funded by the “Fundamental Research Funds for the Central Universities" (grant 17JCYA13), Shanghai Jiao Tong University. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTMBER 2018 109 ENDNOTES 1 D. Kevin Davie, James Fox, and Barbara Preece, The ARL Geographic Information Systems Literacy Project. SPEC Kit 238 and SPEC Flyer 238 (Washington, DC: Association of Research Libraries, 1999). 2 B. W. Bishop and L. H. Mandel, “Utilizing Geographic Information Systems (GIS) in Library Research,” Library Hi Tech 4, no. 4 (2010): 536–47. 3 Karen Hertel and Nancy Sprague, “GIS and Census Data: Tools for Library Planning,” Library Hi Tech 25, no. 2 (2007): 246–59, https://doi.org/10.1108/07378830710755009; Christie M. Koontz, Dean K. Jue, and Bradley Wade Bishop, “Public Library Facility Closure: An Investigation of Reasons for Closure and Effects on Geographic Market Areas,” Library Information Science Research 31, no. 2 (2009): 84–91, https://doi.org/10.1016/j.lisr.2008.12.002; Sei-Ching Joanna Sin, “Neighborhood Disparities in Access to Information Resources: Measuring and Mappin g U.S. Public Libraries’ Funding and Service Landscapes,” Library Information Science Research 33, no. 1 (2011): 41–53, https://10.1016/j.lisr.2010.06.002. 4 Gary Higgs, Mitch Langford. and Richard Fry, “Investigating Variations in the Provision of Digital Services in Public Libraries Using Network-Based GIS Models,” Library and Information Science Research 35, no. 1 (2013): 24–32, https://doi.org/10.1016/j.lisr.2012.09.002; Sung Jae Park, “Measuring Public Library Accessibility: A Case Study Using GIS,” Library and Information Science Research 34, no. 1 (2012): 13–21, https://doi.org/10.1016/j.lisr.2011.07.007; Sung Jae Park, “Measuring Travel Time and Distance in Library Use,” Library Hi Tech 30, no. 1 (2012): 151–69, https://doi.org/10.1108/07378831211213274. 5 Wang Xuemei et al., “Applications and Researches of Geographic Information System Technologies in Bibliometrics,” Earth Science Informatics 7, no. 3 (2014): 147–52, https://doi.org/10.1007/s12145-013-0132-4. 6 Renata Solar and Dalibor Radovan, “Use of GIS for Presentation of the Map and Pictorial Collection of the National and University Library of Slovenia,” Information Technology and Libraries 24, no. 4 (2005): 196–200, https://doi.org/10.6017/ital.v24i4.3385. 7 Andrew Coyle, “Interior Library GIS,” Library Hi Tech 29, no. 3 (2011): 529–49, https://doi.org/10.1108/07378831111174468. 8 Mehri Sedighi, “Application of Geographic Information System (GIS) in Analyzing Geospatial Information of Academic Library Databases,” Electronic Library 30, no. 3 (2012): 367–76, https://doi.org/10.1108/02640471211241645. 9 István Boda et al., “A 3D Virtual Library Model: Representing Verbal and Multimedia Content in Three Dimensional Space,” Qualitative and Quantitative Methods in Libraries 4, no. 4 (2017): 891–901. 10 Xia Jingfeng, “Using GIS to Measure In-Library Book-Use Behavior,” Information Technology and Libraries 23, no 4 (2004): 184–91, https://doi.org/10.6017/ital.v23i4.9663; Lauren H. Mandel, “Toward an Understanding of Library Patron Wayfinding: Observing Patrons’ Entry Routes in a mailto:https://doi.org/10.1108/07378830710755009 mailto:https://doi.org/10.1016/j.lisr.2008.12.002 mailto:https://10.1016/j.lisr.2010.06.002 https://doi.org/10.1016/j.lisr.2012.09.002 https://doi.org/10.1108/07378831211213274 https://doi.org/10.1108/02640471211241645 https://doi.org/10.6017/ital.v23i4.9663 LIBRARY SPACE INFORMATION MODEL BASED ON GIS | SHEN 110 https://doi.org/10.6017/ital.v37i3.10308 Public Library,” Library and Information Science Research 32, no. 2 (2010): 116–30, https://doi.org/10.1016/j.lisr.2009.12.004; Lauren H. Mandel, “Geographic Information Systems: Tools for Displaying In-Library Use Data,” Information Technology and Libraries 29, no. 1 (2010): 47–52, https://doi.org/10.6017/ital.v29i1.3158. 11 Xia Jingfeng, “Locating Library Items by GIS Technology,” Collection Management 30, no. 1 (2005): 63–72, https://doi.org/10.1300/J105v30n01_07. 12 Matt Enis, “Technology: Capira Adds StackMap,” Library Journal 139, no. 13 (2014): 17. 13 Chen Jin, The History of Shanghai Jiao Tong University Library (Shanghai: Shanghai Jiao Tong University Press, 2013). 14 Francis P. Donnelly, “Evaluating Open Source GIS for Libraries,” Library Hi Tech 28, no. 1, (2010): 131–51, https://doi.org/10.1108/07378831011026742. https://doi.org/10.1016/j.lisr.2009.12.004 https://doi.org/10.6017/ital.v29i1.3158 https://doi.org/10.1300/J105v30n01_07 https://doi.org/10.1108/07378831011026742 ABSTRACT Introduction Methodology Background Software Figure 1. OPAC and Our Explore in the Shanghai Jiao Tong University Library homepage. Methods Figure 4. Finding a book in the Our Explore system. Figure 5a. The visual position of the book with the call number R318-53/3 (2D). Figure 5b. The visual position of the book with the call number R318-53/3 (3D). Discussion Providing Services for Readers by LSIM Visual Query in the Reading Room Special Collections in Different Reading Rooms Library Analysis and Management Book-Borrowing Situation Analysis Figure 6. Average borrowing frequency of the books on each shelf in one reading room. Permanent-Assets Management Future directions Conclusion Acknowledgements Endnotes 10338 ---- Editorial Board Thoughts Halfway Home: User Centered Design and Library Websites Mark Cyzyk INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 4 Mark Cyzyk (mcyzyk@jhu.edu), a member of LITA and the ITAL editorial board, is the Scholarly Communication Architect in The Sheridan Libraries, The Johns Hopkins University, Baltimore, Maryland. Our Library Website has now gone through two major redesigns in the past five or so years. In both cases, a User Centered Design approach was used to plan the site. In contrast to the Single Person Vision and Design by Committee approaches, User Centered Design focuses on the empirical study and eliciting of the needs of users. Great attention is paid to studying them, listening to them, and to exposing their needs as expressed. In both of our cases, the overall design, functionality, and content of the new site was then focused exclusively on the results of such study. If a proposed design element, a bit of functionality, or a chunk of content did not appear as an expressly desired feat ure for our users, it was considered clutter and did not make it onto the site. Both iterations of our Website redesign were strictly governed by this principle. But User Centered Design has blind spots. First, it may well be that what you take to be your comprehensive user base is not as comprehensive as you think. In my library, our primary users are our faculty and student researchers, so great attention was paid to them. This makes sense insofar as we are an academic library within a major research univ ersity. Faculty and student researchers will always be our primary user group. But they are not our comprehensive user group. We have staff, administrators, visitors, members of our Board of Trustees, members of our Friends, outside members of the profession, etc. — and they are all important constituencies in their own ways. Second, unless your sample size of users is large enough to be statistically valid, you are merely playing a game of three blind men and the elephant. Each user individually will be ex pressing his or her own experience and perceived needs based on that experience, and yet none of them, even taken as a group, will be reporting on the whole beast. While personal testimony definitely counts as evidence, it also frequently and insidiously results in blind spots that would otherwise be exposed through having a statistically valid sample of study participants. Third, and perhaps most importantly, User Centered Design discounts the expertise of librarians. Nobody knows a library’s users and patrons as well as librarians. Knowing their users, eliciting their needs, is part of what librarians as one of the “helping professions” do; it is a central tenet of librarianship. There is no substitute for experience and the expertise that follows from it. In the art world, this is connoisseurship. Somehow, the art historian just knows that what is before him is not a genuine Rembrandt. The empirical evidence may ineluctably lead to a different conclusion — yet there remains something missing, something the connoisseur cannot fully elucidate. Similarly, in the medical world the radiologist somehow just knows that the subtle gradations on his screen indicate one type of malady and not another. Interestingly, in the poultry industry there is something called a “chicken sexer.” This is a person who quickly and accurately sorts baby chicks by sex. Training for this vocation mailto:mcyzyk@jhu.edu EDITORIAL BOARD THOUGHTS: HALFWAY HOME | CYZYK 5 https://doi.org/10.6017/ital.v37i1.103813 largely employs what the philosophers call “ostensive definition:” “This one is male; that one is female.” The differences are so small as to be imperceptible. And yet, experienced chicken sexers can accurately sort chicks at an astonishing rate. They just know through experience. Such is the nature of tacit knowledge. In the case of our most recent Website redesign, none of our users expressed any interest whatsoever, for example, in including floor maps as part of the new site. We were assured a demand for floor maps on the site was “not a thing.” So floor maps were initially excluded from the site. This was met with a slow crescendo of grumbling from the librarians, and rightly so. Librarians, and the graduate students at our Information Desk, know through long experience that researchers of varying types find floor maps of the building to be useful. That’s why we’ve handed out paper copies for years. The fact that this need was missed through our focus on User Centered Design points to a blind spot in that process. Valuable experience and the expertise that follows from it should not be dismissed or otherwise diminished through dogmatic adherence to the core principle of User Centered Design. ... And yet, don’t get me wrong: Insofar as it’s the empirical study of select user groups and their expressed concerns and needs, User Centered Design as a design technique and foundational principle is crucially important and useful. It gets us halfway home. 10339 ---- Information Technology and Libraries at 50: The 1960s in Review Mark Cyzyk INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 6 Mark Cyzyk (mcyzyk@jhu.edu), a member of LITA and the ITAL editorial board, is the Scholarly Communication Architect in The Sheridan Libraries, The Johns Hopkins University, Baltimore, Maryland. In the quarter century since graduating from library school, I have now and then run into someone who had what I consider to be a highly inaccurate and unintuitive view of librarians and information technology. Seemingly, in their view, librarians are at worst Luddites and at best technological neophytes. Not so! In my view, librarians have always been at worst technological power users and at best true IT innovators. One has only to scan the first issues of ITAL, or The Journal of Library Automation as it was then called, to put such debate to rest. March 1968 saw the first issue of the first volume of The Journal of Library Automation published. The first article of that inaugural issue sets the scene: “Computer Based Acquisitions System at Texas A&I University” by Ned C. Morris. Here we find librarians not only employing computing technology to streamline library operations (using an IBM 1620 with 40K RAM), but as the article points out, this new system for computerizing acquisitions was an adjunct to the systems they already had in place at Texas A&I for circulation and serials management. This first article in the first issue of the first volume indicates that we’ve dipped a toe into a stream that was already swiftly flowing. The other bookend of that first issue, “The Development and Administration of Automated Systems in Academic Libraries” by Harvard’s Richard de Gennaro, goes meta and takes a comprehensive look at how automated library systems were already being created and the various system development and implementation rubrics under which such development occurred. Much in this article should resonate with current readers of ITAL. I knew immediately that this article was going to be a good read when I encountered, in the very first paragraph: Development, administration, and operations are all bound up together and are in most cases carried on by the same staff. This situation will change in time, but it seems safe to assume that automated library systems will continue to be characterized by instability and change for the next several years. I’d say that was a safe assumption. The second and final volume of the 1960’s contains gems as well. The entirety of Volume 2 Issue 2 that year was devoted to “USA Standard for a Format for Bibliographic Information Interchange on Magnetic Tape” A.K.A. MARC II. Is it possible for something to be dry, yet fascinating? Some titles of this second volume point to the wide range of technological projects underway in the library world in 1969: mailto:mcyzyk@jhu.edu THE 1960S IN REVIEW | CYZYK 7 https://doi.org/10.6017/ital.v37i1.10339 • “An Automated Music Programmer (MUSPROG)” by David F. Harrison and Randolph J. Herber • “A Fast Algorithm for Automatic Classification” by R. T. Dattola • “Simon Fraser University Computer Produced Map Catalogue” by Brian Phillips and Gary Rogers • “Management Planning for Library Systems Development” by Fred L. Bellomy • “Performance of Ruecking’s Word-compression Method When Applied to Machine Retrieval from a Library Catalog” by Ben-Ami Lipetz, Peter Stangl, and Kathryn F. Taylor And this is only in the first two volumes. As this current 2018 volume of ITAL proceeds, we’ll be surveying the morphing information technology and libraries landscape through ITAL articles of the seventies, eighties, and nineties. I think you will see what I mean when I say that librarians have always been at worst technological power users, at best true IT innovators. 10357 ---- PAL: Toward a Recommendation System for Manuscripts Scott Ziegler and Richard Shrake INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 84 Scott Ziegler (sziegler1@lsu.edu) is Head of Digital Programs and Services, Louisiana State University Libraries. Prior to this position, Ziegler was the Head of Digital Scholarship and Technology, American Philosophical Society. Richard Shrake (shraker13@gmail.com) is a Library Technology Consultant based in Burlington, Vermont. ABSTRACT Book-recommendation systems are increasingly common, from Amazon to public library interfaces. However, for archives and special collections, such automated assistance has been rare. This is partly due to the complexity of descriptions (finding aids describing whole collections) and partly due to the complexity of the collections themselves (what is this collection about and how is it related to another collection?). The American Philosophical Society Library is using circulation data collected through the collection- management software package, Aeon, to automate recommendations. In our system, which we’re calling PAL (People Also Liked), recommendations are offered in two ways: based on interests (“You’re interested in X, other people interested in X looked at these collections”) and on specific requests (“You’ve looked at Y, other people who looked at Y also looked that these collections”). This article will discuss the development of PAL and plans for the system. We will also discuss ongoing concerns and issues, how patron privacy is protected, and the possibility of generalizing beyond any specific software solution. INTRODUCTION The American Philosophical Society Library (APS) is an independent research library in Philadelphia. Founded in 1743, the library houses a wide variety of material in early American history, history of science, and Native American linguistics. The majority of the library’s holdings are manuscripts, with a large amount of audio material, maps, and graphics, nearly all of which are described in finding aids created using Encoded Archival Description (EAD) standards. Like similar institutions, the APS has long struggled to find new ways to help library users discover material relevant to their research. In addition to traditional in-person, email, and phone reference, the APS has spent years creating search and browse interfaces, subject guides , and web exhibitions to promote the collections.1 As part of these ongoing efforts to connect users with collections, the APS is working on an automated recommendation system to reuse circulation data gathered through Aeon. Developed by Atlas Systems, Aeon is a “request and workflow management software specifically designed for special collections libraries and archives,” and it enables the APS to gather statistics on both the use of our manuscript collections and on aspects of the library’s users.2 The automated recommendation system, which we’re calling PAL, for “People Also Liked,” is an ongoing effort. This article presents a snapshot of current work. PAL: TOWARD A RECOMMENDATION SYSTEM FOR MANUSCRIPTS | ZIEGLER AND SHRAKE 85 https://doi.org/10.6017/ital.v37i3.10357 LITERATURE REVIEW The benefits of recommendations in library OPACs has long been recognized. Writing in 2008 about the library recommendation system BibTip, itself started in the early 2000s, Mönnich and Spiering observe that “library services are well suited for the adoption of recommendation systems, especially services that support the user in search of literature in the catalog.” By 2011 OCLC Research and the Information School at the University of Sheffield began exploring a recommendation system for OCLC’s Worldcat.3 Recommendations for library OPACs commonly fall into one of two categories, content-based or collaborative filtering. Content-based recommendations pair specific users to library items based on the metadata of the item and what is known about the user. For example, if a user indicates in some way that they enjoy mystery novels, items identified as mystery novels might be recommended to them. Collaborative filtering combines users in some way and creates recommendations for one user based on the preferences of another user. There can be a dark side to recommendations. The algorithms that determine which users are similar and thus which recommendations to make are not often understood. Writing about algorithms in library discovery systems broadly, Reidsma points out that “in librarianship over the past few decades, the profession has had to grapple with the perception that computers are better at finding relevant information then people.”4 The algorithms that are doing the finding, however, often carry the same hidden biases that their programmers have. Reidsma encourages a broader understanding of algorithms in general and deeper understanding of recommendation algorithms in particular. The history of recommendation systems in libraries has informed the ongoing development of PAL. We use both the content-based and the collaborative filtering approach to offering recommendations to users. For the purposes of communicating them to nontechnical patrons, we refer to them as “interest-based” and “request-based,” respectively. Furthermore, we are cautious about the role algorithms play in determining which recommendations users see. Our help text reinforces the continued importance of working directly with in-house experts, and we promote PAL as one tool among the many offered by the library. We are not aware of any literature on the development of recommendation tools for archives or special-collections libraries. The nature of the material held in these institutions presents special challenges. For example, unlike book collections, many manuscript and archival collections are described in aggregate: one description might refer to many letters. These issues are discussed in detail below. PUTTING DATA TO USE: RECOMMENDATIONS BASED ON INTERESTS AND REQUESTS The use of Aeon allows the APS to gather and store data, including both data that users supply through the registration form and data concerning which collections are requested. PAL use both types of data to create recommendations. Interest-Based Recommendations The first type of recommendation uses self-identified research interest data that researchers supply when creating an Aeon account. When registering, a user has the option to select from a list of sixty-four topics grouped into seven broad categories (figure 1). The APS selected these INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 86 interests based on suggestions from researchers as well as categories common in the field of academic history. Upon signing in, a registered user sees a list of links (figure 2); each link leads to a full-page view of collection recommendations (figure 3). These recommendations follow the model, “You’re interested in X, other people interested in X looked at these collections.” Request-Based Recommendations Using the circulation data that Aeon collects, we are able to automate recommendations in PAL based on request information. Upon clicking a request link in a finding aid, the user is presented with a list of recommendations on the sidebar in Aeon (figure 4). Each link opens the finding aid for the collection listed. Figure 1. List of interests a user sees when registering for the first time. A user can also revisit this list to modify their choices at any point by following links through the Aeon interface. The selected interests generate recommendations. PAL: TOWARD A RECOMMENDATION SYSTEM FOR MANUSCRIPTS | ZIEGLER AND SHRAKE 87 https://doi.org/10.6017/ital.v37i3.10357 Figure 2. List of links appearing on the right-hand sidebar, based on interests that users select. Figure 3. Recommended collections, based on interest, showing collection name (with a link to finding aid), call number, number of requests, and number of users who have requested from the collections. The user sees this list after clicking on option from sidebar, as shown in figure 2. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 88 Figure 4. Request-based recommendation links appearing on the right-hand sidebar after a patron requests an item from a finding aid. THE PROCESS Currently, the data that drives these two functions is obtained from a semidynamic process via daily, automated SQL query exports. Usernames are employed to tie together requests and interests but are subsequently purged from the data before the results are presented to users and staff. This section explains the process in detail and presents code snippets where available. All code is available on GitHub.5 Interest-Based Recommendations For interest-based recommendations, we employ two queries. The first query pulls every collection requested by a user for each topic for which that user has expressed an interest. The second aggregates the data for every user in the system. The following queries get data from the Microsoft SQL database, via a Microsoft Access intermediary, that Aeon uses to store data. Because of the number of interest options in the registration form, and the character length of some of them (“Early America - Colonial History,” for example) we encode the interests in shortened form. “Early America - Colonial History” becomes “EA-ColHist” so as not to run into character limits in the database. This section explores each of these queries in more detail and provides example code. PAL: TOWARD A RECOMMENDATION SYSTEM FOR MANUSCRIPTS | ZIEGLER AND SHRAKE 89 https://doi.org/10.6017/ital.v37i3.10357 The first query gathers research topics for all users who are not staff (user status is ‘Researcher’), and where at least one research topic is chosen (‘ResearchTopics’ is not null). The data is exported into an XML file that we call “aeonMssReg.” SELECT AeonData.dbo.Users.ResearchTopics, AeonData.dbo.Transactions.CallNumber, AeonData.dbo.Transactions.Location FROM AeonData.dbo.Transactions INNER JOIN AeonData.dbo.Users ON (AeonData.dbo.Users.UserName = AeonData.dbo.Transactions.Username) AND (AeonData.dbo.Transactions.Username = AeonData.dbo.Users.UserName) WHERE (((AeonData.dbo.Users.ResearchTopics) Is Not Null) AND ((AeonData.dbo.Transactions.CallNumber) Like 'mss%' Or (AeonData.dbo.Transactions.CallNumber) Like 'aps.%') AND ((AeonData.dbo.Users.Status)='Researcher')) FOR XML RAW ('aeonMssReq'), ROOT ('dataroot'), ELEMENTS; The second query combines all data for all users and exports an XML file ‘aeonMssUsers.’ SELECT DISTINCT AeonData.dbo.Users.ResearchTopics, AeonData.dbo.Transactions.CallNumber, AeonData.dbo.Transactions.Location, AeonData.dbo.Transactions.Username FROM AeonData.dbo.Transactions INNER JOIN AeonData.dbo.Users ON (AeonData.dbo.Users.UserName = AeonData.dbo.Transactions.Username) AND (AeonData.dbo.Transactions.Username = AeonData.dbo.Users.UserName) WHERE (((AeonData.dbo.Users.ResearchTopics) Is Not Null) AND ((AeonData.dbo.Transactions.CallNumber) Like 'mss%' Or (AeonData.dbo.Transactions.CallNumber) Like 'aps.%') AND ((AeonData.dbo.Users.Status)='Researcher')) FOR XML RAW ('aeonMssUsers'), ROOT ('dataroot'), ELEMENTS; Each query produces an XML file. These files are parsed using XSL stylesheets into subsets for each research interest. The stylesheets also generate counts of users requesting a collection and number of total requests for a collection by users sharing an interest. An example is the following stylesheet for the topic “Early America - Colonial History,” which pulls from the XML file “aeonMssReg”: INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 90 This process is repeated for each interest. The data from the query that we modify with XSLT is presented as HTML that we insert into Aeon templates. This HTML includes the collection name (linked to finding aid), call number, number of requests, and number of users in a table. See figure 3 for how this appears to the user. The following shows how XSL is wrapped in HTML.

The collections most frequently requested from researchers who expressed an interest in are listed below with links to each collection's finding aid and the number of times each collection has been requested.

Collection Call Number # of Requests # of Users
To ensure a user only sees the links that match the interests they have selected, we use JavaScript to determine the expressed interests of the current user and display the corresponding links to the HTML pages in a sidebar. This approach works well, but we must account for two quirks. The first is that many interests in the database do not conform to the current list of options because many users predate our current registration form and wrote in free-form interests. Secondly, Aeon stores the research information as an array rather than in a separate table, so we must account for the fact that the Aeon database contains an array of values that includes both controlled and uncontrolled vocabulary. First, we set the array as a variable so we can look for a value that matches our controlled vocabulary and separate the array into individual values for manipulation: // Use var message to check for presence of controlled list of topics var message = "<#USER field='ResearchTopics'>"; // Use var values to separate topics that are collected in one string var values = "<#USER field='ResearchTopics'>".split(","); PAL: TOWARD A RECOMMENDATION SYSTEM FOR MANUSCRIPTS | ZIEGLER AND SHRAKE 91 https://doi.org/10.6017/ital.v37i3.10357 We also create variables to generate the HTML entries and links out when we have extracted our research topics: var open = "" Next we set a conditional to determine if one of our controlled vocabulary terms appears in the array: //Determine if user has an interest topic from the controlled list if ((message.indexOf("EA-ColHis") > -1) || (message.indexOf("EA-AmRev") > -1) || (message.indexOf("EA-EarlyNat") > -1) || (message.indexOf("EA-Antebellum") > -1) || … If the array contains a value from our controlled vocabulary, we generate a link and translate our internal code back into a human-friendly research topic (“EA-ColHist,” for example, becomes once again “Early American - Colonial History”): for (var i = 0; i < values.length; ++i) { if (values[i]=="EA-ColHis"){ document.getElementById("topic").innerHTML += (open + values[i] + middle + "Early America-Colonial History" + close);} else if (values[i]=="EA-AmRev"){ document.getElementById("topic").innerHTML += (open + values[i] + middle + "Early America- American Revolution" + close);} else if (values[i]=="EA-EarlyNat"){ document.getElementById("topic").innerHTML += (open + values[i] + middle + "Early America- Early National" + close);} else if (values[i]=="EA-Antebellum"){ document.getElementById("topic").innerHTML += (open + values[i] + middle + "Early America- Antebellum" + close);} … See figure 2 for how this appears to the user. Users only see the links that correspond to their stated interest. If the array does not contain a value from our controlled vocabulary, we display the research-topic interests associated with the user account, note that we don’t currently have a recommendation, and provide a link to update the research topics for the account. Else {document.getElementById("notopic").innerHTML = "

You expressed interest in:

<#USER field='ResearchTopics'>

We are unable to provide a specific collection recommendation for you. Please visit our User Profile page to select from our list of research topics.

" } Request-Based Recommendations In addition to interest-based recommendations, PAL supplies recommendations based on past requests a user has made. This section details how these recommendations are generated. Aeon allows users to request materials directly from a finding aid (see figure 6). To generate our request-based recommendations we employ a query depicting the call number and user of every request in the system and export the results to an XML file called “aeonLikeCollections.” INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 92 SELECT subquery.CallNumber, subquery.Username, IIf(Right(subquery.trimLocation,1)='.',Left(subquery.trimLocation,Len(subquery.trimLocation)- 1),subquery.trimLocation) AS finallocation FROM ( SELECT DISTINCT AeonData.dbo.Transactions.CallNumber, AeonData.dbo.Transactions.Username, IIf(CHARINDEX(':',[Location])>0,Left([Location],CHARINDEX(':',[Location])-1),[Location]) AS trimLocation FROM AeonData.dbo.Transactions INNER JOIN AeonData.dbo.Users ON (AeonData.dbo.Users.UserName = AeonData.dbo.Transactions.Username) AND (AeonData.dbo.Transactions.Username = AeonData.dbo.Users.UserName) WHERE (((AeonData.dbo.Transactions.CallNumber) Like 'mss%' Or (AeonData.dbo.Transactions.CallNumber) Like 'aps.%') AND ((AeonData.dbo.Transactions.Location) Is Not Null) AND ((AeonData.dbo.Users.Status)='Researcher'))) subquery ORDER BY subquery.CallNumber FOR XML RAW ('aeonLikeCollections'), ROOT ('dataroot'), ELEMENTS; We then process the “aeonLikeCollections” file through a series of XSLT stylesheets, creating lists of every other collection that every user of the current collection has requested. First the stylesheets remove collections that have only been requested once. Then we count the number of times each collection has been requested: We sort on the collection name and username and then re-sort to combine groups of requested collections with users who have requested each collection. PAL: TOWARD A RECOMMENDATION SYSTEM FOR MANUSCRIPTS | ZIEGLER AND SHRAKE 93 https://doi.org/10.6017/ital.v37i3.10357 We then create a new XML file that is organized by our collection groupings. The following snippet shows a populated XML file generated by the XSLT stylesheet above. Mss.497.3.B63c Mss.497.3.B63c - American Council of Learned Societies … 94 Mss.Ms.Coll.200 Mss.Ms.Coll.200 - Miscellaneous Manuscripts Collection … 92 We use JavaScript to determine the call number of the user’s current request and display the list of other collections that users who have requested the current collection have also requested. See figure 4 for how these links appear to the user. All of the exports and processing are handled automatically through a daily scheduled task. The only personally identifiable data that is contained in these processes are usernames, which are used for counting purposes, but they are removed from the final products through the XSLT processing on an internal administrative server, are never stored in the Aeon web directory, and are never available for other library users or staff to see. POTENTIAL PITFALLS AND WHAT TO DO ABOUT THEM PAL allows us to see new things about our users, and we hope that our users are able to see new collections in the library. However, there are potential pitfalls to the way we’ve been working on this project. We’re calling the two biggest pitfalls the “bias toward well-described collections” and the “problem of aboutness.” INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 94 The Bias toward Well-Described Collections The bias toward well-described collections is best understood by examining how the APS integrates Aeon into our finding aids. We offer request links at every available level of description: collection, series, folder, and item. If a patron spends all day in our reading room and looks at the entirety of an item-level collection, they could have made between twenty and one hundred individual requests from that collection. For our statistics, each request will be counted as that collection being used. Figure 6 shows a collection described at the item level; each item can be individually requested, giving the impression that this collection is very heavily used even if it is only one patron doing all the requesting. Figure 6. Finding aid of collection described at the item level. A patron making their way through this collection could make as many as one hundred individual requests. For collections described at the collection level, however, the patron has only one link to click to see the entire collection. For PAL, however, it looks like that collection was only used once, as shown in figure 7. A patron sitting all day in our reading room looking at a collection with little description might use the collection more heavily than a patron clicking select items in a well-described collection. However, when we review the numbers, all we see is that the well-described collections get more clicks. PAL: TOWARD A RECOMMENDATION SYSTEM FOR MANUSCRIPTS | ZIEGLER AND SHRAKE 95 https://doi.org/10.6017/ital.v37i3.10357 Figure 7. Screenshot of finding aid with only collection-level description. This collection has only one request link, the “Special Request” link at the top right. A patron looking through the entirety of this collection will only log a single request from the point of view of our statistics. The Problem of Aboutness When we speak of the problem of aboutness, we draw attention to the fact that manuscript collections can be about many different things. One researcher might come to a collection for one reason, another researcher for another reason. A good example at the APS Library is the William Parker Foulke Papers.6 This collection contains approximately three thousand items and represents a wide variety of the interests of the eponymous Mr. Foulke. He discovered the first full dinosaur skeleton, promoted prison reform, worked toward abolition, and championed arctic exploration. A patron looking at this collection could be interested in any of these topics, or others. PAL, however, isn’t able to account for these nuances. If a researcher interested in prison reform requests items from the Foulke Papers, they’ll see the same suggestion as a researcher who came to the collection for arctic exploration. What to Do about This Identifying these pitfalls is a good first step to avoiding them, but it’s only a first step. There are technical solutions, and we’ll continue to explore them. For example, the bias toward well- described collections is mitigated by showing both the number of requests and the number of users who have requested from a collection (see figure 3). We hope that by presenting both numbers, we move a little toward overcoming this bias. However, we’re also interested in the nontechnical approaches to these issues. As mentioned in the introduction, the APS relies heavily on traditional reference service, both remote and in-house. Nontechnical solutions acknowledge the shortcomings of any constructed solution and injects a healthy amount of humility into our work. Additionally, the subject guides, search tools, and web exhibitions all form an ecosystem of discovery and access to supplement PAL. FUTURE STEPS Using Data Outside of Aeon We have begun exploring options for using the recommendation data outside of Aeon. One early prototype surfaces a link in our primary search interface. For example, searching for the William INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 96 Parker Foulke Papers shows a link of what people who requested from this collection also looked at. See figures 8 and 9. Generalizing for Other Repositories There are ways to integrate the use of Aeon with EAD finding aids. The systems that the APS has developed to collect data for automated recommendations takes advantage of our infrastructure. We’d like for other repositories to be able to use PAL. It is our hope that an institution using Aeon in a different way will help us generalize this system. Generalizing beyond Aeon PAL is currently configured to pull data out of the Microsoft SQL database used by Aeon. However, all the manipulation is done outside of Aeon and is therefore generalizable to data collected in other ways. Because archives and special collections have long-held statistics in different types of systems, we hope to be able to generalize beyond the Aeon use case if there is any interest in this from other repositories. Integrating PAL into Aeon Conversations with Atlas staff about PAL have been positive, and there is interest in building many of the features into future releases of Aeon. As of this writing, an open Uservoice forum topic is taking votes and comments about this integration.7 Figure 8. A link in the search returns that leads to recommendations based on finding aid search. Clicking on the link “PAL Recommendations: Patrons who used Henry Howard Houston, II Papers also used these collections” will open an HTML page with a list of links to finding aids. PAL: TOWARD A RECOMMENDATION SYSTEM FOR MANUSCRIPTS | ZIEGLER AND SHRAKE 97 https://doi.org/10.6017/ital.v37i3.10357 Figure 9. HTML link of recommended finding aids based on search. CONCLUSION The APS is trying to add to the already robust options for users to find relevant manuscript collections. In addition to traditional reference, web exhibitions, and online search and browse tools, we have started reusing circulation data and self-identified user interests to automate recommendations. This new system fits within the ecosystem of tools we already supply. This is a snapshot of where the PAL recommendation project is as of this writing, and we hope to work with other special collections libraries and archives to continue to grow the tool. If you are interested, we hope you reach out. ENDNOTES 1 “Subject Guides and Bibliographies,” American Philosophical Society, accessed February 27, 2018, https://amphilsoc.org/library/guides; “Exhibitions,” American Philosophical Society, accessed February 27, 2018, https://amphilsoc.org/library/exhibit; “Galleries,” American Philosophical Society, accessed February 27, 2018, https://diglib.amphilsoc.org/galleries. 2 “Aeon,” Atlas Systems, accessed February 27, 2018, https://www.atlas-sys.com/aeon/. https://amphilsoc.org/library/guides https://amphilsoc.org/library/exhibit https://diglib.amphilsoc.org/galleries https://www.atlas-sys.com/aeon/ INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 98 3 Michael Mönnich and Marcus Spiering, “Adding Value to the Library Catalog by Implementing a Recommendation System,” D-Lib Magazine 14, no. 5/6 (2008), https://doi.org/10.1045/may2008-monnich. 4 Matthew Reidsma, “Algorithmic Bias in Library Discovery Systems,” Matthew Reidsma (blog), March 11, 2016, https://matthew.reidsrow.com/articles/173. 5 “AmericanPhilosophicalSociety/PAL,” American Philosophical Society, last modified September 11, 2017, https://github.com/AmericanPhilosophicalSociety/PAL. 6 “William Parker Foulke Papers, 1840–1865,” American Philosophical Society, accessed February 27, 2018, https://search.amphilsoc.org/collections/view?docId=ead/Mss.B.F826-ead.xml. 7 “Recommendation System to Suggest Items to Researchers Based on Users with the Same Research Topic,” Atlas Systems, accessed February 27, 2018, https://uservoice.atlas- sys.com/forums/568075-aeon-ideas/suggestions/18893335-recommendation-system-to- suggest-items-to-research. https://doi.org/10.1045/may2008-monnich https://matthew.reidsrow.com/articles/173 https://github.com/AmericanPhilosophicalSociety/PAL http://amphilsoc.org/collections/view?docId=ead/Mss.B.F826-ead.xml https://uservoice.atlas-sys.com/forums/568075-aeon-ideas/suggestions/18893335-recommendation-system-to-suggest-items-to-research https://uservoice.atlas-sys.com/forums/568075-aeon-ideas/suggestions/18893335-recommendation-system-to-suggest-items-to-research https://uservoice.atlas-sys.com/forums/568075-aeon-ideas/suggestions/18893335-recommendation-system-to-suggest-items-to-research ABSTRACT Introduction Literature Review Putting Data to Use: Recommendations Based on Interests and Requests Interest-Based Recommendations Request-Based Recommendations The Process Interest-Based Recommendations Request-Based Recommendations Potential Pitfalls and What To Do About Them The Bias toward Well-Described Collections The Problem of Aboutness What to Do about This Future Steps Using Data Outside of Aeon Generalizing for Other Repositories Generalizing beyond Aeon Integrating PAL into Aeon Conclusion Endnotes 10386 ---- President’s Message Andromeda Yelton INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 2 Andromeda Yelton (andromeda.yelton@gmail.com) is LITA President 2017-18 and Senior Software Engineer, MIT Libraries, Cambridge, Massachusetts. In my last President’s Message, I talked about change — ITAL’s transition to new leadership — and imagination — Wakanda and the archival imaginary. Today change and imagination are on my mind again as LITA contemplates a new path forward: potential becoming a new combined division with ALCTS and LLAMA. As you may have already seen on LITAblog (http://litablog.org/2018/02/lita-alcts-and-llama- document-on-small-division-collaboration/), the three divisional leadership teams have been envisioning this possibility, and all three division Boards discussed it at Midwinter. While the id ea sprang out of our shared challenges with financial stability, in discussing it we’ve realized how much opportunity we have to be stronger together. For instance, we’ve heard for years that you, LITA members, want more of a leadership training pathway, and more ways to stay involved with your LITA home as you move into management; alignment with LLAMA automatically opens up all kinds of possibilities. They have an agile divisional structure with their communities of practice and an outstanding set of lead ership competencies. And anyone involved with library technology knows that we live and die by metadata, but we aren’t all experts in it; joining forces with ALCTS creates a natural home for people no matter where they are (or where they’re going) on the technology/metadata continuum. ALCTS also runs far more online education than LITA and runs a virtual conference. Meanwhile, of course, LITA has a lot to offer to LLAMA and ALCTS. You already know how rewarding the networking is, and how great the depth of expertise on technology topics. We also bring strong publications (like this very journal), marquee conference programs (like Top Tech Trends and the Imagineering panel), and a face-to-face conference. (Speaking of which, please pitch a session (http://bit.ly/2GpGXdf) for the 2018 LITA Forum!) I want to emphasize that no decisions have been made yet. The outcome of our three Board discussions was that we all feel there is enough merit to this proposal to explore it further, but none of us are formally committed to this direction. Furthermore, it is not practically or procedurally possible to make a change of this magnitude until at least 2019. In the meantime, we expect there will be numerous working groups to determine if and how this all could work, as well as open forums for the membership of all three divisions to express hopes, concerns, and ideas. Personally, my highest priority is to ensure that that you, the members, continue to have a divisional home: one that gives you learning opportunities and a place for professional camaraderie, and that is on solid financial footing so it can continue to be here for you in the long term. http://litablog.org/2018/02/lita-alcts-and-llama-document-on-small-division-collaboration/ http://litablog.org/2018/02/lita-alcts-and-llama-document-on-small-division-collaboration/ http://bit.ly/2GpGXdf PRESIDENT’S MESSAGE | MARCH 2018 3 https://doi.org/10.6017/ital.v37i1.10386 So, I’m excited about the possibilities that a superhero teamup affords, but I’m even more excited to hear from you. Do you find this prospect thrilling, scary, both? Do you think we should absolutely go this way, or definitely not, or maybe but with caveats and questions? Please tell me what you think. You can submit anonymous feedback and questions at https://bit.ly/litamergefeedback. I will periodically collate and answer these questions on LITAblog. You can also reach out to me personally any time (andromeda.yelton@gmail.com). https://bit.ly/litamergefeedback mailto:andromeda.yelton@gmail.com 10388 ---- Letter from the Editor Kenneth J. Varnum INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 1 https://doi.org/10.6017/ital.v37i1.10388 This issue marks 50 years of Information Technology and Libraries. The scope and ever- accelerating pace of technological change over the five decades since Journal of Library Automation was launched in 1968 mirrors what the world at large has experienced. From “automating” existing services and functions a half century ago, libraries are now using technology to rethink, recreate, and reinvent services — often in areas that simply were in the realm of science fiction. In an attempt to put today’s technology landscape in context, ITAL will publish a series of essays this year, each focusing on the highlights of a decade. In this issue, editorial board member Mark Cyzyk talks about selected articles from the first two volumes of the journal. In the remaining issues this year, we’ll tackle the 1970s, 1980s, 1990s, and 2000s. The journal itself, now as ever before, focuses on the present and the near future, so we will hold off recapitulating the current decade until our centennial celebration in 2068. As we look back over the journal’s history, the editorial board is also looking to the future. We want to make sure that we know for whom we are publishing these articles, and to make sure that the journal is as relevant to today’s (and tomorrow’s) readership as it has been for those who have brought us to the present. To that end, we invite anyone who is reading this issue to take this brief survey — tell us a little about how you came to ITAL today, how you’re connected with library technology, and what you’d like to see in the journal. It won’t take much of you r time (no more than 5 minutes) and will help us understand the context in which we are working. There’s another opportunity for you to help shape the future of the journal. Due to a number of terms being up at the end of June 2018, we have at least five openings on the editorial board to fill. If you are passionate about libraries and technology, enjoy working with authors to shape their articles, and want to help set out today’s scholarly record for tomorrow’s technologists, submit a statement of interest at https://goo.gl/forms/5GbqOuuSeOlXrFx52. We seek to have an editorial board that represents the diversity of library technology practitioners, and particularly invite individuals from non-academic libraries and underrepresented demographic groups to apply. Sincerely, Kenneth J. Varnum Editor March 2018 https://umich.qualtrics.com/jfe/form/SV_6hafly0cYJpBK4J https://umich.qualtrics.com/jfe/form/SV_6hafly0cYJpBK4J https://goo.gl/forms/5GbqOuuSeOlXrFx52 10405 ---- Application Level Security in a Public Library: A Case Study Richard Thomchick and Tonia San Nicolas-Rocca INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 107 Richard Thomchick (richardt@vmware.com) is MLIS, San José State University. Tonia San Nicolas-Rocca (tonia.sannicolas-rocca@sjsu.edu) is Assistant Professor in the School of Information at San José State University. ABSTRACT Libraries have historically made great efforts to ensure the confidentiality of patron personally identifiable information (PII), but the rapid, widespread adoption of information technology and the internet have given rise to new privacy and security challenges. Hypertext Transport Protocol Secure (HTTPS) is a form of Hypertext Transport Protocol (HTTP) that enables secure communication over the public internet and provides a deterministic way to guarantee data confidentiality so that attackers cannot eavesdrop on communications. HTTPS has been used to protect sensitive information exchanges, but security exploits such as passive and active attacks have exposed the need to implement HTTPS in a more rigorous and pervasive manner. This report is intended to shed light on the state of HTTPS implementation in libraries, and to suggest ways in which libraries can evaluate and improve application security so that they can better protect the confidentiality of PII about library patrons. INTRODUCTION Patron privacy is fundamental to the practice of librarianship in the United States (U.S.). Libraries have historically made great efforts to ensure the confidentiality of personally identifiable information (PII), but the rapid, widespread adoption of information technology and the Internet have given rise to new privacy and security challenges. The USA PATRIOT Act, the rollback of the Federal Communications Commission rules prohibiting internet service providers from selling customer browsing histories without the customer’s permission, along with electronic surveillance efforts by the National Security Agency (NSA) and other government agencies, have further intensified privacy concerns about sensitive information that is transmitted over the public internet when patrons interact with electronic library resources through online systems such as an online public access catalog (OPAC). 1 Hypertext Transport Protocol Secure (HTTPS) is a form of Hypertext Transport Protocol (HTTP) that enables secure communication over the public internet and provides a deterministic way to guarantee data confidentiality so that attackers cannot eavesdrop on communications. HTTPS has been used to protect sensitive information exchanges (i.e., e-commerce transactions, user authentication, etc.). In practice, however, security exploits such as man-in-the-middle attacks have demonstrated the relative ease with which an attacker can transparently eavesdrop on or hijack HTTP traffic by targeting gaps in HTTPS implementation. There is little or no evidence in the literature that libraries are aware of the associated vulnerabilities, threats, or risks, or that researchers have evaluated the use of HTTPS in library web applications. This report is intended to shed light on the state of HTTPS implementation in libraries, and to suggest ways in which libraries can evaluate and improve application security so that they can better protect the mailto:richardt@vmware.com mailto:tonia.sannicolas-rocca@sjsu.edu APPLICATION LEVEL SECURITY IN A PUBLIC LIBRARY |THOMCHICK AND SAN NICOLAS-ROCCA 108 https://doi.org/10.6017/ital.v37i4.10405 confidentiality of PII about library patrons. The structure of this paper is as follows. First, we review the literature on privacy as it pertains to librarianship and cybersecurity. We then describe the testing and research methods used to evaluate HTTPS implementation. A discussion on the results of the findings is presented. Finally, we explain the limitations and suggest future research directions. LITERATURE REVIEW The research begins with a survey of the literature on the topic of confidentiality as it pertains to patron privacy; the impact of information technology on libraries; and the use of HTTPS as a security control to protect the confidentiality of patron data when it is transmitted over the public internet. While there is ample literature on the topic of patron privacy, there appears to be a lack of empirical studies that measure the use of HTTPS to protect the privacy of data transmitted to and from patrons when they use library web applications.2 The Primal Importance of Patron Privacy Patron privacy has long been one of the most important principles of the library profession in the U.S. As early as 1939, the Code of Ethics for Librarians explicitly stated, “It is the librarian’s obligation to treat as confidential any private information obtained through contact with li brary patrons.”3 The concept of privacy as applied to personal and circulation data in library records began to appear in the library literature not long after the passage of the U.S. Privacy Act of 1974.4 Today, the American Library Association (ALA) regards privacy as “fundamental to the ethics and practice of librarianship,” and has formally adopted a policy regarding the confidentiality of personally identifiable information (PII) about library users, which asserts, “confidentiality exists when a library is in possession of personally identifiable information about users and keeps that information private on their behalf.”5 This policy affirms language from the ALA Code of Ethics, and states that “confidentiality extends to information sought or received and resources consulted, borrowed, acquired or transmitted including database search records, reference questions and interviews, circulation records, interlibrary loan records, information about materials downloaded or placed on ‘hold’ or ‘reserve,’ and other personally identifiable information about uses of library materials, programs, facilities, or services.” 6 With the advent of new technologies used in libraries to support information discovery, more challenges arise to protect patron privacy.7 The Impact of Information Technology on Patron Privacy Researchers have studied the impact of information technology on patron privacy for several decades. Early research by Harter and Machovec discussed the data privacy challenges arising from the use of automated systems in the library, and the associated ethical considerations for librarians who create, view, modify, and use patron records.8 Fouty addressed issues regarding the privacy of patron data contained in library databases, arguing that online patron records provide more information about individual library users, more quickly, than traditional paper- based files.9 Agnew and Miller presented a hypothetical case involving the transmission of an obscene email from a library computer, and an ensuing FBI inquiry, as a method of examining privacy issues that arise from patron internet use at the library.10 In addition, Merry pointed to the potential for violations of patron privacy brought about by tracking of personal information attached to electronic text supplied by publishers.11 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 109 The consensus from the literature, as articulated by Fifarek, is that technology has given rise to new privacy challenges, and that the adoption of technology in the library has outpaced efforts to maintain patron privacy.12 This sentiment was echoed and amplified by John Berry, former ALA president, who commented that there are “deeper issues that arise from the impact of converting information to digitized, online formats” and critiqued the library profession for having “not built protections for such fundamental rights as those to free expression, privacy, and freedom.”13 ALA affirmed these findings and validated much of the prevailing research in a report from the Library Information Technology Association, which concluded, “User records have also expanded beyond the standard lists of library cardholders and circulation records as libraries begin to use electronic communication methods such as electronic mail for reference services, and as they provide access to computer, web and printing use.”14 In more recent years, library systems have made increasing use of network communication protocols such as HTTP and focus of the literature has shifted towards internet technologies in response to the growth of trends such as cloud computing and Web 2.0. Mavodza characterizes the relevance of cloud computing as “unavoidable” and expounds on the ways in which Software-as-a- Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS) and other cloud computing models “bring to the forefront considerations about . . . information security [and] privacy . . . that the librarian has to be knowledgeable about.”15 Levy and Bérard caution that next- generation library systems and web-based solutions are “a breakthrough but need careful scrutiny” of security, privacy, and related issues such as data provenance (i.e., where the information is physically stored, which can potentially affect security and privacy compliance requirements). 16 Protecting Patron Privacy in the “Library 2.0” Era “Library 2.0” is an approach to librarianship that emphasizes engagement and multidirectional interaction with library patrons. Although this model is “broader than just online communication and collaboration” and “encompasses both physical and virtual spaces,” there can be no doubt that “Library 2.0 is rooted in the global Web 2.0 discussion,” and that libraries have made increasing use of Web 2.0 technologies to engage patrons.17 The Library 2.0 model disrupts many traditional practices for protecting privacy, such as limited tracking of user activity, short-term data retention policies, and anonymous browsing of physical materials. Instead, as Zimmer states, “the norms of Web 2.0 promote the open sharing of information—often personal information—and the design of many Library 2.0 services capitalize on access to patron information and might require additional tracking, collection, and aggregation of patron activities.”18 As ALA cautioned in their study on privacy and confidentiality, “Libraries that provide materials over websites controlled by the library must determine the appropriate use of any data describing user activity logged or gathered by the web server software.”19 The dilemma facing libraries in the Library 2.0 era, then, is how to appropriately leverage user information while maintaining patron privacy. Many library systems require users to validate their identity through the use of a username, password, PIN code, or another unique identifier for access to their library circulation records and other personal information.20 However, several studies suggest the authentication process itself spawns a trail of personally identifiable information about library patrons that must be kept confidential.21 There is discussion in the literature about the value of using HTTPS and SSL certificates to protect patron privacy and build a high level of trust with users, and general awareness about importance of encrypting communications that involve sensitive information, such as “payment for fines and fees via the OPAC” or when “patrons are required to enter personal APPLICATION LEVEL SECURITY IN A PUBLIC LIBRARY |THOMCHICK AND SAN NICOLAS-ROCCA 110 https://doi.org/10.6017/ital.v37i4.10405 details such as addresses, phone numbers, usernames, and/or passwords.”22 However, as Breeding observed, many OPACs and other library automation software products “don't use SSL by default, even when processing these personalization features.” 23 These observations call library privacy practices into question, and are concerning since “hackers have identified library ILSs as vulnerable, especially when libraries do not enforce strict system security protocols.” 24 One of the challenges facing libraries is the perception that “a library's basic website and online catalog functions don't need enhanced security.”25 As a matter-of-fact, one of the most common complaints against HTTPS implementation in libraries has been: “we don’t serve any sensitive information.”26 These beliefs may be based on the historical practice of using HTTPS selectively to secure “sensitive” information and operations such as user authentication. But in recent years, it has become clear that selective HTTPS implementation is not an adequate defense. The Electronic Frontier Foundation (EFF) cautions, “Some site operators provide only the login page over HTTPS, on the theory that only the user’s password is sensitive. These sites’ users are vulnerable to passive and active attacks.”27 Passive attacks do not alter systems or data. During a passive attack, a hacker will attempt to listen in on communications over a network. Eavesdropping is an example of a passive attack.28 Active attacks alter systems or data. During this type of attack, a hacker will attempt to break into a system to make changes to transmitted or stored data, or introduce data into the system. Examples of active attacks include man-in-the-middle, impersonation, and session hijacking.29 HTTP Exploits Web servers typically generate unique session token IDs for authenticated users and transmit them to the browser, where they are cached in the form of cookies. Session hijacking is a type of attack that “compromises the session token by stealing or predicting a valid session token to gain unauthorized access to the web server,” often by using a network sniffer to capture a valid session ID that can be used to gain access to the server.30 Session hijacking is not a new problem, but the release of the Firesheep attack kit in 2010 increased awareness about the inherent insecurity of HTTP and the need for persistent HTTPS.31 In the wake of Firesheep’s release and several major security breaches, Senator Charles Schumer, in a letter to Yahoo!, Twitter, and Amazon, characterized HTTP as a “welcome mat for would-be hackers” and urged the technology industry to implement better security as quickly as possible.32 These and other events prompted several major site operators, including Google, Facebook, PayPal, and Twitter, to switch from partial to pervasive HTTPS. Today these sites transmit virtually all web application traffic over HTTPS. Security researchers from these companies, as well as from several standards organizations such as Electronic Frontier Foundation (EFF), Internet Engineering Task Force (IETF), and Open Web Application Security Project have shared their experiences and recommendations to help other website operators implement HTTPS effectively.33 These include encrypting the entire session, avoiding mixed content, configuring cookies correctly, using valid SSL certificates, and enabling HSTS to enforce HTTPS. TESTING TECHNIQUES USED TO EVALUATE HTTPS IMPLEMENTATION There is little or no evidence in the literature that libraries are aware of the associated vulnerabilities, threats, or risks, or that researchers have evaluated the use of HTTPS in library web applications. However, there are many methods that libraries can use to evaluate HTTPS and INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 111 SSL/TLS implementation, including automated software tools and heuristic evaluations. These methods can be combined for deeper analysis. Automated Software Tools Among the most widely used automated analysis software tools is SSL Server Test from Qualys SSL Labs. This online service “performs a deep analysis of the configuration of any SSL web server on the public internet” and provides a visual summary as well as detailed information about authentication (certification and certificate chains) and configuration (protocols, key strength, cipher suites, and protocol details).34 Users can optionally post the results to a central “board” that acts as a clearinghouse for identifying “insecure” and “trusted” sites. Another popular tool is SSLScan, a command-line application that, as the name implies, quickly “queries SSL services, such as HTTPS, in order to determine the ciphers that are supported.”35 However, these tools are limited in that they only report specific types of data and do not provide a holistic view of HTTPS implementation. Heuristic Evaluations In addition to automated software tools, librarians can also use heuristic evaluations to manually inspect the gray areas of HTTPS implementation, either to validate the results of automated software or to examine aspects not included in the functionality of these tools. One example is HTTPSNow, a service that lets users report and view information about how websites use HTTPS. HTTPSNow enables this activity by providing heuristics that non-technical audiences can use to derive a relatively accurate assessment of HTTPS deployment on any particular website or application. The project documentation includes descriptions of, and guidance for identifying, HTTP-related vulnerabilities such as use of HTTP during authenticated user sessions, presence of mixed content (instances in which content on a webpage is transmitted via HTTPS while other content elements are transmitted via HTTP), insecure cookie configurations, and use of invalid SSL certificates. RESEARCH METHODOLOGY A combination of heuristic and automated methods was used to evaluate HTTPS implementation in a public library web application to determine how many security vulnerabilities exist in the application and assess to the potential privacy risks to the library’s patrons. Research Location This research project was conducted at a public library in the western US that we will call West Coast Public Library (WCPL). This library was established in 1908 and employs ninety staff and approximately forty volunteers. In addition, it has approximately 91,000 cardholders. As part of its operations, WCPL runs a public-facing website and an integrated library system (ILS) that includes an OPAC with personalization for authenticated users. Test To conduct the test, a valid WCPL library patron account was created and used to authenticate one of the authors for access to account information and personalized features of WCPL’s OPAC. Next, the Google Chrome web browser was used to visit WCPL’s public-facing website. A valid patron name, library card number, and eight-digit PIN number were then used to gain access to online account information. Several tasks were performed to evaluate HTTPS usage. A sample search APPLICATION LEVEL SECURITY IN A PUBLIC LIBRARY |THOMCHICK AND SAN NICOLAS-ROCCA 112 https://doi.org/10.6017/ital.v37i4.10405 query for the keyword “recipes” was performed in the OPAC while logged in. The description pages for two of the resources listed in the search engine result page (one printed resource and one electronic resource) were clicked on and viewed. The electronic resource was added to the online account’s “book cart” and the book cart page was viewed. During these activities, HTTPSNow heuristics were applied to individual webpages and to the user session as a whole. The web browser’s URL address window was inspected to determine whether some or all pages were transmitted via HTTP or HTTPS. The URL icon in the browser’s address bar was clicked on to view a list of the cookies that the application set in the browser. Each cookie was inspected for the text, "Send for: Encrypted connections only," which indicates that the cookie is secure. Individual webpages were checked for the presence of mixed (encrypted and unencrypted) content. Information about individual SSL certificates was inspected to determine their validity and encryption key length. All domain and subdomain names encountered during these activ ities were documented. The Google Chrome web browser was then used to access the Qualys SSL Server Test tool. Each domain name encountered was submitted. Test results were then examined to determine whether any authentication or configuration flaws exist in WCPL’s web applications. RESULTS AND DISCUSSION Given the recommendations suggested by several organizations (e.g., EFF, IETF, OWASP), we evaluated WCPL’s web application to determine how many security vulnerabilities exist in the application, and assess the potential privacy risks to the library’s patrons. The results of tests, as discussed below, suggest that WCPL’s web application processes a number of vulnerabilities that could potentially be exploited by attackers and compromise the confidentiality of PII about library patrons. This is not surprising given the lack of research on HTTPS implementation, as well as the general consensus in the literature that technology adoption has outpaced efforts to maintain patron privacy. Based on the results of these tests, WCPL’s website and ILS span across several domains. Some of these domains appear to be operated by WCPL, while others appear to be part of a hosted environment operated by the ILS vendor. Based on this information, it is reasonable to conclude that WCPL’s ILS utilizes a “hybrid cloud” model. In addition, random use of HTTPS is observed in the OPAC interface during the testing process. This is discussed in the following sections. Use of HTTP During Authenticated User Sessions Library patrons use WCPL’s website and OPAC to access and search for books and other material available through the library. Given the results of the tests, WCPL does not use HTTPS pervasively across its entire web application. During the test, we found that WCPL’s website is transmitted via HTTP by default. This was after manually entering in the URL with an “https” prefix, which resulted in a redirect to the unencrypted “http” page. We continued to test WCPL’s website and OPAC by performing a query using the search bar located on the patron account page. We found that WCPL’s OPAC transmits some pages over HTTP and others over HTTPS. For example, when a search query is performed in the search bar located on the patron account page, the search engine results page is sometimes served over HTTPS, and sometimes over HTTP (see figure 1). This behavior is not limited to specific pages; rather it appears to be random. This security flaw leaves library patrons vulnerable to passive and active attacks that exploit gaps in HTTPS implementation, which allows an attacker to eavesdrop on and hijack a user-session providing the attacker with access to private information. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 113 Figure 1. Results of the Library’s use of HTTPS. Presence of Mixed Content When a library patron visits a webpage served over HTTPS, the connection with the web server is encrypted, and therefore, safeguarded from attack. If an HTTPS webpage includes content retrieved via HTTP, the webpage is only partially encrypted, leaving the unencrypted content vulnerable to attackers. Analysis of WCPL’s website did not reveal any explicit use of mixed content on the public-facing portion of the site. Test results, however, detected unencrypted content sources on some pages of the library’s online catalog. This, unfortunately, puts patron privacy at risk as attackers can intercept the HTTP resources when an HTTPS webpage loads content such as an image, iFrame or font over HTTP. This compromises the security of what is perceived to be a secure site by enabling an attacker to exploit an insecure CSS file or JavaScript function, leading to disclosure of sensitive data, malicious website redirect, man-in-the-middle attacks, phishing, and other active attacks.36 Insecure Cookie Management Cookies are small text files, sent from a web server and stored on user computers via web browsers. Cookies can be divided into two categories: Session and Persistent. Persistent cookies are stored on the user’s hard drive until they are erased or expire. Unlike persistent cookies, session cookies are stored in memory and erased once the user closes their browser. Provided that computer settings allow for it, cookies are created when a user visits a website. Cookies can be set up such that communication is limited to encrypted communication, and can be used to remember login credentials, previous information entered into forms, such as name, mailing address, email address, and the like. Cookies can also be used to monitor the number of times a user visits a website, the pages a user visits, and the amount of time spent on a webpage. APPLICATION LEVEL SECURITY IN A PUBLIC LIBRARY |THOMCHICK AND SAN NICOLAS-ROCCA 114 https://doi.org/10.6017/ital.v37i4.10405 The results of the tests suggest that WCPL’s cookie policies are inconsistent. We found two types of cookies present. Within one domain, the web application uses a JSESSION cookie that is configured to send for “secure connections only.” This indicates that the session ID cookie is encrypted during transmission. Another domain uses an ASP.NET session ID that is configured to send for any connection, which means the session ID could be transmitted in an unencrypted format. Cookies transmitted in an unencrypted format could be intercepted by an attacker in order to eavesdrop on or hijack user sessions. This leaves user privacy vulnerable given the type of information contained within cookies. FLAWED ENCRYPTION PROTOCOL SUPPORT Transport Layer Security (TLS) is a protocol designed to provide secure communication over the web. Websites using TLS, therefore, provide a secure communication path between their web servers and web browsers preventing eavesdropping, hijacking, and other active attacks. This study employed the SSL Server Test from Qualys SSL Labs to perform an analysis of WCPL’s web applications. Results of the Qualys test (see figure 2) indicate that the site does not support TLS 1.2, which means the server may be vulnerable to passive and active attacks, thereby providing hackers with access to data passed between a web server and web browser accessing the server. In addition, the application’s server platform supports SSL 2.0, which is insecure because it is subject to a number of passive and active attacks leading to loss of confidentiality, privacy, and integrity. Figure 2. Qualys Scanning Service Results. The vulnerabilities discovered during the testing process may be a result of uncoordinated security. This is concerning because it is a by-product of the cloud computing approach used to operate WCPL’s ILS. While libraries may have acclimated to the challenge of coordinating security measures across a distributed application, they now face the added complexity of coordinating INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 115 security measures with their vendors, who themselves may also utilize additional cloud-based offerings from third parties. As cloud technology adoption increases and cloud-based infrastructures become more complex and distributed, attackers will likely attempt to find and exploit systems with inconsistent or uneven security measures, and libraries will need to work closely with information technology vendors to ensure tight coordination of security measures. Unencrypted communication using HTTP affects the privacy, security, and integrity of patron data. Passive attacks such as eavesdropping, and active attacks such as hijacking, man -in-the-middle, and phishing can reveal patron login credentials, search history, identity, and other sensitive information that, according to ALA, should be kept private and confidential. Given the results of the testing done in this study, it is clear that WCPL needs to revisit and strengthen their web application security measures by, according to organizations within the security community, using HTTPS pervasively across the entire web application, avoiding mixed content, configuring cookies limited to encrypted communication, using valid SSL certificates, and enabling HSTS to enforce HTTPS. Implementing improvements to HTTPS will mitigate attacks by strengthening the integrity of WCPL’s web applications, which in turn, will help protect the privacy and confidentiality of library patrons. LIMITATIONS AND FUTURE RESEARCH This research was performed at a public library in the western U.S. Therefore, future research is needed to study the implementation of HTTPS to increase patron privacy at other public libraries, libraries in other parts of the U.S. and in other countries. It would also be valuable to conduct similar research at libraries of different types, including academic, law, medical, and other types of special libraries. SSL Server Test from Qualys SSL Labs and HTTPSNow were used to evaluate the use of HTTPS at WCPL. The use of other evaluation techniques may generate different results. While a major limitation of this study is the evaluation of a single public library and the implementation of HTTPS to ensure patron privacy, a next phase of research should further investigate the policies in place that are used to safeguard patron privacy. These include security education, training, and awareness programs, as well as access controls. Furthermore, Library 2.0 and cloud computing are fundamental to libraries, but create risks that could impact the ability to keep patron PII safeguarded. As such, future research should evaluate the impact Library 2.0 and cloud computing applications have on maintaining the confidentiality of patron information. CONCLUSION The library profession has long been a staunch defender of privacy rights, and the literature reviewed indicates strong awareness and concern about the rapid pace of information technology and its impact on the confidentiality of personally identifiable information about library patrons. Much work has been done to educate librarians and patrons about the risks facing them and the measures they can take to protect themselves. However, the research and experimentation presented in this report strongly suggest that there is a need for WCPL and other libraries to reassess and strengthen their HTTPS implementations. HTTPS is not a panacea for mitigating web application risks, but it can help libraries give patrons the assurance of knowing they take security and privacy seriously, and that reasonable steps are being taken to protect them. Finally, this report concludes that further research on library application security should be conducted to assess the overall state of application security in public, academic, and special libraries, with the APPLICATION LEVEL SECURITY IN A PUBLIC LIBRARY |THOMCHICK AND SAN NICOLAS-ROCCA 116 https://doi.org/10.6017/ital.v37i4.10405 long-term objective of enabling ALA and other professional institutions to develop policies and best practices to guide the secure adoption of Library 2.0 and cloud computing technologies within a socially connected world. REFERENCES 1 Jon Brodkin, “President Trump Delivers Final Blow to Web Browsing Privacy Rules,” ARS Technica (April 3, 2017), https://arstechnica.com/tech-policy/2017/04/trumps-signature- makes-it-official-isp-privacy-rules-are-dead/. 2 Shayna Pekala, “Privacy and User Experience in 21st Century Library Discovery,” Information Technology and Libraries 36, no. 2 (2017): 48–58, https://doi.org/10.6017/ital.v36i2.9817. 3 American Library Association, “History of the Code of Ethics: 1939 Code of Ethics for Librarians,” accessed May 11, 2018, http://www.ala.org/Template.cfm?Section=History1&Template=/ContentManagement/Conte ntDisplay.cfm&ContentID=8875. 4 Joyce Crooks, “Civil Liberties, Libraries, and Computers,” Library Journal 101, no. 3 (1976): 482– 87; Stephen Harter and Charles C. Busha, “Libraries and Privacy Legislation,” Library Journal 101, no. 3 (1976): 475–81; Kathleen G. Fouty, “Online Patron Records and Privacy: Service vs. Security,” Journal of Academic Librarianship 19, no. 5 (1993): 289–93, https://doi.org/10.1016/0099-1333(93)90024-Y. 5 “Code of Ethics of the American Library Association,” American Library Association, amended January 22, 2008, http://www.ala.org/advocacy/proethics/codeofethics/codeethics; “Privacy: An Interpretation of the Library Bill of Rights,” American Library Association, amended July 1, 2014, http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy. 6 American Library Association, “Privacy: An Interpretation of the Library Bill of Rights,” amended July 1, 2014, http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy. 7 Pekala, “Privacy and User,” pp. 48–58. 8 Harter and Busha, “Libraries and Privacy Legislation,” pp. 475–81; George S. Machovec, “Data Security and Privacy in the Age of Automated Library Systems,” Information Intelligence, Online Libraries, and Microcomputers 6, no. 1 (1988). 9 Fouty, “Online Patron Records and Privacy, pp. 289–93. 10 Grace J. Agnew and Rex Miller, “How do you Manage?,” Library Journal 121, no. 2 (1996): 54. 11 Lois K. Merry, “Hey, Look Who Took This Out!—Privacy in the Electronic Library,” Journal of Interlibrary Loan, Document Delivery & Information Supply 6, no. 4 (1996): 35–44, https://doi.org/10.1300/J110V06N04_04. https://arstechnica.com/tech-policy/2017/04/trumps-signature-makes-it-official-isp-privacy-rules-are-dead/ https://arstechnica.com/tech-policy/2017/04/trumps-signature-makes-it-official-isp-privacy-rules-are-dead/ https://doi.org/10.6017/ital.v36i2.9817 http://www.ala.org/Template.cfm?Section=History1&Template=/ContentManagement/ContentDisplay.cfm&ContentID=8875 http://www.ala.org/Template.cfm?Section=History1&Template=/ContentManagement/ContentDisplay.cfm&ContentID=8875 https://doi.org/10.1016/0099-1333(93)90024-Y http://www.ala.org/advocacy/proethics/codeofethics/codeethics http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy https://doi.org/10.1300/J110V06N04_04 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 117 12 Aimee Fifarek, “Technology and Privacy in the Academic Library,” Online Information Review 26, no. 6 (2002): 366–74, https://doi.org/10.1108/14684520210452691. 13 John N. Berry III, “Digital Democracy: Not Yet!,” Library Journal 125, no. 1 (2000): 6. 14 American Library Association, “Appendix—Privacy and Confidentiality in the Electronic Environment,” September 28, 2006, http://www.ala.org/lita/involve/taskforces/dissolved/privacy/appendix. 15 Judith Mavodza, “The Impact of Cloud Computing on the Future of Academic Library Practices and Services,” New Library World 114, no. 3/4 (2012): 132–41, https://doi.org/10.1108/03074801311304041. 16 Richard Levy, “Library in the Cloud with Diamonds: A Critical Evaluation of the Future of Library Management Systems,” Library Hi Tech News 30, no. 3 (2013): 9–13, https://doi.org/10.1108/LHTN-11-2012-0071; Raymond Bérard, “Next Generation Library Systems: New Opportunities and Threats,” Bibliothek, Forschung und Praxis 37, no. 1 (2013): 52–58, https://doi.org/10.1515/bfp-2013-0008. 17 Michael Stephens, “The Hyperlinked Library: a TTW White Paper,” accessed May 13, 2018, http://tametheweb.com/2011/02/21/hyperlinkedlibrary2011/; Michael Zimmer, “Patron Privacy in the ‘2.0’ Era.” Journal of Information Ethics 22, no. 1 (2013): 44–59, https://doi.org/10.3172/JIE.22.1.44. 18 Zimmer, “Patron Privacy in the ‘2.0’ Era,” p. 44. 19 “The American Library Association’s Task Force on Privacy and Confidentiality in the Electronic Environment,” American Library Association, final report July 7, 2000, http://www.ala.org/lita/about/taskforces/dissolved/privacy. 20 Library Information Technology Association (LITA), accessed May 11, 2018, http://www.ala.org/lita/. 21 Library Information Technology Association (LITA), accessed May 11, 2018, http://www.ala.org/lita/; Pam Dixon, “Ethical Issues Implicit in Library Authentication and Access Management: Risks and Best Practices,” Journal of Library Administration 47, no. 3 (2008): 141–62, https://doi.org/10.1080/01930820802186480; Eric P. Delozier, “Anonymity and Authenticity in the Cloud: Issues and Applications,” OCLC Systems and Services: International Digital Library Perspectives 29, no. 2 (2012): 65–77, https://doi.org/10.1108/10650751311319278. 22 Marshall Breeding, “Building Trust through Secure Web Sites,” Computers in Libraries 25, no. 6 (2006), p. 24. 23 Breeding, “Building Trust,” p. 25. https://doi.org/10.1108/14684520210452691 http://www.ala.org/lita/involve/taskforces/dissolved/privacy/appendix https://doi.org/10.1108/03074801311304041 https://doi.org/10.1108/LHTN-11-2012-0071 https://doi.org/10.1515/bfp-2013-0008 http://tametheweb.com/2011/02/21/hyperlinkedlibrary2011/ https://doi.org/10.3172/JIE.22.1.44 http://www.ala.org/lita/about/taskforces/dissolved/privacy http://www.ala.org/lita/ http://www.ala.org/lita/ https://doi.org/10.1080/01930820802186480 https://doi.org/10.1108/10650751311319278 APPLICATION LEVEL SECURITY IN A PUBLIC LIBRARY |THOMCHICK AND SAN NICOLAS-ROCCA 118 https://doi.org/10.6017/ital.v37i4.10405 24 Barbara Swatt Engstrom et al., “Evaluating Patron Privacy on Your ILS: How to Protect the Confidentiality of Your Patron Information,” AALL Spectrum 10, no 6 (2006): 4–19. 25 Breeding, “Building Trust,” p. 26. 26 TJ Lamana, “The State of HTTPS in Libraries,” Intellectual Freedom Blog, the Office for Intellectual Freedom of the American Library Association (2017), https://www.oif.ala.org/oif/?p=11883. 27 Chris Palmer and Yan Zhu, “How to Deploy HTTPS Correctly,” Electronic Frontier Foundation, updated February 9, 2017, https://www.eff.org/https-everywhere/deploying-https. 28 Computer Security Resource Center, “Glossary,” National Institute of Standards and Technology, accessed May 12, 2018, https://csrc.nist.gov/Glossary/?term=491#AlphaIndexDiv. 29 Computer Security Resource Center, “Glossary,” National Institute of Standards and Technology, accessed May 12, 2018, https://csrc.nist.gov/Glossary/?term=2817. 30 Open Web Application Security Project, “Session Hijacking Attack,” last modified August 14, 2014, https://www.owasp.org/index.php/Session_hijacking_attack; Open Web Application Security Project, “Session Management Cheat Sheet,” last modified September 11, 2017, https://www.owasp.org/index.php/Session_Management_Cheat_Sheet. 31 Eric Butler, “Firesheep,” (2010), http://codebutler.com/firesheep/; Audrey Watters, “Zuckerberg's Page Hacked, Now Facebook To Offer ‘Always On’ HTTPS," accessed May 16, 2018, https://readwrite.com/2011/01/26/zuckerbergs_facebook_page_hacked_and_now_facebook/ . 32 Info Security Magazine, “Senator Schumer: Current Internet Security “Welcome Mat for Would- be Hackers,” (March 2, 2011), http://www.infosecurity-magazine.com/view/16328/senator- schumer-current-internet- security-welcome-mat-for-wouldbe-hackers/. 33 Palmer and Zhu, “How to Deploy HTTPS Correctly”; Internet Engineering Task Force, “Recommendations for Secure Use of Transport Layer Security (TLS) and Datagram Transport Layer Security (DTLS),” (May, 2015), https://tools.ietf.org/html/bcp195; Open Web Application Security Project, “Session Management Cheat Sheet,” last modified September 11, 2017, https://www.owasp.org/index.php/Session_Management_Cheat_Sheet. 34 Qualys SSL Labs, “SSL/TLS Deployment Best Practices,” accessed May 18, 2018, https://www.ssllabs.com/projects/best-practices/. 35 SourceForge, “SSLScan—Fast SSL Scanner,” last updated April 24, 2013, http://sourceforge.net/projects/sslscan/. 36 Palmer and Zhu, “How to Deploy HTTPS Correctly.” https://www.oif.ala.org/oif/?p=11883 https://www.eff.org/https-everywhere/deploying-https https://csrc.nist.gov/Glossary/?term=491#AlphaIndexDiv https://csrc.nist.gov/Glossary/?term=2817 https://www.owasp.org/index.php/Session_hijacking_attack https://www.owasp.org/index.php/Session_Management_Cheat_Sheet http://codebutler.com/firesheep/ https://readwrite.com/2011/01/26/zuckerbergs_facebook_page_hacked_and_now_facebook/ http://www.infosecurity-magazine.com/view/16328/senator-%20schumer-current-internet-%20security-welcome-mat-for-wouldbe-hackers/ http://www.infosecurity-magazine.com/view/16328/senator-%20schumer-current-internet-%20security-welcome-mat-for-wouldbe-hackers/ https://tools.ietf.org/html/bcp195 https://www.owasp.org/index.php/Session_Management_Cheat_Sheet https://www.ssllabs.com/projects/best-practices/ http://sourceforge.net/projects/sslscan/ ABSTRACT INTRODUCTION LITERATURE REVIEW The Primal Importance of Patron Privacy The Impact of Information Technology on Patron Privacy Protecting Patron Privacy in the “Library 2.0” Era HTTP Exploits TESTING TECHNIQUES USED TO EVALUATE HTTPS IMPLEMENTATION Automated Software Tools Heuristic Evaluations RESEARCH METHODOLOGY Research Location Test RESULTS AND DISCUSSION Use of HTTP During Authenticated User Sessions Presence of Mixed Content Insecure Cookie Management Flawed Encryption Protocol Support LIMITATIONS AND FUTURE RESEARCH CONCLUSION REFERENCES 10407 ---- Letter to the Editor Ann Kucera INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 9 https://doi.org/10.6017/ital.v37i2.10407 Dear Editorial Board, Regarding “Halfway Home: User Centered Design and Library Websites” in the March 2018 issue of Information Technology and Libraries (ITAL), I thought there were some interesting points. I think that your assertion, however, that User Centered Design automatically eliminates anything from a website that your main user group did not expressly ask for is faulty. When someone brings up the fact that User Centered Design is not statistically significant, I interpret that as a misunderstanding of what User Centered Design is. Our academic library websites are not research projects so why would we gather statistically significant information about them? Our academic library websites are (or should be) helpful to students and faculty and constantly changing to meet their needs. If librarians perpetuate a misunderstanding of User Centered Design, my fear is that misunderstanding could perpetuate stagnation and a refusal to change our technology/user interfaces in a rapidly changing environment and do our patrons and ourselves a disservice. User Centered Design is a set of tools to help us gather information about users and their needs. The information gathered informs the design but does not dictate the design and needs to be part of an iterative process. The web design team at your institution demonstrated User Centered Design when they added floor maps back into the web site when a group of users pointed out that it was causing problems for the main users at your institution. While valuable experience from librarians and other staff is critical to take into account, it is sometimes difficult to determine which pieces of the puzzle provide comfort to those who work at the library vs. which pieces assist students in their studies. I applaud your willingness to “clear the slate” and reduce the amount of information you were maintaining on your website. I’m guessing you may have removed dozens of links from your website. You only mentioned adding one category of information back into the design. I would say your User Centered Design process is working quite well. Ann Kucera Systems Librarian Central Michigan University https://doi.org/10.6017/ital.v37i1.10338 10437 ---- The Benefits of Enterprise Architecture for Library Technology Management: An Exploratory Case Study Sam Searle INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 27 Sam Searle (samantha.searle@griffith.edu.au) is Manager, Library Technology Services, Griffith University, Brisbane, Australia. ABSTRACT This case study describes how librarians and enterprise architects at an Australian university worked together to document key components of the Library’s “as-is” enterprise architecture (EA). The article covers the rationale for conducting this activity, how work was scoped, the processes used, and the outputs delivered. The author discusses the short-term benefits of undertaking this work, with practical examples of how outputs from this process are being used to better plan future library system replacements, upgrades, and enhancements. Longer-term benefits may also accrue in the future as the results of this architecture work inform the Library’s IT planning and strategic procurement. This article has implications for practice for library technology specialists as it validates views from other practitioners on the benefits for libraries in adopting enterprise architecture methods and for librarians in working alongside enterprise architects within their organizations. INTRODUCTION Griffith University is a large comprehensive university with multiple campuses located across the South East Queensland region in Australia. Library and information technology operations are highly converged and from 1989–2017 were offered within a single Division of Information Services. Scalable, sustainable, and cost-effective IT is seen as a key strategic enabler of the University’s core business in education and research. “Information Management and Integration” and “Foundation Technology” are two of four key areas outlined in the Griffith Digital Strategy 2020, which highlights enterprise-wide decision-making and proactive moves to take advantage of As-a-Service models for delivering applications.1 From late 2016 through to early 2018, Library and Learning Services (“the Library”) and IT Architecture and Strategy (ITAS) worked iteratively to document key components of the Library’s “as-is” enterprise architecture (EA). Around fifty staff members have participated in the process at different points. The process has been very positive for all involved and has led to a number of benefits for the library in terms of improved planning, decision-making, and strategic communication. As Manager, Library Technology Services, the author was well placed to act as a participant-as- observer with the objective of sharing these experiences with other library practitioners. The author actively participated in the processes described here and has been able to informally discuss the benefits of this work with the architects and some of the library staff members who were most involved. mailto:samantha.searle@griffith.edu.au BENEFITS OF ENTERPRISE ARCHITECTURE FOR LIBRARY TECHNOLOGY MANAGEMENT | SEARLE 28 https://doi.org/10.6017/ital.v37i4.10437 LITERATURE REVIEW Enterprise architecture (EA) emerged over twenty years ago and is now a well-established IT discipline. Like other disciplines such as project management and change management, there are a number of best practice frameworks in common use, including The Open Group Architecture Framework (TOGAF).2 A global federation of member professional associations has been in place since 2011, with aims including the formalization of standards and promotion of the value of EA.3 Educational qualifications, certifications, and professional development pathways for enterprise architects are available within universities and the private training sector. According to the international higher education technology association EDUCAUSE, EA is relatively new within universities but is growing in importance. As a set of practices, “EA provides an overarching strategic and design perspective on IT activities, clarifying how systems, services, and data flows work together in support of business processes and institutional mission.”4 Yet despite this growing interest in our parent organizations, individual academic libraries applying EA principles and methods are notably absent from the scholarly literature and library practitioner information sharing channels. The fullest account to date of the experience and impacts of enterprise architecture practice in a library context is a case study from the Canada Institute for Scientific and Technical Information (CISTI). At the time of the case study’s writing in 2008, CISTI was already well underway in its adoption of EA methods in an effort to address the challenges of “legacy, isolated, duplicated, and ineffective information systems” and to “reduce complexity, to encourage and enable collaborations, and, finally, to rein in the beast of technology.”5 The author of this case study concludes that while getting started in EA was complex and resource-intensive, this was more than justified at CISTI by the improvements in technology capability, strategic planning, and services to library users. Broader whole-of-government agendas are a driver for EA adoption in non-university research libraries. The National Library of Finland’s EA efforts were guided by a National Information Society Policy and the EA architecture design method for Finnish government. 6 A 2009 review of the IT infrastructure at the U.S. Library of Congress (LC) argued LC was lagging behind other federal agencies in adoption of government-recommended EA frameworks. The impact of this included: inadequate linking of IT to the LC mission; potential system interoperability problems; difficulties assessing and managing the impact of changes; poor management of IT security; and technical risk due to non-adherence to industry standards and lack of future planning.7 A follow- up review in 2015 noted that LC had since developed an architecture, but that it had still fallen short by not gathering data from management and validating the work with stakeholders. 8 There is little discussion in the literature about the EA process as a collaborative effort. In their 2016 discussion of emerging roles for librarians, Parker and McKay proposed EA as a new area for librarians themselves to consider moving into, rather than as a source of productive partnerships.9 They argued that there are many similarities in the skillsets and practices of enterprise architects and information professionals (in particular, systems librarians and corporate information managers). Areas of crossover identified included: managing risks, for example, related to intellectual property and data retention; structured and standardized approaches to (meta)data and information; technical skills such as systems analysis, database design and vendor management; and understanding and application of information standards and internal INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 29 information flows. While not a research library, within a broader information management context State Archives and Records NSW has promoted the benefits to records managers of working with enterprise architects, including improved program visibility, strategic assistance with business case development, and the embedding of recordkeeping requirements within the organization’s overall enterprise architecture.10 GETTING STARTED: CONTEXT AND PLANNING Library Technology Services Context In 2015–16, the awareness of enterprise and solution architecture expanded significantly within Griffith University’s Library Technology Services (LTS) team. In 2015, some members of the team participated in activities led by external consultants to document Griffith’s overall enterprise architecture at a high level. In 2016, the author became a member of the University’s Solution Architecture Board (SAB). LTS submitted several smaller solution architectures to this group for discussion and approval, and team members found this process useful in identifying alternative ways to do things that we may not have otherwise considered. As a small team looking after a portfolio of high-use applications, LTS was seeking to align itself as much as possible with university-wide IT governance and strategy. These broader approaches included aggressively seeking to move services to cloud hosting, standardizing methods for transferring data between systems, complying with emerging requirements for greater IT security, and participating in large-scale disaster recovery planning exercises. The author also needed to improve communication with senior IT stakeholders. There was little understanding outside of the Library of the scale and complexity involved in delivering online library services to a community of over 50,000 people. In a resource-scarce environment, it was increasingly important to make business cases not just in formal project documents but also opportunistically in less formal situations (the “elevator pitch”). Existing systems were definitely hindering the Library in making progress toward an improved online student experience and more efficient usage of staff resources. A complex ecosystem of more than a dozen library applications had developed over time. The Library had selected these at different times based on requirements for specific library functions rather than alignment with an overall architectural strategy. Our situation mirrored that described at CISTI: “a complex and ‘siloed’ legacy infrastructure with significant vendor lock-in” combined with “reactionary” projects that “extended or redesigned [existing infrastructure] to meet purported needs, without consideration for the complexity that was being added to overcomplicated systems.”11 Complex data flows between local systems and third-party providers that were critical to library services were not always well-documented. While LTS staff members were extremely experienced, much of their knowledge was tacit. As in many libraries, staff could be observed sharing in informal, organic ways focused on the tasks at hand, but less effort was spent on capturing knowledge systematically. Building a more explicit shared understanding about the Library’s application portfolio would help address risks associated with staff succession. Improved internal documentation would also address emerging requirements for team members to both develop their own understanding in new areas (upskilling) as well as become more flexible in terms of taking up broader roles and responsibilities across the team (cross-skilling). BENEFITS OF ENTERPRISE ARCHITECTURE FOR LIBRARY TECHNOLOGY MANAGEMENT | SEARLE 30 https://doi.org/10.6017/ital.v37i4.10437 There was also a sense that the time was right to take stock and evaluate the current state of affairs before embarking on any major changes. The team was supporting several applications, including the library management system and the interlibrary loans system, that were end-of-life. We needed to make decisions, and these needed to not only address our current issues but also provide a firm platform for the future. It was in this context that in 2016 Library Technology Services approached the Information Technology Architecture and Solutions group for assistance. Information Technology Architecture and Solutions Context In 2014, Griffith University embarked on a new approach to enterprise architecture. The Chief Technology Officer was given a mandate by the senior leadership of the University to ensure that IT architecture was managed within an architecture governance framework, and the Information Services EA team was tasked with developing and maintaining an EA and providing services to support the development of solution architectures for projects and operational activities. Two new boards were established to provide governance: The Information and Technology Architecture Board (ITAB) would control architectural standards and business technology roadmaps, while the Solution Architecture Board (SAB) would “support the development and implementation of solution architecture that is effective, sustainable and consistent with architectural standards and approaches.” Project teams and operational areas were explicitly given responsibility to engage with these boards when undertaking the procurement and implementation of IT systems. Sets of architectural, information, and integration principles were developed, which promoted integration mechanisms that minimized business impact and were future-proof, loosely coupled, reusable, and shared services.12 Our enterprise architects saw their primary role as maximizing the value of the University’s total investment in IT by promoting standards and frameworks that could potentially improve consistency and reduce duplication across the whole organization. In order to do this , they would need to work with and through other business units. From the architects’ perspective, a collaboration with the Library offered an opportunity to exercise skillsets and frameworks that were in place but still relatively new. Griffith was still maturing in this area and attempting to move from the hiring of consultants as the norm to building more internal capability. Working with the Library would be a good learning experience for a junior architect, who was on a temporary work placement from another part of Information Services as a professional development opportunity. She could build her skills in a friendly environment before embarking on other engagements with potentially less open client groups. Determining Scope in a Statement of Architecture Work Once the two teams had decided that the process could have benefits on both sides, the next step was to jointly develop a Statement of Architecture Work outlining what the process would include and how we would work together. A formal document was eventually endorsed at the Director level, but prior to that, the librarians and the architects had a number of useful informal conversations in which we discussed our expectations, as well as the amount of time that we could reasonably contribute to the process. In developing the Statement of Work, the two teams agreed to focus on the current “as-is” environment and on assessment of the maturity of the applications already in use (see figure 1). This would help us immediately with developing business cases and roadmaps, without INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 31 necessarily committing either team to the much greater effort required to identify an ideal “to-be” (i.e., future) state to work towards. Figure 1. Overview of the Architecture Statement of Work. Full size version available at https://doi.org/10.6084/m9.figshare.6667427. The Open Group Architecture Framework (TOGAF) supports the development of enterprise architectures through four subdomains: Business Architecture, Data Architecture, Application Architecture, and Technology Architecture.13 The work that we decided to pursue maps to two of these areas: Data Architecture, which “describes the structure of an organization’s logical and physical data assets and data management resources;” and Application Architecture, which “provides a blueprint for the individual applications to be deployed, their interactions, and their relationships to the core business processes of the organization.” ENTERPRISE ARCHITECTURE PROCESS AND OUTPUTS Once the Architecture Statement of Work had been agreed on, the two teams embarked on the process of working together over an extended period. While the lapsed time from approval of the Statement of Work through to endorsement of the architecture outputs by the Solution Architecture Board was approximately fourteen months, the bulk of the work was undertaken within the first six months. Following an intense period of information gathering involving large numbers of staff, a smaller subset of people then worked iteratively to refine the outputs for final approval. Several times architecture activities had to be placed on hold in favor of essential ongoing operational work and higher priority projects, such as a major upgrade of the institutional repository. The process involved four main activities which are described in more detail in following sections. https://doi.org/10.6084/m9.figshare.6667427 BENEFITS OF ENTERPRISE ARCHITECTURE FOR LIBRARY TECHNOLOGY MANAGEMENT | SEARLE 32 https://doi.org/10.6017/ital.v37i4.10437 Data Asset and Application Inventory The first activity consisted of a series of three workshops to review information held about library systems in the EA management system, Orbus Software’s iServer. This is the tool used by the Griffith EA team to develop and store architectural models, and to produce artifacts such as architecture diagrams (in Microsoft Visio format) and documentation (in Microsoft Word, Excel, and PowerPoint formats).14 The architects guided a group of librarians who use and support library systems through a process of mapping the types of data held against an existing list of enterprise data entities. In this context, a data entity is a grouping of data elements that is discrete and meaningful within a particular business context. For library staff, meaningful data entities included all the data relating to a Person, to items and metadata within a library Collection, and to particular business processes such as Purchasing. We also identified the systems into which data were entered (System of Entry), the systems that were considered the “source of truth” (System of Record), and the systems that made use of data downstream from those systems of record (Reference Systems). The main output of this process was a workbook (figure 2) showing a range of relationships: between systems and data entities; between internal systems; and between internal systems and external systems. The first two columns in the worksheet contain a list of all the data entities and sub-entities stored in library systems (as expressed in the enterprise architecture). Along the top of the worksheet is a list of all the products in our portfolio along with a range of systems they are integrated with. Each of the orange arrows in this spreadsheet represents the flow of data from one system to another. The workbook in this raw form is definitely messy and the data within it is not really meant to be widely consumed in this format. The workbook’s main role is as the data source for the application communication diagram that is described in a later section. As a result of this data asset inventory, the management system used by our architects now contains a far more comprehensive and up-to-date view of the Library’s architectural components than before: • The data entities better reflect library content. For example, while iServer already had a Collection Item data entity, we were able to add new data entity subtypes for Bibliographic Records, Authority Records, and Holdings Records. • Library systems are now captured in ways that make more sense to us. Workshopping with the architects led to the breakdown of several applications into more granular architectural components. For example, the library management system is now represented not just as a single system, but rather as a set of interconnected modules that support different business functions, such as cataloguing and circulation. Similarly, our reading lists solution was broken down into its two main components: one for managing reading lists and one for managing digitized content. This granularity has enabled us to build a clearer picture of how systems (and modules within systems) interface with each other. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 33 Figure 2. Part of the data asset and application inventory worksheet. Full size version available at https://doi.org/10.6084/m9.figshare.6667430. https://doi.org/10.6084/m9.figshare.6667430 BENEFITS OF ENTERPRISE ARCHITECTURE FOR LIBRARY TECHNOLOGY MANAGEMENT | SEARLE 34 https://doi.org/10.6017/ital.v37i4.10437 • The wide range of technical interfaces we have with third parties, such as publishers and other libraries, is now explicitly expressed. Feedback from the architects suggested that the Library was very unusual compared to other parts of the organization in terms of the number of critical external systems and services that we use as part of our service provision. Previously iServer did not contain a full picture of these critical services, including: o the web-based purchasing tools that we use to interact with publishers, such as EBSCO’s GOBI;15 o the Library Links program that we use to provide easier access to scholarly content via Google Scholar;16 and o various harvesting processes that enable us to share metadata with content aggregators, such as the National Library of Australia’s Trove service and the Australian National Data Service’s Research Data Australia portal. 17 Application Maturity Survey The second activity was an application maturity assessment. This involved forty-four staff members from all areas of the Library with different viewpoints (technical, non-technical, and management) answering a series of questions in a spreadsheet format. The survey contained questions about: • how often a system was used; • how easy it was to use; • how well it supported the business processes that person carried out; • how well it performed, for example, in terms of response times; • how quickly changes/enhancements were implemented in the product; • how easily the system could be integrated with other systems; • the level of compliance with industry standards; and • overall supportability (including vendor support). As different respondents were assigned multiple systems depending on their level of support and/or use, the final overall number of responses to the survey was 144 responses relating to eleven different systems. The outputs of this process were a summary table and a series of four graphs. The summary table (see figure 3) presents aggregated scores on a scale of one (low) to five (high) for each application as well as recommended technical and management strategies. It is interesting, and somewhat disheartening, to note that scores for the business criticality of the applications are generally much higher than the scores for fitness. There is also some variation in the strategies required; some systems need to be replaced, but there are others where the issues seem to be less technical. The third row of the table shows a product that is scored as highly business-critical and perfectly suited to the job from a technical perspective, yet the product still scores much more poorly for business fit, which could indicate that something has gone wron g in the way that this product has been implemented. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 35 Figure 3. Table summarizing the results of the application maturity assessment [product names redacted]. Applications are rated on a scale of one to five, and one of four management strategies (Technology Refresh—not shown here, Optimise, Implementation Review, or Replace) is recommended. Full size version available at https://doi.org/10.6084/m9.figshare.6667433. Figure 4. Two of the four graph types produced from the application maturity survey results, for a product [name redacted] that is performing well. Full size version available at https://doi.org/10.6084/m9.figshare.6667436. Figures 4 and 5 show the four graph types produced automatically from the survey results. On the left in figure 4 is a view displaying the Business Criticality, Business Fit, and Technical Fit for an individual application (shown in pink) as compared to the overall portfolio (shown in blue). On the right is a graph showing scores for the range of measures covered by the survey. This https://doi.org/10.6084/m9.figshare.6667433 https://doi.org/10.6084/m9.figshare.6667436 BENEFITS OF ENTERPRISE ARCHITECTURE FOR LIBRARY TECHNOLOGY MANAGEMENT | SEARLE 36 https://doi.org/10.6017/ital.v37i4.10437 particular product is doing well; technical and business fit are high in the graph on the left, and most measures are above average in the graph on the right. Figure 5 shows the remaining two graphs for the same product. The graph on the left plots the scores for Business Criticality and Application Suitability (fitness for purpose) to produce a recommended technical strategy. The graph on the right plots the scores for Business Fit and Technical Fit to produce a recommended management strategy. In both graphs, it is possible to see how the specific application is performing (the red square) compared to the portfolio overall (the blue diamond). Placement within the quadrant with the green Optimize label is preferred, as in this case. Figure 5. The remaining two graph types from the application maturity survey results, for a system [product name redacted] that is performing well. The specific system’s location is shown by the red square, while the blue diamond maps the average for all systems in the application portfolio. Full size version available at https://doi.org/10.6084/m9.figshare.6667442. Figures 6 and 7 present the same set of graphs for an end-of-life system. In figure 6 the graph on the left shows that the product is very business-critical but that its scores for Technical Fit and Business Fit (the lower corners of the pink triangle) are lower than the average across all applications (the lower corners of the blue triangle). The graph on the right shows that Supportability and the Time to Market for changes and enhancements (the least prominent “points” in the pink polygon) are below the portfolio average (shown in blue along the same axes) while scores for other criticality, standards compliance, information quality, and performance were more in line with the portfolio average. https://doi.org/10.6084/m9.figshare.6667442 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 37 Figure 6. The first and second (of four) graphs for a system [product name redacted] that is end - of-life. Full size version available at https://doi.org/10.6084/m9.figshare.6667478. In figure 7, this application is placed well within the quadrant suggesting replacement. Figure 7. The third and final graphs for a system [product name redacted] that is end-of-life. The placement of the red square within the Replace quadrant indicates that this product is a high candidate for decommissioning. This is a marked difference from the portfolio as a whole (the blue diamond), which could be reviewed for possible implementation improvements. Full size version available at https://doi.org/10.6084/m9.figshare.6667484. https://doi.org/10.6084/m9.figshare.6667478 https://doi.org/10.6084/m9.figshare.6667484 BENEFITS OF ENTERPRISE ARCHITECTURE FOR LIBRARY TECHNOLOGY MANAGEMENT | SEARLE 38 https://doi.org/10.6017/ital.v37i4.10437 The graphs are also useful for highlighting anomalies. Figure 8 shows a product that is assessed as better-than-average in the portfolio on most measures. However, the survey results quite clearly show that information quality is a major issue. Figure 8. Graph from application maturity survey showing a specific area of concern (data quality) for an otherwise well-performing application [product name redacted]. Full size version available at https://doi.org/10.6084/m9.figshare.6667487. This type of finding will help Library Technology Services to target our continuous improvement efforts and work through our relationships with user groups and vendors to get a better result. Application Communication Diagram The third major activity was the production of an application communication diagram (see figure 9). This is a visual representation of all of the information that was collated through the workshops using the workbook described above. https://doi.org/10.6084/m9.figshare.6667487 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 39 Figure 9. Application communication diagram [simplified view]. Full size version available at https://doi.org/10.6084/m9.figshare.6667490. https://doi.org/10.6084/m9.figshare.6667490 BENEFITS OF ENTERPRISE ARCHITECTURE FOR LIBRARY TECHNOLOGY MANAGEMENT | SEARLE 40 https://doi.org/10.6017/ital.v37i4.10437 The diagram includes a number of things to note. • Key applications that make up the library ecosystem. An example of this is the large blue box on the top left. This represents the Intota product suite from ProQuest, which contains multiple components, including our link resolver, discovery layer, and electronic resource manager. • Physical technology. Self-checkout machines appear as the small green box mid-right. • Other internal systems that connect to library system components. Examples of these are throughout and include: corporate systems, such as PeopleSoft for human resources and finances; identity management systems like metadirectory and Ping Federate; the learning management system Blackboard; and research systems, including the research information management system and the researcher profiles system. • External systems that connect to our systems. These are mostly gathered into the large grey box bottom right. • Actors who access the systems. This includes administrators, staff, students, and the general public. Actors are identified using a small person icon. • Interfaces between components. Each line in the diagram represents a unique connection into another system or interface. Captions on these lines indicate the nature of the connection, e.g. manual data entry, Z39.50 search, export scripts, and lookup lists. The production of this diagram has been an iterative process that has taken place over a long time period. The number of components involved in the diagram is quite large, so it is worth noting that the version presented here has actually been simplified. The architects’ tools can present information in different ways and this particular “view” was chosen to balance the need for detail and accuracy with the need to communicate meaningfully with a variety of stakeholders. Production of interactive visualizations In the fourth and final work package, the data entity and application inventory spreadsheet was used as a data source to provide an interactive visualization (see figure 10). A member of the architecture team converted the workbook (see figure 2) from Microsoft Excel .xls into a .csv file. He developed a PHP script to query the file and return a JSON object based on the parameters that were passed. The Data Driven Documents JavaScript library (D3.js) was used to produce a force graph that uses shapes, colors, and lines to visually present the spreadsheet information in a more interactive way.18 This tool enables navigation through the Library’s network of data entities (shown as orange squares) and applications (shown as blue dots). In the example being displayed, the data entity “Bibliographic records—MARC” has been selected. It is possible to see both in the visualization and in the popup box on the left how MARC records are captured, stored, and used across our entire ecosystem of applications. This visualization was very much an experiment and the value of this in the long term is something we are still discussing. In the short term, other outputs have proven to be more useful for planning purposes. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 41 Figure 10. Interactive visualization of library architecture, showing relationships between a single data subentity (Bibliographic records—MARC) and various applications. Full size version available at https://doi.org/10.6084/m9.figshare.6667493. https://doi.org/10.6084/m9.figshare.6667493 BENEFITS OF ENTERPRISE ARCHITECTURE FOR LIBRARY TECHNOLOGY MANAGEMENT | SEARLE 42 https://doi.org/10.6017/ital.v37i4.10437 DISCUSSION The process described above was not without its challenges, including establishing a common language. Enterprise architecture and libraries are both fertile breeding grounds for jargon and acronyms. There was also a disconnect in our understandings of who our users were, with the architects tending to concentrate on internal users, while the librarians were keen to include the perspectives of the academic staff and students who make up our core client base. These were minor challenges, and the experience of working with the enterprise architects was overall an interesting and positive one for the Library. Our collaboration validated McKay and Parker’s view that there is much crossover in the skillsets and mindsets of librarians and enterprise architects.19 Both groups tended to work in systematic and analytical ways, which was helpful in removing some of the more emotive aspects that might have arisen through a more judgmental “assessment” process. The enterprise architects’ job was to promote conformance with standards that are aspirational in many respects for the Library. However, the collaborative nature of the process and the immediate usefulness of its outputs helped us to approach this as an opportunity to improve our internal practices as well as the services that we offer to library customers. The architects observed in return that library staff were very open-minded about the process; this had not necessarily always been their experience with other groups in the University. One reason for this may have been LTS’s efforts to communicate early with other library staff. Before embarking on this work, we sent emails and provided verbal updates to all participants and their supervisors. These communications were clear about both the time commitment needed for workshops and surveys and also about the benefits we hoped to achieve. Short-Term Impacts in the Library Domain The level of awareness and understanding in Library Technology Services about EA concepts and methods is much higher than what it was previously. Our capacity to self-identify architectural issues is better as a result and this is enabling us to be proactive rather than reactive. A recent example of this is a request from our Solution Architecture Board (SAB) to seek an exemption from our IT Advisory Board (ITAB) for our proposed use of the NISO Circulation Interchange Protocol (NCIP) to support interlibrary loan. While NCIP is a NISO standard that is widely used in libraries, it is not one of the integration mechanisms incorporated into the architecture standards. As a result of this request, we plan to develop a document for these IT governance groups about all the library-specific data transfer protocols that we use; not just NCIP, but also Z39.50, the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), the EDIFact standard for transferring purchasing information, and possibly others. It is in our interests to educate these important governance groups about integration methods commonly used in the library environment, since these are not well understood outside of our team. The baseline as-is application architecture diagram gives us a much better grasp on the complexity we are faced with. Understanding this complexity is a prerequisite to controlling it. The diagram, and the process worked through to populate it, makes it easier to identify manual processes that should be automated and integrations that might be done more efficiently or effectively. For example, like most libraries, we still have many scheduled batch processes that we could potentially replace in the future with web services to provide real-time updates. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 43 The iServer platform is now an important source of data to support our decision-making, in terms of arriving at broad recommendations for replacing, reimplementing, or optimizing our systems as well as highlighting specific areas of concern. Importantly, the process produced relative results, so that we can see across our application portfolio which systems are underperforming compared to others. This makes it easier to determine where the team should be putting its efforts and highlights areas where firmer approaches to vendor management could be applied. A practical example of this was our decision in late 2017 to review (and ultimately unbundle and replace) an e-journal statistics module that was underperforming compared to other modules within the same suite. The outputs from this process are also helping Library Technology Services communicate, both within our own team and also with other stakeholders. The results of the application maturity assessment were included as part of a business case seeking project funding to upgrade our library management system and replace our interlibrary loans system. That funding bid was successful. While it is possible that the business case would have been approved regardless, a recommendation from the architects that the system needed to be replaced was likely more persuasive than the same recommendation coming solely from a library perspective. In our organizational context, enterprise architects are trusted by very senior executives; they are perceived as neutral and objective, and the processes that they use are understood to be systematic and data-driven. Longer-Term Impacts in an Enterprise Context There are a number of longer-term impacts that may arise from this work. Seeing the Library’s applications in a broader enterprise context is likely to lead to more questioning of the status quo and to a desire to investigate new ways to do things. In large organizations like universities, available enterprise systems can offer better functionality and more standardized ways of operating than library systems. Financial systems are an obvious example, as are business intelligence tools. The canned and custom reports and dashboards within library systems meet a narrow set of requirements, but do not compare well for increasingly complex analytics when compared to enterprise data warehousing, emerging “data lake” technologies for less structured data, and sophisticated reporting tools. An enterprise approach also highlights where the same process is being done across different systems. For example, OAI-PMH harvesting is a feature of multiple systems at Griffith. Traditionally each system provides its own feeds. Our data repository, publications repository, and researcher profile system all provide OAI-PMH harvesting endpoints for sending metadata to different aggregators. An alternative solution to explore could be to harvest all publications data from multiple systems into our corporate data warehouse (particularly if this evolved to provide more linked data functionality) and provide a single OAI-PMH endpoint that could then be managed as a single service. The EA process has further raised our already high level of concern with the current library systems market. There has been a move in recent years towards larger, highly-integrated “black box” solutions. While there have been some moves towards openness, for example through the development of APIs, these are often rhetorical rather than practical. The pricing structures for products mean that we continue to pay for functionality that would not be required if we could integrate library applications with non-library enterprise tools in smarter ways. At Griffith, the BENEFITS OF ENTERPRISE ARCHITECTURE FOR LIBRARY TECHNOLOGY MANAGEMENT | SEARLE 44 https://doi.org/10.6017/ital.v37i4.10437 products that scored most highly in our maturity assessment in terms of business and technical fit were the less expensive, lightweight, browser-based, cloud-native tools designed to do one or two things really well. This suggests that strategies around a more loosely coupled microservices approach, such as that being developed through the FOLIO open source library software initiative, will be worth exploring in future.20 CONCLUSION There are few documented examples of librarians working closely with enterprise architects in higher education or elsewhere. The goal of this case study is to encourage other librarians to learn more about architects’ work practices and to seek opportunities to apply EA methods in the library systems space for the benefit not just of the library but also for the organization as a whole. As a single institution case study, the applicability of this work may be limited in other environments. Griffith has a long tradition of highly converged library and IT operations; other organizations may have more structural barriers to entry if the Library and IT areas are not as naturally cooperative. A further obvious limitation relates to resourcing. The author of the CISTI case study cautions that getting started in EA can be complex and resource-intensive. Few libraries are likely to be in the position of CISTI in having dedicated library architects, so working with others will be required. In many universities, work of this nature is outsourced to specialist consultants because of a lack of in-house expertise. At Griffith University, we conducted this exercise entirely with in-house staff. A downside of this was that, despite our best efforts at the scoping stage, competing priorities in both areas meant that this work took far longer than we expected. In theory, external consultants could have guided the Library through similar activities to produce similar outputs, and probably in a shorter timeframe. However, we would observe that the process has been just as important as the outputs; the knowledge, skills, and relationships that have been built will continue into the future. At CISTI, investments in EA were assessed by the library as justified by the improvements in technology capability, strategic planning, and services to library users. The Griffith experience validates this perspective. It is also important to note that EA work can and should be done in an iterative way. Our experience suggests that some outputs can be delivered earlier than others and useful insights can be gleaned even from drafts. Our local “ecosystem” of library applications, enterprise applications, and integrations between these different components mus t respond to changes in technologies; legal and regulatory frameworks; institutional policies and procedures; and other factors. It is therefore unrealistic to expect outputs from a process like this to remain current for long. Assuming that the Library’s data and application architecture will always be a work-in-progress, it will continue to be worth the effort involved to build and maintain positive working relationships with the enterprise architects, who now have a deeper understanding of who we are and what we do. ACKNOWLEDGEMENTS Thank you to Anna Pegg, Associate IT Architect; Jolyon Suthers, Senior Enterprise Architect; Colin Morris, Solution Consultant; the Library Technology Services team; all our Library and Learning Services colleagues who participated in this initiative; and Joanna Richardson, Library Strategy INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 45 Advisor, for support and feedback during the writing of this article. This work was previously presented at THETA (The Higher Education Technology Agenda) 2017, Auckland, New Zealand. REFERENCES 1 Griffith University, “Griffith Digital Strategy 2020,” 2016, https://www.griffith.edu.au/__data/assets/pdf_file/0026/365561/griffithuniversity-digital- strategy.pdf. 2 The Open Group, “TOGAF®, an Open Group Standard,” accessed June 4, 2018, http://www.opengroup.org/subjectareas/enterprise/togaf. 3 Federation of Enterprise Architecture Professional Associations, “A Common Perspective on Enterprise Architecture,” 2013, http://feapo.org/wp-content/uploads/2013/11/Common- Perspectives-on-Enterprise-Architecture-v15.pdf. 4 Judith Pirani, “Manage Today’s IT Complexities with an Enterprise Architecture Practice,” EDUCAUSE Review, February 16, 2017, https://er.educause.edu/blogs/2017/2/manage- todays-it-complexities-with-an-enterprise-architecture-practice. 5 Stephen Kevin Anthony, “Implementing Service Oriented Architecture at the Canada Institute for Scientific and Technical Information,” The Serials Librarian 55, no. 1–2 (July 3, 2008): 235–53, https://doi.org/10.1080/03615260801970907. 6 Kristiina Hormia-Poutanen, “The Finnish National Digital Library: National Library of Finland developing a National Infrastructure in Collaboration with Libraries, Archives and Museums,” accessed March 24, 2018, http://travesia.mcu.es/portalnb/jspui/bitstream/10421/6683/1/fndl.pdf. 7 Karl W. Schornagel, “Information Technology Strategic Planning: A Well-Developed Framework Essential to Support the Library’s and Future IT Needs. Report No. 2008-PA-105,” May 2, 2009, https://web.archive.org/web/20090502092325/https://www.loc.gov/about/oig/reports/20 09/Final%20IT%20Strategic%20Planning%20Report%20Mar%202009.pdf. 8 Joel Willemssen, “Information Technology: Library of Congress Needs to Implement Recommendations to Address Management,” December 2, 2015, https://www.gao.gov/assets/680/673955.pdf. 9 Rebecca Parker and Dana McKay, “It’s the End of the World as We Know It . . . or Is It? Looking Beyond the New Librarianship Paradigm,” in Marketing and Outreach for the Academic Library, ed. Bradford Lee Eden (Lanham, MD: Rowman and Littlefield, 2016): 81–106. 10 New South Wales State Archives and Records Authority, “Recordkeeping in Brief 59—An Introduction to Enterprise Architecture for Records Managers,” 2011, https://web.archive.org/web/20120502184420/https://www.records.nsw.gov.au/recordkee ping/government-recordkeeping-manual/guidance/recordkeeping-in-brief/recordkeeping-in- brief-59-an-introduction-to-enterprise-architecture-for-records-managers. 11 Anthony, “Implementing Service Oriented Architecture,” 236–37. https://www.griffith.edu.au/__data/assets/pdf_file/0026/365561/griffithuniversity-digital-strategy.pdf https://www.griffith.edu.au/__data/assets/pdf_file/0026/365561/griffithuniversity-digital-strategy.pdf http://www.opengroup.org/subjectareas/enterprise/togaf http://feapo.org/wp-content/uploads/2013/11/Common-Perspectives-on-Enterprise-Architecture-v15.pdf http://feapo.org/wp-content/uploads/2013/11/Common-Perspectives-on-Enterprise-Architecture-v15.pdf https://er.educause.edu/blogs/2017/2/manage-todays-it-complexities-with-an-enterprise-architecture-practice https://er.educause.edu/blogs/2017/2/manage-todays-it-complexities-with-an-enterprise-architecture-practice https://doi.org/10.1080/03615260801970907 http://travesia.mcu.es/portalnb/jspui/bitstream/10421/6683/1/fndl.pdf https://web.archive.org/web/20090502092325/https:/www.loc.gov/about/oig/reports/2009/Final%20IT%20Strategic%20Planning%20Report%20Mar%202009.pdf https://web.archive.org/web/20090502092325/https:/www.loc.gov/about/oig/reports/2009/Final%20IT%20Strategic%20Planning%20Report%20Mar%202009.pdf https://www.gao.gov/assets/680/673955.pdf https://web.archive.org/web/20120502184420/https:/www.records.nsw.gov.au/recordkeeping/government-recordkeeping-manual/guidance/recordkeeping-in-brief/recordkeeping-in-brief-59-an-introduction-to-enterprise-architecture-for-records-managers https://web.archive.org/web/20120502184420/https:/www.records.nsw.gov.au/recordkeeping/government-recordkeeping-manual/guidance/recordkeeping-in-brief/recordkeeping-in-brief-59-an-introduction-to-enterprise-architecture-for-records-managers https://web.archive.org/web/20120502184420/https:/www.records.nsw.gov.au/recordkeeping/government-recordkeeping-manual/guidance/recordkeeping-in-brief/recordkeeping-in-brief-59-an-introduction-to-enterprise-architecture-for-records-managers BENEFITS OF ENTERPRISE ARCHITECTURE FOR LIBRARY TECHNOLOGY MANAGEMENT | SEARLE 46 https://doi.org/10.6017/ital.v37i4.10437 12 Jolyon Suthers, “Information and Technology Architecture,” 2016, accessed April 6, 2018 https://www.caudit.edu.au/system/files/Media%20library/Resources%20and%20Files/Com munities/Enterprise%20Architecture/EA2016%20Joylon%20Suthers%20CAUDIT%20EA%2 0Symposium%202016%20-%20IT%20Architecture%20v2_0.pdf. 13 The Open Group, “TOGAF® 9.1,” 2011, 2018, http://pubs.opengroup.org/architecture/togaf9- doc/arch/index.html: Part 1 Introduction Section 2: Core Concepts. 14Orbus Software, “iServer for Enterprise Architecture,” accessed March 26, 2018, https://www.orbussoftware.com/enterprise-architecture/capabilities/. 15 EBSCO, “GOBI®,” accessed June 5, 2018, https://gobi.ebsco.com/gobi. 16 Google Scholar, “Google Scholar Support for Libraries,” accessed June 5, 2018, https://scholar.google.com/intl/en/scholar/libraries.html. 17 National Library of Australia, “Trove,” accessed June 5, 2018, https://trove.nla.gov.au/; Australian National Data Service, “Research Data Australia,” accessed June 5, 2018, https://researchdata.ands.org.au/. 18 Mike Bostock, “D3.Js—Data-Driven Documents,” accessed April 3, 2018, https://d3js.org/. 19 Parker and McKay, “It’s the End of the World,” 88. 20 Breeding, Marshall, “Five Key Technology Trends for 2018,” Computers in Libraries, 37, no.10 (December 2017), http://www.infotoday.com/cilmag/dec17/Breeding--Five-Key-Technology- Trends-for-2018.shtml. https://www.caudit.edu.au/system/files/Media%20library/Resources%20and%20Files/Communities/Enterprise%20Architecture/EA2016%20Joylon%20Suthers%20CAUDIT%20EA%20Symposium%202016%20-%20IT%20Architecture%20v2_0.pdf https://www.caudit.edu.au/system/files/Media%20library/Resources%20and%20Files/Communities/Enterprise%20Architecture/EA2016%20Joylon%20Suthers%20CAUDIT%20EA%20Symposium%202016%20-%20IT%20Architecture%20v2_0.pdf https://www.caudit.edu.au/system/files/Media%20library/Resources%20and%20Files/Communities/Enterprise%20Architecture/EA2016%20Joylon%20Suthers%20CAUDIT%20EA%20Symposium%202016%20-%20IT%20Architecture%20v2_0.pdf http://pubs.opengroup.org/architecture/togaf9-doc/arch/index.html http://pubs.opengroup.org/architecture/togaf9-doc/arch/index.html https://www.orbussoftware.com/enterprise-architecture/capabilities/ https://gobi.ebsco.com/gobi https://scholar.google.com/intl/en/scholar/libraries.html https://trove.nla.gov.au/ https://researchdata.ands.org.au/ https://d3js.org/ http://www.infotoday.com/cilmag/dec17/Breeding--Five-Key-Technology-Trends-for-2018.shtml http://www.infotoday.com/cilmag/dec17/Breeding--Five-Key-Technology-Trends-for-2018.shtml ABSTRACT INTRODUCTION LITERATURE REVIEW GETTING STARTED: CONTEXT AND PLANNING Library Technology Services Context Information Technology Architecture and Solutions Context Determining Scope in a Statement of Architecture Work ENTERPRISE ARCHITECTURE PROCESS AND OUTPUTS Data Asset and Application Inventory Application Maturity Survey Application Communication Diagram Production of interactive visualizations DISCUSSION Short-Term Impacts in the Library Domain Longer-Term Impacts in an Enterprise Context CONCLUSION ACKNOWLEDGEMENTS REFERENCES 10432 ---- An Overview of the Current State of Linked and Open Data in Cataloging Irfan Ullah, Shah Khusro, Asim Ullah, and Muhammad Naeem INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 47 Irfan Ullah (cs.irfan@uop.edu.pk) is doctoral candidate, Shah Khusro (khusro@uop.edu.pk) is Professor, Asim Ullah (asimullah@uop.edu.pk) is doctoral student, and Muhammad Naeem (mnaeem@uop.edu.pk) is Assistant Professor, at the Department of Computer Science, University of Peshawar. ABSTRACT Linked Open Data (LOD) is a core Semantic Web technology that makes knowledge and information spaces of different knowledge domains manageable, reusable, shareable, exchangeable, and interoperable. The LOD approach achieves this through the provision of services for describing, indexing, organizing, and retrieving knowledge artifacts and making them available for quick consumption and publication. This is also aligned with the role and objective of traditional library cataloging. Owing to this link, major libraries of the world are transferring their bibliographic metadata to the LOD landscape. Some developments in this direction include the replacement of Anglo-American Cataloging Rules 2nd Edition by the Resource Description and Access (RDA) and the trend towards the wider adoption of BIBFRAME 2.0. An interesting and related development in this respect are the discussions among knowledge resources managers and library community on the possibility of enriching bibliographic metadata with socially curated or user-generated content. The popularity of Linked Open Data and its benefit to librarians and knowledge management professionals warrant a comprehensive survey of the subject. Although several reviews and survey articles on the application of Linked Data principles to cataloging have appeared in literature, a generic yet holistic review of the current state of Linked and Open Data in cataloging is missing. To fill the gap, the authors have collected recent literature (2014–18) on the current state of Linked Open Data in cataloging to identify research trends, challenges, and opportunities in this area and, in addition, to understand the potential of socially curated metadata in cataloging mainly in the realm of the Web of Data. To the best of the authors’ knowledge, this review article is the first of its kind that holistically treats the subject of cataloging in the Linked and Open Data environment. Some of the findings of the review are: Linked and Open Data is becoming the mainstream trend in library cataloging especially in the major libraries and research projects of the world; with the emergence of Linked Open Vocabularies (LOV), the bibliographic metadata is becoming more meaningful and reusable; and, finally, enriching bibliographic metadata with user-generated content is gaining momentum. Conclusions drawn from the study include the need for a focus on the quality of catalogued knowledge and the reduction of the barriers to the publication and consumption of such knowledge, and the attention on the part of library community to the learning from the successful adoption of LOD in other application domains and contributing collaboratively to the global scale activity of cataloging. INTRODUCTION With the emergence of the Semantic Web and Linked Open Data (LOD), libraries have been able to make their bibliographic data publishable and consumable on the web, resulting in an increased understanding and utility both for humans and machines.1 Additionally, the use of Linked Data principles of LOD has allowed connecting related data on the web.2 Traditional catalogs as mailto:cs.irfan@uop.edu.pk mailto:khusro@uop.edu.pk mailto:asimullah@uop.edu.pk mailto:mnaeem@uop.edu.pk CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 48 https://doi.org/10.6017/ital.v37i4.10432 collections of metadata about library content have served the same purpose for a long time.3 It is, therefore, natural to establish a link between the two technologies and exploit the capabilities of LOD to enhance the power of cataloging services. In this regard, significant milestones have been achieved, which includes the use of Linked and Open Data principles for publishing and linking library catalogs, BIBFRAME, and Europeana Data Model (EDM).4 However, the potential of Linked and Open Data for building more efficient libraries and the challenges involved in that direction are mostly unknown due to the lack of a holistic view of the relationship between cataloging and the LOD initiative and the advances made in both areas. Likewise, the possibility of enriching the bibliographic metadata with user-generated content such as ratings, tags, and reviews to facilitate the search for known-items as well as exploratory search has not received much attention. 5 Some studies of preliminary extent have, however, appeared in literature an overview of which is presented in the following paragraphs. Several survey and review articles have contributed to different aspects of cataloging in the LOD environment. Hallo et al. investigated how Linked Data is used in digital libraries, how the major libraries of the world implemented it, and how they benefit from it by focusing on the selected ontologies and vocabularies. 6 They identified several specific challenges to applying Linked Data to digital libraries. More specifically, they reviewed the Linked Data applications in digital libraries by analyzing research publications regarding the major national libraries (obtaining five-stars by following Linked Data principles) and published from 2012 to 2016.7 Tallerås examined statistically the quality of Linked Bibliographic Data published by the major libraries including Spain, France, the United Kingdom, and Germany. 8 Yoose and Perkins presented a brief survey of LOD uses under different projects in different domains including libraries, archives, and museums.9 By exploring the current advances in the Semantic Web, Robert identified the potential roles of libraries in publishing and consuming bibliographic data and institutional research output as Linked and Open Data on the web.10 Gardašević presented a detailed overview of Semantic Web and Linked Open Data from the perspective of library data management and their applicability within the library domain to provide a more open and integrated catalog for improved search, resource discovery, and access.11 Thomas, Pierre-Yves, and Bernard presented a review of Linked Open Vocabularies (LOV), in which they analyzed the health of LOV from the requirements perspective of its stakeholders, its current progress, its uses in LOD applications, and proposed best practices and guidelines regarding the promotion of LOV ecosystem.12 They uncovered the social and technical aspects of this ecosystem and identified the requirements for the long-term preservation of LOV data. Vandenbussche et al. highlighted the features, components, significance, and applications of LOV and identified the ways in which LOV supports ontology & vocabulary engineering in the publication, reuse and data quality of LOD.13 Tosaka and Park performed a detailed literature review of RDA (2005–11) and identified its fundamental differences from AACR2, its relationship with the metadata standards, and its impact on metadata encoding standards, users, practitioners, and the training required.14 Sprochi presented the current progress in RDA, FRBR (Functional Requirements for Bibliographic Records), and BIBFRAME to predict the future of library metadata, the skills and knowledge required to handle it, and the directions in which the library community is heading. 15 Gonzales identified the limitations of MARC21 and the benefits of and challenges in adopting the BIBFRAME INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 49 framework.16 Taniguchi assessed BIBFRAME 2.0 for the exchange and sharing of metadata created in different ways for different bibliographic resources.17 He discussed BIBFRAME 1.0 from RDA point of view.18 He examined BIBFRAME 2.0 from the perspective of RDA to uncover issues in its mapping to BIBFRAME including RDA expressions in BIBFRAME, mapping RDA elements to BIBFRAME properties, and converting MARC21 metadata records to BIBFRAME metadata. 19 Fayyaz, Ullah, and Khusro reported on the current state of LOD and identified several prominent issues, challenges, and research opportunities. 20 Ullah, Khusro, and Ullah reviewed and evaluated different approaches for bibliographic classification of digital collections.21 By looking at the above survey and review articles, one may observe that these articles target a specific aspect of cataloging from the perspective of LOD. The holistic analysis and a complete picture of the current state of cataloging in transiting to LOD ecosystem are missing. This paper adds to the body of knowledge by filling this gap in the literature. More specifically, it attempts to answer the following research questions (RQs): RQ01: How Linked Open Data (LOD) and Vocabularies (LOV) are transforming the digital landscape of library catalogs? RQ02: What are the prominent/major issues, challenges, and research opportunities in publishing and consuming bibliographic metadata as Linked and Open Data? RQ03: What is the possible impact of extending bibliographic metadata with the user- generated content and making it visible on the LOD cloud? The first section of this paper answers RQ01 by discussing the potential role of LOD and LOV in making library catalogs visible and reusable on the web. The second section answers RQ02 by identifying some of the prominent issues, challenges, and research opportunities in publishing, linking, and consuming library catalogs as Linked Data. It also identifies specific issues in RDA and BIBFRAME from LOD perspective and highlights the quality of LOD-based cataloging. The third section answers RQ03 by reviewing the state-of-the-art literature on the socially curated metadata and its role in cataloging. The last section concludes the paper followed by references cited in this article. THE ROLE OF LINKED OPEN DATA AND VOCABULARIES IN CATALOGING The catalogers, librarians, and information science professionals have always been busy defining the set of rules, guidelines, and standards to record the metadata about knowledge artifacts accurately, precisely, and efficiently. The AACR2 are among the widely used rules and guidelines for cataloging. However, it has several issues with the nature of authorship, the relationships between bibliographic metadata, the categorization of format-specific resources, and the description of new data types.22 In an attempt to produce its revised version, AACR3, the cataloging community noticed that a new framework should be developed with the name of RDA.23 Based on FRBR conceptual models, RDA is a “flexible and extendible bibliographic framework” that supports data sharing and interoperability and is compatible with MARC21 and AACR2.24 According to the RDA Toolkit, RDA describes digital and non-digital resources by taking advantage of the flexibilities and efficiencies of modern information storage and retrieval technologies while at the same time is backward-compatible with legacy technologies used in conventional resource discovery and access applications.25 It is aligned with the IFLA’s CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 50 https://doi.org/10.6017/ital.v37i4.10432 (International Federation of Library Associations and Institutions) conceptual models of authority and bibliographic metadata (FRBR, FRAD [Functional Requirements for Authority Data], FRSAD [Functional Requirements for Subject Authority Data]).26 RDA accommodates all types of content and media in digital environments with improved bibliographic control in the realm of Linked and Open Data; however, its responsiveness to user requirements needs further research.27 The discussion of the cataloging rules and guidelines stays incomplete without the metadata encoding standards and formats that give practical shape to these rules in the form of library catalogs. The most common encoding formats include Dublin Core (DC) and MARC21. Dublin Core (http://lov.okfn.org/dataset/lov/vocabs/dce) is a [general-purpose metadata encoding scheme and] vocabulary of fifteen properties with “broad, generic, and usable terms” for resource description in natural language. It is advantageous as it presents relatively low barriers to repository construction; however, it lacks in standards to index subjects consistently as well as to offer a uniform semantic basis necessary for an enhanced search experience.28 The lack of uniform semantic basis is due to the individual interpretations and exploitations of DC metadata by the libraries, which in turn originated from its different and independent implementations at the element level.29 MARC21 is the most common machine process-able metadata encoding format for bibliographic metadata. It can be mapped to several formats including DC, MARC/XML (http://www.loc.gov/standards/marcxml/), MODS (http://www.loc.gov/standards/mods), MADS (http://www.loc.gov/standards/mads), and other metadata standards.30 However, MARC21 has several limitations such as only library software and librarians understand it, it is semantically inexpressive and isolated from the web structure, and it lacks in expressive semantic connections to relate different data elements in a single catalog record.31 Besides its limitations, MARC metadata encoding format is vital for resource discovery especially within the library environment, and therefore, ways must be found to make visible the library collections outside the libraries and available through the major web search engines.32 One such effort is from the Library of Congress (http://catalog.loc.gov/) that introduced a new bibliographic metadata framework, BIBFRAME 2.0, which will eventually replace MARC21 and allow Semantic Web and Linked Open Data to interlink bibliographic metadata from different libraries. Other metadata encoding schema and frameworks include Schema.org, EDM, and the International Community for Documentation (CIDOC)’s Conceptual Reference Model (CIDOC-CRM).33 Today, the bibliographic metadata records are available on the web in several forms including MARC21, Online Public Access Catalogs (OPACs), and bibliographic descriptions from online catalogs (e.g., Library of Congress), online cooperative catalogs (e.g., OCLC’s WorldCat [https://www.oclc.org/en/worldcat.html program]), social collaborative cataloging applications (e.g., LibraryThing [https://www.librarything.com]), digital libraries (e.g., IEEE Xplore digital library [https://ieeexplore.ieee.org/Xplore/home.jsp]), ACM digital library(https://dl.acm.org), book search engines such as Google Books, and commercial databases including e.g., Amazon.com. Most of these cataloging web applications use either MARC or other legacy standards as metadata encoding and representation schemes. However, the majority of these applications are either considering or transiting to the emerging cataloging rules, frameworks, and encoding schemes so that the bibliographic descriptions of their holdings could be made visible and reusable as Linked and Open Data on the web for the broader interests of libraries, publishers, and end-users. http://lov.okfn.org/dataset/lov/vocabs/dce http://www.loc.gov/standards/marcxml/ http://www.loc.gov/standards/mods http://www.loc.gov/standards/mads http://catalog.loc.gov/ https://www.oclc.org/en/worldcat.html https://www.librarything.com/ https://ieeexplore.ieee.org/Xplore/home.jsp https://dl.acm.org/ INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 51 The presence of high-quality reusable vocabularies makes the consumption of Linked Data more meaningful, which is made possible by Linked Open Vocabularies (LOV) that bring value-added extensions to the Web of Data.34 The following two subsections attempt to answer the RQ01 by highlighting how LOD and LOV are transforming the current digital landscape of cataloging. Linked and Open Data The Semantic Web and Linked Open Data have enabled libraries to publish and make visible their bibliographic data on the web, which increases the understanding and consumption of this metadata both for humans and machines.35 LOD connects and relates bibliographic metadata on the web using Linked Data principles.36 Publishing, linking, and consuming bibliographic metadata as Linked and Open Data brings several benefits. These include improvements in data visibility, linkage with different online services, interoperability through universal LOD platform, and the credibility due to user annotations.37 Other benefits include: the semantic modeling of entities related to bibliographic resources; ease in transforming topics into SKOS; ease in the usage of linked library data in other services; better data visualization according to user requirements; linking and querying linked data from multiple sources; and improved usability of library linked data in other domains and knowledge areas.38 Different users including scientists, students, citizens and other stakeholders of library data can benefit from adopting LOD in libraries.39 Linked Data has the potential to make bibliographic metadata visible, reusable, shareable, and exchangeable on the web with greater semantic interoperability among the consuming applications. Several major projects including BIBFRAME, LODLAM (Linked Open Data in Libraries Archives and Museums [http://lodlam.net]), and LD4L (Linked Data for Libraries [https://www.ld4l.org]) are in progress, which advocates for this potential.40 Similarly, Library Linked Data (LLD) is LOD-based bibliographic datasets, available in MODS and MARC21 and could be used in making search systems more sophisticated and may also be used in LOV datasets to integrate applications requiring library and subjects domain datasets.41 Bianchini and Guerrini report on the current changes in the library and cataloging domains from Ranganathan’s point of view of trinity (library, books, staff), which states that changes in one element of this trinity undoubtedly affect the others.42 They found several factors including readers, collections, and services influence this trinity and emphasize for a change: • Readers moved to the web from libraries and wanted to save their time but want many capabilities including searching and navigating the full-text of resources by following links. They want resources connected to similar and related resources. They want concepts interlinked to perform an exploratory search, find serendipitous results to fulfill their information needs. • Collections encompass several changes from their production to dissemination, from search and navigation to the representation and presentation of content. The ways the users access them and catalogers describe them are changing. Their management is moving beyond the boundaries of their corresponding libraries to the open and broader landscape of Open Access context and exposure to LOD environment. • Services are moving from bibliographic data silos to the Semantic Web. This affects moving the bibliographic model to a more connected and linked data model and environment of Semantic Web. The data is moving from bibliographic database management systems to large LOD graph, where millions of MARC records are reused and converted to new http://lodlam.net/ https://www.ld4l.org/ CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 52 https://doi.org/10.6017/ital.v37i4.10432 encoding formats that are backward compatible with MARC21, RDA, and others and provide opportunities to be exploited fully by the Linked and Open Data environment. Thinking along this direction, new cataloging rules and guidelines, such as RDA, are making us a part of the growing global activity of cataloging. Therefore, catalogers should take keen interest in and avail themselves of the opportunities that lie in Linked and Open Data for cataloging. Otherwise, they (as a service) might be forgotten or removed from the trinity, i.e., from collections and readers.43 Several major libraries have been actively working to make their bibliographic metadata visible and re-usable on the web. The Library of Congress through its Linked Data Service (http://id.loc.gov) enables humans and machines to access its authority data programmatically. 44 It exposes and interconnects data on the web through dereferenceable Uniform Resource Identifiers (URIs).45 Its scope includes providing access to the commonly found LOC standards and vocabularies (controlled vocabularies and data values) for the list of authorities and controlled vocabularies that LOC currently supports.46 According to the LOC, the Linked Data Service brings several benefits to the users including: accessing data at no cost; providing granular access to individual data values; downloading controlled vocabularies and their data values in numerous formats; enabling linking to LOC data values within the user metadata using Linked Data principles; providing a simple RESTful API, clear license and usage policy for each vocabulary; accessing data across LOC divisions through a unified endpoint; and visualizing relationships between concepts and values.47 However, to fully exploit the potentials of LOD, LOC is mainly focusing on its BIBFRAME initiative.48 BIBFRAME is not only a replacement for the current MARC21 metadata encoding format it is a new way of thinking how the available large amount of bibliographic metadata could be shared, reused, and made available as Linked and Open Data. 49 The BIBFRAME 2.0 (https://www.loc.gov/bibframe/docs/bibframe2-model.html) model organizes information into work (the details of the about the work information), instance (work on specific subject quantity in numbers), item (format: print or electronic), and nature (copy/original work). BIBFRAME 2.0 elaborates the roles of the persons in the specific work as agents, and the subject of the work as subjects and events.50 According to Taniguchi, BIBFRAME 2.0 takes the bibliographic metadata standards to the Linked and Open Data with model and vocabulary that makes the cataloging more useful both inside and outside the library community.51 To achieve this goal, it needs to fulfill two primary requirements. These include (1) accepting and representing metadata created with RDA by replacing the MARC21, and therefore, working as creating, exchanging, and sharing RDA metadata; (2) accepting and accommodating descriptive metadata for bibliographic resources created by libraries, cultural heritage communities, and users for the wide exchange and sharing. BIBFRAME 2.0 should comply with the Linked Data principles including the use of RDF and URIs. In addition to the Library of Congress, OCLC through its Linked Data Research has also been actively involved in research on transforming and publishing its bibliographic metadata as Linked Data.52 Under this program, OCLC aims to provide a technical platform for the management and publication of its RDF datasets at a commercial scale. It models the key bibliographic entities including work and person and populates them with legacy and MARC-based metadata. It extends http://id.loc.gov/ https://www.loc.gov/bibframe/docs/bibframe2-model.html INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 53 models to efficiently describe the contents of digital collections, art objects, and institutional repositories, which are not very well-described in MARC. It improves the bibliographic description of works and their translations. It manages the transition from MARC and other legacy encoding formats to Linked Data and develops prototypes for native consumption of Linked Data to improve resource description and discovery. Finally, it organizes teaching and training events.53 Since 2012, OCLC has been publishing bibliographic data as Linked Data with three major LOD datasets including OCLC Persons, WorldCat works, and WorldCat.org.54 Inspired from Google Research, currently, they have been working on Knowledge Vault pipeline process to harvest, extract, normalize, weigh, and synthesize knowledge from bibliographic records, authority files, and the web to generate Linked Data triples to improve the exploration and discovery experience of end-users.55 WorldCat.org publishes it bibliographic metadata as Linked Data by extracting a rich set of entities including persons, works, places, events, concepts, and organizations to make possible several web services and functionalities for resource discovery and access.56 It uses Schema.org (http://schema.org) as the base ontology, which can be extended with different ontologies and vocabularies to model WorldCat bibliographic data to be published and consumed as Linked Data.57 Tennant presents a simple example of how this works. Suppose we want to represent the fact “William Shakespeare is the author of Hamlet” as Linked Data.58 To do this, the important entities should be extracted along with their semantics (relationships) and represented in a format that is both machine-processable and human-readable. Using Schema.org, Virtual International Authority File (VIAF.org), and WorldCat.org, the sentence can be represented as a Linked Data triple, as shown in figure 1 based on Tennant.59 The Digital Bibliography & Library Project (DBLP) is an online Computer Science bibliography that provides bibliographic information about major publications in Computer Science with the goal of providing free access to high-quality bibliographic metadata and links to the electronic version of these publications.60 As of October 2018, it has indexed more than 4.3 million publications from more than 2.1 million authors and has indexed more than 40,000 journal volumes, 38,000 conference/workshop proceedings, and more than 80,000 monographs.61 Its dataset is available on LOD that allows for faceted search and faceted navigation to the matching publications. It uses GrowBag graphs to create topic facets and uses DBLP++ datasets (an enhanced version of DBLP) and additional data extracted from the related webpages on the web.62 A MySQL database stores the DBLP++ dataset that is accessible through several ways including (1) getting the database dump; (2) using its web services; (3) using D2R server to access it in RDF; and (4) getting the RDF dump available in N3 serialization.63 The above discussions on LOC, OCLC, and DBLP make it clear that LOD can potentially transform the cataloging landscape of libraries by making bibliographic metadata visible and reusable on the web. However, this potential can only be exploited to its fullest if relevant vocabularies are provided to make the Linked Data more meaningful. LOV fulfills this demand for relevant and standard vocabularies, discussed in the next subsection. CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 54 https://doi.org/10.6017/ital.v37i4.10432 Figure 1. An Example of Publishing a Sample Fact as Linked Data (Based on Tennant64). Linked Open Vocabularies Linked Open Vocabularies (LOV) are a “high-quality catalog of reusable vocabularies to describe Linked and Open Data.”65 They assist publishers in choosing the appropriate vocabulary to efficiently describe the semantics (classes, properties, and data types) of the data to be published as Linked and Open Data.66 LOV interconnect vocabularies, version control, the property type of values to be matched with a query to increase the score of the terms, and offers a range of data access methods including APIs, SPARQL endpoint, and data dump. The aim is to make the reuse of well-documented vocabularies possible in the LOD environment.67 The LOV portal brings value- added extensions to the Web of Data, which is evident from its adoption in several state-of-the-art applications.68 The presence of vocabulary makes the corresponding Linked Data meaningful, if the original vocabulary vanishes from the web, linked data applications that rely on it no longer function because they cannot validate against the authoritative source. LOV systems prevent vocabularies from becoming unavailable by providing redundant or back-up locations for these vocabularies.69 The LOV catalog meets almost all types of search criteria including search using metadata, ontology, APIs, RDF dump, and SPARQL endpoint enabling it to provide a range of services regarding the reuse of RDF vocabularies.70 Linked Data should be accompanied by its meaning to achieve its benefits, which is possible using vocabularies especially RDF vocabularies that are also published as Linked Data and linked with each other forming an LOV ecosystem.71 Such an ecosystem defines the health and usability of Linked Data by making its meaningful interpretation possible.72 For an ontology or vocabulary to be included into the LOV catalog, it must be of an appropriate size with low-level and normalized INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 55 constraints and represented in RDFS or Web Ontology Language (OWL); it must allow creating instances and support documentation by permitting comments, labels, definitions, and descriptions to support end users.73 The ontology must have additional characteristics such as those described in Semantic Web languages like OWL, published on the web with no limitations on its reuse, and support for content negotiation using searchable content and namespace URIs .74 The LOV catalog offers four core functionalities that make it more attractive for libraries. The Aggregate accesses vocabularies through dump file or (a SPARQL) endpoint. The Search finds classes/properties in a vocabulary or ontology. The Stat displays descriptive statistics of LOV vocabularies. Finally, Suggest enables the registry of new vocabularies.75 Radio and Hanrath uncovered the concerns regarding transitioning to LOV including how pre- existing terms could be mapped while considering the potential semantic loss.76 They describe this transition in the light of a case study at the University of Kansas institutional repository, which adopted OCLC’s FAST vocabulary and analyzed the outcomes and impact of exposing their data as Linked Data. To them, a vocabulary that is universal in scope and detail can become “bloated” and may result in an aggregated list of uncontrolled terms. However, such a diverse system may be capable of accurately describing the contents of an institutional repository. In this regard, adopting Linked Data vocabulary may serve to increase the overall quality of data by ensuring consistency with greater exposure of the resources when published as LOD. However, such a transition to a Linked Data vocabulary is not that simple and gets complicated when the process involves reconciling the legacy metadata especially when dealing with the issues of under or misrepresentation.77 Publishers, commercial entities, and data providers such as universities are taking keen interest and consortial participation, and therefore the library community must contribute to, benefit from, and consider this inevitable opportunity seriously.78 Considering, the core role of libraries in connecting people to the information, they should come forward to make available their descriptive metadata collections as Linked and Open Data for the benefit of the scholarly community on the web. It is time to move from strings (descriptive bibliographic records) to things (data items) that are connected in a more meaningful manner for the consumption of both machines and humans.79 Besides the numerous benefits of the LOV, there are some well-documented [and well-supported] vocabularies that are “not published or no longer available.”80 While focusing on the mappings between Schema.org and LOV, Nogales et al. argue that the LOV portal is limited as “some of the vocabularies are not available here.”81 In other words, the LOV portal is growing, but currently, it is at the infant stage, where much work is needed to bring all or at least the missing well- documented and well-supported vocabularies. This way the true benefits of LOV could be exploited to the fullest when such vocabularies are linked and made available for the consumption and reuse of the broader audience and applications of the Web of Data. CHALLENGES, ISSUES, AND RESEARCH OPPORTUNITIES To answer the RQ02, this section attempts to identify some of the prominent/key challenges and issues regarding publishing and consuming bibliographic metadata as Linked and Open Data. The sheer scale and diversity of cataloging frameworks, metadata encoding schemes, and standards make it difficult to approach cataloging effectively and efficiently. The quality of the cataloging data is another dimension that needs proper attention. CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 56 https://doi.org/10.6017/ital.v37i4.10432 The Multiplicity of Cataloging Rules and Standards The importance and critical role of standards in cataloging are clear to everyone. With standards, it becomes possible to identify authors uniquely; link users to the intended and the required resources; assess the value and usage of the services a library or information system provides; operate efficiently different transactions regarding bibliographic metadata, link content, preserve metadata, and generate reports; and enable the transfer of notifications, data, and events across machines.82 The success of these standards is because of the community-based efforts and their utility for a person/organization and ease of adoption. 83 However, we are living in a “jungle of standards” with massive scale and complexity.84 We are facing a flood of standards, schemas, protocols, and formats to deal with bibliographic metadata. 85 It is necessary to come up with some uniform and widely accepted standard, schema, protocol, and format, which will make possible the uniformity between bibliographic records and make way for records de-duplication on the web. Also, because of the exponential growth of the digital landscape of document collections and the emerging yet widely adopted Linked Data environment, it becomes necessary for librarians to be part of this global scale activity of making their bibliographic data available as Linked and Open Data.86 Therefore, all these standards need reenvisioning and reconsideration when libraries transit from the current implementations to a more complex LOD-based environment.87 RDA is easy to use, user-centric, and retrieval-supportive with a precise vocabulary.88 However, it has lengthier descriptions with a lot of technical terms, is time-consuming, needs re-training, and suffers from the generation gap.89 RDA is transitioning from AACR2 to produce metadata for knowledge artifacts, and it will be adaptive to the emerging data structures of Linked Data.90 Although librarians could potentially play a vital role in making RDA successful, it is challenging to bring them on the same page with publishers and vendors.91 While studying BIBFRAME 2.0 from RDA point of view, Taniguchi observed that: • BIBFRAME has no class correspondence with RDA, especially making a distinction between Work and Expression is challenging. • Some RDA elements have no corresponding properties in BIBFRAME, and therefore, cannot be expressed in BIBFRAME. In other cases, BIBFRAME properties cannot be converted back to RDA elements due to the many-to-one and many-to-many mappings between them. • The availability of multiple MARC21-to-BIBFRAME tools results in the variety of BIBFRAME metadata, which makes its matching and merging in the later stages challenging.92 To understand whether BIBFRAME 2.0 is suitable as a metadata schema, Taniguchi examined it closely for domain constraint of properties and developed four additional methods for implementing such constraints, i.e., defining properties in BIBFRAME.93 In these methods, method 1 is the strictest one for defining such properties, method 2 from BIBFRAME, and the remaining gradually loosen. Method 1 defines the domain of individual properties as work or instance only, which is according to the method in RDA. Method 2 defines properties using multiclass structure (work-instance-item) for descriptive metadata. Method 3 introduces a new class BibRes to accommodate Work and Instance properties. Method 4 uses two classes BibRes and Work for representing a bibliographic resource. Method 5 leaves the domain of any property unspecified and uses rdf:type to represent whether a resource belongs to the Work or Instance. He observed that: INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 57 • The multi-class structure used in BIBFRAME (method 2) questions the consistency between this structure and the domain definition of the properties. • If the quality of the metadata is concerned especially matching among converted metadata from different source metadata, then method 1 works better than method 2. • If metadata conversion from different sources is required, then method 4 or 5 should be applied.94 Taniguchi concludes that BIBFRAME’s domain constraint policy is unsuitable for descriptive metadata schema to exchange and share bibliographic resources, and therefore, should be reconsidered.95 According to Sprochi, bibliographic metadata is passing through a significant transformation. 96 FRBR, RDA, and BIBFRAME are among the three major and currently running programs that will affect the recording, storage, retrieval, reuse and sharing of bibliographic metadata. IFLA focuses on reconciling FRBR, FRAD, and FRSAD models into one model namely FRBR-Library Reference Model (RFBR-LRM [https://www.ifla.org/node/10280]), published in May 2016.97 Sprochi further adds that it is generally expected that by adopting this new model, RDA will be changed and revised significantly. BIBFRAME will also get substantial modifications to become compatible with FRBR-LRM and the resulting RDA rules.98 These initiatives, on the one hand, makes possible their visibility on the web, but on the other hand, introduces several changes and challenges for the library and information science community.99 To cope with the challenges of making bibliographic data visible, available, reusable, and shareable on the web, Sprochi argues that: 100 • The library and information science community must think of the bibliographic records in terms of data that is both human-readable and machine-understandable, which can be processed across different applications and databases with no format restrictions. Also, this data must support interoperability among vendors, publishers, users, and libraries and therefore, should be thought of beyond the notion that “only library create quality metadata (as quoted in Coyle (2007)” and cited by Sprochi101). • A shared understanding of Semantic Web, LOD, data formats, and other related technologies is necessary for the library and information science community for more meaningful and fruitful conversations with software developers, Information & Library Science (ILS) designers, and IT & Linked Data professionals. At least some basic knowledge about these technologies will enable the library community to take active participation in publishing, storing, visualizing, linking, and consuming bibliographic metadata as Linked and Open Data. • The library community must show a strong commitment to more ILS vendors to “post- MARC” standards such as BIBFRAME or any other standard that is supportive of the LOD environment. This way we will be in a better position to exploit Linked Data and Semantic Web to their fullest. The library community must be ready to adopt LOD in cataloging. Transitioning from MARC to Linked Data needs collaborative efforts and requires addressing several challenges. These challenges include: https://www.ifla.org/node/10280 CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 58 https://doi.org/10.6017/ital.v37i4.10432 • committing to a single standard by getting all units in the library, so that the Big Data problem resulting from using multiple metadata standards by different institutions could be mitigated; • bringing individual experts, libraries, university, and governments to work together and organize conferences, seminars, and workshops to bring Linked Data into the mainstream ; • translating the BIBFRAME vocabulary into other languages; • involving different users and experts in the area; and • obtaining funding from the public sector and other agencies to continue the journey towards Linked Data.102 In the current scenario of metadata practices, the interoperability for the exchange of metadata varies across different formats.103 The Semantic Web and LOD support different library models such as FRBRoo, EDM, and BIBFRAME. These conceptual models and frameworks suffer from the interoperability issue, which makes data integration difficult. Currently, several options are available for encoding bibliographic data to RDF (and to LOD), which further complicates the interoperability and introduces inconsistency.104 Existing descriptive cataloging methodologies and the bibliographic ontology descriptions in cataloging and metadata standards set the stage for redesigning and developing better ways of improved information retrieval and interoperability.105 Besides the massive heaps of information on the web, the library community (especially digital libraries) has devised standards for metadata and bibliographic description to meet the interoperability requirements for this part of the data on the web.106 Semantic Web technologies could be exploited to make information presentation, storage, and retrieval more user-friendly for digital libraries.107 To achieve such interoperability among resources, Castro proposed an architecture for semantic bibliographic description.108 Gardašević emphasizes on employing information system engineers and developers to understand resource description, discovery, and access process in libraries and then extend these practices by applying Linked Data principles.109 This way bibliographic metadata will be more visible, reusable and shareable on the web. Godby, Wang, and Mixter stress collaborative efforts to establish a single and universal platform for cataloging rules, encoding schema, and model to a higher level of maturity, which requires initiatives such as RDA, BIBFRAME, LD4L, and BIBLOW (https://bibflow.library.ucdavis.edu/about).110 The massive volume of metadata (available in MARC and other legacy formats) makes data migration to BIBFRAME challenging.111 Although BIBFRAME challenges the conventional ground of cataloging, which aims to record tangible knowledge containers, it is still in the infant stage at both theoretical and practical levels.112 For BIBFRAME to be more efficient, enhanced, and enriched, it needs the attention of librarians and information science experts who will use it to encode their bibliographic metadata.113 Gonzales suggests that librarians must be willing to share metadata and upgrade metadata encoding standards to BIBFRAME; they should train, learn, and upgrade their systems to efficiently use BIBFRAME encoding scheme and research new ways of bringing interoperability between BIBFRAME and other legacy metadata standards; and they should ensure the data security of patrons and mitigate the legal and copyright issues in making visible their resources as Linked and Open Data.114 Also, LOV must be exploited from the cataloging perspective by finding out ways to create a single, flexible, adaptable, and representative vocabulary. Such a vocabulary will bring the cataloging data from different https://bibflow.library.ucdavis.edu/about INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 59 libraries of the world and make it accessible and consumable as a single Library Linked Data to get free from the jungle of metadata vocabularies [and standards]. Publishing and Consuming Linked Bibliographic Metadata According to the findings of one survey, there are several primary motives for publishing an institution’s [meta]data as Linked Data. These include (in the order from most frequent/ essential to a lesser one):115 • making data visible on the web; • experimenting and finding the potentials of publishing datasets as Linked Data; • exposing local datasets to understand the nature of Linked Data; • exploring the benefits of Linked Data for Search Engine Optimization (SEO); • consuming and reusing Linked Data in future projects; • increasing the data reusability and interoperability; • testing Schema.org and BIBFRAME; • meeting the requirements of the project; and • making available the “stable, integrated, and normalized data about research activities of an institution.”116 They also identified several reasons from the participants regarding the consumption of such data. These include (in the order from most frequent/essential to a lesser one):117 • improving the user experience; • extending local data with other datasets; • effectively managing the internal metadata; • improving the accuracy and scope of search results; • trying to improve SEO for local resources; • understanding the effect of data aggregation from multiple datasets; and • experimenting and finding the potentials of consuming Linked Datasets. Publishing and consuming bibliographic data on the LOD cloud brings numerous applications. Kalou et al. developed a semantic mashup by combining Semantic Web technologies, RESTful services, and content management services (CMS) to generate personalized book recommendations and publish them as Linked Data.118 It allows for the expressive reasoning and efficient management of ontologies and has potential applications in the library, cataloging services, and ranking book records and reviews. This application exemplifies how we can use the commercially [and socially] curated metadata with bibliographic descriptions from improved user experience in digital libraries using Linked Data principles. However, publishing and consuming bibliographic metadata as Linked and Open Data is not that simple and need addressing several prominent challenges and issues, which are identified in the following subsections along with some opportunities for further research. Publishing Linked Bibliographic Metadata The University of Illinois Library worked on publishing MARC21 records of 30,000 digitized books as Linked Library Data by adding links, transforming them to LOD-friendly semantics (MODS) and deploying them as RDF with the objective to be used by a wider community.119 To them, using Semantic Web technologies, a book can be linked to related resources and multiple possible CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 60 https://doi.org/10.6017/ital.v37i4.10432 contexts, which is an opportunity for libraries to build innovative user-centered services for the dissemination and uses of bibliographic metadata.120 In this regard, the challenge is to utilize the existing book-related bibliographic maximally and descriptive metadata in a manner that parallels with the services (both inside the library and outside) as well as exploit to the fullest the full-text search and Semantic Web technologies, standards, and LOD services.121 While publishing the National Bibliographic Information as free open Linked Data, IFLA identifies several issues including:122 • dealing with the negative financial impact on the revenue generated from traditional metadata services; • the inability to offer consistent services due to the complexity of copyright and licensing frameworks; • the confusion in understanding the difference between “open” and “free” terms; • remodeling library data as Library Linked Data; • the limited persistence and sustainability of Linked Data resources; • the steep learning curve in understanding and applying Linked Data practices to library data; • making choices between sites to link to; and • creating persistent URIs for library data objects. From the analysis of the relevant literature, Hallo identified several issues in publishing bibliographic metadata as Linked and Open Data. These include difficulties in cataloging and migrating data to new conceptual models; the multiplicity of vocabularies for the same metadata; the lack of agreements to share data; the lack of experts and tools for transforming data; the lack of applications and indicators for its consumption; mapping issues; providing useful links of datasets; defining and controlling data ownership; and ensuring dataset quality.123 Libraries should adopt to Linked Data five-stars model by adopting emerging non-proprietary formats to publish its data; link to external resources and services; participate actively in enriching; and improving the quality of metadata to improve knowledge management and discovery. 124 The cataloging has a bright future with more dataset providers by involving citizens and end -users in metadata enrichment and annotation; making ranking and recommendation as part of library cataloging services; and the increased participation of the library community to the body of Semantic Web and Linked Data.125 Publishing Linked Data poses several issues. These include data cleanup issues es pecially when dealing with legacy data; technical issues such as data ownership; the software maturity to keep Linked Data up-to-date; managing its colossal volume; and providing IT support for data entry, annotation, and modeling; developing representative and widely applicable LOVs; and handling the steep learning curve to understand and apply Linked Data principles. 126 Bull and Quimby stress understanding how the library community is transiting their cataloging methods, systems, standards, and integrations to the LOD for making them visible on the web and how they keep backward compatibility with legacy bibliographic metadata.127 It is necessary for the LOD data model to maintain the underlying semantics of the existing models, schemas, and standards, yet innovate and renew old traditions, where the quality of the conversion solely depends on the ability of this new model to cope with heterogeneity conflicts, INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 61 maintain granularity and semantic attributes and consequently prevent loss of data and semantics.128 The new model should be semantically expressive enough to support meaningful and precise linking to other datasets. By thinking alternatively, these challenges are the significant research opportunities that will enable us to be part of Linked and Open Data community in a more profound manner. Consuming Linked Bibliographic Metadata Consuming Linked Data resources can be a daunting task and may involve resolving/mitigating several challenges. These challenges include:129 • dealing with the bulky or non-available RDF dumps, no authority control within RDF dumps, and data format variations; • identifying terms’ specificity levels during concept matching; • the limited reusability of Library Linked Data due to lack of contextual data; • harmonizing classes and objects at the institution level; • excessive handcrafting due to few off-the-shelf visualization tools; • manual mapping of vocabularies; • matching, aligning, and disambiguating library and Linked Data; • the limited representation of several essential resources as Linked Data due to non- availability of URIs; • the lack of sufficient representative semantics for bibliographic data; • the time-consuming nature of Linked Data to understand its structure for reuse; • the ambiguity of terms across languages; and • the non-stability of endpoints and outdated datasets. Syndication is required to make library data visible on the web. Also, it is necessary to understand how current applications including web search engines perceive and treat visibility, to what extent schema.org matters, and what is the nature of the Linked Data cloud.130 An influential work may be translated into several languages, which results in multiple metadata records. Some of these are complete, and others are with missing details. Godby and Smith‐ Yoshimura suggest aggregating these multiple metadata records into a single record, which can be complete, link the work to its different translations and translators, and is publishable (and consumable) as Linked Data.131 However, such an aggregation demands a great deal of human effort to make these records visible and consumable as Linked Data. This also includes describing all types of objects that libraries currently collect and manage, translating research findings to best practices; and establishing policies to use URIs in MARC and other types of records. 132 To achieve the long-term goal of making metadata consumable as Linked Data; the libraries, as well as individual researchers, should align their research with work that of the major players such as OCLC, LOC, and IFLA and follow their best practices.133 The issues in LOV needs immediate attention to make LOD more useful. These issues, according to include the following:134 • LOV publishes only a subset of RDF vocabularies with no inclusion for value vocabularies such as SKOS thesaurus; • it provides no or almost negligible support for vocabulary authors; CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 62 https://doi.org/10.6017/ital.v37i4.10432 • it relies on third parties to get the information about vocabulary usage in published datasets; • it has insufficient support for multilingualism or many languages; • it should support multi-term vocabulary search, which is required from the ontology designers to understand and employ the complex relationships among concepts; • it should support vocabulary matching, vocabulary checking, and multilingualism to allow users to search and browse vocabularies using their native language. It also improves the quality of the vocabulary by translation, which allows the community to evaluate and collaborate; and • efforts are required to improve and make possible the long-term preservation of vocabularies. LOD emerged to change the design and development of metadata, which has implications for controlled vocabularies, especially, the Person/Agent vocabularies that are fundamental to data linkage but suffer from the issues of metadata maintenance and verification. 135 Therefore, practical data management and the metadata-to-triples transition should be studied in detail to make the wider adaptation of LOD possible.136 To come out of the lab environment and make LOD practically useful, the controlled vocabularies must be cleaned, and its cost should be reduced.137 However, achieving this is challenging and needs to answer how knowledge artifacts could be uniquely identified and labeled across digital collections and what should be the standard practices to use them.138 Linked Data is still new to libraries.139 The technological complexities, the feeling of risks in adopting new technology and limitations due to the system, politics, and economy are some of the barriers in its usage in libraries.140 However, libraries can potentially overcome these barriers by learning from the use of Linked Data in other domains including, e.g., Google’s Knowledge Graph and Facebook’s Open Graph.141 The graph interfaces could be developed to link author, publisher, and book-related information, which in turn can be linked to the other open and freely available datasets.142 It is time that the Library and Information Science professionals come out of the old, document-centric approach to bibliographic metadata and adapt their thinking as more data- centric for a more meaningful consumption of bibliographic metadata by both users and machines.143 Quality of Linked Bibliographic Metadata The use of a cataloging data defines its quality.144 The quality is essential for the discovery, usage, provenance, currency, authentication, and administration of metadata. 145 Cataloging data or bibliographic metadata is considered fit for use based on its accuracy, completeness, logical consistency, provenance, coherence, timeliness, conformance and accessibility. 146 Data is commonly assessed by its quality to be used in specific application scenarios and use cases, however, sometimes, low-quality data can still be useful for a specific application as far as its quality meets the requirements of that application.147 The reasons include several factors including availability, accuracy, believability, completeness, conciseness, consistency, objectivity, relevance, understandability, timeliness, and verifiability that determine the quality of data. 148 The quality of Linked Data can be of two types, one is the inherent quality of Linked Data, and the other relates to its infrastructure aspects. The former can be further divided into aspects including domain, metadata, RDF model, links among data items, and vocabulary. The infrastructural INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 63 aspects include the server that hosts the Linked Data, Linked Data fragments, and file servers.149 This typology introduces issues of their own, the issues related to the inherent quality including “linking, vocabulary usage and the provision of administrative metadata.”150 The infrastructural aspect introduces issues related to naming conventions, which include avoiding blank nodes and using HTTP URIs, linking through owl:sameAS links, describing by reusing the existing terms and dereferencing.151 The quality cataloging definitions are mainly based on the experience and practices of the cataloging community.152 Its quality falls into at least four basic categories: (1) the technical details of the bibliographic records, (2) the cataloging standards, (3) the cataloging process, and (4) the impact of cataloging on the user.153 The cataloging community focuses mainly on the quality of bibliographic metadata. However, it is not sufficient enough to consider the accuracy, completeness, and standardization of bibliographic metadata, and therefore, it is necessary that they should also consider the information needs of the users.154 Van Kleeck et al. investigated issues in the quality management of metadata of electronic resources to assess in supporting user tasks of finding, selecting, and accessing library holdings as well as identifying the potential for increasing efficiencies in acquisition and cataloging workflow.155 They evaluated the quality of existing bibliographic records mostly provided by their vendors and compared them with those of OCLC and found that the latter has better support users in resource discovery and access. 156 From the management perspective, the complexity and volume of bibliographic metadata and the method of ingesting it to the catalog emphasize the selection of highest quality records.157 From the perspective of digital repositories, the absence of well-defined theoretical and operational definitions of metadata quality, interoperability, and consistency are some of the issues for the quality of metadata.158 The National Information Standards Organization (NISO) identifies several issues in creating metadata. 159 These include the inadequate knowledge about cataloging in both manual and automatic environments leading to inaccurate data entry, inconsistency of subject vocabularies, and limitations of resource discovery, and the development of standardized approaches to structure metadata.160 The poor quality of Linked Data can make its usefulness much difficult.161 Datasets are created at the data level resulting in a significant variance in perspectives and underlying data models.162 This also leads to errors in triplication, syntax, and data; misleading owl:sameAs links, and the low availability of SPARQL endpoints.163 Library catalogs, because of their low quality, most often fail to communicate clear and correct information correctly to the users.164 The reasons for such low quality include user’s inability to produce catalogs that are free from faults and duplicates as well as low standards and policies that drive these cataloging practices. 165 Although the rich collections of bibliographic metadata are available, these are rich in terms of the heaps of cataloging data and not in terms of quality with almost no bibliographic control. 166 These errors in and the low quality of bibliographic metadata are the result of misunderstanding the aims and functions of bibliographic metadata and adopting the “unwise” cataloging standards and policies.167 Still there exist some high-quality cataloging efforts with well-maintained cataloging records, where the only quality warrant is to correctly understand the subject matter of the artifact and effectively communicate between librarians and experts in the corresponding domain knowledge. 168 The demand for such high quality and well-managed catalogs has increased on the web. Although CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 64 https://doi.org/10.6017/ital.v37i4.10432 people are more accustomed to web search engines, the quality catalogs will attract not only libraries but the general web users as well (when published and consumed as Linked Data).169 The community must work together on metadata with publishers and vendors to approach cataloging from the user perspective and refine the skillset as well as produce quality metadata.170 As library and information science professionals, we should not only be the users of the standards , instead, we must actively participate and contribute to its development and improvement so that we may effectively and efficiently connect our data with the rest of the world.171 Such collaboration is required from not only the librarians and vendors but also from the users in developing an efficient cataloging environment and for a more usable bibliographic metadata, this is discussed in the next section. LINKING THE SOCIALLY CURATED METADATA This section addresses RQ03 by reviewing the state-of-the-art literature from multiple but related domains including Library Sciences, Information Sciences, Information Retrieval, and Semantic Web. The section below discusses the importance and possible impact of making socially curated metadata as part of the bibliographic or professionally curated metadata. The next section highlights why social collaborative cataloging approaches should be adopted by librarians to work with other stakeholders in making their bibliographic data available and visible as Linked and Open Data and what is the possible impact of fusing the user-generated content with professional metadata and making it available as Linked and Open Data. The Socially Curated Metadata Matters in Cataloging Conventional libraries have clear and well-established classification and cataloging schemes but these are as challenging to learn, understand, and apply as they are slow and painful to consume.172 Using computers to retrieve bibliographic records resulted in the massive usage of copy cataloging.173 However, adopting this practice is challenging, because these records are inconsistent; incomplete; less visible, granular, and discoverable; unable to integrate metadata and content to the corresponding records; difficult to preserve with new and usable format for the consumption by users and machines; and not supportive towards integrating the user-generated content into the cataloging records.174 The University of Illinois Library, through its VuFind service, offers extra features to enhance the search and exploration experience of end users by providing a book’s cover image, table of contents, abstracts, reviews, comments, and user tags.175 Users can contribute content such as tags, reviews, comments, and recommend books to friends. H owever, it is necessary to research whether this user-generated content should be integrated to or preserved along the bibliographic records.176 In their book, Alemu and Stevens mentioned several advantages of making user-generated content as part of the library catalogs.177 These include (i) enhancing the functionality of professionally- curated metadata by making information objects findable and discoverable; (ii) removing the limitations posed by sufficiency and necessity principles of the professionally-curated metadata; (iii) bringing users closer to the library by “pro-actively engaging” them in ratings, tagging, and reviewing, etc., provided that users are also involved in managing and controlling metadata entries; and (iv) the resulting “wisdom of the crowd” would benefit all the stakeholders from this massively growing socially-curated metadata. However, this combination can only be utilized optimally if we can semantically and contextually link it to the internal and external resources; the resulting metadata is openly accessed, shared, and reused; users are supported in easily adding INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 65 the metadata and made part of the quality control by enabling them to report spamming activities to the metadata experts.178 LibraryThing for Libraries (LTFL) makes a library catalog more informative and interactive by enhancing OPAC, providing access to professional and social metadata, and enabling them to search, browse, and discover library holdings in a more engaging way (https://www.librarything.com/forlibraries). It is one of the practical examples of enriching library catalogs with user-generated content. This trend of merging social and professional metadata innovates library cataloging by dissolving the borders between “social sphere” and library resources.179 The social media has expanded library into social spaces by exploiting tags and tag clouds as navigational tools and enriching the bibliographic descriptions by integrating the user-generated content.180 It bridges the communication gaps between the library and its users, where users take active participation in resource description, discovery, and access. 181 The potential role of the socially curated metadata in resource description, discovery, and access is also evident from the long long-tail Social Book Search research under the Initiative for XML Retrieval (INEX) where both professionally curated bibliographic and user-generated social metadata are exploited for retrieval and recommendation to support both known-item as well as exploratory search.182 By experimenting with Amazon/LibraryThing datasets of 2.8 million book records, containing both professional and social metadata, the results conclude that enriching the professional metadata with social metadata especially tags significantly improves search and recommendation.183 Koolen also noticed that the social metadata especially tags and reviews significantly improve the search performance as professionally curated metadata is “often too limited” to describe books resourcefully.184 Users add socially curated metadata with the intention of making resource re-findable during a future visit, i.e., they add metadata such as tags to facilitate themselves and allow other similar users in resource discovery and access, and therefore, form a community around the resource.185 Clements found user tags (social tagging) beneficial for librarians while browsing and exploring the library catalogs.186 To some librarians, tags are complementary to controlled vocabulary; however, training issues and lack of awareness of social tagging functionality in cataloging interfaces prevent their perceived benefit.187 The Socially Curated Metadata as Linked Data Metadata is socially constructed.188 It is shaping and shaped by the context in which it is developed and applied, and demands community-driven approaches, where data should be looked at from a holistic point of view rather than considering them as discrete (individual) semantic units.189 The library is adopting the collaborative social aspect of cataloging that will take place between authors, repository managers, libraries, e-collection consortiums, publishers, and vendors.190 Librarians should improve their cataloging skills in line with the advances in technology to expose and make visible their bibliographic metadata as Linked and Open Data.191 Currently, linked library data is generated and used by library professionals. Socially constructed metadata will act as a value-added in retrieving knowledge artifacts with precision.192 The addition of socially constructed and community-driven metadata in current metadata structures, controlled vocabularies, and classification systems provide the holistic view of these structures as they add the community-generated sense to the professionally-curated metadata structures.193 An https://www.librarything.com/forlibraries CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 66 https://doi.org/10.6017/ital.v37i4.10432 example of the possibilities of making user-generated content as part of cataloging and Linked Open Data is the semantic book mashup (see “Consuming Linked Bibliographic Metadata” above) which demonstrates how the commercially [and socially] curated metadata could be retriev ed and linked with bibliographic descriptions.194 While enumerating the possible applications of this mashup, they argue that book reviews from different websites could be aggregated using Linked Data principles by extending the Review class of BIBFRAME 2.0.195 From the analysis of twenty-one in-depth interviews with LIS professionals, Alemu discovered four metadata principles, namely metadata enrichment, linkage, openness, and filtering.196 This analysis revealed that the absence of socially curated metadata is sub-optimal for the potential of LOD in libraries.197 Their analysis advocates for a mixed-metadata approach, in which social metadata (tags, ratings, and reviews) augments the bibliographic metadata by involving users proactively and by offering a social collaborative cataloging platform. The metadata principles should be reconceptualized, and Linked Data should be exploited to address the existing library metadata challenges. Therefore, the current efforts in Linked Data should fully consider social metadata.198 Library catalogs should be enriched by mixing the professional and social metadata as well as semantically and contextually interlinked to internal and external information resources to be optimally used in different application scenarios.199 To fully exploit this linkage, the duplication of metadata should be reduced. It must be made openly accessible so that its sharing, reuse, mixing, and matching could be made possible. The enriched metadata must be filtered per user requirements using an interface that is flexible, personalized, contextual, and re- configurable.200 Their analysis suggests a “paradigm shift” in metadata’s future, i.e., from simple to enriched; from disconnected, invisible and locked to well-structured, machine-understandable, interconnected, visible, and more visualized metadata; and from single OPAC interface to reconfigurable and adaptive metadata interfaces.201 By involving users in the metadata curation process, the mixed approach will bring diversity in metadata and make resources discoverable, usable, and user-centric with the wider and well-supported platform of Linked and Open Data.202 In conclusion, the fusion of socially curated metadata with the standards-based professional metadata is essential from the perspective of the user-centric paradigm of cataloging, which has the potential to aid resource discovery and access and open new opportunities for information scientists working in Linked and Open Data as well as catalogers who are transiting to the Web of Data to make their metadata visible, reusable, and linkable to other resources on the web. From the analysis and scholarly discussions of Alemu, Stevens, Farnel, and others as well as from the initial experiments of Kalou et al.203 it becomes apparent that the application of Linked Data principles for library catalogs is future-proof and promising towards more user-friendly search and exploration experience with efficient resource description, discovery, access, and recommendations. CONCLUSIONS In this paper, we presented a brief yet holistic review of the current state of Linked and Open Data in cataloging. The paper identified the potentials of LOD and LOV in making the bibliographic descriptions publishable, linkable, and consumable on the web. Several prominent challenges, issues, and future research avenues were identified and discussed. The potential role of socially- curated metadata for enriching library catalogs and the collaborative social aspect of cataloging were highlighted. Some of the notable points include the following: INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 67 • Publishing, linking, and consuming bibliographic metadata on the web using Linked Data principles brings several benefits for libraries.204 The library community should improve their skills regarding this paradigm shift and adopt the best practices from other domains.205 • Standards have a key role in cataloging, however, we are living in a “jungle of metadata standards” with varying complexity and scale, which makes it difficult to select, apply and work with.206 To be part of global scale activity of making bibliographic data available on the web as Linked and Open Data, these standards should be considered and re- envisioned.207 • The quality of bibliographic metadata depends on several factors including accuracy, completeness, logical consistency, provenance, coherence, timeliness, conformance and accessibility.208 However, achieving these characteristics is challenging because of several reasons including cataloging errors; limited bibliographic control; misunderstanding the role of metadata; and “unwise” cataloging standards and policies.209 To ensure high-quality and make data visible and reusable as Linked Data, the library community should contribute to developing and refining these standards and policies. 210 • Metadata is socially constructed and demands community-driven approaches and the social collaborative aspect of cataloging by involving authors, repository managers, librarians, digital collection consortiums, publishers, vendors, and users.211 This is an emerging trend, which is gradually dissolving the borders between the “social sphere” and library resources and bridging the communication gap between libraries and their users, where end users contribute to the bibliographic descriptions resulting in a diversity of metadata and making it user-centric and usable.212 • Adopting a “mixed-metadata approach” by considering bibliographic metadata and the user-generated content complementary and essential for each other suggests a “paradigm shift” in the metadata’s future from simple to enriched; from human-readable data silos to machine understandable, well-structured, and reusable; from invisible and restricted to visible and open; and from single OPAC to reconfigurable interfaces on the web.213 Several researchers including the ones cited in this article agree that the professionally curated bibliographic metadata supports mostly the known-item search and has little value to open and exploratory search and browsing. They believe that not only the collaborative social efforts of the cataloging community are essential but also the socially curated metadata, which can be used to enrich bibliographic metadata and support exploration and serendipity. This is not only evident from the wider usage of LibraryThing and its LTFL but also from the long-tail INEX Social Book Search research where both professionally curated bibliographic and user-generated social metadata are exploited for retrieval and recommendation to support both known-item as well as exploratory search.214 Therefore, this aspect should be considered for further research to make cataloging more useful for all the stakeholders including libraries, users, authors, publishers, and for the general consumption as Linked Data on the web. The current trend of social collaborative cataloging efforts is essential to fully exploit the potential of Linked Open Data. However, if we look closely, we find four groups including librarians, Linked Data experts, Information Retrieval (IR) and Interactive IR researchers; and users, all going on their separate ways with minimal collaboration and communication. More specifically, they are not benefiting from each other to a greater extent, which could result in better possibilities of CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 68 https://doi.org/10.6017/ital.v37i4.10432 resource description, discovery, and access. For example, the library community should consider the findings of INEX SBS track, which have demonstrated that professional and social metadata, are essential for each other to facilitate end users in resource discovery and access and support not only known-item search but also exploration and serendipity. The current practices of LibraryThing, LTFL, and social web in general advocate for user-centric cataloging, where users are not only the consumers of bibliographic descriptions but also the contributors to metadata enrichment. Linked Open Data experts have achieved significant milestones in other domains including, e.g., e-Government, they should understand the cataloging and resource discovery & access practices in libraries to make the bibliographic metadata not only visible as Linked Data on the web but also shareable, re-usable, and beneficial to the end-users. The social collaborative cataloging approach by involving the four mentioned groups actively is significant to make bibliographic descriptions more useful not only for the library community and users but also for their consumption on the web as Linked and Open Data. Together we can, and we must. REFERENCES 1 María Hallo et al., “Current State of Linked Data in Digital Libraries,” Journal of Information Science 42, no. 2 (2016):117–27, https://doi.org/10.1177/0165551515594729. 2 Tim Berners-Lee, “Design Issues: Linked Data,” W3C, 2006, updated June18, 2009, accessed November 09, 2018, https://www.w3.org/DesignIssues/LinkedData.html; Hallo, “Current State,” 117. 3 Yuji Tosaka and Jung-ran Park, “RDA: Resource Description & Access—A Survey of the Current State of the Art,” Journal of the American Society for Information Science and Technology 64, no. 4 (2013): 651–62, https://doi.org/10.1002/asi.22825. 4 Hallo, “Current State,” 118; Angela Kroeger, “The Road to BIBFRAME: The Evolution of the Idea of Bibliographic Transition into a Post-MARC Future,” Cataloging & Classification Quarterly 51, no. (2013): 873–90. https://doi.org/10.1080/01639374.2013.823584; Martin Doerr et al., “The Europeana Data Model (EDM).” Paper presented at the World Library and Information Congress: 76th IFLA General Conference and Assembly, Gothenburg, Sweden, August 10–15, 2010. 5 Getaneh Alemu and Brett Stevens, An Emergent Theory of Digital Library Metadata—Enrich then Filter,1st Edition (Waltham, MA: Chandos Publishing, Elsevier Ltd. 2015). 6 Hallo, “Current State,” 118 . 7 Berners-Lee, “Design Issues.” 8 Kim Tallerås, “Quality of Linked Bibliographic Data: The Models, Vocabularies, and Links of Data Sets Published by Four National Libraries,” Journal of Library Metadata 17, no. 2 (2017):126– 55, https://doi.org/10.1080/19386389.2017.1355166. 9 Becky Yoose and Jody Perkins, “The Linked Open Data Landscape in Libraries and Beyond,” Journal of Library Metadata 13, no. 2–3 (2013): 197–211, https://doi.org/10.1080/19386389.2013.826075. https://doi.org/10.1177/0165551515594729 https://www.w3.org/DesignIssues/LinkedData.html https://doi.org/10.1002/asi.22825 https://doi.org/10.1080/01639374.2013.823584 https://doi.org/10.1080/19386389.2017.1355166 https://doi.org/10.1080/19386389.2013.826075 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 69 10 Robert Fox, “From Strings to Things,” Digital Library Perspectives 32, no. 1 (2016): 2–6, https://doi.org/10.1108/DLP-10-2015-0020. 11 Stanislava Gardašević, “Semantic Web and Linked (Open) Data Possibilities and Prospects for Libraries,” INFOtheca—Journal of Informatics & Librarianship 14, no. 1 (2013): 26–36, http://infoteka.bg.ac.rs/pdf/Eng/2013-1/INFOTHECA_XIV_1_2014_26-36.pdf. 12 Thomas Baker, Pierre-Yves Vandenbussche, and Bernard Vatant, “Requirements for Vocabulary Preservation and Governance,” Library Hi Tech 31, no. 4 (2013): 657-68, https://doi.org/10.1108/LHT-03-2013-0027. 13 Pierre-Yves Vandenbussche et al., “Linked Open Vocabularies (LOV): A Gateway to Reusable Semantic Vocabularies on the Web,” Semantic Web 8, no. 3 (2017): 437–45, https://doi.org/10.3233/SW-160213. 14 Tosaka, “RDA,” 651, 652. 15 Amanda Sprochi, “Where Are We Headed? Resource Description and Access, Bibliographic Framework, and the Functional Requirements for Bibliographic Records Library Reference Model,” International Information & Library Review 48, no. 2 (2016): 129–36, https://doi.org/10.1080/10572317.2016.1176455. 16 Brighid M.Gonzales, “Linking Libraries to the Web: Linked Data and the Future of the Bibliographic Record,” Information Technology and Libraries 33, no. 4 (2014): 10, https://doi.org/10.6017/ital.v33i4.5631. 17 Shoichi Taniguchi, “Is BIBFRAME 2.0 a Suitable Schema for Exchanging and Sharing Diverse Descriptive Metadata about Bibliographic Resources?,” Cataloging & Classification Quarterly 56, no. 1 (2018): 40–61, https://doi.org/10.1080/01639374.2017.1382643. 18 Shoichi Taniguchi, “BIBFRAME and Its Issues: From the Viewpoint of RDA Metadata,” Journal of Information Processing and Management 58, no. 1 (2015): 20–27, https://doi.org/10.1241/johokanri.58.20. 19 Shoichi Taniguchi, “Examining BIBFRAME 2.0 from the Viewpoint of RDA Metadata Schema,” Cataloging & Classification Quarterly 55, no. 6 (2017): 387–412, https://doi.org/10.1080/01639374.2017.1322161. 20 Nosheen Fayyaz, Irfan Ullah, and Shah Khusro, “On the Current State of Linked Open Data: Issues, Challenges, and Future Directions,” International Journal on Semantic Web and Information Systems (IJSWIS) 14, no. 4 (2018): 110–28, https://doi.org/10.4018/IJSWIS.2018100106. 21 Asim Ullah, Shah Khusro, and Irfan Ullah, “Bibliographic Classification in the Digital Age: Current Trends & Future Directions,” Information Technology and Libraries 36, no. 3 (2017): 48–77, https://doi.org/10.6017/ital.v36i3.8930. 22 Tosaka, “RDA,” 659. https://doi.org/10.1108/DLP-10-2015-0020 http://infoteka.bg.ac.rs/pdf/Eng/2013-1/INFOTHECA_XIV_1_2014_26-36.pdf https://doi.org/10.1108/LHT-03-2013-0027 https://doi.org/10.3233/SW-160213 https://doi.org/10.1080/10572317.2016.1176455 https://doi.org/10.6017/ital.v33i4.5631 https://doi.org/10.1080/01639374.2017.1382643 https://doi.org/10.1241/johokanri.58.20 https://doi.org/10.1080/01639374.2017.1322161 https://doi.org/10.4018/IJSWIS.2018100106 https://doi.org/10.6017/ital.v36i3.8930 CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 70 https://doi.org/10.6017/ital.v37i4.10432 23 Tosaka, “RDA,” 651, 652, 659. 24 Tosaka, “RDA,” 653, 660. 25 The first author used the trial version of RDA Toolkit to report these facts about RDA (https://access.rdatoolkit.org). RDA Toolkit is co-published by American Library Association (http://www.ala.org), Canadian Federation of Library Associations (http://cfla- fcab.ca/en/home-page), and Facet Publishing (http://www.facetpublishing.co.uk). 26 IFLA, “IFLA Conceptual Models,” The International Federation of Library Associations and Institutions (IFLA), 2017, updated April 06, 2009, accessed November 12, 2018, https://www.ifla.org/node/2016. 27 Tosaka, “RDA,” 651, 652, 655. 28 Michael John Khoo et al., “Augmenting Dublin Core Digital Library Metadata with Dewey Decimal Classification,” Journal of Documentation 71, no. 5 (2015): 976–98. https://doi.org/10.1108/JD-07-2014-0103; Ulli Waltinger et al., “Hierarchical Classification of Oai Metadata Using the DDC Taxonomy,” in Advanced Language Technologies for Digital Libraries, edited by Raffaella Bernardi, Frederique Segond and Ilya Zaihrayeu. Lecture Notes in Computer Science (Lncs), 29–40: Springer, Berlin, Heidelberg, 2011; Aaron Krowne and Martin Halbert, “An Initial Evaluation of Automated Organization for Digital Library Browsing,” Paper presented at the Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries, Denver, CO, USA, June 7–11, 2005 2005; Waltinger, “DDC Taxonomy,” 30. 29 Khoo, “Dublin Core,” 977, 984 . 30 LOC, “MARC Standards: MARC21 Formats,” Library of Congress (LOC), 2013, updated March 14, 2013, accessed January 2, 2014, http://www.loc.gov/marc/marcdocz.html. 31 Philip E Schreur, “Linked Data for Production and the Program for Cooperative Cataloging,” PCC Policy Committee Meeting, 2017, accessed May 18, 2018, https://www.loc.gov/aba/pcc/documents/Facil-Session-2017/PCC_and_LD4P.pdf. 32 Sarah Bull and Amanda Quimby, “A Renaissance in Library Metadata? The Importance of Community Collaboration in a Digital World,” Insights 29, no. 2 (2016): 146–53, http://doi.org/10.1629/uksg.302. 33 Philip E. Schreur, “Linked Data for Production,” PCC Policy Committee Meeting, 2015, accessed November 09, 2018, https://www.loc.gov/aba/pcc/documents/PCC-LD4P.docx. 34 Vandenbussche, “Linked Open Vocabularies,” 437, 438, 450. 35 Hallo, “Current State,” 120. 36 Hallo, “Current State,” 118. 37 Hallo, “Current State,” 120, 124. https://access.rdatoolkit.org/ http://www.ala.org/ http://cfla-fcab.ca/en/home-page http://cfla-fcab.ca/en/home-page http://www.facetpublishing.co.uk/ https://www.ifla.org/node/2016 https://doi.org/10.1108/JD-07-2014-0103 http://www.loc.gov/marc/marcdocz.html https://www.loc.gov/aba/pcc/documents/Facil-Session-2017/PCC_and_LD4P.pdf http://doi.org/10.1629/uksg.302 https://www.loc.gov/aba/pcc/documents/PCC-LD4P.docx INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 71 38 Hallo, “Current State,” 120, 124. 39 Hallo, “Current State,” 124. 40 Bull, “Community Collaboration,” 147. 41 Sam Gyun Oh, Myongho Yi, and Wonghong Jang, “Deploying Linked Open Vocabulary (LOV) to Enhance Library Linked Data,” Journal of Information Science Theory and Practice 2, no. 2 (2015): 6–15, http://dx.doi.org/10.1633/JISTaP.2015.3.2.1. 42 Carlo Bianchini and Mauro Guerrini, “A Turning Point for Catalogs: Ranganathan’s Possible Point of View,” Cataloging & Classification Quarterly 53, no. 3-4 (2015): 341–51, http://doi.org/10.1080/01639374.2014.968273. 43 Bianchini, “Turning Point,” 350. 44 LOC, “Library of Congress Linked Data Service,” The Library of Congress, accessed March 24, 2018, http://id.loc.gov/about/. 45 LOC, “Linked Data Service.” 46 LOC, “Linked Data Service.” 47 LOC, “Linked Data Service.” 48 LOC, “Linked Data Service.” 49 Margaret E Dull, “Moving Metadata Forward with BIBFRAME: An Interview with Rebecca Guenther,” Serials Review 42, no. 1 (2016): 65–69, https://doi.org/10.1080/00987913.2016.1141032. 50 LOC, “Overview of the BIBFRAME 2.0 Model,” Library of Congress, April 21, 2016, accessed November 09, 2018, https://www.loc.gov/bibframe/docs/bibframe2-model.html. 51 Taniguchi, “BIBFRAME 2.0,” 388; Taniguchi, “Suitable Schema,” 40. 52 OCLC. 2016, “OCLC Linked Data Research,” Online Computer Library Center (OCLC), https://www.oclc.org/research/themes/data-science/linkeddata.html. 53 OCLC, “Linked Data Research.” 54 Jeff Mister, “Turning Bibliographic Metadata into Actionable Knowledge,” Next Blog—OCLC, February 29, 2016, http://www.oclc.org/blog/main/turning-bibliographic-metadata-into- actionable-knowledge/. 55 Mister, “Turning Bibliographic Metadata.” 56 George Campbell, Karen Coombs, and Hank Sway, “OCLC Linked Data,” OCLC Developer Network, March 26, 2018, https://www.oclc.org/developer/develop/linked-data.en.html. 57 Campbell, “OCLC Linked Data.” http://dx.doi.org/10.1633/JISTaP.2015.3.2.1 http://doi.org/10.1080/01639374.2014.968273 http://id.loc.gov/about/ https://doi.org/10.1080/00987913.2016.1141032 https://www.loc.gov/bibframe/docs/bibframe2-model.html https://www.oclc.org/research/themes/data-science/linkeddata.html http://www.oclc.org/blog/main/turning-bibliographic-metadata-into-actionable-knowledge/ http://www.oclc.org/blog/main/turning-bibliographic-metadata-into-actionable-knowledge/ https://www.oclc.org/developer/develop/linked-data.en.html CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 72 https://doi.org/10.6017/ital.v37i4.10432 58 Roy Tennant, “Getting Started with Linked Data,” NEXT Blog—OCLC, February 8, 2016, http://www.oclc.org/blog/main/getting-started-with-linked-data-3/. 59 Tennant, “Linked Data.” 60 DBLP, “DBLP Computer Science Bibliography: Frequently Asked Questions,” Digital Bibliography & Library Project (DBLP), updated November 07, 2018, accessed 08 November 2018. http://dblp.uni-trier.de/faq/. 61 DBLP, “Frequently Asked Questions.” 62 Jörg Diederich, Wolf-Tilo Balke, and Uwe Thaden, “Demonstrating the Semantic Growbag: Automatically Creating Topic Facets for Faceteddblp,” Paper presented at the Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries, Vancouver, Canada, June 17–22, 2007. 63 Jörg Diederich, Wolf-Tilo Balke, and Uwe Thaden, “About FacetedDBLP,” 2018, accessed November 09, 2018, http://dblp.l3s.de/dblp++.php. 64 Tennant, “Linked Data.” 65 In this Section, LOV catalog or portal refers to the LOV platform available at http://lov.okfn.org/dataset/lov/, whereas the abbreviation LOV, when used alone (without the term catalog/portal), refers to Linked Open Vocabularies in general; Vandenbussche, “Linked Open Vocabularies,” 437. 66 Vandenbussche, “Linked Open Vocabularies,” 443, 450. 67 Vandenbussche, “Linked Open Vocabularies,” 437. 68 Vandenbussche, “Linked Open Vocabularies,” 437, 438, 450. 69 Vandenbussche, “Linked Open Vocabularies,” 438. 70 Vandenbussche, “Linked Open Vocabularies,” 437, 438, 443–46. 71 Baker Thomas, Pierre-Yves Vandenbussche, and Bernard Vatant, “Requirements for Vocabulary Preservation and Governance,” Library Hi Tech 31, no. 4 (2013): 657–68, https://doi.org/10.1108/LHT-03-2013-0027. 72 Thomas, “Vocabulary Preservation,” 658. 73 Oh, “Deploying,” 9. 74 Oh, “Deploying,” 9. 75 Oh, “Deploying,” 9, 10. http://www.oclc.org/blog/main/getting-started-with-linked-data-3/ http://dblp.uni-trier.de/faq/ http://dblp.l3s.de/dblp++.php http://lov.okfn.org/dataset/lov/ https://doi.org/10.1108/LHT-03-2013-0027 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 73 76 Erik Radio and Scott Hanrath, “Measuring the Impact and Effectiveness of Transitioning to a Linked Data Vocabulary,” Journal of Library Metadata 16, no. 2 (2016): 80–94, https://doi.org/10.1080/19386389.2016.1215734. 77 Radio, Transitioning,” 81. 78 Robert, “Strings to Things,” 2. 79 Robert, “Strings to Things,” 2, 4, 6. 80 Vandenbussche, “Linked Open Vocabularies,” 438. 81 As of April 23, 2018, the Schema.org vocabulary is now available at http://lov.okfn.org/dataset/lov/; Alberto Nogales et al., “Linking from Schema.org Microdata to the Web of Linked Data: An Empirical Assessment,” Computer Standards & Interfaces 45 (2016): 90-99. https://doi.org/10.1016/j.csi.2015.12.003. 82 Bull, “Community Collaboration,” 146. 83 Bull, “Community Collaboration,” 146. 84 Bull, “Community Collaboration,” 147. 85 Bull, “Community Collaboration,” 147. 86 Bull, “Community Collaboration,” 147, 148. 87 Schreur, 2015. Linked Data for Production. 88 Yhna Therese P. Santos, “Resource Description and Access in the Eyes of the Filipino Librarian: Perceived Advantages and Disadvantages,” Journal of Library Metadata 18, no. 1 (2017): 45–56, https://doi.org/10.1080/19386389.2017.1401869. 89 Santos, “Filipino Librarian,” 51–55. 90 Philomena W. Mwaniki, “Envisioning the Future Role of Librarians: Skills, Services and Information Resources,” Library Management 39, no. 1, 2 (2018): 2–11, https://doi.org/10.1108/LM-01-2017-0001. 91 Mwaniki, “Envisioning the Future,” 7, 8. 92 Taniguchi, “BIBFRAME 2.0,” 410, 411 . 93 Taniguchi, “Suitable Schema,” 52–58 . 94 Taniguchi, “Suitable Schema,” 59, 60. 95 Taniguchi, “Suitable Schema,” 60. 96 Sprochi, “Where Are We Headed?,” 129, 134. https://doi.org/10.1080/19386389.2016.1215734 http://lov.okfn.org/dataset/lov/ https://doi.org/10.1016/j.csi.2015.12.003 https://doi.org/10.1080/19386389.2017.1401869 https://doi.org/10.1108/LM-01-2017-0001 CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 74 https://doi.org/10.6017/ital.v37i4.10432 97 Sprochi, “Where Are We Headed?,” 129. 98 Sprochi, “Where Are We Headed?,” 134. 99 Sprochi, “Where Are We Headed?,” 134. 100 Sprochi, “Where Are We Headed?,” 134, 135. 101 Sprochi, “Where Are We Headed?,” 134. 102 Caitlin Tillman, Joseph Hafner, and Sharon Farnel, “Forming the Canadian Linked Data Initiative,” Paper presented at the the 37th International Association of Scientific and Technological University Libraries 2016 (IATUL 2016) Conference, Dalhousie University Libraries in Halifax, Nova Scotia, June 5–9, 2016. 103 Carol Jean Godby, Shenghui Wang, and Jeffrey K Mixter, Library Linked Data in the Cloud: OCLC's Experiments with New Models of Resource Description. Vol. 5, Synthesis Lectures on the Semantic Web: Theory and Technology, San Rafael, California (USA),Morgan & Claypool Publishers, 2015, https://doi.org/10.2200/S00620ED1V01Y201412WBE012. 104 Sofia Zapounidou, Michalis Sfakakis, and Christos Papatheodorou, “Highlights of Library Data Models in the Era of Linked Open Data,” Paper presented at the The 7th Metadata and Semantics Research Conference, MTSR 2013, Thessaloniki, Greece, November 19 –22, 2013; Timothy W. Cole et al., “Library MARC Records Into Linked Open Data: Challenges and Opportunities,” Journal of Library Metadata 13, no. 2–3 (2013): 163–96, https://doi.org/10.1080/19386389.2013.826074; Kim Tallerås, “From Many Records to One Graph: Heterogeneity Conflicts in the Linked Data Restructuring Cycle, Information Research 18, no. 3 (2013) paper C18, accessed November 10, 2018. 105 Fabiano Ferreira de Castro, “Functional Requirements for Bibliographic Description in Digital Environments,” Transinformação 28, no. 2 (2016): 223–31. https://doi.org/10.1590/2318- 08892016000200008. 106 Castro, “Functional Requirements,” 223, 224. 107 Castro, “Functional Requirements,” 224, 230. 108 Castro, “Functional Requirements,” 223, 228–30. 109 Gardašević, “Possibilities and Prospects,” 35. 110 Godby, OCLC's Experiments, 112. 111 Gonzales, “The Future,” 17. 112 Karim Tharani, “Linked Data in Libraries: A Case Study of Harvesting and Sharing Bibliographic Metadata with Bibframe,” Information Technology and Libraries 34, no. 1 (2015): 5–15. https://doi.org/https://doi.org/10.6017/ital.v34i1.5664. 113 Tharani, “Harvesting and Sharing,” 16. https://doi.org/10.2200/S00620ED1V01Y201412WBE012 https://doi.org/10.1080/19386389.2013.826074 https://doi.org/10.1590/2318-08892016000200008 https://doi.org/10.1590/2318-08892016000200008 https://doi.org/https:/doi.org/10.6017/ital.v34i1.5664 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 75 114 Gonzales, “The Future,” 16. 115 Karen Smith-Yoshimura, “Analysis of International Linked Data Survey for Implementers,” D- Lib Magazine, 2016, July/August 2016. 116 Smith-Yoshimura, “Analysis.” 117 Smith-Yoshimura, “Analysis.” 118 Aikaterini K. Kalou, Dimitrios A. Koutsomitropoulos, and Georgia D. Solomou, “Combining the Best of Both Worlds: A Semantic Web Book Mashup as a Linked Data Service Over CMS Infrastructure,” Journal of Library Metadata 16, no. 3–4 (2016): 228–49, https://doi.org/10.1080/19386389.2016.1258897. 119 Cole, “MARC,” 163, 165, 175. 120 Cole, “MARC,” 163, 164, 191. 121 Cole, “MARC,” 164, 191. 122 IFLA, “Linked Open Data: Challenges arising,” The International Federation of Library Associations and Institutions (IFLA), 2014, accessed March 03, 2018, https://www.ifla.org/book/export/html/8548. 123 Hallo, “Current State,” 124. 124 Hallo, “Current State,” 126. 125 Hallo, “Current State,” 124. 126 Karen Smith-Yoshimura, “Linked Data Survey results 4–Why and What Institutions are Publishing (Updated),” Hanging Together the OCLC Research blog, September 3, 2014, accessed November 12, 2018, https://hangingtogether.org/?p=4167. 127 Bull, “Community Collaboration,” 148. 128 Tallerås, “One Graph.” 129 Karen Smith-Yoshimura, “Linked Data Survey Results 3–Why and What Institutions are Consuming (Updated),” Hanging Together the OCLC Research blog, September 1, 2014, accessed November 12, 2018, http://hangingtogether.org/?p=4155. 130 Godby, OCLC’s Experiments, 116. 131 Carol Jean Godby and Karen Smith‐Yoshimura, “From Records to Things: Managing the Transition from Legacy Library Metadata to Linked Data,” Bulletin of the Association for Information Science and Technology 43, no. 2 (2017): 18–23, https://doi.org/10.1002/bul2.2017.1720430209. 132 Godby, “From Records to Things,” 23. https://doi.org/10.1080/19386389.2016.1258897 https://www.ifla.org/book/export/html/8548 https://hangingtogether.org/?p=4167 http://hangingtogether.org/?p=4155 https://doi.org/10.1002/bul2.2017.1720430209 CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 76 https://doi.org/10.6017/ital.v37i4.10432 133 Godby, “From Records to Things,” 22. 134 Vandenbussche, “Linked Open Vocabularies,” 449, 450. 135 Silvia B. Southwick, Cory K Lampert, and Richard Southwick, “Preparing Controlled Vocabularies for Linked Data: Benefits and Challenges,” Journal of Library Metadata 15, no. 3–4 (2015): 177–190, https://doi.org/10.1080/19386389.2015.1099983. 136 Southwick, “Controlled Vocabularies,” 177. 137 Southwick, “Controlled Vocabularies,” 189, 190. 138 Southwick, “Controlled Vocabularies,” 183. 139 Robin Hastings, “Feature: Linked Data in Libraries: Status and Future Direction,” Computers in Libraries (Magzine Article), 2015, http://www.infotoday.com/cilmag/nov15/Hastings-- Linked-Data-in-Libraries.shtml. 140 Hastings, “Status and Future.” 141 Hastings, “Status and Future.” 142 Hastings, “Status and Future.” 143 Hastings, “Status and Future.” 144 Tallerås, “National Libraries,” 129 (by quoting from van Hooland 2009; Wang and Strong 1996). 145 Jung-Ran Park, “Metadata Quality in Digital Repositories: A Survey of the Current State of the Art,” Cataloging & Classification Quarterly 47, no. 3–4 (2009): 213–28, https://doi.org/10.1080/01639370902737240. 146 Tallerås, “National Libraries,” 129 (by quoting from Bruce & Hillmann, 2004). 147 Park, “Metadata Quality,” 213, 224; Tallerås, “National Libraries,” 129, 150. 148 Park, “Metadata Quality,” 213, 215, 218–21, 224, 225; Tallerås, “National Libraries,” 141. 149 Tallerås, “National Libraries,” 129. 150 Tallerås, “National Libraries,” 129. 151 Tallerås, “National Libraries,” 129. 152 Karen Snow, “Defining, Assessing, and Rethinking Quality Cataloging,” Cataloging & Classification Quarterly 55, no. 7–8 (2017): 438–55, https://doi.org/10.1080/01639374.2017.1350774. 153 Snow, “Quality Cataloging,” 445. 154 Snow, “Quality Cataloging,” 451, 452. https://doi.org/10.1080/19386389.2015.1099983 http://www.infotoday.com/cilmag/nov15/Hastings--Linked-Data-in-Libraries.shtml http://www.infotoday.com/cilmag/nov15/Hastings--Linked-Data-in-Libraries.shtml https://doi.org/10.1080/01639370902737240 https://doi.org/10.1080/01639374.2017.1350774 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 77 155 David Van Kleeck et al., “Managing Bibliographic Data Quality for Electronic Resources,” Cataloging & Classification Quarterly 55, no. 7-8 (2017): 560–77, https://doi.org/10.1080/01639374.2017.1350777. 156 Van Kleeck, “Data Quality,” 560, 575, 576. 157 Van Kleeck, “Data Quality,” 575. 158 Park, “Metadata Quality,” 214, 216–18, 225. 159 NISO, A Framework of Guidance for Building Good Digital Collections, ed. NISO Framework Advisory Group, 3rd ed (Baltimore, MD: National Information Standards Organization, 2007), https://www.niso.org/sites/default/files/2017-08/framework3.pdf. 160 Park, “Metadata Quality,” 214, 215; NISO. Guidance; Jane Barton, Sarah Currier, and Jessie MN Hey, “Building Quality Assurance into Metadata Creation: An Analysis Based on the Learning Objects and E-Prints Communities of Practice,” Paper presented at the Proceedings of the International Conference on Dublin Core and Metadata Applications: Supporting Communities of Discourse and Practice—Metadata Research & Applications, Seattle, Washington, September 28–October 2, 2003. 161 Pascal Hitzler and Krzysztof Janowicz, “Linked Data, Big Data, and the 4th Paradigm,” Semantic Web 4, no. 3 (2013): 233–35, https://doi.org/10.3233/SW-130117. 162 Hitzler, “4th Paradigm,” 234. 163 Hitzler, “4th Paradigm,” 234. 164 Alberto Petrucciani, “Quality of Library Catalogs and Value of (Good) Catalogs,” Cataloging & Classification Quarterly 53, no. 3–4 (2015): 303–13. https://doi.org/10.1080/01639374.2014.1003669. 165 Petrucciani, “Quality,” 303, 305. 166 Petrucciani, “Quality,” 303, 309, 311. 167 Petrucciani, “Quality,” 303, 309. 168 Petrucciani, “Quality,” 309, 310. 169 Petrucciani, “Quality,” 310. 170 Bull, “Community Collaboration,” 147. 171 Bull, “Community Collaboration,” 148. 172 Han, Myung-Ja, “New Discovery Services and Library Bibliographic Control,” Library Trends 61, no. 1 (2012):162–72, https://doi.org/10.1353/lib.2012.0025. 173 Han, “Bibliographic Control,” 162. https://doi.org/10.1080/01639374.2017.1350777 https://www.niso.org/sites/default/files/2017-08/framework3.pdf https://doi.org/10.3233/SW-130117 https://doi.org/10.1080/01639374.2014.1003669 https://doi.org/10.1353/lib.2012.0025 CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 78 https://doi.org/10.6017/ital.v37i4.10432 174 Han, “Bibliographic Control,” 169–71. 175 Han, “Bibliographic Control,” 163. 176 Han, “Bibliographic Control,” 167–70. 177 Alemu, Emergent Theory, 29–33, 43–65. 178 Alemu, Emergent Theory, 29–65. 179 Lorri Mon, Social Media and Library Services, Synthesis Lectures on Information Concepts, Retrieval, and Services, ed. Gary Marchionini, 40, San Rafael, California (USA), Morgan & Claypool Publishers, 2015), https://doi.org/10.2200/S00634ED1V01Y201503ICR040. 180 Mon, Social Media, 50. 181 Mon, Social Media, 24. 182 Marijn Koolen et al., “Overview of the CLEF 2016 Social Book Search Lab,” Paper presented at the 7th International Conference of the Cross-Language Evaluation Forum for European Languages, Évora, Portugal, September 5–8, 2016; Koolen et al., “Overview of the CLEF 2015 Social Book Search Lab,” Paper presented at the 6th International Conference of the Cross- Language Evaluation Forum for European Languages, Toulouse, France, September 8–11, 2015; Patrice Bellot et al., “Overview of INEX 2014,” Paper presented at the International Conference of the Cross-Language Evaluation Forum for European Languages, Sheffield, UK, September 15–18, 2014; Bellot et al., “Overview of INEX 2013,” Paper presented at the International Conference of the Cross-Language Evaluation Forum for European Languages, Valencia, Spain, September 23–26, 2013. 183 Bo-Wen Zhang, Xu-Cheng Yin, and Fang Zhou, “A Generic Pseudo Relevance Feedback Framework with Heterogeneous Social Information,” Information Sciences 367–68 (2016): 909–26, https://doi.org/10.1016/j.ins.2016.07.004; Xu-Cheng Yin et al., “ISART: A Generic Framework for Searching Books with Social Information,” PLOS ONE 11, no. 2 (2016): e0148479, https://doi.org/10.1371/journal.pone.0148479; Faten Hamad and Bashar Al- Shboul, “Exploiting Social Media and Tagging for Social Book Search: Simple Query Methods for Retrieval Optimization,” in Social Media Shaping E-Publishing and Academia, edited by Nashrawan Tahaet al., 107–17 (Cham: Springer International Publishing, 2017). 184 Marijn Koolen, “User Reviews in the Search Index? That’ll Never Work!” Paper presented at the 36th European Conference on IR Research (ECIR 2014), Amsterdam, The Netherlands, April 13–16, 2014. 185 Alemu, Emergent Theory, 29–33, 43–65. 186 Lucy Clements and Chern Li Liew, “Talking about Tags: An Exploratory Study of Librarians’ Perception and Use of Social Tagging in a Public Library,” The Electronic Library 34, no. 2 (2016): 289–301, https://doi.org/10.1108/EL-12-2014-0216. 187 Clements, “Talking about Tags,” 291, 297-99. https://doi.org/10.2200/S00634ED1V01Y201503ICR040 https://doi.org/10.1016/j.ins.2016.07.004 https://doi.org/10.1371/journal.pone.0148479 https://doi.org/10.1108/EL-12-2014-0216 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 79 188 Sharon Farnel, “Understanding Community Appropriate Metadata through Bernstein’s Theory of Language Codes,” Journal of Library Metadata 17, no. 1 (2017): 5–18, https://doi.org/10.1080/19386389.2017.1285141. 189 Farnel, “Bernstein’s Theory,” 5, 6. 190 Mwaniki, “Envisioning the Future,” 8. 191 Mwaniki, “Envisioning the Future,” 8, 9. 192 Getaneh Alemu et al., “Toward an Emerging Principle of Linking Socially-Constructed Metadata,” Journal of Library Metadata 14, no. 2 (2014): 103–29, https://doi.org/10.1080/19386389.2014.914775. 193 Farnel, “Bernstein’s Theory,” 15–16. 194 Kalou, “Book Mashup.” 195 Kalou, “Book Mashup,” 242, 243. 196 Alemu, “Socially-Constructed Metadata,” 103, 107. 197 Alemu, “Socially-Constructed Metadata,” 103. 198 Alemu, “Socially-Constructed Metadata,” 103, 104, 120, 121. 199 Getaneh Alemu, “A Theory of Metadata Enriching and Filtering: Challenges and Opportunities to Implementation,” Qualitative and Quantitative Methods in Libraries 5, no. 2 (2017): 311–34, http://www.qqml-journal.net/index.php/qqml/article/view/343 200 Alemu, “Metadata Enriching and Filtering,” 311. 201 Alemu, “Socially-Constructed Metadata,” 125. 202 Alemu, “Metadata Enriching and Filtering,” 319, 320. 203 Alemu, “Metadata Enriching and Filtering”; Alemu, Emergent Theory; Alemu, “Socially- Constructed Metadata”; Farnel, “Bernstein's Theory”; Kalou, “Book Mashup.” 204 Hallo, “Current State,” 120. 205 Alemu, “Socially-Constructed Metadata,” 125; Hastings, “Status and Future.” 206 Bull, “Community Collaboration,” 147. 207 Bull, “Community Collaboration,” 152; Bull, “Community Collaboration,” 152; Schreur, 2015. Linked Data for Production. 208 Tallerås, “National Libraries,” 129. 209 Petrucciani, “Quality,” 303, 309. https://doi.org/10.1080/19386389.2017.1285141 https://doi.org/10.1080/19386389.2014.914775 http://www.qqml-journal.net/index.php/qqml/article/view/343 CURRENT STATE OF LINKED AND OPEN DATA IN CATALOGING | ULLAH, KHUSRO, ULLAH, AND NAEEM 80 https://doi.org/10.6017/ital.v37i4.10432 210 Bull, “Community Collaboration,” 147, 152. 211 Farnel, “Bernstein's Theory,” 5, 6, 12, 13, 15, 16; Mwaniki, “Envisioning the Future,” 8. 212 Mon, Social Media, 3; Alemu, “Metadata Enriching and Filtering,” 320. 213 Alemu, “Socially-Constructed Metadata,” 125. 214 Koolen, “CLEF 2016”; Koolen, “CLEF 2015”; Bellot, “INEX 2014”; Bellot, “INEX 2013.” ABSTRACT Introduction The Role of Linked Open Data and Vocabularies in Cataloging Linked and Open Data Linked Open Vocabularies Challenges, Issues, and Research Opportunities The Multiplicity of Cataloging Rules and Standards Publishing and Consuming Linked Bibliographic Metadata Publishing Linked Bibliographic Metadata Consuming Linked Bibliographic Metadata Quality of Linked Bibliographic Metadata Linking the Socially Curated Metadata The Socially Curated Metadata Matters in Cataloging The Socially Curated Metadata as Linked Data Conclusions REFERENCES 10460 ---- Editorial Board Thoughts: Events in the Life of ITAL Sharon Farnel INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 4 Sharon Farnel (sharon.farnel@ualberta.ca) is Metadata Coordinator, University of Alberta Libraries. At the end of June 2018, I will be ending my time on the ITAL Editorial Board. During my term I have had the opportunity to write several “From the Board” pieces and have very much enjoyed the freedom to explore a library technology topic of choice. This time around I would like to examine ITAL as seen through Crossref’s Event Data service. Crossref launched its Event Data service in Beta in 2017; production service was announced in late March of this year. Event Data is “an open data service that registers online activity (specifically, events) associated with Crossref metadata. Event Data will collect and store a record of any activity surrounding a research work from a defined set of web sources. The data will be made available as part of our metadata search service or via our Metadata API and normalised across a diverse set of sources. Data will be open, audit-able and replicable.”1 Using DOIs as a basis, Event Data captures information on discussions, citations, references and other actions on Wikipedia, Twitter, and other services. I thought it might be interesting to see what the Crossref Event Data might say about ITAL. I used the Event Data API2 to pull event data using the prefix for all OJS Journals hosted by Boston College (10.6017). I then used OpenRefine3 to filter out all non-ITAL records and then began further examining the data. The data was gathered on May 9, 2018. In total, 313 events were captured. Of these, 193 events were from Wikipedia, 110 from Twitter, and 5 each from The Lens (patent citations) and Wordpress blogs. The 313 events are associated with 38 ITAL articles, the earliest from 1973 (Volume 6, Number 1, from ITAL’s digitized archive), and the most recent from 2018 (Volume 37, Number 1). The greatest number of events (126) are associated with an article from Volume 25, Number 1 (2006) on RFID in Libraries.4 The other articles are associated with a varying number of discrete events, from one to 24. Looking more closely at the events associated with the 2006 article on RFID, all 126 events are references in Wikipedia. These represent references to the English and Japanese language Wikipedia articles on Radio Frequency Identification. Other references from Wikipedia are to articles on open access, FAST (Faceted Application of Subject Terminology), Library 2.0 , Biblioteca 2.0, and others. What about that article from 1973? It was written by J. J. Dimsdale and titled “File Structure for an On-Line Catalog of One Million Titles.” The abstract provides a tantalizing glimpse into the content: “A description is given of the file organization and design of an on-line catalog suitable for automation of a library of one million books. A method of virtual hash addressing allows rapid search of the indexes to the catalog file. Storage of textual material in a compressed f orm allows considerable reduction in storage costs.”5 mailto:sharon.farnel@ualberta.ca EVENTS IN THE LIFE OF ITAL | FARNEL 5 https://doi.org/10.6017/ital.v37i2.10460 There are only four events associated with this 1973 article, but interestingly all are from The Lens,6 a global patent search database. These are a set of related patents, by Mayers and Whiting, for Data Compression Apparatus and Methods.7 There are 110 events associated with Twitter, with tweets from 15 different users. The largest number of events, 21, begins with Aaron Tay, 8 a librarian and blogger from Singapore Management University, tweeting about a 2016 ITAL article9 on user expectations of library discovery products, which was then retweeted 20 times. The two next most-tweeted articles (17 tweets/retweets each) discuss privacy and user experience in library discovery 10 and “reference rot” in ETD (Electronic Theses & Dissertations) repositories. 11 What value can such a brief examination of this small set of data from a very new service provide to ITAL authors, or the Editorial Board? It can certainly provide a glimpse of who might be accessing ITAL articles, and how, and perhaps provide some hints as to ways to increase the reach of the journal. This kind of data is not a replacement for download counts or bibliographic citation patterns, but can complement them and add another layer to our understanding of the place of ITAL in the library technology community and beyond. As ITAL continues to thrive and as services like Event Data continue to improve, I look forward to seeing what story this data continues to tell! REFERENCES The Event Data used for this analysis can be found at https://bit.ly/2KgDJcM. 1 Madeleine Watson, “Event Data: open for your interpretation,” Crossref Blog, February 25, 2016, https://www.crossref.org/blog/event-data-open-for-your-interpretation/. 2 Crossref, Event Data User Guide, https://www.eventdata.crossref.org/guide/. 3 OpenRefine, http://openrefine.org/. 4 Jay Sing, Navjit Brar and Carmen Fong, “The State of RFID Applications in Libraries,” Information Technology and Libraries 25 no. 1, 2006, https://doi.org/10.6017/ital.v25i1.3326. 5 J. J. Dimsdale, “File Structure for an On-Line Catalog of One Million Titles,” Information Technology and Libraries 6, no. 1, 1973, https://doi.org/10.6017/ital.v6i1.5760. 6 The Lens, https://www.lens.org/. 7 Clay Mayers and Douglas Whiting. Data Compression Apparatus and Method Using Matching String Searching and Huffman Encoding. US Patent 5532694, filed July 7, 1995, and issued July 2, 1996. 8 Aaron Tay, https://twitter.com/aarontay. 9 Irina Trapido, “Library Discovery Products: Discovering User Expectations through Failure Analysis,” Information Technology and Libraries 35, no. 3, 2016, https://doi.org/10.6017/ital.v35i3.9190. https://bit.ly/2KgDJcM https://www.crossref.org/blog/event-data-open-for-your-interpretation/ https://www.eventdata.crossref.org/guide/ http://openrefine.org/ https://doi.org/10.6017/ital.v25i1.3326 https://doi.org/10.6017/ital.v6i1.5760 https://www.lens.org/ https://twitter.com/aarontay https://doi.org/10.6017/ital.v35i3.9190 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 6 10 Shayna Pekala, “Privacy and User Experience in 21st Century Library Discovery,” Information Technology and Libraries 36, no. 2, 2017, https://doi.org/10.6017/ital.v36i2.9817. 11 Mia Massicotte and Kathleen Botter, “Reference Rot in the Repository: A Case Study of Electronic Theses and Dissertations (ETDs) in an Academic Library,” Information Technology and Libraries 36, no. 1, 2017, https://doi.org/10.6017/ital.v36i1.9598. https://doi.org/10.6017/ital.v36i2.9817 https://doi.org/10.6017/ital.v36i1.9598 References 10493 ---- President’s Message Andromeda Yelton INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 3 https://doi.org/10.6017/ital.v37i2.10493 Andromeda Yelton (andromeda.yelton@gmail.com) is LITA President 2017-18 and Senior Software Engineer, MIT Libraries, Cambridge, Massachusetts. As I started planning this column, I looked back over my other columns for the year and discovered that they have a theme: the connection that runs from our past, through our present, and into our future. In my first column, I talked about the first issues of ITAL: Henriette Avram founding MARC right here in these pages. Early LITA hackers cobbling together the technologies of their age to make streamlined, inventive library services — just as LITA members do today. In my second column, I talked about conferences where we come together today — LITA Forum 2017 and 2018 — and encounter the issues of today — Data for Black Lives. I can close my eyes and I’m in Denver, chatting with long-time colleagues and first-time presenters ... or I’m at the MIT Media Lab, watching algorithmic opportunity and injustice spar with one another, while artists and poets point us toward the Wakandan imaginary. And in my third column, I talked about the possibility of LITA, LLAMA, and ALCTS coming together to form a new division: a potential future. This possibility both knocked my world off its axis and let me see it in a new light; I didn’t imagine that I’d spend my presidency exploring the options for large-scale organizational transformation, and yet I can see how this route could not only address challenges all three divisions face, but also give us opportunities to be stronger together. I believe in this roadmap, but I also want us all to grapple with the question of identity. What’s peripheral, and what’s central, to who we are as library technologists? What’s ephemeral, and what endures? What’s the through line we can hold on to, across that past and present, and carry with us into the future? Today, here in the present, I’m preparing to turn over my piece of that line to President-Elect Bohyun Kim. She has been unfailingly brilliant and diligent in the years I’ve known her, and I know she’ll ask insightful questions, advocate for all that’s best in LITA and its people, and get things done. But I’m also cognizant that it was never really my line; it was yours. I had the immense privilege of carrying it for a while, but as we hear every time we survey our members, the best part of LITA is the networking — it’s you. We will have many chances to discuss our through line in the months to come and I urge you to bring your voices to the table: ask your questions, tell us what matters, and depict your imaginaries. mailto:andromeda.yelton@gmail.com 10494 ---- Information Technology and Libraries at 50: The 1970s in Review Sandra Shores INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 7 Sandra Shores (sandra.shores@ualberta.ca), a member of LITA and the ITAL editorial board, is Senior IT & Facilities Officer, Learning Services, University of Alberta. What a pleasure it has been to scan through a decade of articles, communications, and news from the ten volumes comprising the 1970s’ Journal of Library Automation, predecessor to Information Technology and Libraries. Despite the open-access availability of several of these volumes, I requested the entire run from our high density storage library and delighted in the dusty covers between my hands and the musty smells wafting from the paper and ink. At the same time, I rued past practices at this university library of slicing out journal covers and front and back advertising in order to save a few cents in bindery costs. Such a loss of information about a journal’s history! By the end of the 1970s, I was almost through my undergraduate degree and intended to pursue a library science degree after a gap year, not being able to imagine a more inspiring or satisfying place to build a career than an academic library. Even as an avid library user at that time, I realized that technology was having an increasing impact on operations but, until reviewing these ten years of the journal, did not grasp the tipping point reached. The decade began with tentative language around technology including uncertainty about whether this phenomenon was best called mechanization or automation. That uncertainty extended to the naming of new things, resulting in words not yet joined, or if joined, then with hyphenation: data base or data-base, key word or key-word, on line or on-line, for example. In the early years, the profession began to imagine the coming together of disparate small library applications into what was referred to as a “total” system; the Integrated Library System or ILS had neither been imagined nor articulated. Early concerns in the decade focused on the rising costs of library services alongside the high price of computing. Cost-benefit analysis drove many decisions related to library automation. In a 1970 article conceptualizing an online catalog, Frederick Kilgour claimed that the “productivity of library workers is not continuously increasing as is that of workers in the general economy”1 and concluded that “mechanization, or more specifically, computerization, is the only avenue that extends toward the goal of economic viability.”2 Fortunately, the decade saw considerable advances in computer engineering coinciding with steadily decreasing costs in processing power, data storage and networking. Nine year later, computer scientist G. Salton imagined a much more viable future for libraries, “postulating a completely new library design where the shelf arrangement of books and journals would be replaced by a computerized store containing presumably the full text of all library items together with appropriate search methodologies.”3 While several decades would pass before the emergence of Project Gutenberg and other mass book scanning projects, technology was sufficiently affordable for libraries by the end of the 1970s that the focus shifted to ways of working cooperatively to harness the powerful new opportunities. JoLA attracted articles on library networks, cataloguing cooperatives, union serial lists, robust circulation systems, early interlibrary loan systems, and commercial and not-for-profit database services. Many applications and systems grew out of projects begun at university and other libraries, but the decade also saw the early emergence of vendor solutions. By the final volume of the 1970s, Caryl and Stratton McAllister reported on mailto:sandra.shores@ualberta.ca THE 1970S IN REVIEW | SHORES 8 https://doi.org/10.6017/ital.v37i2.10494 DOBI/LIBIS, “an integrated library system with strong authority file control that can be used directly by the library staff and its borrowers.”4 The ILS was born. A few professional themes that still resonate today stand out from this glance into the past, the first being a shift from valuing ownership of materials to valuing cooperation and resource sharing. Technology combined with an increasing emphasis on standards of description and communication offered new possibilities for regional and national resource sharing, leading many in the profession to acknowledge the futility of trying to build comprehensive collections for their institutions. A number of articles highlight the impact of technology on library personnel, noting that the introduction of automation is disruptive to staff, leaving many feeling unprepared to succeed in their jobs. The roots of evidence-based decision making can be seen in a few articles, for example one describing how newly available data from first generation circulation systems can inform the acquisition of additional copies of high demand titles. 5 The library user comes more into focus as the decade progresses, in studies about user satisfaction with mediated search and retrieval services, whether run in batch mode or online, and in concept papers imagining systems that support end-user searching. Other articles express concern over the costs of online searching and other computer services, especially as the costs of library materials continued to rise and journals consumed more of the collections budget. Finally, members of the profession understood early that data about use of library materials stored in computer systems needs protection. At the ALA Midwinter meeting in 1973, the Information Science and Automation Division (precursor to LITA) passed a motion urging ALA to develop data privacy policy, noting the “vulnerability of machine-readable files due to the large quantity of data processed.”6 William Mathews provides an excellent article to read in anticipation of library technologies in the 1980s. With considerable prescience in “Advances in Electronic Technologies” he takes the reader through microprocessors, high performance computing, the pending home computer phenomenon and new storage and processing technologies.7 By the end of the 1970s, the library world was on the cusp of a technology revolution! In bringing this review to a close, I would be remiss not to draw attention to the illustrious professionals who assumed editorial roles in the early years of the journal. Frederick Kilgour, Henriette Avram, Verner Clapp, Pauline Atherton, and others not only had extraordinary careers but also set high standards for the journal and the association. Kudos to them for their establishment of JoLA! 1 Frederick G. Kilgour, “Concept of an On-line Computerized Library Catalog,” Journal of Library Automation 3, no. 1 (March 1970): 2, https://doi.org/10.6017/ital.v3i1.5123. 2 Kilgour, “Concept,” 3. 3 G. Salton, “Suggestions for Library Network Design,” Journal of Library Automation 12, no. 1 (March 1979): 39-40. 4 Caryl McAllister and A. Stratton McAllister, “DOBIS/LIBIS: An Integrated, On-Line Library Management System,” Journal of Library Automation 12, no. 4 (December 1979): 300. 5 Robert S. Grant, “Predicting the Need for Multiple Copies of Books, Journal of Library Automation 4, no. 2 (June 1971): 64-71, https://doi.org/10.6017/ital.v4i2.5583. 6 “Highlight of Minutes: Information Science and Automation Division, Board of Directors Meeting,” Journal of Library Automation 6, no. 1 (March 1973): 57, https://ejournals.bc.edu/ojs/index.php/ital/article/view/5761/5140. 7 William D. Mathews, “Advances in Electronic Technologies,” Journal of Library Automation 11, no. 4 (December 1978): 299-307. https://doi.org/10.6017/ital.v3i1.5123 https://doi.org/10.6017/ital.v4i2.5583 https://ejournals.bc.edu/ojs/index.php/ital/article/view/5761/5140 10571 ---- Letter from the Editor Kenneth J. Varnum INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 1 In this June 2018 issue, we continue our celebration of ITAL’s 50th year with a summary by Editorial Board member Sandra Shores of the articles published in the 1970s, the journal’s first full decade of publication. The 1970s are particularly pivotal in library technology, as it marks the introduction of the personal computer, as a hobbyist’s tool, to society. The web is still more than a decade away, but the seeds are being planted. With this issue, we introduce a new look for the journal — thanks to the work of LITA’s Web Coordinating Committee, and in particular Kelly Sattler (also a member of the Editorial Board), Jingjing Wu, and Guy Cicinelli. The new design is much easier on the eyes and more legible, and sports a new graphic identity for ITAL. BOARD TRANSITIONS June marks the changing of the Editorial Board. A significant number of Board members’ terms expire this June 30, and I’d like to take this opportunity to thank those departing members for their years of service to Information Technology and Libraries, and the support they have offered me this year as I began as Editor. Each has ably and generously contributed to the journal’s growth over the last years, and I thank them for their service to the journal and to ITAL: • Mark Cyzyk (Johns Hopkins University) • Mark Dehmlow (Notre Dame University) • Sharon Farnel (University of Alberta) • Kelly Sattler (Michigan State University) • Sandra Shores (University of Alberta) These are big shoes to fill, but I am excited about the new members who have been appointed for two-year terms beginning July 1, 2018. In March, we extended a call for volunteers for 2 -year terms on the Editorial Board. We received almost 50 applications, and ultimately added seven new members: • Steven Bowers (Wayne State University) • Kevin Ford (Art Institute of Chicago) • Cinthya Ippoliti (Oklahoma State University) • Ida Joiner (Independent Consultant) • Breanne Kirsch (University of South Carolina Upstate) • Michael Sauers (Do Space, Omaha, Nebraska) • Laurie Willis (San Jose Public Library) READERSHIP SURVEY SUMMARY Over the past three months, we ran a survey of the ITAL readership to try to understand a bit more detail about who you are, collectively. The survey received 81 complete responses out of about 11,000 views of pages with the survey link on it. Here are some brief summary results: • Nearly half (46%) of respondents have attended at least one LITA event (in-person or online). LETTER FROM THE EDITOR | VARNUM 2 https://doi.org/10.6017/ital.v37i2.10571 • Three quarters (75%) of respondents are from academic libraries. Public, Special, and LIS programs make up an additional 20%. • The majority (56%) are librarians, with the remaining spread across a number of other roles. • Almost two thirds (63%) of respondents have never been LITA members, a quarter (25%) are current members, and the remainder are former members. • About four fifths (81%) of responses came from the current issue (either the table of contents or individual articles). AN INVITATION What can you share with your library colleagues in relation to technology? If you have interesting research about technology in a library setting, or are looking for a venue to share your your case study, get in touch with me at varnum@umich.edu. Sincerely, Kenneth J. Varnum, Editor varnum@umich.edu June 2018 mailto:varnum@umich.edu Board Transitions Readership Survey Summary An Invitation 10574 ---- Taking the Long Way Around: Improving the Display of HathiTrust Records in the Primo Discovery System Jason Alden Bengtson and Jason Coleman INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 27 Jason Bengtson (jbengtson@ksu.edu) is Head of IT Services for Kansas State University Libraries. Jason Coleman (coleman@ksu.edu) is Head of Library User Services for Kansas State University Libraries. ABSTRACT As with any shared format for serializing data, Primo’s PNX records have limits on the types of data which they pass along from the source records and into the Primo tool. As a result of these limitations, PNX records do not currently have a provision for harvesting and transferring rights information about HathiTrust holdings that the Kansas State University (KSU) Library system indexes through Primo. This created a problem, since Primo was defaulting to indicate that all HathiTrust materials were available to KSU Libraries (K-State Libraries) patrons, when only a limited portion of them actually were. This disconnect was infuriating some library users, and creating difficulties for the public services librarians. There was a library-wide discussion about removing HathiTrust holdings from Primo altogether, but it was decided that such a solution was an overreaction. As a consequence, the library IT department began a crash program to attempt to find a solution to the problem. The result was an application called hathiGenius. INTRODUCTION Many information professionals will be aware of Primo, the web scale discovery tool provided by Ex Libris. Web scale discovery services are designed to provide indexing and searching User Experiences, not only for the library’s holdings (as with a traditional Online Public Access Catalog), but also for many of a library’s licensed and open access holdings. Primo offers a variety of useful features for search and discovery, taking in data from manifold sources and serializing them into a common format for indexing within the tool. However, such applications are still relatively young, and the technologies powering them have not fully matured. The combination of this lack of maturity and deliberately closed architecture between vendors leads to several problems for the user. One of the most frustrating is errors in identifying full-text access availability. As with any shared format for serializing data, Primo’s PNX (Primo Normalized XML) records have limits on the types of data they pass from the source records into the Primo tool. As a result of these limitations, PNX records do not currently have a provision for harvesting and transferring rights information about HathiTrust holdings that the K-State Libraries system indexes through Primo. This created a problem in the K-State Libraries’ implementation, since Primo was defaulting to indicate that all HathiTrust materials were available to K-State Libraries patrons, when only a limited portion of them actually were. This disconnect was infuriating some library users, and creating difficulties for the public services librarians. There was a library-wide discussion about removing HathiTrust holdings from Primo altogether, but it was decided that such a solution was an overreaction. As a consequence, the library IT Services department began a crash program to attempt to find a solution to the problem. TAKING THE LONG WAY AROUND | BENGSTON AND COLEMAN 28 https://doi.org/10.6017/ital.v38i1.10574 HATHITRUST’S DIGITAL LIBRARY AS A COLLECTION IN PRIMO CENTRAL HathiTrust was established in 2008 as a collaboration among several research libraries that were interested in preserving digital content. As of the beginning of March 2018, the collaborative’s digital library contained more than sixteen million items, approximately 37 percent of which were in the public domain.1 Ex Libris’ Primo Central Index (PCI), which serves as Primo’s built-in index of articles from various database providers, includes metadata for the vast majority of the items in HathiTrust’s digital library, providing inline frames within the original Primo user interface to directly display full-text content of those items that the library has access to. Libraries subscribing to Primo choose whether or not to make these records available to their users. K-State Libraries, like many other Primo Central clients, elected to activate HathiTrust in its instance of Primo, which it has branded with the name Search It. The unmodified version of Primo Central identified all records from HathiTrust’s digital library as available online, regardless of the actual level of access provided to users. Users who discovered a record for an item from HathiTrust’s digital library were presented with a conspicuous message indicating that full text was available and two links named view it and details. An example of the appearance of these search results is shown in figure 1. After clicking the “view it” tab, the center window would display the item’s homepage from HathiTrust’s digital library inside an iframe. Public domain items would display the title page of the item and present users with an interface containing numerous visual indicators that they were viewing an ebook (see figure 2 for an example). Items with copyright restrictions would display a message indicating that the item is not available online (see figure 3 for an example). Figure 1. Two books from HathiTrust as they appeared in Search It prior to implementation of hathiGenius. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 29 Figure 2. HathiTrust result for an item in the public domain. Figure 3. HathiTrust’s homepage for an item that is not in the public domain. Despite the intentions evident in the design of the Primo interface, availability of HathiTrust records was not being accurately reflected in the list of returns. The size of the indices underlying web scale discovery systems and the number of configurations and settings that must be maintained locally introduce a variety of failure points that can intercede when patrons attempt to access subscribed resources.2 One of the failure points identified by Sunshine and Carter is inaccurate knowledgebase information. The scope of inaccurate information about HathiTrust items in Primo Central Index constituted a particularly egregious example of this type of failure. PATRON REACTION TO MISINFORMATION ABOUT ACCESS TO HATHITRUST Between the time HathiTrust’s Digital Library was activated in Search It and the time the HathiGenius application was installed at least thirty patrons contacted K-State Libraries to ask why they were unable to access a book in HathiTrust when Search It had indicated that full text was available for the book. Many of these expressed frustration at frequently encountering this error (for an example, see figure 4). TAKING THE LONG WAY AROUND | BENGSTON AND COLEMAN 30 https://doi.org/10.6017/ital.v38i1.10574 1:08 26389957777759601093088133 I find it misleading that the Search It function often finds a book I am interested in, but sometimes says it is available online; however, oftentimes it takes me to the Hathi Trust webpage for the book where I am told it is NOT available online. Is this because our library has had to give up their subscription to this service? 1:08 me Hi! 1:09 me That is definitely frustrating - and we are trying to find a way to correct it. 1:10 me It does not have to do with our subscription, but rather the metadata we receive from HathiTrust and its compatibility (or rather, incompatibility) with Search It 1:11 26389957777759601093088133 Okay, so I guess I better ask for the book I am seeking (The Emperor’s Mirror) through ILL. 1:11 me That’d probably be your best bet, but let me take a look - one moment 1:14 me Yes, ILL does look best. Please note that the ILL department will be closed after today until January 1:14 26389957777759601093088133 Got it. Thanks. I hope the Hathi Trust issue is resolved soon. (I have seen this problem all semester and finally got so frustrated to ask about it.) 1:15 26389957777759601093088133 Have a Happy holiday! 1:15 me You as well! And yes, I hope we can figure it out ASAP 1:15 me (it’s frustrating for us, too!) 1:20 26389957777759601093088133 has left the conversation Figure 4. Chat transcript revealing frustration with inaccurate information about availability of items in HathiTrust. STAFF REACTION TO MISINFORMATION ABOUT ACCESS TO HATHITRUST Reference staff at K-State Libraries use a ticketing system to report electronic resource access problems to a team of librarians who troubleshoot the underlying issues. Shortly after the HathiTrust library was activated in Search It, reference staff submitted several tickets about problems with access to items in that collection. Members of the troubleshooting team responded quickly and informed the reporting librarians that the problem was one beyond their control. This message was slow to reach the entirety of the reference staff and was not always understood as being applicable to the full range of access problems our patrons were experiencing. Samples and Healy note that this type of decentralization and reactive orientation is common in electronic resource troubleshooting.3 Like them, K-State Libraries recognized a need to develop best practices to obviate confusion. We also found ourselves pining for a tool such as that described by Collins and Murray that could automatically verify access for a large set of links.4 The extent of displeasure with the situation was so severe that some librarians stated they were loath to promote Search It to students since several million records were so conspicuously inaccurate. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 31 TECHNICAL CHALLENGES THE K-State Libraries IT department wanted to fix the situation, in order to provide accurate expectations to their users, but doing so presented severe technical challenges, the most significant of which stemmed from the lack of rights information in the PNX record in Primo. Without more accurate information on availability, user satisfaction seemed destined to remain low. Research into patron use of discovery layers predicted this unsurprising dissatisfaction. OCLC’s (2009) research into what patrons want from discovery system led the researchers to conclude that “a seamless, easy flow from discovery through delivery is critical to end users. This point may seem obvious, but it is important to remember that for many end users, without the delivery of something he or she wants or needs, discovery alone is a waste of time.”5 A later usability study reported: “Some participants spent considerable time looking around for features they hoped or presumed existed that would support their path toward task completion.”6 Additionally, the perceived need to customize discovery layers so that they reflect the needs of a particular research library is hardly new, or exclusive to K-State Libraries. The same issue was confronted by catalogers at East Carolina University, as well as catalogers at UNC Chapel Hill.7 Nonetheless, the challenge posed by discovery layers comes with opportunity, as James Madison University discovered when their EBSCO Discovery Service widget netted almost twice the usage of their previous library catalog widget, and as the University of Colorado discovered when they observed users attempting to use the discovery layer search box in “Google-like” ways that could potentially aid discovery layer creators (as well as library IT departments) in both design and in setting expectations.8 As previously noted, Primo’s results display is driven by PNX records (see figure 5 for an example). The single most fundamental challenge was finding a way to get to holdings rights information despite that data not being present in the PNX records, or, consequently, the search results that showed up in the presentation layer. There was no immediate option to create a solution that leveraged “server-side” resources, where the data itself resided and was transformed, since K-State Libraries subscribes to Primo as a hosted service, and Ex Libris provided no direct server-side access to K-State Libraries. Some alternative way had to be found to locate the rights data for individual records and populate it into the Primo interface. Upon assessing the situation, the Assistant Director, IT (AD) decided that one potential approach would be to independently query the HathiTrust bibliographic Application Programming Interface (API) for rights information. This approach solved a number of fundamental problems, but also posited its own questions and challenges: 1. Some server-side component would still be needed for part of the query . . . where would that live and how could it be made to communicate with the Javascript K-State Libraries had injected into its Primo instance? 2. How to best isolate HathiTrust object identifiers from Primo and then use them to launch an API query? 3. How to keep those responses appropriately “pinned” to their corresponding entries on the Primo page? 4. How would the HathiTrust bibliographic API perform under load from Search It queries? Answering these questions would require significant research into the HathiTrust bibliographic API documentation, and extensive experimentation. TAKING THE LONG WAY AROUND | BENGSTON AND COLEMAN 32 https://doi.org/10.6017/ital.v38i1.10574 Figure 5. A portion of the PNX record for http://hdl.handle.net/2027/uc1.32106011231518 (the second item shown in figure 1). BUILDING THE APPLICATION Of these four questions, the first was easily the most challenging: where would the server-side component live and how would it work? The K-State Libraries IT Services department had, in the past, made a number of significant modifications to the appearance and functionality of the Primo application by adding JavaScript to the static HTML tiles used in the Primo interface. However, generally speaking, JavaScript cannot successfully request data from outside of the domain of the web document it occupies. Requesting data from an API across domains requires the mediation of a server-side appliance. The AD constructed one for this purpose, using the PHP programming language. This script would serve as an intermediary between the JavaScript in Primo and the HathiTrust API. The appliance accepted data from the Primo JavaScript in the form of the contents of http variables (encoded in the URL of the GET request to the PHP appliance), then used those values to query the HathiTrust API. However, since this server-side appliance did not reside in the same domain as K-State Libraries’ Primo instance, the problem of getting the returned API data from the PHP appliance to the JavaScript still remained. This problem was solved by treating the PHP appliance as a JavaScript file for purposes of the application. While JavaScript cannot load data from another domain, a web document may load actual JavaScript files from anywhere on the web. The hathiGenius appliance takes advantage of this fact by calling the PHP appliance programmatically as a JavaScript file, with a JavaScript Object Notation (JSON) version of the identifiers of any HathiTrust entries encoded as part of the URL used to call the file. The PHP script runs the queries against the API and returns a JavaScript file consisting of a single variable containing the JSON data encoding the availability information for the HathiTrust entries as supplied from the bibliographic API . . . essentially appearing to the browser as a standard JavaScript file. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 33 The second and third problems were intrinsically interrelated, and essentially boiled down to finding a unique identifier to use in an API query from the HathiTrust entries. The most effective way to handle these queries was to use the “htid” identifier, which was largely unique to HathiTrust entries, could be easily extracted from any entries that contained it, and would form the basis of the PHP script’s request to the HathiTrust RESTful API to obtain rights information. In the process of harvesting the htid, hathiGenius also copies the id for the object in the webpage that serves as the entry in the list of Primo returns containing that htid. As the data is moved back and forth for processing, the htids, and later the resultant JSON data, remain paired to the object id for the entry in the list of returns. When hathiGenius receives the results of the API query, it can then easily rewrite those entries to reflect the rights data it obtained. The fourth question has been fully answered with time. To this point, well over a year after hathiGenius was activated in production, Library IT has not observed any failure of the API to deliver the requested results in testing, and no issues to that effect have been reported by users. Log data indicates that, even under heavy load, the API is performing to expectations. FURTHER MODIFICATIONS Originally, the hathiGenius application supplied definitive states of available or unavailable for each entry. However, some experimentation showed this approach to be less than optimal. Since the bibliographic API cannot be queried by Kansas State University as a specific user, but rather was being queried for general access rights, the possibility still existed for false negatives in the future, if Kansas State University’s level of access to HathiTrust changed. The data returned from the API queries, when drilled down, just consisted of the usRightsString property from the API, which corresponded to open-source availability, and did not account for any additional items available to the library by license in the future. After the application had been active for a short time, to mitigate this potential issue, the “not available” state (consisting of an application of the “EXLResultStatusNotAvailable” class to the HathiTrust entry) was “softened” into an application of the “EXLResultStatusMaybeAvailable” class and verbiage asking users to check the “View It” tab for availability. A few weeks after deployment, IT received a ticket indicating hathiGenius was failing to work properly. The source of the problem proved to be detailed bibliographic pages for items in a search results list, which were linking out from the search entries. These pages used a different class and object structure than the search results pages in Primo, requiring that an additional module be built into hathiGenius to account for them. Once the new module was added to the application and put into place, the problem was resolved. A second issue presented itself some weeks later, when a few false negatives were reported. At first, the assistant director assumed that licensing had changed, creating a disparity between the access information from the usRightsString property and the library’s actual holdings. However, upon investigation it was clear that hathiGenius was dropping some of the calls to the HathiTrust bibliographic API. The API itself was performing as expected under load, however, and the failure proved to be coming from an unexpected source. The PHP script used by hathiGenius to interface with the API was employing the cURL module, which, in turn, was using its own, less secure certificate to establish a Secure Socket Layer (SSL) connection to the HathiTrust server. Once the TAKING THE LONG WAY AROUND | BENGSTON AND COLEMAN 34 https://doi.org/10.6017/ital.v38i1.10574 script was refactored to employ the simpler file_get_contents function, which relied upon the server’s main SSL certificate, the problem was fully resolved. hathiGenius also had a limited vulnerability to bad actors. While the internal script’s destination hardwiring prevented hathiGenius from being used as a generic tool to anonymously query APIs, the library did encounter a situation in which a (probably inadvertently) malicious bot repeatedly pinged the script, causing it to use up system resources until it interrupted other services on the host machine. Modifications were added to the script to provide a simple check against requests originating from Primo. Additionally, restrictions were placed on the script so that excessive resource use would cause it to be intermittently deactivated. While not perfect solutions, these measures have prevented a repeat of the earlier incident. K-State Libraries has recently finished work on its version of the new Primo User Interface (Primo New UI), which was moved into production this year. The new interface has a completely different client-side structure, requiring a very different approach to integrating hathiGenius.9 APPEARANCE OF HATHITRUST RESULTS IN PRIMO AFTER HATHIGENIUS When the hathiGenius API does not find a usRights property, we configured Primo to display a yellow dot and the text “Please Check Availability with the View It Tab” (see figure 6 for an example). As noted earlier, we originally considered this preferable to displaying a red dot and the text “Not Available Online,” because there might be instances in which the item is actually available in full view through HathiTrust despite the absence of usRights in the record. Figure 6. Two books for which hathiGenius found no usRights in HathiTrust. When the hathiGenuis API finds usRights, we configured Primo to display a green dot and text “Available Online” (see figure 7 for an example). INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 35 Figure 7. A book for which hathiGenius found usRights. PATRON RESPONSE Since the beginning of 2017, the reference staff at K-State Libraries have received no reports of patrons encountering situations in the original user interface in which Primo indicates that full text is available but HathiTrust is only providing a preview. However, a small number of patrons (at least four) expressed confusion at seeing a result in Primo and discovering that the full-text is not available. Some of those patrons noted that they saw the text “Please Check Availability with the View It Tab,” and inferred that this was meant to state that the full-text was available. Others indicated that they never considered that we would include results for books that we do not own. These responses add to the body of literature documenting user expectations that everything should be available in full-text in an online library and that systems should be easy to use.10 INTERNAL RESPONSE In order to gauge the feelings of K-State Libraries’ staff who regularly assist patrons with reference questions, the authors crafted a brief survey (included in appendix A). Respondents were asked to indicate whether they had noticed a positive change following implementation of hathiGenius, a negative change, or no change at all. They were also invited to share comments. The survey was distributed to thirty individuals. Twelve (40 percent) of those thirty responded to the survey. The survey response indicated a great deal of ambivalence by reference staff toward the change, with four individuals (33 percent) indicating they had not noticed a difference, and another four (33 percent) indicating that they had noticed a difference, but that it had not improved the quality of search results. Only two (17 percent) of the respondents revealed that they had noticed an improvement in the quality of the search results. One (9 percent) respondent indicated that they felt that the HathiTrust results had gotten noticeably worse since the introduction of hathiGenius, although they did not elaborate on this in the survey question which invited further comment. The remaining respondent stated that they did not have an opinion. Four comments were left by respondents, including one which indicated displeasure with the new, softer verbiage for hathiTrust “negatives,” and one who claimed that the problem of false positives persisted, despite such feedback not being seen by the authors through any of the statistical modalities currently used for recording reference transactions. One user praised hathiGenius, while another related broad displeasure with the decision to include HathiTrust records in Search It. That individual claimed that almost none of the results from HathiTrust were available and stated that the hope engendered by the presence of the HathiTrust results and the corresponding suggestion to check the View It tab was always dashed, to the detriment of patron satisfaction. TAKING THE LONG WAY AROUND | BENGSTON AND COLEMAN 36 https://doi.org/10.6017/ital.v38i1.10574 THE NEW UI As previously mentioned, in late 2018, K-State Libraries adopted the Primo New UI created by Ex Libris. This new user interface was built in Angular, and changed many aspects about how hathiGenius had to be integrated into Primo. The K-State Libraries’ IT department completed a refactoring (reworking application code to change how an application works, but not what it does) of hathiGenius to integrate it with the new UI and released it into production in September 2018. As an interesting aside, the IT department did not initially prioritize the reintegration of hathiGenius, due to the ambivalence of the response to the application evidenced by the survey conducted for this paper. However, shortly after Search It was switched over to the new UI, complaints about the HathiTrust results again displaying inaccurate availability information began to come in to the IT department via both email and tickets from reference staff. As the stridence of the response increased, the project was reprioritized, and the work completed. FUTURE DIRECTIONS As previously mentioned, hathiGenius currently uses the very rough “usRightsString” property value from the HathiTrust bibliographic API. However, the API also delivers much more granular rights data for digital objects. A future version of the app may inspect these more granular rights codes and compare them to rights data from K-State Libraries in order to more definitively provide access determinations for HathiTrust results in Primo should the licensing of HathiTrust holdings be changed. Similarly, since htid technically only resolves to the volume level, a future version may additionally harvest the HathiTrust record number, which appears to be extractable from the Primo entries. Based on feedback from the survey, the “soft negative” verbiage used in hathiGenius was replaced with a firmer negative. This decision proved especially sagacious given that, once the early issues with certificates and communication with the HathiTrust bibliographic API were sorted out, the accuracy of the tool seemed to be fully satisfactory. Another problem with the “soft negative” was the fact that it asked users to click on the View-it tab, when many users simply chose to ignore the tabs and links in the search results, instead clicking on the article title, as found in a usability study on Primo conducted by the University of Houston Libraries.11 It is also worth noting the one survey respondent who is apparently not seeing an improvement in HathiTrust accuracy. If the continued difficulties they have indicated can be documented and replicated, the IT department can examine those complaints to investigate where the tool may be failing. DISCUSSION One interesting feature of this experience is the seeming disconnect between library reference support staff and users in terms of the perception of the efficacy of the tool. This disconnect is all the more curious given the negative reaction displayed by reference support staff when hathiGenius became unavailable temporarily upon introduction of the Primo New UI. Part of this perceived disconnect may be a result of the fact that staff were given a survey instrument, while the reactions of users have been determined largely via null results (a lack of complaints to, or INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 37 requests for assistance from, service point staff). However, given the dramatic drop in user complaints compared to the ambivalent reaction to the tool by most of the survey respondents, it appears that the staff had a much less enthusiastic response to the intervention than patrons. A few possibilities occur to the authors, including a general dislike for the discovery layer by reference librarians, a general disinclination toward a technological solution by some respondents, or the initial perception by at least part of the reference staff that the problem was not significant. As noted by Fagan et al., the pivot toward discovery layers has not been a comfortable one for many librarians.12 Until further research can be conducted on this, and reactions to similar customization interventions, these possibilities remain speculation. One particular feature of note with hathiGenius is the use of what one of the authors refers to as “sidewise development” to solve problems that seem to be intractable within a proprietary, or open source, web-based tool. While not a new methodology in and of itself, the author has mainly encountered this type of design in ad-hoc creations, rather than as a systematic approach to problem-solving. Instead of relying upon the capabilities of Primo, this type of customization made its own query to a relevant API and blended that external data with the data available from Primo seamlessly within the application’s presentation layer in order to facilitate a solution to a known problem. The solution created in this fashion was portable, and unaffected by most updates to Primo itself. Even the transition to the New UI required changes to the “hooks” and timing used by the JavaScript, rather than any substantial rewrite of the core engines of the application. This methodology has been used repeatedly by K-State Libraries IT Services to solve problems where other interventions would have necessitated the creation of specialized modules, or the rewriting of source code; both of which would be substantially affected by updates to the product itself, and which would have been difficult to improve or version without down time to the affected product. Similar solutions have seen tools independently query an application’s database in order to inject the data back into the application’s presentation layer, bypassing the core functionality of the application. CONCLUSION Reactions at this point from users, and at least some library staff, have been positive. While not a perfect tool, hathiGenius has improved the user experience, removing a point of frustration and an area of disconnect between the library and its users. The application itself is fully replicable by other institutions (as is the general model of sideways development), allowing them to improve the utility of their Primo instances. As with many possible customizations to discovery layers, hathiGenius provides fertile ground for additional work, research, and refinement, as libraries struggle to find the most effective ways to implement discovery tools within their own environments. Beyond hathiGenius itself, the sideways development method provides a powerful tool for libraries to improve the tools they use by integrating additional functionality at the presentation layer level. Tackling the problem of inaccurate full-text links in discovery layers is only one application of this approach, but it is an important one. As libraries continue to strive to improve the results and usability of their search offerings, the ability to add local customizations and improvements will be an essential feature for vendors to consider. TAKING THE LONG WAY AROUND | BENGSTON AND COLEMAN 38 https://doi.org/10.6017/ital.v38i1.10574 APPENDIX A. FEEDBACK SURVEY Q1 In January 2017, the library began applying a tool (called hathiGenius) to the HathiTrust results in Primo in order to eliminate the problem of “false positives.” In other words, Primo would report that all of the HathiTrust results it returned were available online as full text, when many were not. We would like your feedback about the impact of this change from your perspective. Q2 Which of the following statements best describes your opinion about the impact of hathiGenius? o I haven’t noticed a difference. o I feel that Search It’s presentation of HathiTrust results in Search It has become noticeably better since hathiGenius was implemented. o I feel that Search It’s presentation of HathiTrust results in Search It has become noticeably worse since hathiGenius was implemented. o I have noticed a difference, but I feel that Search It’s presentation of HathiTrust results is about the same quality as it was before hathiGenius was implemented. o No opinion. Q3 Please share any comments you have about hathiGenius or any ideas you have for improving the display of HathiTrust’s records in Search It. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 39 REFERENCES 1 HathiTrust Digital Library, “Welcome to HathiTrust!” accessed March 4, 2018, https://www.hathitrust.org/about. 2 Sunshine Carter and Stacie Traill, “Essential Skills and Knowledge for Troubleshooting E- Resources Access Issues in a Web-Scale Discovery Environment,” Journal of Electronic Resources Librarianship 29, no. 1 (2017): 7, https://doi.org/10.1080/1941126X.2017.1270096. 3 Jacquie Samples and Ciara Healy, “Making It Look Easy: Maintaining the Magic of Access,” Serials Review 40, no. 2 (2014): 114, https://doi.org/10.1080/00987913.2014.929483. 4 Maria Collins and William T. Murray, “SEESAU: University of Georgia’s Electronic Journal Verification System,” Serials Review 35, no. 2 (2009): 80, https://doi.org/10.1080/00987913.2009.10765216. 5 Karen Calhoun, Diane Cellentani, and OCLC, eds., Online Catalogs: What Users and Librarians Want: An OCLC Report (Dublin, Ohio: OCLC, 2009): 20, https://www.oclc.org/content/dam/oclc/reports/onlinecatalogs/fullreport.pdf. 6 Rice Majors, “Comparative User Experiences of Next-Generation Catalogue Interfaces,” Library Trends; Baltimore 61, no. 1 (Summer 2012): 191, https://scholarcommons.scu.edu/cgi/viewcontent.cgi?article=1132&context=library. 7 Marlena Barber, Christopher Holden, and Janet L. Mayo, “Customizing an Open Source Discovery Layer at East Carolina University Libraries “The Cataloger’s Role in Developing a Replacement for a Traditional Online Catalog,” Library Resources & Technical Services 60, no. 3 (July 2016): 184, https://journals.ala.org/index.php/lrts/article/view/6039; Benjamin Pennell and Jill Sexton, “Implementing a Real-Time Suggestion Service in a Library Discovery Layer,” Code4Lib Journal, no. 10 (June 2010): 5, https://journal.code4lib.org/articles/3022. 8 Jody Condit Fagan et al., “Usability Test Results for a Discovery Tool in an Academic Library,” Information Technology and Libraries 31, no. 1 (March 2008): 99, https://doi.org/10.6017/ital.v31i1.1855. 9 Dan Moore and Nathan Mealey, “Consortial-Based Customizations for New Primo UI,” The Code4Lib Journal, no. 34 (October 25, 2016), http://journal.code4lib.org/articles/11948. 10 Lesley M. Moyo, “Electronic Libraries and the Emergence of New Service Paradigms,” The Electronic Library, 22, no. 3 (2004): 221, https://www.emeraldinsight.com/doi/full/10.1108/02640470410541615. 11 Kelsey Brett, Ashley Lierman, and Cherie Turner, “Lessons Learned: A Primo Usability Study,” Information Technology and Libraries, 35, no. 1 (March 2016): 20, https://ejournals.bc.edu/ojs/index.php/ital/article/view/8965. 12 Fagan et al., “Usability Test Results for a Discovery Tool in an Academic Library,” 84. 10592 ---- 10592 20190318 galley The Map as a Search Box: Using Linked Data to Create a Geographic Discovery System Gabriel Mckee INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 40 Gabriel McKee (gm95@nyu.edu) is Librarian for Collections and Services at the Institute for the Study of the Ancient World at New York University. ABSTRACT This article describes a bibliographic mapping project recently undertaken at the Library of the Institute for the Study of the Ancient World (ISAW). The MARC Advisory Committee recently approved an update to MARC that enables the use of dereferenceable Uniform Resource Identifiers (URIs) in MARC subfield $0. The ISAW Library has taken advantage of MARC’s new openness to URIs, using identifiers from the linked data gazetteer Pleiades in MARC records and using this metadata to create maps representing our library’s holdings. By populating our MARC records with URIs from Pleiades, an online, linked open data (LOD) gazetteer of the ancient world, we are able to create maps of the geographic metadata in our library’s catalog. This article describes the background, procedures, and potential future directions for this collection-mapping project. INTRODUCTION Since the concept of the Semantic Web was first articulated in 2001, libraries have faced the challenge of converting their vast stores of metadata into linked data.1 Though BIBFRAME, the planned replacement for the MARC (MAchine-Readable Cataloging) systems that most American libraries have been using since the 1970s, is based on linked-data principles, it is unlikely to be implemented widely for several years. As a result, many libraries have delayed creating linked data within the existing MARC framework. One reason for this delay has been the absence of a clear consensus in the cataloging community about the best method to incorporate Uniform Resource Identifiers (URIs), the key building block of linked data, into MARC records.2 But recent developments have added clarity to how URIs can be used in MARC, clearing a path for projects that draw on URIs in library metadata. This paper describes one such project undertaken by the Library of the Institute for the Study of the Ancient World (ISAW) that draws on URIs from the linked-data gazetteer Pleiades to create maps of items in the library’s collection. A BRIEF HISTORY OF URIS IN MARC Over the last decade, the path to using URIs in MARC records has become more clear. This process began in 2007, when the Deutsche Nationalbibliothek submitted a proposal to expand the use of a particular MARC subfield, $0 (also called “dollar-zero” or “subfield zero”), to contain control numbers for related authority records in Main Entry, Subject Access, and Added Entry fields.3 The proposal, which was approved on July 13, 2007, called for these control numbers to be recorded with a particular syntax: “the MARC organization code (in parentheses) followed immediately by the number, e.g., (CaBVaU)2835210335.”4 This MARC-specific syntax is usable within the MARC environment, but is not actionable for linked-data purposes. A dereferenceable URI—that is, an identifier beginning with “http://” that links directly to an online resource or a descriptive INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 41 representation of a person, object, or concept—could be parsed and reconstructed, but only with a significant amount of human intervention and a high likelihood of error.5 In 2010, following a proposal from the British Library, $0 was redefined to allow types of identifiers other than authority record numbers, in particular International Standard Name Identifiers (ISNI), using this same parenthetical-prefix syntax.6 That same year, the RDA/MARC Working Group issued a discussion paper proposing the use of URIs in $0, but no proposal regarding the matter was approved at that time.7 The 2010 redefinition made it possible to place URIs in $0, provided they were preceded by the parenthetical prefix “(uri)”. However, this requirement of an added character string put MARC practice at odds with the typical practices of the Linked Data community. Not only does the addition of a prefix create the need for additional parsing before the URI can be used, the prefix is also redundant, since dereferenceable URIs are self-identifying. In 2015, the Program for Cooperative Cataloging (PCC) charged a task group with examining the challenges and opportunities for the use of URIs within a MARC environment.8 One of this group’s first accomplishments was submitting a proposal to the MARC Advisory Committee to discontinue the requirement of the “(uri)” prefix on URIs.9 Though this change appears minor, it represents a significant step forward in the gradual process of converting MARC metadata to linked data. Linked data applications require dereferenceable URIs. The requirement of either converting an HTTP URI to a number string (as $0 required from 2007-10), or prefixing it with a parenthetical prefix, produced identifiers that did not meet the definition of dereferenceability. As Shieh and Reese explain, the MARC syntax in place prior to this redefinition was at odds with the practices used by Semantic Web services: The use of qualifiers, rather than actionable URIs, requires those interested in utilizing library metadata to become domain experts and become familiar with the wide range of standards and vocabularies utilized within the library metadata space. The qualifiers force human interaction, whereas dereferenceable URIs are more intuitive for machines to process, to query services, to self-describe—a truly automated processing and a wholesome integration of Web services.10 Though it has been possible to use prefixed URIs in MARC for several years, few libraries have done so, in part because of this requirement for human intervention, and in part because of the scarcity of use-cases that justified their use. The removal of the prefix requirement brings MARC’s use of URIs more into line with that of other semantic web services, and will reduce system fragility and enhance forward-compatibility with developing products, projects, and services. Though MARC library catalogs still struggle with external interoperability, the capability of inserting unaltered, dereferenceable URIs into MARC records is potentially transformative.11 Following the approval of the PCC Task Group on URI in MARC’s 2016 proposal makes it easier to work with limited linked data applications directly within MARC, rather than waiting for the implementation of BIBFRAME. By inserting actionable URIs directly into MARC records, libraries can begin developing programs, tools, and projects that draw on these URIs for any number of data outcomes. In the last two years, the ISAW Library has taken advantage of MARC’s new openness to URIs to create one such outcome: a bibliographic mapping project that creates browseable maps of items held by the library. The ISAW Library holds approximately 50,000 volumes in its print collection, MAP AS A SEARCH BOX | MCKEE 42 https://doi.org/10.6017/ital.v38i1.10592 chiefly focusing on archaeology, history, and philology of Asia, Europe, and North Africa from the beginning of agriculture through the dawn of the Medieval period, with a focus on cultural interconnections and interdisciplinary approaches to antiquity. The Institute, founded in 2007, is affiliated with New York University (NYU) and its library holdings are cataloged within Bobcat, the NYU OPAC. By populating our MARC records with URIs from Pleiades, an online, linked open data (LOD) gazetteer of the ancient world, the ISAW Library is able to create maps of the geographic metadata in our library’s catalog. At the moment, this process is indirect and requires periodic human intervention, but we are working on ways of introducing greater automation as well as expanding beyond small sets of data to a larger map encompassing as much of our library’s holdings as it makes sense to represent geographically. MAP-BASED SEARCHING FOR ANCIENT SUBJECTS In the disciplines of history and archaeology, geography is of vital importance. Much of what we know about the past can be tied to particular locations: archaeological sites, ancient structures, and find-spots for caches of papyri and cuneiform tablets provide the spatial context for the cultures about which they inform us. But while geospatial data about antiquity can be extremely precise, the text-based searching that is the user’s primary means of accessing library materials enabled is much less clear. Standards for geographic metadata focus on place names, which open the door for greater ambiguity, as Buckland et al. explain: There is a basic distinction between place, a cultural concept, and space, a physical concept. Cultural discourse tends to be about places rather than spaces and, being cultural and linguistic, place names tend to be multiple, ambiguous, and unstable. Indeed, the places themselves are unstable. Cities expand, absorbing neighboring places, and countries change both names and boundaries.12 Nowhere is this instability of places and their names so clear as in the fields of ancient history and archaeology, which often require awareness of cultural changes in a single location throughout the longue durée. And yet researchers in these fields have had to rely on library search interfaces that rely entirely on toponyms for accessing research materials. Scholars in these disciplines, and many others besides, would be well served by a method of discovering research materials that relies not on keywords or controlled vocabularies, but on geographic location. Library of Congress classification and subject cataloging tend to provide greater granularity for political developments in the modern era, presenting a challenge to students of ancient history. A scholar of the ancient Caucasus, for example, is likely to be interested in materials that are currently classified under the History classes for the historical region of Greater Armenia (DS161- 199), the modern countries of Armenia (DK680), Azerbaijan (DK69X), Georgia (DK67X), Russia (DK5XX), Ukraine (DK508), and Turkey (DS51, DS155-156 and DR401-741); for pre- and proto- historic periods, materials may be classified in GN700-890; and texts in ancient languages of the Caucasus will fall into the PK8000-9000 range. Moreover, an effective catalog search may require familiarity with the romanization schemes for Georgian, Armenian, Russian, and Ukrainian. Materials on the ancient Caucasus fall into a dozen or more call number ranges, and there is no single term within the Library of Congress Subject Headings (LCSH) that connects them—but if their subjects were represented on a map, they would fall within a polygon only a few hundred miles long on each side. This geophysical collocation of materials from across many classes of knowledge can enable unexpected discoveries. As Bidney and Clair point out, “Organizing INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 43 information based on location is a powerful idea—it has the capacity to bring together information from diverse communities of practice that a research may never have considered . . . ‘Place’ is interdisciplinary.”13 With this in mind, the ISAW Library has set out to create an alternative method of accessing items in its collection: a browseable, map-based interface for the discovery of library materials. LITERATURE REVIEW Though geographic searching is undoubtedly useful for many different types of content, much of the work in using coordinate data and map-based representations of resources has centered on searching for printed maps and, more recently, geospatial datasets. In an article published in 2007, Buckland et al. issued a challenge to libraries to complement existing text-string toponymic terminology with coordinate data.14 Perhaps unsurprisingly, the most progress in meeting this challenge has been made in the area of cartographic collections. In a 2010 article, Bidney discussed the Library of Congress’s then-new requirement of coordinates in records describing maps, and explores the possibility of using this metadata to create a geographic search interface.15 A 2014 follow-up article by Bidney and Clair expanded this argument to include not just cartographic materials, but all library resources, challenging libraries to develop new interfaces to make use of geospatial data.16 The most advanced geospatial search interfaces have been developed for cartographic and geospatial data. For example, GeoBlacklight (http://geoblacklight.org) offers an excellent map-based interface, but it is intended primarily for cartographic and GIS data specifically, and not library resources more broadly. The mapFAST project described by Bennett et al. in 2011 pursues goals similar to our Pleiades- based discovery system.17 Using FAST (Faceted Application of Subject Terminology) headings, which are already present in many MARC records, this project creates a searchable map via the Google Maps API. Each FAST geographic heading creates a point on the map which, when clicked, brings the user to a precomposed search in the library catalog for the corresponding controlled subject heading. One limitation to the mapFAST model is the absence of geographic coordinates on many of the LC authority records from which FAST headings are derived: at the time that Bennett et al. described the project, coordinates were available for only 62.5 percent of FAST geographic headings; additional coordinates came from the Geonames database (http://www.geonames.org/).18 Moreover, the method of retrieving these coordinates is based on text string matching, which introduces the possibility of errors resulting from the lack of coordination between toponyms in FAST and Geonames. In exploring other mapping projects, we looked most closely at projects with a focus on the ancient world, including Pelagios (http://commons.pelagios.org), its geographic search tool Peripleo,19 and China Historical GIS (CHGIS, http://sites.fas.harvard.edu/~chgis). As described by Simon et al. in 2016, Pelagios offers a shared framework for researchers in classical history to explore geographic connections, and several applications of its data resemble our desired outcome.20 Similarly, Merrick Lex Berman’s work with the API provided by China Historical GIS in connection with library metadata provided important guidelines and points of comparison.21 We also explored mapping projects outside of the context of antiquity, including MapHappy, the Biodiversity Heritage Library’s map of LCSH headings, and the map interface developed for PhillyHistory.org.22 MAP AS A SEARCH BOX | MCKEE 44 https://doi.org/10.6017/ital.v38i1.10592 FIRST STEPS: METADATA To develop a system for mapping the ISAW Library’s collection, we began by working with smaller sets of metadata. Our initial collection map, which served as a proof of concept, represented the titles available in the Ancient World Digital Library (AWDL, http://dlib.nyu.edu/ancientworld), an online e-book reader created by the ISAW Library in collaboration with NYU’s Digital Library Technical Services department. When we initially created this interface, called the AWDL Atlas, AWDL contained a small, manageable set of about one hundred titles. Working in a spreadsheet, we assigned geographic coordinates to each of these titles and mapped them using Google Fusion Tables (https://fusiontables.google.com). Fusion Tables, launched by Google in June 2009, is a cloud-based platform for data management that includes a number of visualization tools, including a mapping feature that builds on the infrastructure of Google Maps.23 The Fusion Tables map created for AWDL shows a pinpoint for each title in the e-book library; when clicked, each pinpoint gives basic bibliographic data about the title and a link to the e-book itself. One problem with this initial map was that it did little to show precision—a pinpoint representing a specific archaeological site in Iraq looks the same on the map as a pinpoint representing the entirety of Central Asia. Nevertheless, the basic functionality of the AWDL Atlas performed as desired, providing a geographic search interface for a concrete set of resources. For our next collection map, we turned our attention to our monthly lists of new titles in our library’s collection. At the end of each month, NYU’s Library Systems team sends our library a MARC-XML report listing all of the items added to our library’s collection that month. For several years now, we have been publishing this data on our library’s website in human-readable HTML form and adding the titles to a library in the open-source citation management platform Zotero, allowing our users multiple pathways to discovering resources within our collection.24 Beginning in August 2016, we began creating monthly maps of these titles, using a variation of the workflow that we devised for the AWDL Atlas. To better represent the different levels of precision that each point represents, we implemented a color-coded range of four levels of precision, from site- specific archaeological publications to materials covering a broad, multi-country range, with a fifth category for cross-cultural materials and other works that can’t be well represented in geographic form. (These items are grouped in the Mediterranean Sea on the monthly new titles maps, but in a full-collection map would most likely be either excluded or represented by multiple points, as appropriate.) The initial New Titles maps took a significant amount of title-by-title work to create. Coordinates and assessments of precision needed to be assigned for each title individually. We quickly began looking for ways to automate the process of geolocation, and soon settled on using data from Pleiades to increase the efficiency of creating each map.25 We set our sights on MARC field 651 (Subject Added Entry-Geographic Name) as the best place in a MARC record to put Pleiades data. As a subject access field, the 651 is structured to contain a searchable text string and can also include a $0 with a URI associated with that text string. However, under current cataloging guidelines, catalogers are not free to use any URI they choose in this field: the Library of Congress maintains a list of authorized sources for subject terms to be used in 651 and other subject-access fields.26 In August 2016, the ISAW Library submitted a proposal to the Library of Congress for Pleiades to be approved as a source of authoritative subject data and added to LC’s list of Subject Heading and Term Source Codes. The proposal was approved the following month, and by early 2017 the LC-assigned code was approved for use in INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 45 OCLC records. With this approval in place, we began incorporating Pleiades URIs in MARC records for items held by the ISAW Library. We used the names of Pleiades resources as subject terms in new 651 (Subject Added Entry-Geographic Name) fields, specifying Pleiades as the source of the subject term in subfield $2 and adding the Pleiades URI in a $0: Figure 1. Fields from a MARC record showing an LCNAF geographic heading and the corresponding Pleiades heading, with URI in $0. Figure 1 shows a detail from OCLC record #986242751, which describes a book containing texts from cuneiform tablets discovered at the Hittite capital city Hattusa. This detail shows both the LCNAF and Pleiades geographic headings assigned to this record. (In addition to providing a URI for the site, the Pleiades heading also enhances keyword searches: the 651 field is searchable in the NYU library catalog, thus providing keyword access to one of the city’s ancient names). The second 651 field contains a second indicator 7, indicating that the source of the subject term is specified in $2, where the LC-approved code “pleiades” is specified. This is followed by a $0 containing the URI for the Pleiades place resource describing Hattusa. Our monthly reports of new titles now contain a field for Pleiades URIs. Currently, we are not querying Pleiades directly for coordinates, but rather are using the URI as a vertical-lookup term within a spreadsheet of each month’s new titles, which is checked against a separate data file that matches Pleiades URIs to coordinate pairs.27 For places where no Pleiades heading is available, we have begun using URIs from the Getty Thesaurus of Geographic Names (TGN), MARC-syntax FAST identifiers, and unique LCNAF text strings, using the same vertical-lookup process to retrieve previously researched coordinate pairs for those places. Next, we retrieve coordinates for newly appearing Pleiades locations, research the locations of new non-Pleiades places, and add both to the local database of places used. Lastly, due to Google Fusion Table’s inability to display more than one item on a single coordinate pair, prior to uploading the map data to Fusion Tables we examine it for duplicated coordinate pairs, manually altering them to scatter these points to nearby locations. The overall amount of time spent on cleaning data and preparing each month’s map has decreased from more than a full day’s work in August 2016 to about two hours in January 2018. MAP AS A SEARCH BOX | MCKEE 46 https://doi.org/10.6017/ital.v38i1.10592 Figure 2. A screenshot from the ISAW Library New Titles map for January 2018, showing an item- specific information window (http://isaw.nyu.edu/library/Find/NewTitles-2017-18/2018-jan). CHALLENGES In developing the ISAW Library’s mapping program, we had to overcome several challenges: Early in the project, we needed to address the philosophical differences between how Pleiades and LCNAF think about places and toponyms. The concept of “place” in Pleiades is broad, and contains cities, structures, archaeological sites, kingdoms, provinces, and other types of administrative divisions, roads, geological features, culturally defined regions, and ethnic groups: “the term [‘place’] applies to any locus of human attention, material or intellectual, in a real-world geographic context.”28 In functional terms, a “place” in Pleiades is a top-level resource containing one or more other types of data: • One or more locations, consisting of either a precise point, an approximate rectangular polygon, or a precise polygon formed by multiple points; • One or more names, in one or more ancient or modern languages; • One or more connections to other Place resources, generally denoting a geospatial or political/administrative connection. Locations, names, and connections contain further metadata, including chronological attestations and citations to data sources. No one of these components is a requirement—even locations are optional, as ancient texts contain references to cities and structures whose geospatial location is unknown. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 47 By contrast, Library of Congress rules focus almost exclusively on names—that is, text strings. There are two main categories of geographic names, as described in instruction sheet H690 of the Subject Headings Manual (SHM): Headings for geographic names fall into two categories: (1) names of political jurisdictions, and (2) non-jurisdictional geographic names. Headings in the first category are established according to descriptive cataloging conventions with authority records that reside in the name authority file . . . Headings in the second category are established . . . with authority records that reside in the subject authority file.29 The two categories—essentially definable as political entities and geographic regions—are both of interest to the SHM only as represented by text strings. The purpose of identifying places within the framework of LC’s guidelines is to enable text-based searching and collocation of items based on uniform, human-readable terminology. At the beginning of this project, it was important to acknowledge, explore, and understand this fundamental difference, and to understand the different purposes of an authority file (identifying unique text strings), a linked data gazetteer (assembling and linking many different kinds of geospatial and toponymic data), and our mapping project (identifying coordinate pairs related to specific library resources). In our project, this philosophical gap manifested as a difference between the primary and secondary importance of authorized text strings and URIs: in LCSH and LCNAF, the text string is primary, and the URI secondary (where it is used at all); in Pleiades and many other linked-data sources, URIs are primary and text strings secondary. LCSH and LCNAF text strings are unique, and can be considered as a sort of identifier, but they do not have the machine-readable functionality of a URI. In Pleiades, the machine-readable URI is primary, and can be used to return coordinates, place names, and other human- or machine-readable data. The name of a Pleiades place resource can be construed as a “subject heading,” but these text strings are not necessarily unique, and additional data from the Pleiades resource may be required for disambiguation by a human reader.30 Toponymic terminology—that is, human-readable text strings—are just one type of data that Pleiades contains, alongside geospatial data, temporal tags, and linkages between resources. One example of a recent change in Pleiades data illustrates the fundamental difference in approach between authority control and URI management. Until recently, Pleiades contained two different place resources with the heading “Aegyptus” (https://pleiades.stoa.org/places/766 and https://pleiades.stoa.org/places/981503), both referring to the general region of Egypt. Both of these resources were recently updated, and the title text of both was changed: /places/766 was retitled “Aegyptus (Roman imperial province)” and /places/981503 became “Ancient Egypt (region).” The distinction illustrates the difficulty in assigning names to places over long spans of time: Egypt, as understood by pre-Ptolemaic inhabitants of the Nile region, had a different meaning than the administrative region established after Octavian’s defeat of Marc Antony and Cleopatra—or, for that matter, from the Predynastic kingdoms of Upper and Lower Egypt, the Ottoman Eyalet of Misr, and the modern Republic of Egypt. Prior to this change in Pleiades, both URIs were applied to MARC records for items held by the ISAW Library, under the heading “Aegyptus.” From a linked-data standpoint, there is no real problem here: the URIs still link to resources describing different historical places called “Egypt,” including the coordinate data needed for ISAW’s collection maps. But from the standpoint of authority control, the subject term MAP AS A SEARCH BOX | MCKEE 48 https://doi.org/10.6017/ital.v38i1.10592 “Aegyptus” on these records is now “wrong,” representing a deprecated term, and should be updated. Even here, though, a linked-data model has benefits that a text-string-based model lacks. Even if they contain the same text string heading, the URI means there is no ambiguity between the two headings, and the text strings can be replaced with a batch operation based on the differences in their URIs. Getting away from text-string-based thinking will represent a major philosophical challenge for libraries as we move toward a linked data model for library metadata, but the many benefits of linked data will make that shift worthwhile. Google Fusion Tables represents a future hurdle that the ISAW Library’s mapping project will need to clear. In December 2018, Google announced that the Fusion Tables project would be discontinued, and that all embedded Fusion Tables visualizations will cease functioning on December 3, 2019.31 Fortunately, the ISAW Library has already begun developing an alternative solution that does not rely on the deprecated Fusion Tables application. The core methodology used in developing our maps will remain the same, however. Lastly, the geographic breadth of our collection reveals the limitations of Pleiades as the sole data source for this project. At its inception, Pleiades was focused on the Greco-Roman antiquity, and though it has expanded over time, Central and East Asia—regions of central interest to the ISAW Library—are largely not covered. Because all contributions to Pleiades undergo peer review prior to being published online, Pleiades’ editors are understandably reluctant to commit to expanding their coverage eastward until the editorial team includes experts in these geographic areas. However, though we began this project with Pleiades, there is no barrier to using other sources of geographic data, such as China Historical GIS, the Getty Thesaurus of Geographic Names (TGN, http://www.getty.edu/research/tools/vocabularies/tgn/index.html), GeoNames (http://www.geonames.org/), the World-Historical Gazetteer (http://whgazetteer.org/), or the Library of Congress’s Linked Data Service (http://id.loc.gov/). The same procedures we’ve used with Pleiades can be applied to any reliable data source with consistently formatted data. FUTURE DIRECTIONS We have already begun to move away from the Google Fusion Tables model, and are working to develop our own JavaScript-based map application using Mapbox (https://www.mapbox.com/) and Leaflet (https://leafletjs.com/). When completed, this updated mapping application will actively query a database of Pleiades headings for coordinates, further automating the process of map creation. We are looking into different methods of encoding and representing precision—for example, using points and polygons to represent sites and regions, respectively. The Leaflet map interface will also enable us to show multiple items for single locations, something Fusion Tables is unable to do, and will thus eliminate the need to manually deduplicate coordinate pairs. To expand the number of records that contain Pleiades URIs, we are developing a crosswalk between existing LC geographic headings and Pleiades Place resources. When completed, we will use this crosswalk to batch-update our older records with Pleiades data where appropriate. The crosswalk will contain URIs from both Pleiades and the LC Linked Data Service, and it will be provided to the Pleiades team so that Pleiades resources can incorporate LC metadata as well. We are also exploring further user applications of map-based search. One function we hope to develop is a geographic notification service, allowing users to define polygonal areas of interest on the map. When a new point is added that falls within these polygons, the user will be notified of a INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 49 new item of potential interest. Some user training will be required to ensure that users define their areas of interest in such a way that they will receive results that interest them—for example, a user interested in the Roman Empire will likely be interested in titles about the Mediterranean region in general, and may need to draw their bounding box so that it encompasses the open sea as well as sites on land. It will also require thoughtfulness about where users are likely to look for points of interest, especially for empires and other historic entities that do not correspond to modern geopolitical boundaries (for example, the Byzantine Empire or Scythia). Additionally, we hope to begin working with chronological as well as geospatial data, with hopes of being able to add a time slider to the library map. This would enable users to focus on particular periods of history as well as geographic regions—for example, users interested in Bronze Age Anatolia could limit results to that time period, so that they can browse the map without material from the Byzantine Empire “cluttering” their browsing experience.32 The online temporal gazetteer PeriodO (http://perio.do/) provides a rich data source to draw on, including URIs for individual chronological periods and beginning and end dates for each defined temporal term. Following a proposal submitted by the ISAW Library, PeriodO was approved by the Library of Congress as a source of subject terminology in September 2018, and its headings and URIs are now useable in MARC. However, though LCSH headings for geographic places are often quite good, the guidelines for chronological headings and subdivisions are often inadequate for describing ancient historical periods, and thus enacting a chronological slider, though highly desirable, would require a large amount of manual changes and additions to existing metadata. The ISAW Library’s collection mapping project has accomplished its initial goal of providing a geospatial interface for the discovery of materials in our library collection. As we expand our mapping project to incorporate more of our collection, we also hope that our model can prove useful to other institutions looking for practical applications of URIs in MARC, alternative discovery methods to text-based searching, or both. REFERENCES AND NOTES 1 For a summary of this challenging problem, see Brighid M. Gonzales, “Linking Libraries to the Web: Linked Data and the Future of the Bibliographic Record,” Information Technology & Libraries 33, no. 4 (Dec. 2014): 10–22, https://doi.org/10.6017/ital.v33i4.5631. 2 See, for example, Timothy W. Cole et al., “Library MARC Records into Linked Open Data: Challenges and Opportunities,” Journal of Library Metadata 13, no. 2–3 (July 2013): 178, https://doi.org/10.1080/19386389.2013.826074. 3 Deutsche Nationalbibliothek, “MARC Proposal No. 2007-06: Changes for the German and Austrian Conversion to MARC 21,” MARC Standards, May 25, 2007, https://www.loc.gov/marc/marbi/2007/2007-06.html. 4 Ibid. 5 For a detailed discussion of the importance of actionability in unique identifiers, see Jackie Shieh and Terry Reese, “The Importance of Identifiers in the New Web Environment and Using the Uniform Resource Identifier (URI) in Subfield Zero ($0): A Small Step That Is Actually a Big MAP AS A SEARCH BOX | MCKEE 50 https://doi.org/10.6017/ital.v38i1.10592 Step,” Journal of Library Metadata 15, no. 3–4 (Oct. 2, 2015): 220–23, https://doi.org/10.1080/19386389.2015.1099981. 6 British Library, “MARC Proposal No. 2010-06: Encoding the International Standard Name Identifier (ISNI) in the MARC 21 Bibliographic and Authority Formats,” MARC Standards, May 17, 2010, https://www.loc.gov/marc/marbi/2010/2010-06.html. 7 RDA/MARC Working Group, “MARC Discussion Paper No. 2010-DP02: Encoding URIs for Controlled Values in MARC Records,” MARC Standards, Dec. 14, 2009, https://www.loc.gov/marc/marbi/2010/2010-dp02.html. 8 For a summary of this task group’s work to date, see Jackie Shieh, “Reports from the Program for Cooperative Cataloging Task Groups on URIs in MARC & BIBFRAME,” JLIS.It: Italian Journal of Library, Archives and Information Science = Rivista Italiana Di Biblioteconomia, Archivistica e Scienza Dell’informazione 9, no. 1 (2018): 110–19, https://doi.org/10.4403/jlis.it-12429. 9 PCC Task Group on URI in MARC and The British Library, “MARC Discussion Paper No. 2016- DP18: Redefining Subfield $0 to Remove the Use of Parenthetical Prefix ‘(Uri)’ in the MARC 21 Authority, Bibliographic, and Holdings Formats,” MARC Standards, May 27, 2016, https://www.loc.gov/marc/mac/2016/2016-dp18.html; MARC Advisory Committee, “MAC Meeting Minutes” (ALA Annual Meeting, Orlando, FL, 2016), https://www.loc.gov/marc/mac/minutes/an-16.html. For a cumulative description of the scope of this task group’s work, see PCC Task Group on URIs in MARC, “PCC Task Group on URIs in MARC: Year 2 Report to PoCo, October 2017” (Program for Cooperative Cataloging, Oct. 23, 2017), https://www.loc.gov/aba/pcc/documents/PoCo- 2017/PCC_URI_TG_20171015_Report.pdf. 10 Shieh and Reese, “The Importance of Identifiers in the New Web Environment and Using the Uniform Resource Identifier (URI) in Subfield Zero ($0),” 221. 11 Shieh and Reese, “The Importance of Identifiers in the New Web Environment and Using the Uniform Resource Identifier (URI) in Subfield Zero ($0)”; For a discussion of a related problem (finding a place for a URI in MARC authority records), see Ioannis Papadakis, Konstantinos Kyprianos, and Michalis Stefanidakis, “Linked Data URIs and Libraries: The Story so Far,” D-Lib Magazine 21, no. 5/6 (June 2015), https://doi.org/10.1045/may2015-papadakis. 12 Michael Buckland et al., “Geographic Search: Catalogs, Gazetteers, and Maps,” College & Research Libraries 68, no. 5 (Sept. 2007): 376, https://doi.org/10.5860/crl.68.5.376. 13 Marcy Bidney and Kevin Clair, “Harnessing the Geospatial Semantic Web: Toward Place-Based Information Organization and Access,” Cataloging & Classification Quarterly 52, no. 1 (2014): 70, https://doi.org/10.1080/01639374.2013.852038. 14 Buckland et al., “Geographic Search.” 15 Marcy Bidney, “Can Geographic Coordinates in the Catalog Record Be Useful?,” Journal of Map & Geography Libraries 6, no. 2 (July 13, 2010): 140–50, https://doi.org/10.1080/15420353.2010.492304. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 51 16 Bidney and Clair, “Harnessing the Geospatial Semantic Web.” 17 Rick Bennett et al., “MapFAST: A FAST Geographic Authorities Mashup with Google Maps,” Code4Lib Journal, no. 14 (July 25, 2011): 1–9, http://journal.code4lib.org/articles/5645. 18 Bennett et al., 1. 19 Rainer Simon et al., “Peripleo: A Tool for Exploring Heterogeneous Data through the Dimensions of Space and Time,” The Code4Lib Journal, no. 31 (Jan. 28, 2016), http://journal.code4lib.org/articles/11144. 20 Rainer Simon et al., “The Pleiades Gazetteer and the Pelagios Project,” in Placing Names: Enriching and Integrating Gazetteers, ed. Merrick Lex Berman, Ruth Mostern, and Humphrey Southall, The Spatial Humanities (Bloomington: Indiana Univ. Pr., 2016), 97–109. 21 Merrick Lex Berman, “Linked Places in the Context of Library Metadata” (Nov. 10, 2016), https://sites.fas.harvard.edu/~chgis/work/docs/papers/HVD_LibraryLinkedDataGroup_LexB erman_20161110.pdf. 22 Lisa R. Johnston and Kristi L. Jensen, “MapHappy: A User-Centered Interface to Library Map Collections via a Google Maps ‘Mashup,’” Journal of Map & Geography Libraries 5, no. 2 (July 2009): 114–30, https://doi.org/10.1080/15420350903001138; Chris Freel et al., “Geocoding LCSH in the Biodiversity Heritage Library,” The Code4Lib Journal, no. 2 (Mar. 24, 2008), http://journal.code4lib.org/articles/52; Gina L. Nichols, “Merging Special Collections with GIS Technology to Enhance the User Experience,” SLIS Student Research Journal 5, no. 2 (2015): 52–71, http://scholarworks.sjsu.edu/slissrj/vol5/iss2/5/. 23 Hector Gonzalez et al., “Google Fusion Tables: Data Management, Integration and Collaboration in the Cloud,” in Proceedings of the 1st ACM Symposium on Cloud Computing (Indianapolis: ACM, 2010), 175–80, https://doi.org/10.1145/1807128.1807158. 24 The ISAW Library New Titles library is available at http://www.zotero.org/groups/290269. 25 Since our interest was in obtaining coordinate data, we determined that LCNAF and LCSH would not be appropriate to our needs. Although some MARC authority records include coordinate data, it is not present in all geographic headings. Moreover, where coordinate data is available in the authority file, it is not published in the RDF form of the records via the LC Linked Data Service (http://id.loc.gov/). Entries in the Getty Thesaurus of Geographic Names (TGN, http://www.getty.edu/research/tools/vocabularies/tgn/index.html) often include structured coordinate data, and in recent months we have begun using TGN URIs when a Pleiades URI is not available. 26 Library of Congress, Network Development & MARC Standards Office, “Subject Heading and Term Source Codes: Source Codes for Vocabularies, Rules, and Schemes,” Library of Congress, Jan. 9, 2018, https://www.loc.gov/standards/sourcelist/subject.html. 27 It is worth noting that, since the URIs are not currently being queried in the preparation of the map, much of this work could have been accomplished with pre-URI identifiers from MARC MAP AS A SEARCH BOX | MCKEE 52 https://doi.org/10.6017/ital.v38i1.10592 data, or even unique text strings. One benefit of using URIs is ease of access to coordinate data, especially from Pleiades. Pleiades puts coordinates front and center in its display, and even features a one-click feature to copy coordinates to the clipboard. Moreover, the entire Pleiades dataset is available for download, making the retrieval of coordinates automatable locally, reducing keystrokes even without active database querying. The primary benefit of using URIs instead of other forms of unique identifiers, however, is forward-compatibility. This is of immediate importance, since we are developing an updated version of the map that will actively query Pleiades for coordinates. Future benefits of the presence of URIs also include links from Pleiades into the library catalog, based on records in which place URIs appear. If and when the entire catalog shifts to a linked-data model, the benefits of having these URIs present expands exponentially, as this metadata will then be available to all manner of outside sources. 28 Sean Gillies et al., “Conceptual Overview,” Pleiades, Mar. 24, 2017, https://pleiades.stoa.org/help/conceptual-overview. 29 Library of Congress, “Subject Headings Manual (SHM)” (Library of Congress, 2014), H 690, https://www.loc.gov/aba/publications/FreeSHM/freeshm.html. 30 For example, Pleiades contains two Place resources with the identical name “Babylon”: one the Mesopotamian city and capital of the region known as Babylonia (https://pleiades.stoa.org/places/893951); the other the site of the Muslim capital of Egypt, Al-Fusṭāṭ, known in late antiquity as Babylon (https://pleiades.stoa.org/places/893951727082). 31 Google, “Notice: Google Fusion Tables Turndown,” Fusion Tables Help, Dec. 11, 2018, https://support.google.com/fusiontables/answer/9185417. 32 A method of chronological browsing was described in Vivien Petras, Ray R. Larson, and Michael Buckland, “Time Period Directories: A Metadata Infrastructure for Placing Events in Temporal and Geographic Context,” in Digital Libraries, 2006. JCDL’06. Proceedings of the 6th ACM/IEEE- CS Joint Conference on Digital Library (IEEE, 2006), 151–60, https://doi.org/10.1145/1141753.1141782. 10596 ---- Gaps in IT and Library Services at Small Academic Libraries in Canada Jasmine Hoover INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 15 Jasmine Hoover (jasmine_hoover@cbu.ca) is Scholarly Resources Librarian, Cape Breton University, Sydney, Nova Scotia, Canada. ABSTRACT Modern academic libraries are hubs of technology, yet the gap between the library and IT is an issue at several small university libraries across Canada that can inhibit innovation and lead to diminished student experience. This paper outlines results of a survey of small (<5,000 FTE) universities in Canada, focusing on IT and the library when it comes to organizational structure, staffing, and location. It then discusses higher level as well as smaller scale solutions to this issue. INTRODUCTION Modern academic libraries are hubs of technology, yet existing staffing, organizational structures, physical proximity and traditional ways of doing things in higher education have maintained a gap between the library and IT, which is an issue at several small university libraries across Canada. Libraries today are largely online, which means managing access to resources, using online tools for reference and research, designing websites and more. The physical space in libraries is now a place to interact with new technologies, visualize data, a place for research support including open access repositories and data management, and other digital research initiatives. 1 These library functions often require a staffing complement to support them with a level of specialization in information technology (IT). However, though the offerings of the library have changed drastically over the years, smaller university libraries have struggled to support the growing need for IT services. Larger universities (over 5,000 FTE) have managed this influx of demand and usage of new technologies in libraries by having their own library IT services to manage software and technologies to support research, teaching and learning. Many also offer student and user -facing technical support with IT help desks within the library. Smaller universities (below 5,000 FTE) often do not have the resources to have their own IT department or staff and find themselves not able to help researchers with modern digital scholarship, not able to support new systems and software, and not working as closely with IT as they would like or need. Also, the IT department is generally not responsible for this kind of work, as it is outside of institutional-wide software support. This paper outlines the current status of IT and the library when it comes to organizational structure, physical location, and collaboration in small academic libraries across Canada. It then outlines strategies that can be used in smaller libraries to help bridge the gap, as well as recommendations for administrators when considering organizational changes to better serve a modern research atmosphere. CURRENT STATUS AT SMALL CANADIAN UNIVERSITIES The technologies behind modern library services are often complex, as libraries need to securely manage access to online resources (both on and off campus); support faculty as they research and mailto:jasmine_hoover@cbu.ca GAPS IN IT AND LIBRARY SERVICES AT SMALL ACADEMIC LIBRARIES IN CANADA | HOOVER 16 https://doi.org/10.6017/ital.v37i4.10596 teach using new software and technologies; and support new models for publishing that include open-access repositories, data management, open education resources, and more. Library staff deal with technology issues that come up daily, with several non-IT library staff members troubleshooting and solving various issues that arise. Library users run into all kinds of technical issues and reach out for help. In Nova Scotia, our library consortium offers Live Help, an online library chat service distributed throughout eleven academic institutions in Nova Scotia. Statistics kept on type of question asked on this service from January 2010 to March 2018 show that 26 percent of the over 68,000 questions asked are technical in nature, with topics including difficulty accessing online resources, login troubles, and other technical issues.2 For this study, 18 out of the 21 universities with FTE >1,000 and < 5,000 in Canada were surveyed. Excluded were universities that were “sister” institutions of larger universities which utilized the same library system and French-only-speaking universities. Twelve university libraries responded to an online survey which asked questions concerning organization and collaboration focused on IT, the library, and Educational Technology. Results (see figure 1) show that organizational reporting structures in higher education vary when it comes to IT and the library. Fifty percent of the survey respondents reported that their IT department reports to the CEO/CFO or VP administrative, 25 percent of IT departments report to a CIO, 17 percent report to a Provost/VP academic, and 8 percent report to a VP Finance. Figure 1. Which of the following best describes how your IT organization reports? All of the libraries in this survey, on the other hand, report to a Provost or VP Academic. This makes sense, as libraries are generally considered academic while IT is usually associated with operations. However, there have been recent changes to some university library structures in Canada that might indicate new thinking when it comes to organizational structure and the relationship between these units. In 2018, it was announced that there would be restructuring at Brandon University which removed the University Librarian position altogether (as well as the 50% 17% 25% 8% Reports to CEO/CFO/VP admin Reports to Provost/ VP academic Reports to CIO Reports to the VP Finance INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 17 director of IT services), and placed the library under a Chief Information Officer. This would bring the library and IT under one reporting structure.3 In an opposite move, Mount Allison University recently proposed to eliminate the top librarian position and have an academic dean split the responsibility of the library and their academic unit.4 After local outcry, this move was reversed and the job ad is out for a head librarian. It is hard to say if these are signals of upcoming change in the future of library reporting, or a temporary solution in a time of budget restrictions. However, half of the survey respondents mentioned that there has been some recent reorganization or planned reorganization related to IT and the library at their institutions. Only 33 percent of small university libraries surveyed have their own IT department or staff. One of those libraries have an IT specialist who splits time between the library and their IT department. The other 67 percent have no IT department or staff in the library (see figure 2). Figure 2. Does your library have its own IT department? When asked, “Is there anything you would like changed about the current organization when it comes to IT and the library?,” all of the libraries without in-library IT support mentioned a desire for either a position in the library responsible for IT; greater collaboration between IT and the library; or a specific person within the IT department who they could contact regarding IT. Student experience, including their experience with technology is important according to a recent EDUCAUSE study. This 2017 EDUCAUSE study outlines the importance of IT, and support for students when it comes to Wi-Fi and other technical support.5 One recommendation from this report is to have IT help desks more visible and available. Not only is the library a convenient location, but as we have already seen, students are increasingly using technologies in the library and often run into issues. It makes sense then to have an IT help desk within the library, as the majority of larger university libraries in Canada already offer. When asked about IT help desks in 25% 67% 8% Yes, they are library employees No GAPS IN IT AND LIBRARY SERVICES AT SMALL ACADEMIC LIBRARIES IN CANADA | HOOVER 18 https://doi.org/10.6017/ital.v37i4.10596 the library, three of the responding university libraries (25 percent) have help desks staffed by IT services, one (8 percent) had a help desk staffed by library staff, and another (8 percent) had an after-hours help desk staffed by IT services. The remaining 59 percent have no IT help available in the library (see figure 3). Figure 3. Does your library have an IT help desk? The physical location of the two units is also important. In this survey, 75 percent of respondents replied that the library and the IT department are in separate spaces while 25 percent share a common space. Studies have shown that physical proximity in the workplace can lead to greater collaboration. An MIT study showed that physical proximity drives collaboration between researchers on university campuses.6 As one of the common themes in the survey was the desire for more collaboration, a physical change of location could have a great impact. When asked about changes people would like to see with the current organization of the IT and library, many mention a need for more collaboration due to interrelated responsibilities. Common suggestions included library IT staff, having an IT help desk in the library, or a specific person in IT they could contact directly for help or who had shared responsibilities between IT and the library. Another suggestion was a committee that would bring together members from both units to strengthen communication. WHAT CAN BE DONE? In the larger view, university administrations need to look for outdated governance and organizational structures that are in place. As universities shift their goals and focus over time, they need to adapt structures and staffing accordingly. Chong and Tan describe IT govern ance as being of utmost importance, claiming there needs to be strategic alignment between IT and organizational strategies and objectives.7 Carraway describes universities with a high level of IT governance maturity and effectiveness as those where “IT initiatives are aligned with the 59%25% 8% 8% No Yes, employed by IT services Yes, employed by the library Yes, employed by IT services, only after hours INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 19 institution’s strategic priorities and prioritized among the university’s portfolio projects.” 8 Effective IT governance, focused on collaboration and communication, is associated with greater integration of innovation into institutional process. Also, IT governance was found to be more effective under a delegated model that empowers IT governance bodies than under a CIO centric model. The majority of universities surveyed showed common governance structures of IT, with most as separate units reporting to a CFO/VP admin or similar. The inclusion of faculty, students , and business units in IT governance committees was associated with a stronger innovation culture.9 Stakeholder inclusion is an important characteristic of IT governance maturity. Students, as consumers of IT, and faculty should both have a seat at the table when it comes to IT governance. Carraway found that an increased level of student engagement in IT governance correlates with a high level of innovation culture.10 University administration should take a good look at how IT is governed, who has input and how it is affecting the university’s objectives. The reporting structure of libraries has generally gone unchanged, with most respondents confirming that their library reports to an academic vice president. Budget constraints at two Canadian universities have seen the library structure being impacted as of late, however there has been little research done on the ideal governance structure of libraries in higher education. Both IT and the library in smaller Canadian universities could consider governance committees that include students, faculty and other stakeholders, in order to be more innovative and effective. IT is an interesting unit, where the model in higher education has moved back and forth between three main structures: centralized, decentralized, and federated IT structures. Centralized, where there is a central hub that runs IT services for the university, is the most common structure found at the surveyed universities. Decentralized, where IT services are spread throughout the organization, would automatically mean the library (and other units) had IT staff. A federated model would also lead to local library IT work being done by specific people, who work for and out of a central IT office, but are assigned to specific areas. Federated structures offer centralized control, with decentralized functions in faculties and units. Chong and Tan believe that federated structures are more appropriate for a collaborative network, such as a university.11 Their study found that federated structure, combined with coordinated communication, led to higher effectiveness. Nugroho maintains that decentralized organizations such as universities need to regularly review their IT governance structure, as both technology and the organization itself changes.12 He maintains that effective governance does not happen by coincidence, and IT governance is not a static concept. Library staffing also needs to change based on needs of the users and goals of the organization. Some even suggest that libraries reorganize every few years to keep staff flexible, take advantage of new opportunities and foster growth.13 In 2011, we saw Bell and Shank’s work on the blended librarian, which advocated for librarianship with educational technology and instructional design skills.14 According to the 2015 ARL statistics, we continue to see nontraditional professional jobs increasing in the library. In 2015, the top three new hire categories included two nontraditional categories: digital specialists and functional specialists.15 ARCL statistics from 2016 showed that over the previous five years, 61 percent of libraries repurposed or cross-trained staff to better support new technologies or services.16 We saw in the survey that out of over 68,000 research questions fielded by librarians across Nova Scotia since 2010, just over one quarter of these are technical in nature. Library Administration at smaller universities, looking at these numbers, GAPS IN IT AND LIBRARY SERVICES AT SMALL ACADEMIC LIBRARIES IN CANADA | HOOVER 20 https://doi.org/10.6017/ital.v37i4.10596 should respond by ensuring that technical knowledge and skills will be written into job ads, as they are increasingly in demand or that staff are trained appropriately. Physical location is also important. We’ve seen from the survey results that there is a lack of physical connectedness between the library and IT in smaller Canadian universities. Wineman et al. studied various organizations and their physical proximity. They state: “Social networks play important roles in structuring communication, collaboration, access to knowledge and knowledge transformation.”17 They suggest that innovation is a process that occurs at the crossroad between social and physical space. Cramton points out that “maintaining mutual knowledge is a central problem of geographically dispersed collaboration.”18 If it is not possible to change the organizational structure or governance to ensure more communication and knowledge sharing, physical spaces such as an IT desk in the library is another way for the library and IT staff to be in regular contact. A 2017 MIT study recommended that institutions keen to support the cross- disciplinary collaborative activity that is vital to research and practice, may need to adopt “a new approach to socio-spatial organisation that may ultimately enrich the design and operation of places for knowledge creation.”19 We could apply the same thinking to institutions interested in supporting collaborative activity between the library, IT, and newer-yet-related initiatives such as educational technology and digital research centers. Proximity to collaborators should be considered as one option to enhance outcomes and innovation between the library and IT. Organizational structures and models, physical locations, and governance are all large-scale factors that should be considered when looking at the relationship between IT and the library. There are also smaller-scale practical ideas that can help. These ideas will be discussed below. An important first step is to start the conversation. The author’s institution has begun thinking about the gaps in our services and support for research, especially when it comes to support for technologies needed for modern research and publication that are often housed in the library. Factors which have helped start this conversation include: funding mandates related to open access and data management; new services or initiatives that researchers or units would like to start; which require IT and library specialization; and planning for a future in higher education that increasingly relies on up to date technologies to support research, publishing, and teaching. A conversation is beginning between researchers, administration, the library, and other stakeholders which will lead to a collaborative solution to some of these issues. It’s important that there is interest and initiation from administration, but also that other stakeholders are involved from the onset. Many universities have developed new positions, new units, or worked these positions into IT or the library to fill this gap, but the solution needs to fit each institution and their goals. Often times when there is no IT staff in the library, technical issues are managed by one or two technical-minded staff members. Equipping frontline service providers may help alleviate some of this work by enabling many staff to solve common technical issues. Here at the author’s institution, the librarian in charge of access has begun presenting common technical/access issues during a monthly reference meeting. The goal is to have all staff who field questions from users have a basic understanding of how the systems work in the library, what to do if they see issues, and whom they can contact. In libraries where there is not a strong IT presence, it is important to enable staff to be comfortable with basic issues that will come up. This also ensures that there is not just one person who can answer common technical/access issues. If someone staffing the INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 21 reference or circulation desk encounters users with these issues, they can explain why they are happening and what the library is going to do to help them. The plan is to create a library technical manual out of these quick presentations that can act as a resource for all staff or as a training manual for new staff. At each of these presentations, a survey is administered. The survey has four questions and asks participants about their comfort level dealing with technical/access questions both before and after the presentation. One hundred percent of staff answered that after the presentation, they felt more comfortable when encountering the issues described. This is not a suitable replacement of the specialized IT skills needed in libraries; however, it can alleviate some of the pressure put on select people in smaller academic libraries. Library staff can, and do, actively work to learn new skills through formal training and professional development. We saw from the ACRL survey that many libraries are working to cross-train staff in order to keep up with technological demands. Encouraging learning new skills and educational opportunities can go a long way and should be encouraged by library administration. The benefit of having IT staff dedicated to the library is obvious, and libraries should continually push for this. Results of the survey showed that library staff would prefer to have a person to contact with issues specific to the library. Issues can be dealt with promptly, IT personnel working in or assigned to the library will have an understanding of the systems involved, communication is easier, as there is a point person to contact, and the library has control over the products and services they offer. However, if that is not possible within the organization, a good system of communication is important. A timely system of contacting IT and resolving issues can go a long way. Chong and Tan maintain that a coordinated communication system is key for IT in an organization.20 A commonly used system for technical issues is the ticket system, where issues can be submitted by users, and answered and tracked by IT. This is a very useful system for IT staff, however the users often cannot track their own ticket, see a timeline for completion, or know who is on the other end to contact with more information. It is a good idea to meet regularly with IT, formally or informally, to be able to discuss issues, build a relationship with colleagues, and get a better sense of how each unit works. On the library end, it is important to keep statistics on technical issues sent to IT and the time elapsed before the issues are resolved. These statistics can be used to demonstrate the need for library-specific IT staff, encourage better communication between departments, or demonstrate a problem with the current way issues are communicated. Having statistics will help libraries if and when the time comes that new positions can be created. At the author’s institution we use Springshare’s LibAnswers software to track all technical issues, including those sent on to IT. This software records the dates and times; important details and resolutions to technical issues; and exports useful statistics. In smaller Canadian university libraries there is a growing need for IT support. However, there has been little done by way of organizational structure, staffing, or physical proximity between these two units to allow universities to better serve their students and faculty. This paper outlined the current situation in several smaller university libraries in Canada and provides some high level as well as local solutions to this problem. GAPS IN IT AND LIBRARY SERVICES AT SMALL ACADEMIC LIBRARIES IN CANADA | HOOVER 22 https://doi.org/10.6017/ital.v37i4.10596 APPENDIX A: IT, LIBRARY, AND EDUCATIONAL TECHNOLOGY ORGANIZATION *Required 1. Institution Name * 2. Total Student Population 3. Which of the following best describes how your IT organization reports? Mark only one oval. Reports to CEO/CFO/VP admin Reports to CIO Reports to Provost/VP Academic Reports to Dean of Library/ Head of Library Other: 4. Which of the following best describes how the Dean/Head of Library/University Librarian reports? Mark only one oval. Reports to the CEO/CFO/VP admin Reports to Provost/VP Academic Reports to University President Other: 5. Which of the following best describes IT's relationship to the library? Mark only one oval. IT and the library are not at all part of the same reporting structure IT is a part of the library reporting structure IT and the library report to the same person, but are separate departments Other: 6. Which of the following describes the physical location of IT and the library? Mark only one oval. Located in separate spaces Share a physical location Other: 7. Does your library have its own IT department? Mark only one oval. Yes, they are library employees Yes, they are employed by IT services and work in the library INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 23 No Other: 8. Does your library have an IT help desk? Mark only one oval. Yes, they are library employees Yes, they are employed by IT services No Other: 9. Have there been any major reorganizations (that you are aware of) related to IT and library services in the last ten years? 10. Is there anything you would like changed about your current organization when it comes to IT and the library? 11. Who is in charge of Educational Technology/Academic Technology at your university? Mark only one oval. Library IT Educational Technology is separate unit/office Educational Technology duties are split up among the library/IT/other Other: 12. Which of the following describes the physical location of Educational Technology? Mark only one oval. Ed Tech is located in or shares space with the library Ed Tech is located in or shares space with IT Ed Tech has its own space No Ed Tech unit GAPS IN IT AND LIBRARY SERVICES AT SMALL ACADEMIC LIBRARIES IN CANADA | HOOVER 24 https://doi.org/10.6017/ital.v37i4.10596 Other: 13. What would you include as roles of an Educational Technology unit? Mark all that apply. Media design/production Research and development (testing technologies, em erging tech) Instructional design and developm ent Faculty development Learning spaces Assessment (learning outcomes, course evaluations) Distance/online learning support Training on course software/technologies related to teaching and learning Managing classroom technologies Other: 14. Have there been any changes (that you know of) related to Educational Technology Services in the last ten years? 15. Is there anything you would like changed about your current organization when it comes to Educational Technology Services and the library? 16. May I use direct quotes in my research/publication? (No names or institutions will be attributed to a quote.) Mark only one oval. Yes No INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 25 REFERENCES 1 Tibor Koltay, “Are You Ready? Tasks and Roles for Academic Libraries in Supporting Research 2.0,” New Library World 117, no. 1/2 (January 11, 2016): 94–104, https://doi.org/10.1108/NLW-09-2015-0062. 2 “Instant Messaging Service—Statistics Data Entry Page,” Novanet, accessed June 5, 2018, https://util.library.dal.ca/livehelp/liveh3lp/admin/livehelp/chatentry.php. 3 “Brandon University Will Eliminate 15% of Senior Administration to Help Tackle Budget Cut,” Brandon University, March 15, 2018, https://www.brandonu.ca/news/2018/03/15/brandon- university-will-eliminate-15-of-senior-administration-to-help-tackle-budget-cut/. 4 Joseph Tunney, “Mount A Proposal to Phase out Top Librarian Makes Students, Staff Want to Make Noise,” CBC News, January 18, 2018, https://www.cbc.ca/news/canada/new- brunswick/mount-allison-university-librarian-1.4492297. 5 D. Christopher Brooks and Jeffrey Pomerantz, “ECAR Study of Undergraduate Students and Information Technology,” EDUCAUSE, October 18, 2017, accessed June 7, 2017, https://library.educause.edu/resources/2017/10/ecar-study-of-undergraduate-students- and-information-technology-2017. 6 Matthew Claudel et al., “An Exploration of Collaborative Scientific Production at MIT through Spatial Organization and Institutional Affiliation,” PLOS ONE 12, no. 6 (2017), https://doi.org/10.1371/journal.pone.0179334. 7 Josephine Chong and Felix B. Tan, “IT Governance in Collaborative Networks: A Socio-Technical Perspective,” Pacific Asia Journal of the Association for Information Systems 4, no. 2 (2012). 8 Deborah Louise Carraway, “Information Technology Governance Maturity and Technology Innovation in Higher Education: Factors in Effectiveness” (master’s diss., The University of North Carolina at Greensboro, 2015), 113. 9 Ibid., 89. 10 Ibid. 11 Chong and Tan, “IT Governance in Collaborative Networks: A Socio-Technical Perspective,” 44. 12 Heru Nugroho, “Conceptual Model of IT Governance for Higher Education Based on Cobit 5 Framework,” Journal of Theoretical and Applied Information Technology, 60, no. 2 (February 2014): 6. 13 Gillian S. Gremmels, “Staffing Trends in College and University Libraries,” Reference Services Review 41, no. 2 (2013): 233–52, https://doi.org/10.1108/00907321311326165. 14 John D. Shank and Steven Bell, “Blended Librarianship.” Reference & User Services Quarterly 51, no. 2 (Winter 2011): 105–10. https://doi.org/10.1108/NLW-09-2015-0062 https://util.library.dal.ca/livehelp/liveh3lp/admin/livehelp/chatentry.php https://www.brandonu.ca/news/2018/03/15/brandon-university-will-eliminate-15-of-senior-administration-to-help-tackle-budget-cut/ https://www.brandonu.ca/news/2018/03/15/brandon-university-will-eliminate-15-of-senior-administration-to-help-tackle-budget-cut/ https://www.cbc.ca/news/canada/new-brunswick/mount-allison-university-librarian-1.4492297 https://www.cbc.ca/news/canada/new-brunswick/mount-allison-university-librarian-1.4492297 https://library.educause.edu/resources/2017/10/ecar-study-of-undergraduate-students-and-information-technology-2017 https://library.educause.edu/resources/2017/10/ecar-study-of-undergraduate-students-and-information-technology-2017 https://doi.org/10.1371/journal.pone.0179334 https://doi.org/10.1108/00907321311326165 GAPS IN IT AND LIBRARY SERVICES AT SMALL ACADEMIC LIBRARIES IN CANADA | HOOVER 26 https://doi.org/10.6017/ital.v37i4.10596 15 Stanley Wilder, “Hiring and Staffing Trends in ARL Libraries,” Association of Research Libraries, October 2017, https://www.arl.org/storage/documents/publications/rli-2017-stanley- wilder-article2.pdf. 16 “New ACRL Publication: 2016 Academic Library Trends and Statistics,” News and Press Center, July 20, 2017, http://www.ala.org/news/member-news/2017/07/new-acrl-publication-2016- academic-library-trends-and-statistics. 17 Jean Wineman et al., “Spatial Layout, Social Structure, and Innovation in Organizations,” Environment and Planning B: Planning and Design 41, no. 6 (December 1, 2014): 1,100–112, https://doi.org/10.1068/b130074p. 18 Catherine Durnell Cramton, “The Mutual Knowledge Problem and Its Consequences for Dispersed Collaboration,” Organization Science 12, no. 3 (May-June2001): 346–71, https://doi.org/10.1287/orsc.12.3.346.10098. 19 Claudel et al., “An Exploration of Collaborative Scientific Production at MIT through Spatial Organization and Institutional Affiliation,” 2. 20 Chong and Tan, “IT Governance in Collaborative Networks: A Socio-Technical Perspective,” 44. https://www.arl.org/storage/documents/publications/rli-2017-stanley-wilder-article2.pdf https://www.arl.org/storage/documents/publications/rli-2017-stanley-wilder-article2.pdf http://www.ala.org/news/member-news/2017/07/new-acrl-publication-2016-academic-library-trends-and-statistics http://www.ala.org/news/member-news/2017/07/new-acrl-publication-2016-academic-library-trends-and-statistics https://doi.org/10.1068/b130074p https://doi.org/10.1287/orsc.12.3.346.10098 ABSTRACT INTRODUCTION Current status at small Canadian universities What can be done? APPENDIX A: IT, Library, and Educational Technology Organization REFERENCES 10599 ---- Business Intelligence in the Service of Libraries Articles Business Intelligence in the Service of Libraries Danijela Tešendić and Danijela Boberić Krstićev INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 98 Danijela Tešendić (tesendic@uns.ac.rs) is Associate Professor, University of Novi Sad. Danijela Boberić Krstićev (dboberic@uns.ac.rs) is Associate Professor, University of Novi Sad. ABSTRACT Business intelligence (BI) refers to methodologies, analytical tools, and applications used for data analysis of business information. This article aims to illustrate an application of BI in libraries, as reporting modules in library management systems are usually inadequate for a comprehensive business analysis. The application of BI technology is presented as a case study of libraries using the BISIS library management system in order to overcome shortcomings of an existing reporting module. Both user requirements regarding reporting in BISIS and already existing transactional databases are analysed during the development of a data warehouse model. Based on that analysis, three data warehouse models have been proposed. Also, examples of reports generated by an OLAP tool are given. By building the data warehouse and using OLAP tools, users of BISIS can perform business analysis in a more user-friendly and interactive manner. They are not limited with predefined types of reports. Librarians can easily generate customized reports tailored to the specific needs of the library. INTRODUCTION Organizations usually have a vast amount of data which increases on a daily basis. The success of an organization is directly related to its ability to provide relevant information in a timely manner. An organization must be able to transform raw data into valuable information that will enable better decision-making.1 For this reason, it is impossible to imagine an organization without an efficient reporting module as a part of its management information system. If we put libraries in a business context, they are very similar to any other organization. The common characteristic of each is that they have high demand for a variety of statistical reports in order to support their business. A library management system uses a transactional database to store and process relevant data. This database is designed in accordance with the main functionalities of the system. Information used to make strategic decisions is usually obtained from historical and summarized data. However, the database model may have a complex structure and may not be suitable for performing analytical queries that are often very complex and involve aggregations. Execution of those queries may be a time-consuming and resource-intensive process that can decrease performance as well as the availability of the system itself. Also, creating such queries can require advanced IT knowledge. These problems can be overcome by developing business intelligence systems. Business intelligence (BI) refers to methodologies, analytical tools, and applications used for data analysis of business information. BI gives business managers and analysts the ability to conduct mailto:tesendic@uns.ac.rs mailto:dboberic@uns.ac.rs BUSINESS INTELLIGENCE IN THE SERVICE OF LIBRARIES |TEŠENDIĆ AND KRSTIĆEV 99 https://doi.org/10.6017/ital.v38i4.10599 appropriate analyses. By analyzing historical and current data, decision-makers get valuable insights that enable them to make better, more-informed decisions. BI systems rely on a data warehouse as an information source. The data warehouse is a repository of data usually structured to be available in a form ready for analytical processing activities. 2 Business intelligence systems do not exist as ready-made solutions for each organization, but need to be built in accordance with the characteristics of each organization using the appropriate methodology. This article proposes a data warehouse architecture and usage of OLAP tools in order to support BI in libraries. The application of BI technology is illustrated through a case study of libraries using the BISIS library management system. The first step in implementation of BI was the creation of a data warehouse model considering data that exist in BISIS and requirements regarding reporting. After the data warehouse model had been created, data were loaded into the data warehouse using OLAP tools. OLAP tools were also used for visualization of data stored in the data warehouse. REPORTING IN BISIS The BISIS library management system has been in development since 1993 by the University of Novi Sad, Serbia. Currently, the BISIS community comprises over forty medium-sized libraries in Serbia.3 The primary modules of the BISIS system include cataloguing, reporting, circulation, OPAC, bibliographic data interchange, and administration. BISIS supports cataloguing according to UNIMARC and MARC 21 formats using an XML editor for bibliographic material processing.4 The BISIS search engine is implemented with a Lucene engine.5 BISIS supports Z39.50 and SRU protocols for the search and retrieval of bibliographic records.6 Those protocols are also used for developing a BISIS service for searching and downloading electronic materials by the audio library system for visually impaired people.7 In addition, BISIS allows sharing of bibliographic records with the union catalogue of the University of Novi Sad.8 The circulation module features all standard activities for managing users: registration, charging, discharging, searching users and publications, and generating different kinds of reports, as well as user reminders.9 The reporting module of BISIS is implemented using the JasperReports tool.10 However, this module has some limitations due to the fact that BISIS works only with a transactional database and does not cope well with complex reports. Firstly, in order to generate reports regarding library collections, it is necessary to process all bibliographic records stored in that transactional database. This activity significantly burdens the system and reduces its performance. To avoid this, reports are prepared in advance outside working hours, usually at night. Consequently, those reports include only data collected before report generation. Creating reports in this manner greatly reduces system load and speeds up presentation of the reports because they are already generated. However, some reports, such as those related to the financial aspects of the library (e.g., the number of new members and the balance at the end of the day), need to be created in real time. Due to the execution in real time, those reports are ineffective and affect the performance of the entire system. The next limitation of this reporting module is that it has a set of predefined reports and the creation of new reports requires additional development. In the current deployment it is not possible to add new reports without engaging software developers. Also, an additional obstacle is the fact that the data for generating reports are obtained from two different data sources (described in more detail in the following sections). For example, the report regarding the number INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER2019 100 of borrowed books by the UDC (Universal Decimal Classification) groups requires data about the UDC groups from XML documents and data about book borrowing from the relational database. Generating this kind of reports cannot be done in a timely and efficient manner. Taking into account these shortcomings of the reporting module, it can be concluded that the application of business intelligence, primarily data warehouse and OLAP tools, could improve analytical data processing in the libraries using BISIS. RELATED WORK One of the basic components of the business intelligence system is a data warehouse. A data warehouse is a centralized database that stores historical data. Those data are in principle unchangeable and they are obtained by collecting and processing data from various data sources. Data warehouses are used as support for making business decisions.11 The data sources for a data warehouse can be diverse and may include transactional databases and different file formats. The process of integrating data from different data sources into a single database is called data warehousing. Data warehousing includes extracting, transforming, and loading (ETL) data into data warehouse.12 The goal of data warehousing is to extract useful data for further analysis from the huge amount of data that is potentially available. There are different approaches to modeling a data warehouse. These approaches can be classified in three different paradigms according to the origin of the information requirements: (1) supply- driven, (2) demand-driven, and (3) hybrids of these. A supply-driven approach is based on data that exist in the transactional database. These data are analyzed to determine which data are the most relevant for making business decisions, or which data should be part of the data warehouse. Alternatively, a demand-driven approach is based on the end-user requirements which means that the data warehouse is modeled in a way that is possible to get answers to the questions asked by the users. The third approach is a hybrid approach and it combines the previous two approaches in the process of data warehouse modelling. The hybrid approach attempts to diminish the shortcomings of the previous two approaches. In the case of a supply-driven approach, the data warehouse will probably not meet the requirements of the end users, while in the demand-driven approach there may be no data to fill the created data warehouse. In an article published in 2009, Romero and Abelló gave an overall view of the research in the field of dimensional modeling of data warehouses.13 Various examples of implementation of data-warehouse solutions in libraries can be found in the literature. In 2014, Siguenza-Guzman et al. described the design of a knowledge-based decision support system based on data-warehouse techniques that assists library managers making tactical decisions about the optimal use and leverage of their resources and services. When designing the data warehouse, the authors started from the requirements of the end users (demand -driven approach) and extracted data from heterogeneous sources.14 A similar approach has been used by Yang and Shieh, who started from the reports needed by public libraries in Taiwan and through an iterative methodological approach modeled a data warehouse that meets all their reporting requirements.15 Unlike the previously described articles where a demand-driven approach was used, we applied a hybrid approach to modeling data warehouse. We analyzed data sources that exist in BISIS BUSINESS INTELLIGENCE IN THE SERVICE OF LIBRARIES |TEŠENDIĆ AND KRSTIĆEV 101 https://doi.org/10.6017/ital.v38i4.10599 following a supply-driven approach, but we also analysed user requirements to identify the facts and dimensions for the dimensional data warehouse model. MODELING THE DATA WAREHOUSE In order to implement a data warehouse solution, the first step is to design a data model suitable for analytical data processing. A data warehouse usually stores data in a relational database and organizes them in so called dimensional models. Unlike standard relational database models, those models are denormalized and provide easier data visualization. Data can be presented as a cube with three, four or n-dimensions. Analyzing such data is more intuitive and user-friendly. The dimensional model contains the following concepts: dimensions, facts, and measures. Dimensions represent the parameters for data analysis while facts represent business entities, business transactions, or events that can be used in analyzing business processes. The most commonly used model in dimensional modeling is the star model. After identifying the facts and dimensions, a dimensional model almost always resembles a star, with one central fact and several dimensions that surround it. Dimensions and facts are usually implemented as tables in the relational database. Dimension tables contain primary keys and other attributes. Fact tables contain numerical data as well as dimension tables keys. The measure is a numerical attribute of the fact table and can be obtained by aggregating data by certain dimensions. There are several approaches to modeling data warehouse and we followed a hybrid approach to design dimensional models presented in this article. This implies that both the existing data sources and the user requirements were considered while designing the final data-warehouse models. That modeling process involved the following activities: 1. analysis of existing data sources in BISIS with identification of possible facts and dimensions, 2. analysis of user requirements regarding reporting, 3. refactoring of the facts and dimensions in accordance with the user requirements, and 4. design of dimensional models. Analysis of Data Sources in BISIS The first step in creating a data warehouse is an analysis of existing data sources. The BISIS system uses two different data sources. Bibliographic records are stored in XML documents, while circulation data, as well as holdings data regarding the items that are circulated, are stored in a relational database. In 2009, Tešendić et al. described the BISIS circulation database model.16 That model describes data about individual and corporate library members. Data about members includes information about personal data, membership fees, as well as information about a member’s borrowed and returned items. Bibliographic data in BISIS are presented in UNIMARC format. Dimić and Surla in 2009 described the model for bibliographic records used in BISIS.17 A bibliographic record is modeled as a list of fields and subfields. A field contains a name, values of the indicators and a list of subfields. A subfield contains a name and a value of that subfield. The data described by that model are stored in XML documents because the bibliographic record structure is not suitable for relational modeling. That structure is more in line with the document-oriented data storage approach. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER2019 102 Analysis of User Requirements One of the essential functionalities of information systems, including library management systems, is to provide various statistical reports that should help the management of the library to make better business decisions. User requirements related to analytical processing in BISIS can be grouped into several categories. The first category consists of requirements regarding reports on the library collections. Examples of reports from this category are: • number of publications per language for a certain period of time; • number of publications by departments; • number of new publications for a certain period of time; and • number of publications by UDC groups. The second category consists of requirements related to the circulation of library resources. Examples of such reports are: • number of borrowed items by member category; • number of borrowed items by language of publication; • number of borrowed items by departments; • the most popular books; and • the most avid readers for a certain period. The third category consists of requirements related to the reports on financial elements of the library's business. Some of the reports are: • number of new members on a daily basis with a financial balance; • number of members by membership category and gender; and • number of members per departments. Analyzing user requirements, it was perceived that a new data warehouse have to be created using data from both data sources. This means that appropriate transformations of data from the relational database as well as from the bibliographic records documents need to be performed. DATA WAREHOUSE MODELS Taking into account the reporting requirements as well as the data that exist in BISIS, appropriate dimensional models are designed. The proposed dimensional models were designed to meet all the needs for analytical processing, as well as to enable flexibility of the reporting process in BISIS. For each of the observed groups of reports, a dimensional model was created as described below. Model Describing Library Collection Data A dimensional model of the BISIS data warehouse used for analytical processing of the library collection data is shown in figure 1. The data from this model are used to generate reports on the library collection. Examples of such reports are accessions register, number of items by UDC group, number of items by departments, etc. In generating all these reports, an acquisition number of an item has the main role and all reports are created either by counting the acquisition numbers or by displaying the acquisition BUSINESS INTELLIGENCE IN THE SERVICE OF LIBRARIES |TEŠENDIĆ AND KRSTIĆEV 103 https://doi.org/10.6017/ital.v38i4.10599 numbers along with other data related to that item. Therefore, the acquisition number represents the measure in this dimensional model. The central table in the model is the Item table and it presents a fact table. This table contains the acquisition number and foreign keys from dimension tables. All other tables in the model are dimension tables. The Publication table represents a dimension table containing bibliographic data from bibliographic records. Only data that are needed for reports are extracted from bibliographic records and stored in this table. Those data refer to the name of the author, the title of the publication, the publication’s ISBN and UDC number, the number of pages, keywords, and an identification number for the bibliographic record in the transactional database. The Acquisition table represents a dimension that describes the publication's acquisition data such as a retail price, the name of the supplier, and the invoice number. The Location table describes departments within the library where an item is stored. The Status, Publisher, Language, and UDC_group tables relate to information about the status of an item, publisher, language of the publication, and UDC group to which an item belongs. The Date and Year tables represent the time dimensions. Data in the Date table are extracted from the date of an item acquisition and data in the Year table are extracted from the publishing year. Figure 1. Dimensional model describing library collection data. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER2019 104 Model Describing Library Circulation Data A dimensional model of the BISIS data warehouse used for the analytical processing of library circulation data is shown in figure 2. Data from this model are used for generating statistical reports regarding usage of library resources. Examples of such reports are the number of borrowed publications according to different criteria (such as user categories, language of publication, departmental affiliation of the user who borrowed the publication, etc.). These data can answer questions about the most popular books or the readers with the highest number of borrowed books. Similar to the previous reporting group, the acquisition number of the item which was borrowed has the main role in generating those reports. All reports from this group are created by counting acquisition numbers of borrowed items and displaying data related to those checkouts. Therefore, in this dimensional model, the acquisition number is a measure. The central table in the model is the Lending table and is presented as a fact table. This table contains the acquisition number of the borrowed item and foreign keys from the dimension tables. All other tables in the model are dimension tables. The Publication, Publisher, Year, Acquisition, UCD_group, Status, and Language tables contain data from bibliographic records and the content of these tables have been already explained. The Member, MembershipType, Category, Education, and Gender tables represent the dimension tables containing information about library users. These data are only a subset of circulation data from transactional database. The Location table describes departments within the library where items are borrowed. The Date table represents the time dimension. The data in the Date table are derived from the date of borrowing and the date of discharge of an item. BUSINESS INTELLIGENCE IN THE SERVICE OF LIBRARIES |TEŠENDIĆ AND KRSTIĆEV 105 https://doi.org/10.6017/ital.v38i4.10599 Figure 2. Dimensional model describing library circulation data. Model Describing Members’ Data A dimensional model of the BISIS data warehouse used for the analytical processing of members’ data is shown in figure 3. Data from this model are used for generating statistical reports on library members, as well as for generating financial reports based on membership fees. Examples of such reports are the number of members according to different criteria (such as department of registration, member category, type of membership, gender, or education level). Also, this report group contains reports that include a financial balance (for example, a list of members with membership fees in a certain time period). The membership fee has the main role in generating these reports. All reports from this group are generated by counting or displaying members who have paid a membership fee or summarizing membership fees. Therefore, in this dimensional model, membership fee is a measure. The main table in the model is the Membership table and it presents a fact table. It contains the membership fee, which is the measure, and foreign keys from the dimension tables. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER2019 106 All other tables in the model are dimension tables. Tables Member, MembershipType, Category, Education and Gender represents the dimension tables that contain information about library members and the content of these tables was previously described. The table Location describes departments within the library where user registration is performed. The table Date represents the time dimension. Data in the table Date are based on the registration date and the date of the membership expiration. Figure 3. Dimensional model describing library members. TRUE VALUE OF A DATA WAREHOUSE In the previous sections, we presented models of data warehouse, but those models are unusable if they are not implemented and populated with data needed for business analysis. Extracting, transforming, and loading (ETL) processes are responsible for reshaping the relevant data from the source systems into useful data to be stored in the data warehouse. ETL processes load data into a data warehouse, but that data warehouse is still only storage for those data. A real-time and interactive visualisation of those data will show the true benefits of data warehouse implementation in various organisations including libraries. To load as well as to analyze and visualize large volumes of data in data warehouses, various OnLine Analytical Processing (OLAP) tools can be used.18 The usage of OLAP tools does not BUSINESS INTELLIGENCE IN THE SERVICE OF LIBRARIES |TEŠENDIĆ AND KRSTIĆEV 107 https://doi.org/10.6017/ital.v38i4.10599 require a lot of programming knowledge in comparison to tools used for querying transactional databases. The interface of OLAP tools should provide a user with a comfortable working environment to perform analytical operations and to visualize query results without knowing programming techniques or structure of transactional database. There are various OLAP tools available on the market.19 When choosing an OLAP tool to be used in an organization, there are several important criteria to consider: the duration of query execution, user-oriented interface, the possibility of interactive reports, price of tool, automation of the ETL process, etc.20Pentaho BI system is one of the open-source OLAP tools which satisfies most of those criteria. Among various features, Pentaho supports creation of ETL processes, data analysis, and reporting.21 Implementation of ETL processes can be a challenging task primarily because of the nature of the source systems. We used Pentaho tool to transform data from BISIS to the data warehouse, as well as to visualize data and generate statistical reports. ETL Processes Modeling After creating a data-warehouse model, it is necessary to load data into the data warehouse. The first step in that process is to extract data from the data sources. Those data may not be in accordance with the newly created data-warehouse model and appropriate transformations of data may be needed before loading. Regarding the structure of the data sources, transformations can be implemented from scratch, or by using dedicated OLAP tools. Both techniques are used in the development of our data warehouse. Transformations that required data from bibliographic records were implemented from scratch because of complex data structure, while transformations that processed data from relational database are implemented using Pentaho Data Integration (PDI) tool. PDI is a graphical tool that enables designing and testing ETL processes without writing programming code. Figures 4 and 5 show an example of transformations created and executed by that tool. Those transformations have been applied to load members’ data from BISIS relational database into the data warehouse. Figure 4. Transformations for loading members data. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER2019 108 Figure 5. The MembershipTransformation process. An issue that may arise after an initial loading of a data warehouse relates to updating the data warehouse. In order to achieve a better performance of transactional databases, updates of the data warehouse should not be performed in real time. In the case of library management systems, those updates can be performed outside of working hours so data in the data warehouse will be up to date on a daily basis. An update algorithm can be defined as an ETL process using OLAP tools or it can be implemented from scratch. Data Visualization The basic task of OLAP tools is to enable visualization of data stored in a data warehouse. The OLAP tools use multidimensional data representation, known as a cube, which allows a user to analyze data from different perspectives. OLAP cubes are built on dimensional models of a data warehouse and consist of dimensions and measures. Dimensions form the cube structure and each cell of the cube holds a measure. Measures are derived from the records in the fact table and dimensions are derived from the dimension tables. OLAP tools allow a user to select a part of the OLAP cube by setting an appropriate query and that part can be further analyzed by different dimensions. This process is performed by applying common operations on the cube which include slice and dice, drill down, roll up, and pivot.22 Data that are results of operations on the cube can be visualized in the form of tables, charts, graphs, maps, etc. The main advantage of OLAP tools reflects is that end users can do their own analyses and reporting very efficiently. Users can extract and view data from different points of view on demand. OLAP tools are valuable because they provide an easy way to analyze data using various graphical wizards. By analyzing data interactively, users are provided with feedback which can define the direction of further analysis. In order to visualize data from our data warehouse, we used the Pentaho OLAP tool. We used it to create predefined reports identified during the analysis of user requirements as well as some interactive reports using operations on the OLAP cube. Examples of generated reports are presented below in order to illustrate some features of the Pentaho OLAP tool. An example of a report shown in figure 6 was obtained with a dice operation on the cube. The dice operation selects two or more dimensions from a given cube and provides a new sub-cube. In this particular example, we selected three dimensions: gender, member category, and registration date. BUSINESS INTELLIGENCE IN THE SERVICE OF LIBRARIES |TEŠENDIĆ AND KRSTIĆEV 109 https://doi.org/10.6017/ital.v38i4.10599 Figure 6. Example of dice operation performed on the OLAP cube. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER2019 110 Figure 7. Example of roll-up and drill-down operations performed on the OLAP cube. BUSINESS INTELLIGENCE IN THE SERVICE OF LIBRARIES |TEŠENDIĆ AND KRSTIĆEV 111 https://doi.org/10.6017/ital.v38i4.10599 Additionally, we analyzed only those data from 2014 to 2018. The result of this operation is presented in the form of nested pie charts. However, other forms of visualisation can be applied on the same data set very easily. In figure 7, a more complex report is presented. That report is obtained by performing combination of roll-up and drill-down operations. The roll-up operation performs aggregation on a cube reducing the dimensions. In our example, we aggregated the number of newly acquired publications for certain years ignoring all other dimension except the date dimension. A user can select a particular year, quarter, and month and analyze details of purchased publications in that period, such as title and author of the publication. This is an example of using drill-down operation on the cube. The result of that operation is presented as a table, as shown in figure 7. This report is interactive, because user can investigate data in more detail by performing other operations on the cube that are placed on the toolbar of the report. CONCLUSION This article aims to illustrate an application of business intelligence in libraries, as reporting modules in library management systems are usually inadequate for a comprehensive business analysis. Development of a data warehouse, which is the base of any business intelligence system, as well as usage of OLAP tools are presented. Both user requirements regarding reporting in BISIS and already-existing transactional databases are analyzed during the development of a data- warehouse model. Based on that analysis, three data-warehouse models have been proposed. Also, examples of reports generated by an OLAP tool are given. By building the data warehouse and using OLAP tools, users of BISIS can perform business analysis in a more user-friendly manner than with other processes. Users are not limited to predefined types of reports. Librarians can easily generate customized reports tailored to the specific needs of the library. In this way, librarians work in a more comfortable environments, performing analytical operations interactively and visualizing query results without additional programming knowledge. The article presents the usage of Pentaho OLAP tool, but the proposed data-warehouse model is independent of OLAP tools selection and any other tool can be integrated with the proposed data warehouse. REFERENCES 1 Ralph Stair and George Reynolds, Fundamentals of Information Systems (Cengage Learning, 2017). 2 Ramesh Sharda, Dursun Delen, and Efraim Turban, Business Intelligence, Analytics, and Data Science: A Managerial Perspective (Pearson, 2016). 3 “BISIS,” library management system BISIS, accessed July 8, 2019, http://www.bisis.rs/korisnici/. 4 Bojana Dimić and Dušan Surla, “XML Editor for UNIMARC and MARC 21 Cataloguing,” The Electronic Library 27, no. 3 (2009): 509-28, https://doi.org/10.1108/02640470910966934; Bojana Dimić Surla,“Eclipse Editor for MARC Records,” Information Technology and Libraries 31, no. 3 (2012): 65-75, https://doi.org/10.6017/ital.v31i3.2384; Bojana Dimić Surla, “Developing an Eclipse Editor for MARC Records using Xtext,” Software: Practice and Experience 43, no. 11 (2013): 1377-92, https://doi.org/10.1002/spe.2140. http://www.bisis.rs/korisnici/ https://doi.org/10.1108/02640470910966934 https://doi.org/10.6017/ital.v31i3.2384 https://doi.org/10.1002/spe.2140 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER2019 112 5 Branko Milosavljević, Danijela Boberić, and Dušan Surla, “Retrieval of Bibliographic Records using Apache Lucene,” The Electronic Library 28, no. 4 (2010): 525-39, https://doi.org/10.1108/02640471011065355. 6 Danijela Boberić and Dušan Surla,“ XML Editor for Search and Retrieval of Bibliographic Records in the Z39. 50 Standard,” The Electronic Library 27, no. 3 (2009): 474-95; Danijela Boberić Krstićev, “Information Retrieval using a Middleware Approach,” Information Technology and Libraries 32, no. 1 (2013): 54-69, https://doi.org/10.6017/ital.v32i1.1941; Miroslav Zarić, Danijela Boberić Krstićev, and Dušan Surla, “Multitarget/Multiprotocol Client Application for Search and Retrieval of Bibliographic Records,” The Electronic Library 30, no. 3 (2012): 351-66, https://doi.org/10.1108/02640471211241636. 7 Danijela Tesendic and Danijela Boberic Krsticev, “Web Service for Connecting Visually Impaired People with Libraries,” Aslib Journal of Information Management 67, no. 2 (2015): 230-43, https://doi.org/10.1108/AJIM-11-2014-0149. 8 Danijela Boberić-Krstićev and Danijela Tešendić,“ Mixed Approach in Creating a University Union Catalogue,” The Electronic Library 33, no. 6 (2015): 970-89, https://doi.org/10.1108/EL-02- 2014-0026. 9 Danijela Tešendić, Branko Milosavljević, and Dušan Surla, “A Library Circulation System for City and Special Libraries,” The Electronic Library 27, no. 1 (2009): 162-86, https://doi.org/10.1108/02640470910934669; Branko Milosavljević and Danijela Tešendić, “Software Architecture of Distributed Client/Server Library Circulation System,” The Electronic Library 28, no. 2 (2010): 286-99, https://doi.org/10.1108/02640471011033648; Danijela Tešendić, “Data Model for Consortial Circulation in Libraries,” in Proceedings of the Fifth Balkan Conference in Informatics, Novi Sad, Serbia, September, 16-20, 2012. 10 Danijela Boberic and Branko Milosavljevic, “Generating library material reports in software system BISIS,” in Proceedings of the 4th International Conference on Engineering Technologies ICET, 2009: 133-37. 11 William H. Inmon, Building the Data Warehouse (Indianapolis: John Wiley & Sons, 2005); Ralph Kimball, The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses (NY: John Willey & Sons, 1996), 248, no. 4. 12 Ralph Kimball and Joe Caserta, The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data (Indianapolis: John Wiley& Sons, 2004), 528. 13 Oscar Romero and Alberto Abelló, “A Survey of Multidimensional Modeling Methodologies,” International Journal of Data Warehousing and Mining (IJDWM) 5, no. 2 (2009): 1-23. 14 Lorena Siguenza Guzman, Victor Saquicela, and Dirk Cattrysse,“Design of an Integrated Decision Support System for Library Holistic Evaluation,”in Proceedings of IATUL Conferences (2014):1- 12. https://doi.org/10.1108/02640471011065355 https://doi.org/10.6017/ital.v32i1.1941 https://doi.org/10.1108/02640471211241636 https://doi.org/10.1108/AJIM-11-2014-0149 https://doi.org/10.1108/EL-02-2014-0026 https://doi.org/10.1108/EL-02-2014-0026 https://doi.org/10.1108/02640470910934669 https://doi.org/10.1108/02640471011033648 BUSINESS INTELLIGENCE IN THE SERVICE OF LIBRARIES |TEŠENDIĆ AND KRSTIĆEV 113 https://doi.org/10.6017/ital.v38i4.10599 15 Yi-Ting Yang and Jiann-Cherng Shieh, “Data Warehouse Applications in Libraries—The Development of Library Management Reports,” in Advanced Applied Informatics (IIAI-AAI), 2016 5th IIAI International Congress on Advanced Applied Informatics, 88-91. IEEE, 2016, https://doi.org/10.1109/IIAI-AAI.2016.129. 16 Tešendić, Milosavljević, and Surla, “A Library Circulation System,”162-86. 17 Dimić and Surla, “XML Editor for UNIMARC,” 509-28. 18 Paulraj Ponniah, Data Warehousing Fundamentals for IT Professionals (Hoboken, NJ: John Wiley & Sons, 2011). 19 “Top 10 Best Analytical Processing (OLAP) Tools,” Software Testing Help, https://www.softwaretestinghelp.com/best-olap-tools/. 20 Rick Sherman, “How to Evaluate and Select the Right BI Tools,” https://searchbusinessanalytics.techtarget.com/buyersguide/A-buyers-guide-to-choosing- the-right-BI-analytics-tool. 21 Doug Moran, “Pentaho Community Wiki,” https://wiki.pentaho.com/. 22 Ponniah, “Data Warehousing,” 382-93. https://doi.org/10.1109/IIAI-AAI.2016.129 https://www.softwaretestinghelp.com/best-olap-tools/ https://searchbusinessanalytics.techtarget.com/buyersguide/A-buyers-guide-to-choosing-the-right-BI-analytics-tool https://searchbusinessanalytics.techtarget.com/buyersguide/A-buyers-guide-to-choosing-the-right-BI-analytics-tool https://wiki.pentaho.com/ ABSTRACT INTRODUCTION REPORTING IN BISIS RELATED WORK MODELING THE DATA WAREHOUSE Analysis of Data Sources in BISIS Analysis of User Requirements DATA WAREHOUSE MODELS Model Describing Library Collection Data Model Describing Library Circulation Data Model Describing Members’ Data TRUE VALUE OF A DATA WAREHOUSE ETL Processes Modeling Data Visualization CONCLUSION REFERENCES 10603 ---- 10603 20190318 galley Measuring Information System Project Success through a Software-Assisted Qualitative Content Analysis Jin Xiu Guo INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 53 Jin Xiu Guo (jin.x.guo@stonybrook.edu) is Director of Collections and Resource Management, Frank Melville, Jr. Memorial Library, Stony Brook University ABSTRACT Information System (IS)/IT project success is a growing interest in management due to its high impact on organizational change and effectiveness. Libraries have been adopting integrated library systems (ILS) to manage services and resources for years. It is essential for librarians to understand the mechanism of IS project management in order to successfully bring technology innovation to the organization. This study develops a theoretical model of measuring IS project success and tests it in an ILS merger project through a software-assisted qualitative content analysis. The model addresses project success through three constructs: (1) project management process, (2) project outcomes, and (3) contextual factors. The results indicate project management success alone cannot guarantee project success; project outputs and contextual factors also influence success through the leadership of the project manager throughout the lifecycle. The study not only confirms the proposed model in a post-project evaluation, but also signifies that project assessment can reinforce organizational learning, increase the chance of achieving success, and maximize overall returns for an organization. The qualitative content analysis with NVivo 11 has provided a new research method for project managers to self-assess an IS/IT project success systematically and learn from their experiences throughout the project lifecycle. INTRODUCTION Information Technology (IT) project success has drawn more attention in the last two decades due to its high impact on organizational change. More companies have conducted their innovation to gain business advantages through IS projects. In the United Kingdom alone, 21 percent of the gross value increased in manufacturing and construction happens through complex products and IS development projects. However, the implementation of IS projects has not been successful as practitioners hoped. Nicholas and Hidding reported that only 35 percent of IT projects were completed on time and budget, and met the project requirements.1 The U.S. Office of Electronic Government and Information Technology (OEGIT) noted that only 25 percent of 1,400 projects reached the office’s goals and more than $21 billion spent on IT projects were in jeopardy.2 In the European Union, about 20 to 30 percent of contracted IT/IS projects could not meet the stakeholders’ expectations and cause the loss of ₤70 billion or $99 billion.3 Although some IT projects are considered successful from the perspective of project management, project sponsors hardly recognize the results leading to organizational effectiveness. It is critical for IT practitioners to explore new methods to articulate what IT project success is and then improve project performance. MEASURING IS PROJECT SUCCESS | GUO 54 https://doi.org/10.6017/ital.v38i1.10603 Traditionally, the measurement of IT project success focuses on internal measures such as project time, cost, risk, and quality, which address project efficiency. In recent years, external measures, such as product satisfaction and organizational effectiveness, have gained more attention. Moreover, contextual factors such as top management support, project managers’ qualifications, system vendors, implementation consultants, and adaptation to change have shown critical effects on project success. The lack of literature in the post-project evaluation and merger of multiple information systems (IS) still exists. Notably, the consolidation of information systems of different organizations creates additional challenges for the new organizations. Diverse cultures and leadership styles may create barriers for managers to gain the trust of employees who used to work at a different institution. Nevertheless, the adaptation to change for all staff is necessary in the course of the merger. The need for addressing the impact of these factors on IS project success is increasing. Libraries have adopted the ILS to manage services and resources for the last two decades. The next generation system—cloud-based Library Management Systems—are now replacing existing ILS. To improve the efficiency of higher education, consolidation of public universities or colleges is still a viable alternative. It is essential for librarians to understand the mechanism of IS project management in order to successfully bring technology innovation to the organization. This study is to fill the gap by examining IS project success factors and developing a model to measure IS project success. The model can help practitioners better understand IS project success and improve the chance of success. The author firstly provides a historical account of the definitions of project success and measures adopted. What follows is to apply the model in a post- project evaluation at an academic library. THEORETICAL BACKGROUND Researchers and practitioners have been seeking IT project success through both quantitative and qualitative studies to find out what makes a successful IT project and how a project manager can make better decisions to increase the chance of project success. This review is to examine how project success is defined and what criteria practitioners employ for measurement. IT projects can be at different levels of complexity. For instance, a project of enterprise resource planning (ERP) implementation is more complicated and requires more resources to deploy across organizational functions. This type of projects might quickly overrun budget and deadline. As a result, the studies on ERP implementation success draw more attention. Cảrstea believes that project success is to achieve the targets that an organization has created and can be relatively measured against time, cost, quality, final results obtained, resources, the degree of automation, and international standards with a flexible evaluation system. He suggests that project managers may analyze the goal discrepancies between the current and new to self-evaluate the progress.4 Although this method emphasizes project efficiency, the self-developed evaluation system has shown the potential for IT project managers to control planning and organization of multiple IT projects within the organization. Instead of studying project management process alone, Tsai et al. incorporate system providers, implementation consultants, and the achievement level of project management into DeLone and McLean’s modified IS success model. They describe the ERP project success as efficient deployment and enhancement of organizational effectiveness. The success indicators include the INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 55 accomplishment level of project management and the degree of the improvement of IS performance. The metrics of project management are fulfilling business implementation goal, top management support, budget, time, communication, and troubleshooting; while the system performance dimensions include achieving integration of systems for system quality, information quality, system use, user satisfaction, individual and organizational impacts. The authors applied the research model to a quantitative study to test five hypotheses with SERVQUAL (service quality) instruments. The results show that the services provided by system vendors and implementation consultants are correlated with project management, then from project management to system performance.5 It is worth mentioning that this measurement integrates project management into the IS success model and confirms the contribution of project management to ERP performance that leads to the improvement of organizational effectiveness. Both studies indicate IS project measures should comprise the dimensions of project management success and business goals. With the similar interest of ERP, Young and Jordan investigate the impact of top management support (TMS) on ERP implementation success through descriptive case studies. The authors regard project success as the delivery of “expected benefits” and the achievement of “above average performance.” The findings of the research reveal that TMS is the most important critical success factor (CSF) that affects IT project success through the involvement of top management in project planning, result follow-ups, and the facilitation of management problems, but project management success does not guarantee project success resulting in organizational productiveness.6 Researchers are also interested in different perspectives of IT project success. Irani believes IS project appraisal should incorporate investment evaluation into the project lifecycle. A project manager evaluates IS impacts before, during, and after the investment is secured to dynamically justify the investment and ensure the project is in alignment with the organizational strategy. The author also points out that post-project evaluation lacks in current project management so that organizations lose a great learning opportunity to optimize their project management.7 Furthermore, Peslak inspects the relationship between IT project success and overall IT returns from the viewpoint of financial executives. The author defines IT project success as organizational success in which staying abreast of technology and the ability to measure project and balance managerial control over projects positively affect IT project success, then project success to overall IT returns.8 Likewise, Lacerda and Ensslin develop a conceptual model from the standpoint of external consultants to assess software projects. The theoretical framework contains the hierarchical structure of value, analysis, and recommendation, where they identify performance descriptors and analyze project values to improve the decision process in the course of the consultation.9 Nicholas and Hidding discover business goals, time for learning and reflection, and flexibility of the product are associated with project success through a series of interviews with IT project managers.10 Additionally, researchers make efforts to explain project outcomes for better understanding project success. Thomas and Fernández believe project success is changeable to each company, but the success criteria should consist of project management, technical system, and business goals that underscore business continuity, met business objectives, and delivery of benefits.11 MEASURING IS PROJECT SUCCESS | GUO 56 https://doi.org/10.6017/ital.v38i1.10603 Another study performed by Kutsch also proves that the achievement of business purpose; benefit to the owner; the satisfaction of owners, users, and stakeholders; achieving prestated objectives; quality; cost and time; and satisfaction of team are sequentially significant variables affecting project outcomes.12 The study further attests that organizational effectiveness is an essential criterion of IT project success. Interestingly, researchers also examine individual success indicators such as quality and risk to deepen their understanding of project success. Geraldi, Kutsch, and Turner think project quality has eight attributes including (1) a commitment to quality, (2) enabling capabilities, (3) completeness, (4) clarity, (5) integration, (6) adaptability, and (7) compliance along with (8) value-adding and met requirements.13 Among them, enabling capabilities and adaptability are comparatively new. This discovery discloses that project quality is evaluated vigorously in the project lifecycle, which is consistent with Cảrstea’s finding that project managers need to assess the projects regularly to recognize project controls and safety to achieve project goals. Such practices create the agility for software development projects and secure the resources needed for development. Summary of Literature The literature review indicates that it is necessary to define project performance criteria and outcomes to measure IS project success. IS project success is the achievement of project management process and project goals. When measuring an IS project, practitioners should also consider the impacts of contextual factors throughout the project lifecycle. System vendors, consultants’ services, management support, communication, adaptation to changes, time for learning and reflection, product flexibility, and project complexity are environmental influences. It is essential for practitioners to create an opportunity for organizational learning and improve future project success through a post-project evaluation. Figure 1. The relationship between Project Success and Organizational Effectiveness. Organizational effectiveness Business goals & objectives Business case 1, case 2, ... case n. Project 1, Project 2, … Project n. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 57 PROJECT SUCCESS MODEL The purpose of this study is to develop the measurement of IS project success based on the findings of the literature review. Therefore, the first step is to define project success. Project success comprises project management success and the achievement of business goals. In the previous studies, practitioners emphasized project management success but pay less attention to project outcomes, which leads to many unexplainable project failures. For example, some IT projects did not meet the business goals but conformed to the criteria of project management success. It might be a successful project from the perspective of project management process although it failed to attain the project goals. The relationship between IS projects and organizational effectiveness is described in figure 1. Each IS project makes at least one business case, and each business case contributes to at least a business objective. It will be a successful IS project if the project outcomes reach the business goals resulting in organizational effectiveness. The purpose of project performance criteria is to measure project progress throughout its lifecycle. Without standards, a project manager could lose the control over the project, and most likely fail. As a result, the next step is to identify the measures of project success. The indicators of project management success have been widely studied and tested. The project scope, time, cost, quality, and risk are on the top of the metrics list. The discovery of literature review shows researchers employ business continuity, achieving business objectives, delivery of benefits, and the perceived value of a project to measure project outcomes. It is noteworthy that contextual factors also impact project success, influences such as top management support (TMS), user involvement, system vendors, project manager’s qualifications, communication, and the complexity of a project, and adaptation to change need to be measured as well. Hence, the author proposes a measurement model as shown in figure 2. MEASURING IS PROJECT SUCCESS | GUO 58 https://doi.org/10.6017/ital.v38i1.10603 Figure 2. Model for Measuring IS Project Success. Three constructs effect IS project success in this model. Project management process is a tool to help a project manager attain success, where project performance criteria are identified to control quality and assess the progress throughout the lifecycle. On the other hand, project outcomes entail project goals to ensure ultimate project success. The contextual factors may contribute to success directly or indirectly by influencing project management process or organizational environment such as change management. Therefore, a project manager has to examine three constructs when assessing project success. To demonstrate the application of the model, the author conducted a case study on an ILS merger project. CASE STUDY: A POST-PROJECT EVALUATION Background In November 2013, the Board of Regents of the University System of Georgia announced the consolidation of Kennesaw State University (KSU) and Southern Polytechnic State University (SPSU). The merger of two state university libraries was one of the main tasks and involved merging two integrated library systems (ILS). The project involved removing duplicate bibliographic and customer records between two libraries and of relational databases that contain financial, bibliographic, transactional, vendor, and customer data. The ILS provider, Ex Libris, and two university libraries executed the merger with the support of GALILEO Interconnected Libraries (GIL) IT staff. The ILS merger implementation team comprised of two IT experts from Ex INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 59 Libris and fourteen ILS users from two libraries across five functional units comprising acquisition, cataloging, circulation and interlibrary loan, serials, and system administration. KSU/SPSU and Ex Libris had a project manager on each side, and the author was the KSU/SPSU project manager. The GIL Support team facilitated the implementation of the merger. The project goal was to operate two libraries with a consolidated ILS by July 2015 without interrupting services to students, staff, and faculty. The project was completed within eighty-one days and the consolidated university libraries were operated uniformly by the timeline. The team also won the 2015 Georgia Library Association Team Award due to its success. Methodology The methodologies adopted in previous researches include interview and survey. Both methods need to collect feedback from stakeholders during the post-project period, which sometimes can be challenging to reach the project stakeholders once the project is completed. However, many written communications including project documentation, emails, and reports are invaluable data for project managers to assess project success. Researchers have utilized software to assist content analysis in qualitative studies. Hoover and Koerber used NVivo to analyze data like text, interview transcripts, photographs, audio and video recordings by coding and retrieving to understand sophisticated relations among those data.14 Researchers think that computer-assisted qualitative data analysis (CAQDAS) has created new research practices and helped data analysis, research management, and theory development, where CAQDAS becomes a synonym of qualitative research.15 Balan’s team manually coded and categorized the dimensions identified in concept analysis, then employed concept mapping to present data relationship, which is an integration of qualitative and quantitative methods. 16 The word tag cloud in NVivo is a technique to assess the relevance of the data obtained or gathered to the research topic and treemap on the other hand is a tool to extract the new themes along with their relationship from the study data.17 Hutchison et al. believed that CAQDAS could facilitate the ground theory investigation. The group utilized the memo in NVivo to monitor emerging trends and justify the research purpose and theoretical sampling procedures. They also experienced the model-building tool to visualize the analytical observations.18 A study on content analysis of new articles indicated NVivo could assist qualitative research through data organization, idea management, querying data, and modeling. The research group also raised the concern about analytical reliability because qualitative data analysis is a highly interpretive method. Therefore, they suggested utilizing double coding and comparison of codes by different researchers to resolve this problem.19 Paulus’s team suggested researchers should write a description of the software to allow audience unfamiliar with the tool to not only appreciate its role in the study, but also understand how precisely the software enhances the potential in their analyses.20 In this case study, the author adopted NVivo 11 to conduct a content analysis to testify the proposed model by measuring IS project success, which is a qualitative method for practitioners to assess project with textual data in the post-project period. Data Collection The data gathered in this study include the email communications between the project manager and stakeholders, the reports of University Consolidation Operational Work Group (OWG), and project committee reports. After reviewing all document data to ensure the relevancy to the MEASURING IS PROJECT SUCCESS | GUO 60 https://doi.org/10.6017/ital.v38i1.10603 research topic, the author imported 878 emails, twenty-five OWG reports, and sixty-three project committee reports into NVivo 11. Content Analysis Process The Software—NVivo 11. NVivo 11 is the software package that allows researchers to collaborate and conduct qualitative studies. Researchers can import various types of raw data including social media into NVivo 11 to store, manage, and share the data throughout the research process. However, initial learning and mastering the software could pose a difficult hurdle for researchers to perform a software-assisted qualitative research. Data Preparation and Import. NVivo 11 can process documents (MS Word, PDF, or RTF), survey, audio, video, and image. Researchers may import Outlook emails saved as .msg files into NVivo 11 directly. It is also noted that emails imported into NVivo become PDFs and any supported attachments are imported as well. In this study, the OWG and committee reports in either MS Word or PDF were imported to NVivo directly. To ensure the email content relevant to the project, the author opened the software NVivo 11 and emails in Outlook 2010 simultaneously, and then dragged each email into the Sources List View of NVivo 11 (see figure 3) after reviewing each email. Figure 3. Sources List View in NVivo 11. Coding. Coding is a way of categorizing all references to a specific topic, theme, person, or other entity. The process of coding can help researchers to identify patterns and theories in research data.21 In this study, the author adopted coding using queries to answer the following research questions: • What is IS project success? • What are the factors that affect IS project success? INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 61 • How do these factors influence IS project success? Below are the steps of coding source data: • Run the query of word frequency in all data sets using the criteria of one hundred most frequent words with minimal five-character length including exact matches, stemmed words, and synonyms. • Review the word list, remove irrelevant words from the list, and re-run the query until the words are accurate and relevant to the research topic. • Create the parent nodes (e.g. contextual factors, project management process, project outcomes) and child nodes (e.g. top management support, manager’s qualifications, project involvement) based on the proposed model, and then save the results of word frequency in respective nodes (see the coding in figure 4). • Run the query of word frequency with the same criteria in the context nodes (within each parent node) • Review the results of word frequency and save the new word as a new node. • Review all node references and sources, merge relevant nodes, and remove irrelevant ones as needed. Figure 4. Coding Using Queries. FINDINGS AND DISCUSSION An Overview of Content Analysis Previous studies have shown that visualization tools such as models, charts, and treemaps provided by NVivo can be helpful to present the findings of qualitative studies.22 Therefore, the author used the model tool to gain a better understanding and overview of key themes in the ILS merger project. Since the number of emails is much larger than the number of reports, the author MEASURING IS PROJECT SUCCESS | GUO 62 https://doi.org/10.6017/ital.v38i1.10603 decided to display the themes of emails and reports separately. Figures 5 and 6 are the word treemaps for the emails and reports respectively. Figure 5. Email Tree Map. Figure 6. Report Tree Map. The treemap is the visualization of the results of word frequency queries. In figure 5, the concepts of patron, barcode, missing, fines, charge, circulation, and policy are library user transactional data; while order, vendor, complete, Wilson, lines, Taylor, and holding show procurement information. The themes of production, mapping, duplicate, matching, location, cataloging, and process stand for library resource data. Hence, the acquisition, bibliographic, patron, and transactional data are the primary content migrated to the new ILS. The names mentioned such as Russell, Adriana Meryll, Trotter, and David reveal the involvement of system and service providers and top management. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 63 Figure 6 displays more details on library resource data such as serials, codes, bibliographic, ebook, format, journal and print. The user transactional data also appear. The subjects of production, implement, identify, training, mapping, match, finish, matrix, plan, procedure, campus and urgent indicate project management process. The term “accepted” in contrast shows one of the project outcomes. The treemaps shown above demonstrate that project management process, the involvement of user and system providers, top management, and project outcomes are the representatives of project success, which implies project success is to succeed in project management process, project outcomes, and engaging top management and system users and providers. How do these factors come together to impact project success? The next step is to examine the relationships among these variables and their interactions. Relationships among Constructs To analyze the concepts of contextual factors, project management process, and project outcomes further, the author utilized the model tool to create project maps. Project maps are graphic representations of the data in a project, which helps illustrate the relationships among constructs and answer the research questions of this study. The author further inspected each construct node by creating project maps. Figure 7. Project Management Process Map. Figure 7 shows the relationships among the variables that affect the project management process. The child nodes of communication, project cost, quality, risk, time, and scope are the influencers of project management process. Their respective child nodes such as barcodes, missing, and deadline are the results of coding source data and well support how the concepts of communication, cost, quality, risk, scope, and time effect project management process correspondently. MEASURING IS PROJECT SUCCESS | GUO 64 https://doi.org/10.6017/ital.v38i1.10603 Figure 8. Contextual Factor Map. Contextual factors have not been thoroughly discussed in previous project management practices. Figure 8 illustrates the results of coding source data within this construct. The engagement of users and vendors, and their feedback signify the variable of project involvement. The node of top management also confirms its parent node of top management support. Furthermore, Jin as the project manager is associated with the node of project manger’s qualifications. She could affect project success either directly or indirectly through contextual factors. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 65 Figure 9. Project Outcomes Map. Figure 9 represents the themes— downtime, service satisfaction, and acceptance are the child nodes of business continuity, delivery of benefit, and project deliverables correspondently. The PDF reference source supports the subjects of “satisfaction of service” and “conditional acceptance” as the child nodes of “delivery of benefits” and “project deliverables” respectively. Thus, business continuity, delivery of benefits, and project deliverables are the core factors to be considered when assessing project outcomes. Figures 7, 8, and 9 have demonstrated that the project would not be successful if the project management process was not executed appropriately, context factors MEASURING IS PROJECT SUCCESS | GUO 66 https://doi.org/10.6017/ital.v38i1.10603 were not fully met, or preferred project outcomes were not delivered. In other words, if one of three above project variables is not executed or delivered appropriately, the project could fail. The Role of Project Manager Although figures 7, 8, and 9 have signified the three constructs can affect project success, but do not tell how project management process, project outcomes, and contextual factors play together in this role. Consequently, the author hoped to identify the connections between project items and to see if there are gaps or isolated items unexplained by the proposed model. To create such project map in NVivo 11, the author chose emails as project items and added the issues associated with the project manager Jin to the map. Figure 10. Manager’s Project Map. This case study is to test the proposed model in a post-project assessment. The Manager’s Project Map in figure 10 has well self-explained this purpose. The project manager Jin led the project to success by influencing project management process, project outcomes, and contextual factors. The project success in this case includes the contribution to the consolidation of two state universities and maximization of library resources for the organization. The outcomes of the merger project are to deliver a consolidated ILS and to provide library services for the new university continuously. Figure 10 clearly indicates Jin managed business continuity and project deliverables INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 67 through downtime and load acceptance. Among contextual factors, the project manager executed project involvement through engaging system users and vendors and gathering user feedback. She also involved top management David in the project directly. Senior management empowered Jin to make decisions on the project. As a manager her qualifications enabled her to cope with the complexity of the project. The project documentation has verified the manager’s ability to govern the project. For instance, figure 11 is the project framework that the manager created according to the PMBOK (Project Management Body of Knowledge). Hence, a qualified project manager can directly make impacts on project success through contextual factors. Figure 11. KSU Library ILS Merger Project Management Framework. Meanwhile, the nodes of barcode, mappings, missing, patrons, and vendors confirm the manager’s role in project quality control. The coding of the deadline, cost-consolidation, communication, and risk control indicates the manager put her effort in project time, cost, and communication management and risk mitigation correspondingly. Figure 10 reveals the project manager is the core of the project team and makes significant impacts on project success by influencing project management process, contextual factors and project outcomes. A project manager must fully understand project outputs; have the ability to execute project plans in the business environment, and communicate with different stakeholders at the corresponding levels through various channels since communication becomes challenging when a project involves more people from different sections of the business. People decode messages differently. Multiple communication chains can help stakeholders gain consistent and accurate information directly. For example, this project manager utilized formal reports, group MEASURING IS PROJECT SUCCESS | GUO 68 https://doi.org/10.6017/ital.v38i1.10603 discussions, training, and weekly coordination meetings to share information and seek feedback. The functional groups are the governance structure of the project. In the phrase of test and production loads, the leaders of functional groups communicated problems with the project manager more frequently to ensure the manager resolve issues in collaboration with related stakeholders (e.g. Ex Libris) in a timely way. In the meantime, the project manager communicated the expectations for responsible IT staff regularly to prevent the additional waiting time for feeding the merged ILS with patron data by verifying the feeder during the test load, which helps meet the deadlines of the campus IT projects. The manager mitigated risk by implementing the project plan thoughtfully throughout the project. It was the project manager who connects the three variables—project management process, project outcomes, and contextual factors with project success. CONCLUSIONS Libraries have used the ILS to manage resources and services for decades. With the exponential growth of digital information, IS innovation continuously becomes one of most effective drivers of library transformation. Therefore, it is crucial for libraries to effectively manage IS/IT projects to achieve organizational goals. This study develops a model of IS project success. The model employs three constructs namely project management process, project outcomes, and contextual factors to measure IS project success. Project management success cannot bring IS project success unless the project results achieve business goals and lead to the improvement of organizational effectiveness. The project manager makes important impacts on project success by delivering project outcomes through implementing project management process and making use of contextual factors throughout the project. The research methodology—software-assisted qualitative content analysis can be an approach to develop or test a theoretical model for library practitioners. A post-project evaluation can create an excellent opportunity for organizational learning and help managers to manage talents better and improve the chances of project success in the future. FUTURE RESEARCH Libraries have moved into a new era that is full of new and disruptive technologies, which affect library services, operations, and decisions on a daily basis. IS projects will continue bringing innovations to library services and programs. A theoretical framework could provide librarians a methodology to manage IS projects successfully. Notably, the U.S. Senate has unanimously approved the Program Management Improvement and Accountability Act (PMIAA) to enhance project and program management practices to maximize efficiency in the federal government.23 Project management has become a must-have skill for today’s library leaders. There are many opportunities for managers to test the IS project success model through their practices. The future studies may combine quantitative and qualitative methods to assess and enhance the model further. Each institution has different goals and contextual indicators that the author has not mentioned in this study. These factors might shift from minor to major or vice versa due to different organizational cultures. Practitioners can also use NVivo to collaborate on double coding to increase the analytical reliability. A software-assisted qualitative content analysis will help library leaders to understand project management better and experiment the solutions to complex information world. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 69 ACKNOWLEDGEMENTS This work would not have been possible without the support of the KSU Library System Administration and the team efforts from KSU Voyager Consolidation Committee, GIL Support, and Ex Libris Team. I am grateful to all of those with whom I have had the privilege to work during this project. REFERENCES 1 John Nicholas and Gezinus Hidding, “Management Principles Associated with IT Project Success,” International Journal of Management and Information Systems 14, no. 5 (Nov. 2, 2010): 147-56, https://doi.org/10.19030/ijmis.v14i5.22. 2 Alan R. Peslak, “Information Technology Project Management and Project Success,” International Journal of Information Technology Project Management 3, no. 3 (July 2012): 31-44, https://doi.org/10.4018/jitpm.2012070103. 3 Udechukwu Ojiako, Eric Johansen, and David Greenwood, “A Qualitative Re-construction of Project Measurement Criteria.” Industrial Management & Data Systems 108, no. 3 (Mar. 2008): 405-17, https://doi.org/10.1108/02635570810858796. 4 Claudia-Georgeta Cảrstea, “IT Project Management—Cost, Time and Quality,” Economy Transdisciplinarity Cognition 17, no. 1 (Mar. 2014): 28-34, http://www.ugb.ro/etc/etc2014no1/07_Carstea_C..pdf. 5 Wen-Hsien Tsai et al., “An Empirical Investigation of the Impacts of Internal/External Facilitators on the Project Success of ERP: A Structural Equation Model,” Decision Support Systems 50, no. 2 (Jan. 2011): 480-90, https://doi.org/10.1016/j.dss.2010.11.005. 6 Raymond Young and Ernest Jordan, “Top Management Support: Mantra or Necessity?” International Journal of Project Management 26, no. 7 (Oct. 2008): 713-25, https://doi.org/10.1016/j.ijproman.2008.06.001. 7 Z. Irani, “Investment Evaluation within Project Management: An Information Systems Perspective,” The Journal of the Operational Research Society 61, no. 6 (June 2010): 917-28, https://doi.org/10.1057/jors.2010.10. 8 Peslak, “Information Technology Project Management and Project Success,” 31-44. 9 Tadeau Oliveira de Lacerda, Leonardo Ensslin, and Sandra Rolim Ensslin, “A Performance Measurement View of IT Project Management,” International Journal of Productivity and Performance Management 60, no. 2 (2011): 132-51, https://doi.org/10.1108/17410401111101476. 10 Nicholas and Hidding, “Management Principles,” 153. 11 Graeme Thomas and Walter Fernández, “Success in IT Projects: A Matter of Definition?” International Journal of Project Management 26, no. 7 (Oct. 2008): 733-42, https://doi.org/10.1016/j.ijproman.2008.06.003. MEASURING IS PROJECT SUCCESS | GUO 70 https://doi.org/10.6017/ital.v38i1.10603 12 Elmar Kutsch, “The Measurement of Performance in IT Projects,” International Journal of Electronic Business 5, no. 4 (2007): 415, https://doi.org/10.1504/IJEB.2007.014786. 13 Joana G. Geraldi, Elmar Kutsch, and Neil Turner, “Towards a Conceptualisation of Quality in Information Technology Projects,” International Journal of Project Management 29, no. 5 (July 2011): 557-67, https://doi.org/10.1016/j.ijproman.2010.06.004. 14 Ryan S. Hoover and Amy L. Koerber, “Using NVivo to Answer the Challenges of Qualitative Research in Professional Communication: Benefits and Best Practices Tutorial,” IEEE Transactions on Professional Communication 54, no. 1 (Mar. 2011): 68-82, https://doi.org/10.1109/TPC.2009.2036896. 15 Erika Goble et al., “Habits of Mind and the Split-Mind Effect: When Computer-Assisted Qualitative Data Analysis Software is used in Phenomenological Research,” Forum: Qualitative Social Research 13, no. 2 (May 2012): 1-22, https://doi.org/10.17169/fqs-13.2.1709. 16 Peter Balan et al., “Concept Mapping as a Methodical and Transparent Data Analysis Process,” in Handbook of Qualitative Organizational Research (London: Routledge, 2015): 318-30, https://doi.org/10.4324/9781315849072. 17 Syed Zubair Haider and Muhammad Dilshad, “Higher Education and Global Development: A Cross Cultural Qualitative Study in Pakistan,” Higher Education for the Future 2, no. 2 (July 2015): 175-93, https://doi.org/10.1177/2347631114558185. 18 Andrew John Hutchison, Lynne Halley Johnston, and Jeff David Breckon, “Using QSR-NVivo to Facilitate the Development of a Grounded Theory Project: An Account of a Worked Example,” International Journal of Social Research Methodology 13, no. 4 (Oct. 2010): 283-302, https://doi.org/10.1080/13645570902996301. 19 Florian Kaefer, Juliet Roper, and Paresha Sinha. “A Software-Assisted Qualitative Content Analysis of News Articles: Example and Reflections,” Forum: Qualitative Social Research 16, no. 2 (May 2015): 1-20, https://doi.org/10.17169/fqs-16.2.2123. 20 Trena Paulus et al., “The Discourse of QDAS: Reporting Practices of ATLAS.Ti and NVivo Users with Implications for Best Practices,” International Journal of Social Research Methodology 20, no. 1 (Jan. 2017): 35-47, https://doi.org/10.1080/13645579.2015.1102454. 21“About Coding,” NVivo Help (Melbourne, Australia: QSR International, 2018), accessed Apr. 3 2018, http://help- nv11.qsrinternational.com/desktop/concepts/about_coding.htm?rhsearch=coding&rhsyns=. 22 Paulus et al., “Discourse of QDAS,” 41. 23 “U.S. Senate Unanimously Approves the Program Management Improvement and Accountability Act,” Business Wire (Dec. 2016) accessed Nov. 10, 2017, http://www.businesswire.com/news/home/20161201006499/en/U.S.-Senate-Unanimously- Approves-Program-Management-Improvement. 10604 ---- The Open Access Citation Advantage: Does It Exist and What Does It Mean for Libraries? Colby Lewis INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 50 Colby Lewis (colbyllewis@gmail.com), a second year Master of Science in Information student at the University of Michigan School of Information, is winner of the 2018 LITA/Ex Libris Student Writing Award. ABSTRACT The last literature review of research on the existence of an open access citation advantage (OACA) was published in 2011 by Philip M. Davis and William H. Walters. This paper reexamines the conclusions reached by Davis and Walters by providing a critical review of OACA literature that has been published since 2011 and explores how increases in open access publication trends could serve as a leveraging tool for libraries against the high costs of journal subscriptions. INTRODUCTION Since 2001, when the term “open access” was first used in the context of scholarly literature, the debate over whether there is a citation advantage (CA) caused by making articles open access (OA) has plagued scholars and publishers alike.1 To date, there is still no conclusive answer to the question, or at least not one that the premier publishing companies have deemed worthy of acknowledging. There have been many empirical studies, but far fewer with randomized controls. The reasons for this range from data access to the numerous potential “methodological pitfalls” or confounding variables that might skew the data in favor of one argument or another. The most recent literature review of articles that explored the existence (or lack thereof) of an open access citation advantage (OACA) was published in 2011 by Philip M. Davis and William H. Walters. In that review, Davis and Walters ultimately concluded that “while free access leads to greater readership, its overall impact on citations is still under investigation. The large access -citation effects found in many early studies appear to be artifacts of improper analysis and not the result of a causal relationship.”2 This paper seeks to reexamine the conclusions reached by Davis and Walters in 2011 by providing a critical review of OACA literature that have been published since their 2011 literature review.3 This paper will examine the methods and conclusions provoking such criticisms and whether these criticisms are addressed in the studies. I will begin by identifying some of the top confounders in OACA studies, in particular the potential for self-archiving bias. I will then examine articles from July 2011, when Davis and Walters published their findings, to July 2017. There will be a few exceptions to this time frame, but the studies cited in figures 4 and 5 are entirely from this period. In addition to reviewing OACA studies since Davis and Walters’ March 2011 study, I will explore the implications of an OACA on the future of publishing and the role of librarians in the subscription process. As Antelman points out in her Association of College and Research Libraries conference paper, “Leveraging the Growth of Open Access in Library Collection Decision Making,” it is the responsibility of libraries to use the newest data and technology available to them in the interest of best serving their patrons and advancing scholarship.4 In connecting OACA mailto:colbyllewis@gmail.com THE OPEN ACCESS CITATION ADVANTAGE | LEWIS 51 https://doi.org/10.6017/ital.v37i3.10604 studies and the potential bargaining power an OACA could bring libraries, I assess the current roles that universities and university libraries play in promoting (or not) OA publications and the implications of an OACA for researchers, universities, and libraries, and I provide suggestions on how recent research could influence the present trajectory. I conclude by summarizing what my findings tell us about the existence (or lack thereof) of an OACA, and what these findings imp ly for the future of library journal subscriptions and the publish-or-perish model for tenure. Lastly, I will suggest some alternative metrics to citations that could be used by libraries in determining future journal subscriptions and general collection management. SELF-ARCHIVING BIAS AND WHY IT DOESN’T MATTER The idea of a self-archiving bias is based upon the concept that, if faced with a choice, authors will always opt to make their best work more widely available. Effectively, when open access is not mandated, these articles may be specifically chosen to be made open access to increase readership and, hypothetically, citations.5 This biased selection method has the potential to confound the results of OACA studies because of the intuitive notion that an author’s best work is much more likely to be cited than any of their other work. Its effect is amplified by making this work available OA, but it prevents studies in which articles were self-archived from being able to convincingly claim that the citation advantage these articles received was due to OA and not to its inherent quality and subsequent likelihood to be cited anyway. In a 2010 study, Gargouri et al. determined that articles by authors whose institutions mandated self-archiving (such as in an institutional repository [IR]) saw an OACA just as great for articles that were mandated to be OA as for articles that were self-selected to be OA.6 This by no means proves a causal relationship between OA and CA, but does counter the notion that self -archived articles are an uncontrollable confounder that automatically compromises the legitimacy of OACA studies.7 Ottaviani affirms this conclusion in a 2016 study in which he writes, “In the long run better articles gain more citations than expected by being made OA, adding weight to the results reported by Gargouri et al.”8 In short, claiming that articles self-selected for self-archiving irreparably confound OACA studies ignores the fact that these authors have accounted for the likelihood that articles of higher quality will inherently be cited more. As Gargouri et al. put it, “The OA advantage [to self-archived articles] is a quality advantage, rather than a quality bias” (italics in original).9 GOLD VERSUS GREEN AND THEIR EFFECT ON OACA ANALYSES Many critics of OACA studies have argued that such studies do not distinguish between Gold OA, Green OA, and Hybrid (subscription journals that offer the option for authors to opt-in to Gold OA) journals in their sample pool, thus skewing the results of their studies. In fact, there are many acknowledged subcategories of OA, but for the purposes of this paper, I will primarily focus on Gold, Green, and hybrid OA. Figure 1, provided by Elsevier as a guide for their clients, distinguishes between Gold and Green OA.10 While the chart provided applies specifically to those looking to publish with Elsevier, it highlights the overarching differences between Gold OA and Green OA. A comprehensive list of OA journals is available through the Directory of Open Access Journals (DOAJ) website (https://doaj.org/). https://doaj.org/ INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 52 Figure 1. Elsevier explains to potential clients their options for publishing OA with Elsevier and the differences between publishing with Gold OA versus Green OA. The argument that not distinguishing between Gold OA and Green OA in OACA studies distorts study results primarily stems from the potential for skew in Green OA journals. Green OA journals allow authors to self-archive their articles after publication, but the articles are often not made full OA until an embargo period has passed. This problem was addressed in a recent study conducted by Science-Metrix and 1science, who manually checked and coded approximatively 8,100 top-level domains (TLDs).11 It is important to note that this study was made available as a white paper on the 1science website and has not been published in a peer-reviewed journal. Additionally, 1science is a company built on providing OA solutions to libraries, which means they have a vested interest in proving the existence of an OACA. However, just as publishers such as Elsevier have a vested interest in a substantial OACA not existing, this should not prevent us from examining their data. For their study, 1science did not distinguish hybrid journals as being in a distinct journal category. Critics, such as the editorial director of journals policy for Oxford University Press, David Crotty, were quick to fixate on this lack of distinction as a means of discrediting the study.12 Employees of Elsevier were similarly inclined to criticize the study, declaring that it, “like many others [studies] on this topic, does not appear to be randomized and controlled.”13 However, Archambault et al., acknowledging that their study “does not examine the overlap between green and gold,” have provided an extremely comprehensive sample pool, examining 3,350,910 OA papers published between 2007 and 2009 in 12,000 journals.14 This paper examines the notion that “the advantage of OA is partly due to citations having a chance to arrive sooner . . . and concludes that the purported head start of OA papers is actually contrary to observed data.” 15 THE OPEN ACCESS CITATION ADVANTAGE | LEWIS 53 https://doi.org/10.6017/ital.v37i3.10604 In a more recent study published in February 2018, Piwowar et al. examine the prevalence of OA and average relative citation (ARC) based on three sample groups of one hundred thousand articles each: “(1) all journal articles assigned a Crossref DOI, (2) recent journal articles indexed in Web of Science, and (3) articles viewed by users of Unpaywall, an open-source browser extension that lets users find OA articles using oaDOI.”16 Unlike the 1science study, Piwowar et al. had a twofold purpose: to examine the prevalence of OA articles available on the web and whether an OACA exists based on their sample findings. I do not include their results in my literature review because of the dual focus of their study, although I do compare their results with those of Archambault et al. and analyze the implications of their findings. BRONZE: NEITHER GOLD NOR GREEN In their article, Piwowar et al. introduce a new category of OA publication: Bronze. If Gold OA refers to complete open access at the time of publication, and Green OA refers to articles published in a paywalled journal but ultimately made OA either after an embargo period or via an IR, Bronze OA refers to OA articles that somehow don’t fit into either of these categories. Piwowar et al. define Bronze OA articles as “free to read on the publisher page, but without any clearly identifiable license.”17 However, as Crotty points out in a Scholarly Kitchen article reflecting on the preprint version of Piwowar et al.’s article, “Bronze” already exists as an OA category, but has simply been called “public access.”18 While coining “Bronze” as a new term for “public access” is helpful in connecting it to OA terms such as “Green” and “Gold,” it is not quite the new phenomenon it is touted to be. ARC AS AN INDICATION OF AN OACA Both Archambault et al. and the authors of the 1science paper provide the ARC as a means of establishing a paper’s impact on the larger research community. 19 Within their ARC analyses, Archambault et al. distinguish between non-OA and OA, within which they differentiate between Gold and Green OA (figure 2). Piwowar et al. group papers by closed (non-OA) and OA, with the following OA subcategories: Bronze, hybrid, Gold, and Green OA (figure 3). An ARC of 1.0 is the expected amount of citations an article will receive “based on documents published in the same year and [National Science Foundation (NSF)] specialty.” 20 Based on this standard, articles with an ARC above or below 1.0 represent a citation impact that percentage above or below the expected citation impact of like articles. For example, an article with an ARC of 1.23 has received 23 percent more citations than expected for articles of similar content and quality. This scale can be incredibly useful in determining the presence of a citation advantage, and it can enable researchers to determine overall CA patterns. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 54 Figure 2. Research impact of paywalled (not OA) versus open access (OA) papers “computed by Science-Metrix and 1science using oaIndx and the Web of Science.” Archambault et al., “Research Impact of Paywalled Versus Open Access Papers,” white paper, Science-Metrix and 1science, 2016, http://www.1science.com/1numbr/. Critics’ fixation on the “randomized and controlled” nature of the 1science study ignores the fact that the authors do not claim causation. Rather, their findings suggest the existence of an OACA when comparing OA (in all forms) and non-OA (in any form) articles (see figure 2). The authors ultimately conclude that “in all these fields, fostering open access (without distinguishing between gold and green) is always a better research impact maximization strategy than relying on strictly paywalled papers.”21 Unlike Archambault et al., Piwowar et al. found that Gold OA articles had a significantly lower ARC, and that the average ARC of all OA balances out to 1.18 because of the high ARCs of Bronze (1.22), hybrid (1.31), and Green (1.33). However, both studies fou nd that non-OA (referred to by Piwowar et al. as “closed”) articles had an ARC below 1.0, suggesting a definitive correlation between OA (without specifying type) and an increase in citations. http://www.1science.com/1numbr/ THE OPEN ACCESS CITATION ADVANTAGE | LEWIS 55 https://doi.org/10.6017/ital.v37i3.10604 Figure 3. “Average relative citations of different access types of a random sample of World of Science (WoS) articles and review with a Digital Object Identifier (DOI) published between 2009 and 2015.” Heather Piwowar et al., “The State of OA: A Large-Scale Analysis of the Prevalence and Impact of Open Access Articles,” PeerJ, February 13, 2018, https://doi.org/10.7717/peerj.4375. SIX YEARS AND WHAT HAS CHANGED IN OACA RESEARCH Between July 2011 and the publication of Piwowar et al.’s work in February 2018, nine new OACA studies have been published in peer-reviewed journals. Of these, five only look at the OACA in one field, such as cytology or dentistry. The other four are multidisciplinary studies, two of which are repository-specific and only use articles from Deep Blue and Academia.edu, respectively. This is important to note because of critics’ earlier stated objections to the use of studies that are not randomized controlled studies. However, the Deep Blue study can still be considered a randomized controlled sample group because the authors are not self-selecting articles to upload to the repository as they are with Academia.edu. Rather, articles were made accessible through Deep Blue “via blanket licensing agreements between the publishers and the [University of Michigan] library.”22 Some of the field-specific studies use sample sizes that may not reflect a general OACA, but rather one only for that field, and in certain cases, only for a single journal. FIELD-SPECIFIC STUDIES Between July 2011 and July 2017, five field-specific studies were conducted to determine whether an OACA existed in those fields. I summarize the scope and conclusions of these studies in table 1. As you can see from the table, the article sample size vastly varied between studies, but that can likely be accounted for by considering the specific fields studied since there are only five major cytopathology journals and nearly fifty major ecology journals. Piwowar et al. acknowledge this in their study, noting that the NSF assigns all science journals “exactly one ‘discipline’ (a high-level categorization) and exactly one ‘specialty’ (a finer-grained categorization).”23 The more deeply nested in an NSF discipline a subject is, the more specialized the field becomes and the fewer journals there are on the subject. This alone is reason not to extrapolate from the results of these studies and project their results on the existence of OACA across all fields. https://doi.org/10.7717/peerj.4375 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 56 Only two of these studies, those focused on an OACA in dentistry and ecology, can be cons idered truly randomized controlled studies. Both the cytopathology and marine ecology studies chose a specific set of journals from which to draw their entire sample pool. While the dentistry and ecology studies can be considered randomized controlled in nature, they still only reflect the occurrence (or lack thereof) of an OACA in those specific fields. It would be irresponsible to allow the results from studies in a single field of a single discipline to represent OACA trends across all disciplines. Therefore, it is surprising that Elsevier employees use the dentistry study to make such a claim. Hersh and Plume write, “Another recent study by Hua et al (2016) looking at citations of open access articles in dentistry found no evidence to suggest that open access articles receive significantly more citations than non-open access articles.”24 The key phrase missing from the end of this analysis is in dentistry. One might question whether a claim about multidisciplinary OACA can effectively be extrapolated from a single-field analysis. The authors do, two sentences later, qualify their earlier statement by saying, “In dentistry at least, the type of article you publish seems to make a difference but not OA status.”25 That is indeed what this study seems to show, and is therefore a logical claim to make. Likewise, the three empirical studies in table 1 show that, for those respective fields, OA status does correlate to a citation advantage. In the case of the ecology study, the authors are confident enough in their randomized controlled methodology to claim causation. 26 The ecology study is the most recently published OACA study, and its authors were able to learn from similar past studies about the necessary controls and potential confounders in OACA studies. With this knowledge, Tang et al. determined that: By comparing OA and non-OA articles within hybrid journals, our estimate of the citation advantage of OA articles sets controls for many factors that could confound other comparisons. Numerous studies have compared articles published in OA journals to those in non-OA journals, but such comparison between different journals could not rule out the impacts of potentially confounding factors such as publication time (speed) and quality and impact (rank) of the journal. These factors are effectively controlled with our focus on hybrid journals, thereby providing robust and general estimates of citation advantages on which to base publication decisions. 27 THE OPEN ACCESS CITATION ADVANTAGE | LEWIS 57 https://doi.org/10.6017/ital.v37i3.10604 SUMMARY OF KEY FIELD-SPECIFIC STUDIES Author Study Design Content Number of Articles Controls Results, Interpretation, and Conclusion Clements 2017 Empirical 3 hybrid-OA marine ecology journals All articles published in these journals between 2009 and 2012; specific number not provided JIF; Article type; Self- citations “On average, open access articles received more peer-citations than non- open access articles.” OACA found. Frisch et al. 2014 Empirical 5 cytopathology journals; 1 OA and 4 non-OA 314 articles published between 2007 and 2011 JIF; Author frequency; Publisher neutrality “Overall, the averages of both CPP and Q values were higher for OA Cytopathology Journal (CytoJournal) than traditional non-OA journals.” OACA found. Gaulé and Maystre 2011 Empirical 1 major biology journal 4,388 articles published between 2004 and 2006 Last author; Characteristics; Article quality “We find no evidence for a causal effect of open access on citations. However, a quantitatively small causal effect cannot be statistically ruled out.” OACA not found. Hua et al. 2016 Randomized controlled Articles randomly selected from PubMed database, not specific dentistry journals 908 articles published in 2013 Randomized article selection; Exclusion of articles Unrelated to dentistry; Multi- database search to determine OA status “In the present study, there was no evidence to support the existence of OA ‘citation advantage’, or the idea that OA increases the citation of citable articles.” OACA not found. Tang et al. 2017 Randomized controlled 46 hybrid-OA ecology journals 3,534 articles published between 2009 and 2013 GNI of author country; Randomized article pairing; Article length “Overall, OA articles received significantly more citations than non-OA articles, and the citation advantage averaged approximately one citation per article per year and increased cumulatively over time after publication.” OACA found. Table 1. Scope, Controls, and Results of Field-Specific OACA Studies Since 2011. Based on a chart in Stephan Mertens, “Open Access: Unlimited Web Based Literature Searching,” Deutsches Ärzteblatt International 106, no. 43 (2009): 711. JIF, journal impact factor; CPP, citations per publication; Q, Q-value (see Frisch, Nora K., Romil Nathan, Yasin K. Ahmed, and Vinod B. Shidham. “Authors Attain Comparable or Slightly Higher Rates of Citation Publishing in an Open Access Journal (CytoJournal) Compared to Traditional Cytopathology Journals—A Five Year (2007–2011) Experience.” CytoJournal 11, no. 10 (April 2014). https://doi.org/10.4103/1742-6413.131739 for specific equation used.) https://doi.org/10.4103/1742-6413.131739 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 58 SUMMARY OF KEY MULTIDISCIPLINARY STUDIES Author Study Design Content Number of Articles Controls Results, Interpretation, and Conclusion McCabe and Snyder 2014 Empirical 100 journals in ecology, botany, and multidisciplinary science All articles published in these journals between 1996 and 2005; specific number not provided JIF; Journal founding year “We found that open access only provided a significant increase for those volumes made openly accessible via the narrow channel of their own websites rather than the broader PubMed Central platform.” OACA found. Niyazov et al. 2016 Empirical Unspecified number of journals across 23 academic divisions 31,216 articles published between 2009 and 2012 Field; JIF; Publication vs. upload Date “We find a substantial increase in citations associated with posting an article to Academia.edu. . . . We find that a typical article that is also posted to Academia.edu has 49% more citations than one that is only available elsewhere online through a non-Academia.edu venue.” OACA found for Academia.edu. Ottaviani 2016 Randomized controlled Unspecified number of journals who have blanket licensing agreements between the publishers and the University of Michigan Library 93,745 articles published between 1990 and 2013 Self-selection “Even though effects found here are more modest than reported elsewhere, given the conservative treatments of the data and when viewed in conjunction with other OACA studies already done, the results lend support to the existence of a real, measurable, open access citation advantage with a lower bound of approximately 20%.” OACA found. Sotudeh et al. 2015 Empirical 633 APC-funded OA journals published by Springer and Elsevier 995,508 articles published between 2007 and 2011 Journals who adopted OA policies after 2007 Journals with non– article processing charge OA policies “The APC OA papers are, also, revealed to outperform the TA ones in their citation impacts in all the annual comparisons. This finding supports the previous results confirming the citation advantage of OA papers.” OACA found. Table 2. Scope, Controls, and Results of Multi-Disciplinary OACA Studies Since 2011. JIF, journal impact factor; APC, article processing charge; TA, toll access THE OPEN ACCESS CITATION ADVANTAGE | LEWIS 59 https://doi.org/10.6017/ital.v37i3.10604 Based on the randomized controlled methodology that Tang et al. found hybrid journals to provide, it is possible that this study may serve as an ideal model for future larger OACA studies across multiple disciplines. However, more field-specific hybrid journal studies will have to be conducted before determining if this model would be the most accurate method for measuring OACA across multiple disciplines in a single study. MULTIDISCIPLINARY STUDIES The multidisciplinary OACA studies conducted since 2011 include a single randomized control study and three empirical studies (table 2). All these studies found an OACA; in the case of Niyazov et al., an OACA was found specifically for articles posted to Academia.edu. I included this study because it is an important contribution to the premise that a relationship exists between self - selection and OACA. Niyazov et al. highlight this point in the section “Sources of Selection Bias in Academia.edu Citations,” explaining that “even if Academia.edu users were not systematically different than non-users, there might be a systematic difference between the papers they choose to post and those they do not. As [many] . . . have hypothesized, users may be more likely to post their most promising, ‘highest quality’ articles to the site, and not post articles they believe will be of more limited interest.”28 To underscore this point, I refer to Gargouri et al., who stated that “the OA advantage [to self - archived articles] is a quality advantage, rather than a quality bias” (italics in original).29 Again, it is unsurprising that articles of higher caliber are cited more and that making such articles more readily available increases the amount of citations they would likely already receive. Similar to my conclusion in the field-specific study section, we simply need more randomized controlled studies, such as Ottaviani’s, to determine the nature and extent of the relationship between OA and CA across multiple disciplines. CONCLUSIONS Critics of some of the most recent studies, specifically Archambault et al. and Ottaviani, have argued that authors of OACA studies are too quick to claim causation. While a claim of causation does indeed require strict adherence to statistical methodology and control of potential confounders, few of the authors I have examined actually claim causation. They recognize that the empirical nature of their studies is not enough to prove causation, but rather to provide insight into the correlation between open access and a citation advantage. In all their conclusions, these authors acknowledge that further studies are needed to prove a causal relationship between OA and CA. The recent work published by Piwowar et al. provides a potential model for replication by other researchers, and Ottaviani offers a replicable method for other large research institutions with non-self-selecting institutional repositories. Alternatively, field-specific studies conducted in the style of Tang et al. across all fields would serve to provide a wider array of evidence for the occurrence of field-specific OACA and therefore of a more widespread OACA. Recent developments in OA search engines have created alternative routes to many of the same articles offered by subscriptions, but at a fraction (if any) of the cost. Antelman proposed that libraries use an OA-adjusted cost per download (OA-adj CPD), a metric that “subtracts the downloads that could be met by OA copies of articles within subscription journals,” as a tool for negotiating the price of journal subscriptions.30 By calculating an OA-adj CPD, libraries could INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 60 potentially leverage their ability to access journal articles through means other than traditional subscription bundles to save money and encourage OA publication. While Antelman suggests using OA-adj CPD as a leveraging tool when making deals with publishers for journals subscriptions, I suggest that libraries use the data-gathering methods of Piwowar et al. via Unpaywall to determine whether enough articles from a specific journal can be found OA via Unpaywall. By using metrics such as those collected by Piwowar et al. through Unpaywall, the potential confounding variable of articles found through illegitimate means (such as SciHub) is alleviated. Instead, Piwowar et al.’s metrics focus on tracking the percentage of material searched by library patrons that can be found OA through the Unpaywall browser extension. According to Unpaywall’s “Libraries User Guide” page, libraries “can integrate Unpaywall into their SFX, 360 Link, or Primo link resolvers, so library users can read OA copies in cases where there's no subscription access. Over 1000 libraries worldwide are using this now. ”31 Ideally, scholars will also be more willing to publish papers OA, and institutions will be more supportive of providing the necessary costs for making publications OA. Though the publish-or- perish model still reigns in academia, there is great potential in encouraging tenured professors to publish OA by supplementing the costs through institutional grants and other incentives wrapped into a tenure agreement. Perhaps through this model, as Gargouri et al. have suggested, the longstanding publish-or-perish doctrine will give way to an era of “self-archive to flourish.”32 BIBLIOGRAPHY Antelman, Kristin. “Leveraging the Growth of Open Access in Library Collection Decision Making.” ACRL 2017 Proceedings: At the Helm, Leading the Transformation, March 22–25, Baltimore, Maryland, ed. Dawn M. Mueller (Chicago: Association of College and Research Libraries, 2017), 411–22. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/L everagingtheGrowthofOpenAccess.pdf. Archambault, Éric, Grégoire Côté, Brooke Struck, and Matthieu Voorons. “Research Impact of Paywalled Versus Open Access Papers.” White Papers, Science-Metrix and 1science, 2016. http://www.1science.com/1numbr/. Calver, Michael C. and J. Stuart Bradley. “Patterns of Citations of Open Access and Non -Open Access Conservation Biology Journal Papers and Book Chapters.” Conservation Biology 24, no. 3 (May 2010): 872-80. https://doi.org/10.1111/j.1523-1739.2010.01509.x. Chua, S. K., Ahmad M. Qureshi, Vijay Krishnan, Dinker R. Pai, Laila B. Kamal, Sharmilla Gunasegaran, M. Z. Afzal, Lahri Ambawatta, J. Y. Gan, P. Y. Kew, et al. “The Impact Factor of an Open Access Journal Does Not Contribute to an Article’s Citations” [version 1; referees: 2 approved]. F1000 Research 6 (2017): 208. https://doi.org/10.12688/f1000research.10892.1. Clarivate Analytics. “InCites Journal Citation Reports.” Dataset updated September 9, 2017. https://jcr.incites.thomsonreuters.com/. Clements, Jeff C. “Open Access Articles Receive More Citations in Hybrid Marine Ecology Journals.” FACETS 2 (January 2017): 1–14. https://doi.org/10.1139/facets-2016-0032. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/LeveragingtheGrowthofOpenAccess.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/LeveragingtheGrowthofOpenAccess.pdf http://www.1science.com/1numbr/ https://doi.org/10.1111/j.1523-1739.2010.01509.x https://doi.org/10.12688/f1000research.10892.1 https://jcr.incites.thomsonreuters.com/ https://doi.org/10.1139/facets-2016-0032 THE OPEN ACCESS CITATION ADVANTAGE | LEWIS 61 https://doi.org/10.6017/ital.v37i3.10604 Crotty, David. “Study Suggests Publisher Public Access Outpacing Open Access; Gold OA Decreases Citation Performance.” Scholarly Kitchen, October 4, 2017. https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-access- outpacing-open-access-gold-oa-decreases-citation-performance/. Crotty, David. “When Bad Science Wins, or ‘I’ll See It When I Believe It.’” Scholarly Kitchen, August 31, 2016. https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-see-it- when-i-believe-it/. Davis, Philip M. “Open Access, Readership, Citations: A Randomized Controlled Trial of Scientific Journal Publishing.” FASEB Journal 25, no. 7 (July 2011): 2129–34. https://doi.org/10.1096/fj.11- 183988. Davis, Philip M., and William H. Walters. “The Impact of Free Access to the Scientific Literature: A Review of Recent Research.” Journal of the Medical Library Association 99, no. 3 (July 2011): 208– 17. https://doi.org/10.3163/1536-5050.99.3.008. Elsevier. “Your Guide to Publishing Open Access with Elsevier.” Amsterdam, Netherlands: Elsevier, 2015. https://www.elsevier.com/__data/assets/pdf_file/0020/181433/openaccessbooklet_May.pdf. Evans, James A. and Jacob Reimer. “Open Access and Global Participation in Science.” Science 323, no. 5917 (February 2009): 1025. https://doi.org/10.1126/science.1154562. Eysenbach, Gunther. “Citation Advantage of Open Access Articles.” PLoS Biology 4, no. 5 (May 2006): e157. https://doi.org/10.1371/journal.pbio.0040157. Fisher, Tim. “Top-Level Domain (TLD).” Lifewire, July 30, 2017. https://www.lifewire.com/top- level-domain-tld-2626029. Frisch, Nora K., Romil Nathan, Yasin K. Ahmed, and Vinod B. Shidham. “Authors Attain Comparable or Slightly Higher Rates of Citation Publishing in an Open Access Journal (CytoJournal) Compared to Traditional Cytopathology Journals—A Five Year (2007–2011) Experience.” CytoJournal 11, no. 10 (April 2014). https://doi.org/10.4103/1742-6413.131739. Gaulé, Patrick, and Nicolas Maystre. “Getting Cited: Does Open Access Help?” Research Policy 40, no. 10 (December 2011): 1332–38. https://doi.org/10.1016/j.respol.2011.05.025. Gargouri, Yassine, Chawki Hajjem, Vincent Larivière, Yves Gingras, Les Carr, Tim Brody, and Stevan Harnad. “Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research.” PLoS ONE 5, no. 10 (October 2010). https://doi.org/10.1371/journal.pone.0013636. Hajjem, Chawki, Stevan Harnad, and Yves Gingras. “Ten-Year Cross-Disciplinary Comparison of the Growth of Open Access and How it Increases Research Citation Impact.” IEEE Data Engineering Bulletin 28, no. 4 (December 2005): 39-46. Hall, Martin. “Green or Gold? Open Access After Finch.” Insights 25, no. 3 (November 2012): 235– 40. https://doi.org/10.1629/2048-7754.25.3.235. https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-access-outpacing-open-access-gold-oa-decreases-citation-performance/ https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-access-outpacing-open-access-gold-oa-decreases-citation-performance/ https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-see-it-when-i-believe-it/ https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-see-it-when-i-believe-it/ https://doi.org/10.1096/fj.11-183988 https://doi.org/10.1096/fj.11-183988 https://doi.org/10.3163/1536-5050.99.3.008 https://www.elsevier.com/__data/assets/pdf_file/0020/181433/openaccessbooklet_May.pdf https://doi.org/10.1126/science.1154562 https://doi.org/10.1371/journal.pbio.0040157 https://www.lifewire.com/top-level-domain-tld-2626029 https://www.lifewire.com/top-level-domain-tld-2626029 https://doi.org/10.4103/1742-6413.131739 https://doi.org/10.1016/j.respol.2011.05.025 https://doi.org/10.1371/journal.pone.0013636 https://doi.org/10.1629/2048-7754.25.3.235 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 62 Hersh, Gemma, and Andrew Plume. “Citation Metrics and Open Access: What Do We Know?” Elsevier Connect, September 14, 2016. https://www.elsevier.com/connect/citation-metrics-and- open-access-what-do-we-know. Houghton, John, and Alma Swan. “Planting the Green Seeds for a Golden Harvest: Comments and Clarifications on ‘Going for Gold.’” D-Lib Magazine 19, no. 1/2 (January/February 2013). https://doi.org/10.1045/january2013-houghton. Hua, Fang, Heyuan Sun, Tanya Walsh, Helen Worthington, and Anne-Marie Glenny. “Open Access to Journal Articles in Dentistry: Prevalence and Citation.” Journal of Dentistry 47 (April 2016): 41– 48. https://doi.org/10.1016/j.jdent.2016.02.005. Internet Corporation for Assigned Names and Numbers. “List of Top-Level Domains.” Last updated September 13, 2018. https://www.icann.org/resources/pages/tlds-2012-02-25-en. Jump, Paul. “Open Access Papers ‘Gain More Traffic and Citations.’” Times Higher Education, July 30, 2014. https://www.timeshighereducation.com/home/open-access-papers-gain-more-traffic- and-citations/2014850.article. McCabe, Mark J., and Christopher M. Snyder. “Identifying the Effect of Open Access on Citations Using a Panel of Science Journals.” Economic Inquiry 52, no. 4 (October 2014): 1284–1300. https://doi.org/10.11111/ecin.12064. McCabe, Mark J., and Christopher M. Snyder. “Does Online Availability Increase Citations? Theory and Evidence from a Panel of Economics and Business Journals.” Review of Economics and Statistics 97, no. 1 (March 2015): 144–65. https://doi.org/10.1162/REST_a_00437. Mertens, Stephan. “Open Access: Unlimited Web Based Literature Searching.” Deutsches Ärzteblatt International 106, no. 43 (2009): 710–12. https://doi.org/10.3238/arztebl.2009.0710. Moed, Hank. “Does Open Access Publishing Increase Citation or Download Rates?” Research Trends 28 (May 2012). https://www.researchtrends.com/issue28-may-2012/does-open-access- publishing-increase-citation-or-download-rates/. Niyazov, Yuri, Carl Vogel, Richard Price, Ben Lund, David Judd, Adnan Akil, Michael Mortonson, Josh Schwartzman, and Max Shron. “Open Access Meets Discoverability: Citations to Articles Posted to Academia.edu.” PLoS ONE 11, no. 2 (February 2016): e0148257. https://doi.org/10.1371/journal.pone.0148257. Ottaviani, Jim. “The Post-Embargo Open Access Citation Advantage: It Exists (Probably), It’s Modest (Usually), and the Rich Get Richer (of Course).” PLoS ONE 11, no. 8 (August 2016): e0159614. https://doi.org/10.1371/journal.pone.0159614. Pinfield, Stephen, Jennifer Salter, and Peter A. Bath. “A ‘Gold-Centric’ Implementation of Open Access: Hybrid Journals, the ‘Total Cost of Publication,’ and Policy Development in the UK and Beyond.” Journal of the Association for Information Science and Technology 68, no. 9 (September 2017): 2248–63. https://doi.org/10.1002/asi.23742. Piwowar, Heather, Jason Priem, Vincent Larivière, Juan Pablo Alperin, Lisa Matthias, Bree Norlander, Ashley Farley, Jevin West, and Stefanie Haustein. “The State of OA: A Large-Scale https://www.elsevier.com/connect/citation-metrics-and-open-access-what-do-we-know https://www.elsevier.com/connect/citation-metrics-and-open-access-what-do-we-know https://doi.org/10.1045/january2013-houghton https://doi.org/10.1016/j.jdent.2016.02.005 https://www.icann.org/resources/pages/tlds-2012-02-25-en https://www.timeshighereducation.com/home/open-access-papers-gain-more-traffic-and-citations/2014850.article https://www.timeshighereducation.com/home/open-access-papers-gain-more-traffic-and-citations/2014850.article https://doi.org/10.11111/ecin.12064 https://doi.org/10.1162/REST_a_00437 https://doi.org/10.3238/arztebl.2009.0710 https://www.researchtrends.com/issue28-may-2012/does-open-access-publishing-increase-citation-or-download-rates/ https://www.researchtrends.com/issue28-may-2012/does-open-access-publishing-increase-citation-or-download-rates/ https://doi.org/10.1371/journal.pone.0148257 https://doi.org/10.1371/journal.pone.0159614 https://doi.org/10.1002/asi.23742 THE OPEN ACCESS CITATION ADVANTAGE | LEWIS 63 https://doi.org/10.6017/ital.v37i3.10604 Analysis of the Prevalence and Impact of Open Access Articles.” PeerJ (February 13, 2018): 6:e4375. https://doi.org/10.7717/peerj.4375. Research Information Network. “Nature Communications: Citation Analysis.” Press release, 2014. https://www.nature.com/press_releases/ncomms-report2014.pdf. Riera, M. and E. Aibar. “¿Favorece la publicación en abierto el impacto de los artículos científicos? Un estudio empírico en el ámbito de la medicina intensive” [Does open access publishing increase the impact of scientific articles? An empirical study in the field of intensive care medicine]. Medicina Intensiva 37, no. 4 (May 2013): 232-40. http://doi.org/10.1016/j.medin.2012.04.002. Sotudeh, Hajar, Zahra Ghasempour, and Maryam Yaghtin. “The Citation Advantage of Author-Pays Model: The Case of Springer and Elsevier OA Journals.” Scientometrics 104 (June 2015): 581–608. https://doi.org/10.1007/s11192-015-1607-5. Swan, Alma, and John Houghton. “Going for Gold? The Costs and Benefits of Gold Open Access for UK Research Institutions: Further Economic Modelling.” Report to the UK Open Access Implementation Group, June 2012. http://wiki.lib.sun.ac.za/images/d/d3/Report-to-the-uk-open- access-implementation-group-final.pdf. Tang, Min, James D. Bever, and Fei-Hai Yu. “Open Access Increases Citations of Papers in Ecology.” Ecosphere 8, no. 7 (July 2017): 1–9. https://doi.org/10.1002/ecs2.1887. Unpaywall. “Libraries User Guide.” Accessed September 13, 2018. https://unpaywall.org/user- guides/libraries. Wray, K. Brad. “No New Evidence for a Citation Benefit for Author-Pay Open Access Publications in the Social Sciences and Humanities.” Scientometrics 106 (January 2016): 1031–35. https://doi.org/10.1007/s11192-016-1833-5. ENDNOTES 1 Elsevier, “Your Guide to Publishing Open Access with Elsevier” (Amsterdam, Netherlands: Elsevier, 2015), 2, https://www.elsevier.com/__data/assets/pdf_file/0020/181433/openaccessbooklet_May.pdf. 2 Philip M. Davis and William H. Walters, “The Impact of Free Access to the Scientific Literature: A Review of Recent Research,” Journal of the Medical Library Association 99, no. 3 (July 2011): 213, https://doi.org/10.3163/1536-5050.99.3.008. 3 David and Walters, “The Impact of Free Access,” 208. 4 Kristin Antelman, “Leveraging the Growth of Open Access in Library Collection Decision Making,” ACRL 2017 Proceedings: At the Helm, Leading the Transformation, March 22–25, Baltimore, Maryland, ed. Dawn M. Mueller (Chicago: Association of College and Research Libraries, 2017): 411, 413, http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/201 7/LeveragingtheGrowthofOpenAccess.pdf. https://doi.org/10.7717/peerj.4375 https://www.nature.com/press_releases/ncomms-report2014.pdf http://doi.org/10.1016/j.medin.2012.04.002 https://doi.org/10.1007/s11192-015-1607-5 http://wiki.lib.sun.ac.za/images/d/d3/Report-to-the-uk-open-access-implementation-group-final.pdf http://wiki.lib.sun.ac.za/images/d/d3/Report-to-the-uk-open-access-implementation-group-final.pdf https://doi.org/10.1002/ecs2.1887 https://unpaywall.org/user-guides/libraries https://unpaywall.org/user-guides/libraries https://doi.org/10.1007/s11192-016-1833-5 https://www.elsevier.com/__data/assets/pdf_file/0020/181433/openaccessbooklet_May.pdf http://jmla.mlanet.org/ https://doi.org/10.3163/1536-5050.99.3.008 http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/LeveragingtheGrowthofOpenAccess.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/LeveragingtheGrowthofOpenAccess.pdf INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 64 5 Research Information Network, “Nature Communications: Citation Analysis,” press release, 2014, https://www.nature.com/press_releases/ncomms-report2014.pdf. 6 Gargouri et al., “Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research,” PLoS ONE 5, no. 10 (October 2010): 17, https://doi.org/10.1371/journal.pone.0013636. 7 David Crotty, “When Bad Science Wins, or ‘I’ll See It When I Believe It’,” Scholarly Kitchen, August 31, 2016, https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-see- it-when-i-believe-it/. 8 Jim Ottaviani, “The Post-Embargo Open Access Citation Advantage: It Exists (Probably), It’s Modest (Usually), and the Rich Get Richer (of Course),” PLoS ONE 11, no. 8 (August 2016): 9, https://doi.org/10.1371/journal.pone.0159614. 9 Gargouri et al., “Self-Selected or Mandated,” 18. 10 Elsevier, “Your Guide to Publishing,” 2. 11 Top-Level Domain (TLD) refers to the last string of letters in an internet domain name (i.e., the TLD of www.google.com is .com). For more information on TLDs, see Tim Fisher, “Top-Level Domain (TLD),” Lifewire, July 30, 2017, https://www.lifewire.com/top-level-domain-tld- 2626029. For a full list of TLDs, see “List of Top-Level Domains,” Internet Corporation for Assigned Names and Numbers, last updated September 13, 2018, https://www.icann.org/resources/pages/tlds-2012-02-25-en. 12 Crotty, “When Bad Science Wins.” 13 Hersh and Plume, “Citation Metrics and Open Access: What Do We Know?,” Elsevier Connect, September 14, 2016, https://www.elsevier.com/connect/citation-metrics-and-open-access- what-do-we-know. 14 Archambault et al., “Research Impact of Paywalled Versus Open Access Papers,” white paper, Science-Metrix and 1science, 2016, http://www.1science.com/1numbr/. 15 Archambault et al., “Research Impact.” 16 Heather Piwowar et al., “The State of OA: A Large-Scale Analysis of the Prevalence and Impact of Open Access Articles,” PeerJ, February 13, 2018, https://doi.org/10.7717/peerj.4375. 17 Piwowar et al., “The State of OA,” 5. 18 David Crotty, “Study Suggests Publisher Public Access Outpacing Open Access; Gold OA Decreases Citation Performance,” Scholarly Kitchen, October 4, 2017, https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-access- outpacing-open-access-gold-oa-decreases-citation-performance/. https://www.nature.com/press_releases/ncomms-report2014.pdf https://doi.org/10.1371/journal.pone.0013636 https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-see-it-when-i-believe-it/ https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-see-it-when-i-believe-it/ https://doi.org/10.1371/journal.pone.0159614 https://www.lifewire.com/top-level-domain-tld-2626029 https://www.lifewire.com/top-level-domain-tld-2626029 https://www.icann.org/resources/pages/tlds-2012-02-25-en https://www.elsevier.com/connect/citation-metrics-and-open-access-what-do-we-know https://www.elsevier.com/connect/citation-metrics-and-open-access-what-do-we-know http://www.1science.com/1numbr/ https://doi.org/10.7717/peerj.4375 https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-access-outpacing-open-access-gold-oa-decreases-citation-performance/ https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-access-outpacing-open-access-gold-oa-decreases-citation-performance/ THE OPEN ACCESS CITATION ADVANTAGE | LEWIS 65 https://doi.org/10.6017/ital.v37i3.10604 19 Archambault et al., “Research Impact”; Piwowar et al., “The state of OA,” 15. 20 Piwowar et al., “The State of OA,” 9–10. 21 Archambault et al., “Research Impact.” 22 Ottaviani, “The Post-Embargo Open Access Citation Advantage,” 2. 23 Piwowar et al., “The State of OA,” 9. 24 Hersh and Plume, “Citation Metrics and Open Access.” 25 Hersh and Plume, “Citation Metrics and Open Access.” 26 Tang et al., “Open Access Increases Citations of Papers in Ecology,” Ecosphere 8, no. 7 (July 2017): 8, https://doi.org/10.1002/ecs2.1887. 27 Tang et al., “Open Access Increases Citations,” 7. Tang et al. list the following as examples of the “numerous studies” as quoted above, which I did not include in the quote for the purpose of brevity: (Antelman 2004, Hajjem et al. 2005, Eysenbach 2006, Evans and Reimer 2009, Calver and Bradley 2010, Riera and Aibar 2013, Clements 2017). 28 Yuri Niyazov et al., “Open Access Meets Discoverability: Citations to Articles Posted to Academia.edu,” PLoS ONE 11, no. 2 (February 2016): e0148257, https://doi.org/10.1371/journal.pone.0148257. 29 Gargouri et al., “Self-Selected or Mandated,” 18. 30 Antelman, “Leveraging the Growth,” 414. 31 “Library User Guide,” Unpaywall, accessed September 13, 2018, https://unpaywall.org/user- guides/libraries.<> 32 Gargouri et al., “Self-Selected or Mandated,” 20. https://doi.org/10.1002/ecs2.1887 https://doi.org/10.1371/journal.pone.0148257 https://unpaywall.org/user-guides/libraries https://unpaywall.org/user-guides/libraries ABSTRACT Introduction Self-Archiving Bias and Why It Doesn’t Matter Gold versus Green and Their Effect on OACA Analyses Bronze: Neither Gold nor Green arc AS an indication of an oaca Six Years and What Has Changed in OACA Research Field-Specific Studies Summary of key field-Specific Studies Summary of key MultiDISCIPLINARY STUDIES MultiDisciplinary Studies Conclusions Bibliography ENDNOTES 10702 ---- The “Black Box”: How Students Use a Single Search Box to Search for Music Materials Kirstin Dougan INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 81 Kristin Dougan (dougan@illinois.edu) is Head, Music and Performing Arts Library, University of Illinois. ABSTRACT Given the inherent challenges music materials present to systems and searchers (formats, title forms and languages, and the presence of additional metadata such as work numbers and keys), it is reasonable that those searching for music develop distinctive search habits compared to patrons in other subject areas. This study uses transaction log analysis of the music and performing arts module of a library’s federated discovery tool to determine how patrons search for music materials. It also makes a top-level comparison of searches done using other broadly defined subject disciplines’ modules in the same discovery tool. It seeks to determine, to the extent possible, whether users in each group have different search behaviors in this search environment. The study also looks more closely at searches in the music module to identify other search characteristics such as type of search conducted, use of advanced search techniques, and any other patterns of search behavior. INTRODUCTION Music materials have inherent qualities that present difficulties to the library systems that describe them and to the searchers who wish to find them. This can be exemplified in three main areas: formats, titles, and relationships. First, printed music comes in multiple formats such as full scores, vocal scores, study scores, and parts; and in multiple editions such as facsimiles, scholarly editions, performing editions (of various caliber); each format and edition serving a different purpose or need. Related to this, but less problematic, is the variety of sound recording formats available. Second, issues resulting from titling practices abound in music, ranging from frequent use of foreign terms, not just in descriptive titles (L'oiseau de feu = Zhar-ptitsa = The firebird = Feuervogel), but in generic titles as translated by various publishers from different countries (symphony=sinfonie). Additionally, musical works often have only generic genre titles enhanced by key and work number metadata, for example Symphony No. 1 in c minor. Third, music materials present a relationship issue best defined as “one-to-many.” Musical works often have multiple sections or songs in them (an aria in an opera or a movement in a symphony), and a CD or a score anthology may contain multiple pieces of music. Given these three main challenges presented by music materials, it is possible that those searching for music develop distinctive search habits compared to patrons in other subject areas. This study uses transaction log analysis of the music and performing arts module of a library’s federated discovery tool to determine how patrons search for music materials. It also makes a top-level comparison of searches done using other broadly defined subject disciplines’ modules in the same discovery tool. It seeks to determine, to the extent possible, whether users in each group have different search behaviors in this search environment. The study also looks more closely at mailto:dougan@illinois.edu THE “BLACK BOX” | DOUGAN 82 https://doi.org/10.6017/ital.v37i4.10702 searches in the music module to identify other search characteristics such as type of search conducted, use of advanced search techniques, and any other patterns of search behavior. BACKGROUND Since Fall 2007 the University of Illinois Library has had Easy Search (ES), a locally developed search tool designed to aid users in finding results from multiple catalog, A&I, and reference targets quickly and simultaneously. There is a “main” ES on the library’s main gateway page that searches a variety of cross-disciplinary tools (see figure 1). Figure 1. Gateway Easy Search. On the gateway, users have the option of selecting one of the format tabs to narrow their search to books, articles, journals, or media. When the data for this study was gathered, the journals tab was not present. Starting in 2010 many of the subject and branch libraries in the University Library created their own ES modules with target resources specific to the disciplinary areas they serve. Search boxes for these ES subject modules are often displayed right on the branch library’s home page, but users can also select these subject module options from the dropdown in the main ES (see figure 2). INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 83 Figure 2. Gateway dropdown subject choices. The MPAL ES interface as it appears on the MPAL homepage can be seen in figure 3—it was created in 2011. Figure 3. MPAL Easy Search interface. THE “BLACK BOX” | DOUGAN 84 https://doi.org/10.6017/ital.v37i4.10702 ES is a federated search tool and does not have a central index like most current discovery layer tools. Rather, it utilizes broadcast search technologies to target different tools and search them directly. While the Gateway ES now uses a “bento box” layout to display selected citations from each target, in the first iterations of the tool and still today in the subject modules, users are simply presented with a list of hit counts in each of the target tools (see figures 4 and 5). Figure 4. MPAL Easy Search display screen part 1. Figure 5. MPAL Easy Search results screen part 2. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 85 Not shown in the screen captures are the results from various newspaper indexes and reference sources such as Oxford Music Online and the International Encyclopedia of Dance. LITERATURE REVIEW Many studies have examined patron search behavior using transaction log analysis and other methods over the past few decades. Since the appearance of Google in 1998, and its vast impact on individuals’ expectations and search behavior, recent studies have looked at user search behavior in tools that initially present a single search box. Additional studies have looked at discipline- specific searching behaviors. General Search Studies and Single Search Boxes Many advantages and disadvantages have been ascribed to tools with single search boxes (whether federated search tools or discovery layers), namely ease and convenience on the one hand, and the lack of precision possible in searching and overwhelming number of results on the other. Two companion articles, Boyd et al. and Dyas-Correira, written ten years apart, attempted to visit and revisit these issues.1 Results and patron satisfaction can vary based on size of library and number of resources accessed by these tools. These types of tools will never be able to search and display everything and that problem is magnified by the number of resources a library has. Holman, Porter, and Zimerman discovered in independent studies that undergraduates do not search very efficiently or effectively and find library tools difficult to use.2 Avery and Tracy also found this true for the ES tool under discussion in this study: The generation of keywords by many students indicates they often struggled to identify alternative terminology that may have resulted in a more successful search . . . . Many students exhibited persistence in their searching, but the selection of search terms, sometimes compounded by spelling problems or problems in search string structure, likely did not yield the most relevant results.3 Asher, Duke, and Wilson state in their study comparing student search strategies and success across a variety of library search tools and Google that there were “strong patterns in the way students approached searches no matter which tool they used. . . . Students treated almost every search box like a Google search box, using simple keyword searches in 81.5 percent (679/829) of the searches observed.”4 Dempsey and Valenti note students’ infrequent use of limiters such as “peer-review” and “date” in EDS, the high non-use and misuse rates of quotation marks, relatively low instances of repeated uncorrected spelling errors, and variance patterns in keyword usage.5 Students like federated search tools and discovery layers because of their convenience and ease, as found in studies by Armstrong, Belliston, and Williams et al.6 This is reiterated in Asher et al., “Despite the fact that they did not necessarily perform better on the research tasks in this study, students did prefer Google and Google-like discovery tools because they felt that they could get to full-text resources more quickly.”7 This one-box approach could hinder students, as described by Swanson and Green: The search box became an obstacle in other questions where it should not have been used. In some cases, the search box was viewed as an all-encompassing search of the entire site. THE “BLACK BOX” | DOUGAN 86 https://doi.org/10.6017/ital.v37i4.10702 Several students searched for administrative information, research guides, and podcasts in this box.8 Lown et al. also found that users hope to access a vast range of information via a single search box. “One lesson is that library search is about more than articles and the catalog. About 23 percent of use of QuickSearch took place outside either the catalog or articles modules, indicating that NCSU Library users attempt to access a wide range of information from the single search box.”9 Search and Library Use in Different Disciplines In their study comparing a discovery layer and subject-specific tools, Dahlen and Hanson found “subject-specific indexing and abstracting databases still play an important role for libraries that have adopted discovery layers. Discovery layers and subject-specific indexing and abstracting databases have different strengths and can complement each other within the suite of library resources.”10 They also observed things iterated by previous authors, chiefly that “not all students prefer discovery tools” and “the tools that students prefer may not be those that give them the best results.”11 In addition, they found that default configuration matters in terms of students’ success in and preference for a given tool. Fu and Thomes found that creating smaller discipline- specific subsets in discovery tools was beneficial to searchers by reducing results and in creasing the results’ relevance.12 Few studies investigate how music students search for music materials. Dougan found in her observational study of music students’ search behaviors that they have difficulty forming good searches; misuse quotation marks and other search elements; and at times struggle with finding music materials.13 Mayer noted upper-class music students’ frustration with using library tools to find specific works of music, going so far as to state, “the music students agreed that both the discovery layer and the catalog are not effective for music-related searching, for any format.”14 Clark and Yeager found that students had an easier time searching for media items than music scores, and frequently struggled with search strategy revisions.15 There is more research on the larger information needs of disciplines and creating models for research behavior, and not necessarily specific search processes or constructions.16 Whitmire, in her 2002 pre-Google article, found that students majoring in the social sciences were engaged in information-seeking behaviors at a higher rate than students majoring in engineering.17 Chrzastowski and Joseph surveyed graduate students at the University of Illinois at Urbana– Champaign and found that those in the life sciences, physical sciences, and engineering visited the libraries less often than students in other academic disciplines.18 Students in the arts and humanities used the library more often than students in other disciplines. Collins and Stone report that in prior studies of users across different disciplines, arts and humanities users do not account for the biggest users of library materials, their survey found the opposite to be true. 19 When looking at the various student populations in their study, musicians had the highest library usage in terms of items borrowed and almost the highest number of library visits. Music users in the study also showed high numbers of hours logged into the library e-resources and highest number of e-resources accessed compared to others in their discipline group (but not as much as other disciplines). However, they show a low number of PDF downloads and low number of e-resources accessed frequently. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 87 METHODOLOGY This study conducted quantitative analysis of Easy Search (ES) data as a whole and from a selection of the subject modules, including the Music and Performing Arts Library (MPAL) module, using data from the period June 20, 2014 through June 16, 2015. Additional quantitative and qualitative analysis was conducted only on the MPAL ES transaction log data. Data from the following subject modules were included in comparative analyses: • Funk Agricultural, Consumer and Environmental Sciences Library (ACES) (http://www.library.illinois.edu/funkaces/) • Grainger Engineering Library (http://search.grainger.illinois.edu/top/) • History, Philosophy, and Newspaper Library (HPNL) (http://www.library.illinois.edu/hpnl/) • Music and Performing Arts Library (MPAL) (http://www.library.illinois.edu/mpal/) • Social Science, Health, and Education Library (SSHEL) (http://www.library.illinois.edu/sshel/) • Undergraduate Library (UGL) (http://www.library.illinois.edu/ugl/) Each of these libraries has a search box for ES on its home page that is customized to the search targets identified as best for those subject areas by the subject librarians in that library. Transaction log data on searches done in ES is continuously compiled in a SQL database and queries were written to determine certain quantitative measures. Searches done in these various subject modules were isolated by a variable in the SQL data that indicates whether the search was done in the main Gateway ES, in the main Gateway ES but using one of the subject dropdown choices, or from the subject ES box directly from that library’s homepage. Searches in the six subject modules listed above and in the main ES were assessed for the average number of searches per session and the average words per search. Further analysis of searches done in the MPAL ES module used 25,503 sessions conducted on MPAL public computers from March 21, 2014 to June 21, 2015, which is a slightly longer time- span than used for the comparative analysis between subject ES modules described above. To make this more manageable, only every tenth session was considered, meaning 2,550 sessions were analyzed out of the full set of MPAL data. Searches were sorted by session ID number, which is assigned to each session when a new session is begun. This method kept all strings from one session together, whereas simply sorting by date and string ID did not, since multiple sessions can occur simultaneously. A session is a series of user actions (searches and click-throughs) from the same workstation in which there is less than a twenty-minute pause between actions. If there are user actions from the same workstation after a twenty-minute pause, a new session is established, therefore, there is the possibility that some of the sequential sessions were from the same user, but there is no easy way to determine that. The MPAL data set was assessed using the following quantitative measures: 1) Average number of searches per session and whether session contained a) A single search b) Multiple searches for the same thing (either repeated exactly or varied) http://www.library.illinois.edu/funkaces/ http://search.grainger.illinois.edu/top/ http://www.library.illinois.edu/hpnl/ http://www.library.illinois.edu/mpal/ http://www.library.illinois.edu/sshel/ http://www.library.illinois.edu/ugl/ THE “BLACK BOX” | DOUGAN 88 https://doi.org/10.6017/ital.v37i4.10702 c) Multiple strings searching for multiple things 2) Average number of search terms per search 3) Type of search by index (Title/Author/Keyword) or other advanced search 4) Use of Boolean, quotation marks, parentheses, etc. 5) Use of work or opus numbers or key indications 6) Search indicating format (score, CD, etc.) FINDINGS Comparing the data for searches done in the main ES to some of the subject modules (see table 1) shows that the UGL ES and the HPNL ES have the fewest average searches per session, and the MPAL ES has the third highest average number of searches per session. The sciences tend to have higher average words per search string values, while MPAL has the second lowest average number of words per search. This is not surprising given that the sciences tend to use a lot of journal literature and it is common for researchers to copy and paste such citations into ES. Whereas in music, as we will see later, keyword searches tend to focus on combinations of the composer’s name and words from the work title, occasionally with other terms added. Source Sessions Searches Average searches per session Average words per search All ES searches 599,482 1,340,159 2.121 5.08220 Gateway only Gateway everything tab 382,040 757,862 1.9837 5.255 Gateway books tab 71,007 136,724 1.9255 4.048 Gateway articles tab 57,169 107,893 1.887 6.35 Gateway total 1,002,479 All Subject Modules Departmental searches (incl. those from Gateway dropdown) 75,035 214,364 1.9288 Searches done directly from subject library pages 144,283 Select Subject Modules21 Agricultural, Consumer and Environmental Sciences Library 2,732 5,221 1.911 4.07 Engineering Library 32,018 68,146 2.128 5.092 History, Philosophy, and Newspaper Library 1,264 1,985 1.57 3.09 Music and Performing Arts Library (MPAL) 21,047 41,590 1.976 3.375 MPAL data from March 21, 2014 to June 21, 2015 25,503 49,702 1.949 3.349 Social Science, Health, and Education Library 9,458 19,760 2.089 4.906 Undergraduate Library 26,988 44,588 1.65 3.909 Table 1. Comparative search data from June 20, 2014 to June 16, 2015 (unless otherwise noted). INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 89 Average Number (and Range) of Searches Per Session In looking at the searches done directly from the MPAL homepage and from the Gateway dropdown from March 21, 2014 through June 21, 2015, there were 25,503 sessions conducted in the MPAL ES that contained a total of 49,702 searches, resulting in an average of 1.949 searches per session. Of the 2,550 MPAL search sessions in the study sample, the majority (63.2 percent) consisted of one search.22 This means the patron conducted one search and then left ES, presumably clicking into the library catalog or another tool that is a target in ES to complete their research. Sessions consisting of two to four searches account for 31 percent of sessions, while sessions involving five to nine searches only account for 5 percent of total sessions, and only 32 sessions, or fewer than 1 percent, consist of ten or more searches (see table 2). Searches per session Number of sessions Searches per session Number of sessions 1 1604 12 7 2 476 13 2 3 191 14 3 4 116 16 1 5 51 17 2 6 29 18 1 7 22 19 2 8 12 20 1 9 15 23 1 10 6 30 2 11 6 Total searches 2,550 Table 2. Searches per session. Sessions with multiple searches (n= 946) were evaluated to see whether patrons were searching multiple times for the same thing (either with the same term[s] or with different terms), or whether they were searching for different things. Five sessions that were clearly not music-related were removed from the sample. Each session was categorized as “same/exact,” “same/different,” or “different.” At times, sessions might include several searches for the same thing using altered strings, in addition to searches for other things. Those sessions were coded as “different.” For example: crumb zodiac crumb georgy crumb georgy cromb korean music There were 478 multi-search sessions (50.6 percent) in which patrons searched for different things within their session, 391 sessions (41.3 percent) in which patrons looked for the same thing with differing search strings, and 71 (7.5 percent) in which patrons reiterated the exact same search in each attempt. In the 71 sessions in which patrons used the same exact search THE “BLACK BOX” | DOUGAN 90 https://doi.org/10.6017/ital.v37i4.10702 multiple times, they averaged 2.25 searches. Those sessions tagged as “same/exacts” provide an opportunity to try to determine why patrons repeat the same search. Common themes include: using too broad a search, searching in wrong place (non-performing-arts–related search), or repeatedly typing in the wrong info (e.g., typos or other errors) and not realizing their mistake. In the 391 sessions in which patrons spent their session searching for the same thing with different search strings, they did so with an average of 2.96 searches. Often the variation in the search string was a change in spelling or a minor change in the terms, but sometimes it involved the addition or subtraction of terms, such as starting with morley fitzewilliam virginalists and going to morley fitzewilliam. In another example, we see how music metadata can prove challenging for searchers to format, such as when a patron started with schumann op.68 (without the necessary space between op. and 68), then progressed to album for the young, and finally schumann album for the young. In the 478 sessions in which patrons searched for completely different things within their session, they did so with an average of 4.08 searches per session. In many cases, although the searches were for different items, they were related in some way, either by genre, instrument, or some other element, such as in this example: microjazz color me jazz jamey aebersold play-along vandall jazz jazz piano pieces But sometimes the searches were for very different things: debussy voiles composition as problem mart humal composition as problem debussy ursatz Average Number (and Range) of Terms Per Search In looking at the approximately 4,900 searches included in the sample of 2,550 MPAL sessions, without removing the small percentage of duplicate searches, two-term searches are the most common, followed by three-term searches—together accounting for more than half of the searches (55.3 percent). One- and four-term searches are the next most common, together accounting for 25.5 percent of searches (see table 3). In 2012, regular ES single-term searches were at almost 60 percent.23 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 91 Number of terms in search string Instances Percentage (%) 1 605 12.4 2 1,559 31.8 3 1,149 23.5 4 642 13.1 5 400 8.2 6 196 4.0 7 100 2.0 8-15 216 4.5 16-57 29 .06 Table 3. Words per search string. Longer search strings (8-15 terms) ranged from 74 to ten examples each, respectively, while searches with 16 to 20 terms ranged from 8 to 2 examples each, respectively. The following term counts each had only one example in the logs (25, 26, 31, 32, 36, 57). Single-Term Searches Types of single-term searches can be broken down into several categories (see table 4). Over half (58.4 percent) were searches for personal names or part/all of a work title. Some names and work titles are in fact so unique that a one-word search might in fact be successful (e.g., Beyoncé, Schwanengesang, Newsies, or Landowska). Over a fifth (22.2 percent) were classified as “Other or undetermined,” including publisher names, cities, or subject terms. Type of one-word search Number Personal name 260 Title 93 Instrument/genre 51 Tool/location/format 51 Call number/barcode/label number 15 Other/undetermined 135 Table 4. One-term search types. In the Tool/Location/Format category patrons searched for things such as: albums, images, dissertation, RILM, WorldCat, JSTOR, and IMSLP. While RILM (Abstracts of Music Literature) and WorldCat can be found by a search in this tool because they will match on journal or database titles to which we subscribe, a search for IMSLP [International Music Score Library Project] only brings back mentions of IMSLP in RILM, etc. MPAL links to IMSLP on its webpage, but neither IMSLP nor the library’s website are targets in ES. When patrons only searched for a format, as in a session where a patron first searched for performances, then albums, and then audio cd [sic], it is difficult to know whether the patron expected to be led to a tool that only searched or listed recordings, whether they wanted a list of all of our recordings, or if some other logic was occurring. Searchers also used this technique in multi-word searches, such as in the example george gershwin articles. THE “BLACK BOX” | DOUGAN 92 https://doi.org/10.6017/ital.v37i4.10702 Single-term searches in the “Other/undetermined” category were a mix of subject terms like solfege, tuning, and spectralism. The patron could be trying to find materials related to these topics, examples of them (in the case of solfege), or definitions for them. They also included publisher or label terms such as rubank and puntamayo [sic], and even, on more than one occasion, URLs and DOIs. Two-Term Searches and Names The largest segment of the MPAL data (31.8 percent) is comprised of two-term searches. The examples show that often a musical work can be easily sought based on the composer’s name and a word from the title, especially in cases where it is a common title but adding the composer’s name makes it unique (e.g., Ligeti Requiem). Sometimes the patron only knows the work’s characteristics and not its proper name (e.g., Lakme duet). Patrons do attempt to search for topical material using only two words, and that is not likely enough for a good topic search in most cases, such as in the example mahler dying. Sometimes phonetic spellings are employed such as woozy wick followed by woyzeck (which is both a play and a film with this spelling but could also potentially be a misspelling of Berg’s opera Wozzeck). Another example is image cartier followed by images quartier. Personal names are frequently seen in two-term search strings. Occasional use of foreign versions of names is observed, e.g., georgy crumb. It is difficult to know if these are typos or an artifact of our high international student population. As with any search that contains only a name, it is impossible to know whether the searcher was looking for materials by that individual or information about them. Additionally, when current faculty names are searched, it is difficult to know whether patrons are looking for contact information for them, or scores or recordings by them. Also observed in name searches is the phenomena of patrons repeating their search with a change in order of names, such as bryan gilliam and then gilliam bryan. This occurs with other two-word searches as well, such as a change from introitus gubaidulina to gubaidulina introitus. Switching the order of the words in a search no longer makes a difference in most search tools (although in some catalogs, of course, it was once required to formulate an author search as Last Name, First Name). There is still the occasional use of comma in LN, FN searches here. Echoing the results of an earlier study that asked students what data points they used in searching, only occasionally did searches in this data set incorporate specific performers combined with a particular piece or composer: franck mutter, or for a particular edition: idomeneo barenreiter.24 Sometimes names/titles were combined with format, such as a session in which a patron searched for Hedwig images and then Hedwig photo. Here it is hard to tell if they are looking for pictures of a fictional owl or images from productions of Hedwig and the Angry Inch, or something else. Names are also frequently combined with work numbers instead of title words, such as mozart k.395 and moscheles op.73. Search strings in the “Other/undertermined” category sometimes included what appears to be an author/date search, perhaps for an article, such as mccord 2006. Long Search Strings On the other end of the spectrum, the vast majority of the ten-plus word string searches are for performing arts items, but some were in other subject areas. These long searches are often citations that have been copied and pasted, which can be discerned from the use of punctuation and capitalization, like “Welded in a single mass”: memory and community in London’s concert halls INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 93 during the first world war.25 It is very common in general Gateway ES searches to see an entire citation pasted in,26 but less common in the MPAL module. Searches such as this are often truncated through iteration to make the search more generic (see table 5). Given Easy Search’s DOI search recognition function, the longest version of this search would have worked had the DOI been correct, but the correct DOI number lacks the “.2” at the end (see table 6, query 1). The middle three searches (#s 2-4) failed because none of the A&I services that include this citation use hess, j. for the author’s name, but instead use her full first name (Juliet). Other examples showed that even when patrons use the exact citation, their search might not be successful if the citation formatting did not match that of the database(s) in which the article was indexed. Query # Query string 1 hess, j. (2014). radical musicking: towards a pedagogy of social change. music education research, 16(3), 229-250. doi: 10.1080/14613808.2014.909397.2). 2 hess, j. (2014). radical musicking: towards a pedagogy of social change. music education research, 16(3), 229-250. 3 hess, j. (2014). radical musicking: towards a pedagogy of social change. 4 hess, j. radical musicking: towards a pedagogy of social change. 5 radical musicking: Table 5. Search truncation. In some instances, searches were long because the patron included additional information such as in this example: bernstein, leonard. arranger: jack mason. title: west side story-selections (for symphonic "full" orchestra piano-conductor score). edition/publisher: hal leonard corporation. It is hard to tell if this was a copy and paste from another source such as a publisher catalog, or if the patron was trying to be very precise. In any case, this search was not successful, but would have been had the searcher omitted extraneous information such as the terms “arranger” and “edition/publisher.” Type of Index Search—Title/Author/Keyword and Adding Subsets or Tools Easy Search does have an advanced search function with indexes for title and author, although it is rarely used by patrons. Including repeated searches, searches done selecting the “Title” index only numbered 207, or fewer than 10 percent of the sample. Searches done selecting “Author” were even scarcer, at 141(5.5 percent). The remaining ~2,300 searches in the sample were conducted using the default keyword search. Occasionally there was a misuse of index searching, such as: ti: js bach english suite ti: scarlatti sonatas ti: haydn cello concerto D In these examples, composer name is included in a title index search. It is unclear whether searchers do not realize that they have selected something other than a keyword search, or whether people inherently think of the composer’s name as part of the title. Later in this paper the phenomenon of searches using possessive name forms is discussed, which may be associated. THE “BLACK BOX” | DOUGAN 94 https://doi.org/10.6017/ital.v37i4.10702 Patrons have the option to start from the Main Library gateway and perform a search in ES, and in the advanced search screen can choose other subject modules such as Arts and Humanities, L ife Sciences, and so forth, and/or types of tools to cross-search (see figure 6). Patrons chose the music and performing arts tool subset in 161 sessions. Figure 6. Easy Search Advanced Search Screen. The vast majority of the time (4,557 searches or 93 percent), patrons chose to start from the MPAL ES on the MPAL homepage and do a basic search there, but 179 times patrons started from the MPAL ES and chose other subsets through the advanced search.27 Given our large music education program, logically, some patrons made tool choices that included the music subset and the education and/or social science subsets. But sometimes patrons chose every or almost every option available across multiple unrelated subject areas, which likely made for a very unwieldy result set. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 95 Use of Boolean Operators, Quotation Marks, Parentheses, Truncation, Etc. As in most search tools, there are several ways in ES to conduct more sophisticated searches. However, patrons do not employ these techniques often, in part because they don’t always have to. In most older catalogs (including our classic Voyager OPAC), searchers had to use Boolean terms in capital letters, whereas in VuFind and WorldCat Boolean AND is now implied between terms. In the 159 examples of Boolean logic in the searches, AND is most common term used. Interestingly, some researchers used plus signs instead of AND (as they might in Google), not just between individual words, but in between multi-word segments of the string (without employing quotation marks). However, the + sign, like AND, is ignored/implied by ES. berg + warm die lufte progressive studies for trumpet progressive studies for trumpet + john miller progressive studies for trumpet (john miller) new orleans + bossa nova johnny alf + brazil dick farney + brazil dick farney + booker pittman In some cases, the use of Boolean did not seem intentional, that is, the term “and” appears as part of a common phrase (especially for instrument combinations), such as in webern violin and piano. Only a handful of the Boolean searches included examples of OR and NOT, which seemed to stem from a class assignment designed by a professor, as the search strings are all very similar. One set is below: Machaut NOT Mass Machaut OR Mass Machaut AND Mass Machaut Mass Notre Dame Machaut mass Machaut AND mass THE “BLACK BOX” | DOUGAN 96 https://doi.org/10.6017/ital.v37i4.10702 Commas were sometimes seen to stand in for Boolean operators in a sense, or at least to separate search concepts, like the plus signs above, but were not counted in the total uses of Boolean terms cited above. They are ignored by ES. rachmaninoff, moment music planet, holst City noir, john adams piazzolla, flute and marimba Mussorgsky, Pictures at an Exhibition, Manfred Schandert Searchers used quotation marks on occasion (n=162) to keep phrases together, and parentheses were also used in this manner eight times (although they are ignored by ES), such as in these examples: Preludes and fugues (Well-tempered clavier) cohen Chaconne (from Partita in D minor, BWV 1004) In some cases, searchers did not seem to grasp the function of quotation marks, as in this example: “Snowforms" Raymond Murrey Schafer, which was also observed by Avery and Tracy.28 Truncation symbols can be another powerful tool in a searcher’s arsenal, but examples of their use in the transaction logs show that most searchers who attempt to use them do not understand them, such as in the examples Doctor Atomic?, Boethius music,* and: orchestra* history history of the orchestra orchest* history orchestr* history orchestra history orchestral history In fact, the current library catalog assists users by automatically applying truncation logic so that “symphony” returns results for “symphonies” and vice versa. It is doubtful that this is generally known among users and likely functions in a manner transparent to most of them. Work Numbers and Key Indications Searching by music metadata elements such as work or opus numbers and key designations has always proved challenging in online search environments given that numbers and single letters can appear in other parts of the catalog record with different meanings (e.g., 1 part instead of symphony no. 1). Added to this is the difficulty of describing items that contain multiple works— the item’s title might be “Mozart’s Complete Symphonies” or “Beethoven Symphonies 1-6” without INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 97 complete work details provided. Nevertheless, 134 searches had some form of work number included, and 36 searches included a key indication. Fantasie in f# minor presto georg philipp Telemann and concerto en ut mineur j.c. bach are further examples of why a work’s key is hard to search by, one because of the use of the French solfege syllable “ut” and one because it includes a sharp symbol (#).29 The difficulties this can cause often led searchers to try various permutations of their search. mozart concerto g major sam franko; mozart concerto k 283 sam franko; scores; mozart violin concerto g major; mozart violin concerto g major sam franko; mozart violin concerto; sonata g major flute cpe bach sonata g major flute bach hamburger sonata flute cpe bach hamburger sonata hamburger sonata It is counterintuitive to searchers that including specific details in their search string might not help, but that is in fact the case in many online catalogs. Searchers often run into the question of how or if to include the work indicator (op., K., BWV, etc.), which can lead to a “misuse” of this extra data such as in mozart k501 and Mahler Symphony No.9 (no spaces). Another observation includes the use of what the author calls musicians’ shorthand. That is, those familiar with classical repertoire will know that examples such as Sibelius 1 and Mahler 5 are searches for symphonies even though they do not say so, but it will be harder, if not impossible, for the catalog to interpret that, leaving the searcher to sort through many extra results. In addition is the long-standing issue of whether to enter the number as “1”, “1st”, or “first” and whether the system can interpret these against the form of the number present in the catalog record. Search by Format or Edition Type In forty-seven examples searchers used format terms in their searches, including score, vocal score, full score, DVD, performance recordings, albums, and audio CD as well as the following: prokofiev romeo and juliet orchestra parts orchestra excerpts prokofiev romeo and juliet viola THE “BLACK BOX” | DOUGAN 98 https://doi.org/10.6017/ital.v37i4.10702 Tosca harp part assassins cd saxophone article In fifteen examples searchers searched for edition types including urtext, facsimile, critical edition, and complete works. In the latter case they occasionally used the word “complete” and the composer’s name, such as complete Schumann or complete Webern. Unfortunately, this approach will often not be successful, because even though the term “complete works” is used colloquially by musicians, the titles of such editions are often something else (and often in a foreign language, such as “opera omnia”). Other Observations on Formulation of Searches Searching by Call Numbers and Recording Label Numbers While some catalogs allow call number searches, our current instance of VuFind does not have a call number index, and keyword searching for them only works in some instances.30 But while call number searching does not work well in VuFind (e.g., it has to be done as a keyword search and not a call number index search like in Voyager), it still works in ES because it is searching by keywords. There were thirty-two examples of searches in MPAL’s ES where patrons used entire call numbers or the first part of a classification number to find related materials: count basie biography count basie ml 410 duke ellington ml 410 duke ellington bibiliography It is also not unrealistic to think that patrons might want to search by a recording’s label number, since most catalogs provide search options for ISBNs and ISSNs for print materials. Searchers attempted this in a handful of searches like lpo-0014,31 7.24356E+11,32 and 777337-2.33 Unfortunately this information is not usually reflected in MPAL’s catalog records. Common Descriptions, Natural Language Queries, Genre Queries, and Context Words As mentioned already with the examples Mahler 1 and complete works, patrons regularly search with terms and phrases that make sense to them or that are used colloquially when discussing music and sources, which may or may not be in the bibliographic record. Additional examples in the data set include: handel messiah critical edition rodelinda in italian mamma mia! Book [for the text of a musical] grove encyclopedia [the title of this is in fact “dictionary” not “encyclopedia”] mgg sachteil [the abbreviation for Musik in Geschichte und Gegenwart and the name for a section of it] INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 99 dance collection The last example in the list is particularly intriguing—somewhat like the earlier search examples of performances and albums, one wonders if the patron hoped to find everything in that category and then be able to browse, however it is hard to know what the searcher anticipated getting in return. Sometimes natural language queries appear, often in an attempt to find a smaller part of a larger work, such as the slow movement of Brahms's First symphony, anonymous chant from vespers for christmas day, and Chaconne (from Partita in D minor, BWV 1004); or for things other than musical works, such as in Reviews of Stravinsky Article by Robert Craft. Another variation on natural language or colloquial searches is the use of the possessive form of composer names. Although not common (23 examples), patrons do this when searching for composer and title of a work, e.g., verdi's requiem. It seems unlikely that people do this when searching for books or other works, but musicians make works possessive to the composer, such as in the examples mendelssohn's violin concerto, to differentiate between pieces with the same form/generic title. In rare cases searchers used the term “by,” such as Jeptha by Carrissimi. Genre searches such as South Indian Vocal music and hindustani classical music show that people may want to search the way they might in Pandora or iTunes, although it is possible this person was looking for secondary materials and not recordings or scores: pop female pop women pop contemporary pop Searchers also exhibit a desire to find things by genre and instrument or voice type, such as soprano arias [which is ‘high voice’ in the LC subject heading], mozart satb sanctus, and baroque arias for medium voice. Other examples include marimba literature, organ literature, and organ techniques. Catalogs do not necessarily aid in these types of searches, even though they are natural constructions for users. Sometimes searchers add context words to their search like they would in Google in a way that will not necessarily help them in the catalog, such as daniel read composer. DISCUSSION Even given the difficulties of searching for music materials, MPAL patrons have embraced ES—its module has almost as many searches as the Undergraduate Library’s, which serves a much larger population. It also has twice as many searches as the Social Science, Health, and Education Library module, which also serves a much larger population than MPAL. One of the possible reasons for this is the fact that MPAL was an early adopter of developing an ES subject module that could be searched from our homepage, which means our patrons have had longer to grow accustomed to using it. MPAL has lower average words-per-search ratio (3.375 or 3.349 depending on data set) than most other ES modules, likely because there are more composer plus title keyword searches for musical works and not as many pasted article citation searches, which tend to be longer. This is supported by the comparison of the average number of words in searches done in the Gateway books tab THE “BLACK BOX” | DOUGAN 100 https://doi.org/10.6017/ital.v37i4.10702 (4.048) vs. the Gateway articles tab (6.35). In addition, although two- and three-word searches are most common, MPAL has a significant number of single-word searches (12.4 percent). Such searches can work in music, when there are unique titles like Turandot and Treemonisha that are unlikely to appear for more than one composer or as terms in other disciplines. For this same reason, single- or even two-word searches are unlikely to be effective in most other disciplines. At around seven words per search a transition in search patterns occurs. Eight word and longer search strings are almost always some version of a title of a book, article, chapter or dissertation, etc. and strings with six words and fewer tend to be topical searches or combination composer/piece searches. Other transaction log studies of ES have shown that “title searching and results display—of journal titles, article titles, and book titles—is being heavily employed by users.”34 However, in music, where title alone may not be sufficient to identify and retrieve a musical work, searches with a combination of composer name and elements of the title and/or additional information will always be most prevalent. Search Location Appropriateness and Context Even though discovery layers and federated search tools help with minimizing the number of silos and places in which scholars need to search, there are still issues with patrons attempting to use the ES box to find things it is not designed to find.35 Searchers see a box and search, without always understanding the context. This can happen on multiple levels. The MPAL page clearly states that the MPAL ES box searches for arts-related things, but obviously patrons do not always see or comprehend this, even after they type in many queries that do not provide (good) results. This is likely related to the number of visitors to MPAL from other disciplines who do not realize that there are various differently scoped versions of ES. The following example could be a theatre set construction related search, which would work only moderately well in our tool. Or, it may have been conducted by an architecture or structural engineering student, who would have better luck using a different ES module. light weigh [sic] structures in architecture building research the evolving design vocabulary of fabric structure the engineering discipline of tent structures building research jan/feb 1972:22 It would be ideal if the system was smart enough to make suggestions: “you appear to need architecture resources—if you are not finding what you need, might we suggest tool X, Y, or Z?” While ES does this to an extent when it can in the generic ES, it does not do so in the subject modules, and in reality, can only go so far. It raises the question of whether we are we doing patrons a disservice by offering pre-defined subject modules. While this approach has some benefits for most users, it also creates different challenges for some. MPAL’s ES does not target all available relevant online tools and neither does the general ES, so interdisciplinary researchers still need to be cautious of silos, even well-intentioned ones created by librarians or traditional INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 101 ones created by vendors. It is difficult to inform patrons of this in one-box search settings—they see the box and are eager to get started without first having to read a lengthy set of instructions. Search location context is also important when patrons use ES to try to find things that are described or linked on our website and not in ES, such as for any of our named special collections. Patrons also use ES to find tools such as Naxos, jstor, worldcat, and librarysource, some of which are targeted by ES and some of which are not. ES will at least provide a link to a tool, however (see figure 7). Figure 7. Easy Search Post-Search Suggestion. These particular tools are all also linked from the MPAL website (in fact, Naxos is linked further down the home page from the ES box) and we also have a separate tool that enables one to search for databases and online journals by name. On some occasions, searchers used ES to look for help using library tools, such as in the following example: rilm retrieval rilm using rilm The library website, not the discovery layer, is a better tool for finding instructions, since help information is currently delivered via various LibGuides. However, this is not intuitive to patrons. On a related note, it is interesting to consider whether patrons searching for specific tools such as IMSLP expect to find results from non-library resources in our search layers, or if they simply do not differentiate in their minds what is an open tool and what is a library subscription tool. Patron Knowledge Level Many of the observations of this study are related to known-item searching, since a large percentage of people looking for music materials are looking for specific pieces of music. Earlier studies show that it is difficult to search for something if you do not know what it is.36 This can be seen in examples like ombramaifu handel (should be Ombra mai fu) or the interworkings of tennis (which was followed by the correct inner game of tennis). Topical searches can be especially difficult in any subject when the patron does not quite know how to put what they want into words (or literally does not know the right words, especially in the case of our many patrons for whom English is not their first language). THE “BLACK BOX” | DOUGAN 102 https://doi.org/10.6017/ital.v37i4.10702 qualtize musical tension spell change click: kw:qualitize musical tension quantize musical tension quantitative musical tension music motive similarity surveying musical form through melodic-motivic similarities a paradigmatic approach to extract the melodic structure of a musical piece inding subse- quences of melodies in musical pieces spell change click: kw:finding subsequences melodies musical pieces similarity measures for melodies measures of musical tension measuring musical tension This echoes Head and Eisenberg’s 2009 findings and Dempsey and Valenti’s 2016 findings.37 Shortcomings of the Easy Search Tool This study helped illuminate some shortcomings in ES. Sometimes the search formulation changes from ES to the target, for example cramer preludes in ES becomes all(cramer preludes) [a bound phrase] in one target, resulting in many fewer results than if the search had been done in the native interface. Patrons may not realize this as they are searching. In another case there were no results for Danças folclóricas brasileiras e suas aplicações educativas but removing the diacritics retrieves this title in our catalog, so it appears that diacritics do not function in ES (at least when VuFind is the target)—something that may not be apparent to searchers and hopefully can be addressed in the code. Further Research Additional analysis could be done on this data set, including assessing whether searches were for known items or topics, and more specifically whether for articles, books, scores, or recordings. However, in many cases it is difficult to tell if a patron is looking for a score, recording, or information about a piece or composer. Other research on ES shows over half of searches (just over 58 percent in 2015) in the main ES are for known items.38 This percentage is likely to be much higher in MPAL’s ES. With an enhanced data set it would also be possible to identify which target tools searchers are choosing most often. CONCLUSION While many patrons (and librarians) are eager for a tool that can truly search everything, we are not there yet. Some have tried to make music-specific interfaces for library catalogs, but this work is not widespread.39 Perhaps because music students are often searching for things other than articles it would be better to have one tool that searches the catalog and streaming media tools INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 103 and one that only searches article indexes. Some schools have taken this approach—configuring their discovery layer indexes to include article content but not the local catalog. There were several observations in this data of patron search behavior are not fully supported by library systems in all cases, but perhaps should be (e.g., use of + signs, searching by record label numbers or genre names/types of music/formats). In some cases, this is an issue with the metadata standards in use and in others it is about needing more flexible search options based on the metadata that we already have. Newcomer et al. discuss this in their article outlining music discovery requirements.40 Tools like Easy Search and discovery layers solve some problems for users but can create others. Dedicated library catalogs are still generally the best tools for finding scores and recordings in our physical (and some online) collections, but not all libraries offer that tool anymore, instead offering a discovery layer as the primary search tool. In those cases, serious consideration needs to be given to facets, the ability to limit by format, and especially the FRBRization of items, which is particularly problematic for music. Additionally, there is a continued need for targeted instruction for music library users, because not only are the tools used in libraries less than perfect, the inherent challenges in searching for music because of its formats and titles are aggravated by musicians’ use of shorthand and colloquialisms to describe music materials. ENDNOTES 1 John Boyd et al., “The One-Box Challenge: Providing a Federated Search that Benefits the Research Process,” Serials Review 32, no. 4 (December 2006): 247–54, https://doi.org/10.1016/j.serrev.2006.08.005; Sharon Dyas-Correia et al., “’The One-Box Challenge: Providing a Federated Search That Benefits the Research Process’ Revisited,” Serials Review 41, no. 4 (October-December 2015): 250–56, https://doi.org/10.1080/00987913.2015.1095581. 2 Lucy Holman, “Millennial Students’ Mental Models of Search: Implications for Academic Librarians and Database Developers,” Journal of Academic Librarianship 37, no. 1 (January 2011): 19–27, https://doi.org/10.1016/j.acalib.2010.10.003; Brandi Porter, “Millennial Undergraduate Research Strategies in Web and Library Information Retrieval Systems,” Journal of Web Librarianship 5, no. 4 (July-December 2011): 267–85, https://doi.org/10.1080/19322909.2011.623538; Martin Zimerman, “Digital Natives, Searching Behavior, and the Library,” New Library World 11, nos. 3/4 (2012): 174–201, https://doi.org/10.1108/03074801211218552. 3 Susan Avery and Dan Tracy, “Using Transaction Log Analysis to Assess Student Search Behavior in the Library Instruction Classroom,” Reference Services Review 42, no. 2 (June 2014): 332, https://doi.org/10.1108/RSR-08-2013-0044. 4 Andrew Asher, Lynda M. Duke, and Suzanne Wilson, “Paths of Discovery: Comparing the Search Effectiveness of EBSCO Discovery Service, Summon, Google Scholar, and Conventional Library Resources,” College & Research Libraries 74, no. 5 (September 2013): 473, https://doi.org/10.5860/crl-374. https://doi.org/10.1080/00987913.2015.1095581 https://doi.org/10.1016/j.acalib.2010.10.003 https://doi.org/10.1080/19322909.2011.623538 https://doi.org/10.1108/03074801211218552 https://doi.org/10.1108/RSR-08-2013-0044 https://doi.org/10.5860/crl-374 THE “BLACK BOX” | DOUGAN 104 https://doi.org/10.6017/ital.v37i4.10702 5 Megan Dempsey and Alyssa Valenti, “Student Use of Keywords and Limiters in Web-scale Discovery Searching,” Journal of Academic Librarianship 42, no. 3 (May 2016): 203, https://doi.org/10.1016/j.acalib.2016.03.002. 6 Annie R. Armstrong, “Student Perceptions of Federated Searching vs. Single Database Searching,” Reference Services Review 37, no. 3 August 2009): 291–303, https://doi.org/10.1108/00907320910982785; C. Jeffrey Belliston, Jared L. Howland, and Brian C. Roberts, “Undergraduate Use of Federated Searching: A Survey of Preferences and Perceptions of Value-added Functionality,” College & Research Libraries 68, no. 6 (November 2007): 472-86, https://doi.org/10.5860/crl.68.6.472; Sarah D. Williams, Angela Bonnell, and Bruce Stoffel, “Student Feedback on Federated Search Use, Satisfaction, and Web Presence: Qualitative Findings of Focus Groups,” Reference and User Services Quarterly 49, no. 2 (Winter 2009): 131–39. 7 Asher et al., “Paths of Discovery,” 476. 8 Troy Swanson and Jeremy Green, “Why We Are Not Google: Lessons from a Library Web Site Usability Study,” Journal of Academic Librarianship 37, no. 3 (May 2011): 227, https://doi.org/10.1016/j.acalib.2011.02.014. 9 Cory Lown, Tito Sierra, and Josh Boyer, “How Users Search the Library from a Single Search Box,” College & Research Libraries 74, no. 3 (May 2013): 240, https://doi.org/10.5860/crl-321. 10 Sarah Dahlen and Kathlene Hanson, “Preference vs. Authority: A Comparison of Student Searching in a Subject-Specific Indexing and Abstracting Database and a Customized Discovery Layer,” College & Research Libraries 78, no. 7 (November 2017), 892, https://doi.org/10.5860/crl.78.7.878. 11 Ibid. 12 Li Fu and Cynthia Thomes, “Implementing Discipline-Specific Searches in EBSCO Discovery Service,” New Library World 115, nos. 3/4 (2014): 102–15, https://doi.org/10.1108/NLW-01- 2014-0003. 13 Kirstin Dougan, “Finding the Right Notes: An Observational Study of Score and Recording Seeking Behaviors of Music Students,” Journal of Academic Librarianship 41, no. 1 (January 2015): 61–67, https://doi.org/10.1016/j.acalib.2014.09.013. 14 Jennifer M. Mayer, “Serving the Needs of Performing Arts Students: A Case Study,” Portal: Libraries & the Academy 15, no. 3 (July 2015): 416, https://doi.org/10.1353/pla.2015.0036. 15 Joe Clark and Kristin Yeager, “Seek and You Shall Find? An Observational Study of Music Students’ Library Catalog Search Behavior,” Journal of Academic Librarianship 44, no. 1 (January 2018): 105-12, https://doi.org/10.1016/j.acalib.2017.10.001. 16 Christine D. Brown, “Straddling the Humanities and Social Sciences: The Research Process of Music Scholars,” Library & Information Science Research 24, no. 1 (March 2002): 73–94, https://doi.org/10.1016/S0740-8188(01)00105-0; Stephann Makri and Claire Warwick, https://doi.org/10.1016/j.acalib.2016.03.002 https://doi.org/10.1108/00907320910982785 https://doi.org/10.5860/crl.68.6.472 https://doi.org/10.1016/j.acalib.2011.02.014 https://doi.org/10.5860/crl-321 https://doi.org/10.5860/crl.78.7.878 https://doi.org/10.1108/NLW-01-2014-0003 https://doi.org/10.1108/NLW-01-2014-0003 https://doi.org/10.1016/j.acalib.2014.09.013 https://doi.org/10.1353/pla.2015.0036 https://doi.org/10.1016/j.acalib.2017.10.001 https://doi.org/10.1016/S0740-8188(01)00105-0 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 105 “Information for Inspiration: Understanding Architects' Information Seeking and Use Behaviors to Inform Design,” Journal of the American Society for Information Science & Technology 61, no. 9 (September 2010): 1,745-770, https://doi.org/10.1002/asi.21338; Francesca Marini, “Archivists, Librarians, and Theatre Research,” Archivaria 63 (2007): 7–33; Ann Medaille, “Creativity and Craft: The Information-Seeking Behavior of Theatre Artists,” Journal of Documentation 66, no. 3 (May 2010): 327–47, https://doi.org/10.1108/00220411011038430; MaryBeth Meszaros, “A Theatre Scholar- Artist Prepares: Information Behavior of the Theatre Researcher,” in Advances in Library Administration and Organization (v. 29), Delmus E. Williams and Janine Golden, eds. (Bingley, UK: Emerald Group Publishing Limited, 2010): 185-217; Bonnie Reed and Donald R. Tanner, “Information Needs and Library Services for the Fine Arts Faculty,” Journal of Academic Librarianship 27, no. 3 (May 2001): 231, https://doi.org/10.1016/S0099-1333(01)00184-7; Shannon Robinson, “Artists as Scholars: The Research Behavior of Dance Faculty,” College & Research Libraries 77, no. 6 (November 2016): 779-94, https://doi.org/10.5860/crl.77.6.779. 17 Ethelene Whitmire, “Disciplinary Differences and Undergraduates’ Information‐Seeking Behavior,” Journal of the Association for Information Science and Technology 53 (June 2002): 631-38, https://doi.org/10.1002/asi.10123. 18 Tina Chrzastowski and Lura Joseph, “Surveying Graduate and Professional Students' Perspectives on Library Services, Facilities and Collections at the University of Illinois at Urbana-Champaign: Does Subject Discipline Continue to Influence Library Use?,” Issues in Science & Technology Librarianship 45, no. 1 (Winter 2006), https://doi.org/10.5062/F4DZ068J. 19 Ellen Collins and Graham Stone, “Understanding Patterns of Library Use Among Undergraduate Students from Different Disciplines,” Evidence Based Library and Information Practice 9 (September 2014): 51–67, https://doi.org/10.18438/B8930K. 20 This is up from the 4.33 average reported by Mischo in 2012 (164). 21 Including direct from departmental webpage and via Gateway ES dropdown choices. 22 In Mischo’s 2012 analysis of Easy Search logs, 52 percent of sessions had one string and 48 percent had two or more. By 2015, single-query sessions had risen to 57 percent (William Mischo, et al., "The Bento Approach to Library Discovery: Web-Scale and Beyond,” Internet Librarian International, October 21, 2015). 23 William H. Mischo et al., “User Search Activities within an Academic Library Gateway: Implications for Webscale Discovery Systems,” in Planning and Implementing Resource Discovery Tools in Academic Libraries, ed. Mary Popp and Diane Dallis (Hershey, PA: IGI Global, 2012), 163. 24 Kirstin Dougan, “Information Seeking Behaviors of Music Students,” Reference Services Review 40, no. 4 (November 2012): 563, https://doi.org/10.1108/00907321211277369. https://doi.org/10.1002/asi.21338 https://doi.org/10.1108/00220411011038430 https://doi.org/10.1016/S0099-1333(01)00184-7 https://doi.org/10.5860/crl.77.6.779 https://doi.org/10.1002/asi.10123 https://doi.org/10.5062/F4DZ068J https://doi.org/10.18438/B8930K https://doi.org/10.1108/00907321211277369 THE “BLACK BOX” | DOUGAN 106 https://doi.org/10.6017/ital.v37i4.10702 25 Vanessa Williams, “‘Welded in a Single Mass’: Memory and Community in London’s Concert Halls During the First World War,” The Journal of Musicological Research 33, nos. 1–3 (2014): 27–38. 26 Mischo, “User Search Activities,” 162. 27 This echoes earlier research that shows most searchers use default settings and keyword searches. 28 Avery and Tracy, “Using Transaction Logs,” 31. 29 Barbara D. Henigman and Richard Burbank, “Online Music Symbol Retrieval from the Access Angle,” Information Technology & Libraries 14, 1 (March 1995): 5–16. 30 We still have to use our older Voyager OPAC or the staff-side of Voyager to effectively search by call number until we get a newer version of VuFind. 31 Symphony no. 4 in E flat “Romantic” by Anton Bruckner, Klaus Tennstedt (Conductor), London Philharmonic Orchestra. (Performer). 32 This is Mozart, “Clarinet Concerto in A, K. 622,” Meyer/Berlin Philharmonic/Abbado EMI Classics 57128; 7.24356E+11. 33 This is REICH: Sextet / Piano Phase / Eight Lines (Griffiths Kevin/ London Steve Reich Ensemble/ The/ Stephen Wallace) (Cpo: 777337-2)). 34 Mischo, “User Search Activities,” 169. 35 This reinforces what Lown and Asher et al. found as cited in the literature review above. 36 Kirstin Dougan, “Finding the Right Notes: An Observational Study of Score and Recording Seeking Behaviors of Music Students,” Journal of Academic Librarianship 41, no. 1 (January 2015): 66. 37 Alison Head and Michael Eisenberg, “Finding Context: What Today’s College Students Say About Conducting Research in the Digital Age,” Progress Report (2009) (Retrieved from http://projectinfolit.org/images/pdfs/pil_progressreport_2_2009.pdf); Dempsey and Valenti, “Student Use of Keywords and Limiters,” 2016. 38 William H. Mischo et al., “The Bento Approach to Library Discovery: Web-Scale and Beyond,” Internet Librarian International, October 21, 2015. 39 Anke Hofmann and Barbara Wiermann, “Customizing Music Discovery Services: Experiences at the Hochschule für Musik und Theater, Leipzig,” Music Reference Services Quarterly 17, no. 2 (June 2014): 61–75, https://doi.org/10.1080/10588167.2014.904699; Bob Thomas, “Creating a Specialized Music Search Interface in a Traditional OPAC Environment,” OCLC Systems & Services 27, no. 3 (August 2011): 248–56, https://doi.org/10.1108/10650751111164588. 40 Nara Newcomer et al., “Music Discovery Requirements: A Guide to Optimizing Interfaces,” Notes 69, no. 3 (March 2013): 494-524, https://doi.org/10.1353/not.2013.0017. http://projectinfolit.org/images/pdfs/pil_progressreport_2_2009.pdf https://doi.org/10.1080/10588167.2014.904699 https://doi.org/10.1108/10650751111164588 https://doi.org/10.1353/not.2013.0017 ABSTRACT INTRODUCTION BACKGROUND LITERATURE REVIEW General Search Studies and Single Search Boxes Search and Library Use in Different Disciplines METHODOLOGY FINDINGS Average Number (and Range) of Searches Per Session Average Number (and Range) of Terms Per Search Single-Term Searches Two-Term Searches and Names Long Search Strings Type of Index Search—Title/Author/Keyword and Adding Subsets or Tools Use of Boolean Operators, Quotation Marks, Parentheses, Truncation, Etc. Work Numbers and Key Indications Search by Format or Edition Type Other Observations on Formulation of Searches Searching by Call Numbers and Recording Label Numbers Common Descriptions, Natural Language Queries, Genre Queries, and Context Words DISCUSSION Search Location Appropriateness and Context Patron Knowledge Level Shortcomings of the Easy Search Tool Further Research CONCLUSION endnotes 10703 ---- 20180926 10703 editor President’s Message: Rebuilding Our Identity, Together Bohyun Kim INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 2 Bohyun Kim (bohyun.kim.ois@gmail.com) is LITA President 2018-19 and Chief Technology Officer & Associate Professor, University of Rhode Island Libraries, Kingston, RI. ITAL is the official journal of LITA (Library and Information Technology Association), and if you are a reader of the ITAL journal, it is highly likely that you are a member of LITA and/or one who is deeply interested in library technology. It is my pleasure to write this column to update all of you about the exciting discussion that is currently underway in LITA and two other ALA divisions, ALCTS (Association for Library Collections and Technical Services), and LLAMA (Library Leadership and Management Association). As many of you know, LITA began discussing the potential merger with two other ALA divisions, ALCTS and LLAMA, last year.1 What initially prompted the discussion was the prospect of continuing budget deficits in all three divisions. But the resulting conversation has proved that financial viability is not the entire story of the change that we want to bring about. At the 2018 ALA Annual Conference in New Orleans, the three Boards of LITA, ALCTS, and LLAMA held a joint meeting open to members and non-members alike to solicit and share our collective thoughts, suggestions, concerns, and hopes about the potential three-division realignment. At this meeting attended by approximately 75 people, participants expressed their support for creating a new division with the following key elements. • Retain and build upon the best elements of each division. • Embrace the breakdown of silos and positive risk-taking to better collaborate and move our profession forward. • Build a strong culture of innovation, energy, and inspiration. • Be more transparent, responsive, agile, and less bureaucratic • Excel in diversity, equity, and inclusion. • Support members in all stages of their careers, those with the least means to travel for in-person participation, in particular. • Provide member-driven interactions and good value for the membership fee. These ideas have made it clear that members of all three divisions see the goal of realignment as something much more fundamental than financial sustainability. They have validated the shared belief among the LITA, ALCTS, and LLAMA Boards that the ultimate goal of realignment is to create a division that better serves and benefits members, not to simply recover the division’s financial health. While the criteria for the success of a new combined division received almost unanimous endorsement at the meeting, opinions about how to realize such success varied. There were understandable concerns associated with combining three small-sized associations into one large one. For example, how will we reconcile three distinctly different cultures in LITA, ALCTS, and LLAMA? How will the new association ensure itself to be more transparent, responsive, and REBUILDING OUR IDENTITY, TOGETHER | KIM 3 https://doi.org/10.6017/ital.v37i3.10703 nimble than the individual divisions prior to the merger? Could the larger size of the new division make it more difficult for small groups with special interests to get needed support for their programs? Many requested that the leadership of the three divisions provide more specific vision and details. As a group, the leaders of LITA, ALCTS, and LLAMA are committed to hashing out those details. With the aim of providing fuller information about what the new division would look like at the 2019 Midwinter Conference, we have already formed working groups, one for finances and the other for communication and are currently working to create two more on operations and activities. These four teams will work closely together with the current leadership of LITA, ALCTS, and LLAMA, to prepare the most important information about the proposed new division, so that the boards and the members of three divisions can review and provide feedback for needed adjustments. Our goal is to present essential information that will allow the members to vote with confidence on the proposal to form one new division on the ALA ballot in the Spring of 2019. If the membership vote passes, then we will be taking the proposal to the ALA Committee on Organization for finalization. On this occasion, I would also like to bring to everyone’s attention to an inherent tension between the two ideas that many of us hold as association members regarding alignment. One is that more member involvement in determining alignment-related details at an early stage is essential to the success of the new division. The other is that we can decide whether we will support the new division or not, only after the leadership first presents us with a clear, specific, and detailed picture of what the new division will look like. The problem is that we cannot have both at the same time. As members, if we want to be involved at an early stage of reorganization, we will have to accept that there will be no ready-made set of clear and specific details about the division waiting for us to simply say yes or no. We will be required to work through our collective ideas to decide on those details ourselves. It will be a messy, iterative, and somewhat confusing process for all of us. There is no doubt that this will be hard work for both the LITA leadership and LITA members. But it is also an amazing opportunity. Imagine a new division, where (a) innovative ideas and projects are shared and tested through open conversation and collaboration among library professionals in a variety of functional areas such as systems and technology, metadata and cataloging, and management and administration, (b) frank and inspiring dialogues take place between front-line librarians and administrators about vexing issues and exciting challenges, and (c) new librarians learn the ropes, are supported throughout their careers going through changes in their responsibilities as well as areas of specialization, are mentored to be future leaders, and get to develop the next generation of leaders as they themselves achieve their goals. Furthermore, I believe that the process of building this kind of new association from the ground up will be a truly rewarding experience. We had an opportunity to discuss and share our collective hope and vision for the new division at the joint meeting, and that vision is an inspiring one: a division that is member-driven, nimble and responsive, transparent and inclusive, and not afraid to take risks. Can we create a new association that breaks down our own silos and builds bridges for better communication and collaboration to move our profession forward? INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 4 My hope is that we can model and embody the change we want to see, starting in the reorganization process itself. If we want to build a new association that is inclusive, transparent, and nimble, we should be able to build such an association in precisely that manner: inclusively, transparently, and nimbly. If we are successful, our identity as members of this new division will be rebuilt as the very spirit and energy of continuing innovation, experimentation, and collaboration across different functional silos of librarianship, rather than as what we have in our job titles. Many LITA members and ITAL readers are leaders in their field and care deeply about the continued success and innovation of LITA and ITAL. I would like to invite all of you to participate in this effort of three-division alignment and to inform and lead our way together. While the boards of three divisions are working on the proposal, there will be multiple calls for member participation. Keep your eye out for new updates that will be posted in the ALA Connect community, “ALCTS/LLAMA/LITA Alignment Discussion” at https://connect.ala.org/communities/community-home?CommunityKey=047c1c0e-17b9-45b6- a8f6-3c18dc0023f5. All information in this group site is viewable to the public. LITA, ALCTS, and LLAMA members can also join the group, post suggestions and feedback, and subscribe to updates. Where would you like LITA to be next year, and the year after? Let us take LITA there, together. ENDNOTE 1 Andromeda Yelton, “President’s Message,” Information Technology and Libraries 37, no. 1 (March 19, 2018): 2–3, https://doi.org/10.6017/ital.v37i1.10386. 10714 ---- Is Creative Commons a Panacea for Managing Digital Humanities Intellectual Property Rights? Articles Is Creative Commons a Panacea for Managing Digital Humanities Intellectual Property Rights? Yi Ding INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 34 Yi Ding (yi.ding@csun.edu) is Online Instructional Design Librarian and Affordable Learning $olutions Co-coordinator, California State University, Northridge. ABSTRACT Digital humanities is an academic field applying computational methods to explore topics and questions in the humanities field. Digital humanities projects, as a result, consist of a variety of creative works different from those in traditional humanities disciplines. Born to provide free, simple ways to grant permissions to creative works, Creative Commons (CC) licenses have become top options for many digital humanities scholars to handle intellectual property rights in the US. However, there are limitations of using CC licenses that are sometimes unknown by scholars and academic librarians. By analyzing case studies and influential lawsuits about intellectual property rights in the digital age, this article advocates for a critical perspective of copyright education and provides academic librarians with specific recommendations about advising digital humanities scholars to use CC licenses with four limitations in mind: 1) the pitfall of a free license; 2) the risk of irrevocability; 3) the ambiguity of NonCommercial and NonDerivative licenses; 4) the dilemma of ShareAlike and the open movement. INTRODUCTION Along with an increasing number of digital scholarships, open access became a preferred, more affordable model for scholarly communication in the US.1 In particular, digital humanists envision a sharing culture that digital contents and tools can be widely distributed through open access licenses.2 Creative Commons (CC) licenses, with their promise to provide simple ways to grant permissions to creative works, became top options for many digital humanities to handle intellectual property rights in the US. However, Creative Commons is not a panacea for managing the intellectual property rights of digital scholarship. Digital humanities projects usually consist of complicated components and their intellectual property rights involve various licenses and stakeholders. With misunderstandings of intellectual property and CC licenses, many scholars are not fully aware of the implications of using CC licenses, which cannot provide legal solutions to all intellectual property rights issues. The increasingly popular application and commercialization of digital humanities projects in the US further complicate the issue. Based on case studies and influential lawsuits involving the topic in the US, this article critically investigates the limitations of using CC licenses and recommends that academic librarians provide scholars with more sophisticated suggestions on using CC licenses as well as providing education on intellectual property rights in general. mailto:yi.ding@csun.edu IS CREATIVE COMMONS A PANACEA FOR MANAGING DIGITAL HUMANITIES IP RIGHTS? | DING 35 https://doi.org/10.6017/ital.v38i3.10714 LITERATURE REVIEW Usually identified as rights experts, academic librarians are in a unique position to provide copyright education in the digital humanities field through consultation, instruction, and other means to faculty and students.3 Librarians sometimes position themselves as “reuse evangelists” who embrace the vision of Creative Commons by applying CC licenses as well as introducing CC licenses to the campus community through guides and webpages.4 Yet, few discussions have been brought up about the limitations of CC licenses in the library community.5 Drawing from scholarly literature from the law field and primary sources including lawsuits, websites, magazine articles, and newspaper articles involving this topic, this article intends to bring a critical perspective into the copyright education academic librarians provide by analyzing the four limitations of CC licenses in managing digital humanities projects intellectual property rights. In the law community, scholars have examined the limitations of open licensing and Creative Commons. Katz elaborated the mismatch of the vision of Creative Commons and its licensors as well as how the incompatibility of CC licenses may result in potential detriment to the dissemination of knowledge.6 Scholars later have referred to Katz in extensive discussions of the limitations of CC licenses in different realms of copyrighted works. For example, Johnson investigated several limitations of CC licenses for entertainment media, including those with ShareAlike, NonCommercial, and NonDerivative licenses.7 Lukoseviciene acknowledged the efficiency of CC licenses while pointing out its limitation in ensuring equity in a sharing culture.8 When discussing the problems of CC licenses in data sharing, Khayyat and Bannister echoed Katz’s critique on the limitation of CC licenses in combining copyrighted works with different types of licenses.9 Scholars have also addressed problems related to intellectual property rights other than copyright when applying CC licenses. For example, Hietnanen discussed the problems of license interpretation and concluded that although CC licenses are useful for “low value - high volume licensing,” it fails to address some important intellectual property rights including privacy and moral rights.10 Burger demonstrated how CC commercial licenses have encouraged publicity right infringement in several cases.11 Nevertheless, none of the above scholars discussed the implication of the limitations of CC licenses in digital scholarship. To solve the problem of excessive open-source licenses, Gomulkiewicz suggested a license-selection “wizard” modeling what Creative Commons offers, which demonstrates the limitation of CC licenses in managing the intellectual property rights of codes, a common component of many digital humanities projects.12 This article does not aim to conduct a comprehensive assessment of pitfalls of CC licenses in digital scholarship or make legal recommendations to manage the intellectual property rights of digital humanities projects. Rather, it discusses the four limitations of CC licenses that are usually overlooked but essential for academic librarians to educate patrons in the digital humanities field. With the development of the digital humanities field and more students involved in it, academic librarians should educate both faculty scholars and emerging scholars about implications of applying CC licenses.13 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 36 FOUR LIMITATIONS OF CC LICENSES Is Creative Commons Really Free?—The Pitfall of a Free License One major reason that scholars and institutions are using CC licenses is the ease of applying them to creative works. The Directory of Open Access Journals (DOAJ), which is regarded as “both an important mode of discovery and marker of legitimacy within the world of open access publishing,” now recommends CC licenses as a best practice.14 DOAJ explicitly encourages scholars to use Creative Commons’ “simple and easy” license chooser tool. Indeed, the Creative Commons website provides scholars and institutions a very user-friendly way to select and apply a license to copyrightable works.15 Anyone can place a CC license on a work by copying and pasting from its website. However, this oversimplified process of handling intellectual property rights of creative works may mislead both copyright owners and copyrighted works users to overlook pitfalls of this free license, including unintentional copyright and other intellectual property rights infringements. More specifically, one prominent legal formality of CC licenses is that licensees do not need to pay to register with Creative Commons to apply a CC license. As indicated by Creative Commons website, a CC license is legally valid as soon as a user applies it to any material the user has the legal right to license. Creative Commons also does not “require registration of the work with a national copyright agency.”16 While copyright protection is automatic the moment a work is created and “fixed in a tangible form,” there are various advantages to register copyrighted works through the United States Copyright Office to establish a public record of the copyright claim.17 One foremost important advantage of copyright registration is that copyright owners can file an infringement suit of works of U.S. origin in court. Actually, filing a registration before or within five years of publishing a work will actually put the copyright owner in a stronger position in court to validate the copyright.18 Additionally, copyright registration enables one to get awarded statutory damages and attorney’s fees and to gain protection against the importation of infringing copies.19 The emphasis on a free-to-use license along with the lack of clarification of the functions of copyright registration on the website of Creative Commons may not only mislead scholars to ignore important legal formalities within the copyright law, but also increase the abuse of original materials by stakeholders such as predatory publishers. One example is how the Integrated Study of Marine Mammals repackaged existing articles taking advantage of the Creative Commons licenses used by PLOS ONE, which has been publishing articles on digital humanities. 20 The oversimplified process of using CC licenses advocated by Creative Commons website may also prevent licensors from double-checking or clarifying if they have the legal right to license a work. In 2013, Persephone Magazine, which used an image with a Creative Commons license, was later sued for $1,500 for using it. It turned out the photo did not belong to the person who uploaded it with a CC license, which led to 73 companies who used it being sued. Persephone Magazine claimed that $1,500 was more than its entire advertising revenue for the year and it had to ask its users to donate just to keep the site going.21 Therefore, scholars of digital humanities projects, which usually include different types of content such as artworks and photographs, should be wary of using CC licensed images. Otherwise, a freely available license might end up costing a scholar unexpected money and energy. In the IS CREATIVE COMMONS A PANACEA FOR MANAGING DIGITAL HUMANITIES IP RIGHTS? | DING 37 https://doi.org/10.6017/ital.v38i3.10714 meantime, when deciding to put their projects under CC licenses or to publish their works in a journal that requires CC licenses, scholars should also be reminded to make accurate and clear copyright statements to prevent innocent infringements of other copyright owners ’ works. For example, a team of art historians who create an online map of architectures in Ancient China are very likely to use and critique other people’s images in digital projects under fair use. These digital humanists should cite image sources and clarify the scope of the CC license that they apply to their project. It is understandable that in order to promote an open, sharing culture, the application of a CC license is intentionally designed to be simple and free by Creative Commons to fulfill its mission. However, the misuse of a free license can lead to false licenses and more innocent infringements and ultimately costs. Academic librarians should become aware of these pitfalls and provide more in-depth training on CC licenses to scholars, especially by collaborating with campus centers of digital humanities or language and literature faculty as well as other institutional research support departments as suggested by Fraser and Arnhem.22 Is Creative Commons Really Safe?—The Risk of Irrevocability Similar to the pitfall of inaccurate licenses, the irrevocability of CC licenses can also be problematic. A “revocable” license is one that can be terminated by the licensor at any time during the term of the license agreement. An “irrevocable” license, on the other hand, cannot be terminated if there is no breach. All CC licenses are irrevocable.23 Licenses and contracts usually have effective date of termination and even if they don’t have one, most courts hold that simple, nonexclusive licenses with unspecified durations that are silent on revocability are revocable at will.24 As a result, the irrevocability of CC licenses can be easily overlooked by CC licensors. This means that while in traditional academic publishing and other means of the dissemination of research, scholarly, and creative output, a scholar will be able to revise the copyright agreement he or she has established with a publisher or a scholarly communication venue due to the usually clear rules on termination dates and revocability, it is impossible to revoke a CC license. This discrepancy of the revocability between traditional copyright agreements and CC licenses may put copyright owners at disadvantage especially because many of them apply noncommercial CC licenses. Copyright experts have warned scholars to keep in mind that once a “nonexclusive license,” which CC noncommercial licenses are, has been chosen to grant one’s work, the scholar has lost potential opportunities to “license the same work on an exclusive basis,” which is the case in the commercialization of a digital humanities work.25 We can understand this pitfall of the irrevocability of CC licenses in a case in late 2014. A plan by Yahoo to begin selling prints of images uploaded to Flickr was met with anger by users, even though Yahoo only used photos with Creative Commons licenses that explicitly allowed commercial uses. Although Yahoo’s use of CC licensed works was legal, users who initially applied CC licenses with commercial use would not have wanted the company making canvas prints from the photos they posted to Flickr to make money.26 Should these copyright owners understand better the irrevocability of CC licenses, they might have chosen a different type of CC license with caution. Bill of Rights, a community of people advocating for protecting the intellectual property rights of artists, even called this kind of commercial use “abuse.”27 Although most digital scholars, like those Flickr users, have a genuine interest in making their works available to as many people as possible, it can be hard to gauge their reactions to all unforeseen outcomes of applying CC licenses to their works. Therefore, scholars need more institutional support and education to INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 38 become aware of the irrevocability of CC licenses when managing the intellectual property rights of their digital scholarship projects. This institutional awareness-building is especially important because of the lack of support from Creative Commons. Irrevocability is listed in the “Considerations before licensing” section on the website of Creative Commons. However, scholars may easily overlook the irrevocability feature of CC licenses due to two reasons. First, the 6,500-plus-word “Considerations before licensing” section is not a mandatory step to go through for licensors. It is simply a clickable link from the “Choose a license” webpage of Creative Commons.28 Second, although every CC license consists of three layers, the lawyer-readable legal code, the human-readable deed, and the machine-readable code, the irrevocability of CC licenses can be easily buried in those texts when a layperson without any experience or training of CC license look for the simplest way to promote and expose their works as much as possible.29 Some may suggest putting everything under noncommercial use. However, it is not an option for some platforms and is even discouraged by some digital scholarship repositories. For example, the Open Access Scholarly Publishers Association strongly encourages the use of the CC-BY license wherever possible.30 The rationale behind the recommendation is the hope to make scientific findings available for innovations as well as to make open-access journals sustainable with sufficient profit to operate. Driven by the same objectives, CC-BY has become the gold standard for OA publishing. The three largest OA publishers (BioMed Central, PLOS, and Hindawi) all use this license.31 In particular, the often multimedia and viable characteristics of digital humanities projects can expose them to even more infringement issues in the future. One example of this is RomeLab, a project focusing on the recreation of the Roman Forum, and its website is made up of multiple separate components. The project’s website is constructed with the Drupal content management system, and is integrated with a 3D virtual world component, where users can access the RomeLab website and walk through the virtual space of Rome itself. RomeLab is currently under a Creative Commons Attribution-NonCommercial License. As a project funded by the Mellon Foundation, RomeLab is required to offer “nonexclusive, royalty-free, worldwide, perpetual, irrevocable license to distribute” its data. 32 However, it is never clear to the researchers creating the site how to release the data that only work within the proprietary software Unity Engine that they used to produce the virtual space and more importantly, all the 3D models and pictures. Simply putting the whole site under the Creative Commons Attribution-NonCommercial License doesn’t automatically make its research data accessible by the public. In this case, the irrevocability of CC licenses further complicates the issue of CC licenses being oversimplified. Specifically, since the RomeLab website is also equipped with a chat feature and a multiplayer function, allowing multiple users to interact with each other, the project has a great potential to make profit if repurposed as a teaching tool and even an educational game in the future. Whether or not researchers of RomeLab manage to make their research data publicly available, CC licenses are not a panacea to handle conflicting data release expectations and intellectual property rights of Unity Engine and Mellon Foundation. It is therefore recommended that digital scholars consider various data types and licensing options before exclusively applying irrevocable CC licenses to their creative works. Moreover, if the creator of RomeLab wants to produce a virtual introduction of the 3D world of the project, he should take into consideration of the limitation of CC licenses before disseminating his IS CREATIVE COMMONS A PANACEA FOR MANAGING DIGITAL HUMANITIES IP RIGHTS? | DING 39 https://doi.org/10.6017/ital.v38i3.10714 work via platforms such as YouTube. In 2014, a user found out that somebody took his drone video of Burning Man 2013 and reposted it in its entirety to YouTube under the inaccurate and misleading title “Drone’s Eye View of Burning Man 2014,” which earned a large number of views and advertising.33 When everyone was looking for the newest drone video of Burning Man in 2014, the video posted by this other person received millions of views, which earned them money from YouTube advertising. The reason the user cannot sue this other person is that he originally licensed his video under CC BY license, which allows commercial use, and which unfortunately is YouTube’s only CC license option.34 Had the original videographer better understood the irrevocability of CC licenses, he might have chosen a different platform to disseminate his video or at least utilized other ways to protect his copyright. Scholars would not want this kind of abuse of their original works and thus should be more cautious of the irrevocability of CC licenses. Furthermore, YouTube and many other platforms that digital humanities scholars use to disseminate their research, scholarly, and creative work fail to provide effective functionalities and incentives to fulfill CC’s attribution requirement.35 CC BY license stipulates, “If supplied, you must provide the name of the creator and attribution parties, a copyright notice, a license notice, a disclaimer notice, and a link to the material.”36 To find this piece of information on YouTube, however, someone must go to a video’s landing page and first click the “SHOW MORE” text in the description below the video. Although it is clear to see the CC Attribution license with link displayed, someone must click a “View attributions” link to discover the original author’s credit and source video link. The difficulty of going through different steps may impede an average YouTube user or most potential licensees of a CC-licensed digital scholarly work to learn the original creator of any content and if what they are viewing was partially or wholly created by someone else.37 Since CC licenses only provide licensees with a very general requirement to attribute, licensees are allowed to attribute “in any reasonable manner.”38 With the only limitation to be “not in any way that suggests the licensor endorses you or your use,” licensees are not incentivized to accurately attribute to the scholar of the original work and thus to help disseminate his or her work crediting the copyright owner.39 While users can search for registered works on the official website of United States Copyright Office, there is no way to conduct a comprehensive search for works under CC licenses. Creative Commons does not maintain a database of works distributed under CC licenses. Although there are search engines and websites for works under CC licenses, there is no way to conduct an exhaustive search.40 This can create hurdles for future licensees of a derivative work to accurately and clearly attribute the original work. One of the most important motivations of scholars to distribute their works under CC licenses is to get gain more exposure. Due to all these above limitations and others to be discussed in this paper, scholars should be more cautious of the irrevocability of CC licenses and its lack of enforcement and support system to help licensors accurately attribute the original work. Is Creative Commons Really Clear?—The Ambiguity of NonCommercial and NonDerivative Licenses NonCommercial License In the legal code of a CC Attribution-NonCommercial-ShareAlike License, NonCommercial is defined as “not primarily intended for or directed towards commercial advantage or monetary compensation. For purposes of this Public License, the exchange of the Licensed Material for other material subject to Copyright and Similar Rights by digital file-sharing or similar means is NonCommercial provided there is no payment of monetary compensation in connection with the https://www.youtube.com/watch?v=m2ThTb6iffA https://www.youtube.com/watch?v=m2ThTb6iffA https://www.youtube.com/watch?v=Z9jtiouk_6o https://creativecommons.org/licenses/by/2.0/ INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 40 exchange.”41 This seemingly clear statement can create some confusion and problems in the real world. While a commercial use weighs against fair use, copyright law does not rule infringement solely on a use being commercial. In fact, it is hard to determine a use as totally noncommercial. In the case of Princeton University Press v. Michigan Document Services, Inc., Michigan Document Services (MDS) being a commercial copy shop weighs against a finding of fair use, but MDS’s use being commercial is only one of the four factors in a fair-use analysis. In this case, the court held that MDS’s commercial exploitation of the copyrighted works from Princeton University Press did not constitute fair use although the courts clarified the educational use was “noncommercial in nature.”42 There have been a number of cases in US copyright law where commercial uses have been ruled lawful fair use. By making commercial use a decisive factor to determine an illegal use, Creative Commons fails to specify real cases of commercial uses and thus oversimplifies the complicated copyright issues involving commercial uses that scholars should be aware of. More specifically, many digital scholars nowadays post their articles and projects with noncommercially CC licensed images on a website, the maintenance of which is seldom free. Similar to the case of Princeton University Press v. Michigan Document Services, Inc., the educational or scholarly use of those noncommercially licensed images should be considered “noncommercial in nature.”43 However, if a digital humanist maintains a website that is subsidized partly by Google Ads or a company, the nature of the use of those noncommercially licensed images might be called into question as in the case of Princeton University Press v. Michigan Document Services, Inc. Although in both situations, the image is not “primarily intended for or directed towards commercial advantage or monetary compensation,” the digital humanist may still increase the traffic of his site and thus profit from including those images on his site. 44 The “different viewpoints and colliding interests” among commercial publishers, librarians, scholars, university administrators, and others may further complicate the already “ambiguous commercial nature of use” in fair use analysis that Creative Commons oversimplifies.45 The more recent case of Great Minds v. FedEx Office & Print Services, Inc. demonstrates this ambiguity of commercial use and one use of CC NonCommercial license that is legal yet unexpected and unwanted by copyright owners. To specify, Great Minds argued that FedEx should compensate it for the money the company made from copying materials that Great Minds distributed under a CC Attribution-NonCommercial-ShareAlike 4.0 license. In an amicus brief to support FedEx Office, Creative Commons held that “entities using CC-licensed works must be free to act as entities do—including through employees and the contractors they engage in their service” and otherwise “the value of the license would be significantly diminished.”46 Creative Commons demonstrated its interpretation of a commercial use to be different from the ruling in the case Princeton University Press v. Michigan Document Services, in which the judge explicitly ruled the use to be commercial because the copyright complaint was performed on “a profit- making basis by a commercial enterprise” and clearly forbade the contract between this enterprise and a nonprofit organization to copy and distribute copyrighted content.47 In contrast, in the case of Great Minds v. FedEx Office & Print Services, Inc., the court held that Great Minds ’ nonexclusive public license, i.e. CC Attribution-NonCommercial-ShareAlike 4.0 International Public License, “unambiguously permitted school districts to engage FedEx, for a fee, to reproduce” the copyrighted content.48 Scholars should therefore be wary of the complicated process and “several areas of uncertainty” surrounding Creative Commons, which can be easily IS CREATIVE COMMONS A PANACEA FOR MANAGING DIGITAL HUMANITIES IP RIGHTS? | DING 41 https://doi.org/10.6017/ital.v38i3.10714 overlooked when applying the “simple and easy” CC licenses.49 None of the interpretations of noncommercial uses by Creative Commons are specified in the generic License Deed. Compared to more customized licenses that usually involve direct interactions between the licensor and the licensee, the free-of-charge license, CC licenses, has a long way to go to protect both licensors and licensees from infringements and financial loss. A study of noncommercial uses conducted by Creative Commons indicates that NonCommercial licenses account for “approximately two-thirds of all Creative Commons licenses associated with works available on the Internet.”50 Kim confirmed this popularity of CC NonCommercial licenses that “over 60 percent Flickr users prohibit commercial use or derivative work.”51 As Kim elaborated and as the previous section in this paper on the irrevocability of CC licenses showcases, either commercial or noncommercial CC licenses are “likely to be detrimental to potential professional careers” of copyright owners.52 Nevertheless, as stated by Creative Commons, they do not offer legal advice. 53 When providing copyright education, academic librarians should therefore remind digital scholars to be careful in using both commercial and noncommercial content and making their own content available for noncommercial purposes. NonDerivative License Similarly, scholars should be reminded to have a critical view of NonDerivative use of CC licenses. According to Title 17 Section 101 of the Copyright Act, a “derivative work” is a work based upon one or more preexisting works in which it may be recast, transformed, or adapted.54 However, Creative Commons used the phrase “Adapted Material” to define derivative work in the Legal Code for NonDerivative uses.55 Creative Commons has a different understanding of derivative works from what is defined by the Copyright Act in musical works. “Adapted Material is always produced where the Licensed Material is synched in timed relation with a moving image.”56 This means that while using an original soundtrack in a video is not derivative work according to the Copyright Act, videos that use an ND-licensed song violate the terms of the CC license. Similar to the difference of revocability and commercial use between Creative Commons and Copyright Act as discussed earlier in this article, this different understanding of derivative work should be made aware to scholars. Specifically, when providing copyright education to scholars, academic librarians should make it clear that NonDerivative license cannot alienate the fair use rights of users and that a NonCommercial NonDerivative license does not prevent companies from using a work in a parody.57 Some licensors of CC licenses may not share Creative Commons’ vision of an open, sharing culture as suggested by the prevalence of ND licenses.58 Therefore, instead of providing generic recommendation on using CC licenses, academic librarians should “balance the interests of information users and rights holders” by providing a more sophisticated and critical perspective when educating the scholarly community about the NonDerivative CC licenses.59 Is Creative Commons Really Sustainable—The Dilemma of ShareAlike and Open Access Incompatible ShareAlike Licenses For many digital scholars, the ShareAlike term in CC licenses is intended to distribute their works more broadly and openly since a licensee is required by Creative Commons to “distribute . . . [their contributions] . . . under the same license as the original.”60 Nevertheless, some incompatibility issues arise to prevent a more open distribution of works. For example, since the Creative Commons system offers two different ShareAlike licenses, a scholar cannot create a new derivative work combining two ShareAlike works with different terms of their respective licenses. http://www4.law.cornell.edu/uscode/17/101.html INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 42 It is the open and accessible nature of CC-licensed works that makes them ideal for scholars including digital humanists to collaborate on projects but ironically, the ShareAlike function can create the risk of “an intractable thicket” if incompatibilities between those licenses hinder future collaboration.61 Creative Commons does provide a series of compatible licenses, but only same licenses with differences in CC versions are considered compatible.62 Against Open Access? These incompatibilities between certain CC licenses have been pointed out by copyright experts to limit “the future production and distribution of creative works” and even “anti-public domain.”63 In 2009, the cofounder and CEO of Creative Commons, Lawrence Lessig, pointed out the perils of openness in government in his article “Against Transparency.”64 Echoing his argument that “whether and how new information is used to further public objectives depends upon its incorporation into complex chains of comprehension, action, and response,” this paper advocates a critical perspective of CC licenses in digital scholarship. Apart from all the limitations of CC licenses discussed already, a more unsettling misuse of CC licenses is a failure to recognize other rights of a work beyond copyright. In 2011, the image of an underage girl, which was placed on Flickr under a CC license, was used in an advertising campaign for mobile phone services.65 Although after the lawsuit, Creative Commons CEO added a term in the Legal Deed of the latest version (4.0) of every CC license to explicitly state that other rights such as publicity, privacy, or moral rights may limit how to use the material, the case reveals the perils of openness.66 When providing copyright education, academic librarians should not only warn digital scholars of this limitation of CC licenses but also encourage them to include a statement of intellectual property rights including privacy and other rights on their digital scholarship websites to reduce abuses and innocent infringements. CONCLUSION Even though CC licenses are helpful for digital humanities to gain more exposure, these licenses are still being improved. Creative Commons pledged to the community to “clarify how the NC limitation works in the practical world.”67 Yet, when providing copyright consultation or partnering with digital humanity scholars, academic librarians should warn these scholars as both licensors and licensees the sophisticated implications of not only the NonCommercial license, but also other characteristics and limitations of CC licenses. Academic librarians should introduce to digital scholars a more critical view of CC licenses by collaborating with different campus stakeholders.68 While it is recommended that academic librarians suggest digital scholars place their creative works under NonCommercial license, academic librarians should also educate them about the ambiguous definitions of commercial use as well as the possibility of commercial parody and other fair use situations. It is also recommended that academic librarians provide digital humanists with guidance on how to create intellectual property statements on their website, which should include not only copyright, but also privacy and other intellectual property rights. Currently, a number of university libraries and nonprofit organizations, ranging from Duke University Library (http://library.duke.edu/), to Library of Congress (https://www.flickr.com/photos/library_of_congress), and Wikipedia (https://en.wikipedia.org/wiki/Main_Page), use CC licenses for their entire site.69 As CC license http://library.duke.edu/ https://www.flickr.com/photos/library_of_congress https://en.wikipedia.org/wiki/Main_Page IS CREATIVE COMMONS A PANACEA FOR MANAGING DIGITAL HUMANITIES IP RIGHTS? | DING 43 https://doi.org/10.6017/ital.v38i3.10714 users, academic librarians should also be extremely careful when using CC-licensed pictures or music on the library’s website. The safest way is to only use ones that are in the public domain or that are acquired by the library. Despite the use of free and simple CC licenses, academic libraries are recommended to include Terms of Use and Privacy sections on their websites to provide more detailed explanations of the function of CC licenses and intellectual property rights in general. The alignment between the visions of Creative Commons, digital humanities, and “higher education as a cultural and knowledge commons” put academic librarians in a unique position to provide copyright education in the digital humanities field.70 Because of all the limitations of CC licenses, academic librarians should go beyond a simple endorsement of CC licenses and offer a more sophisticated and critical perspective when educating the scholarly community about CC licenses. NOTES 1 Amanda Hornby and Leslie Bussert, "Digital Scholarship and Scholarly Communication," University of Washington Libraries, accessed November 30, 2016, https://www.uwb.edu/getattachment/tlc/faculty/teachingresources/newmedia. 2 Oya Y Rieger, “Framing Digital Humanities: The Role of New Media in Humanities Scholarship,” First Monday 15, no. 10 (October 11, 2010), http://firstmonday.org/ojs/index.php/fm/article/view/3198. 3 Elizabeth Joan Kelly, "Rights Instruction for Undergraduate Students: Needs, Trends, and Resources," College & Undergraduate Libraries 25, no. 1 (2018): 1-16, https://doi.org/10.1080/10691316.2016.1275910. 4 Daniel Hickey, "The Reuse Evangelist: Taking Ownership of Copyright Questions at Your Library," Reference & User Services Quarterly 51, no. 1 (2011): 9-11; “Research Guides: Image Resources: Creative Commons Images,” Creative Commons Images - Image Resources - Research Guides at UCLA Library, accessed April 28, 2019, https://guides.library.ucla.edu/c.php?g=180361&p=1185834; “Finding Public Domain & Creative Commons Media: Images,” Research Guides, accessed April 28, 2019, https://guides.library.harvard.edu/c.php?g=310751&p=2072816. UCLA and Harvard are two good examples. 5 Lewin-Lane et al., "The Search for a Service Model of Copyright Best Practices in Academic Libraries," Journal of Copyright in Education and Librarianship 2, no. 2 (2018): 1-24. Harvard. For example, when conducting a literature review of the copyright education in academic libraries to search for best practices, does not discuss any limitation of CC licenses in this article. 6 Zachary Katz, "Pitfalls of Open Licensing: An Analysis of Creative Commons Licensing," Idea: The Intellectual Property Law Review 46, no. 3 (2006): 391-413. 7 Eric E. Johnson, "Rethinking Sharing Licenses for Entertainment Media," Cardozo Arts & Entertainment Law Journal 26, no. 2 (2008): 391-440. https://www.uwb.edu/getattachment/tlc/faculty/teachingresources/newmedia http://firstmonday.org/ojs/index.php/fm/article/view/3198 https://doi.org/10.1080/10691316.2016.1275910 https://guides.library.ucla.edu/c.php?g=180361&p=1185834 https://guides.library.harvard.edu/c.php?g=310751&p=2072816 https://csun-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=TN_gale_ofa190356089&context=PC&vid=01CALS_UNO&search_scope=EVERYTHING&tab=everything&lang=en_US https://csun-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=TN_gale_ofa190356089&context=PC&vid=01CALS_UNO&search_scope=EVERYTHING&tab=everything&lang=en_US https://csun-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=TN_gale_ofa190356089&context=PC&vid=01CALS_UNO&search_scope=EVERYTHING&tab=everything&lang=en_US https://csun-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=TN_gale_ofa190356089&context=PC&vid=01CALS_UNO&search_scope=EVERYTHING&tab=everything&lang=en_US INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 44 8 Aurelija Lukoseviciene, "Beyond the Creative Commons Framework of Production and Dissemination of Knowledge," http://dx.doi.org/10.2139/ssrn.1973967. 9 Mashael Khayyat and Frank Bannister, “Open Data Licensing: More than Meets the Eye,” Information Polity: The International Journal of Government & Democracy in the Information Age 20 (4): 231–52, https://doi:10.3233/IP-150357. 10 Herkko Hietanen, “The Pursuit of Efficient Copyright Licensing: How Some Rights Reserved Attempts to Solve the Problems of All Rights Reserved,” Lappeenranta University of Technology, 2008. 11 Christa Engel Pletcher Burger, “Are Publicity Rights Gone in a Flash?: Flickr, Creative Commons, and the Commercial Use of Personal Photographs,” Florida State Business Review 8 (2009): 129, https://ssrn.com/abstract=1476347. 12 Robert W Gomulkiewicz, “Open Source License Proliferation: Helpful Diversity or Hopeless Confusion?” Washington University Journal of Law & Policy 30 (2009): 261; Expanded Academic ASAP, accessed April 28, 2019, http://link.galegroup.com.libproxy.csun.edu/apps/doc/A208273638/EAIM?u=csunorthridge &sid=EAIM&xid=4bbf2442. 13 Jacob H. Rooksby, “A Fresh Look at Copyright on Campus,” Missouri Law Review (Summer 2016): 769; General OneFile, accessed April 27, 2019, http://link.galegroup.com.libproxy.csun.edu/apps/doc/A485538679/ITOF?u=csunorthridge& sid=ITOF&xid=1f2822f3. 14 “eScholarship: Copyright & Legal Agreements,” accessed December 1, 2016, http://escholarship.org/help_copyright.html#creative. 15 “Directory of Open Access Journals,” DOAJ, accessed December 1, 2016, https://doaj.org. 16 “Frequently Asked Questions—Creative Commons,” accessed December 7, 2016, https://creativecommons.org/faq/#do-i-need-to-register-with-creative-commons-before-i- obtain-a-license. 17 “Copyright in General,” U.S. Copyright Office, accessed July 30, 2019, https://www.copyright.gov/help/faq/faq-general.html. 18 “Why Should I Register My Work If Copyright Protection Is Automatic?,” Copyright Alliance, accessed July 28, 2019, https://copyrightalliance.org/ca_faq_post/copyright-protection-ata/. 19 “Copyright Basics,” U.S. Copyright Office and Library of Congress, accessed November 30, 2016. https://www.copyright.gov/circs/circ01.pdf#page=7. 20 Phil Clapham, “Are Creative Commons Licenses Overly Permissive? The Case of a Predatory Publisher,” BioScience (2018): 842-43, accessed April 20, 2019, https://doi:10.1093/biosci/biy098; Cornelius Puschmann and Marco Bastos, “How Digital Are http://dx.doi.org/10.2139/ssrn.1973967 https://doi:10.3233/IP-150357 https://ssrn.com/abstract=1476347 http://link.galegroup.com.libproxy.csun.edu/apps/doc/A208273638/EAIM?u=csunorthridge&sid=EAIM&xid=4bbf2442 http://link.galegroup.com.libproxy.csun.edu/apps/doc/A208273638/EAIM?u=csunorthridge&sid=EAIM&xid=4bbf2442 http://link.galegroup.com.libproxy.csun.edu/apps/doc/A485538679/ITOF?u=csunorthridge&sid=ITOF&xid=1f2822f3 http://link.galegroup.com.libproxy.csun.edu/apps/doc/A485538679/ITOF?u=csunorthridge&sid=ITOF&xid=1f2822f3 http://escholarship.org/help_copyright.html#creative https://doaj.org/ https://creativecommons.org/faq/#do-i-need-to-register-with-creative-commons-before-i-obtain-a-license https://creativecommons.org/faq/#do-i-need-to-register-with-creative-commons-before-i-obtain-a-license https://www.copyright.gov/help/faq/faq-general.html https://copyrightalliance.org/ca_faq_post/copyright-protection-ata/ https://www.copyright.gov/circs/circ01.pdf#page=7 https://doi:10.1093/biosci/biy098 IS CREATIVE COMMONS A PANACEA FOR MANAGING DIGITAL HUMANITIES IP RIGHTS? | DING 45 https://doi.org/10.6017/ital.v38i3.10714 the Digital Humanities? An Analysis of Two Scholarly Blogging Platforms,” Plos One 10, no. 2 (2015), accessed April 20, 2019. https://doi:10.1371/journal.pone.0115035. 21 “Why Your Blog Images Are a Ticking Time Bomb,” Koozai.com, accessed December 2, 2016, https://www.koozai.com/blog/content-marketing-seo/blog-sued-for-images/. 22 John W. White and Heather Gilbert eds., Laying the Foundation: Digital Humanities in Academic Libraries (West Lafayette: Purdue University Press, 2016), ProQuest Ebook Central. 23 “Considerations for Licensors and Licensees—Creative Commons,” accessed December 7, 2016, https://wiki.creativecommons.org/wiki/Considerations_for_licensors_and_licensees. 24 “The Terms ‘Revocable’ and ‘Irrevocable’ in License Agreements: Tips and Pitfalls,” accessed December 7, 2016, http://www.sidley.com/news/the-terms-revocable-and-irrevocable-in- license-agreements-tips-and-pitfalls-02-21-2013. 25 Mark Seeley and Lois Wasoff, “Legal Aspects and Copyright-15,” in Academic and Professional Publishing, edited by Robert Campbell, Ed Pentz, and Ian Borthwick (Cambridge, UK: Elsevier Ltd, 2012), 355-83. 26 Douglas MacMillan, “Fight Over Yahoo’s Use of Flickr Photos,” Wall Street Journal, November 25, 2014, sec. Tech, http://www.wsj.com/articles/fight-over-flickrs-use-of-photos-1416875564. 27 “Flickr Apologizes but What About CC Abuses by Others?,” accessed December 7, 2016, http://www.artists-bill-of-rights.org/news/campaign-news/flickr-apologizes-but-what- about-cc-abuses-by-others?/. 28 “The Terms ‘Revocable’ and ‘Irrevocable’ in License Agreements: Tips and Pitfalls,” accessed December 7, 2016, http://www.sidley.com/news/the-terms-revocable-and-irrevocable-in- license-agreements-tips-and-pitfalls-02-21-2013. 29 “Legal Code—Creative Commons,” accessed December 7, 2016, https://wiki.creativecommons.org/wiki/Legal_code. 30 “Why CC-BY?—OASPA,” accessed December 7, 2016, http://oaspa.org/why-cc-by/. 31 “Why CC-BY?—OASPA.” 32 “Intellectual Property Policy,” The Andrew W. Mellon Foundation, accessed July 28, 2019, https://mellon.org/grants/grantmaking-policies-and-guidelines/grantmaking- policies/intellectual-property-policy/. 33 “Why I’m Giving up on Creative Commons on YouTube,” Eddie.com, September 6, 2014, http://eddie.com/2014/09/05/why-im-giving-up-on-creative-commons-on-youtube/. 34 “Creative Commons—Attribution 4.0 International—CC BY 4.0,” accessed December 7, 2016, https://creativecommons.org/licenses/by/4.0/. 35 “Why I’m Giving up on Creative Commons on YouTube.” https://doi:10.1371/journal.pone.0115035. https://www.koozai.com/blog/content-marketing-seo/blog-sued-for-images/ https://wiki.creativecommons.org/wiki/Considerations_for_licensors_and_licensees http://www.sidley.com/news/the-terms-revocable-and-irrevocable-in-license-agreements-tips-and-pitfalls-02-21-2013 http://www.sidley.com/news/the-terms-revocable-and-irrevocable-in-license-agreements-tips-and-pitfalls-02-21-2013 http://www.wsj.com/articles/fight-over-flickrs-use-of-photos-1416875564. http://www.artists-bill-of-rights.org/news/campaign-news/flickr-apologizes-but-what-about-cc-abuses-by-others?/ http://www.artists-bill-of-rights.org/news/campaign-news/flickr-apologizes-but-what-about-cc-abuses-by-others?/ http://www.sidley.com/news/the-terms-revocable-and-irrevocable-in-license-agreements-tips-and-pitfalls-02-21-2013. http://www.sidley.com/news/the-terms-revocable-and-irrevocable-in-license-agreements-tips-and-pitfalls-02-21-2013. https://wiki.creativecommons.org/wiki/Legal_code http://oaspa.org/why-cc-by/ https://mellon.org/grants/grantmaking-policies-and-guidelines/grantmaking-policies/intellectual-property-policy/ https://mellon.org/grants/grantmaking-policies-and-guidelines/grantmaking-policies/intellectual-property-policy/ http://eddie.com/2014/09/05/why-im-giving-up-on-creative-commons-on-youtube/ https://creativecommons.org/licenses/by/4.0/ INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 46 36 “Creative Commons—Attribution 4.0 International—CC BY 4.0.” 37 “Why I’m Giving up on Creative Commons on YouTube.” 38 “Creative Commons—Attribution 4.0 International—CC BY 4.0.” 39 Ibid. 40 “CC Search,” accessed December 7, 2016, https://search.creativecommons.org/. 41 “Creative Commons—Attribution-NonCommercial-ShareAlike 4.0 International—CC BY-NC-SA 4.0,” accessed December 7, 2016, https://creativecommons.org/licenses/by-nc- sa/4.0/legalcode. 42 “U.S. Copyright Office Fair Use Index,” U.S. Copyright Office, accessed April 21, 2019, https://www.copyright.gov/fair-use/. 43 Ibid. 44 Ibid. 45 Jerry D Campbell, “Intellectual Property in a Networked World: Balancing Fair Use and Commercial Interests,” Library Acquisitions: Practice and Theory 19, no. 2 (1995): 179-84, https://doi:10.1016/0364-6408(95)00020-A; Igor Slabykh, “Ambiguous Commercial Nature of Use in Fair Use Analysis,” AIPLA Quarterly Journal 46, no. 3 (2018): 293-339. 46 “Defending Noncommercial Uses: Great Minds v Fedex Office,” Creative Commons, August 30, 2016, https://creativecommons.org/2016/08/30/defending-noncommercial-uses-great- minds-v-fedex-office/. 47 “Princeton University Press v. Michigan Document Services,” Bitlaw, accessed December 7, 2016, http://www.bitlaw.com/source/cases/copyright/pup.html#IIIA. 48 Justia, “Great Minds v. FedEx Office & Print Services, Inc,” Stanford Copyright and Fair Use Center, March 21, 2018, https://fairuse.stanford.edu/case/great-minds-v-fedex-office-print- services-inc/. 49 Minjeong Kim, “The Creative Commons and Copyright Protection in the Digital Era: Uses of Creative Commons Licenses,” Journal of Computer‐Mediated Communication 13, no. 1 (2007): 187-209, https://doi:10.1111/j.1083-6101.2007.00392.x; “Directory of Open Access Journals,” DOAJ, accessed December 1, 2016, https://doaj.org. 50 “FEATURE: Creative Commons: Copyright Tools for the 21st Century,” accessed December 7, 2016, http://www.infotoday.com/online/jan10/Gordon-Murnane.shtml. 51 “The Creative Commons and Copyright Protection in the Digital Era: Uses of Creative Commons Licenses.” 52 Ibid. https://search.creativecommons.org/ https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode https://www.copyright.gov/fair-use/ https://csun-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=TN_sciversesciencedirect_elsevier0364-6408(95)00020-A&context=PC&vid=01CALS_UNO&search_scope=EVERYTHING&tab=everything&lang=en_US https://csun-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=TN_sciversesciencedirect_elsevier0364-6408(95)00020-A&context=PC&vid=01CALS_UNO&search_scope=EVERYTHING&tab=everything&lang=en_US https://doi:10.1016/0364-6408(95)00020-A. https://csun-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=TN_gale_ofa570516325&context=PC&vid=01CALS_UNO&search_scope=EVERYTHING&tab=everything&lang=en_US https://csun-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=TN_gale_ofa570516325&context=PC&vid=01CALS_UNO&search_scope=EVERYTHING&tab=everything&lang=en_US https://creativecommons.org/2016/08/30/defending-noncommercial-uses-great-minds-v-fedex-office/ https://creativecommons.org/2016/08/30/defending-noncommercial-uses-great-minds-v-fedex-office/ http://www.bitlaw.com/source/cases/copyright/pup.html#IIIA https://fairuse.stanford.edu/case/great-minds-v-fedex-office-print-services-inc/ https://fairuse.stanford.edu/case/great-minds-v-fedex-office-print-services-inc/ https://doi:10.1111/j.1083-6101.2007.00392.x https://doaj.org/ http://www.infotoday.com/online/jan10/Gordon-Murnane.shtml IS CREATIVE COMMONS A PANACEA FOR MANAGING DIGITAL HUMANITIES IP RIGHTS? | DING 47 https://doi.org/10.6017/ital.v38i3.10714 53 “Creative Commons—Attribution-ShareAlike 4.0 International—CC BY-SA 4.0,” accessed December 7, 2016, https://creativecommons.org/licenses/by-sa/4.0/legalcode#s6a. 54 “17 U.S. Code § 101—Definitions,” Legal Information Institute, accessed April 20, 2019, https://www.law.cornell.edu/uscode/text/17/101. 55 “Creative Commons—Attribution-NonCommercial-NoDerivatives 4.0 International—CC BY-NC- ND 4.0,” accessed December 7, 2016, https://creativecommons.org/licenses/by-nc- nd/4.0/legalcode. 56 “Creative Commons—Attribution-NonCommercial-NoDerivatives 4.0 International—CC BY-NC- ND 4.0.” 57 The famous Campbell v. Acuff-Rose Music case established that a commercial parody could qualify as fair use. 58 Katz, “Pitfalls of Open Licensing,” 411. 59 “Professional Ethics,” Tools, Publications & Resources, American Library Association, February 6, 2019, http://www.ala.org/tools/ethics. 60 “Creative Commons—Attribution-ShareAlike 4.0 International—CC BY-SA 4.0,” accessed December 7, 2016, https://creativecommons.org/licenses/by-sa/4.0/. 61 Molly Houweling, “The New Servitudes,” Georgetown Law Journal 96, no. 3 (2008): 885-950. 62 “Compatible Licenses,” Creative Commons, accessed December 7, 2016, https://creativecommons.org/share-your-work/licensing-considerations/compatible- licenses/. 63 Katz, “Pitfalls of Open Licensing,” 391; Susan Corbett, “Creative Commons Licences, the Copyright Regime and the Online Community: Is There a Fatal Disconnect?,” The Modern Law Review 74, no. 4 (2011): 506, http://www.jstor.org/stable/20869091. 64 Lawrence Lessig, “Against Transparency,” New Republic, October 8, 2009, https://newrepublic.com/article/70097/against-transparency. 65 “Creative Commons CEO Apologizes To Virgin Mobile—Stock Photography News, Analysis and Opinion,” accessed December 7, 2016, https://www.selling-stock.com/Article/creative- commons-ceo-apologizes-to-virgin-mob. 66 “Frequently Asked Questions,” Creative Commons, accessed July 30, 2019, https://creativecommons.org/faq/#how-are-publicity-privacy-and-personality-rights- affected-when-i-apply-a-cc-license. 67 “Defending Noncommercial Uses: Great Minds v Fedex Office,” Creative Commons, August 30, 2016, https://creativecommons.org/2016/08/30/defending-noncommercial-uses-great- minds-v-fedex-office/. https://creativecommons.org/licenses/by-sa/4.0/legalcode#s6a https://www.law.cornell.edu/uscode/text/17/101 https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode https://en.wikipedia.org/wiki/Parody https://en.wikipedia.org/wiki/Fair_use http://www.ala.org/tools/ethics https://creativecommons.org/licenses/by-sa/4.0/ https://creativecommons.org/share-your-work/licensing-considerations/compatible-licenses/ https://creativecommons.org/share-your-work/licensing-considerations/compatible-licenses/ http://www.jstor.org/stable/20869091 https://newrepublic.com/article/70097/against-transparency https://www.selling-stock.com/Article/creative-commons-ceo-apologizes-to-virgin-mob https://www.selling-stock.com/Article/creative-commons-ceo-apologizes-to-virgin-mob https://creativecommons.org/faq/#how-are-publicity-privacy-and-personality-rights-affected-when-i-apply-a-cc-license https://creativecommons.org/faq/#how-are-publicity-privacy-and-personality-rights-affected-when-i-apply-a-cc-license https://creativecommons.org/2016/08/30/defending-noncommercial-uses-great-minds-v-fedex-office/ https://creativecommons.org/2016/08/30/defending-noncommercial-uses-great-minds-v-fedex-office/ INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 48 68 Andrea Malone et al., “Center Stage: Performing a Needs Assessment of Campus Research Centers and Institutes,” Journal of Library Administration 57, no.4 (2017): 406–19, https://doi:10.1080/01930826.2017.1300451. 69 Laura Gordon-Murnane, “FEATURE: Creative Commons: Copyright Tools for the 21st Century,” Information Today, accessed December 7, 2016, http://www.infotoday.com/online/jan10/Gordon-Murnane.shtml. 70 Ibid. https://doi:10.1080/01930826.2017.1300451 http://www.infotoday.com/online/jan10/Gordon-Murnane.shtml ABSTRACT Introduction Literature Review Four Limitations of CC Licenses Is Creative Commons Really Free?—The Pitfall of a Free License Is Creative Commons Really Safe?—The Risk of Irrevocability Is Creative Commons Really Clear?—The Ambiguity of NonCommercial and NonDerivative Licenses NonCommercial License NonDerivative License Is Creative Commons Really Sustainable—The Dilemma of ShareAlike and Open Access Incompatible ShareAlike Licenses Against Open Access? Conclusion NOTES 10738 ---- 10738 20190318 galley Determining Textbook Cost, Formats, and Licensing with Google Books API: A Case Study from an Open Textbook Project Eamon Costello, Richard Bolger, Tiziana Soverino, and Mark Brown INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 91 Eamon Costello (eamon.costello@dcu.ie) is Assistant Professor, Open Education at Dublin City University. Richard Bolger (richard.bolger@dcu.ie) is Lecturer at Dublin City University. Tiziana Soverino (tiziana.soverino@dcu.edu) is Researcher at Dublin City University. Mark Brown (mark.brown@dcu.ie) is Full Professor of Digital Learning, Dublin City University. ABSTRACT The rising cost of textbooks for students has been highlighted as a major concern in higher education, particularly in the US and Canada. Less has been reported, however, about the costs of textbooks outside of North America, including in Europe. We address this gap in the knowledge through a case study of one Irish higher education institution, focusing on the cost, accessibility, and licensing of textbooks. We report here on an investigation of textbook prices drawing from an official college course catalog containing several thousand books. We detail how we sought to determine metadata of these books including: the formats they are available in, whether they are in the public domain, and the retail prices. We explain how we used methods to automatically determine textbook costs using Google Books API and make our code and dataset publicly available. INTRODUCTION The cost of textbooks is a hot topic for higher education. It has been reported that by 2014 the average student spent $1,200 annually on textbooks.1 Another study claimed that between 2006 and 2016 the costs of college textbooks increased over four times the cost of inflation.2 Despite this rise in textbook costs, a survey of more than 3,000 US faculty members (“The Babson Survey”) found that almost every course (98 percent) mandated a textbook or related study resources.3 One response to the challenge of rising textbook costs is open textbooks. Open textbooks are a type of open educational resource (OER). OERs have been defined as “teaching, learning, and research resources that reside in the public domain or have been released under an intellectual property license that permits their free use and repurposing by others. Open educational resources include full courses, course materials, modules, textbooks, streaming videos, tests, software, and any other tools, materials, or techniques used to support access to knowledge.”4 OERs stem from the principle that access to education is a human right and that, as such, education should be accessible to all.5 Hence an open textbook is made available under terms which grant legal rights to the public, not only to use, but also to adapt and redistribute. Creative Commons licensing is the most prevalent and well-developed intellectual property licensing tool for this purpose. Open textbook projects aimed at promoting publishing and redistributing open textbooks, both in digital and print formats, have been growing. For example, the BCampus project in Canada began in 2012 with the aim of creating a collection of open textbooks aligned with the most popular subject areas in British Columbia.6 The project has shown strong growth, with over 230 open digital textbooks now available and more than forty institutions involved. A significant recent DETERMINING TEXTBOOK COST, FORMATS, AND LICENSING | COSTELLO, BOLGER, SOVERINO, AND BROWN 92 https://doi.org/10.6017/ital.v38i1.10738 development in open textbooks occurred in March 2018, when the US Congress announced a $5 million investment in an open textbook initiative.7 In addition to helping change institutional culture, and challenge attitudes to traditional publishing models, one of the most oft-cited benefits of open textbooks is cost savings. According to the College Board’s Survey of Colleges, the average annual cost to US undergraduate students in 2017 for textbooks and materials was estimated at $1,250.8 This figure is remarkably close to the aforementioned figure of $1,200 a year, as reported by Baglione and Sullivan. However, there is little known about the monetary face value of books that students are expected to buy, beyond studies based on self-reported data. Students themselves in the US have attempted to at least open the debate in this area by highlighting book price disparities.9 Nonetheless, they only report on a very small number of books, and the College Board representing on-campus US textbook retailers have disputed their results for this reason, claiming that they have been selective in the book prices they have chosen. Hence this study seeks to address the gap that exists in knowledge about the true cost of textbooks in higher education. This is in the context of a wider research project we are conducting on open textbooks in Ireland.10 Determining the cost of books is not straightforward as books can be new, used, rental, or digital subscription. However, the cost of new books does set a baseline for other forms, particularly rental and used books. Our aim here is hence to start with new books, by analyzing costs of all the required and recommended textbooks of one higher education institution (HEI) in Ireland. The overarching research question this study sought to address is: What is known about the currently assigned textbooks in an Irish university? The sub-questions were: • RQ1: What is the extent of textbooks that are required reading? • RQ2: What are the retail costs of textbooks? • RQ3: Are textbooks available in digital or e-book form? • RQ4: Are textbooks available in the public domain? The next section outlines our methodology and how we sought to find answers to these questions. METHODS In this section we describe our approach, the dataset generated, and the methods we used to analyze the data. We identified a suitable data source comprising the official course catalog of a HEI in Ireland with more than ten thousand students. In the course catalog faculty give required and recommended textbook details for all courses. This information is freely accessible on the website of the HEI; the course catalog is powered by a software system known as Akari (http://www.akarisoftware.com/). Akari is a proprietary software system used by several HEIs in and outside Ireland to create and manage academic course catalogs. The course team gained access to a download of all books recorded in the database of the course catalog (Figure 1). In this catalog, fields are provided for lecturers to input information for students about books such as title, International Standard Book Number (ISBN), author, and publisher. Following manual and automated data cleansing, 3,014 unique records of books were created. Due to the large number of books, at this stage we sought a programmatic solution for finding out more information about these books. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 93 Figure 1. Course Catalog Screenshot. We initially thought that ISBNs might prove the best way to accurately reconcile records of books. However, many ISBNs were incomplete or mistyped. Moreover, many instructors simply did not enter an ISBN. Given the capacity for errors in the data—for instance, some lectures simply entered “I will tell you in class” in the book title field—we required a tool that could handle fuzzy search queries, e.g. cases where a book title or author were misspelled. The tool we selected was the Google Books Application Programming Interface (API).11 This API provides an interface to the Google Books database of circa thirty million books. The service, like the main Google search engine, is forgiving of queries that are mistyped or misspelled. Hence, we constructed a query based on a combination of author name, book title, and publisher. Following experimentation, we determined that these three search terms together allowed us to find books with a high degree of accuracy whilst also accounting for possible spelling errors. DETERMINING TEXTBOOK COST, FORMATS, AND LICENSING | COSTELLO, BOLGER, SOVERINO, AND BROWN 94 https://doi.org/10.6017/ital.v38i1.10738 Figure 2. System Design. We then wrote a custom JavaScript middleware program deployed in the Google Cloud Platform. This program parsed the file of the book search queries, passed them to the Google Books API as search requests and saved the results. The API returned results in JavaScript Object Notation (JSON) format. JSON is a modern web language for describing data. It is related to JavaScript and can be used to translate objects in the JavaScript programming language into textual strings. It is used as a replacement for XML as it is arguably more human readable and is considerably less verbose. We then imported this JSON into a MongoDB database to filter and clean the data, before finally exporting them to Excel for statistical analysis. MongoDB is a document store database that natively stores objects in the JSON format and allows for efficient querying of the data. The Google Books API provides some key metadata on books aside from the usual author, publisher, ISBN, edition, pages, etc. as it gives prices for selected books. Google draws this information from its own e-book store which contains over three million books and a network of resellers who sell print and digital versions of the books. In addition to price, Google Books also contains information on accessible versions of books, digital/e-pub versions, PDF versions, and whether the book is in the public domain. We have published a release of this dataset and all of our code to the software repository GitHub. We then used the Zenodo platform to generate a digital object identifier (DOI) for the code.12 One of the functions of the Zenodo platform is to allow for code to be properly cited and referenced. We published our code in this way for others interested in replicating this work in other contexts. In the next section we will provide an analysis of the results of our queries. RESULTS After extracting and processing the data from the course catalog and Google platforms, we obtained 3,030 unique course names and in these courses we found over 15,414 books listed. Required versus Recommended Reading From the course catalog data, we found that 11,022 (71.5 percent) books were required readings and the remaining 4,392 (28.5 percent) were recommended. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 95 Upon cleaning and removing duplicates and missing data, we identified 3,014 books that could be queried using the Google Books API. Querying the API returned results for 2,940 books, i.e. it found 97 percent of the books and only seventy-four books could not be found. The Google Books API returns information in JSON format. Figure 3 below shows an example of the JSON information returned for one book. { "volumeInfo" : { "title" : "Psychiatric and Mental Health Nursing", "authors" : [ "Phil Barker" ], "industryIdentifiers" : [ { "type" : "ISBN_13", "identifier" : "9781498759588" }, { "type" : "ISBN_10", "identifier" : "1498759580" } ], "imageLinks" : { "smallThumbnail" : "http://books.google.com/books/content?id=btSOCgAAQBAJ&printsec=frontcover&img=1&zo om=5&edge=curl&source=gbs_api" } }, "saleInfo" : { "isEbook" : true, "retailPrice" : { "amount" : 62.39, "currencyCode" : "USD" } }, "accessInfo" : { "publicDomain" : false, "pdf" : { "isAvailable" : true } } } Figure 3. Sample of book information returned by Google Books API. Digital Formats and Public Domain License Figure 4 shows the numbers of PDF (1,219) and e-book (1,016) versions of books reported to be available. Eight hundred and fifty-four were available in both PDF and e-book format. From the DETERMINING TEXTBOOK COST, FORMATS, AND LICENSING | COSTELLO, BOLGER, SOVERINO, AND BROWN 96 https://doi.org/10.6017/ital.v38i1.10738 total of 2,940 individual books listed their availability was as follows: Figure 4. Availability of 2,940 books in digital formats and public domain license. As per figure 4, only 0.18 percent (six) of the books had a version available in the public domain according to Google Books. Cost Results The Google Books API only returned prices for 596 (20 percent) of the books that we searched for. Within that sample, the cost ranged from $0.99 to over $452, as illustrated in figure 5. The median price of a book was $40, and the mean price was $56.67. As there are on average 3.96 books per course, this implies an average cost to students of $224.41 per course taken. As students take an average of 8.05 courses per year, this further implies a cost per year of $1,806.50 per student if they were to buy new versions of all the books. 1,219 (39.73% ) 1,016 (34.56% ) 6 (0. 18%) 0 500 1000 1500 2000 2500 PDF EBook OpenPDF E-Book Public Domain INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 97 Figure 5. Summary of Book Prices (n = 596). DISCUSSION AND CONCLUSION We have demonstrated that it is possible to programmatically search and determine the prices of large numbers of books. We used this information to attempt to estimate the full economic cost of books to students on average in an Irish HEI. We are still actively developing this tool and encourage others to use and even contribute to the code which we have published with the dataset. This proof of concept tool may allow stakeholders with an interest in book costs for students to quickly get real data on large numbers of books. Ultimately, we hope that this will help highlight the costs of many textbooks. Our findings also highlight relatively low levels of digital book availability. Very few books were found to be in the public domain. A limitation of this research is that there are issues around the coverage of Google Books and its index policies or algorithms. In a literature review of research articles about Google Books in 2017, Fagan pointed out that the coverage of Google Books is “hit and miss.”13 In 2017, Google Books included about thirty million books, though Google did not release specific details on its database, as emphasized by Fagan. It is known that content includes digitized collections from over forty libraries, and that US and English- language books are overrepresented.14 Furthermore, Google Books is only returning results for books that are in the public domain and cannot tell us if books are made available through open licenses such as Creative Commons. Accepting such caveats, however, we have found the Google Books API to be a very useful tool for answering questions about large numbers of books in a systematic way and hope that our findings can help others. The prices that we derived in this study were for new books only. However, the new book prices provide a baseline for all other prices, e.g. a used book or a loan book price will be relative to a new book price and library budgets will need to take account of new book prices.15 Further study is required to determine a more realistic figure for the cost of textbooks and the next phase of our 0 50 100 150 200 250 300 350 400 450 500 1 16 31 46 61 76 91 10 6 12 1 13 6 15 1 16 6 18 1 19 6 21 1 22 6 24 1 25 6 27 1 28 6 30 1 31 6 33 1 34 6 36 1 37 6 39 1 40 6 42 1 43 6 45 1 46 6 48 1 49 6 51 1 52 6 54 1 55 6 57 1 58 6 D ol la rs Cost in USD Books DETERMINING TEXTBOOK COST, FORMATS, AND LICENSING | COSTELLO, BOLGER, SOVERINO, AND BROWN 98 https://doi.org/10.6017/ital.v38i1.10738 wider open textbook research projects involves interviews and focus groups with students to better understand the lived reality of their relationship with textbooks.16 REFERENCES 1 Stephen L. Baglione and Kevin Sullivan, “Technology and Textbooks: The Future,” American Journal of Distance Education 30, no. 3 (Aug. 2016): 145-55, https://doi.org/10.1080/08923647.2016.1186466. 2 Etan Senack and Robert Donoghue, “Covering the Cost: Why We Can No Longer Afford to Ignore High Textbook Prices,” Report, The Student PIRGs (Feb. 2016), www.studentpirgs.org/textbooks. 3 Elaine Allen and Jeff Seaman, “Opening the Textbook: Educational Resources in U.S. Higher Education, 2015-16,” Report, BABSON Survey Research Group (July 2016), https://www.onlinelearningsurvey.com/reports/openingthetextbook2016.pdf. 4 William and Flora Hewlett Foundation (2019), http://www.hewlett.org/programs/education-program/open-educational-resources. 5 2012 Paris OER Declaration, http://www.unesco.org/new/fileadmin/MULTIMEDIA/HQ/CI/WPFD2009/English_Declaratio n.htm. 6 Mary Burgess, “The BC Open Textbook Project,” in Open: The Philosophy and Practices That Are Revolutionizing Education and Science, Rajiv S. Jhangiani and Robert Biswas-Diener (eds.). (London: Ubiquity Pr., 2017): 227–36. 7 Nicole Allen, “Congress Funds $5 Million Open Textbook Grant Program in 2018 Spending Bill,” SPARC Open (Mar. 20, 2018), https://sparcopen.org/news/2018/open-textbooks-fy18/. 8 Jennifer Ma et al., “Trends in College Pricing,” Report, The College Board (Oct. 2017), https://trends.collegeboard.org/sites/default/files/2017-trends-in-college-pricing_0.pdf. 9 Kaitlyn Vitez, “Open 101: An Action Plan for Affordable Textbooks,” Report, Student PIRGs (Jan. 2018), https://studentpirgs.org/campaigns/sp/make-textbooks-affordable. 10 Mark Brown, Eamon Costello, and Mairéad Nic Giolla Mhichíl, “From Books to MOOCs and Back Again: An Irish Case Study of Open Digital Textbooks,” in Exploring the Micro, Meso and Macro. Proceedings of the European Distance and E-Learning Network 2018 Annual Conference, Genova, 17-20 June, 2018 (Budapest: The European Distance and E-Learning Network): 206-14. 11 Google Books API (2018), https://developers.google.com/books/docs/v1/reference/volumes. 12 Eamon Costello and Richard Bolger, “Textbooks Authors, Publishers, Formats and Costs in Higher Education,” BMC Research Notes 12, no. 1 (Jan. 2019): 12-56, https://doi.org/10.1186/s13104-019-4099-1. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 99 13 Jody Condit Fagan, “An Evidence-Based Review of Academic Web Search Engines, 2014-2016: Implications for Librarians’ Practice and Research Agenda,” Information Technology and Libraries 36, no. 2 (Mar. 2017): 7-47, https://doi.org/10.6017/ital.v36i2.9718. 14 Ibid. 15 Anne Christie, John H. Pollitz, and Cheryl Middleton, “Student Strategies for Coping with Textbook Costs and the Role of Library Course Reserves,” portal: Libraries and the Academy 9, no. 4 (Oct. 2009): 491-510, http://digital.library.wisc.edu/1793/38662. 16 Eamon Costello et al., “Textbook Costs and Accessibility: Could Open Textbooks Play a Role?” Proceedings of the 17th European Conference on eLearning (ECEL), vol. 17 (Athens, Greece: 2018): 99-106. 10746 ---- Editorial Board Thoughts Column Getting to Yes: Stakeholder Buy-In for Implementing Emerging Technologies in Your Library Ida Joiner INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 5 Ida A. Joiner (ida.joiner@gmail.com), a member of LITA and the ITAL editorial board, is the Librarian at the Universal Academy School in Irving, Texas. She is the author of “Emerging Library Technologies: It’s Not Just for Geeks” (Elsevier, August 2018). Have you ever wanted to implement new technologies in your library or resource center such as (drones, robotics, artificial intelligence, augmented/virtual reality/mixed reality, 3D printing, wearable technology, and others) and presented your suggestions to your stakeholders (board members, directors, managers, and other decision makers) only to be rejected based on “there isn’t enough money in the budget,” or “no one is going to use the technology,” or “we like things the way that they are,” then this column is for you. I am very passionate about emerging technologies, how they are and will be used in libraries/ resource centers, and how librarians will be able to assist those who will be affected by these technologies. I recently published a book introducing emerging technologies in libraries. I came up with suggestions on how doing your research — including the questions below and those on the accompanying checklist —will prepare you to meet with your stakeholders and improve the likelihood of your emerging technology proposal being approved. 1. Who are your stakeholders and include them early on in the process? Determine who you stakeholders are, what their areas of expertise are, and how they can support your emerging technology projects. The most critical piece to getting your stakeholders on board to support your technology initiatives is addressing the question “What’s in it for them?” This will get their attention and increase your odds to getting to say “yes” to your technology initiatives. 2. What are the costs? Research what your costs will be and create a budget. Find innovative ways to fund your initiatives by researching grants, strategic partnerships with others who might be interested in partnering with you, and locating other funding opportunities. 3. What are the risks? Identify any potential risks so that you are prepared to discuss how you will mitigate them when you meet with your stakeholders. Some potential risks that you might want to address are budget cost overruns or staffing issues such as a key person resigning or going on maternity or sick leave, or policies in place to deter patrons from trying to use the technology for criminal means. mailto:ida.joiner@gmail.com https://www.elsevier.com/books/emerging-library-technologies/joiner/978-0-08-102253-5 https://www.elsevier.com/books/emerging-library-technologies/joiner/978-0-08-102253-5 GETTING TO YES | JOINER 6 HTTPS://DOI.ORG/10.6017/ITAL.V37I3.10746 4. What is the timeline and key milestones? Address the timeline for when you want or need to implement these technologies? Have you planned for key milestones and possible delays such as funds not being available? You need to have a detailed timeline, from your first kickoff meeting with your initiative’s team, to your stakeholder meeting where you present your proposal, to getting signoff on the project. 5. What training will you offer? Perform a needs assessment to determine who will need to be trained, what training you will offer, what your training costs will be, and who will pay for them. Once you have all of this in place, you will select the trainer(s) and the training model (such as “train the trainer”) that you will use. 6. How will you market your technology initiatives? Will you rely on social media to market your technology initiatives? Will you collaborate with your marketing department for developing your message through press releases, websites, blogs, e-newsletters, flyers, and other media outlets? You will need to meet with your marketing and publications experts to plan how you will market your emerging technology initiatives along with your costs and who will pay them. 7. Who is your audience and how can you engage them? This is the one of the most important areas to address in your proposal to present to your stakeholders. Without our patrons, there is no library. You will need to determine who your audience is and how you can utilize the emerging technologies to assist them. Are they K to 12 students, adults who will be displaced by these technologies, technology novices who want to learn more about these technologies, or university faculty and/or students who want to use the technology for their projects? You can address all of these potential audiences in your proposal to your stakeholders. These are just a few tips on how to get stakeholder buy-in for implementing emerging technologies in your library. Feel free to share some of your own successes in getting shareholders on board to implement emerging technologies in your library or resource center. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTMEBER 2018 7 EMERGING TECHNOLOGY STAKEHOLDER BUY-IN QUESTIONNAIRE I have included questions below that you should follow when you are considering getting your stakeholders on board to implement new emerging technologies in your library. If you address all of these, you have a very good chance of getting your stakeholders on board to support your initiatives. 1. What technologies do you want to implement in your library/resource center and why do you want them? 2. Who are your stakeholders and what are their backgrounds? 3. Why should your stakeholders support your technology initiatives? 4. What is your budget for your new technology initiatives? 5. What training is needed to support these initiatives, who will provide the training, what are the costs, and who will pay for the training? 6. How will you market these technology initiatives, what are the costs, and who will pay for them? 7. Did you perform a cost-benefit-analysis for these technology initiatives? 8. Are there legal fees? If so, what are they, and who will pay for them? 9. What are the risks? 10. What are the returns on the investment (ROI)? 11. What strategic partnerships can you establish? 12. What is your timeline for implementing these technology initiatives? Emerging Technology Stakeholder Buy-In Questionnaire 10747 ---- Letter from the Editor Kenneth J. Varnum INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 1 https://doi.org/10.6017/ital.v37i3.10747 This September 2018 issue of ITAL continues our celebration of the journal’s 50 th anniversary with a column by former Editorial Board member Mark Dehmlow, who highlights the technological changes beginning to stir the library world in the 1980s. The seeds of change planted in the 1970s are germinating, but the explosive growth of the 1990s is still a few years away. In addition to peer-reviewed articles on recommender systems, big data processing and storage, finding vendor accessibility documentation, using GIS to find specific books on a shelf, and a recommender system for archival manuscripts, we are also publishing the student paper by this year’s Ex Libris/LITA Student Writing Award, “The Open Access Citation Advantage: Does It Exist and What Does It Mean for Libraries?”, by Colby Lewis at the University of Michigan School of Information. This inciteful paper impressed the competition’s judges (as ITAL’s Editor, I was one of them) and I am very pleased to include Ms. Lewis’ work here. This issue also marks my fourth as editor. With one year under my belt I am finding a rhythm for the publication process and starting to see the increased flow of articles from outside traditional academic library spaces that I wrote about in December 2017. As always, if you have an idea for a potential ITAL article, please do get in touch. We on the editorial board look forward to working with you. Sincerely, Kenneth J. Varnum, Editor varnum@umich.edu September 2018 http://www.ala.org/news/member-news/2018/04/colby-lewis-wins-2018-litaex-libris-student-writing-award mailto:varnum@umich.edu 10749 ---- Information Technology and Libraries at 50: The 1980s in Review Mark Dehmlow INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 8 Mark Dehmlow (mdehmlow@nd.edu) is Director, Library Information Technology at the Hesburgh Libraries, University of Notre Dame. My view of library technology in the 1980s through the lens of Journal of Library Automation (JOLA) and its successor Information Technology and Libraries (ITAL) is a bit skewed by my age. I am a Gen-Xer and much of my professional perspective has been shaped by the last two decades in libraries. While I am cognizant of our technical past, my perspective is very much grounded in the technical present. In a way, I think that context made my experience reviewing the 1980s in JOLA and ITAL all the more fun. The most pronounced event for the journal during the 1980s was the transition from the Journal of Library Automation to Information Technology and Libraries between 1981 to 1982. The rationale for this change is perhaps best captured through the context set in the guest editorial “Old Wine in New Bottles?” by Kenney in the first issue of ITAL: “Proliferating technologies, the trend toward integration of some of these technologies into new systems, and rapidly increasing adoption of technology-based systems of all types in libraries .…”1 The article grounds us in the anxieties and challenges of the decade surrounding an accelerating change in technology. Libraries were evolving from implementing systems of “automation,” a term that focuses more on processes, to broadening their view to “information technology,” which is more of a discipline — an ecosystem made up of technology, process, systems, standards, policies, etc. In a way, the article acknowledges the departure of libraries from their adolescent technological pasts to their young adult present for which the 80s would be the background. Perhaps no other event is more technologically significant during the decade than the standardization of the internet. While the concept of networks and a network of networks, e.g. the internet, was conceptualized in the 1960s, it was the development of the TCP/IP network protocol that is the most consequential event because it made it possible to interconnect computer systems using a common means of communication. While the internet wouldn’t become ubiquitously popularized until the early 1990s with the emergence of the world wide web, the internet was active and alive well before that and, in its early state, was critical to the emergence and evolution of library technologies. From the first issue through the last of the 1980s, ITAL references the term “online” frequently. The “online” of the 80s however was largely text based, where systems were interconnected using lightweight terminals to navigate browse and search systems. It was not unlike a massive “choose your own adventure book,” skipping from menu to menu to find what you were looking for. Throughout my review, I was happy to see a small, but significant, percentage of international articles that focused on character sets, automation, and collection comparisons in countries like Kuwait, Australia, China, and Israel. Diversity is a cornerstone for LITA and ALA and the journal has continued this trend to encourage the submission of articles from outside of the U.S. The 1980s volumes of ITAL traversed a plethora of topics ranging from measuring system THE 1980S IN REVIEW | DEHMLOW 9 https://doi.org/10.6017/ital.v37i3.10749 performance (efficiency was important during a time when computing was relativ ely slow and expensive) to how to use library systems to provide data that can be used to make business decisions. Over the decade, there was a significant focus on library organizations coming to terms with new technology, e.g. the automation of circulation, acquisitions, and the MARC bibliographic record. There were several articles that discussed the complications, costs, and best practices for converting card-catalog metadata to electronic records and several other articles that detailed large barcoding projects. The largest number of articles on a single topic focused on the automation and management of authority control in automated library systems. There were articles on the emergence of research databases often delivered as applications on CD-ROMs which would then be installed on microcomputers. The term “microcomputer” was frequently used because the 80s saw the emergence of the personal computer in the work environment, a transformative step in enabling staff and patrons alike to access online library services and applications to support their research and work. Electronic mail was in its infancy and became a novel way to share information with end users across a campus. Several articles focused on the physical design of search terminals and optimizing the ergonomics of computers. There were also many articles about designing the best OPAC interface for users, ranging from how to present bibliographic records to users, to what information should be sent to printers, to early efforts to extend local catalogs with article-based metadata. Many of these topics have parallels today. Instead of only analyzing statistical usage data we can pull from our systems, libraries are striving to develop predictive analytics, leveraging big-data from across an assortment of institutions. I found the 1988 article “Investigating Computer Anxiety in an Academic Library,” which examines staff resistance to technology and change to be as apropos today as it was then.2 CD-ROMs have gone the way of the feathered and overly hair- sprayed coifs of the 80s and have largely been superseded by hard drives and solid state flash media that can hold significantly more data and can transfer data more rapidly. The current decade of the 2010s has been dedicated to providing the optimal search experience for our end users as we have broadened our efforts to the discovery of all scholarly information, not just what is held in our collections. And of course, instead of adding a few article abstracting resources to our catalogs in an innovative, but difficult to sustain manner, the commercial sector has created web-scale mega-indexes that are integrated with our catalogs and offer the promise of searching a predominant amount of the scholarly record. There was a really interesting thread of articles over the decade that traced the evolution of the ILS in libraries. There were articles about how to develop automation systems for libraries, the various functions that could be automated — cataloging, circulation, acquisitions, etc. — and evaluation projects for commercial systems. If the 2000s was the era of consolidation, the early 1980s could easily represent the era of proliferation. The decade nicely traces the first two generations of library systems, starting with university-developed automation and database backed systems and the migration of many of those systems to vendors. The Northwestern University-based NOTIS system was referenced a lot and there were some mentions of OCLC’s acquisition and distribution of the LS/2000 system. This part of our automation history is a palpable reminder that libraries have been innovative leaders in technology for decades, often developing systems ahead of the commercial industry in an effort to meet our evolving service portfolios. This early strategy for libraries mirrors recent developments of institutional repositories, Current Research Information Systems (CRISs), and faculty profiling systems like VIVO that were developed before the commercial sector saw the feasibility of commercialization. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2018 10 The cycle of selecting and implementing a new integrated library system is something that m any organizations are faced with again. The only difference is that the commercial sector has entered into the development of the 4th or 5th generation of integrated library systems, many of which are coming with data services integrated and most of them are implemented in the cloud. In addition to seeing our technically rudimentary past, there were several articles over the decade that discussed especially innovative ideas or that anticipated future technologies. A 1983 article by Tamas Doszkocs which was written long before the emergence of Google is an early revelation that regular patrons struggle to use expert systems that require normalized and Boolean searching strategies. Not surprising is the conclusion that users lean organically toward natural language searching, but even then we were having the expert experience vs. intuitive experience debate in the profession: “The development of alternative interfaces, specifically designed to facilitate direct end user interaction in information retrieval systems, is a relatively new phenomenon.”3 The 1984 article, “Packet Radio for Library Automation,” is about eliminating the challenges of retrofitting buildings with cabling to connect LAN networks by using radio based interfaces.4 Could this be an early precursor to WiFi? There is the 1985 article titled “Microcomputer Based Faculty-Profile” about using a local database management application on a PC to create an index of faculty publications and university publishing trends.5 This is nearly three decades before the popularization of the CRIS and faculty profile system. In 1986, there is an article “Integrating Subject Pathfinders into a GEAC ILS: A MARC-Formatted Record Approach,” an article that made me think about how library websites are structured, and the current trend of developing online research guides and making them discoverable in our websites as a research support tool.6 And finally, I was struck by the innovative approach in 1987’s “Remote Interactive Online Support,” wherein the authors wrote about using hardware to make simultaneous shell connections to a search interface so they could give live search guidance to researchers remotely. 7 We take remote technical support for granted now, but in the late 80s, this required several complicated steps to achieve. The 80s were an exciting time for technology development and a decade that is rife with technical evolution. I think this quote from the article “1981 and Beyond: Visions and Decisions” by Fasana in the Journal of Library Automation best elucidates the deep connection between the past and the future, “Library managers are currently confronted with a dynamic environment in which they are attempting simultaneously to plan library services and systems for the future, and to control the rate and direction of change.”8 This still holds true. Library managers are still planning services in a rapidly changing environment, except, I like to think we have learned to live with change that we cannot control the rate nor direction of. 1 B. Kenney, “Guest Editorial: Old Wine in New Bottles?,” Information Technology and Libraries, 1 no. 1 (March 1982), p. 3. 2 MaryEllen Sievert, Rosie L. Albritton, Paula Roper, and Nina Clayton, “Investigating Computer Anxiety in an Academic Library,” Information Technology and Libraries 7 no. 3 (September 1988), pp. 243-252. THE 1980S IN REVIEW | DEHMLOW 11 https://doi.org/10.6017/ital.v37i3.10749 3 Tamas E. Doszkocs, “CITE NLM: Natural-Language Searching in an Online Catalog,” Information Technology and Libraries 2 no. 4 (December 1983), p. 364. 4 Edwin B. Brownrigg, Clifford A. Lynch, and Rebecca Pepper, “Packet Radio for Library Automation,” Information Technology and Libraries 3 no. 3 (September 1984), pp. 229-244. 5 Vladimir T. Borovansky and George S. Machovec, “Microcomputer Based Faculty-Profile,” Information Technology and Libraries 4 no. 4 (December 1985), pp. 300-305. 6 William E. Jarvis and Victoria E. Dow, “Integrating Subject Pathfinders into a GEAC ILS: A MARC- Formatted Record Approach,” Information Technology and Libraries 5 no. 3 (September 1986), pp. 213-227. 7 S. F. Rossouw and C. van Rooyen, “Remote Interactive Online Support,” Information Technology and Libraries 6 no. 4 (December 1987), pp. 311-313. 8 Paul J. Fasana, “1981 and Beyond: Visions and Decisions,” Journal of Library Automation 13 no. 2 (June 1980), p. 96. 10810 ---- Editorial Board Thoughts: Critical Technology Cinthya Ippoliti INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 5 Cinthya Ippoliti (cinthya.ippoliti@ucdenver.edu) is University Librarian and Director, Auraria Library, University of Colorado. Critical librarianship has brought many changes in how libraries have examined their programs and services, created new positions dedicated to equity, inclusion, and diversity, and paved the way to challenge existing assumptions about our work and environment. Technology also exists in a space that is not neutral, as library systems and services reflect specific perspectives in their content and focus as well as how they are made accessible (or not). I would like to briefly examine how we can begin to think about these issues within academic libraries, and offer some additional readings for further reflection for four technology-related areas: spaces, services/programming, systems, and engaging with our users. TECHNOLOGY SPACES We might assume that because we are seeing students using our classrooms, makerspaces, and study areas, that we have been successful in meeting the needs of a wide variety of users. To a large extent that may be true, but we should also be asking ourselves who does not feel welcome in such a space and, more importantly, why not? There are two facets to this question. The first involves the degree to which libraries strive to create a welcoming environment. Staff interactions, signage, hours, and institutional values are all part of a complex and broader environment that signals to users how these spaces function and how they are perceived by the organization. These same elements can also serve as deterrents through choices in layout, policy, or other intangible aspects so that they may in fact prevent individuals from entering these spaces in the first place. The second revolves around the notion that each technology-rich space conveys its level of friendliness and intended purpose through its physical presence. Ensuring that furniture, paint, and layout are compliant with ADA standards, and integrating these features with each other as opposed to setting them apart so that they are not considered “special” or “different,” is one small and vital step in this direction. Maggie Beers and Teggin Summers cover these issues an EDUCAUSE Review article and discuss asking questions regarding how power structures are reinforced by having a “front” of the room or other configurations can enrich planning and assessment efforts. Similarly, developing a plan so that new technology in areas such as makerspaces rotates as much as possible will help to provide access for those who may not be able to utilize these resources outside of the library context in order to accommodate differing skill levels, interests, and learning styles. In addition, students may not always be present on campus due to family, job, or other life circumstances and planning with the assumption that everyone who could benefit from using a particular space is in fact taking advantage of that benefit, is problematic. One way around that is to ensure that each space is as flexible as possible and (ideally) can be reconfigured for quiet reflection, collaborative work, or transformed into a sensory space or other type of specialized environment. The reservation process should be available both online and manually (as not CRITICAL TECHNOLOGY | IPPOLITI 6 https://doi.org/10.6017/ital.v37i4.10810 everyone may have access to a computer and/or the internet), hour limitations should have several counter options, and the space should be available as much of the time as possible when it is not in use for more a more formalized purpose. Any space usage assessments should also purposefully include non-users or perceived non-users and integrate questions about barriers to or about the space in their methodologies. Finally, ensuring that the right level of staffing to support both the intended, as well as perhaps the unintended, uses of the space and the activities that occur within it will help create a sense that not only the space itself is valued, but that the experiences occurring within it are even more important. This is not easy to accomplish, as it is difficult to predict exactly how a space will be used unless there are very strict confines placed around its configuration and accessibility. But assuming that most spaces in libraries are designed to be malleable and keeping in constant communication with users via some of the methods described above should help. TECHNOLOGY SERVICES AND PROGRAMMING Similarly, services and programs cannot be built around a one-size-fits-all model. This can prove to be quite challenging given the limited resources libraries face. Engagement and learning lie not only in access to tools, but in the very process of sharing knowledge and experiences — whether for academic growth, social action, or simply personal enjoyment. Matt Ratto, who coined the term “critical making,” defines it as the process “intended to highlight the interwoven material and conceptual work that making involves.” He argues that “critical making is dependent on open design technologies and processes that allow the distribution and sharing of technical work and its results.” Ratto makes the further point that this process also has the capacity of “unpacking the social and technical dimensions of information technologies.” This in turn allows for technology to become more than simply a cool resource, but rather a mechanism for democratizing this creative work of making and designing while dealing with its messy, political, and uncomfortable aspects which do not exist in vacuum outside of the tools themselves. An approach in this instance might involve taking technology outside of library spaces such as on campus or within the community, offering as much for free as possible, and capitalizing on programs such as Girls Who Code (https://girlswhocode.com/) and Grow with Google (https://grow.google/). Capturing how these resources are used in all of their possible permutations enables stories of individuals to shine through. The impact of these programs takes on a personal element through showcases, speaker events, and hackathons that are designed to bring the community together and engage in sharing of knowledge, perspectives, and conversations. In addition, this will hopefully shrink the barriers for those who don’t see themselves as having a role in these activities. INTEGRATED LIBRARY SYSTEMS I do not have a background in systems, but Simon Barron and Andrew Preater have written a great chapter unpacking the inherent power structures which manifest themselves in library systems such as the integrated library system (ILS), discovery interfaces, and the third-party resources we provide access to. They suggest taking action by thinking about user privacy and ensuring that the information libraries are able to view, gather, and store is used ethically and that decisions for derivative services or actions are not made based on assumptions about gender identity, economic status, or other identifiers via access to these types of data. Openness is another area the authors explore, as they discuss how libraries can use open source software whenever possible in order to balance the field against profit-based licensing models. Barron and Preater also raise a concern however that while crowdsourcing is in theory a good way to include the community in https://girlswhocode.com/ https://grow.google/ INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 7 developing ways to help itself, it still does not recognize the limited resources marginalized populations can dedicate to these efforts. Finally, they discuss how it is crucial for libraries to recognize and support the expertise needed in this arena in order to avoid overreliance on vendor systems that can prove alluring with out-of-the-box solutions, but which compromise things like privacy, autonomy, and customization that might otherwise benefit from equity, diversity, and inclusion-centered practices. EQUITY-DRIVEN DESIGN Engaging with users in developing shared solutions to challenges is an important aspect of the user experience, and can help pave the way for deeper conversations. Taking a step back and making sure the assessment and design process itself is transparent for everyone is one of the first things that needs to be in place. I would like to harken to the work of Gretchen Rossman and Sharon Rallis who make a crucial distinction between user-centered design, in which the user seldom has a voice in what the final process or product looks like, and what they term as “emancipatory design,” in which participants are “collaboratively producing knowledge to improve their work and their lives.” In addition, emancipatory design is one where “users are in charge; their power, their indigenous knowledge are more powerful and respected than those of the expert designer.” This approach can therefore be a means to promoting equity, diversity, and inclusion into technology work in libraries by focusing on the users’ voice as opposed to our own and working collaboratively to develop shared solutions to address their challenges. A specific example of how this framework might be applied comes from The Stanford School of Design which is famous for its course in design thinking. Stanford has recently taken that concep t even further, and integrated an equity focus into the first steps of the progression, where the designer is not only identifying existing built-in biases but also raises questions such as who the users are, what are the equity challenges that need to be addressed, who has institutional power, and how is it manifested in the decisions that drive the organization. The Stanford model also provides specific methods focusing on human values and developing relational trust as a way to bookend the design thinking process by reflecting on the blind spots that were uncovered as a way to help inform action items and next steps and ensure that the users are actively collaborating to develop these services and programs which in turn affect them. This version of the program is available at https://dschool.stanford.edu/resources/equity-centered-design-framework. As a final thought, one idea to keep at the forefront in all of these areas is that of universal design, which is defined by the Center for Universal Design at NCSU as “the design of products and environments to be useable by all people, to the greatest extent possible, without the need for adaptation or specialized design.” The first principle is that of equitable use and can be applied to many technology-related aspects whether they are physical or virtual: • Provide the same means of use for all users: identical whenever possible; equivalent when not • Avoid segregating or stigmatizing any users • Provisions for privacy, security, and safety should be equally available to all users • Make the design appealing to all users https://dschool.stanford.edu/resources/equity-centered-design-framework CRITICAL TECHNOLOGY | IPPOLITI 8 https://doi.org/10.6017/ital.v37i4.10810 FURTHER READINGS: Barron, S. and Preater, A. J. “Critical systems librarianship.” In The Politics of Theory and the Practice of Critical Librarianship (Sacramento: Litwin Books, 2018). https://repository.uwl.ac.uk/id/eprint/4512/1/2018-Barron-and-Preater-Critical-systems- librarianship.pdf. Beers, M. & Summers, T. “Educational Equity and the Classroom: Designing Learning-Ready Spaces for All Students,” Educause Review. May 7, 2018. https://er.educause.edu/articles/2018/5/educational-equity-and-the-classroom-designing- learning-ready-spaces-for-all-students. North Carolina State University Center for Universal Design. “Center for Universal Design”. https://projects.ncsu.edu/design/cud/ (accessed November 25, 2018). Ratto, M. “Critical Making,” Open Design Now. http://opendesignnow.org/index.html%3Fp=434.html (accessed November 7, 2018). Rossman, G. B., and Rallis, S. F. Learning in the field: An Introduction to Qualitative Research (Thousand Oaks, CA: Sage, 1998). https://repository.uwl.ac.uk/id/eprint/4512/1/2018-Barron-and-Preater-Critical-systems-librarianship.pdf https://repository.uwl.ac.uk/id/eprint/4512/1/2018-Barron-and-Preater-Critical-systems-librarianship.pdf https://er.educause.edu/articles/2018/5/educational-equity-and-the-classroom-designing-learning-ready-spaces-for-all-students https://er.educause.edu/articles/2018/5/educational-equity-and-the-classroom-designing-learning-ready-spaces-for-all-students https://projects.ncsu.edu/design/cud/ http://opendesignnow.org/index.html%3Fp=434.html Technology spaces Technology services and programming Integrated library systems Equity-driven design Further Readings: 10821 ---- Information Technology and Libraries at 50: The 1990s in Review Steven K. Bowers INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 9 Steven K. Bowers (sbowers@wayne.edu) is Executive Director, Detroit Area Library Network (DALNET). I played some computers games — stored on data cassette tapes — in the 1980s. That was entertaining, but I never imagined the greater hold that computers would have on the world by the mid-1990s. I can remember getting my first email account in 1993, and looking at information on rudimentary web pages in 1996. I remember my work shifting from an electric typewriter to a bulky personal computer with dial-up Internet access. Eventually, this new computing technology became a prevalent part of my everyday life. This shift to a computer-driven reality had a major effect on libraries too. I was amazed by the end of the 1990s to be doing research on a university library catalog system connected with other institutions of higher education throughout the region, wondering at the expanded access to, and reach of, information. In my mind, due to computers and the Internet, libraries were really connected at that time more than they had ever been. As I prepared this review of what we were writing about in ITAL in the 1990s, I had some fond memories of the advent of personal computers in my daily life and in the libraries I had access to. As we take a look back, I think it is interesting to see what we were doing then and how it is connected to what we are still working on today. Along with the eventual disruption that the Internet was to libraries, computers and online access also had the effect of greatly changing how libraries constructed our core research tools, especially the catalog. Prior to the 1990s libraries had begun automation projects to move their catalogs to computer-based terminals, creating connections and access that were not previously possible with a card catalog. If we are still complaining about the design and function of the Online Public Access Catalog (OPAC) today, in the early 1990s we were discussing what their design and function should be, in a positive and optimistic way. In some ways it seems hard to recall the discussions of how to format data and display it to users. In other ways it seems like we are still having the same discussions, but our work has become more complex as we continue to restructure library data to become more open and accessible. While we were contemplating the design of online library catalogs, libraries were also discussing the implementation of networking and other information technology infrastructures. Nevins and Learn examined the changes in hardware, software, and telecommunications at the time and predicted a more affordable cost model with distributed personal computers connected through networks, and enhancing library automation cooperation. 1 They expanded the discussion to include consideration of copyright and intellectual property, security, authorization, and a need for information literacy in the form of user navigation, all key to what we are doing today. Beyond catalogs, there was the real adoption of the Internet itself. By the early 1990s there was growing enthusiasm for accessing and exploring the Internet. 2 This created a need for libraries to learn about the Internet and instruct others on how to use it. As late as 1997, however, even search engines were still being introduced and defined, and using the Internet or searching the World Wide Web was still a new concept that was not fully understood by many people. At their THE 1990S IN REVIEW| BOWERS 10 https://doi.org/10.6017/ital.v37i4.10821 basis, search engines were simply defined as indexing and abstracting databases for the web. 3 It is interesting that library catalogs were developed separately from the development of search engines and we are still trying to get our metadata out of our closed systems and open to the rest of the web. In 1991, Kibirige examined the potential impact of this new connectivity on library automation. He posited that “One of the most significant change agents that will pervade all other trends is the establishment and regular use of high-speed, fiber optic communication highways.”4 His article in ITAL provides a prescient overview of much of what has played out in technology, not just in libraries. He noted the need for disconnected devices to become tools to access full-text information remotely.5 Perhaps most important, he noted the need for librarians to become experts in non-library technology, to keep pace with developments outside of the profession. This admonition is still important to keep in mind today. At the time, however, libraries were working on the basics of converting records from online bibliographic utility systems running on mainframes to a more useful format for access on a personal computer, let alone thinking about transforming library metadata into linked data that can be accessed by rest of the Internet. So we keep moving forward. Later in the decade, libraries began to think about the library catalog as a “one stop shop” for information. In 1997, Caswell wrote about new work to integrate local content, digital materials, and electronic resources, all into one search interface. Initially the discussion was more technical in nature, but Caswell provided an early concept for providing a single access point to all of the content that the library has, print and electronic, which was a step forward from just listing the books in the catalog.6 At the time we were still far away from our current concept of a full discovery system with access to millions of electronic resources that may well surpass the print collections of a library. Eventually more discussion developed around the importance of user experience and usability for the design of catalogs and websites. Catalogs were examined in parallel with the structure of library metadata, and both were seen as important to the retrievability of library materials. Human-machine interaction was starting to be examined on the staff side of systems, and this would eventually become part of examining the public interface usability as well. Outlining an agenda for redesigning online catalogs, Buckland summarized this new technological development work for libraries by noting that “Sooner or later we need to rethink and redesign what is done so that it is not a mechanization of paper but fully exploits the capabilities of the new technology.”7 More exciting, by the end of the 1990s we were seeing usability studies for specific populations and those with accessibility difficulties. Systems were in wide enough use that libraries began to examine their usefulness to more audiences. Beyond our systems, the technology of our actual collections was changing. New network connectivity combined with new hardware led to new formats for library resources, specifically digital and electronic resources. In 1992, Geraci and Langschied summarized these changes, stating that “what is new for the 1990s is the complication of a greater variety of electronic format, software, hardware, and network decisions to consider.”8 They also expanded the conversation to include data in all forms, and data sets of various kinds, well beyond traditional library materials. This is an important evolution as libraries worked to shift their operations, identities, and curatorial practices. Geraci and Langschied defined data by type, including social data, scientific INFORMATION TECHNOLOGY AND LIBRARIES | MONTH YEAR 11 data, and humanities data. They called most importantly for libraries to include access to this varied data to continue the role of libraries providing access to information, as they cautioned that information seekers were already beginning to bypass libraries and look for such information from other sources. Libraries were beginning to lose ground as the gatekeepers of information and needed to shift to providing online access and open data themselves. The early 1990s were an exciting time for preservation, as discussion was moving from converting materials to microforms to digitization. In 1990, Lesk compared the two formats and had hope for a promising digital future.9 Thank goodness he was on target for sharing resources and creating economical digital copies, even if he did not completely predict the eventual shift to reliance on electronic resources that many research libraries have now made. Lesk also noted the importance of text recognition, optical character recognition (OCR), and text formatting in ASCII. Others focused on digital file formats and the planning and execution of creating digital collections. Digitization practices were developing and the need to formalize practice was becom ing evident. The same year, Lynn outlined the relationship between digital resources and their original media, highlighting preservation, capture, storage, access, distribution.10 By the late 1990s there were more targeted discussions about the benefits of digitizing resources to provide not only remote access, but access to archival materials specifically. In 1996, Alden provided a good primer on everything to consider when doing digitization projects, within budget constraints. 11 By the mid-1990s, Karen Hunter was excited to extol the promises of the dissemination of information electronically, calling the High Performance Computing and High Speed Networking Applications Act of 1993 “[a] formidable vision and goal. Real-time access to everything and a laser printer in every house. The 1990s equivalent to a chicken in every pot.”12 Hunter’s article is a good overview of where libraries were at working with electronic publications and online access in the early 1990s. Halcyon Enssle’s piece on moving reserves to online access opened with a great summary of where much of library access was headed: “The virtual library, libraries without walls, the invisible user . . . these are some of the terms getting used to describe the library of the future . . . .”13 Eventually, by the end of the decade we even learned to start tracking how our new online libraries were being used, applying our knowledge of print resource usage to our new online collections. In 1995, Laverna Saunders had already developed a new definition of what a library was, and how the transformation of libraries from physical warehouses to providing access to online content would affect workflows in libraries. As defined by Saunders, “the virtual library is a metaphor for the networked library, consisting of electronic and digital resources, both local and remote.”14 Not a bad definition more than 20 years later. Saunders asked pertinent questions such as which resources would be best in print vs. online, what print materials should be retained, and which resources and collections libraries should digitize themselves. The broader view provided was that these changes would affect not just collections but the entire operation of libraries. There would still be work to do in libraries, but changes in the work were necessary to address shifting technology and the composition of collections. By the end of the decade there was new work to assess use of electronic resources, extended virtual reference services, and information literacy extending to technology instruction. In 1998, Kopp wrote about the promising future of library collaborations. Consortia were well established in prior decades and they were seeing a resurgence. Kopp noted that just as consortia THE 1990S IN REVIEW| BOWERS 12 https://doi.org/10.6017/ital.v37i4.10821 had been built around support for new shared utilities in the 1970s and 1980s, in the 1990s they were finding a new purpose in the new networking of the Internet and possibilities of greater connectivity and collaborations in the online environment.15 Beyond cataloging and automation technology, it is interesting to note that even in the new online environment that was forming in the 1990s, many consortia formed at the time to share print resources. This may have been conversely related to libraries shifting from complete print collections to online holdings that many may have felt were more ephemeral, or maybe money was spent on new technological infrastructures and less on library materials. Resource sharing of print materials is still an important part of libraries working together to provide access to information, and since the time that Kopp wrote about consortia and growing networked collaborations, there has also been a growing development of sharing electronic resources. A large part of the work of many consortia today revolves around purchasing of electronic resources, but in the late 1990s libraries were just beginning to get into purchasing commercial electronic resources.16 There were lots of ITAL articles in the 1990s looking at the future of libraries and technology, and some specific articles dedicated to prognostication. In 1991, looking into the future, Kenneth E. Dowlin shared a vision for public libraries in 2001. He predicted that libraries would still exist but it is noteworthy that at the time the future existence of libraries was questioned by many. Dowlin did predict change for libraries, including the confluence of new media formats, computing, and yes, still books. He stated what time has now confirmed: “The public wants them all.”17 He had lots of other interesting ideas as well; his article is worth a second look. Another fun take on the future was a special section on Science Fiction from 1994 considering Future Possibilities in Information Technology and Access. In one piece, David Brin noted, “Nobody predicted that the home computer would displace the mega-machine and go on to replace the rifle over the fireplace as freedom’s great emancipator, liberating common citizens as no other technology has since the invention of the plow.”18 An interesting observation, even if the computer has now been replaced by phones in our pockets or other fantastic wearable technologies. By the end of the 1990s, libraries had been greatly transformed by technology. Many libraries had automated, workflows continued to adjust in all areas of library work, and most libraries had at least partially incorporated elements of using the Internet along with providing computer access to library users. Some libraries were already moving through the change from print to electronic library resources. Specific web applications and websites were also being developed and used for and by libraries. These eventually have matured into smarter systems that can provide better access to our collections and smarter assessment of our resource usage, for both print and electronic materials. As a whole, the 1990s are an exciting time to review when looking at the intersection of information technology and libraries. As information dissemination moved to an online environment, within and outside of the profession, the future existence of libraries began to be questioned. As we now know, libraries still play an important role in providing access to information. NOTES 1 Kate Nevins and Larry L. Learn, “Linked Systems: Issues and Opportunities (Or Confronting a Brave New World),” Information Technology and Libraries 10, no. 2 (1991): 115. INFORMATION TECHNOLOGY AND LIBRARIES | MONTH YEAR 13 2 Constance L. Foster, Cynthia Etkin, and Elaine E. Moore, “The Net Results: Enthusiasm for Exploring the Internet,” Information Technology and Libraries 12, no. 4 (1993): 433-6. 3 Scott Nicholson, “Indexing and Abstracting on the World Wide Web: An Examination of Six Web Databases,” Information Technology and Libraries 16, no. 2 (1997): 73-81. 4 Harry M. Kibirige, “Information Communication Highways in the 1990s: An Analysis of their Potential Impact on Library Automation,” Information Technology and Libraries 10, no. 3 (1991): 172. 5 Kibirige, “Information Communication Highways in the 1990s,” 175. 6 Jerry V. Caswell, “Building an Integrated User Interface to Electronic Resources,” Information Technology and Libraries 16, no. 2 (1997): 63-72. 7 Michael K. Buckland, “Agenda for Online Catalog Designers,” Information Technology and Libraries 11, no. 2 (1992): 162. 8 Diane Geraci and Linda Langschied, “Mainstreaming Data: Challenges to Libraries,” Information Technology and Libraries 11, no. 1 (1992): 10. 9 Michael Lesk, “Image Formats for Preservation and Access,” Information Technology and Libraries 9, no. 4 (1990): 300-308. 10 M. Stuart Lynn, “Digital Imagery, Preservation, and Access--Preservation and Access Technology: The Relationship between Digital and Other Media Conversion Processes: A Structured Glossary of Technical Terms,” Information Technology and Libraries 9, no. 4 (1990): 309-336. 11 Susan Alden, “Digital Imaging on a Shoestring: A Primer for Librarians,” Information Technology and Libraries 15, no. 4 (1996): 247-50. 12 Karen A. Hunter, “Issues and Experiments in Electronic Publishing and Dissemination,” Information Technology and Libraries 13, no. 2 (1994): 127. 13 Halcyon R. Enssle, “Reserve on-Line: Bringing Reserve into the Electronic Age,” Information Technology and Libraries 13, no. 3 (1994): 197. 14 Laverna M. Saunders, “Transforming Acquisitions to Support Virtual Libraries,” Information Technology and Libraries 14, no. 1 (1995): 41. 15 James J. Kopp, “Library Consortia and Information Technology: The Past, the Present, the Promise,” Information Technology and Libraries 17, no. 1 (1998): 7-12. 16 International Coalition of Library Consortia, “Guidelines for Statistical Measures of Usage of Web-Based Indexed, Abstracted, and Full Text Resources,” Information Technology and Libraries 17, no. 4 (1998): 219-21; Charles T. Townley and Leigh Murray, “Use-Based Criteria THE 1990S IN REVIEW| BOWERS 14 https://doi.org/10.6017/ital.v37i4.10821 for Selecting and Retaining Electronic Information: A Case Study,” Information Technology and Libraries 18, no. 1 (1999): 32-9. 17 Kenneth E. Dowlin, “Public Libraries in 2001,” Information Technology and Libraries 10, no. 4 (1991): 317. 18 David Brin, “The Good and the Bad: Outlines of Tomorrow,” Information Technology and Libraries 13, no. 1 (1994): 54. 10822 ---- Articles No Need to Ask: Creating Permissionless Blockchains of Metadata Records Dejah Rubel INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 1 Dejah Rubel (rubeld@ferris.edu) is Metadata and Electronic Resource Management Librarian, Ferris State University. ABSTRACT This article will describe how permissionless metadata blockchains could be created to overcome two significant limitations in current cataloging practices: centralization and a lack of traceability. The process would start by creating public and private keys, which could be managed using digital wallet software. After creating a genesis block, nodes would submit either a new record or modifications to a single record for validation. Validation would rely on a Federated Byzantine Agreement consensus algorithm because it offers the most flexibility for institutions to select authoritative peers. Only the top tier nodes would be required to store a copy of the entire blockchain thereby allowing other institutions to decide whether they prefer to use the abridged version or the full version. INTRODUCTION Several libraries and library vendors are investigating how blockchain could improve activities such as scholarly publishing, content dissemination, and copyright enforcement. A few organizations, such as Katalysis, are creating prototypes or alpha versions of blockchain platforms and products.1 Although there has been some discussion about using blockchains for metadata creation and management, only one company appears to be designing such a product. Therefore, this article will describe how permissionless blockchains of metadata records could be created, managed, and stored to overcome current challenges with metadata creation and management. LIMITATIONS OF CURRENT PRACTICES Metadata standards, processes, and systems are changing to meet twenty-first century information needs and expectations. There are two significant limitations, however, to our current metadata creation and modification practices that have not been addressed: centralization and traceability. Although there are other sources for metadata records, including the Open Library Project, the largest and most comprehensive database with over 423 million records is provided by the Online Computer Library Center (OCLC).2 OCLC has developed into a highly centralized operation that requires member fees to maintain its infrastructure. OCLC also restricts some members from editing records contributed by other members. One example of these restrictions is the Program for Cooperative Cataloging (PCC). Although there is no membership fee for PCC, catalogers from participating libraries must receive additional training to ensure that their institution contributes high quality records.3 Requiring such training, however, limits opportunities for participation and can create bottlenecks when non-PCC institutions identify errors in a PCC record. Decentralization NO NEED TO ASK | RUBEL 2 https://doi.org/10.6017/ital.v38i2.10822 would help smaller, less-well-funded institutions overcome such barriers to creating and contributing their records and modifications to a central database. The other significant limitation to our current cataloging practices is the lack of traceability for metadata changes. OCLC tracks record creation and changes by adding an institution’s OCLC symbol to the 040 MARC field.4 However, this symbol only indicates which institution created or edited the record, not what specific changes they made. OCLC also records a creation date and a replacement date in each record, but a record may acquire multiple edits between those two dates. Recording the details of each change within a record would help future metadata editors to understand who made certain changes and possibly why they were made. Capturing these details would also mitigate concerns about the potential for metadata deletion because every datum would still be recorded even if it is no longer part of the active record. INFORMATION SCIENCE BLOCKCHAIN RESEARCH Many researchers and institutions are exploring blockchain for information science applications. Most of these applications can be categorized as either scholarly publishing, content dissemination and management, or metadata creation and management. One of the most promising applications for blockchain is coordinating, endorsing, and incentivizing research and scholarly publishing activities. In “Blockchain for Research,” Rossum from Digital Science describes benefits such as data colocation, community self-correction, failure analysis, and fraud prevention.5 Research activity support and endorsement would use an Academic Endorsement Points (AEP) currency to support work at any level, such as blog posts, data sets, peer reviews, etc. The amount credited to each scientist is based on the AEP received for their previous work. Therefore, highly endorsed researchers will have a greater impact on the community. One benefit of this system is that such endorsements would accrue faster than traditional citation metrics.6 One detriment to this system is its reliance on the opinions of more experienced scientists. The current peer review process assumes these experts would be the best to evaluate new research because they have the most knowledge. Breakthroughs often overturn the status quo, however, and consequently may be overlooked in an echo chamber of approved theories and approaches. Micropayments using AEP could “also introduce a monetary reward scheme to researchers themselves,” bypassing traditional publishers.7 Unfortunately, such rewards could become incentives to propagate unscientific or immoral research on topics like eugenics. In addition, research rewards might increase the influence of private parties or corporations to science and society’s detriment. Blockchains might also reduce financial waste by “incentivizing research collaboration while discouraging solitary and siloed research.”8 Smart contracts could also be enabled that automatically publish any article, fund research, or distribute micropayments based on the amount of endorsement points.9 To support these goals, Digital Science is working with Katalysis on the Blockchain for Peer Review project. It is hard to tell exactly where they are in development, but as of this writing, it is probably between the pilot phase and the minimum viable product.10 The Decentralized Research Platform (DEIP) serves as another attempt “to create an ecosystem for research and scientific activities where the value of each research…will be assessed by an experts’ community.”11 The whitepaper authors note that the lack of negative findings and unmediated or open access to INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 3 research results and data often leads to scientists replicating the same research.12 They also state that 80 percent of publishers’ proceeds are from university libraries, which spend up to 65 percent of their entire budget on journal and database subscriptions.13 This financial waste is surprising because universities are the primary source of published research. Therefore, DEIP’s goals include research and resource distribution, expertise recognition, transparent grant processes, skill or knowledge tracking, preventing piracy, and ensuring publication regardless of the results.14 The second most propitious application of blockchain to information science is content dissemination and management.15 Blockchain is an excellent way to track copyright. Several blockchains have already been developed for photographers, artists, and musicians. Examples include photochain, copytrack, binded, and dotBC.16 Micropayments for content supports the implementation of different access models, which can provide an alternative to subscription- based models.17 Micropayments can also provide an affordable infrastructure for many content types and royalty payment structures. Blockchain could also authenticate primary sources and trace their provenance over time. This authentication would not only support archives, museums, and special collections, but it would also ensure law libraries can identify the most recent version of a law.18 Finally, Blockchain could protect digital first sale rights, which are key to libraries being able to share such content.19 “While DRM of any sort is not desirable, if by using blockchain-driven DRM we trade for the ability to have recognized digital first sale rights, it may be a worthy bargain for libraries.”20 To support such restrictions, another use for blockchain developed by companies such as LibChain is open, verifiable, and anonymous access management to library content.21 Another suitable application for blockchain is metadata creation and management.22 An open metadata archive, information ledger, or knowledgebase is very appealing because access to high quality records often requires a subscription to OCLC.23 Some libraries cannot afford such subscriptions. Therefore, they must rely on records supplied by either a vendor or a government agency, like the Library of Congress. Unfortunately, as of this writing, there is little research on how these blockchains could be constructed at the scale of large databases like those of OCLC and the Library of Congress. In fact, the only such project is DEMCO’s private, invitation-only beta.24 DEMCO does not provide any information regarding their new product, but to make its development profitable, it is most likely a private, permissioned blockchain. CREATING PERMISSIONLESS BLOCKCHAINS FOR METADATA RECORDS This section will describe how to create permissionless blockchains for metadata records including grouping transactions, an appropriate consensus algorithm, and storage options. Please note that these blockchains are intended to augment current metadata record creation and modification practices and standards, not supersede them. The author assumes that record creation and modification will still require content (RDA) and encoding (MARC) validation prior to blockchain submission. Validation in this section will refer solely to blockchain validation. Generating and Managing Public and Private Keys All distributed ledger participants will need a public key or address for blocks of transactions to be sent to them and a private key for digital signatures. One way to create these key pairs is to generate a seed, which can be a group of random words or passphrases. The SHA-256 algorithm can then be applied to this seed to create a private key.25 Next, a public key can be generated from that private key using an elliptic curve digital signature algorithm.26 For additional security, the NO NEED TO ASK | RUBEL 4 https://doi.org/10.6017/ital.v38i2.10822 public key can be hashed again using a different cryptographic hash function, such as RIPEMD160, or multiple hash functions, like Bitcoin does to create its addresses.27 These key pairs could be managed with digital wallet software. “A Bitcoin wallet is an organized collection of addresses and their corresponding private keys.”28 Larger institutions, such as the Library of Congress, could have multiple key pairs with each pair designated for the appropriate cataloging department based on genre, form, etc. Creating a Genesis Block Every blockchain must start with a “genesis block.”29 For example, a personal name authority blockchain might start with William Shakespeare’s record. A descriptive bibliographic blockchain might start with the King James Bible. This genesis block includes a block header, a recipient’s public key or address, a transaction count, and a transaction list.30 Being the first block, the block header will not contain a hash of the previous block header. It will contain, however, a hash of all of the transactions within that block to verify that the transactions list has not been altered. The block header will also include a timestamp and possibly a difficulty level and nonce.31 Then the block header is hashed using the SHA-256 algorithm and encrypted with the creator’s private key to produce a digital signature. This digital signature will be appended to the end of the block so validators can verify that the creator made the block by using their (the creator’s) public key.32 Finally, the recipient’s public key or address, the transaction count, and transaction list are appended to the block header.33 Block header • Hash of previous block header • Hash of all transactions in that block • Timestamp • Difficulty level (if applicable) • Nonce (if applicable) Block • Recipient public key or address • Transaction count • Transaction list • Digital signature In her master of information security and intelligence thesis at Ferris State University, Amber Snow investigated the feasibility of using blockchain to add, edit, and validate changes to Woodbridge N. Ferris’ authority record.34 As shown in figure 1, she began by creating a hash function using the SHA-256 algorithm to encrypt the previous hash, the timestamp, the block number, and the metadata record. “The returned encrypt value is significant because the returned data is the encrypted data that is being committed as [a] mined block transaction permanently to ledger.”35 The ledger block, however, “contains the editor’s name, the entire encrypted hash value, and the prior blocks [sic] hashed value.”36 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 5 Figure 1. Creating a SHA-256 hash. Next, as shown in figures 2 and 3, she created a genesis block with a prior hashed value of zero by ingesting Ferris’ authority record as “a single line file that contains the indicator signposts for cataloging the record.”37 Figure 2. Ingesting Woodbridge N. Ferris' authority record.38 Figure 3. Woodbridge N. Ferris' authority record as a genesis block. Note the previousHash value is zero. Snow noted that “the understanding and interpretation of the MARC authority record’s signposts is not inherently relevant for the blockchain data processing.”39 To keep the scope narrow, she also avoided using public and private key pairs to exchange records between nodes. “The RI [Research Institution] blockchain does not necessarily require two users to agree…instead the RI blockchain is looking to commit and track single user edits to the record.”40 Creating and Submitting New Blocks for Validation Once a genesis block has been created and distributed, any node on the network can submit new blocks to the chain. For metadata records, new blocks should contain either new records or multiple modifications to the same record with each field being treated as a transaction. When a NO NEED TO ASK | RUBEL 6 https://doi.org/10.6017/ital.v38i2.10822 second block is appended, the new block header will include the hash of the previous block header, a hash of all of the new transactions, a new timestamp, and possibly a new difficulty level and/or nonce. The block header will then be hashed using SHA-256 and encrypted with the submitter’s private key to become a digital signature for that block. Finally, another recipient’s public key or address, a new transaction count, and a new transaction list will be appended to the block header. Additional blocks can then be securely appended to the chain ad infinitum without losing any of the transactional details. If two validators approve the same block at the same time, then the fork where the next block is appended first becomes the valid chain while the other chain becomes orphaned.41 Although Snow’s method does not include exchanging records using public keys or addresses, she was able to change a record, add it to the blockchain, and successfully commit those edits using the Proof of Work consensus algorithm.42 As shown in figure 4, after creating and submitting a genesis block as “tester 1,” she added a modified version of Woodbridge N. Ferris’ record as “tester 2.” This version appended the string “testerchanged123” to Woodbridge N. Ferris’ authority record. Then she validated or “mined” the second block to commit the changes. Figure 4. Submitting and validating an edited record. Figure 5 shows that the second block is chained to the genesis block because the “previousHash” value of the second block matches the “hash” of the genesis block. This link is what commits the block to the ledger. The appended string in the second block is at the end of the “metadata” variable. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 7 Figure 5. The new authority record blockchain. A more sophisticated method to append a second block would require key pairs. As described previously, a block would include a recipient’s public key or address, which would route the new and modified records to large, known institutions like the Library of Congress. Although every node on the network can see the records and all of the changes, large institutions with well- trained and authoritative catalogers may be the best repository for metadata records and could store a preservation or backup copy of the entire chain. They are also the most reliable for validating records for content accuracy and correct encoding. Achieving Algorithmic Consensus Once a block has been submitted for validation, the other nodes use a consensus algorithm to verify the validity of the block and its transactions. “Consensus mechanisms are ways to guarantee a mutual agreement on a data point and the state…of all data.”43 The most well-known consensus algorithm is Bitcoin’s Proof of Work, but the most suitable algorithm for permissionless metadata blockchains is a Federated Byzantine Agreement. Proof of Work Proof of Work (PoW) relies on a one-way cryptographic hash function to create a hash of the block header. This hash is easy to calculate, but it is very difficult to determine its components.44 To solve a block, nodes must compete to calculate the hash of the block header. To calculate the hash of a block header, a node must first separate it into its constituent components. The hash of the previous block header, the hash of all of the transactions in that block, the timestamp, and the difficulty target will always have the same inputs. The validator, however, changes the nonce or random value appended to the block header until the hash has been solved.45 In Bitcoin this process is called “mining” because every new block creates new Bitcoins as a reward for the node that solved the block.46 NO NEED TO ASK | RUBEL 8 https://doi.org/10.6017/ital.v38i2.10822 Bitcoin also includes a mechanism to ensure the average number of blocks solved per hour remains constant. This mechanism is the difficulty target. “To compensate for increasing hardware speed and varying interest in running nodes over time, the proof-of-work difficulty is determined by a moving average targeting an average number of blocks per hour. If they’re generated too fast, the difficulty increases.”47 Adjusting the difficulty target within the block header keeps Bitcoin stable because its block rate is not determined by its popularity.48 In sum, validators are trying to find a nonce that generates a hash of the block header that is less than the predetermined difficulty target. Unfortunately, Proof of Work requires immense and ever-increasing computational power to solve blocks, which poses a sustainability and environmental challenge. Bitcoin and other financial services may need to rely on Proof of Work because “the massive amounts of electricity required helps to secure the network. It disincentivizes hacking and tampering with transactions…”49 because an attacker would need to control over 51 percent of the entire network to convince the other nodes that a faulty ledger is correct.50 Metadata blockchains would rely on public information and therefore would not need the same level of security as private financial, medical, or personally identifiable information. Unlike Bitcoin, metadata blockchains also would not need a difficulty target because fluctuations in block production rates would not affect a metadata block’s value the same way cryptocurrency inflation would. Therefore, despite its incredible security, Proof of Work would be computationally excessive for metadata record blockchains. Federated Byzantine Agreement Byzantine Agreements are “the most traditional way to reach consensus. […] A Byzantine Agreement is reached when a certain minimum number of nodes (known as a quorum) agrees that the solution presented is correct, thereby validating a block and allowing its inclusion on the blockchain.”51 Byzantine fault-tolerant (BFT) state machine replication protocols support consensus “despite participation by malicious (Byzantine) nodes.”52 This support ensures consensus finality, which “mandates that a valid block…never be removed from the blockchain.”53 In contrast, Proof of Work does not satisfy consensus finality because there is still the potential for temporary forking even if there are no malicious nodes.54 The “absence of consensus finality directly impacts the consensus latency of PoW blockchains as transactions need to be followed by several blocks to increase the probability that a transaction will not end up being pruned and removed from the blockchain.”55 This latency increases as block size increases, which may also increase the number of forks and possibility of attack.56 “With this in mind, limited performance is seemingly inherent to PoW blockchains and not an artifact of a particular implementation.”57 BFT protocols, however, can sustain tens of thousands of transactions at nearly network latency levels.58 A BFT consensus algorithm is also superior to one based on Proof of Work because “users and smart contracts can have immediate confirmation of the final inclusion of a transaction into the blockchain.”59 BFT consensus algorithms also decouple trust from resource ownership, allowing small organizations to oversee larger ones.60 To use BFT, every node must know and agree on the exact list of participating peer nodes. Ripple, a BFT protocol, tries to ameliorate this problem by publishing an initial membership list and allowing members to edit that list after implementation. Unfortunately, users are often reluctant to edit the membership list thereby placing most of the network’s power in the person or organization that maintains the list.61 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 9 Federated Byzantine Agreement (FBA), however, does not require each node to agree upon and maintain the same membership list. “In FBA, each participant knows of others it considers important. It waits for the vast majority of those others to agree on any transaction before considering the transaction settled.”62 Theoretically, an attacker could join the network enough times to outnumber legitimate nodes, which is why quorums by majority would not work. Instead, FBA creates quorums using a decentralized method that relies on each node selecting its own quorum slices.63 “A quorum slice is the subset of a quorum convincing one particular node of agreement.”64 A node may have many slices, “any one of which is sufficient to convince it of a statement.”65 The system constructs quorums based on individual node decisions thereby generating consensus without every node being required to know about every other node in the system.66 One example of quorum slices that might be good for metadata blockchains is a tiered system as shown in figure 6. The top tier would be structured like a BFT system where the nodes can tolerate a limited number of Byzantine nodes at the same level. This level would include the core metadata authorities, such as the Library of Congress or PCC members. Members of this tier would be able to validate any record. The second or middle tier nodes would depend on the top tier because, in this example, a middle tier node requires two top tier nodes to form a quorum slice. These middle tier nodes would be authoritative, known institutions, such as universities, that already rely on the core metadata authorities on the top tier to validate and distribute their records. Finally, a third tier, such as smaller institutions, would, in this example, rely on at least two middle tier nodes for their quorum slice. Figure 6. Tiered quorum example. NO NEED TO ASK | RUBEL 10 https://doi.org/10.6017/ital.v38i2.10822 Using an FBA protocol to validate a transaction requires each node to exchange two sets of messages. The first set of messages gathers validations and the second set of messages confirms those validations. “From each node’s perspective, the two rounds of messages divide agreement…into three phases: unknown, accepted, and confirmed.”67 The unknown status becomes an acceptance when the first validation succeeds. Acceptance is not sufficient for a node to act on that validation, however, because acceptance may be stuck in an indeterminate state or blocked for other nodes.68 The accepting node may also be corrupted and validate a transaction the network quorum rejects. Therefore, the confirmation validation “allows a node to vote for one statement and later accept a contradictory one.”69 Figure 7. Validation process of statement a for a single node v. FBA would lessen concerns about sharing a permissionless blockchain, but it can “only guarantee safety when nodes choose adequate quorum slices.”70 After discovery, Byzantine nodes should be excluded from quorum slices to prevent interference with validation. One example of such interference is tricking other nodes to validate a bad confirmation message. “In such a situation, nodes must disavow past votes, which they can only do by rejoining the system under new node names.”71 Theoretically, this recovery process could be automated to include “having other nodes recognize reincarnated nodes and automatically update their slices.”72 Therefore, the key limitation to using an FBA algorithm is continuity of participation. If too many nodes leave the network, reengineering consensus would require centralized coordination whereas Proof of Work algorithms could operate after losing many nodes without substantial human intervention.73 STORING THE BLOCKCHAIN Storing a large blockchain, such as Bitcoin, is a significant challenge. One method to facilitate that storage would be to rely on top tier nodes to retain a complete copy of the blockchain and allow smaller, lower tier nodes to retain an abridged version. In Bitcoin, these methods are known as full payment verification (FPV) and simplified payment verification (SPV). FPV requires a complete copy of the blockchain to “verify that bitcoins used in a transaction originated from a mined block by scanning backward, transaction by transaction, in the blockchain until their origin is found.”74 Unfortunately, as one might expect, FPV consumes many resources and can take a long time to initialize. For example, downloading Bitcoin’s blockchain can take several days. This long installation period is partly due to the size of blockchain, but if Proof of INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 11 Work is used as the consensus algorithm, then the new node must also connect to other full nodes “to determine whose blockchain has the greatest proof-of-work total (by definition, this is assumed to be the consensus blockchain).”75 Using FBA instead of Proof of Work would eliminate this time and resource consuming step. In contrast, SVP only allows a node “to check that a transaction has been verified by miners and included in some block in the blockchain.”76 A node does this by downloading the block headers of every block in the chain. In addition to retaining the hash of the previous block header, these headers also include root hashes derived from a Merkle Tree. A Merkle Tree is a method where “the spent transactions…can be discarded to save disk space.”77 As shown in figure 8, combining transaction hashes for the entire block into a single root hash in the block header saves a considerable amount of storage capacity because the interior hashes can be eliminated or “pruned” off the Merkle Tree. Figure 8. Using a Merkle Tree for storage. As shown in figure 9, to verify that a transaction was included a block, a node “obtains the Merkle branch linking the transaction to the block it’s timestamped in.”78 Although it cannot check the transaction directly, “by linking it to a place in the chain he can see that a network node has accepted it and blocks after it further confirm the network has accepted it.”79 NO NEED TO ASK | RUBEL 12 https://doi.org/10.6017/ital.v38i2.10822 Figure 9. Verifying a transaction using a Merkle root hash. Compared to FVP, SVP “requires only a fraction of the memory that’s needed for the entire blockchain.”80 This small amount of storage enables SVP ledgers to sync and become operational in less than an hour.81 SVP is limited, however, only allowing nodes to manage addresses or public keys that they maintain whereas FVP ledgers are able to query the entire network. Thus, an SVP ledger must rely “on its network peers to ensure its transactions are legit.”82 Theoretically, an attacker could overpower the entire network and convince nodes using SVP to accept fraudulent transactions, but such an attack is very unlikely for metadata blockchains. For additional security, an SVP node could also “accept alerts from network nodes when they detect an invalid block, prompting the user’s software to download the full block and alerted transactions to confirm the inconsistency.”83 Adding such a feature to metadata blockchain software would eliminate the slight risk of it being contaminated by malicious actors. Thus, SVP offers the ability for smaller institutions to participate in creating and maintaining a metadata blockchain without requiring them to have the storage capacity for the entire blockchain. CONCLUSION AND FUTURE DIRECTIONS This article described how permissionless metadata blockchains could be created to overcome two significant limitations in current cataloging practices: centralization and a lack of traceability. The process would start by creating public keys using a seed and the SHA-256 algorithm and private keys using an elliptic curve digital signal algorithm. After creating the genesis block, nodes would submit either a new record or modifications to a single record for validation. Validation would rely on a Federated Byzantine Agreement (FBA) consensus algorithm because it offers the most flexibility for institutions to select authoritative peers. Quorum slices would be chosen using a tiered system where the top tier institutions would be the core metadata authorities, such as the Library of Congress. Only the top tier nodes would be required to store a copy of the entire blockchain (FVP) thereby allowing other institutions to decide whether they prefer to use SVP or FVP. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 13 Future directions for research could start with investigating whether this theoretical design will work. FBA has not been heavily promoted as an option for a consensus algorithm, but its quorum slices create trust between recognized authorities and smaller institutions. Another area of study could be whether there is a significant demand for metadata blockchains. Many institutions appear frustrated at the costs and limitations of working with a vendor, but they also view such relationships as necessary for metadata record creation and maintenance. A metadata blockchain would reduce such dependence, but some institutions may be leery of using open source software. Other institutions might be hesitant to adopt blockchain because they believe it is merely another “fad” or an unnecessary addition to metadata exchange systems. A third area for research could be a cost-benefit analysis for implementing metadata blockchains that weighs current vendor fees and labor costs against the potential storage and labor costs. Such an analysis may create a tipping point where long-term return on investment outweighs the short-term challenges. ENDNOTES 1 “About the Project,” Blockchain for Peer Review, Digital Science and Katalysis, accessed Nov. 29, 2018, https://www.blockchainpeerreview.org/about-the-project/. 2 “MARC Record Services,” MARC Standards, Library of Congress, accessed Nov. 29, 2018, https://www.loc.gov/marc/marcrecsvrs.html; “Open Library Data,” Open Library, Internet Archive, accessed Nov. 29, 2018, https://archive.org/details/ol_data ; OCLC, 2017-2018 Annual Report. 3 “Join the PCC,” Program for Cooperative Cataloging, Library of Congress, accessed Nov. 29, 2018, http://www.loc.gov/aba/pcc/join.html. 4 “040 Cataloging Source (NR),” OCLC Support & Training, OCLC, accessed Nov. 29, 2018, https://www.oclc.org/bibformats/en/0xx/040.html. 5 Dr. Joris Van Rossum, “Blockchain for Research,” accessed Nov. 29, 2018, https://www.digital- science.com/resources/digital-research-reports/blockchain-for-research/. 6 Van Rossum, 11. 7 Van Rossum, 12. 8 Van Rossum, 12. 9 Van Rossum, 16. 10 Digital Science and Katalysis, “About the Project.” 11 “Decentralized Research Platform,” DEIP, accessed Nov. 29, 2018, https://deip.world/wp- content/uploads/2018/10/Deip-Whitepaper.pdf. 12 DEIP, 13. 13 DEIP, 14. 14 DEIP, 16. NO NEED TO ASK | RUBEL 14 https://doi.org/10.6017/ital.v38i2.10822 15 Jason Griffey, “Blockchain for Libraries,” Feb. 26, 2016, https://speakerdeck.com/griffey/blockchain-for-libraries. 16 “E-Services,” Concensum, accessed Nov. 29, 2018, https://concensum.org/en/e-services; “About,” Binded, accessed Nov. 29, 2018, https://binded.com/about; “FAQ,” Dot Blockchain Media, accessed Nov. 29, 2018, http://dotblockchainmedia.com/. 17 Van Rossum, “Blockchain for Research,” 10. 18 Debbie Ginsberg, “Law and the Blockchain,” Blockchains for the Information Profession, Nov. 22, 2017, https://ischoolblogs.sjsu.edu/blockchains/law-and-the-blockchain-by-debbie- ginsberg/. 19 Griffey, “Blockchain for Libraries.” 20 “Ways to Use Blockchain in Libraries,” San José State University, accessed Nov. 29, 2018, https://ischoolblogs.sjsu.edu/blockchains/blockchains-applied/applications/. 21 “LibChain: Open, Verifiable, and Anonymous Access Management,” LibChain, accessed Nov. 29, 2018, https://libchain.github.io/. 22 Griffey, “Blockchain for Libraries.” 23 San José State University. “Ways to Use Blockchain in Libraries.” 24 “Demco Software Blockchain,” Demco, accessed Nov. 29, 2018, http://blockchain.demcosoftware.com/. 25 Jordan Baczuk, “How to Generate a Bitcoin Address—Step by Step,” Coinmonks, accessed Nov. 29, 2018, https://medium.com/coinmonks/how-to-generate-a-bitcoin-address-step-by-step- 9d7fcbf1ad0b. 26 “Elliptic Curve Digital Signature Algorithm,” Bitcoin Wiki, accessed Nov. 29, 2018, https://en.bitcoin.it/wiki/Elliptic_Curve_Digital_Signature_Algorithm. 27 Conrad Barski and Chris Wilmer, Bitcoin for the Befuddled (San Francisco: No Starch Pr., 2015), 139. 28 Barski and Wilmer, 12-13. 29 Barski and Wilmer, 11. 30 Barski and Wilmer, 172-73. 31 Barski and Wilmer, 172-73. 32 Satoshi Nakamoto, “Bitcoin: A Peer-to-Peer Electronic Cash System,” accessed Nov. 29, 2018, https://bitcoin.org/bitcoin.pdf. 33 Barski and Wilmer, Bitcoin for the Befuddled, 170-72. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 15 34 Amber Snow, “The Design and Implementation of Blockchain Technology in Academic Resource’s Authoritative Metadata Records: Enhancing Validation and Accountability” (Master’s thesis, Ferris State University, 2018), 34. 35 Snow, 40. 36 Snow, 40. 37 Snow, 37, 40. 38 Snow, 42. 39 Snow, 37. 40 Snow, 39. 41 Barski and Wilmer, Bitcoin for the Befuddled, 23. 42 Snow, “The Design and Implementation of Blockchain Technology,” 37. 43 “9 Types of Consensus Mechanisms You Didn’t Know About,” Daily Bit, accessed Nov. 29, 2018, https://medium.com/the-daily-bit/9-types-of-consensus-mechanisms-that-you-didnt-know- about-49ec365179da. 44 Barski and Wilmer, Bitcoin for the Befuddled, 138. 45 Barski and Wilmer, 171. 46 Barski and Wilmer, 138. 47 Nakamoto, “Bitcoin,” 3. 48 Barski and Wilmer, Bitcoin for the Befuddled, 171. 49 Helen Zhao, “Bitcoin and blockchain consume an exorbitant amount of energy. These engineers are trying to change that,” CNBC, Feb. 23, 2018, https://www.cnbc.com/2018/02/23/bitcoin- blockchain-consumes-a-lot-of-energy-engineers-changing-that.html. 50 Barski and Wilmer, Bitcoin for the Befuddled, 23. 51 Shaan Ray, “Federated Byzantine Agreement,” Towards Data Science, accessed Nov. 29, 2018, https://towardsdatascience.com/federated-byzantine-agreement-24ec57bf36e0. 52 Marko Vukolić, “The Quest for Scalable Blockchain Fabric: Proof-of-Work vs. BFT Replication,” IBM Research – Zurich, accessed Nov. 29, 2018, http://vukolic.com/iNetSec_2015.pdf 53 Vukolić, “The Quest for Scalable Blockchain Fabric,” [5]. 54 Vukolić, [6]. NO NEED TO ASK | RUBEL 16 https://doi.org/10.6017/ital.v38i2.10822 55 Vukolić, [6]. 56 Vukolić, [7]. 57 Vukolić, [7]. 58 Vukolić, [7]. 59 Vukolić, [6]. 60 David Mazières, “The Stellar Consensus Protocol: A Federated Model for Internet-level Consensus,” Stellar Development Foundation, accessed Nov. 29, 2018, https://www.stellar.org/papers/stellar-consensus-protocol.pdf. 61 Mazières, 3. 62 Mazières, 1. 63 Mazières, 4. 64 Mazières, 4. 65 Mazières, 4. 66 Mazières, 5. 67 Mazières, 11. 68 Mazières, 11. 69 Mazières, 13. 70 Mazières, 28. 71 Mazières, 29. 72 Mazières, 29. 73 Mazières, 29. 74 Barski and Wilmer, Bitcoin for the Befuddled, 191. 75 Barski and Wilmer, 191. 76 Barski and Wilmer, 192. 77 Nakamoto, “Bitcoin,” 4. 78 Nakamoto, 5. 79 Nakamoto, 5. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 17 80 Barski and Wilmer, Bitcoin for the Befuddled, 192. 81 Barski and Wilmer, 193. 82 Barski and Wilmer, 193. 83 Nakamoto, “Bitcoin,” 5. 10844 ---- 10844 20190318 galley Library Services Navigation: Improving the Online User Experience Brian Rennick INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 14 Brian Rennick (brian_rennick@byu.edu) is AUL for Library IT, Brigham Young University. ABSTRACT While the discoverability of traditional information resources is often the focus of library website design, there is also a need to help users find other services such as equipment, study rooms, and programs. A recent assessment of the Brigham Young University Library website identified nearly two hundred services. Many of these service descriptions were buried deep in the site, making them difficult to locate. This article will describe a web application that was developed to improve service discovery and to help ensure the accuracy and maintainability of service information on an academic library website. INTRODUCTION The Brigham Young University Library released a new version of its website in 2014. Multiple usability studies were conducted to inform the design of the new site. During these studies, the web designers observed that when a user did not see what they were looking for on the homepage, they were likely to click on the “Services” link as the next best option. The word services appeared to be an effective catch-all term. Web designers asked themselves, “What is a library service?” They concluded that a library service could be anything public-facing that meets the needs of a user. Using this broad definition, services could include: • Library materials—both digital and physical (e.g. books, DVDs) • Material services (e.g. course reserve, interlibrary loan) • Equipment and technology (e.g. computers, cameras, tripods) • Help and guidance (e.g. research assistance, computer assistance) • Locations (e.g. group study rooms, classrooms, help desks) • Programs (e.g. Friends of the Library, lectures) Because libraries offer so many diverse services, structuring a website to effectively promote them all brings many challenges. For instance, a common approach to presenting library services on a website is to have a menu that lists a few of the most popular or important services. The last menu item will normally be a link to a web page for “Other Services” that provides a more comprehensive service list. Such an all-inclusive listing of library services on a single web page can easily lead to information overload for users. Where do services belong in a library website’s information architecture? Determining the one correct path is not easy because there are multiple valid ways to organize services into web pages. Services could be arranged by department, service category, user group (undergraduates, graduates, faculty, visitors, alumni), or any number of other ways. An ideal system would allow users to follow the path that makes the most sense to them. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 15 User expectations for a single (Google-like) search box add to the challenges for service listings.1 A single search box, also known as a metasearch system, web-scale discovery service, or federated search, combines search results from multiple library sources. A study at the University of Colorado found that users expected to locate services by entering keywords into the single search box on the library’s homepage.2 For example, the users attempted to search for “interlibrary loan” and “chat with a librarian” using the single search box. It is unrealistic to expect all users to follow a specific series of links in order to find the one correct path to information about a service when they are accustomed to Google-style searching. Even when a user manages to locate the correct web page where a service is described, the pertinent information can still be difficult to pinpoint when service descriptions are buried in paragraphs. Users need to be able to quickly perform a visual scan of a web page to locate service information. Kozak and Hartley suggest that “bulleted lists are easier to read, easier to search and easier to remember than continuous prose.”3 The ongoing maintenance of service listings poses another significant challenge. For large academic libraries, up-to-date service information is difficult to maintain because it is typically scattered throughout a website. Each department may have its own set of web pages and service listings. Department pages created and maintained by different individuals end up with inconsistent design, organization, and voice. Services that are common to multiple departments will have duplicate listings with different descriptions. Maintenance of accurate information becomes an issue as services change; tracking down all of the references to a discontinued or modified service requires extensive searching of the website. LITERATURE REVIEW Studies and commentaries regarding the information architecture of academic library websites have been covered extensively in the literature.4 A few articles specifically address the way that library services are organized on websites. Library services are a significant component of academic library website content. Clyde studied one hundred library websites from thirteen countries in order to compare common features and to determine some of the purposes for a library website.5 Purposes for the sites varied. Some focused on providing information about the library and its services while others functioned more like a portal, providing links to Internet resources. Cohen and Still developed a list of core content for academic library websites by examining pages from university and two-year college sites.6 They organized the content into categories: Library Information, Reference, Research, Instruction, and Functionalities. Liu surveyed ARL libraries to get an overview of the state of web page development.7 The subsequent SPEC Kit identifies services commonly found on academic library websites. Yang and Dalal studied a random sample of academic library websites to see which web–based reference services were offered and how they were presented.8 They also examined the differing terminology used to describe the services. The choice of terminology used on library websites impacts the findability of services. Dewey compared academic websites from thirteen member libraries of a consortium to determine how findable service links were on the sites.9 The service links used in the evaluation covered “access, reference, information, and user education” categories. The study measured the number of clicks from the homepage that were required to find information about a service. Dewey found LIBRARY SERVICES NAVIGATION | RENNICK 16 https://doi.org/10.6017/ital.v38i1.10844 inconsistent use of terminology used to describe library services from one site to another. Dewey posited that extensive use of library jargon could, in a sense, hide links from users. The overall conclusion was that the websites contained “too much information poorly placed.” A study of an academic library website by McGillis and Toms also found that participants struggled with terminology when attempting to locate services.10 The website reflected “traditional library structures” instead of using categories that were meaningful to users. The decision on where to place library services on a website is an important step in the design process. As part of their proposal to establish a benchmarking program for academic library websites, Hightower, Shih, and Tilghman created classifications for the web pages they studied.11 Library services were assigned to the “Directional” category instead of representing a separate category. Vaughan described a history of changes to an academic website that took place from 1996–2000.12 An interesting change was that, after multiple redesigns, the web designers combined two categories into a single “Library Services” category in order to simplify top level navigation on the home page. Comeaux studied thirty-seven academic library websites to see how design elements evolved between 2012 and 2015.13 A portion of the study compiled terms used as navigation labels. The term “About” was the most common navigation label followed by “Services” as the second most common. Use of the term “Services” as a main navigation label increased in popularity from 2012 to 2015. Several researchers suggest organizing library services into web pages or portals that target different audiences. Gullikson et al. studied usability issues related to the information architecture of an academic website and discovered that study participants followed different paths in their attempts to locate service information on the site.14 Some users found items easily while others were unsuccessful. Menu labels were not universally understood. The researchers identified a need for multiple access points to information in order to accommodate different mental models. They suggested employing multiple information organizational schemes, such as categorizing links by function, frequency of use, and target user group. Adams and Cassner analyzed the websites of ARL libraries to see how services for distance education students and faculty were presented.15 They recommend strategies for helping distance students navigate the website, including maintaining a web page designed specifically for distance students that avoided jargon and clearly described services. Detlor and Lewis envisioned academic library websites as “sophisticated guidance systems which support users across a wide spectrum of information seeking behaviors—from goal-directed search to wayward browsing.”16 They reviewed ARL library websites to see which important features were present or absent. Their coding methodology was adopted by Gardner, Juricek, and Xu in their study of how library web pages can meet the needs of campus faculty.17 Liu proposed a conceptual model for an improved academic library website that would be organized into portals designed for specific user groups, such as undergraduates, faculty, or visitors.18 Some of the ARL websites studied by the researcher already implemented portals by user group. A more recent approach for locating library services has been to include website search results when using the single search from the homepage. For example, the North Carolina State Libraries website includes library-wide site search results when using the single search.19 The Wayne State University Libraries single search displays results from a university-wide site search.20 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 17 An influential report produced by Andrew Pace provides practical advice for designing library websites.21 In the report, Pace described the library services that should be included on a site and stressed that website design affects the discoverability and delivery of these services: “Whether requiring minimal maintenance or constant upkeep, the extensibility of the design and flexibility of a site’s architecture ultimately saves the library time, money, hassle, and user frustration.”22 The web application described in this article aims to achieve these goals in terms of service discoverability and website maintainability. A SERVICES WEB APPLICATION In an effort to tackle the challenges of services navigation and maintenance, the Brigham Young University Library developed a web application for organizing services that allows multiple routes to service information. The application, known internally as “Services,” was built using Django, an open-source Python Web framework. The application incorporates a comprehensive list of library services and a map of service relationships. Each service is assigned one or more categories, locations, and service areas within the application: • Categories and Subcategories—broad groupings of services (e.g., research help, for faculty, printing and copying) • Locations—physical or virtual places within the library where services can be found (e.g., help desks, rooms) • Service Areas— library departments or other organizational units that offer services (e.g., Humanities, Special Collections) Services can have multiple categories, locations, and service areas and some service areas have multiple locations within the library (see figure 1). Service information can also include links to related services. These links facilitate the serendipitous discovery of additional services (see figure 2). Service information is stored in a relational database that joins connected entities together. An HTML template is used to format service information from the database in order to generate web pages for each of the services. Maintaining the data in this manner ensures that changes made to service information in the database flow through to all of the associated web pages. Adding or modifying entries automatically triggers the generation of new HTML for only the impacted services. Generating static content by using triggers keeps the web pages up-to-date without the performance hit of real-time dynamic page generation. LIBRARY SERVICES NAVIGATION | RENNICK 18 https://doi.org/10.6017/ital.v38i1.10844 Figure 1. Sample map illustrating relationships between services (on the left side) and service area locations (on the right side). INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 19 Figure 2. Sample map of how related service web pages are linked. LIBRARY SERVICES NAVIGATION | RENNICK 20 https://doi.org/10.6017/ital.v38i1.10844 USER SCENARIOS The following examples of navigation paths typify how the web application can help users locate services. In each case there are multiple alternative paths that could be followed to find the same information. Scenario 1. A student is looking for a computer that has music notation software installed. Clicking the “Services” link on the library homepage leads to a summary of library services. The student clicks the “Public Computers” link found under the “Featured Services” heading and is presented with detailed information about the computers. In the bullet points listed in the “Overview” section there is a link to “See the list of software available on these computers.” Following this link the student is able to learn that the desired software is available in the library’s Music and Dance Media Lab. Scenario 2. While visiting a web page for the faculty delivery service, a professor notices a link to the category “For Faculty.” Following the link leads to a page that highlights some of the library services provided exclusively to campus faculty. The professor clicks the link “Faculty Expedited Book Orders” and is taken to a web page that describes the service and provides an online form for requesting a book. Scenario 3. A student would like to borrow a camera for a class project. Entering “digital cameras” into the main search box on the library homepage produces a link to “Digital Cameras (DSLR)” listed under the “Library Services” heading at the top of the search results. Following the link leads to a web page with information about the library’s digital camera offerings. The web page provides links to related services, including the library’s video production studio. The student decides to reserve the studio instead of checking out a camera. ANATOMY OF A SERVICES WEB PAGE Each Service web page is divided into sections to help users quickly find the type of information they seek. Each section represents an information module with a specific purpose and an identifying design; the sections are color coded and displayed in a consistent order on each page. This helps users to find the same kind of information in the same place on every service page. Major sections include: • Title • Description • Keywords • Hours • Location • Contact • Overview • Call to Action • Frequently Asked Questions • Additional Resources • Related Services • Categories INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 21 A few of the sections require an explanation. The Hours, Location, and Contact sections are links located directly below the Title and Description. Clicking these links displays the section content. The Overview section is intended to provide brief bullet points near the top of the web page so that users can quickly scan the most important information about the service. The Call to Action section follows these bullet points and contains one or more links to web applications that facilitate use of the service. Examples of calls to action include: • Place a hold • Reserve a group study room • Register for an advanced writing class • Submit an interlibrary loan request Most of the sections are optional since not all sections apply to every service. The Services web pages can also include raw HTML that is embedded in a section in order to provide unique formatting for those services that do not neatly fit the standard layout. For example, the Public Computers page includes a section that displays the current availability of computers for each floor of the library. The look and feel of Services web pages can be extended to other pages on the library website. Library departments have web pages that provide information about personnel, mission, location, and services offered. Some of these pages have been converted to a format that resembles the services layout in an effort to add cohesiveness to the library website. The department pages have sections similar to Services pages such as hours, location, contact information, and an overview with bullet points. The pages can automatically display links to all of the services available in the department. Because department pages are part of the Services application and are connected to services with a relational database, changes to service information remains in sync across the entire website. This helps alleviate the problem of out-of-date department web pages. SEARCHING FOR SERVICES Services can be located by submitting a query in a search box or by following links found on the main Services web page. The Services search engine matches words from the query with words found in a service name or associated tags. Each service is tagged with keywords, phrases, or synonyms to increase the likelihood of successful searching. Users may not be familiar with library jargon and will search for services using a variety of terms. It is impossible to name library services in a way that is understood by everyone, especially since academic library services target both students and faculty. A study on library services and user-centered language found that: “The choices of the graduate students did not always mirror those of the faculty. This highlights the inherent challenge of marketing services—the target audiences for the same service can have very different opinions and preferences.”23 Services can have multi-word phrases assigned in addition to individual keywords. For example, the data management service has the following synonyms assigned: data curation, data management plan, and DMP. New keywords and phrases can be identified by reviewing search queries in the system log files and by conducting usability studies. LIBRARY SERVICES NAVIGATION | RENNICK 22 https://doi.org/10.6017/ital.v38i1.10844 Figure 3. The interlibrary loan service web page. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 23 In addition to using a search box on the Services web pages, users can search for services using the single search box on the library’s homepage. The single search box returns a link to matching services as part of search results when the search engine recognizes services keywords in a query. The Services application has an API that makes keywords and other service information available to the single search box application. Figure 4. Search for a service from the single search box on the library’s homepage. Figure 5. JSON results from the Services API. To facilitate browsing, services are organized into three groups on the Services web page: Featured Services, Categories, and Service Areas. The Featured Services group highlights the most commonly sought-after services. Categories are organized by the type of service or the target audience. The Service Areas group directs users to services available in library departments or units. The Services web page does not list every service but instead directs users to web pages based on categories or service areas that list individual services. The Services search feature can also include links to non-services. For example, library policies are not services yet users occasionally search for them on the Services page (the library website posts {"status": 200, "results": [{"url": "https://lib.byu.edu/services/data- management/", "type": "service", "name": "Data Management", "slug": "data- management", "description": "Through our institutional repository ScholarsArchive, faculty can store research data. This is particularly useful for faculty who must develop data management plans for research projects funded by grants.", "keywords": ["data curation", "DMP", "data management plan", "data storage", "open access"]}], "total": 1, "query": "dmp"} LIBRARY SERVICES NAVIGATION | RENNICK 24 https://doi.org/10.6017/ital.v38i1.10844 policy documents on the About page). In order to minimize user frustration with searching, links to non-services are included in search results so that users can be redirected to the desired pages. To help with optimization for external search engines such as Google, each Services page has a user-friendly URL that clearly identifies the service. For example, the 3D printer service has the URL https://lib.byu.edu/services/3d-printers/. Each web page also includes the service name in an embedded HTML title tag. CONCLUSION Adopting a broad view of what represents a service has altered the library’s approach to the information architecture of the website. The Services web application offers several innovations for improving library service discoverability and maintenance including: • Standardized organization of service information • Attaching keywords/aliases to service descriptions • An API for integration with the single search box on the homepage • Links to related services • Generation of web pages from a relational database Usability tests were conducted throughout the development of the Services application. Follow-up assessments are planned for the future in order to verify that the application works as expected and to identify potential adjustments to the design. The Services application shows promise as an effective tool for facilitating the discovery of services and increasing the reliability and uniformity of service information. ACKNOWLEDGEMENTS The author gratefully acknowledges the contributions of Grant Zabriskie for the original concept and design of the Services application and Ben Crowder for the implementation. REFERENCES 1 Cory Lown, Tito Sierra, and Josh Boyer, “How Users Search the Library from a Single Search Box,” College & Research Libraries 74, no. 3 (May 2013): 227-41, https://doi.org/10.5860/crl-321. 2 Rice Majors, “Comparative User Experiences of Next-Generation Catalogue Interfaces,” Library Trends 61, no. 1 (Summer 2012): 186–207, https://doi.org/10.1353/lib.2012.0029. 3 Marcin Kozak and James Hartley, “Writing the Conclusions: How Do Bullet-Points Help?” Journal of Information Science 37 no. 2 (Feb. 2011): 221–24, https://doi.org/10.1177/0165551511399333. 4 Barbara A. Blummer, “A Literature Review of Academic Library Web Page Studies,” Journal of Web Librarianship 1 no. 1 (2007): 45–64, https://doi.org/10.1300/J502v01n01_04; Galina Letnikova, “Usability Testing of Academic Library Web Sites: A Selective Annotated Bibliography,” Internet Reference Services Quarterly 8 no. 4 (2004): 53–68, https://doi.org/10.1300/J136v08n04_04. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 25 5 Laurel A. Clyde, “The Library as Information Provider: The Home Page,” The Electronic Library 14 no. 6 (Dec. 1996): 549–58, https://doi.org/10.1108/eb045522. 6 Laura B. Cohen and Julie M. Still, “A Comparison of Research University and Two-Year College Library Web Sites: Content, Functionality, and Form,” College & Research Libraries 60 no. 3 (1999): 275–89, https://doi.org/10.5860/crl.60.3.275. 7 Yaping Peter Liu, “Web Page Development and Management: A SPEC Kit,” Association of Research Libraries (1999): https://hdl.handle.net/2027/mdp.39015042087232. 8 Sharon Q. Yang and Heather A. Dalal, “Delivering Virtual Reference Services on the Web: An Investigation into the Current Practice by Academic Libraries,” Journal of Academic Librarianship 41 no. 1 (2015): 68–86, https://doi.org/10.1016/j.acalib.2014.10.003. 9 Barbara I. Dewey, “In Search of Services: Analyzing the Findability of Links on CIC University Libraries’ Web Pages,” Information Technology and Libraries, 18 no. 4 (1999): 210–13, http://www.ala.org/sites/ala.org.acrl/files/content/conferences/pdf/dewey99.pdf. 10 Louise McGillis and Elaine G. Toms, “Usability of the Academic Library Web Site: Implications for Design,” College & Research Libraries 62 no. 4 (July 2001): 355–67, https://doi.org/10.5860/crl.62.4.355. 11 Christy Hightower, Julie Shih, and Adam Tilghman, “Recommendations for Benchmarking Web Site Usage among Academic Libraries,” College & Research Libraries 59 no. 1 (Jan. 1998): 61–79, https://crl.acrl.org/index.php/crl/article/viewFile/15182/16628. 12 Jason Vaughan, “Three Iterations of an Academic Library Web Site,” Information Technology and Libraries 20 no. 2 (June 2001): 81–92, https://search.proquest.com/docview/215832160. 13 David J. Comeaux, “Web Design Trends in Academic Libraries—A Longitudinal Study,” Journal of Web Librarianship 11 no. 1 (2017): 1–15, https://doi.org/10.1080/19322909.2016.1230031. 14 Shelly Gullikson et al., “The Impact of Information Architecture on Academic Web Site Usability,” The Electronic Library 17 no. 5 (Oct. 1999): 293–304, https://doi.org/10.1108/02640479910330714. 15 Kate E. Adams and Mary Cassner, “Content and Design of Academic Library Web Sites for Distance Learners: An Analysis of ARL Libraries,” Journal of Library Administration 37 no. 1/2 (2002): 3–13, https://doi.org/10.1300/J111v37n01_02. 16 Brian Detlor and Vivian Lewis, “Academic Library Web Sites: Current Practice and Future Directions,” Journal of Academic Librarianship 32 no. 3 (May 2006): 251–58, https://doi.org/10.1016/j.acalib.2006.02.007. 17 Susan J. Gardner, John Eric Juricek, and F. Grace Xu, “An Analysis of Academic Library Web Pages for Faculty,” Journal of Academic Librarianship 34 no. 1 (Jan. 2008): 6–24, https://doi.org/10.1016/j.acalib.2007.11.006. LIBRARY SERVICES NAVIGATION | RENNICK 26 https://doi.org/10.6017/ital.v38i1.10844 18 Shu Liu, “Engaging Users: The Future of Academic Library Web Sites,” College & Research Libraries 69 no. 1 (Jan. 2008): 6–27, https://doi.org/10.5860/crl.69.1.6. 19 Kevin Beswick, “QuickSearch,” North Carolina State University Libraries, accessed Nov. 28, 2018, https://www.lib.ncsu.edu/projects/quicksearch. 20 Cole Hudson and Graham Hukill, “One-To-Many: Building a Single-Search Interface for Disparate Resources,” in Exploring Discovery: The Front Door to Your Library’s Licensed and Digitized Content, ed. Kenneth J. Varnum (Chicago: ALA Editions, 2016), 141–53, http://digitalcommons.wayne.edu/libsp/114. 21 Andrew K. Pace, “Optimizing Library Web Services: A Usability Approach,” Library Technology Reports 38 no. 2 (Mar./Apr. 2002): 1–87, https://doi.org/10.5860/ltr.38n2. 22 Ibid. 23 Allison R. Benedetti, “Promoting Library Services with User-Centered Language,” Libraries & The Academy 17 no. 2 (Apr. 2017): 217-34, https://doi.org/10.1353/pla.2017.0013. 10850 ---- President’s Message: Imagination and Structure in Times of Change Bohyun Kim INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 2 Bohyun Kim (bohyun.kim.ois@gmail.com) is LITA President 2018-19 and Chief Technology Officer & Associate Professor, University of Rhode Island Libraries, Kingston, RI. In my last column, I talked about the discussion that LITA had begun regarding forming a new division to achieve financial sustainability and more transparency, responsiveness, and agility. This proposed new division would merge LITA with ALCTS (Association for Library Collections and Technical Services) and LLAMA (Library Leadership and Management Association). When this topic was brought up and discussed at an open meeting at the 2018 ALA Annual Conference in New Orleans, many members of these three divisions expressed interests and excitement. At the same time, there were many requests for more concrete details. You may recall that as a response to those requests, the Steering Committee, which consists of the Presidents, Presidents-elect, and Executive Directors of the three divisions decided to form four working groups with the aim of providing more complete information about what the new division would look like. Today, I am happy to report that the work of the Steering Committee and the four working groups is well underway. The Operations Working Group that I have been chairing for the last two months submitted its recommendations on November 23. The Activities Working Group finished its report on December 5. The Budget and Finance Working Group also submitted its second report. The Communications Working Group continues to engage members of all three divisions by sharing new updates and soliciting opinions and suggestions. Most recently, it started gathering input and feedback on potential names for the new division.1 You can see the charges, member rosters, and current statuses of these four working groups in the ‘Current Information’ page at the ‘ALCTS/ LLAMA/ LITA Alignment Discussion’ community in the ALA Connect website (https://connect.ala.org/communities/allcommunities/all/all-current-information).2 To give you a glimpse of our work preparing for the proposed new division, I would like to share some of my experience leading the Operations Working Group. The Operations Working Group consisted of nine members, three from each division, in addition to myself as the chair and one staff liaison. We quickly became familiar with the organizational and membership structures of three divisions. The three divisions are similar to one another in size, but they have slightly different structures. LITA has 18 interest groups (IG), 25 committees, and 4 (current) task forces; LLAMA has 7 communities of practice (COP) and 46 discussion groups / committees / task forces; ALCTS has 5 sections, 42 IGs, and 61 committees (20 at the division level and 41 at the section level). All committees and task forces in LITA are division-level, while ALCTS and LLAMA have committees that are either division-level or section/COP-level. ALCTS is unique in that it elects section chairs, who serve on the division board alongside with ALCTS directors-at-large. ALCTS also has a separate Executive Committee in addition to the board. LLAMA has self-governed COPs, which are formed by the board’s approval. Among all three, LITA has the most flat and simplest structure due to its intentional efforts in the past. For example, there are neither sections nor mailto:bohyun.kim.ois@gmail.com https://connect.ala.org/communities/allcommunities/all/all-current-information INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 3 communities of practice in LITA, and the LITA board eliminated the Executive Committee a few years ago. The Steering Committee of the three divisions agreed upon several guiding principles for the potential merger. These include (i) open, flexible, and straightforward member engagement, (ii) simplified and streamlined processes, and (iii) a governance and coordinating structure that engages members and staff in meaningful and productive work. The challenge is how to translate those guiding principles into a specific organizational structure, membership structure, and bylaws. Clearly, some shuffling of existing sections, COPs, and IGs in three divisions will be necessary to make the new division as effective, agile, and responsive as promised. However, when and how such consolidation should take place? Furthermore, what kind of guidance should the new division provide for members to re-organize themselves into a new and better structure? These are not easy questions to answer. Nor are they something that can be immediately answered. Some changes may require going through multiple stages for them to be completed. This may concern some members. They may prefer all these questions to have definitive answers before they decide on whether they will support the proposed new division or not. People often assume that a change takes place after a big vision is formed, and then the change is executed by a clear plan that directly translates that vision into reality in an orderly fashion. However, that is rarely how a change takes place in reality. More often than not, a possible change builds up its own pressure, showing up in a variety of forms on multiple fronts by many different people while getting stronger, until the idea of this change gains enough urgency. Finally, some vision of the change is crafted to give a form to that idea. The vision for a change also does not materialize in one fell swoop. It often begins with incomplete details and ideas that may even conflict with one another in its first iteration. It is up to all of us to sort them out and make them consistent, so that they would become operational in the real world. Recently, the Steering Committee reached an agreement regarding the final version of the mission, vision, and values of the proposed new division. I hope these resonate with our members and guide us well in navigating challenges ahead if the membership votes in favor of the proposal. The New Division’s Mission: We connect library and information practitioners in all career stages and from all organization types with expertise, colleagues, and professional development to empower transformation in technology, collections, and leadership, and to advocate for access to information for all. The New Division’s Vision: We shape the future of libraries and catalyze innovation across boundaries. The New Division [name to be determined] amplifies diverse voices and advocates for equal and equitable access to information for all. The New Division’s Values: Shared and celebrated expertise; Strategically chosen work that makes a difference; Transparent, equitable, flexible, and inclusive structures; Empowering framework for experimental and proven approaches; Intentional amplification of diverse perspectives; Expansive collaboration to become better together. IMAGINATION AND STRUCTURE IN TIMES OF CHANGE | KIM 4 https://doi.org/10.6017/ital.v37i4.10850 In deciding on all operational and logistical details for the new division, the most important criteria will be whether a proposed change will advance the vision and mission of the new division and how well it aligns with the agreed-upon values and guiding principles. The Steering Committee and the working groups are busy finalizing the details about the new division. Those details will be first reviewed by the board of each division and then shared with the membership at the Midwinter for feedback. I did not anticipate that during my service as the LITA President-Elect and President, I would be leading a change as great as dissolving LITA and forming a new division with two other divisions, ALCTS and LLAMA. It has been an adventure filled with many surprises, difficulties, and challenges, to say the least. This adventure taught me a great deal about leading a change for an organization at a high level. When we move from the high-level vision of a change to the matter of details deep in the weeds, it is easy to lose sight of the original aspiration and goal that led us to the change in the first place. Trying to determine as many logistical details becomes tempting to those in a leadership role because we all want to assure people in our organizations at a time of uncertainty and to make the transition smooth. However, creating a new division itself is a huge change at the highest level. It would be wrong to backtrack on the original goal to make the transition smooth. For it is the original goal that requires a transition, not vice versa. I believe those in a leadership role should accept that their most important work during the time of change is not to try to wrangle logistics at all levels but to keep things on track and moving in the direction of the original aspiration and goal. LITA and two other divisions have many talented and capable members who will be happy to lend a hand in developing new logistics. The responsibility of leaders is to create space where those people can achieve that freely and swiftly and to provide the right amount of framework and guidance. I hope that all LITA members and those associated and involved with LITA see themselves in the vision, mission, and values of the new division, embrace changes from the lowest to the highest level, and work towards making the new vision into reality together. 1 You can participate in this process at https://connect.ala.org/communities/community- home/digestviewer/viewthread?GroupId=109804&MessageKey=625e8823-21e0-419c-ab2b- 1cb4a82b8d09 and http://www.allourideas.org/newdivisionname. 2 This ‘Current Information’ page will be updated as the plans for the new division develop. https://connect.ala.org/communities/community-home/digestviewer/viewthread?GroupId=109804&MessageKey=625e8823-21e0-419c-ab2b-1cb4a82b8d09 https://connect.ala.org/communities/community-home/digestviewer/viewthread?GroupId=109804&MessageKey=625e8823-21e0-419c-ab2b-1cb4a82b8d09 https://connect.ala.org/communities/community-home/digestviewer/viewthread?GroupId=109804&MessageKey=625e8823-21e0-419c-ab2b-1cb4a82b8d09 http://www.allourideas.org/newdivisionname 10852 ---- Letter from the Editor Kenneth J. Varnum INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 1 https://doi.org/10.6017/ital.v37i4.10852 As 2018 draws to a close, so does our celebration of Information Technology and Libraries’ 50th anniversary. In the final “ITAL at 50” column, Editorial Board member Steven Bowers takes a look at the 1990s. Much as for Steven, for me this decade was where my career direction and interests crystallized around the then-newfangled “World Wide Web.” Taking a look at the topics covered in ITAL over those ten years, it’s clear that plus ça change, plus c'est la même chose: the more things change, the more they stay the same. We were exploring then questions of how the burgeoning Internet would allow libraries to provide new services and be more efficient and helpful in improving existing ones. User experience, distributed data and the challenges that causes, who has access to technology and who does not…. All topics as vibrant and concerning then as they ar e now. With the end of our look back at the last 50 years, we are taking the opportunity start something new in 2019. There will be a new quarterly column, “Public Libraries Leading the Way,” to highlight a technology-based innovation from a public library perspective. Topics we are interested in include the following, but proposals on any other technology topic are welcome. • Virtual and augmented reality • Artificial intelligence • Big data • Internet of things • 3-D printing and makerspaces • Robotics • Drones • Geographic information systems and mapping • Diversity, equity, and inclusion and technology • Privacy and cyber-security • Library analytics and data-driven services • Anything else related to public libraries and innovations in technology Columns will be in the 1,000-1,500 word range and may include illustrations. These will not be research articles, but are meant to share practical experience with technology development or uses within the library. If you are interested in contributing a column, please submit a brief summary of your idea. I’m grateful to the ITAL Editorial Board, and especially to Ida Joiner and Laurie Willis, for their guidance in shaping this concept. Regardless of whether you work in a public, or any other, library, I’m always happy to talk with you about how your experience and knowledge could be published as an article in ITAL. Get in touch with me at varnum@umich.edu. Kenneth J. Varnum, Editor varnum@umich.edu December 2018 https://goo.gl/forms/mCZ2KdLtiwYpsnQ43 https://goo.gl/forms/mCZ2KdLtiwYpsnQ43 mailto:varnum@umich.edu mailto:varnum@umich.edu 10875 ---- Articles 50 Years of ITAL/JLA: A Bibliometric Study of Its Major Influences, Themes, and Interdisplinarity Brady Lund INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 18 Brady Lund (blund2@g.emporia.edu) is a PhD Student at Emporia State University’s School of Library and Information Management. ABSTRACT Over five decades, Information Technology and Libraries (and its predecessor, the Journal of Library Automation) has influenced research and practice in the library and information science technology. From its inception on, the journal has been consistently ranked as one of the superior publications in the profession and a trendsetter for all types of librarians and researchers. This research examines ITAL using a citation analysis of all 878 peer-reviewed feature articles published over the journal’s 51 volumes. Impactful authors, articles, publications, and themes from the journal’s history are identified. The findings of this study provide insight into the history of ITAL and potential topics of interest to ITAL authors and readership. INTRODUCTION Fifty-one years have passed since the first publication of the Journal of Library Automation (JLA), the precursor to Information Technology and Libraries (ITAL), in 1968: 51 volumes, 204 issues, and 878 feature articles. Information technology and its use within libraries has evolved dramatically in the time since the first volume, as has the content of the journal itself. Given the interdisciplinary nature of Library and Information Science (LIS) and ITAL, and the celebration of this momentous achievement, an examination of the journal’s evolution, based on the authors, publishers, and works that have influenced its content, seems apropos. The following article presents a comprehensive study of all 7,575 references listed for the 878 articles (~8.6 refs/article average) published over ITAL’s fifty years, identifying those authors and publishers whose work has been cited the most in the journal and major themes in the cited publications, and an evaluation of the interdisciplinarity of references in ITAL publications. This study not only frames the history of the ITAL journal, but demonstrates an evolution of the journal that suggests new paths for future inquiry. CONCEPTUAL FRAMEWORK A major influence for the organization and methodology of this paper is Imad Al-Sabbagh’s 1987 dissertation from Florida State University’s School of Library and Information Studies, The Evolution of the Interdisciplinarity of Information Science: A Bibliometric Study.1 In this study, Al- Sabbagh sought to examine the interdisciplinary influences on the burgeoning field of information science by examining the references of the Journal of the American Society of Information Science (JASIS), today known as the Journal of the Association for Information Science and Technology (JASIST). In Al-Sabbagh’s study, a sample of ten percent of JASIS references was selected for examination.2 The references were sorted into disciplines based on the definitions supplied by INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 19 Dewey Decimal Classification categories, with the percentages compared to the total number of the sampled references to derive percentages (e.g., if 150 references of 1,000 total JASIS references examined belonged to the category of library science, then 15 percent of references belonged to library science, and so on for all disciplines). The present study deviates slightly from Al-Sabbagh’s in that it does not use a sampling method. Instead, all 878 articles published in JLA/ITAL and their 7,575 references will be examined. The categories for disciplines, instead of being based of Dewey Decimal, will be based on definitions derived from Encyclopedia Brittanica, and will include new disciplines that were not used in Al-Sabbagh’s original analysis, such as Information Systems and Instructional Design. 3 Additionally, the major authors, publishers, and articles cited throughout JLA/ITAL’s history will be identified; this was not done in Al-Sabbagh’s study, but will likely provide additional beneficial information for researchers and potential contributors to ITAL. ITAL is an ideal publication to study using Al-Sabbagh’s methodology, in that it is affiliated with librarianship and library science but, due to its content, is also closely associated with the disciplines of information science, computer science, information systems, instructional design, psychology, and many others. ITAL is likely one of the more interdisciplinary journals to still fall within the category of “library science.” In fact, as part of Al-Sabbagh’s 1987 study, he distributed a survey to several prominent information science researchers, asking them to name journals relevant to information science (this method was used to determine that JASIS was the most representative journal for the discipline of information science). On the list of 31 journals compiled from the respondents’ rankings, ITAL ranked as the seventh most representative journal for information science, above Datamation, Scientometrics, JELIS, and Library Hi-Tech.4 This shows that, for a long time, ITAL has been considered as an important journal not just in library science, but in information science and likely beyond. Key Terminology While the findings of this study are pertinent to the ITAL reader, some of the terminology used throughout the study may be unfamiliar. To acclimate the reader to the terminology used in this study, brief definitions for key concepts are provided below. Bibliometrics. “Bibliometrics” is the statistical study of properties of documents.5 The present study constitutes a “citation analysis,” a type of bibliometric analysis that examines the citations in a document and what they can reveal about said document. Cited Publications. “Cited Publications” are the references (“publications”) listed at the end of a journal article.6 The purpose of Al-Sabbagh’s study (and the present study) is to analyze these cited publications to determine what disciplines influenced the research published in a specific journal. This bibliometric analysis methodology is distinct from those that examine the influence of a specific journal on a discipline (i.e., the present study looks at what disciplines influenced ITAL, not what disciplines are influenced by ITAL). Discipline. In this study, the term “discipline” is used liberally to refer to any area of study that is presently or was historically offered at an institution of higher education (sociology, anthropology, education, etc.). In this study, library science and information science are considered as distinct disciplines (as was the case with Al-Sabbagh’s study).7 As discussed in the methodology section, the names and definitions of disciplines are all derived from the Encyclopedia Britanica. 50 YEARS OF ITAL/JLA | LUND 20 https://doi.org/10.6017/ital.v38i2.10875 LITERATURE REVIEW The type of citation analysis used by Al-Sabbagh and as the basis of the current study are used frequently to examine the interdisciplinarity of library and information science and specific LIS journals, as noted by Huang and Chang.8 Tsay used a similar methodology to Al-Sabbagh to examine cited publications in the 1980, 1985, 1990, 1995, 2000, and 2004 volumes of JASIST. In this study, the researcher found that about one-half of the citations in JASIST came from the field of LIS.9 Butler examined LIS dissertations using a similar approach, finding that about 50 percent of the cited publications in the dissertations originated in LIS, with education, computer science, and health science following in the second, third, and fourth positions.10 Chikate and Paul and Chen and Liang conducted similar studies of dissertations in the India and Taiwan.11 Each study found different degrees of interdisciplinarity, possibly indicating a fluctuation within the discipline of LIS based on publication type, country of origin, etc. for the publications used in the study. Several researchers have used these methods recently to examine library and information science journals, such as Chinese Librarianship,12 Pakistan Journal of Library and Information Science,13 Library Philosophy and Practice,14 and the Journal of Library and Information Technology.15 These studies are more common for journals published outside of the United States, but there is no reason why the methodology would not hold true for a U.S.-based journal like ITAL. Recently, publications in a wide array of fields have used similar methodologies as Al-Sabbagh to evaluate interdisciplinarity in a discipline. Ramos-Rodriguez and Ruiz-Navarro (2004) examined reference trends in the Journal of Strategic Management.16 Fernandez-Alles and Ramos-Rodriguez (2009) conducted a bibliometric analysis to identifying those publications most frequently cited in the journal Human Resource Management.17 Crawley-Low (2006) used a similar methodology to identify the core (most frequently cited) journals in veterinary medicine from the American Journal of Veterinary Research.18 These studies show a growing interest in the use of citation analysis to present new information about a publication to potential authors, editors, and readers. Jarvelin and Vakkari (1993) noted trends in LIS from 1965 to 1985 based on an examination of cited publications in LIS journals. The authors noted a trend in interest in the topic of information storage and retrieval, with a de-emphasis on classification and indexing and a strengthened emphasis on information systems and retrieval.19 This study deviated from Al-Sabbagh and related studies of interdisciplinarity—though it employed a similar methodology—in that it examined trends or subtopics within the discipline of LIS. Though it is not a primary focus of the present study, the use of subtopics to further divide the discipline of library science and examine what aspects (management, technology, cataloging, reference) of the discipline are the focus of cited publications is incorporated in several tables in the results section. METHODS All references from the 878 articles published in the JLA/ITAL journals (n=7,575) were transcribed to an Excel spreadsheet for analysis (this spreadsheet can be found as a supplemental file [https://ejournals.bc.edu/index.php/ital/article/view/10875/9469]). The spreadsheet includes separate columns for primary author, title, publisher, and discipline of each reference. The list of disciplines with their definitions, derived from Encyclopedia Brittanica, is displayed in table 1 below. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 21 Table 1. Definitions of Disciplines Used for this Study. Discipline Definition Library Science The principles and practices of library operation and administration, and their study. Information Science The discipline that deals with the processes of storing and transferring information. Information Systems The study of the integrated set of components for collecting, storing, and processing data and for providing information, knowledge, and digital products. Computer Science The study of computers, including their design (architecture) and their uses for computations, data processing, and systems control. Engineering The application of science to the optimum conversion of the resources of nature to the uses of humankind. Instructional Design The systematic development of instructional specifications using learning and instructional theory to ensure the quality of instruction. Education The discipline that is concerned with methods of teaching and learning in schools or school-like environments as opposed to various nonformal and informal means of socialization. Government Resources produced within the political system by which a country or community is administered and regulated. Sociology A social science that studies human societies, their interactions, and the processes that preserve and change them. Popular Newspaper, magazine, media reports that do not fit better within another category. Philosophy The rational, abstract, and methodical consideration of reality as a whole or of fundamental dimensions of human existence and experience. Psychology The scientific discipline that studies mental states and processes and behaviour in humans and other animals. Corporate Business, corporate, private organization publications that do not fit better within another category. Archival Science The study of the repository for an organized body of records produced or received by a public, semipublic, institutional, or business entity in the transaction of its affairs and preserved by it or its successors. Management The study of the process of dealing with or controlling things or people. Linguistics The scientific study of language. Literature The art of creation of a written work. Law The discipline and profession concerned with the customs, practices, and rules of conduct of a community that are recognized as binding by the community. 50 YEARS OF ITAL/JLA | LUND 22 https://doi.org/10.6017/ital.v38i2.10875 Discipline Definition Mathematics The science of structure, order, and relation that has evolved from elemental practices of counting, measuring, and describing the shapes of objects (also includes statistics). Health Science Study of humans, the extent of an individual’s continuing physical, emotional, mental, and social ability to cope with his or her environment. Communication Science The study of the exchange of meanings between individuals through a common system of symbols. Geography The study of the diverse environments, places, and spaces of Earth’s surface and their interactions. Physics The science that deals with the structure of matter and the interactions between the fundamental constituents of the observable universe. Art/Design The study of the nature of art, including such concepts as interpretation, representation and expression, and form. Economics The social science that seeks to analyze and describe the production, distribution, and consumption of wealth. Biology The study of living things and their vital processes. Museum Studies The study of institutions dedicated to preserving and interpreting the primary tangible evidence of humankind and the environment. Music The art concerned with combining vocal or instrumental sounds for beauty of form or emotional expression, usually according to cultural standards of rhythm, melody, and, in most Western music, harmony. Chemistry The science that deals with the properties, composition, and structure of substances (defined as elements and compounds), the transformations they undergo, and the energy that is released or absorbed during these processes. Science and Technology Studies The study, from a philosophical perspective, of the elements of scientific inquiry. Journalism The collection, preparation, and distribution of news and related commentary and feature materials through such print and electronic media as newspapers, magazines, books, blogs, webcasts, podcasts, social networking and social media sites, and e-mail as well as through radio, motion pictures, and television. Anthropology The study of human beings in aspects ranging from the biology and evolutionary history of Homo sapiens to the features of society and culture that decisively distinguish humans from other animal species. To determine the discipline in which a cited publication would be classified, the researcher used the cited publication’s title, abstract, and journal to select the most appropriate discipline from the table. In those cases where a source could not be easily identified as falling within one specific discipline, the researcher conferred with additional reviewers (professional librarians) to determine the best fit. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 23 Several analyses of this data were conducted to explore various aspects of JLA/ITAL’s publication history. For the complete data of the publication’s 51 volumes, the top ten most referenced authors, articles, publishers (journals/publishing houses/organizations/websites), and disciplines were identified with the aid of Excel’s functions. The same was done separately for both the JLA’s 14 volumes and ITAL’s 37 volumes. This will allow for the comparison of the journal before and after the 1982 rebranding. The 51 volumes of JLA/ITAL were also divided into the five decades of its history: 1968-77, 1978-87, 1988-97, 1998-2007, 2008-18 (eleven volumes instead of ten). For each of these decades, the top ten authors, publishers, and disciplines were identified. For each of the three categories, a table was created to show the top ten of each decade side-by-side. Lastly, the titles of the 7,575 cited publications in JLA/ITAL articles were examined using a content analysis, to identify major concepts and themes that appear to have influenced JLA/ITAL articles. NVivo content analysis software was utilized for this analysis. Titles were fed from the Excel spreadsheet into the NVivo software, and the word frequency tools were used to identify the most frequently used terms and “generalizations,” or types or themes of statements in the titles.20 RESULTS Table 2 displays the top ten most-cited authors, articles, publishers/publications, and disciplines throughout ITAL’s fifty-year history. Among the authors, four of the top six are associated with two institutions: Library of Congress and OCLC. There are four corporate or nonprofit organizations, three academics (associated with institutions of higher education), two women and four men. Of the top ten articles, eight were published before 1973; three were published in JLA/ITAL and five were published in journals versus five in other (non-journal) publications. Of the top ten publishers, seven are journals; five of the publishers are directly associated with library science. Within the disciplines, LIS represents 60 percent of the total. There are 31 total disciplines represented throughout the 51 volumes, a greater number of disciplines than identified in Al-Sabbaugh’s study of JASIST. Table 3 displays the results for JLA. JLA emerged at the same time as the Machine-Readable Catalog (MARC) and OCLC, and this is evident in the authors, articles, and publishers cited in the journal. During this phase of the journal’s history, the top three authors—Fred Kilgour, the Library of Congress, and Henriette Avram—dominated the citations. These three authors were cited more than the next seven combined (143 to 101). The cited publications during this period reflected a focus on systems, corporate, and government publications. Results for the 37 volumes of ITAL are displayed in table 4. During this period, Marshall Breeding emerged as one of the biggest influences on information technology and libraries. All but two of the top articles (Larson and Bizer) were written before 1985. While six publishers were the same as with JLA, three of these six (Library of Congress, Association for Computing Machinery, and College and Research Libraries) changed position in the top ten. The disciplines of systems, psychology, educational and instructional design rose, while government, corporate, management, linguistics, and electrical engineering dropped; library science, information science, and computer science remained at the top. 50 YEARS OF ITAL/JLA | LUND 24 https://doi.org/10.6017/ital.v38i2.10875 Table 2. Overall Most Cited. Top Ten Authors (Affiliation) Top Ten Articles Top Ten Publishers Top Ten Disciplines Top Ten Disciplines with Percentages 1 U.S. Library of Congress American Library Association. (1967). Anglo-American cataloging rules. Chicago, IL: American Library Association. ITAL/JLA Library Science— Technology Library Science 44% 2 Fred G. Kilgour (OCLC) Avram, H. D. (1968). The MARC II Format: A Communications Format for Bibliographic Data. Washington, DC: Library of Congress. ASIST Information Science Information Science 16% 3 Henriette D. Avram (Library of Congress) Ruecking Jr, F. H. (1968). Bibliographic retrieval from bibliographic input; the hypothesis and construction of a test. Information Technology and Libraries, 1(4), 227-238. Association for Computing Machinery Library Science— Cataloging Computer Science 8% 4 American Library Association Kilgour, F. G., Leiderman, E. B., & Long, P. L. (1971). Retrieval of bibliographic entries from a name-title catalog by use of truncated search keys. Ohio College Library Center. College and Research Libraries Computer Science Information Systems 7% 5 IBM: International Business Machines Kilgour, F. G. (1968). Retrieval of single entries from a computerized library catalog file. Proceedings of the American society for information science, 5, 133- 136. Library of Congress Information Systems Government 3% 6 Ohio College Library Center/Online Computer Library Center (OCLC) Long, P. L., & Kilgour, F. (1972). A truncated search key title index. Information Technology and Libraries, 5(1), 17-20. American Library Association Library Science— General Instructional Design 3% 7 Marshall Breeding (Vanderbilt University/Independent) Hildreth, C. R. (1982). Online public access catalogs: The user interface. OCLC Online Computer Library Center, Incorporated. Library Resources and Technical Services Government Corporate 2% 8 Jakob Nielsen (Independent) Nugent, W. R. (1968). Compression word coding techniques for information retrieval. Information Technology and Libraries, 1(4), 250-260. Library Hi- Tech Library Science— Administration Education 2% 9 Karen Markey (University of Michigan) Curwen, A. G. (1990). International Standard Bibliographic Description. In Standards for the international exchange of bibliographic information: papers presented at a course held at the School of Library, Archive and Information Studies, University College London (pp. 3-18). Library Journal Instructional Design Psychology 2% 10 Walt Crawford (Research Libraries Group/Independent) Fasana, P. J. (1963). Automating cataloging functions in conventional libraries (No. ISL-9028-37). Lexington, MA: ITEK Corp Information Sciences Lab. OCLC Library Science— Academic Sociology 2% INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 25 Table 3. JLA Most Cited. Top Ten Authors (Affiliation) Top Ten Articles Top Ten Publishers Top Ten Disciplines Top Ten Disciplines with Percentages 1 Fred G. Kilgour (OCLC) Avram, H. D. (1968). The MARC II Format: A Communications Format for Bibliographic Data. Washington, DC: Library of Congress. Journal of Library Automation Library Science— Technology Library Science 58% 2 U.S. Library of Congress American Library Association. (1967). Anglo-American cataloging rules. Chicago, IL: American Library Association. ASIST Information Science Information Science 14% 3 Henriette D. Avram (Library of Congress) Ruecking Jr, F. H. (1968). Bibliographic retrieval from bibliographic input; the hypothesis and construction of a test. Journal of Library Automation, 1(4), 227-238. Library of Congress Library Science— Cataloging Computer Science 6% 4 IBM: International Business Machines Kilgour, F. G., Leiderman, E. B., & Long, P. L. (1971). Retrieval of bibliographic entries from a name- title catalog by use of truncated search keys. Ohio College Library Center. Library Resources and Technical Services Library Science— General Government 5% 5 American Library Association Long, P. L., & Kilgour, F. (1972). A truncated search key title index. Journal of Library Automation, 5(1), 17-20. IBM Computer Science Corporate 5% 6 William R. Nugent (Inforonics, Inc.) Kilgour, F. G. (1968). Retrieval of single entries from a computerized library catalog file. Proceedings of the American society for information science, 5, 133-136. American Library Association Government Information Systems 4% 7 Paul J. Fasana (Columbia University) Livingston, L.G. (1973). International standard bibliographic description for serials. Library Resources and Technical Services, 17(3), 293-298. Association for Computing Machinery Corporate Management 2% 8 Philip L. Long (OCLC) Fasana, P. J. (1963). Automating cataloging functions in conventional libraries (No. ISL-9028-37). Lexington, MA: ITEK Corp Information Sciences Lab. University of Illinois Press Information Systems Linguistics 1% 9 Martha E. Williams (University of Illinois) Nugent, W. R. (1968). Compression word coding techniques for information retrieval. Journal of Library Automation, 1(4), 250-260. College and Research Libraries Library Science— Academic Electrical Engineering 1% 10 University of California Avram, H. D. (1970). The RECON pilot project: a progress report. Journal of Library Automation, 3(2), 102-114. Special Libraries Library Science— Special Psychology 1% 50 YEARS OF ITAL/JLA | LUND 26 https://doi.org/10.6017/ital.v38i2.10875 Table 4. ITAL Most Cited. Top Ten Authors (Affiliation) Top Ten Articles Top Ten Publishers Top Ten Disciplines Top Ten Disciplines with Percentages 1 U.S. Library of Congress American Library Association. (1967). Anglo-American cataloging rules. Chicago, IL: American Library Association. Information Technology and Libraries Library Science— Technology Library Science 41% 2 American Library Association Hildreth, C. R. (1982). Online public access catalogs: The user interface. OCLC Online Computer Library Center, Incorporated. ASIST Information Science Information Science 16% 3 Marshall Breeding (Vanderbilt University/Independent) Markey, K. (1984). Subject searching in library catalogs. OCLC Online Computer Library Center. Association for Computing Machinery Library Science— Cataloging Computer Science 9% 4 Jakob Nielsen (Independent) Malinconico, S. M. (1979). Bibliographic Data Base Organization and Authority File Control. Wilson library bulletin, 54(1), 36-45. College and Research Libraries Computer Science Information Systems 7% 5 Karen Markey (University of Michigan) Matthews, J. R., Lawrence, G. S., & Ferguson, D. (1983). Using online catalogs: A nationwide survey. Neal-Schuman Publishers, Inc.. Library Hi- Tech Information Systems Instructional Design 3% 6 OCLC Bizer, C., Heath, T., & Berners-Lee, T. (2011). Linked data: The story so far. In Semantic services, interoperability and web applications: emerging concepts (pp. 205-227). IGI Global. American Library Association Instructional Design Government 2% 7 Walt Crawford (Research Libraries Group/Independent) Tolle, J. E. (1983). Current Utilization of Online Catalogs: Transaction Log Analysis. Volume I of Three Volumes. Final Report. Ohio College Library Center Library Science— Administration Education 2% 8 Clifford A. Lynch (University of California/Coalition for Networked 0Information) Larson, R. R. (1991). The decline of subject searching: Long-term trends and patterns of index use in an online catalog. Journal of the American Society for Information Science, 42(3), 197-215. Journal of Academic Librarianship Library Science— General Sociology 2% 9 Charles R. Hildreth (Read LTD.) Markey, K. (1983). Online Catalog Use: Results of Surveys and Focus Group Interviews in Several Libraries. Volume II of Three Volumes. Final Report. Library Journal Library Science— Academic Psychology 2% 10 J.R. Matthews (San Jose State University/Independent) Ludy, L.E., & Logan, S.J. (1982). Integrating authority control in an online catalog. American Society for Information Science Meeting, 19, 176-178. Library of Congress Government Management 2% The top ten authors of each decade are shown in Table 5. For the first two decades, Fred Kilgour was a dominate influence, receiving 15 more citations than the next closest author (the Library of Congress). In the third decade, Kilgour dropped entirely from the top ten and was supplanted at the top spot by Karen Markey, professor at the University of Michigan. During the fourth decade, in the wake of CIPA and the U.S. Patriot Act, the Library of Congress rose to the top spot and John INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 27 Bertot and Paul Jaeger, who wrote extensively on these topics and their legal, social, and administrative implications, rose up the list. Web resources, such as Google, also began to emerge in the fourth decade. In the final decade, Breeding, who writes on library systems as well as information technology in general, rose to the top spot. Tim Berners-Lee, one of the pioneers of the Internet and linked data, took the second spot. Jakob Nielson, known for his contributions to usability testing, appears in the top three of the rankings for both the fourth and fifth decade. Only the Library of Congress and American Library Association appear in the top ten list for all five decades. Table 5. Top Ten Authors of Each Decade. 1968-77 1978-87 1988-97 1998-2007 2008-18 1 Fred G. Kilgour (OCLC) Fred G. Kilgour (OCLC) Karen Markey (University of Michigan) U.S. Library of Congress Marshall Breeding (Vanderbilt University/Independent) 2 U.S. Library of Congress Robert De Gennaro (Harvard University/ Pennsylvania University) U.S. Library of Congress Jakob Nielsen (Independent) Tim Berners-Lee (W3 Consortium/ University of Oxford/ Massachusetts Institute of Technology) 3 Henriette D. Avram (Library of Congress) Henriette D. Avram (Library of Congress) Clifford A. Lynch (University of California/Coalition for Networked Information) John C. Bertot (University of Maryland) Jakob Nielsen (Independent) 4 IBM: International Business Machines IBM: International Business Machines Michael K. Buckland (University of California) OCLC U.S. Library of Congress 5 American Library Association S. Michael Malinconico (New York Public Library/ University of Alabama) American Library Association Paul T. Jaeger (University of Maryland) American Library Association 6 Paul J. Fasana (Columbia University) U.S. Library of Congress Christine L. Borgman (University of California- Los Angeles) Walt Crawford (Research Libraries Group/Independent) National Information Standards Organization 7 William R. Nugent (Inforonics, Inc.) Frederick W. Lancaster (University of Illinois) Charles R. Hildreth (Read LTD) American Library Association U.S. Government 8 University of California Allen B. Veaner (Stanford University/ University of California) Joseph R. Matthews (San Jose State University/Independent) Roy Tennant (University of California/ OCLC) John C. Bertot (University of Maryland) 9 Kenneth J. Bierman (Oklahoma State University/ University of Nevada-Las Vegas) Alan L. Landgraf (OCLC) Walt Crawford (Research Libraries Group/Independent) Google OCLC 10 Robert M. Hayes (University of California-Los Angeles) American Library Association Lois M. Chan (University of Kentucky) Thomas B. Hickey (OCLC) Jung-Ran Park (Drexel University) 50 YEARS OF ITAL/JLA | LUND 28 https://doi.org/10.6017/ital.v38i2.10875 JLA/ITAL appears as the most cited publisher in all decades except the fourth, as shown in table 6. During that decade, ACM and JASIST rose above ITAL, and Library Journal and websites (websites are considered in this study as a collective group) emerged on the list. Library Journal was a frequently used source for Bertot and Jaeger, who authored several ITAL articles during this period. There were also more articles about the Internet, digital libraries, Google and Google Scholar, and the future of libraries during the fourth decade. JASIST appears in the top four of every decade but has declined in the fifth decade of ITAL. OCLC, IBM, College and Research Libraries, Cataloging and Classification Quarterly, Journal of Academic Librarianship, Library Resources and Technical Services, and Library Hi-Tech all appear in multiple decades of this list. Table 6. Top Ten Publishers of Cited Articles for Each Decade. 1968-77 1978-87 1988-97 1998-2007 2008-18 1 JLA JLA/ITAL ITAL Association for Computing Machinery ITAL 2 Library of Congress JASIST JASIST JASIST Library Hi-Tech 3 JASIST Library Journal College and Research Libraries ITAL Association for Computing Machinery 4 Library Resources and Technical Services OCLC American Library Association College and Research Libraries JASIST 5 IBM University of Illinois Press Library Resources and Technical Services American Library Association Journal of Academic Librarianship 6 American Library Association Library of Congress OCLC Library Journal College and Research Libraries 7 Special Libraries Library Resources and Technical Services Library of Congress Journal of Academic Librarianship Computers in Libraries 8 College and Research Libraries American Library Association Library Hi-Tech General Websites D-Lib 9 Association for Computing Machinery Prentice-Hall Journal of Academic Librarianship Library Hi-Tech Cataloging and Classification Quarterly 10 University of Illinois Press IBM Cataloging and Classification Quarterly OCLC IEEE As shown in table 7, library science and information science maintained the first and second positions for every decade of JLA/ITAL’s publication, while computer science and information systems jockeyed for the third and fourth positions every decade except the first (when government reports had a major impact on the journal). Government and corporate (IBM particularly) were important in the first three decades but were replaced by instructional design and education in the final two decades. Sociology appears in four of five decades, while psychology appears in three of five. In the first two decades, electrical engineering (as it applied to the design of computer systems) rounded up the top ten; law emerged in decade four, following CIPA and the INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 29 Patriot Act; in the final decade, with the discussion about Encoded Archival Description in ITAL, archival science rose to the tenth spot. Table 7. Top Ten Disciplines of Each Decade (Library Science Subcategories Combined). 1968-77 1978-87 1988-97 1998-2007 2008-18 1 Library Science Library Science Library Science Library Science Library Science 2 Information Science Information Science Information Science Information Science Information Science 3 Computer Science Information Systems Computer Science Computer Science Information Systems 4 Government Computer Science Information Systems Information Systems Computer Science 5 Corporate Corporate Government Instructional Design Instructional Design 6 Information Systems Government Philosophy Education Psychology 7 Management Management Sociology Corporate Government 8 Linguistics Sociology Literature Sociology Education 9 Electrical Engineering Psychology Psychology Philosophy Sociology 10 Chemistry Electrical Engineering Education Law Archival Science Table 8 compares all disciplines (including subcategories of library science) in the first decade of JLA/ITAL and the fifth decade. Compared to the first decade, the fifth decade saw greater diversification of subtopics under library science, which led to “information science” supplanting “library science—technology” in the top spot. Instructional design and archival science emerged from disciplines not discussed in the first decade to become some of the most important disciplines of the fifth decade. The library science subtopics of accessibility and teaching grew significantly as the roles of the librarian evolved. 50 YEARS OF ITAL/JLA | LUND 30 https://doi.org/10.6017/ital.v38i2.10875 Table 8. First Ten Years vs. Last Eleven Years Disciplines (with subcategories of library science). 1968-77 2008-18 1 Library Science—Technology Information Science 2 Information Science Library Science—Technology 3 Library Science—Cataloging Information Systems 4 Library Science—General Computer Science 5 Computer Science Library Science—Cataloging 6 Government Instructional Design 7 Corporate Library Science—Accessibility 8 Library Science—Academic Library Science—Academic 9 Information Systems Library Science—Reference 10 Library Science—Special Library Science—Administration 11 Management Psychology 12 Linguistics Government 13 Electrical Engineering Library Science—General 14 Library Science—Medical Education 15 Popular Popular 16 Library Science—Reference Library Science—Teaching 17 Chemistry Sociology 18 Physics Archival Science 19 Engineering—General Management 20 Psychology Law 21 Mathematics Corporate 22 Library Science—Local Mathematics 23 Communication Science Philosophy 24 Health Science Literature 25 Library Science—Accessibility Linguistics 26 Library Science—School Physics 27 Philosophy Health Science INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 31 1968-77 2008-18 28 Library Science—Administration Geography 29 Journalism Electrical Engineering 30 Government Library Science—Medical 31 Music Biology 32 Education Art/Design 33 Literature Museum Studies 34 Economics 35 Communication Science 36 Engineering-General 37 Journalism 38 Library Science—Special 39 Chemistry 40 Science and Technology Studies 41 Library Science—School 42 Library Science—Local 43 Anthropology Table 9 show the ten biggest themes and most frequently used terms throughout JLA/ITAL’s 51 volumes. Library is the most common theme and term. The library catalog, and the associated concept of the Integrated Library System (ILS), influence the second and third themes. “Online” is an interesting theme/term for the different ways in which it was used throughout the history of the journal. In the early years, “online” was used to refer to the retrieval of computerized bibliographic information; in later years, “online” came to refer almost exclusively to the use of the World Wide Web. Rounding out the top ten terms are several that associated with the study of information science: data, bibliography, and retrieval. Finally, table 10 depicts the top ten themes for each of ITAL’s five decades. Libraries remained at the top for all decades; the second spot, however, shifted dramatically. In the first decade, with MARC being a major topic of discussion, “system” and “catalog” rose to the top. In decades two and three, with the melding of the disciplines of library science and information science, “information” rose to the top. In the final two decades, the World Wide Web was influential on the ITAL discourse. Users, usability, and accessibility remain an important theme throughout the history of the journal. 50 YEARS OF ITAL/JLA | LUND 32 https://doi.org/10.6017/ital.v38i2.10875 Table 9. Major Themes and Term Frequency in Titles of Cited Publications (All 51 volumes). Themes Terms 1 Library Library 2 Catalog Information 3 System Online 4 Information System 5 Online Web 6 Usability Catalog 7 Web Digital 8 Search Data 9 Computer Bibliography 10 Digital Retrieval Table 10. Major Themes in Titles of Cited Publications (By Decade). 1968-77 1978-87 1988-97 1998-2007 2008-18 1 Library Library Library Library Library 2 System Information Information Web Web 3 Catalog System Catalog Information Digital 4 Information Catalog Web Digital Information 5 Online Online System Usability Usability 6 Usability Web Digital Users Data 7 Web Usability Online Catalog Users 8 Search Digital Usability Search Accessibility 9 Computer Users Users Accessibility Studies 10 Digital Search Accessibility Data Academic INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 33 DISCUSSION One of the major benefits of a bibliometric study/citation analysis is the production of a set of themes, disciplines, seminal sources, influences, and influencers that may benefit potential authors in determining whether their manuscript is suitable for publication in a specific discipline or journal.21 The results of this study demonstrate that ITAL is undoubtedly a library science journal, but that it invites a high-level of interdisciplinarity and has experienced a growing impact from the disciplines of information science, computer science, and information systems (which combined presently comprise about 30 percent of total ITAL references). Throughout the journal’s history, there has been an emphasis on library systems, particularly systems for library cataloging. Recently, however, there has also been an emphasis on technology, law, and the library as well as instructional technology, teaching, and the library. ITAL authors take the majority of their citations/ideas from other ITAL articles, JASIST, ACM, and other library technology (Library Hi- tech, D-Lib) and academic librarianship (College and Research Libraries, Journal of Academic Librarianship) journals. Some of the major authors to read to familiarize oneself with the history and themes of the ITAL publication include Fred Kilgour, Henriette Avram, Karen Markey, and Marshall Breeding. These are some findings that potential ITAL authors may find practical use while preparing crafting their research and writing. With ITAL having a sustained role as a leading publication in library and information science, this study may have some generalizable findings for the discipline. In 2015, Richard van Noorden produced an interactive chart of the interdisciplinarity of a variety of disciplines, based on data from Web of Science and the National Science Foundation.22 If ITAL is considered representative of a sub-discipline called “library and information science—technology,” it can be compared to the interdisciplinarity of the disciplines listed in van Noorden’s study. In the last decade of ITAL, 45.4 percent of citations came from outside of LIS. Compared to van Noorden’s findings, only 42 of 144 (29 percent) “fields” (or “disciplines,” as they have been referred to as in this study) have a higher proportion of references to outside disciplines. This LIS-Tech sub-discipline would have a level of interdisciplinarity comparable to the fields of oceanography, botany, philosophy, history, and psychology, and on-par with the average for all social sciences.23 This shows that the discipline certainly has its own proprietary knowledge-base to build upon, but also values the contributions of knowledge from other disciplines. Though it is not necessarily the purpose of this study to examine the influence of ITAL on other journals and within the discipline of LIS, some of this information can be gathered rather easily from Google Scholar (by searching for the journal and comparing the number of citations for each article, as displayed by Scholar) and is worth sharing. Table 11 shows the top ten most-cited articles published over the history of JLA/ITAL, with McClure’s 1994 article “Network Literacy: A Role for Libraries,” receiving the most references of any article published in the journal. Three ITAL articles have been cited by articles which themselves have over 1,000 citations, including one article (2007’s “Checking Out Facebook.com”) that has been cited by an article which itself has been cited over 10,000 times. Fifty-seven ITAL articles have at least 57 citations, giving the journal an h-index24 of 57. 50 YEARS OF ITAL/JLA | LUND 34 https://doi.org/10.6017/ital.v38i2.10875 Table 11. Citations of ITAL Articles in Outside Journals. Rank Journal Citation Number of Citations 1 McClure, C. R. (1994). Network literacy: A role for libraries? Information Technology and Libraries, 13(2), 115-26. 447 2 Charnigo, L., & Barnett-Ellis, P. (2007). Checking out Facebook.com: The impact of a digital trend on academic libraries. Information Technology and Libraries, 26(1), 23-34. 391 3 Antelman, K., Lynema, E., & Pace, A. K. (2006). Toward a 21st century library catalog. Information Technology and Libraries 25(3), 128-39. 267 4 Spiteri, L. F. (2007). The structure and form of folksonomy tags: The road to the public library catalog. Information Technology and Libraries 26(3), 13-25. 260 5 Katz, I. R. (2007). Testing information literacy in digital environments: ETS's iSkills assessment. Information Technology and Libraries 26(3), 3-12. 226 6 Jeng, J. (2005). What Is Usability in the Context of the Digital Library and How Can It Be Measured. Information Technology and Libraries 24(2), 47-56. 196 7 Lankes, R. D., Silverstein, J., & Nicholson, S. (2007). Participatory networks: the library as conversation. Information Technology and Libraries 26(4), 17-33. 189 8 Dickstein, R., & Mills, V. (2000). Usability Testing at the University of Arizona Library: How to Let the Users in on the Design. Information Technology and Libraries 19(3), 144- 51. 188 9 Schaffner, A. C. (1994). The Future of Scientific Journals: Lessons from the Past. Information Technology and Libraries 13(4), 239-47. 177 10 Kopp, J. J. (1998). Library consortia and information technology: the past, the present, the promise. Information Technology and Libraries 17(1), 7. 166 LIMITATIONS OF THIS STUDY There were couple of potential limitations to this study. This bibliometric analysis was conducted in the “old-fashioned” way, using Excel and hand-typing out all 7,575 cited publications. This was deemed the most effective way to collect the data, based on the availability of ITAL journal, but did take a great deal of time. To save time in recording data, only the first author for each cited publication was listed and no publication dates were collected, nor were abstracts retained and analyzed (which may provide additional compelling details about the content of these cited publications). Greater validity for the assignment of disciplines to cited publications may be achieved by having a large team of researchers for analysis, or using multiple researchers for all citations, not just those that the first researcher deems questionable.25 As with a content analysis, independent review of data and comparison and compromising of coding is likely to provide the most consistent and accurate results. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 35 CONCLUSION Fifty-one volumes of the Journal of Library Automation/Information Technology and Libraries have been published, over which time library technology has evolved from early-MARC, a time in which the exceptional library would have perhaps a single computer for “online retrieval,” to the Internet Age, characterized by personal computing, library management systems, and technology-aided instruction. As time has passed, many of the major influences on the journal have changed, yet the journal has remained one of the most influential for library and information science technology. Increased interdisciplinarity in cited publications and new directions in information law and education offer new directions as the journal enters its sixth decade. ENDNOTES 1 Imad Al-Sabbagh, “Evolution of the Interdisciplinarity of Information Science: A Bibliometric Study” (PhD diss., Florida State University, 1987). 2 Ibid. 3 Encyclopedia Britannica, https://www.britannica.com/ (accessed Sept. 13, 2018). 4 Al-Sabbagh, “Evolution of the Interdisciplinarity.” 5 Melissa K. McBurney and Pamela L. Novak, “What is Bibliometrics and Why Should You Care?” Professional Communication Conference, IEEE (2002): 108-14, https://doi.org/10.1109/IPCC.2002.1049094. 6 Lutz Bornmann and Rudiger Mutz, “Growth Rates of Modern Science,” Journal of the Association for Information Science and Technology 66, no. 11 (2015): 2, 215-222, https://doi.org/10.1002/asi.23329. 7 Al-Sabbagh, “Evolution of the Interdisciplinarity.” 8 Mu-Hsuan Huang and Yu-Wei Chang, “A Study of Interdisciplinarity in Information Science: Using Direct Citation and Co-authorship Analysis,” Journal of Information Science 37, no. 4 (2011): 369-78, https://doi.org/10.1177/0165551511407141. 9 Ming-Yueh Tsay, “Journal Bibliometric Analysis: A Case Study on the JASIST,” Malaysian Journal of Library & Information Science 13, no. 2 (2008): 121-39, http://ejum.fsktm.um.edu.my/article/663.pdf. 10 Lois Buttlar, “Information Sources in Library and Information Science Doctoral Research,” Library & Information Science Research 21, no. 2 (1999): 227-45, https://doi.org/10.1016/S0740-8188(99)00005-5 11 R.V. Chikate and S.K. Patil, “Citation Analysis of Theses in Library and Information Science Submitted to University of Pune: A Pilot Study,” Library Philosophy and Practice 222 (2008); Kuang-hua Chen and Chiung-fang Liang, “Disciplinary Interflow of Library and Information Science in Taiwan,” Journal of Library and Information Studies 2, no. 2 (2004): 31-55. 50 YEARS OF ITAL/JLA | LUND 36 https://doi.org/10.6017/ital.v38i2.10875 12 Akhtar Hussain and Nishat Fatima, “A Bibliometric Analysis of the ‘Chinese Librarianship: An International Electronic Journal,’ 2006-2010,” Chinese Librarianship 31, no. 1 (2011): 1-14, http://www.iclc.us/cliej/cl31HF.pdf. 13 Nosheen Fatima Warraich and Sajjad Ahmad, “Pakistan Journal of Library and Information Science: A Bibliometric Analysis,” Pakistan Journal of Information Management and Libraries 12, no. 1 (2011): 1-7. http://eprints.rclis.org/25600/. 14 S. Thanuskodi, “Bibliometric Analysis of the Journal Library Philosophy and Practice from 2005- 2009,” Library Philosophy and Practice 437 (2010): 1-6. https://digitalcommons.unl.edu/libphilprac/437/. 15 Manoj Kumar and A.L. Moorthy, “Bibliometric Analysis of DESIDOC Journal of Library and Information Technology During 2001-2010,” DESICOC Journal of Library and Information Technology 31, no. 3 (2011): 203-08. 16 Antonio Ramos-Rodriguez and Jose Ruiz-Navarro, “Changes in the Intellectual Structure of Strategic Management Research: A Bibliographic Study of the Strategic Management Journal, 1980-2000,” Strategic Management Journal 25, no. 10 (2004): 981-1,004, https://doi.org/10.1002/smj.397. 17 Mariluz Fernandez-Alles and Antonio Ramos-Rodriguez, “Intellectual Structure of Human Resources Management Research: A Bibliometric Analysis of the Journal Human Resource Management, 1985-2005,” JASIST 60, no. 1 (2009): 161, https://doi.org/10.1002/asi.20947. 18 Jill Crawley-Low, “Bibliometric Analysis of the American Journal of Veterinary Research to Produce a List of Core Veterinary Medicine Journals,” JMLA 94, no. 4 (2006): 430-34. 19 Kalervo Jarvelin and Pertti Vakkari, “The Evolution of Library and Information Science 1965- 1985: A Content Analysis of Journal Articles,” Information Processing and Management 29, no. 1 (1993): 129-44, https://doi.org/10.1016/0306-4573(93)90028-C. 20 R. Barry Lewis, “NVivo and ATLAS.ti 5.0: A Comparative Review of Two Popular Qualitative Data-Analysis Programs,” Field Methods 16, no. 4 (2004): 439-69, https://doi.org/10.1177/1525822X04269174. 21 Thad Van Leeuwen, “The Application of Bibliometric Analyses in the Evaluation of Social Science Research: Who Benefits from It, and Why It is Still Feasible,” Scientometrics 66, no. 1 (2006): 133-54, https://doi.org/10.1007/s11192-006-0010-7. 22 Richard van Noorden, “Interdisciplinarity Research by the Numbers,” Nature 525, no. 7569 (2015): 306-07, https://doi.org/10.1038/525306a. 23 Ibid, 306. 24 Lutz Bornmann and Hans-Dieter Daniel, “What Do We Know about the h index,” Journal of the American Society for Information Science and Technology 58, no. 9 (2007): 1,381-385, https://doi.org/10.1002/asi.20609. 25 Linda C. Smith, “Citation Analysis,” Library Trends 30, no. 1 (1981): 83-106. 10886 ---- Communications Wikidata: From “an” Identifier to “the” Identifier Theo van Veen INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 72 Theo van Veen (theovanveen@gmail.com) is Researcher (retired), Koninklijke Bibliotheek. ABSTRACT Library catalogues may be connected to the linked data cloud through various types of thesauri. For name authority thesauri in particular I would like to suggest a fundamental break with the current distributed linked data paradigm: to make a transition from a multitude of different identifiers to using a single, universal identifier for all relevant named entities, in the form of the Wikidata identifier. Wikidata (https://wikidata.org) seems to be evolving into a major authority hub that is lowering barriers to access the web of data for everyone. Using the Wikidata identifier of notable entities as a common identifier for connecting resources has significant benefits compared to traversing the ever-growing linked data cloud. When the use of Wikidata reaches a critical mass, for some institutions, Wikidata could even serve as an authority control mechanism. INTRODUCTION Library catalogs, at national as well as institutional levels, make use of thesauri for authority control of named entities, such as persons, locations, and events. Authority records in thesauri contain information to distinguish between entities with the same name, combine pseudonyms and name variants for a single entity, and offer additional contextual information. Links to a thesaurus from within a catalog often take the form of an authority control number, and serve as identifiers for an entity within the scope of the catalog. Authority records in a catalog can be part of the linked data cloud when including links to thesauri such as VIAF (https://viaf.org/), ISNI (http://www.isni.org/), or ORCID (https://orcid.org/). However, using different identifier systems can lead to having many identifiers for a single entity. A single identifier system, not restricted to the library world and bibliographic metadata, could facilitate globally unique identifiers for each authority and therefore improve discovery of resources within a catalog. The need for reconciliation of identifiers has been pointed out before.1 What is now being suggested is to use the Wikidata identifier as “the” identifier. Wikidata is not domain specific, has a large user community, and offers appropriate APIs for linking to its data. It provides access to a wealth of entity properties, it links to more than 2,000 other knowledge bases, it is used by Google, and the number of organisations that link to Wikidata is quantifiably growing with tremendous speed.2 The idea of using Wikidata as an authority linking hub was recently proposed by Joachim Neubert.3 But why not go one step further and bring the Wikidata identifier to the surface directly as “the” resource identifier, or official authority record? This has been argued before and the implications of this argument will be considered in more detail in the remainder of this article. 4 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 73 Figure 1. From linking everything to everything to linking directly to Wikidata. Figure 1 illustrates the differences between a few possible situations that should be distinguished. On the left, the “everything links to everything” situation shows Wikidata as one of the many hubs in the linked data cloud. In the middle, the “Wikidata as authority hub” situation is shown, where name authorities are linked to Wikidata. On the right is the arrangement proposed in this article, where library systems and other systems for which this may apply share Wikidata as a common identifier mechanism. Of course, there is a need for systems that feed Wikidata with trusted information and provide Wikidata with a backlink to a rich resource description for entities. In practice, however, many backlinks do not provide rich additional information and in such cases a direct link to Wikidata would be sufficient for the identification of entities. Figure 2 shows these two situations and other possible variations by means of dashed lines, i.e. systems that feed Wikidata, but use the Wikidata identifier as resource identifier for the outside world vs. systems that link directly to Wikidata, but keep a local thesaurus for administrative purposes. It is certainly not the intention to encourage institutions to give up their own resource descriptions or resource identifiers locally, especially not when they are an original or rich source of information about an entity. A distinction can be made between the URL of the description of an entity and the URL of the entity itself. When following the URL of a real-world entity in a browser, it is good practice to redirect to the corresponding description of the entity. This is known as the “HTTPRange-14” issue.5 This article will not go into any detail about this distinction other than to note that it makes sense to have a single global identifier for an entity while accepting different descriptions of that entity linked from various sources. WIKIDATA | VAN VEEN 74 https://doi.org/10.6017/ital.v38i2.10886 Figure 2. Feeding properties connecting collections to Wikidata (left) and direct linking to Wikidata using resource identifier (right). The dashed lines show additional connecting possibilities. THE MOTIVATING USE CASE The idea of using the Wikidata identifier as a universal identifier was born at the research department of the National Library of the Netherlands (KB) while working on a project aimed at automatically enriching newspaper articles with links to knowledge bases for named entities occurring in the text.6 These links include the Wikidata identifier and, where available, the Dutch and English DBpedia (http://dbpedia.org) identifiers, the VIAF number, the Geonames number (http://geonames.org), the KB thesaurus record number, and the identifier used by the Parliamentary Documentation Centre (https://www.parlementairdocumentatiecentrum.nl/). The identifying parts of these links are indexed along with the article text in order to enable semantic search, including search based on Wikidata properties. For demonstration purposes the enriched “newspapers+” collection was made available through the KB Research Portal, which gives access to most of the regular KB collections (figure 3). 7 In the newspaper project, linked named entities in search results are clickable to obtain more information. As most users are not expected to know SPARQL, the query language for the semantic web, the system offers a user-friendly method for semantic search: a query string entered between square brackets, for example “[roman emperor]”, is expanded by a “best guess” SPARQL query in Wikidata, in this case resulting in entities having the property “position held=roman emperor.”. These in turn are used to do a search for articles containing one or more mentions of a Roman emperor, even if the text “roman emperor” is not present in the article. In another example, when a user searches for the term “[beatles]” the “best guess” search yields articles mentioning entities with the property “member of=The Beatles”. For ambiguous items, as in the case of “Guernica,”, which can be the place in Spain or Picasso’s painting, the one with the highest number of occurrences in the newspapers is selected by default, but the user may select another one. For INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 75 the default or selected item, the user can select a specific property from a list of Wikidata properties available for that specific item. The possibilities of this semantic search functionality may inspire others to use the Wikidata identifier for globally known entities in other systems as well. Figure 3. Screenshot of the KB Research Portal with a newspaper article as result of searching “[architect=Willem Dudok]”. The results are articles about buildings of which Willem Dudok is the architect. The name of the building meeting the query [architect=Willem Dudok] is highlighted. USAGE SCENARIOS Two usage scenarios can be considered in more detail: (1) manually following links between Wikidata descriptions and other resource descriptions, and (2) a federated SPARQL query can be performed by the system to automatically bring up linked entities. In the first scenario, in which resource identifiers link to Wikidata, the user can follow the link to all resource descriptions having a backlink in Wikidata. But why would a user follow such a link? Reasons may include wanting more or context-specific information about the entity, or a desire to search in another system for objects mentioning a specific entity. In the latter case, the information behind the backlink should provide a URL to search for the entity, or the backlink should be the search URL itself. Wikidata provides the possibility to specify various URI templates. These can be used to specify a link for searching objects mentioning the entity, rather than just showing a thesaurus entry. When the backlink does not provide extra information or a way to search the entity, the backlink is almost useless. Thus, when systems provide resource links to Wikidata they give users access to a wealth of information about an entity in the web of data and, potentially, to objects mentioning a specific entity. Some systems only provide backlinks from WIKIDATA | VAN VEEN 76 https://doi.org/10.6017/ital.v38i2.10886 Wikidata to their resource descriptions but not the other way around. Users from such systems cannot easily benefit from these links. The second scenario of a federated SPARQL query applies when searching objects in one system based on properties coming from other systems. Formulating such a SPARQL query is not easy because doing so requires a lot of knowledge about the linked data cloud. The alternative is to put the complete linked data cloud in a unified (triple store) database. The technology of linked data fragments might solve the performance and scaling issues but not the complexity. 8 Using a central knowledge base like Wikidata could reduce complexity for the most common situation of searching objects in other systems using properties from Wikidata. This use case requires these systems to take the users query and automatically formulate a SPARQL search. There are many systems that are linked to Wikidata that do not support SPARQL at all or only support it in a way that is not intended for the average user. Those systems can still let users benefit from Wikidata by offering a simple add-on to search in Wikidata for entities that meet some criteria and use the identifiers for a conventional search in the local system as shown for the case of the historical newspapers. These two use cases illustrate how the use of a Wikidata identifier can lower the barrier to access information about an entity and to finding objects related to an entity by minimizing the number of hubs, minimizing the required knowledge and minimizing the required technology. This is achieved by linking resources to Wikidata and, even more so, by making objects searchable by means of the Wikidata identifier. ADVANTAGES OF USING THE WIKIDATA IDENTIFIER AS UNIVERSAL IDENTIFIER Summarizing the above, a number of significant advantages of using the Wikidata identifier as universal identifier can be seen. These include: • Using the Wikidata identifier as resource identifier makes Wikidata the first hub. Applications therefore have in the first instance to deal with only one description model. From there, it is easy to navigate further: most information is only “one hub away,” so less prior knowledge is required to link from one source to another. • Wikidata identifiers can be used for federated search based on properties in Wikidata, so there is less need to know how to access properties in other resource descriptions. • Wikidata identifiers facilitate generating “just in case” links to systems having the Wikidata identifier indexed. • Complicated SPARQL queries using Wikidata as primary source for properties can be shared and reused more easily compared to a situation with many diverse sources for properties. • Wikidata offers many tools and APIs for accessing and processing data. • Some libraries and similar institutions may even decide to use Wikidata directly for authority control when it reaches a critical mass, relieving them from maintaining a local thesaurus. IMPLEMENTATION Institutions can gradually adopt the use of Wikidata identifiers without needing to make radical changes in their local infrastructure. A simple first step is automatically generating links to INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 77 Wikidata in the presentation of an object or to the object description to provide contextual information and navigation options. As a next step, the Wikidata Q-number of an entity could be indexed along with the descriptions containing it, so these objects become findable via a Wikidata identifier search, e.g. of the form: https://whatever.local/wdsearch?id=Q937 The Wikidata identifier could then be used in conventional as well as federated searches for a resource, regardless of the exact spelling of a resource name. A search may be refined using Wikidata properties without further requirements with respect to local infrastructures. Institutions having a SPARQL endpoint can allow for a federated SPARQL query for combining local data with data from Wikidata. As SPARQL is not easy for the end user this requires a user interface that can formulate a SPARQL query to protect the user from knowing SPARQL. Those institutions willing to start using the Wikidata identifier as resource identifier can unify references in their bibliographic records. Currently, for example, a reference to Albert Einstein, in a simplified, RDF-like (https://www.w3.org/RDF/) XML fragment in a bibliographic record, could look quite different for different institutions, e.g.: Albert Einstein Albert Einstein Albert Einstein Albert Einstein If the Wikidata identifier is used as resource identifier, this could for all institutions become the same: Albert Einstein In this case it becomes easy to navigate the web, to create common bookmarklets, and provide additional functionality using the Wikidata identifier. CATALOGUING PROCESS AND CRITERIA FOR NEW WIKIDATA ENTRIES For institutions that decide to link their entities directly to Wikidata, their catalog software would have to be configured to support Wikidata lookups. Catalogers would not have to know about linked data or RDF to create links to Wikidata; they would simply have to query Wikidata and select the appropriate entry to link. The cataloging software would then add the selected identifier to the record being edited. If a query in Wikidata does not yield any results the item would first then have to be created by the cataloger. Creating a new item using the Wikidata user interface (figure 4) is straightforward: create an account, add a new item, and add statements (fields) and values. WIKIDATA | VAN VEEN 78 https://doi.org/10.6017/ital.v38i2.10886 Figure 4. Data entry screen for entering a new item in Wikidata. Catalogers must be aware of some rules when creating items. Wikidata editors may delete items that fall under one of Wikidata’s exclusion criteria, such as vandalism, empty descriptions, broken links, etc. In addition, the item must refer to an instance of a clearly identifiable conceptual or material “notable” entity. Notable means that the item must be mentioned by at least one reliable, third-party published source. Here, common sense is required: being mentioned in a telephone book or a newspaper is in itself not considered as notability. Entities that are not notable enough to be entered into Wikidata would then remain identified by a link to a local or other thesaurus. POSSIBLE OBJECTIONS TO WIKIDATA AS AUTHORITY CONTROL MECHANISM Although it is, at least at the present moment, not the intention of this article to propose the use of Wikidata as the primary local authority control mechanism, some institutions may nonetheless consider the opportunity to do so. There are numerous objections to this idea to note, including: 1) Institutions may consider themselves authoritative sources of information, and may therefore want to keep control over “their” thesaurus. The idea that the greater community can make changes to “their” thesaurus may not be tenable to them. Quality control and error detection certainly are important issues, but experts from outside the library can sometimes provide more and better information about a resource than cataloguing professionals. For misuse and erroneous input, the community can be relied on and trusted to correct and add to Wikidata entries. Information that is critical for local usage, such as access control, may still be managed locally. Despite possible objections to using Wikidata for universal authority control, national libraries and other institutions can INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 79 work together with Wikidata to share responsibility of maintaining the resource, to optimize and harmonize the shared use of Wikidata, and maintain validity and authority. This might imply a more rigorous quality control. 2) Existing systems like VIAF and ISNI already, at present, still contain more persons than Wikidata, so why use Wikidata? VIAF and ISNI are domain specific and are more restrictive with respect to updates of their content and the availability of tools and APIs. In Wikidata both VIAF and ISNI are just one hub away and for internal use the VIAF and ISNI identifiers remain available. The question here is whether there will be a moment that Wikidata reaches a critical mass and supersedes VIAF and ISNI. 3) There may be disagreement about a certain entity, especially when it concerns political events or persons whose role is perceived differently by different political parties. Wikidata contains neutral properties. The properties that may contain subjective qualifications or might suffer bias are mostly behind the backlinks, like the abstract in Wikipedia. A fundamental difference between Wikipedia and Wikidata is that Wikipedia doesn’t have to be consistent across languages. Wikidata is much more structured and therefore more useful for semantic applications. It doesn’t allow for the different nuances in descriptions like Wikipedia articles do and therefore Wikidata doesn’t reflect different opinions in descriptions and is less subject to bias.9 Furthermore, the cataloguing practices in libraries are subject to bias and subjectivity too. Perception and political view may, for example, be reflected in some subject headings and may also change over time.10 It is debatable whether a cataloger is more neutral and less biased than a larger user community. Although the use and acceptance of Wikipedia as a true source of information may be arguable, in the light of the current “fake news” discussion it is extremely important to guard the correctness of information in Wikipedia. In this context it is interesting to note that “according to a study in Nature, the correctness of Wikipedia articles is comparable to the Encyclopaedia Britannica, and a study by IBM researchers found that vandalism is repaired extremely quickly.”11 4) Some objections have to do with the discussion of “centralization versus decentralization.” Some institutions may not want a central system perceptively having control over their local data. The idea of using Wikidata as a common authority control mechanism is not that different from the use of any other thesaurus or identifier framework like ISBN, ISSN, etc., except for its use of a central resource description. 5) What if Wikidata disappears? There are solutions in terms of mirrors and a local copy of Wikidata. Moreover, national libraries and other, similar institutions that are already responsible for long-term preservation of digital content can take responsibility for keeping Wikidata alive to maximize its viability WIKIDATA | VAN VEEN 80 https://doi.org/10.6017/ital.v38i2.10886 CONCLUSION Reconciliation of linked data identifiers in general, and using the Wikidata identifier as universal identifier in particular, has been shown to have many advantages. Libraries and similar institutions can gradually start using the Wikidata identifier without needing to make radical changes in their local database infrastructure. When Wikidata reaches a critical mass, libraries and similar institutions may want to switch to using Wikidata identifiers as the default resource identifiers or authority records. However, given the enormous growth of the number of collections that link entities to Wikidata that is already taking place, we might end up in a situation where the perception is that “if an item is not in Wikidata, it doesn’t exist” stimulating putting more items in Wikidata and making local descriptions less relevant. From a strategic point of view for adopting Wikidata decision makers may pose the question: “Why do we have a local thesaurus when we already have Wikidata?” The next question, then, will probably not be “Should we go this way?” but rather “When should we go this way and start using the Wikidata identifier as The Identifier?” REFERENCES 1 Robert Sanderson, “The Linked Data Snowball and Why We Need Reconciliation,” SlideShare, Apr. 4, 2016, https://www.slideshare.net/azaroth42/linked-data-snowball-or-why-we-need- reconciliation. 2 Karen Smith-Yoshimura, “The rise of Wikidata as a linked data source,” Hanging Together, Aug. 6, 2018, http://hangingtogether.org/?p=6775. 3 Joachim Neubert, “Wikidata as a Linking Hub for Knowledge Organization Systems? Integrating an Authority Mapping into Wikidata and Learning Lessons for KOS Mappings,” in Proceedings of the 17th European Networked Knowledge Organization Systems Workshop, 2017, 14-25, http://ceur-ws.org/Vol-1937/paper2.pdf. 4 Theo van Veen, “Wikidata as universal library thesaurus,” presented Oct. 2017 at WikidataCon 2017, Berlin, https://www.youtube.com/watch?v=1_NxKBnCOHM. 5 “HTTPRange-14,” Wikipedia, accessed Mar. 15, 2019, https://en.wikipedia.org/wiki/HTTPRange-14. 6 Theo van Veen et. al., “Linking Named Entities in Dutch Historical Newspapers,” in Metadata and Semantics Research, MTSR 2016, ed. Emmanouel Garoufallou (Cham: Springer, 2016), 205–10, https://doi.org/10.1007/978-3-319-49157-8_18. 7 Video demonstration of “KB Research Portal,” KB | National Library of the Netherlands, http://www.kbresearch.nl/xportal, accessed Apr. 26, 2019, https://www.youtube.com/watch?v=J5mCem-hEMg. 8 Ruben Verborgh, “Linked Data Fragments: Query the Web of data on Web-scale by moving intelligence from servers to clients,” accessed Mar. 15, 2019, http://linkeddatafragments.org/. 9 Mark Graham, “The Problem with Wikidata,” Apr. 6, 2012, https://www.theatlantic.com/technology/archive/2012/04/the-problem-with- wikidata/255564/. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 81 10 Candise Branum, “The Myth of Library Neutrality,” May 15, 2014, https://candisebranum.wordpress.com/2014/05/15/the-myth-of-library-neutrality/. 11 “The Reliability of Wikipedia,” Wikipedia, accessed Mar. 15, 2019, https://en.wikipedia.org/wiki/Reliability_of_Wikipedia. 10921 ---- Articles “Good Night, Good Day, Good Luck”: Applying Topic Modeling to Chat Reference Transcripts Megan Ozeran and Piper Martin INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 59 Megan Ozeran (mozeran@illinois.edu) is Data Analytics & Visualization Librarian, University of Illinois Library. Piper Martin (pm13@illinois.edu) is Reference Services & Instruction Librarian, University of Illinois Library. ABSTRACT This article presents the results of a pilot project that tested the application of algorithmic topic modeling to chat reference conversations. The outcomes for this project included determining if this method could be used to identify the most common chat topics in a semester and whether these topics could inform library services beyond chat reference training. After reviewing the literature, four topic modeling algorithms were successfully implemented using Python code: (1) LDA, (2) phrase-LDA, (3) DMM, and (4) NMF. Analysis of the top ten topics from each algorithm indicated that LDA, phrase- LDA, and NMF show the most promise for future analysis on larger sets of data (from three or more semesters) and for examining different facets of the data (fall versus spring semester, different time of day, just the patron side of the conversation). INTRODUCTION The library at the University of Illinois at Urbana-Champaign has included chat reference services since the spring of 2001.1 Today, this service is extensively used by library patrons, resulting in thousands of conversations each semester. While in-person reference edges out chat for the largest number of interactions at the main library information desk over the most recent four years, chat questions have a higher number of more complex questions that incorporate teaching or strategizing.2 Since the initial implementation of chat, the library has continually assessed and improved chat reference by evaluating the software, measuring the effectiveness and value of the service, and providing staff training.3 For several years, librarians at the University of Illinois have used chat transcripts for training graduate assistants and new employees and chat statistics for determining staffing. Unlike other forms of reference interactions, chat offers a textual record of the conversation, so librarians have used this unique opportunity in a couple different ways. In a training exercise, students read through actual transcripts and are guided in recognizing both well-developed and less-than-ideal interactions. They are then asked to think about ways those chat conversations could have been improved and to share strategies for doing so. Graduate assistant supervisors also use chat transcripts to evaluate the performance of individual graduate assistants, checking for appropriate levels of helpfulness and for adherence to the library’s chat policies. Finally, part of the library’s assessment strategy looks at chat interaction numbers, such as chats per hour, the duration of each conversation, and the level of complexity of each conversation to help make decisions about optimal chat staffing levels. However, prior to the project described here, the library had not yet GOOD NIGHT, GOOD DAY, GOOD LUCK | OZERAN AND MARTIN 60 https://doi.org/10.6017/ital.v38i2.10921 analyzed the chat reference conversations on a large scale to understand the range and consistency of topics being discussed. While these uses of chat data have been successful, such a large body of information from patrons about the library and its collections and services seemed underutilized. In an environment of growing data-informed decision-making, both within the broader library community and at the University of Illinois in particular, it was now an opportune time to implement this kind of large- scale topic analysis. If common themes emerged from the chat interactions beyond simply showing the most frequently asked questions, these themes could inform the library’s reference services beyond just training for chat reference. For example, patterns in the number of citation questions could indicate the best times to offer a citation management tool workshop; multiple inquiries about a new resource or tool might prompt planning a new workshop; and repeated confusion regarding a service or policy may signal a need to bolster the online guides or FAQ. Since the number of chat transcripts was so large, automating analysis through a programming language such as Python seemed the best course of action. This article presents the results of a pilot project that tested the application of algorithmic topic modeling to chat conversations. The outcomes for this project included (1) determining if this method could be used to identify the most common chat reference topics in a semester; and (2) whether this information indicated if it could be used to inform reference services beyond just training for chat, such as improving FAQs, workshops, the library website, or other instruction. LITERATURE REVIEW Chat reference services are well established in academic libraries, and there are abundant examples in the literature exploring these services. However, there is a lack of research on ways to employ automated methods to analyze chat reference. Numerous articles approach chat analysis via traditional qualitative methods, where research teams hand-code chat themes, topics, or question categories.4 Schiller employed a tool called QDA Miner to partially automate the otherwise human-driven coding process, using the software to automatically generate clusters of manually created codes.5 Only one paper appeared to explicitly address the issue primarily by using algorithmic analysis methods. In addition to conducting sentiment analysis, Kohler applied three topic modeling algorithms to chat reference conversations at Rockhurst University.6 Kohler identified the algorithm of non-negative matrix factorization (NMF) as the “winning topic extractor” based on how evenly it distributed the topic clusters across all the chat conversations.7 The other algorithms Kohler tested, latent Dirichlet allocation (LDA) and latent semantic analysis (LSA), had much more skewed distributions of topics. The most common topic identified by LDA appeared in so many of the chat conversations that it was essentially meaningless as a category. LDA is one the most well-established topic modeling algorithms, but as Kohler found, it does not work very well with short texts like chat conversations. To supplement the lack of library research in this area, non-library research that has applied topic modeling to short texts was also reviewed. Interestingly, although the NMF algorithm worked well for Kohler’s analysis of library chat conversations, there was little mention of NMF in the non- library literature. On the other hand, it was not surprising that LDA was one of the most commonly discussed algorithms, either as an example of what doesn’t work or as a basis upon which a modified algorithm was created to perform better for short texts.8 Another common algorithm INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 61 was biterm topic modeling (BTM). Proposed by Cheng et al., BTM takes pairs of words (biterms), rather than individual words, as the unit on which to base topics.9 By creating biterms, the researchers increased the number of items to sort into topics, thus mitigating a common problem with analyzing short texts. A final commonly used algorithm was the Dirichlet mixture model (DMM).10 A key feature of DMM for analyzing short texts is that it assumes each text (in this project, each chat conversation) is associated with only one topic. While longer texts like articles or books likely encompass many topics, it is plausible that a chat conversation could be summarized in one topic. METHODOLOGY At the time of this project (spring 2018), the library was using locally developed chat software called IWonder. The chat widget is embedded on the library homepage, on the “Ask A Librarian” page, in LibGuides, and within the library’s interface for its licensed EBSCO databases. The chat service was available 87 hours per week at the time the data was collected. During the day, chat service is provided by a mix of librarians, library staff, and graduate assistants, most of whom are scheduled at the main library’s information desk. Subject-specific libraries, including the engineering library, the agricultural and life sciences library, and the social sciences, health, and education library, also contribute many hours on chat reference from their respective locations. The evening and weekend shifts are all covered by graduate assistants from the University of Illinois School of Information Sciences. The authors decided that one semester of chat transcripts would be the most appropriate corpus with which to work for this pilot project because it would encompass a substantive and meaningful (but also manageable) number of conversations. In preparation, Institutional Review Board approval was received, and a graduate student completing a degree in information management from the School of Information Sciences was selected to assist with this project through the school’s practicum program. This practicum student is an experienced programmer, and his presence on the team allowed the project to proceed more quickly than if the authors had pursued the project without his expertise. To begin the project, all chat conversations from the spring 2017 semester were obtained by querying the local server using MySQL Workbench, limiting the query to chat logs between the dates 1/17/2017 and 5/12/2017 (inclusive). Because each line of a chat conversation was saved as a separate line in the database, this meant retrieving approximately 90,000 lines of data. The actual text of the chat conversations was unstructured (by its nature), but the text was saved with related metadata. For instance, each chat conversation was automatically given a unique identifier, so the individual lines could be grouped into conversations and put in order by their timestamp. The 90,000 lines represented almost 6,000 individual conversations. The chat logs were cleaned using a combination of OpenRefine (primarily for ASCII character cleanup) and Python code to remove personally identifiable information (PII) and to make the data easier to analyze.11 By default, the chat software did not collect any information about patrons, but sometimes patrons volunteered PII because they thought it was needed to answer their questions. Therefore, part of the cleaning process involved removing as much of this patron PII as possible, replacing it with the word “REMOVED” to denote the change. In addition, library staff usernames were scrubbed by replacing each username with a generic “staff###”, where “###” was a unique (incremented) number assigned to each original username. This maintained GOOD NIGHT, GOOD DAY, GOOD LUCK | OZERAN AND MARTIN 62 https://doi.org/10.6017/ital.v38i2.10921 the ability to track a single staff member across multiple conversations, if desired, without identifying the actual person. Another important part of the data cleaning was to remove URLs, because these would be unnecessary in identifying topics, and they significantly increased the number of unique “words” that the analysis algorithms identified. The URLs were nearly always saved within an HTML tag, so most URLs were easily identified for removal. The data cleaning process has been described here in a linear fashion for ease of understanding, but over the course of the project it was actually an iterative process, as more cleaning issues were discovered during analysis. Based on the analyses performed in the related literature, the practicum student wrote code to test five topic modeling algorithms: (1) latent Dirichlet allocation (LDA), (2) phrase-LDA (LDA applied to phrases instead of words), (3) biterm topic modeling (BTM), (4) Dirichlet mixture modeling (DMM), and (5) non-negative matrix factorization (NMF). Ultimately, the processing power and time required to implement BTM meant that this algorithm could not be implemented for this project. However, for the other four models, LDA, phrase-LDA, DMM, and NMF, were all successfully implemented. All code related to this project, including the cleaning and analysis, are available on GitHub (https://github.com/mozeran/uiuc-chat-log-analysis). RESULTS Outputs of the LDA, phrase-LDA, DMM, and NMF modeling algorithms are shown in tables 1 through 4. After removing common stop words, the remaining words were put into lowercase and stemmed before topic modeling algorithms were applied. The objective of the stemming process was to convert singular and plural versions of a word to a hybrid form so that they are treated as the same word. Thus, many words ending in “y” are shown ending in “i”. For instance, “library” and “libraries” would both be converted to “librari” and thus be treated as the same word. The phrase “easi search” refers to “Easy Search,” the all-in-one search box on the library homepage. The word “ugl” refers to the undergraduate library (UGL). The word “remov” showed up in the topic lists surprisingly frequently, probably because patron PII was replaced with the word “REMOVED.” Since explicitly denoting the removal of PII is unlikely to be of import, it makes sense in the future to simply remove the PII without replacement. Table 1: LDA (top 10 words in each topic) Topic 1 music map laptop remov find ok one also may score Topic 2 look search find help databas thank use articl research would Topic 3 book librari thank help check look remov reserv would els Topic 4 help use student find articl librari hi look tri question Topic 5 request librari account item thank ok get help loan number Topic 6 thank chat good know one night go okay think hi Topic 7 thank look librari remov help would contact inform find like Topic 8 search articl databas click thank journal help page ok find Topic 9 articl thank journal access look help remov full link find Topic 10 access tri link thank use work get campu remov let Table 2: Phrase-LDA INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 63 (top 10 phrases in each topic) Topic 1 interlibrari loan, lose chat, chat servic, lower level, chat open, writer workshop, spring break, studi room, call ugl, add chat Topic 2 good night, great day, good day, good luck, drop menu, sound good, nice day, ye great, remov thank welcom, make sens Topic 3 anyth els, tri find, abl find, find anyth, feel free, ll tri, social scienc, tri access, ll back, abl access Topic 4 easi search, academ search, find articl, search box, tri search, databas subject, search bar, search term, databas search, search databas Topic 5 graduat student, grad student, peer review, undergrad student, illinoi undergrad, scholarli sourc, univers illinoi, undergradu student, primari sourc, googl scholar Topic 6 main librari, librari catalog, librari account, librari homepag, call number, librari websit, netid password, main stack, creat account, borrow id Topic 7 page remov, click link, open new tab, link remov, send link, remov click, left side, remov link, page click, error messag Topic 8 give one moment, contact inform, moment pleas, faculti staff, give minut, pleas contact, email address, staff member, faculti member, unit state Topic 9 full text, journal articl, access articl, find articl, databas journal, light blue, articl titl, titl articl, journal databas, found articl Topic 10 request book, request item, check book, doubl check, print copi, cours reserv, copi avail, physic copi, book avail, copi past Table 3: DMM (top 10 words in each topic) Topic 1 work open chat way onlin say specif avail day sourc Topic 2 check titl research much onlin avail day text sourc say Topic 3 pleas sourc day onlin titl found right hello may take Topic 4 chat also copi pleas think onlin undergrad sourc work way Topic 5 pleas sorri found item chat way right open work time Topic 6 found also right much think could research undergrad sorri way Topic 7 contact hello account sorri could ask titl moment may think Topic 8 copi onlin sorri ask think say right also much sourc Topic 9 much research way may right think open take hello result Topic 10 abl avail also titl catalog pleas say campu onlin take Table 4: NMF (top 10 words in each topic) Topic 1 request take titl today moment way item may place say Topic 2 specif start type journal topic research tab way subject result Topic 3 ugl today ask wonder call may contact peopl someon talk Topic 4 sourc univers scholarli research servic resourc tell illinoi guid librarian Topic 5 account log set vpn us password id say campu problem Topic 6 main locat undergradu call tab review two circul ugl number Topic 7 reserv class time undergradu cours websit show im titl onlin GOOD NIGHT, GOOD DAY, GOOD LUCK | OZERAN AND MARTIN 64 https://doi.org/10.6017/ital.v38i2.10921 Topic 8 text full troubl problem still pdf websit onlin send moment Topic 9 chat night hey yeah oh well time tonight take yep Topic 10 unfortun uiuc onlin wonder version graduat print seem way grad DISCUSSION Interpreting the results of a topic model can be a bit of a guessing game. None of these algorithms look at the semantic meaning of words, so the resulting topics are not based on semantics. Each algorithm simply employs a different method of mathematically determining the likelihood that words are related to each other. When this likelihood is high enough (as defined by the algorithm), the words are listed within the same topic. Identifying topics mathematically is much quicker than a person hand-coding conversations. However, automatic classification also means that the resulting topics could make absolutely no sense to people, who understand the semantic meaning of the words within a topic. This lack of coherent meaning is most present in the results of the DMM model (table 3). For instance, the words that comprise Topic 1 are the following: “work open chat way online say specify available day source.” It is difficult to imagine what overarching concept links all, or even most, of these words. Only a few words appear to have any significance at all: “open” could refer to open access, or to the library’s open hours; “online” may refer to finding resources online, or the fact that a student is taking online classes; and “source” is likely some reference to a research resource. These words barely relate to each other semantically, and the remaining seven words don’t provide much clarification. Thus, it appears that DMM is not a particularly good topic modeling algorithm for library chat reference. The results seen from the LDA model (table 1) appear slightly more comprehensible. In Topic 2, for instance, the words are as follows: “look search find help database thank use article research would.” While not all the words relate to each other, a common theme could emerge from the words look, search, find, database, article, and research. It’s possible that this Topic 2 identified chat conversations where a patron needed help finding research articles. Even Topic 6, at first glance a silly list of words, makes some sense: “thank chat good know one night go okay think hi.” Greetings and sign-offs probably comprised a good number of the total words in the corpus, so it is understandable that a “greetings” topic could be mathematically identified. Overall, LDA appears to have potential in topic modeling chat reference, but it probably needs to be further tweaked. When applying the LDA model to phrases (table 2), the coherence increases within the phrases, but the topics are not always as coherent. Topic 1 includes the following phrases: “interlibrary loan, lose chat, chat service, lower level, chat open, writer workshop, spring break, study room, call UGL, add chat.” Each phrase, individually, makes perfect sense for the context of this library; as a collection, however, the phrases don’t comprise one coherent topic. Four of the phrases explicitly mention chat services (an interesting meta-topic), while the rest appear completely unrelated. On the other hand, Topic 10 does show more semantic relation between the phrases: “request book, request item, check book, double check, print copy, course reserve, copy available, physical copy, book available, copy past.” It seems pretty clear that this topic refers to books— whether on reserve, being requested, or checking if they are even available. With the wide difference in topic coherence, the phrase-LDA algorithm is not perfect for topic modeling chat reference, but further exploration is warranted. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 65 The final algorithm, NFM (table 4), is also imperfect. It is possible to distill each topic into an actual semantic concept, but there is almost always at least one word that makes it a little less clear. Topic 5 probably provides the best coherence: “account log set VPN use password ID say campus problem.” It seems clear this topic refers to identity verification, likely for off-campus use of library resources. The other topics given by the algorithm have more confusing elements, such as in Topic 1 where the relatively meaningless words may, way, and say all appear. It’s interesting that Kohler found NMF to work very well, while the results above are not nearly as coherent as those identified in her implementation.12 This is a perfect example of how the tuning of many different parameters can affect the ultimate results of each topic modeling algorithm. This is why the authors think it is worth continuing to explore how to improve the implementation of LDA, phrase-LDA, and NMF algorithms for chat conversations, as well as share the original code for others to test and revise. It will take many different projects at many different libraries before an optimum topic model implementation is found for chat reference. NEXT STEPS For the most part, the more coherent results from the LDA and NMF topic modeling algorithms support anecdotal understanding of the primary themes in chat conversations. Currently, two members of the Research & Information Services unit, the department responsible for scheduling the chat reference service at the main library, are examining the model outputs to determine whether any of the results are strong enough at this stage to suggest changes to services or resources. They will also share the results with the chat coordinators at other libraries on campus in case the results indicate changes for them. Additionally, results will be shared with the library’s Web Working Group, since repeated questions about the same services or locations may suggest the need to display them in a more prominent place on the library website or provide a more discoverable online path to them. Since this was a pilot project that used a fairly small data set, it is anticipated that years of transcripts—along with improved topic model implementation—will reveal even more significant and robust themes. With the encouraging results of this pilot project, there is much to continue to explore.13 One future question is whether there are differences between fall and spring semesters. If some topics arise more frequently in one semester than the other, perhaps the library needs to offer more workshops during that semester. Alternatively, perhaps support materials should be created (such as handouts or online guides) that emphasize the related services and place them more prominently, while withdrawing or de-emphasizing them in the other semester. Another area for further analysis is how the topics that emerge in the late-night chat interactions compare to other times of day. This will help the library design more relevant training materials for the graduate assistants who staff those shifts, or potentially change who is staffing the shifts. Also of interest is comparing the text written by the chat operators versus the chat users, as this would further spotlight the terminology that patrons use. If patrons are using significantly different terms from staff, then modifying the language of the library’s website may reduce confusion. There are also improvements to make to the data cleaning process, such as better identifying when to remove stop words and when to remove punctuation. These steps weren’t perfectly aligned, which is why; for example, the “ll” that appears in Topic 3 of the phrase-LDA results (table 2) is most likely a mutation of the contractions like “I’ll,” “we’ll,” and “you’ll.” Generating “ll” as a word from multiple different contractions not only created a meaningless word, but since “ll” GOOD NIGHT, GOOD DAY, GOOD LUCK | OZERAN AND MARTIN 66 https://doi.org/10.6017/ital.v38i2.10921 occurred more frequently than any unique contraction, it was potentially treated as more important by the topic modeling algorithms. CONCLUSION This project has demonstrated that topic modeling is one possible way to employ automated methods to analyze chat reference, with mixed success. The library will continue to improve chat reference analysis based on this project experience. The authors hope that other libraries will use the lessons from this project and the code in GitHub as a starting point to employ similar analysis for their own chat reference. In fact, a related project at the University of Northern Iowa Library is evidence of growing interest in topic modeling of chat reference transcripts.14 Considering how frequently patrons use chat reference, is it important for libraries to explore and embrace whatever methods will allow them to assess and improve such services. ACKNOWLEDGEMENTS The authors wish to acknowledge the Research and Publication Committee of the University of Illinois at Urbana-Champaign Library, which provided support for the completion of this research. Many thanks are owed to Xinyu Tian, our practicum student, for the extensive work he did in identifying relevant literature and developing the project code. NOTES 1 Jo Kibbee, David Ward, and Wei Ma, “Virtual Service, Real Data: Results of a Pilot Study,” Reference Services Review 30, no. 1 (Mar. 1, 2002): 25–36, https://doi.org/10.1108/00907320210416519. 2 The library uses the READ scale (Reference Effort Assessment Data scale), which allows reference transactions to be translated into a numerical scale that takes into account the effort, skills, knowledge, teaching moment, techniques and tools used by the staff in the transaction. See readscale.org for more information. 3 David Ward and M. Kathleen Kern, “Combining IM and Vendor-Based Chat: A Report from the Frontlines of an Integrated Service,” Portal: Libraries and the Academy 6, no. 4 (Oct. 2006): 417–29, https://doi.org/10.1353/pla.2006.0058; JoAnn Jacoby et al., “The Value of Chat Reference Services: A Pilot Study,” Portal: Libraries and the Academy 16, no. 1 (Jan. 2016): 109– 29, https://doi.org/10.1353/pla.2016.0013; David Ward, “Using Virtual Reference Transcripts for Staff Training,” Reference Services Review 31, no. 1 (2003): 46–56, https://doi.org/10.1108/00907320310460915. 4 Robin Brown, “Lifting the Veil: Analyzing Collaborative Virtual Reference Transcripts to Demonstrate Value and Make Recommendations for Practice,” Reference & User Services Quarterly 57, no. 1 (Fall 2017): 42–47, https://doi.org/10.5860/rusq.57.1.6441; Maryvon Côté, Svetlana Kochkina, and Tara Mawhinney, “Do You Want to Chat? Reevaluating Organization of Virtual Reference Service at an Academic Library,” Reference & User Services Quarterly 56, no. 1 (Fall 2016): 36–46, https://doi.org/10.5860/rusq.56n1.36; Donna Goda and Corinne Bisshop, “Frequency and Content of Chat Questions by Time of Semester at the University of Central Florida: Implications for Training, Staffing and Marketing,” Public Services Quarterly 4, no. 4 (Dec. 2008): 291–316, https://doi.org/10.1080/15228950802285593; INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 67 Kelsey Keyes and Ellie Dworak, “Staffing Chat Reference with Undergraduate Student Assistants at an Academic Library: A Standards-Based Assessment,” The Journal of Academic Librarianship 43, no. 6 (2017): 469–78, https://doi.org/10.1016/j.acalib.2017.09.001; Michael Mungin, “Stats Don’t Tell the Whole Story: Using Qualitative Data Analysis of Chat Reference Transcripts to Assess and Improve Services,” Journal of Library & Information Services in Distance Learning 11, no. 1–2 (Jan. 2017): 25–36, https://doi.org/10.1080/1533290X.2016.1223965. 5 Shu Z. Schiller, “CHAT for Chat: Mediated Learning in Online Chat Virtual Reference Service,” Computers in Human Behavior 65 (Dec. 2016): 651–65, https://doi.org/10.1016/j.chb.2016.06.053. 6 Ellie Kohler, “What Do Your Library Chats Say?: How to Analyze Webchat Transcripts for Sentiment and Topic Extraction,” in Brick & Click Libraries Conference Proceedings (Brick & Click, Maryville, MO: Northwest Missouri State University, 2017), 138–48, https://www.nwmissouri.edu/library/brickandclick/presentations/eproceedings.pdf. 7 Kohler, 141. 8 For example: Guan-Bin Chen and Hung-Yu Kao, “Re-Organized Topic Modeling for Micro- Blogging Data,” in Proceedings of the ASE BigData & SocialInformatics 2015, ASE BD&SI ’15 (New York, NY: ACM, 2015), 35:1–35:8, https://doi.org/10.1145/2818869.2818875. 9 X. Cheng et al., “BTM: Topic Modeling over Short Texts,” IEEE Transactions on Knowledge and Data Engineering 26, no. 12 (Dec.2014): 2,928–41, https://doi.org/10.1109/TKDE.2014.2313872. 10 For example: Chenliang Li et al., “Topic Modeling for Short Texts with Auxiliary Word Embeddings,” in Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (ACM Press, 2016), 165–74, https://doi.org/10.1145/2911451.2911499. 11 We used Python packages gensim, langid, nltk, numpy, pandas, re, sklearn, and stop_words for data cleaning and analysis. 12 Kohler, “What Do Your Library Chats Say?” 13 The library implemented new chat reference software after this project was completed, so analysis of chat conversations that took place after the spring 2018 semester will require a re- working of the data collection and cleaning processes. 14 HyunSeung Koh and Mark Fienup, “Library Chat Analysis: A Navigation Tool,” (Poster, Dec. 5, 2018), https://libraryassessment.org/wp-content/uploads/2018/11/58-KohFienup- LibraryChatAnalysis.pdf. 10973 ---- Information Security in Libraries: Examining the Effects of Knowledge Transfer Tonia San Nicolas-Rocca and Richard J. Burkhard INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 58 Tonia San Nicolas-Rocca (tonia.sannicolas-rocca@sjsu.edu) is Assistant Professor in the School of Information at San Jose State University. Richard J. Burkhard (richard.burkhard@sjsu.edu) is Professor in the School of Information Systems and Technology in the College of Business at San Jose State University. . Author Three (email) is Title, Institution. ABSTRACT Libraries in the United States handle sensitive patron information, including personally identifiable information and circulation records. With libraries providing services to millions of patrons across the U.S., it is important that they understand the importance of patron privacy and how to protect it. This study investigates how knowledge transferred within an online cybersecurity education affects library employee information security practices. The results of this study suggest that knowledge transfer does have a positive effect on library employee information security and risk management practices. INTRODUCTION Libraries across the U.S. provide a wide range of services and resources to society. Libraries of all types are viewed as important parts of their communities, offering a place for research, to learn about technology, to access accurate and unbiased information, and a place that inspires and sparks creativity. As a result, there were over 171 million registered public library users in the U.S. in 2016.1 A library is a collection of information resources and services made available to the community in which it serves. The American Library Association (ALA) affirms the ethical imperative to provide unrestricted access to information and to guard against impediments to open inquiry.2 Further, in all areas of librarianship, best practice leaves the library user in control of as many choices as possible.3 In a library, the right to privacy is the right to open inquiry without having the subject of one’s interest examined or scrutinized by others.4 Many library resources require the use of a library card. To obtain a library card in the U.S. one must provide official photo identification showing personally identifiable information (PII), such as name, address, telephone number, and email address. PII connects library users or patrons with, for example, items checked out, and websites visited. As such, PII has the potential to build up an image of a library patron that could potentially be used to assess the patron’s character. In response, the ALA developed a policy concerning the confidentiality of PII about library users.5 Confidentiality extends to “information sought or received and resources consulted, borrowed, acquired or transmitted,” and includes, but is not limited to, database search records, reference interviews, circulation records, interlibrary loan records, and other personally identifiable uses of library materials, facilities, or services.6 In more recent years, the ALA has further specified that the right of patrons to privacy applies to any information that can link “choices of taste, interest, or research with an individual.”7 When library users recognize or fear that their privacy or INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 59 confidentiality is compromised, true freedom of inquiry no longer exists. Therefore, it is imperative that libraries use extra care when handling patron personally identifiable information. While librarians and other library employees may understand the importance of data protection, they generally don’t have the resources available to assess information security risk, employ risk mitigation strategies, or offer security education, training, or awareness (SETA) programs. This is of particular concern as libraries increasingly have access to databases of both proprietary and personal information.8 SETA programs are risk mitigation strategies employed by organizations worldwide to increase and maintain end-user compliance of information security and privacy policies. In libraries, information systems are widely used to provide services to patrons, however, there is little known about information security practices in libraries.9 Given the sensitivity of the data libraries handle, and the lack of information security resources available to them, it is important for those currently or planning to work in the library environment to develop the knowledge necessary to identify risks and develop and employ risk mitigation strategies to protect information and information resources they are entrusted with. Therefore, the research question in this present study is: How can cybersecurity education strengthen information security practices in libraries? Currently, there is a dearth of research on information security practices in libraries.10 This is an important research gap to acknowledge given that patron privacy is fundamental to the practice of librarianship in the U.S, and the advancement in technology coupled with federal regulations adds to the challenges of keeping patron privacy safe.11 Thus this study contributes to current literature by evaluating the effects of knowledge transfer as a means to strengthen information security within libraries. Furthermore, this study will offer a preliminary investigation as to whether knowledge utilization leads to motivation and the participation of information security risk management activities within libraries. The remainder of this paper proceeds as follows: First, a review of knowledge transfer is covered. A description of the cybersecurity course, including students and course material, is provided. Data collection and analysis are then presented. This is followed by a discussion of the findings, limitations, and future research. LITERATURE RIVEW Knowledge Transfer in SETA Knowledge transfer through SETA programs plays a key role in the development and implementation of cybersecurity practices.12 Knowledge is transferred when learning takes place and when the recipient of that knowledge understands the intricacies and implications associated with that knowledge so that he or she can apply it.13 For example, in a security education program, an educator may transfer knowledge about information security risks to users who learn and apply the knowledge to increase patron privacy. The knowledge is applied when evidenced by users who are able to identify risks to patron data and implement risk mitigations strategies that serve to protect patron information and information system assets. Knowledge transfer can be influenced by four factors: absorptive capacity, communication, motivation, and user participation.14 This study evaluates the extent to which knowledge transferred from a cybersecurity course strengthens information security practices within libraries. This study adapts the theoretical model as proposed by Spears & San Nicolas-Rocca INFORMATION SECURITY IN LIBRARIES | SAN-NICOLAS-ROCCA AND BURKHARD 60 https://doi.org/10.6017/ital.v38i2.10973 (2015) (see figure 1) to examine the effects of cybersecurity education on information security practices in libraries.15 Figure 1. Factors of Knowledge Transfer Leads to Knowledge Utilization. Absorptive Capacity Absorptive capacity is the ability of a recipient to recognize the importance and value of eternally sourced knowledge, assimilate and apply it and has been found to be positively related to knowledge transfer.16 Activating a student’s prior knowledge could enhance their ability to process new information.17 That is, knowledge transfer is more likely to take place between the instructor and students enrolled in a cybersecurity course if the student has existing knowledge or has had experience in some related area. For the present study, students have stated that prior to enrolling in the cybersecurity course, they had little to no knowledge of cybersecurity. One student mentioned, “While I am the director of a small academic library, I have no understanding of cybersecurity. I am taking this course to learn about cybersecurity so that I can better secure the library I work in and to share the information with those who work in the library.” Another student mentioned, “My goal is to work in a public library after graduation. I am taking this course because I keep hearing about cybersecurity breaches in the news, and I want to learn more about cybersecurity because I think it will help me in my future job.” While all of the students enrolled in the course had no cybersecurity experience, all of them had some understanding of principle 3 in the ALA Code of Ethics, which states, “We protect each library user’s right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.”18 Understanding of principle 3 in the Code of Ethics demonstrates existing knowledge in some related area with regards to cybersecurity, albeit limited knowledge. Given this understanding, students should have the ability to process new information from the cybersecurity course. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 61 Communication The success of any SETA program depends on the ability of the instructor to effectively communicate the applicability and practical purpose of the material to be mastered, as distinguished from abstract or conceptual learning.19 According to current research, knowledge transfer can only occur if communication is effective in terms of type, amount, competence, and usefulness.20 For the present study, students were enrolled in an online graduate level cybersecurity course at a university we call Mountain View University (MVU). We changed the name to protect the privacy of the research participants. While research suggests that the best form of communication for knowledge transfer is face-to-face communication, the cybersecurity course at MVU is only offered online.21 Therefore, communication relating to the course was conducted via course management software, email, video conferencing, discussion board, and pre- recorded videos. Motivation Motivation can be a significant influence on knowledge transfer.22 That is, an individual’s motivation to participate in SETA programs has been found to influence the extent to which knowledge is transferred.23 Specifically, without motivation, a trainee may fail to use information shared with them about methods used to protect and safeguard patron privacy. In this present study, research participants voluntarily enrolled in the cybersecurity course. The cybersecurity course is not a core course or a class required for graduation. Therefore, enrolling in the course implies motivation to learn about cybersecurity by participating in course activities and completing assigned work. User Participation User participation in information security activities may influence effective knowledge transfer initiatives.24 According to previous research, when users participate in cybersecurity activities, security safeguards were more aligned with organizational objectives and were more effectively designed and performed within the organization.25 For the present study, given that students enrolled in the cybersecurity course, it is expected that they will participate in information security risk management activities, such as the completion of personal and organizational risk management projects. CYBERSECURITY COURSE INFORMATION This study will examine whether cybersecurity education strengthens information security practices within libraries. Based on the model in figure 1, students enrolled in the cybersecurity course (motivation), and therefore, were expected to participate in all course activities and complete assigned work (user participation), such as ISRM assignments. ISRM assignments are described in the Course Material section below. As per figure 2, the cybersecurity course was offered online, and used multiple forms of communication, including email, video conferencing, discussion board, and pre-recorded videos (communication). Students were able to access these resources through Canvas, a learning management system. Students came into the class with some understanding of principle 3 in the ALA Code of Ethics. Therefore, given that this knowledge is in a “related area,” students may be able to process new information relating to cybersecurity (absorptive capacity). As per the above information and as depicted in figure 1, motivation, user participation, communication, and absorptive capacity will lead to knowledge transfer. Therefore, this study will focus on how knowledge transfer, as a means to strengthen information security, leads to knowledge utilization by cybersecurity students within information organizations. INFORMATION SECURITY IN LIBRARIES | SAN-NICOLAS-ROCCA AND BURKHARD 62 https://doi.org/10.6017/ital.v38i2.10973 Specifically, this study will explore the possibility of knowledge utilization leading to motivation, and participation in ISRM initiatives in libraries. Figure 2. Knowledge Transfer Elements: Cybersecurity Knowledge Transfer for Information Organizations. Course Material The course was offered to graduate students at Mountain View University. Course material was created based on the National Institute of Technology Special Publication (NIST SP) 800-53 and 60, as well as Federal Information Processing Standards (FIPS) Publications 199 and 200. The focus of the course was information security risk management (ISRM). Course requirements included lab exercises, discussion posts relating to current cybersecurity findings and news reports, and ISRM assignments. ISRM assignments included a personal risk management assignment, which then led to the completion of an organizational risk management project (ORMP). Students completed the ORMP for various libraries, healthcare institutions, pharmaceutical companies, government organizations, and small businesses. With instructor approval, students were allowed to select the organization they wanted to work with. The objective of the course was for students to obtain an understanding of ISRM and be able to apply what they have learned to the workplace. Course Communication SETA programs depend strongly on the ability of the knowledge source to effectively communicate the importance and applicability of the knowledge shared. Current research suggests that the type of communication medium, relevance and usefulness of the information, and competency of the instructor can affect knowledge transfer. Given that face-to-face communication is considered the best method for successful knowledge transfer, it is important to understand if online communication methods were effective in the cybersecurity course described herein as the main focus of this study is to determine if knowledge transfer leads to knowledge utilization. According to table 1, respondents “Strongly Agree” or “Agree” that the materials used, relevance of communication, comprehension of instructor communication, and the amount of time communicating about cybersecurity in the course was effective (data collection described in section, Data Collection and Analysis. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 63 Questions Response Strongly Agree Agree Neither Agree nor Disagree Disagree Strongly Disagree Medium: The material used in the cybersecurity course I took at MVU communicated security lessons effectively. 12 (50%) 12 (50%) 0 (0.00%) 0 (0.00%) 0 (0.00%) Relevance: Communication during the cybersecurity course I took at MVU was effective in focusing on things I needed to know about cybersecurity for my job. 10 (45.45%) 12 (54.55%) 0 (0.00%) 0 (0.00%) 0 (0.00%) Comprehension: In the cybersecurity course I took at MVU, the instructor’s oral and/or written communication with me was understandable. 12 (54.55%) 10 (45.45%) 0 (0.00%) 0 (0.00%) 0 (0.00%) Amount: In the cybersecurity course I took at MVU, the amount of time communicating about cybersecurity was sufficient. 12 (54.55%) 10 (45.45%) 0 (0.00%) 0 (0.00%) 0 (0.00%) Table 1. Effectiveness of communication in cybersecurity course. DATA COLLECTION AND ANALYSIS The purpose of this study is to determine if knowledge transfer through cybersecurity education, as a means to strengthen information security, leads to knowledge utilization within libraries. Specifically, this study will examine if research participants will engage in ISRM activities after completion of the cybersecurity education course. The model in figure 1 is examined via survey instrument by the authors. The survey instrument was available to former students who completed an online, semester long, cybersecurity course from fall 2013 through fall 2017. One hundred and twenty-six former students completed one of eight cybersecurity courses, and all were asked to participate in this study. Thirty-nine students accessed the survey, but only thirty-eight agreed to participate. Of those who agreed to participate in the survey, only twenty-two work in a library in the U.S. or a U.S. territory. Of the other sixteen participants, twelve do not currently work within a library environment, and four do not have a job. Therefore, responses from twenty-two research participants who work in a library in the U.S. or U.S. territory will be reported in this study. Table 2 provides a list of the types of libraries the twenty-two research participants work in. Type of Library Environment Response (22) Academic Library 3 (13.64%) Public Library 11 (50%) School Library (K-12) 2 (9.09%) Special Library 6 (27.27%) Table 2. Types of libraries research participants work in. INFORMATION SECURITY IN LIBRARIES | SAN-NICOLAS-ROCCA AND BURKHARD 64 https://doi.org/10.6017/ital.v38i2.10973 Having knowledge and an understanding of information security policies, work processes, and information and information system use within a library environment, a knowledge recipient may understand the value of the knowledge shared with them through effective SETA programs and utilize the new knowledge to protect information and information resources. According to table 3, most survey participants stated that they have average to excellent knowledge of their library’s computing-related policies, work processes that handle sensitive patron information, how access to patron information is granted, and how internal staff tend to use computing devices to access organizational information. A few respondents stated that their knowledge is below average. Questions: Response Excellent Above Average Average Below Average Poor How would you rate your knowledge of your organization’s computing-related policies for internal staff computer usage? 4 (18.18%) 10 (45.45%) 8 (36.36%) 0 (0.00%) 0 (0.00%) How would you rate your knowledge of your library’s work processes that handle sensitive patron information? 4 (18.18%) 11 (50%) 6 (27.27%) 1 (4.55%) 0 (0.00%) Within the organization you work for, how would you rate your knowledge of how access to patron information is granted? 3 (13.64%) 12 (54.55%) 5 (22.73%) 2 (9.10%) 0 (0.00%) How would you rate your knowledge on how internal staff tend to use computing devices to access organizational information? 2 (9.10%) 11 (50%) 8 (36.36%) 1 (4.55%) 0 (0.00%) Table 3. Knowledge of organization’s computing-related policies. Knowledge Transfer For this study, knowledge transfer is measured as the extent to which the cybersecurity student acquired knowledge or understands the key educational objective. According to table 4 below, all survey participants stated that during the cybersecurity course, they acquired knowledge on information security risks, and solutions to manage information security risks within organizations. Furthermore, 91 percent of the twenty-two survey participants stated that they gained an understanding of the feasibility to implement solutions and potential impact of not implementing solutions to manage information security risk within the organizations in which they work. This is consistent with previous research that has measured knowledge transfer.26 Question: During the cybersecurity course I took at MVU, I _________. Response acquired knowledge on information security risks within the organization. 22 (100%) acquired knowledge on solutions to manage information security risks identified within my organization. 22 (100%) gained an understanding of the feasibility to implement solutions to manage information security risks identified within my organization. 20 (90.90%) gained an understanding of the potential impact of not implementing solutions to manage information security risks identified within my organization. 20 (90.90%) INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 65 Table 4. Indicators of Knowledge Transfer. Knowledge Utilization The desired outcome of knowledge transfer is knowledge utilization.27 This study is interested in the extent to which cybersecurity students have been engaged in information security risk management initiatives in their workplace since the completion of the cybersecurity course. According to table 5, twelve of the twenty-two survey participants have utilized the knowledge transferred to them from the cybersecurity course within the libraries in which they work. Of the twelve survey participants, ten performed security procedures within the organization on an ad hoc, informal basis. Seven worked on defining new or revised security policies. Four implemented new or revised security procedures for organizational staff to follow, and two evaluated at least one security safeguard to determine whether it is being followed by organizational staff. Question: Since the completion of the cybersecurity course I took at MVU, I have ______ (please check all that apply). Response performed security procedures within the organization on an ad hoc, informal basis. 10 (83.33%) worked on defining new or revised security policies. 7 (58.33%) implemented new or revised security procedures for organizational staff to follow. 4 (33.33%) evaluated at least one security safeguard to determine whether it is being followed by organizational staff. 2 (16.66%) NOT performed any security procedures within the organization. 10 (45.45%) Table 5. Indicators of knowledge utilization in the library. Participation Knowledge transfer through cybersecurity education may influence a cybersecurity student to utilize the knowledge they have gained by participating in ISRM activities. According to table 6, sixteen of the twenty-two survey participants have participated in ISRM activities in the library in which they work since the completion of the cybersecurity course. Fifteen communicated with internal senior management on training materials. Seven performed a policy review and communicated with internal senior management on training materials. Five worked on a security questionnaire, one had an interview with an external collaborator, and another research participant analyzed their library’s business or IT process workflow. Question: Since the completion of the cybersecurity course you took at MVU, have you performed any of the following activities within the workplace: (please check all that apply) Response Security questionnaire 5 (31.25%) Interview with external collaborator (i.e. trainers) 1 (6.25%) Policy review 7 (43.75%) Business or IT process workflow analysis 1 (6.25%) Communication with internal peers or staff on training materials 15 (93.75%) Communicate with internal senior management on training materials 7 (43.75%) I have NOT performed any security activities in my workplace 6 (14.29%) Table 6. Participation in ISRM activities. Participation may also include discussions on ISRM activities. According to table 7, sixteen of the twenty-two survey participants have participated in discussion on ISRM activities within the INFORMATION SECURITY IN LIBRARIES | SAN-NICOLAS-ROCCA AND BURKHARD 66 https://doi.org/10.6017/ital.v38i2.10973 libraries they are currently working at. Fifteen survey participants participated in discussions on physical security, and ten had discussions on password policy. Seven survey participants had discussions on user provisioning, and six had discussions on encryption. Four survey participants had discussions on mobile devices, and another four had discussions on vendor security Question: Since the completion of the cybersecurity course you took at MVU, have you participated in discussions on the following areas of security? (Check all that apply) Response Password policy 10 (62.5%) User provisioning (i.e., establishing or revoking user logons and system authorization) 7 (43.75%) Mobile device 4 (25%) Encryption 6 (37.5%) Vendor security 4 (25%) Physical security 15 (93.75%) Disaster recovery, business continuity, or security incident response 6 (37.50%) I have NOT participated in any discussions relating to security in my workplace 6 (27.27%) Table 7. Participation in discussions on ISRM activities. Participation in cybersecurity education may lead to formal responsibility or accountability of ISRM activities. According to table 8, nine of the twenty-two survey respondents stated that since the completion of the cybersecurity course, they are formally responsible or accountable for ISRM in the libraries in which they work. Three research participants are responsible for identifying organizational members to participate in cybersecurity training. Five survey participants stated that they are responsible for communicating results on cybersecurity training to upper management, peers, and staff. Three research participants are responsible for organizational compliance with government regulations. Two are responsible for communicating organizational risk to the board of directors, and one research participant is responsible for organizational compliance of funder requirements. Question: Since the completion of the cybersecurity course you took at MVU, are you formally responsible or accountable in the following ways? (Check all that apply) Response Identifying organizational members to participate in cybersecurity training 3 (33.33%) Communicating results to upper management 5 (55.56%) Communicating results to peers or staff 5 (55.56%) Responsible for organizational compliance of funder requirements 1 (1.11%) Responsible for organizational compliance with government regulations 3 (33.33%) Responsible for internal audit 0 (0%) Responsible for communicating organizational risk to the board of directors 2 (22.22%) I am NOT formally responsible for security in my workplace 13 (59.10%) Table 8. Participation via accountability of ISRM activities. Motivation An objective of SETA programs is to motivate knowledge recipients to comply with information security policies that serve to protect information and information resources. As such, cybersecurity education may motivate students to comply with organizational information security policies that serve to protect information and information resources. According to table 9, INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 67 since the completion of the cybersecurity course, eighteen of the twenty-two survey participants stated that they believe it is important to protect patron sensitive data. Two respondents stated that they wholeheartedly feel responsible to protect their patrons from harm, and another two stated that they would be embarrassed if their organization experienced a data breach. Since the completion of the cybersecurity course I took at MVU, _________. Response I wholeheartedly feel responsible to protect our patrons from harm. 2 (9.10%) I believe it is important to protect our patrons’ sensitive data. 18 (81.82%) I would be embarrassed if my organization experienced a data breach. 2 (9.10%) my job could be in jeopardy if my organization were to experience a data breach. 0 (0.00%) I do NOT care about cybersecurity in my organization. 0 (0.00%) Table 9. Motivation to protect patron privacy. DISCUSSION The purpose of this study was to evaluate the effects of knowledge transfer as a means to strengthen information security within libraries. Given the results from the survey instrument, the findings suggest that knowledge transfer through cybersecurity education can lead to knowledge utilization. Specifically, knowledge transfer through cybersecurity education may influence a library employee to utilize the knowledge they have gained by participating in discussions about, and the accountability and responsibility of ISRM activities. In addition, participating in SETA programs. SETA programs are implemented within organizations as a means to increase compliance of information security policies. The findings suggest that library employees who completed a cybersecurity education course believe that it is important to, or feel that they have a responsibility to, protect patron private information. A couple of research participants stated that they would feel embarrassed if their organization experienced a data breach. A student enrolled in a cybersecurity education course may develop an understanding of and value the information that is passed on from the knowledge source about ISRM activities. With ongoing development and implementation of SETA programs, activating a student’s prior knowledge of ISRM activities could enhance their ability to process new information and apply to their job. LIMITATIONS AND FUTURE RESEARCH This research was conducted based on an online cybersecurity course offered at a university located in the western U.S. Therefore, future research is needed to study how cybersecurity courses in other parts of the U.S and internationally affects knowledge transfer as a means to strengthen ISRM initiatives in libraries, and other information organizations. It would also be valuable to conduct a modified version of this research within a classroom-based, face-to-face cybersecurity course. Furthermore, SETA programs implemented in libraries in the United States and internationally would add to this research area. There were 126 potential research participants identified, and although all were asked to participate, only thirty-eight completed the online survey. Of the thirty-eight completed surveys, responses from twenty-two participants were reported in this article. Participation from additional research participants may have generated different results. INFORMATION SECURITY IN LIBRARIES | SAN-NICOLAS-ROCCA AND BURKHARD 68 https://doi.org/10.6017/ital.v38i2.10973 While a major limitation of this study is its small pilot study and exploratory focus, a next phase of research should further investigate what type of SETA programs would be most effective in different library environments. While cybersecurity education may not be feasible for all library employees to obtain, examining and implementing the most effective SETA program for each library environment could strengthen cybersecurity practices in libraries across the U.S. A future study instrument should take into account the factors that influence knowledge transfer (absorptive capacity, communication, motivation, and user participation) as a means to strengthen ISRM practices. A common an important outcome for SETA programs is user compliance to information security policies. As such, a future study should test library employee knowledge of, and compliance to, information security policies. CONCLUSION U.S. libraries handle sensitive patron information, including personally identifiable information and circulation records. With libraries providing services to millions of patrons across the United States, it is important that they understand the importance of patron privacy and how to protect it. This study investigated how knowledge transferred within an online cybersecurity education course as a means to strengthen information security risk management affects library employee information security practices. The results of this study suggest that knowledge transfer does have a positive effect on library employee information security and risk management practices. REFERENCES 1 “Public Library Survey (PLS) Data and Reports,” Institute of Museum and Library Services, Retrieved on June 10, 2018 from https://www.imls.gov/research-evaluation/data- collection/public-libraries-survey/explore-pls-data/pls-data. 2 “Policy concerning Confidentiality of Personally Identifiable Information about Library Users,” American Library Association, July 7, 2006, http://www.ala.org/advocacy/intfreedom/statementspols/otherpolicies/policyconcerning; "Professional Ethics," American Library Association, May 19, 2017, http://www.ala.org/tools/ethics. 3 “Privacy: An Interpretation of the Library Bill of Rights,” American Library Association, amended July 1, 2014, http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy. 4 Ibid. 5 “Policy concerning Confidentiality of Personally Identifiable Information about Library Users,” American Library Association; “Code of Ethics of the American Library Association,” American Library Association, amended Jan. 22, 2008, http://www.ala.org/advocacy/proethics/codeofethics/codeethics. 6 “Policy concerning Confidentiality of Personally Identifiable Information about Library Users,” American Library Association; “Code of Ethics of the American Library Association,” American Library Association. 7 “Privacy: An Interpretation of the Library Bill of Rights,” American Library Association. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 69 8 Samuel T.C. Thompson, “Helping the Hacker? Library Information, Security, and Social Engineering,” Information Technology and Libraries 25, no. 4 (2006): 222-25, https://doi.org/10.6017/ital.v25i4.3355. 9 Roesnita Ismail and Awang Ngah Zainab, “Assessing the Status of Library Information Systems Security,” Journal of Librarianship and Information Science 45, no. 3 (2013): 232-47, https://doi.org/10.1177/0961000613477676. 10 Ibid. 11 Shayna Pekala, “Privacy and User Experience in 21st Century Library Discovery,” Information Technology and Libraries 36, no. 2 (2017): 48–58, https://doi.org/10.6017/ital.v36i2.9817. 12 Tonia San Nicolas-Rocca, Benjamin Schooley and Janine L. Spears, “Exploring the Effect of Knowledge Transfer Practices on User Compliance to IS Security Practices,” International Journal of Knowledge Management 10, no. 2, (2014): 62-78, https://doi.org/10.4018/ijkm.2014040105; Janine Spears and Tonia San Nicolas-Rocca, “Knowledge Transfer in Information Security Capacity Building for Community-Based Organizations,” International Journal of Knowledge Management 11, no. 4 (2015): 52-69, https://doi.org/10.4018/IJKM.2015100104. 13 Dong-Gil Ko, Laurie J. Kirsch and William R. King, “Antecedents of Knowledge Transfer from Consultants to Clients in Enterprise System Implementations,” MIS Quarterly 29, no. 1 (2005): 59-85, https://doi.org/10.2307/25148668. 14 Spears and San Nicolas-Rocca, “Knowledge Transfer in Information Security Capacity Building for Community-Based Organizations,” pp. 52-69; Dana Minbaeva et al., “MNC Knowledge Transfer, Subsidiary Absorptive Capacity and HRM,” Journal of International Business Studies 45, no. 1 (2014): 38-51, https://doi.org/10.1057/jibs.2013.43; Geordie Stewart and David Lacey, “Death by a Thousand Facts: Criticising the Technocratic Approach to Information Security Awareness,” Information Management & Computer Security 20, no. 1 (2012): 29-38, https://doi.org/10.1108/09685221211219182; Mark Wilson et al., “Information Technology Training Requirements: A Role-and Performance-Based Model” (NIST Special Publication 800-16), National Institute of Standards and Technology, (2018), https://www.nist.gov/publications/information-technology-security-training-requirements- role-and-performance-based-model; San Nicolas-Rocca, Schooley and Spears, “Exploring the Effect of Knowledge Transfer Practices on User Compliance to IS Security Practices,” 62-78. 15 Spears and San Nicolas-Rocca, “Knowledge Transfer in Information Security Capacity Building for Community-Based Organizations,” 52-69. 16 Janine L. Spears and Henri Barki, “User Participation in Information Systems Security Risk Management,” MIS Quarterly 34, no. 3 (2010): 503-22, https://doi.org/10.2307/25750689; Piya Shedden, Tobias Ruighaver, and Atif Ahmad, “Risk Management Standards-the Perception of Ease of Use,” Journal of Information Systems Security 6, no. 3 (2010): 23–41. INFORMATION SECURITY IN LIBRARIES | SAN-NICOLAS-ROCCA AND BURKHARD 70 https://doi.org/10.6017/ital.v38i2.10973 17 Shedden, Ruighaver and Ahmad, “Risk Management Standards-the Perception of Ease of Use” pp. 23-42; Janne Hagen, Eirik Albrechtsen, and Stig Ole Johnsen, “The Long-term Effects of Information Security e-Learning on Organizational Learning,” Information Management & Computer Security 19, no. 3 (2011): 140-154, https://doi.org/10.1108/09685221111153537. 18 “Code of Ethics of the American Library Association,” American Library Association. 19 Spears and San Nicolas-Rocca, “Knowledge Transfer in Information Security Capacity Building for Community-Based Organizations,” pp. 52-69; Wilson et al., “Information Technology Training Requirements: A Role- and Performance-Based Model” (NIST Special Publication 800-16). 20 Thompson S.H. Teo and Anol Bhattacherjee, “Knowledge Transfer and Utilization in IT Outsourcing Partnerships: A Preliminary Model of Antecedents and Outcomes,” Information & Management 51, no. 2 (2014): 177–86, https://doi.org/10.1016/j.im.2013.12.001; Ko, Kirsch, and King, “Antecedents of Knowledge Transfer from Consultants to Clients in Enterprise System Implementations,” 59-85; Minbaeva et al., “MNC Knowledge Transfer, Subsidiary Absorptive Capacity and HRM,” 38-51; Geordie Stewart and David Lacey, “Death by a Thousand Facts: Criticising the Technocratic Approach to Information Security Awareness,” Information Management & Computer Security 20, no. 1 (2012): 29-38, https://doi.org/10.1108/09685221211219182. 21 Martin Spraggon and Virginia Bodolica, “A Multidimensional Taxonomy of Intra-firm Knowledge Transfer Processes,” Journal of Business Research 65, no. 9 (2012) 1,273-282: https://doi.org/10.1016/j.jbusres.2011.10.043; Shizhong Chen et al., “Toward Understanding Inter-organizational Knowledge Transfer Needs in SMEs: Insight from a UK Investigation,” Journal of Knowledge Management 10, no. 3 (2006): 6-23, https://doi.org/10.1108/13673270610670821. 22 Maryam Alavi and Dorothy E. Leidner, “Review: Knowledge Management and Knowledge Management Systems: Conceptual Foundations and Research Issues,” MIS Quarterly 25, no. 1 (2001): 107-36, https://doi.org/10.2307/3250961. 23 Ko, Kirsch, and King, “Antecedents of Knowledge Transfer from Consultants to Clients in Enterprise System Implementations,” 59-85. 24 San Nicolas-Rocca, Schooley, and Spears, “Exploring the Effect of Knowledge Transfer Practices on User Compliance to IS Security Practices,” 62-78; Spears and San Nicolas-Rocca, “Knowledge Transfer in Information Security Capacity Building for Community-Based Organizations,” 52-69. 25 Spears and San Nicolas-Rocca, “Knowledge Transfer in Information Security Capacity Building for Community-Based Organizations,” 52-69; Spears and Barki, “User Participation in Information Systems Security Risk Management,” 503-22. 26 San Nicolas-Rocca, Schooley, and Spears, “Exploring the Effect of Knowledge Transfer Practices on User Compliance to IS Security Practices,” 62-78; Janine L. Spears and Tonia San Nicolas- Rocca, “Information Security Capacity Building in Community-Based Organizations: INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 71 Examining the Effects of Knowledge Transfer,” 49th Hawaii International Conference on System Sciences (HICSS), Koloa, HI, 2016, pp. 4,011-20, https://doi.org/10.1109/HICSS.2016.498; Ko, Kirsch, and King, “Antecedents of Knowledge Transfer from Consultants to Clients in Enterprise System Implementations,” 59-85. 27 Ko, Kirsch, and King, “Antecedents of Knowledge Transfer from Consultants to Clients in Enterprise System Implementations,” 59-85; Teo and Bhattacherjee, “Knowledge Transfer and Utilization in IT Outsourcing Partnerships: A Preliminary Model of Antecedents and Outcomes,” 177–86. 10974 ---- 20190318 10974 galley Public Libraries Leading the Way The Democratization of Artificial Intelligence: One Library’s Approach Thomas Finley INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 8 Thomas Finley (tfinley@friscotexas.gov) is Adult Services Manager, Frisco Public Library. Chances are that before you read this article, you probably checked your email, used a mapping app to find your way, or typed a search term online. Without your even perceiving it, artificial intelligence (AI) has already helped you to accomplish something today. Email spam filters use variants of AI to help cut down on harmful or useless emails in your inbox.1 With AI doing the fact- crunching, mapping apps quickly preview the best route based on a myriad of factors. Search engine companies like Google have been using AI to suggest or produce results faster for longer than anyone outside of the company really knew until recently.2 According to a recent study by Northeastern University and Gallup, 85% of Americans are already using AI products.3 The true revelation behind these recent technological developments may not be the fact that AI is already embedded into the fabric of our modern lives. The real surprise might just be the sudden ubiquitous availability (and approachability) of AI tools for all. As Google’s former Chief Scientist of AI and Machine Learning, Fei-Fei Li, said in 2017, “The next step for AI must be democratization, lowering the barriers of entry, and making it available to the largest possible community of developers, users and enterprises."4 This sounds a lot like most public libraries’ mission statements. As with other important workforce development efforts, libraries are uniquely placed to participate in this new revolution as key platforms for the discovery and dissemination of emerging tech knowledge. At the Frisco Public Library (https://www.friscolibrary.com), we saw this AI trend surfacing, we see AI as a critical future job skill, and we investigated ways to introduce our patrons into this space. As such, the Frisco Public Library has leveraged readily available technology in a cost-effective way that has engaged community interest. Our efforts are also replicable and scalable in terms of multi-nodal experiences both at home and in classroom- based learning. SOME BASIC DEFINITIONS Let’s take a few steps back to give some broad definitions and boundaries to the scope of AI. According to the Oxford English Dictionary, artificial intelligence is “the capacity of computers or other machines to exhibit or simulate intelligent behavior.”5 In the literature, you will find a further distinction between General AI, Narrow AI, and something called Machine Learning.6 General AI is something that begins to look like science fiction: an artificial intelligence that learns how to learn, then is able to generalize what it has learned and apply that knowledge to a different case. In advanced examples of General AI, scientists are thinking of not putting a specific problem in front of a General AI program to solve, rather, they are giving it an entire dataset so the program itself can choose what problems it should work on. Removing the limited point of view of whoever programs the program.7 Narrow AI is easier to understand because it is what we interact with the most in our day-to-day lives. It is what powers those little speed ups that help us do things faster every day: search INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 9 through our emails to help us avoid spam, translate speech to text when we dictate a message on a smartphone, or helps to parallel park a car at the touch of a button. Narrow AI accomplishes a specific task extremely fast and accurately, and thus, becomes an extension and multiplier of our own human productivity. A lot of these Narrow AI activities are based in a type of Artificial Intelligence called Machine Learning (ML). ML is a set of very complex processes that can review large sets of information; create and train models based on this data; make predictions of what will happen next; and then to refine that data for better future results.8 Machine Learning is the focus of our efforts at the Frisco Public Library due to two main reasons: 1) It is what has been made available through free tools such as Google’s open AI resources; and 2) It makes AI attainable in a library setting. OUR APPROACH: MAKERSPACES FOR EVERYONE, AT HOME The Frisco Public Library has had 4 years of success with circulating Makerspace technology in reasonably-priced, hard shell waterproof boxes with foam inserts. Each kit is cataloged, RFID tagged, security tagged, and sealed with zip ties to enable self-checkouts (zip ties can be easily cut open at home, but prevent items from disappearing in the library). These cases are easy to handle and can take some abuse while protecting their contents. This is important because we circulate about 20 different kinds of robotics kits, no-soldering circuitry kits, 3D scanning kits, programing kits, and Internet of Things kits. Most kits contain the theme item with quick start guides, instruction booklets, and a book to inspire advanced learning. We call these Maker Kits, and we have about 150 total. In our community, they are wildly popular and have circulated more than 4,000 times since their introduction in January 2016.9 AIY: ARTIFICIAL INTELLIGENCE KITS FOR EVERYONE In 2017, Google released their maker-focused AIY Voice project kit (where AIY is a catchy substitute for Do-It-Yourself with Artificial Intelligence Yourself). The kit consists of several components that pairs a Raspberry Pi (entry-level computer) and a small speaker that is housed in a cardboard box with a button prominently placed on top.10 The result is a stripped-down version of an Amazon Echo or Google Home device — essentially a smart speaker. Although the AIY Voice kit is not necessarily initially set up to play music, it is designed to take voice commands like the other products on the market. With a minimum of Python coding expertise, AIY kits enable mass participation in Artificial Intelligence. There isn’t even any soldering required to put this kit together! This is 100% in line with Fei-Fei Li’s (Google’s former Chief Scientist for AI and ML) remarks about the need to democratize AI. Google has since released another kit called AIY Vision that uses similar components paired with a camera. More information on the kits can be found at https://aiyprojects.withgoogle.com/. FRISCO PUBLIC LIBRARY’S ARTIFICIAL INTELLIGENCE MAKER KITS Based on our previous experience with other Maker Kits, we made a few modifications to the original Google design that most librarians with access to a 3D printer can accomplish. The original AIY Voice kit uses a punch-out cardboard box to fold and envelop the device. Apart from being an extremely cost-effective way of making a box, it also seems like there is delicious irony (and message) in the contrasting of cardboard-as a cheap, widely available material-with the advanced tech of AI. Durability being our priority, we knew we needed to upgrade this aspect of Google’s original design. Our Maker Librarian, Adam Lamprecht, quickly found a shared design file PUBLIC LIBRARIES LEADING THE WAY: THE DEMOCRATIZATION OF ARTIFICIAL INTELLIGENCE 10 https://doi.org/10.6017/ital.v38i1.10974 uploaded to the website, www.thingiverse.com, that he modified to better suit our needs (see figure 1).11 Figure 1. AI Maker Kits with 3D printed AIY Voice device. We then printed these in a variety of colors on our 3D printers and modified the grid-patterned foam inserts to make room for the device and a few other items (see figure 2). We are currently circulating 21 of these kits without major incident. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 11 Figure 2. Interior view of the kits. LIBRARY INSTRUCTION: PYTHON AS A WINDOW ONTO ARTIFICIAL INTELLIGENCE Our basic Artificial Intelligence classes have been key in the introduction of this technology to the public. We reserve 10 kits for a class and pair them with classroom laptops for ease of use. The structure of the class provides a short introduction to the technology and then walks participants through a basic voice recognition coding challenge. All of this is accomplished in Python. Python is great for beginning coders because it is easier to learn than other programming languages, takes less time to write lines of code, and it can telescope up into a very large number of projects and applications.12 In fact, according to Neal Ford, Director and Software Architect at Thoughtworks, Python, “is very good at solving bigger kinds of problems.”13 So with Python, a beginning learner has a programming language that continues to be useful beyond the classroom and into the world of work or school. Python provides another important advantage: “Python provides the front-end method of how to hook into Google’s open AI,” states tech writer Sardar Yegulalp.14 It is this combination of a free, accessible coding language with the powerful (and also free) resources of Google’s open AI that truly lowers the barrier to entry for anyone interested in a hands-on experience with Artificial Intelligence. PUBLIC LIBRARIES LEADING THE WAY: THE DEMOCRATIZATION OF ARTIFICIAL INTELLIGENCE 12 https://doi.org/10.6017/ital.v38i1.10974 LESSONS LEARNED The AI Maker Kits are, by far, our most complicated circulating kits. We are hearing back from patrons that the kits are right on the mark. Our users get it, they see the power in getting access to these AI tools (utilizing Python) and by all accounts thus far, are happy with their results. There has been a perception gap between library staff, however, and what an AI kit can reasonably accomplish. Adam Lamprecht reports, “Staff members had the expectation that perhaps with this kit, a rookie coder was going to be able to jump directly into developing Deep Learning neural networks (a very advanced subset of artificial intelligence) and so we definitely benefited from ongoing discussions of those broad AI terms and expectations.”15 Google’s AIY Voice is a good start but there is lots of room to grow AI classes for more depth. AIY Vision is the next logical step that would allow us to enter into the world of basic image recognition. Our approach does rely on one company’s platform, but there are more platforms to explore AI now. One of which is Amazon’s offerings of Machine Learning on AWS (Amazon Web Services). These services have recently been opened up for a wider audience and Amazon is now offering everyone the same online courses they use to train their own engineers.16 The AWS ML resources are currently behind paywalls but access to the training alone could be powerful for the right learner. There are even interesting developments for younger learners in AI with robotics. Anki (www.anki.com) is a consumer robotics company that uses AI to enliven its products. They released Vector in 2018: a seemingly simple toy that responds to its environment and simple commands with the aid of AI. With the release of their Software Development Kit the company is allowing others under the hood of their robots-which potentially means an entry-point for autonomous (or semi-autonomous) robotic vehicle technology powered by AI. What is clear is that the world of AI is already upon us. Public Libraries are well positioned to help meet the challenge of developing the workforce of the near- and far future with AI classes being a vital tool. The doorway to artificial intelligence is now open, the only question that remains is this: Do you step through it? REFERENCES 1 Cade Metz, “Google Says Its AI Catches 99.9 Percent of Gmail Spam,” WIRED. July 9, 2015, https://www.wired.com/2015/07/google-says-ai-catches-99-9-percent-gmail-spam/. 2 Jack Clark, “Google Turning its Lucrative Web Search Over to AI Machines,” Bloomberg Business, October 26, 2015, https://www.bloomberg.com/news/articles/2015-10-26/google-turning- its-lucrative-web-search-over-to-ai-machines. 3 RJ Reinhart, “Most Americans Already Using Artificial Intelligence Products,” Gallup, March 6, 2018, https://news.gallup.com/poll/228497/americans-already-using-artificial-intelligence- products.aspx. 4 Scot Petersen, “Google Joins Chorus of Cloud Companies Promising to Democratize AI,” EWeek, March 10, 2017, EbscoHost Academic Search Complete. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 13 5 “Artificial Intelligence, n,” OED Online, December 2018, Oxford University, accessed March 1, 2019. 6 Bernard Marr, “What Is The Difference Between Artificial Intelligence and Machine Learning?,” Forbes, December 6, 2016, https://www.forbes.com/sites/bernardmarr/2016/12/06/what- is-the-difference-between-artificial-intelligence-and-machine-learning/#6d40eeec2742. 7 Lex Fridman, “Juergen Schmidhuber: Godel Machines, Meta-Learning, and LSTMs,” MIT AI Podcast, December 22, 2018. 8 Serdar Yegulalp, “What Is Tensorflow? The Machine Learning Library Explained,” Infoworld. June 6, 2018, https://www.infoworld.com/article/3278008/tensorflow/what-is-tensorflow-the- machine-learning-library-explained.html. 9 Frisco Public Library, 2019 “Unpublished Maker Kit Statistics 2016-2019.” 10 “AIY Projects: Voice Kit,” Google, accessed December 15, 2018, https://aiyprojects.withgoogle.com/voice/. 11 Adam Lamprecht, “Google AIY Voice Box,” Thingiverse, accessed February 14, 2019, https://www.thingiverse.com/thing:3247685. 12 Elena Ruchko, “Why Learn Python? Here are 8 Data-Driven Reasons,” Dbader.org, accessed February 14, 2019, https://dbader.org/blog/why-learn-python. 13 Christina Cardoza, “The Python Programming Language Grows in Popularity,” SD Times, June 15, 2017, https://sdtimes.com/artificial-intelligence/python-programming-language-grows- popularity/. 14 Yegulalp, “What Is Tensorflow? The Machine Learning Library Explained.” 15 Adam Lamprecht, Email message to the author, February 15, 2019. 16 Locklear Mallory, “Amazon Opens Up its Internal Machine Learning Training to Everyone,” Engadget, November 26, 2018, https://www.engadget.com/2018/11/26/amazon-opens- internal-machine-learning-training/. 10977 ---- “Am I on the library website?”: A LibGuides Usability Study Articles “Am I on the library website?”: A LibGuides Usability Study Suzanna Conrad and Christy Stevens INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 49 Suzanna Conrad (suzanna.conrad@csus.edu) is Associate Dean for Digital Technologies and Resource Management at California State University, Sacramento. Christy Stevens (crstevens@sfsu.edu) is the Associate University Librarian at San Francisco State University. ABSTRACT In spring 2015, the Cal Poly Pomona University Library conducted usability testing with ten student testers to establish recommendations and guide the migration process from LibGuides version 1 to version 2. This case study describes the results of the testing as well as raises additional questions regarding the general effectiveness of LibGuides, especially when students rely heavily on search to find library resources. INTRODUCTION Guides designed to help users with research have long been included among a suite of reference services offered by academic libraries, though terminology, formats, and mediums of delivery have evolved over the years. Print “pathfinders,” developed and popularized by the Model Library Program of Project Intrex at MIT in the 1970s, are the precursor to today’s online research guides, now a ubiquitous resource featured on academic library websites.1 Pathfinders were designed to function as a “kind of map to the resources of the library,” helping “beginners who seek instruction in gathering the fundamental literature of a field new to them in every respect” find their way in a complex library environment.2 With the advent of the internet, pathfinders evolved into online “research guides,” which tend to be organized around subjects or courses. In the late 1990s and early 2000s, creating guides online required a level of technological expertise that many librarians did not possess, such as HTML-coding knowledge or the ability to use web development applications like Adobe Dreamweaver. As a result, many librarians could not create their own online guides and relied upon webmasters to upload and update content. The online guide landscape changed again in 2007 with the introduction of Springshare’s LibGuides, a content management solution that quickly became a wildly popular library product.3 As of December 2018, 614,811 guides had been published by 181,896 librarians, at 4,743 institutions in 56 countries.4 The popularity of LibGuides is due in part to its removal of technological barriers to online guide creation, making it possible for those without web-design experience to create content. LibGuides is also a particularly attractive product for libraries constrained by campus or library web templates, affording librarians and library staff the freedom to design pages without requiring higher level permissions to websites. Despite these advantages, in the absence of oversight, LibGuides sites can develop into microsites within the library’s larger web presence. Inexperienced content creators can inadvertently develop guides that are difficult to use, lacking consistent templates and containing overwhelming amounts of information. As a result, libraries mailto:suzanna.conrad@csus.edu mailto:crstevens@sfsu.edu AM I ON THE LIBRARY WEBSITE? | CONRAD AND STEVENS 50 https://doi.org/10.6017/ital.v38i3.10977 often find it useful to develop local standards and best practices in order to enhance the user experience.5 Like many academic libraries, the Cal Poly Pomona University Library uses the LibGuides platform to provide the campus community with course and subject guides. In 2015, librarians began discussing plans to migrate from LibGuides version one to the version two platfo rm. These discussions led to broader conversations about LibGuides related issues and concerns, some of which had arisen during website focus group sessions conducted in early 2015. The focus groups were designed to provide the library with a better understanding of students’ library website preferences. Students reported frustration with search options on the library website as well as confusion regarding inconsistent headers. Even though focus group questions were related to the library website, two participants commented on the library’s LibGuides as well. The library was using a modified version of the library website header for vendor-provided services, including LibGuides, so it was sometimes unclear to students when they had navigated to an external s ite.6 To complicate matters, the library also occasionally used LibGuides for other, non-research- related library pages, such as a page delineating the library’s hours, because of the ease of updating that the platform affords. One student, who had landed on the LibGuides page detailing the library’s hours, described feeling confused about where she was on the library website. She explained that she had tried to use the search box on the LibGuides page to navigate away from the hours page, apparently unaware that it was only an internal LibGuides search. As a result, she did not receive any results for her query. The language the student used to describe the experience clearly revealed her disorientation and perplexity: “Something popped up called LibGuides and then I put what I was looking for and that was nothing. It said no search found. I don’t even know what that was, so I just went to the main page.” Another participant, who also tried to search for a research-related topic after landing on a LibGuides page, stated, “I tried putting my topics. I even tried refining my topic, but then it took me to the guide thing.” Accustomed to using a search function to find information on a topic, this student did not interpret the research guide she had landed upon as a potentially useful tool that could help with her research. She expected that her search would produce search results in the form of a list of potentially relevant books or articles. The appearance instead of a research guide was misaligned with her intentions and expectations and therefore confusing to her.7 Given both the LibGuides related issues that emerged during the library website focus groups and the library’s plan to migrate from LibGuides version one to version two in the near future, the library’s digital initiatives librarian and head of reference and instruction decided to conduct usability testing focused specifically on LibGuides. In addition to testing the usability of specific LibGuides features, such as navigational tabs and subtabs, we were also interested in determining whether some of the insights gleaned from the library website focus groups and from prior user surveys and usability testing regarding users’ web expectations, preferences, and behaviors were also relevant in the LibGuides environment. Specifically, prior data had indicated users were unlikely to differentiate between the library’s website and vendor-provided content, such as LibGuides, LibAnswers, the library catalog, etc. Findings also suggested that rather than intentionally selecting databases that were appropriate to their topics, students often resorted to searching in the first box they saw. This included searching for articles and books on their topics using search boxes that were not designed for that purpose, such as the database search box on the library’s A-Z database page and the LibGuides site search tool for searching all guides. Although many students did not always resort to searching first (many did attempt to browse to INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 51 specific library services), if they were not immediately successful, they would then type terms from the usability testing task into the first available search box.8 Finally, we were also aware that many of our current LibGuides contained elements that were inconsistent with website search and design best practices as well as misaligned with typical website users’ behaviors and expectations, as described by usability experts like Jakob Nielsen. As such, we wanted to test the usability of some of these potentially problematic elements to determine whether they negatively impacted the user experience in the LibGuides environment. If they did, we would have institution-specific data that we could leverage to develop local recommendations for LibGuides standards and best practices that would better meet students’ needs. LITERATURE REVIEW The Growth of LibGuides Since Springshare’s founding in 2007, LibGuides have been widely embraced by academic libraries.9 In 2011, Ghaphery and White visited the websites of 99 ARL university libraries in the United States and found that 68 percent used LibGuides as their research guides platform. They also surveyed librarians from 188 institutions, 82 percent of which were college or university libraries, and found that 69 percent of respondents reported they used LibGuides.10 As of December 2018, Springshare’s LibGuides Community website indicated that 1,620 academic libraries in the United States and a total of 1,823 academic libraries around the world, not counting law and medical libraries, were using the LibGuides platform.11 LibGuides’ popularity is due in part to its user-friendly format, which eliminates most technical barriers to entry for would be guide authors. For example, Anderson and Springs surveyed librarians at Rutgers University and found they were more likely to update and use LibGuides than previous static subject guides that were located on the library website and maintained by the webmaster, to whom subject specialists submitted content and any needed changes.12 The majority of librarians reported that having direct access to the LibGuides system would increase how often they updated their guides. Moreover, after implementing the new LibGuides system, 52 percent said they would update guides as needed, and 14 percent said they would update guides weekly; prior to implementation, only 36 percent stated they would update guides as needed, and none said they would do so weekly. LibGuides Usability Testing and User Studies Although much literature has been published on the usability of library websites,13 fewer studies have focused on research guides or LibGuides specifically. Of these, several focused on navigation and layout issues. For example, in their 2012 LibGuides navigation study, Pittsley and Memmott confirmed their initial hypothesis that the standard LibGuides navigation tabs located in a horizontal line near the top of each page can sometimes go unnoticed, a phenomenon referred to as “banner blindness.” As a result of their findings, librarians at their institution decided to increase the tab height in all LibGuides, and some librarians also chose to add content menus on the homepages of each of their guides. They moved additional elements from the header to the bottom of the guide under the theory that decreased complexity would contribute to increased tab navigation recognition.14 Sonsteby and DeJonghe examined the efficacy of LibGuides’ tabbed navigational interface as well as design issues that caused usability problems. They identified user preferences, su ch as users’ AM I ON THE LIBRARY WEBSITE? | CONRAD AND STEVENS 52 https://doi.org/10.6017/ital.v38i3.10977 desire for a visible search box that behaved like a discovery tool, and design issues that frequently led to confusion, such as search boxes that searched for limited types of content, like journal titles. They also found that jargon confused users, and that guides containing too many tabs that were inconsistently labeled led to both confusion and the perception that guides were “cluttered” and “busy.”15 Thorngate and Hoden explored the effectiveness of LibGuides version two designs, specifically focusing on use of columns, navigation, and the integration of LibGuides into the library website. They found that two-column navigation is the most usable, users are more likely to notice left navigation over horizontal tabs, and students do not view LibGuides as a separate platform, expecting instead for it to live coherently within the library’s website. 16 Almeida and Tidal employed a mixed methods approach to gather user feedback about LibGuides, including usage of “paper prototyping, advanced scribbling, task analysis, TAP, and semi-structured interviews.”17 The researchers intended to “translate user design and learning modality preferences into executable design principles,” but found that no one layout filled all students’ needs or learning modalities.18 Ouellette’s 2011 study differed from many LibGuides-focused articles in that rather than assigning participants usability tasks, it employed in-depth interviews with 11 students to explore how they used subject guides created on the LibGuides platform and the features they liked and disliked about them. Like some of the aforementioned studies, Oullette found that students did not like horizontal tabbed navigation, preferring instead the more common left-side navigation that has become standard on the web. However, the study was also able to explore issues that many of the usability task-focused studies did not, including whether and how students use subject guides to accomplish their own research-related academic work. Ouellette found that students “do not use subject guides, or at least not unless it is a last resort.”19 Reasons provided for non-use included not knowing that they existed, preferring to search the open web, and not perceiving a need to use them, preferring instead to search for information rather than browsing a guide.20 Such findings call into question the wisdom of expending time and resources on creating guides. However, Ouellette asserted that students were more likely to use research guides when they were stuck, when they were required to find information in a new discipline, or when their instructors explicitly suggested that they use them.21 Nevertheless, most students who had used LibGuides reported that they had done so solely “to find the best database for locating journal articles.”22 Indeed, Ouellette found that the majority of “participants had only ever clicked on the tab leading to the database section of a guide,” a finding that was consistent with Staley’s 2007 study, which found that databases are the most commonly used subject guide section.23 While Ouellette concluded that LibGuides creators should therefore emphasize databases on their guides, both the more recent widespread library adoption of discovery systems that search across databases, in many cases making it unnecessary for students to select a specific database, as well as the common practice of aggregating relevant databases under disciplinary subject headings on library databases pages implicitly call into question the need for duplicating such information on library subject guides. If users can easily find such information elsewhere, these conclusions also cast doubt on the effectiveness of the entire LibGuides enterprise. Information Retrieval Behaviors: Search and Browse Preferences In 1997, usability expert Jakob Nielsen reported that more than half of web users are “search dominant,” meaning that they go directly to a search function when they arrive at a website rather than clicking links. In contrast, only a fifth of users are “link dominant,” preferrin g to navigate sites by clicking on links rather than searching. The rest of the users employ mixed strategies, switching INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 53 between searching and clicking on links in accordance with what appears to be the most promising strategy within the context of a specific page.24 While some researchers have questioned the prevalence of search dominance, Nielsen’s mobile usability studies have indicated an even stronger tendency toward search dominance when users access websites on their mobile devices.25 Moreover, by 2011, Nielsen’s research had indicated that search dominance is a user behavior that gets stronger every year, and that “many users are so reliant on search that it’s undermining their problem-solving abilities.” Specifically, Nielsen found that users exhibited an increasing reluctance to experiment with different strategies to find the information they needed when their initial search strategy failed.26 Nielsen attributes the search dominance phenomenon to two main user preferences. The firs t is that search allows users to “assert independence from websites’ attempt to direct how they use the web.”27 The second is that search functions as an “escape hatch when they are stuck in navigation. When they can’t find a reasonable place to go next, they often turn to the site’s search function.” Nielsen developed a number of best practices based on these usability testing results, including that search should be made available from every page in a website, since it is not possible to predict when users will feel lost. Additionally, given that users quickly scan sites for a box where they can type in words, search should be configured as a box and not a link, it should be located at the top of the page where users can easily spot it, and it should be wide enough to accommodate a typical number of search terms.28 Nielsen’s usability studies have shed light not only on where search should be located but also on how search should function. In 2005, Nielsen reported that searchers “now have precise expectations for the behavior of search” and that “designs that invoke this mental model but work differently are confusing.”29 Specifically, searchers’ “firm mental model” for how search should work includes “a box where they can type words, a button labeled ‘search’ that they click to run the search, [and] a list of top results that’s linear, prioritized, and appears on a new page.” Moreover, Nielsen found that searchers want all search boxes on all websites to function in the same way as typical search engines and that any deviation from this design causes usability issues. He specifically highlighted scoped searches as problematic, pointing out that searches that only cover a subsite are generally misleading to users, most of whom are unlikely to consider what th e search box is actually searching.30 While there is much evidence to support Nielsen’s claims about the prevalence of search dominance, other studies have suggested that users themselves are not necessarily always search or link dominant. Rather, some websites lend themselves better to searching or exploring links, and users often adjust their behaviors accordingly.31 Although we did not find studies that specifically discussed the search and browse preferences and behaviors of LibGuides users, we did find studies of library website use that suggested that though users often exhibit search -dominant tendencies, they also often rely on a mixed approach to library website navigation. For example, Hess and Hristova’s 2016 study of users’ searching and browsing tendencies explored how students access library tutorials and online learning objects. Specifically, they compared searching from a search box on the tutorials landing page, using a tag cloud under a search box, and browsing links.32 Google Analytics data revealed that students employed a mixed approach, equally relying upon both searching and clicking links to access the library’s tutorials.33 Similarly, Han and Wolfram analyzed clickstream data from 1.3 million sessions in an image repository and determined that the two most common actions (86 percent of actions) were simple search and AM I ON THE LIBRARY WEBSITE? | CONRAD AND STEVENS 54 https://doi.org/10.6017/ital.v38i3.10977 click actions.34 However, users in this study exhibited a tendency toward search dominance, conducting simple searches in 70 percent of the actions.35 Niu, Zhang, and Chen presented a mixed methods study analyzing search transaction logs and conducting usability testing f ocused on comparing the discovery layers VuFind and Primo. Browsing in the context of their study included browsing search results. They found that most search sessions were very brief, and students searched using two or three keywords.36 Xie and Joo tested how thirty-one participants went about finding items on a website, classifying their approaches into what they described as eight “search tactics,” including explorative methods, such as browsing.37 Over 88 percent of users conducted at least one search query, and 75 percent employed “iterative exploration,” browsing and evaluating both internal and external links on the site “until they were satisfied or they quit.”38 Only four of thirty-one, or 6.7 percent, did “whole site exploration,” a tactic which included browsing and evaluating most of the available information on a website, looking through every page on the site to find the desired information.39 METHOD This study addresses the following research questions: 1. When prompted to find a research guide, are students more likely to click links or type terms into a search box to find the guide? 2. Are students more likely to successfully accomplish usability tasks directing them to find specific information on a LibGuide when using a guide with horizontal or vertical tabs? 3. How likely are students to click on subtabs? 4. How and to what extent does a one-, two-, or three-column content design layout affect students’ ability to find information on a LibGuide? 5. How and to what extent do students use embedded search boxes in LibGuides? 6. Do students confuse screenshots of search boxes with functioning search tools? In 2015, the University Library had access to two versions of LibGuides: the live version one instance and a beta version two instance. In order to answer our research questions and make data-informed design decisions that would improve the usability of our LibGuides, we compared the usability of existing research guides in LibGuides version one to test sites on LibGuides version two. Version two guides differed from version one guides in several ways. Version two guides were better aligned with Nielsen’s recommendations regarding search box placement and function. Every LibGuide page included a header identical to the library website’s header, which contained a global search box that searched both library resources and the library’s website. The inclusion of a visible discovery tool in the header was consistent with usability recommendations in the literature40 as well as our own prior library website usability tests, which indicated many users preferred searching for resources over finding a path to them by clicking through a series of links. In mid-April 2015, ten students were scheduled to test LibGuides. Each student attempted the same seven tasks, but five students tested the current version of LibGuides and five students tested version two. The sessions were recorded using Camtasia, and students completed usability tasks on a laptop that was hooked up to a large television monitor, allowing the two librarians who were in the room to observe how students navigated the library’s website and LibGuides platform. One librarian served as the moderator and the other managed the recording technology.41 Although additional members of the web team were interested in viewing the test INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 55 sessions, in order to avoid overwhelming the students, only two librarians sat in the sessions. The moderator read tasks aloud and students were instructed to think aloud while completing each task, narrating their thought processes and navigational decisions. Students were recruited via a website usage and perceptions survey sent out the prior quarter, which included a question as to whether they would be interested in participating in usability testing. The students who received this survey were selected from a randomized sample provided by the university’s Institutional Research Office. The sample included both lower division students in the first or second year of their studies and transfer students. Students were also recruited in information literacy instruction sessions for lower-level English courses as well as in a credit- bearing information literacy course taught by librarians. Survey respondents and students from the targeted classes who indicated that they would be interested in participating in usability testing were subsequently contacted via email. Students with appropriate testing day availability were selected. Students from the various colleges were represented, including Engineering; Business Administration; Letters, Arts and Social Sciences; Education and Integrative Studies; and Hospitality Management. All of the participants were undergraduates and most were lower division students. We chose to focus on recruiting lower division students because we wanted to ensure that our guides were usable by students with the least amount of library experience; many lower division students are unaware of library services and may not have taken a library instruction session or a library information literacy course. However, while the goal was to recruit lower division students, scheduling difficulties, including three no-shows, led us to recruit students on-the-fly who were in the library, regardless of their lower division or upper division status. Task 1 In both rounds of usability testing, students were prompted to find a “research guide” to help them write a paper on climate change for a COM 100 class. Students started from the homepage of the library. Two possible success routes included browsing to a featured links section on the homepage where a “Research Guides” link was listed (see figure 1) or searching via the top level “OneSearch” discovery layer search box, displayed in figure 2, which delivered results, including articles from databases, books from the catalog, library website pages, and LibGuides pages, in a bento-box format. The purpose of this task was to determine if students browsed or searched to find research guides. We defined browsing as clicking on links, menus, or images to arrive at a result, whereas searching involved typing words and phrases into a search box. AM I ON THE LIBRARY WEBSITE? | CONRAD AND STEVENS 56 https://doi.org/10.6017/ital.v38i3.10977 Figure 1. Featured links section on library homepage. Figure 2. OneSearch search box on library homepage. Task 2 Task 2 was designed to compare the usability of LibGuides version one’s horizontal tab orientation with version two’s left navigation tab option. Students were provided with a scenario in which they were asked to compare two public opinion polls on the topic of climate change for the same COM 100 class. We displayed the appropriate research guide for the students and instructed them to find a list of public opinion polls. The phrase “Public Opinion Polls” appeared in the navigation of both versions of the guide. Figure 3 displays the research guide with horizontal tab navigation and figure 4 with vertical, left tab navigation. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 57 Figure 3. Horizontal tab navigation. AM I ON THE LIBRARY WEBSITE? | CONRAD AND STEVENS 58 https://doi.org/10.6017/ital.v38i3.10977 Figure 4. Left tab navigation. Task 3 In the third scenario, students were informed that their professor recommended that they u se a library “research guide” to find articles for a research paper assignment in an Apparel Merchandising and Management class. Students were instructed to find the product development articles on the research guide. The phrase “Product Development” appeared as a subtab in both versions of the guide. This task was intended to test whether students navigated to subtabs in LibGuides. As shown in figure 5, the subtab located on the horizontal navigation menu appeared when scrolled over but was otherwise not immediately visible. In contrast, figure 6 shows how the navigation was automatically popped open on the left tab navigation menu so that subtabs were always visible, a newly available option in LibGuides version two. Figure 5. Horizontal subtab options. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 59 Figure 6. Left tab navigation with lower subtabs automatically open. Task 4 On the same Apparel Merchandising and Management LibGuide, students were asked where they would go to find additional books on the topic of product development. The librarian who designed this LibGuide had included search widgets in separate boxes on the page that searched the catalog and the discovery layer “OneSearch.” We were interested in seeing whether students would use the embedded search boxes to search for books. This functionality was identical in both the version one and two instances of the guide, as shown in figure 7. Figure 7. Embedded catalog search and embedded discovery layer search. Task 5 In the fifth scenario, students were told that they were designing an earthquake-resistant structure for a Civil Engineering class. As part of that process, they were required to review AM I ON THE LIBRARY WEBSITE? | CONRAD AND STEVENS 60 https://doi.org/10.6017/ital.v38i3.10977 seismic load provisions. We asked them to locate the ASCE Standards on Seismic Loads using a research guide we opened for them. The ASCE Standard was located on the “Codes & Standards” page, which could be accessed by clicking on the “Codes & Standards” tab. The version one instance of the guide was two-columned, and a link to the ASCE seismic load standard was available in the second column on the right, per figure 8. The version two instance of the guide used a single, centered column, and the user had to scroll down the page to find the standard, per figure 9. We wanted to see if students noticed content in columns on the right, as many of our LibGuides featured books, articles, and other resources in columns on the right side of the page, or whether guides with content in a single central column were easier for students to use. Figure 8. Two-column design with horizontal tabs. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 61 Figure 9. Two-column design with left tab navigation. Task 6 Because librarians sometimes included screenshots of search interfaces in their guides, we were interested in testing whether students mistook these images of search tools for actual search boxes. In task six, we opened a civil engineering LibGuide for students and told them to find an online handbook or reference source on the topic of finite element analysis. As shown in figure 10, a screenshot of a search box was accompanied by instructional text explaining how to find specific types of handbooks. Within this LibGuide, there were also screenshots of the OneSearch discovery layer as well as a screenshot of a “FindIt” link resolver button. AM I ON THE LIBRARY WEBSITE? | CONRAD AND STEVENS 62 https://doi.org/10.6017/ital.v38i3.10977 Figure 10. Screenshots used for instruction. Task 7 The final task was designed to test whether it was more difficult for students to find content in a two- or three-column guide. Students were instructed to do background research on motivation and classroom learning for a psychology course. They were told to find an “encyclopedic source” on this topic. Within each version of the Psychology LibGuide, there was a section called “Useful Books for Background Research.” As shown in figure 11, in the version one LibGuide, books useful for background research were displayed in the third column on the right side of the page. The version two LibGuide displayed those same books in the first column under the left navigation options. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 63 Figure 11. Books displayed in third column. AM I ON THE LIBRARY WEBSITE? | CONRAD AND STEVENS 64 https://doi.org/10.6017/ital.v38i3.10977 Figure 12. Two-column display with books in the left column. RESULTS Searching vs. Browsing to Find LibGuides Understanding how students navigate and use LibGuides is important, but if they have difficulty finding the LibGuides from the library homepage, usability of the actual guides is moot. Of the ten students tested, six students used the OneSearch discovery layer located on the library’s homepage to search for a guide designed to help them write a paper on climate change for a COM 100 class. Frequently used search terms included “research guide,” “communication guides,” “climate change,” “climate change research guide,” “faculty guides,” and “COM 100.” Of these students, two used search as their only strategy, typing search queries into whichever search box they discovered. Neither of these students were successful at locating the correct guide. The remaining four students used mixed strategies; they started by searching and resorted to browsing after the search did not deliver exact results. Two of these students were eventually successful in finding the specific research guide; two were not. Of the six studen ts who searched using the discovery layer, only one did not find the LibGuides landing page at all. In general, it seems that the task and student expectations during testing were not aligned with the way the guide was constructed. Only one student went to the controversial topics guide because “climate change is a controversial topic.” One student thought the guide would be titled “climate change” and another thought there might be a subject librarian dedicated to climate change. Students would search for keywords corresponding with their course and topic, but generally they did not make the leap to focus more broadly on controversial topics. Only one student browsed directly to the “Research Guides” link on the homepage and found the guide under subject guides for “communication" on the first try. Another student navigated to a INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 65 “Services and Help” page from the main website navigation and found a group of LibGuides that were labeled “User Guides,” designed specifically for new students, faculty, staff, and visitors; however, the student did not find any other LibGuides relevant to the task at hand. The remaining two students navigated to pages with siloed content; one student clicked the library catalog link on the library homepage and began searching using the keywords “climate change.” The other student clicked on the “Databases” link. Upon arriving at the Databases A-Z page, the student chose a subject area (Science) and searched for the phrase “faculty guides” in the databases search box. The student was unable to find the research guide because our LibGuides were not indexed in this search box; only database names were listed. Only three out of ten students found the guide; the rest gave up. Two of the successful participants employed mixed strategies that began with searching and included some browsing; the third student browsed directly to the guide without searching. Testers in the LibGuides version one environment attempted the task an average of 3.8 times before achieving success or giving up compared to an average of 3.2 attempts per tester in version two testing. We defined an attempt as browsing or searching for a result until the student tried a different strategy or started over. For instance, if a student tried to browse to a guide and then chose to search after not finding what they were looking for, that constituted two attempts. Testers in both rounds began on the same library website. One major difference between the two research guides landing pages was the search boxes; one was an internal LibGuides search box (version one) and one was a global OneSearch box (version two). It is possible that testers in round two made fewer attempts because of the inclusion of the OneSearch box. For those testing with the LibGuides search box in version one, three searched on the LibGuides landing page. From both rounds, eight of the students located the LibGuides landing page, regardless of whether or not they found the correct guide. The two students who did not find the correct guide did land in LibGuides, but they arrived at specific LibGuides pages that served other purposes (one found a OneSearch help guide and the other landed on a new users’ guide). Navigation, Tabs, and Layout Navigation, tab options (including subtab usage), and layouts were evaluated in tasks two, three, five, and seven. As mentioned in the method section, the first group of five students who tested the interface used the version one LibGuides with horizontal navigation and hidden subtabs. The second round of five students used the version two LibGuides with left navigation and popped open subtabs. Students in both rounds were able to find items in the main navigation (not including subtabs) at consistent rates, with those in the second round with left navigation completing all tasks significantly faster than the first-round testers (38 seconds faster on average across all tasks). In task two, students were asked to find public opinion polls, which they could access by clicking a “Public Opinion Polls” link on the main navigation. In both rounds, regardless of horizontal or vertical navigation, nine of the students clicked on the tab for the polls. Only one student testing on version two was unable to find the tab. Students in version one testing with horizontal navigation attempted this task two times on average before successfully finding the tab; students testing on version two with vertical navigation attempted 1.4 times before finding the tab with the polls or giving up. AM I ON THE LIBRARY WEBSITE? | CONRAD AND STEVENS 66 https://doi.org/10.6017/ital.v38i3.10977 When asked in task three to find articles on product development, which were included on a “Product Development” subtab under the primary “Library Databases AMM” tab, nine out of ten students were unable to locate the subtab. In LibGuides version one, this subtab was only viewable after clicking the main “Library Databases AMM” tab. In LibGuides version two, this subtab was popped open and immediately visible underneath the “Library Databases AMM” tab. A version two tester was the only student who clicked on the “Product Development” subtab. Students attempted this task 1.8 times in version one testing compared to 1.2 times for those testing version two. It is worth noting that six of the students found product development articles by searching via other means (OneSearch, databases, and other library website links); they just did not find the articles on the LibGuide shown. While they still successfully found resources, they did not find them on the guide we were testing. In task five, we asked students to find the ASCE Standards on Seismic Loads on a specific guide. The version one guide used a two-column design while the version two guide with the same content utilized a single column for all content. While six students found the standards (three in round one and three in round two), only four of ten testers overall did so by browsing to the resource. Three of the students who chose to browse were in round one and the fourth student was from round two. In version one testing with the two-column design, two students found the standards after making two attempts to browse the guide. Both of these students used the LibGuides “Search this Guide” function to find the correct page for the standards using keywords “ASCE standards” and “ASCE.” The third successful student in this round used a mixed methods strategy of searching and browsing. She used the search terms “ASCE standards on seismic loads” and then searched for “seismic loads” twice in the same search box. She landed on the correct tab of the LibGuide, scrolling over the correct standard multiple times, but only found the standards after the sixth attempt. During version two testing, which included the one column design and global search box, only one student browsed to the standards on the LibGuide. This student scrolled up and down the main LibGuide page, clicked on the left navigation option for “Find Books,” then the left navigation option for “Codes & Standards” and scrolled down to find the correct item. Four out of five version two testers bypassed browsing altogether, instead using the OneSearch box on the page header to try to find the ASCE standards. Two of those students found the specific ASCE standards that were featured on the LibGuide; the other two found ASCE standards, just not the specific item we intended for them to find. The four students who did no t find the specific standards were equally distributed across both testing groups. On average, students attempted to complete the task 3.6 times in version one testing and 1.6 times in version two testing before either finding the resource or giving up. Task seven asked students to find an encyclopedic source using a three-column design in version one and a two-column design in version two. The version one guide listed encyclopedias in the right-most column of a three-column layout and the version two guide included them under the left navigation in a two-column design. Only three students found the encyclopedia mentioned in task seven, two of whom completed the task using version two’s two-column display. Only one student was able to locate the encyclopedia in the third column in version one testing. The seven students who were unable to find the encyclopedia all attempted to search when they were unable to find the encyclopedia by browsing. Six of these seven students searched for the keywords “motivation and classroom learning” and the seventh for “motivation and learning.” Those who landed in OneSearch (six out of seven) received many results and were unable to find encyclopedias. One student searched within LibAnswers for “encyclopedia” and found Britannica. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 67 One student attempted to refine by facets, thinking that “encyclopedia” would be a facet similar to “book” or “article.” Using search, especially OneSearch, to attempt to find an encyclopedia was ultimately unsuccessful for the students. Search terms students chose were far too general for them to complete the task successfully. Students in version one testing attempted this task 2.4 times compared to 3.2 times for version two testers. Embedded Search Boxes & Screenshots of Search Boxes Embedded search boxes and screenshots of search boxes were tested in tasks four and six. The header used in version one LibGuides was limited, defaulting to searching within the guide, and the additional options on the dropdown menu next to the search box did not include a global “OneSearch.” In version two guides, a OneSearch box that searched most library resources (articles, books, library webpages, and LibGuides) was included. During task four, which asked students who were already on a specific guide how they would go about finding additional books on product development, version one testers were much more likely to use embedded search box widgets in the guide content. Three of the five students in version one testing used the search widgets on the page to either search the catalog or search OneSearch. The remaining two students in that round used a header search or browsed. One of these students used the LibGuides “search this guide” function in LibGuides and searched for “producte [sic] development books.” This student did not notice the typo in the search term and subsequently navigated out of LibGuides to the library website via the library link on the LibGuides header. The user then searched the catalog for “product development” and was able to locate books. A fifth student in the version one testing round did not use embedded search box widgets or the LibGuides search. She browsed through two guide pages and then gave up. In version two testing, three of five students used the global OneSearch box to find the product development books. The remaining two students chose to search the Millennium catalog linked from a “Books and Articles” tab on the main website header, finding books via that route. During testing of both versions, students tried an average of 1.5 times to complete the task before achieving success or acknowledging failure. Nine out of ten testers found books on the topic of product development. The one tester who did not find the books attempted to complete the task one time; she found product development articles from the prior task and said she would click on the same links (for individual article titles) to find books. In task six, half of the ten students from both rounds attempted to click on screenshots of search boxes or unlinked “FindIt” buttons. A screenshot of the OneSearch box and a Knovel search box were embedded in the test engineering guide. Two users in the version one testing and one tester in version two testing attempted to click on the OneSearch screenshot. One student in version two testing attempted to click on the Knovel search box screenshot. One student from version one testing tried to click on a “FindIt” button for the link resolver. Comparisons Between Rounds We recorded how many attempts were needed to complete tasks in each round. In round one, which tested LibGuides version one, students took an average of 2.74 tries to complete the tasks. In round two, which focused on LibGuides version two, students took two tries to complete tasks. Average attempts per task are displayed in figure 13. We also timed the rounds to see how many minutes it took students to complete all of the tasks. In the first round, it took 16:07 minutes on average and in the second round 15:29 minutes. This does not appear to constitute an important AM I ON THE LIBRARY WEBSITE? | CONRAD AND STEVENS 68 https://doi.org/10.6017/ital.v38i3.10977 difference, but there was one tester in round two who narrated his experiences very explicitly and in great detail. His session lasted 23 minutes. If his testing is excluded, then round two had a shorter average of 13:30 minutes. Despite the lower total time spent testing, task success was nearly equal between the two rounds. Details on individual testing times per participant are in figure 14. In round one, testers were successful at completing the task, whether they completed it in the manner we predicted or not, for 24 tasks. Round two was slightly lower with 23 successfully completed tasks. Success was, however, subjective. In task three, we wanted to test whether students found a list of articles on a LibGuide on a certain topic. Nearly all of the students (nine out of ten) found articles on the topic, but only one of them found them via the method we had anticipated. Other tasks produced similar results where the students found resources that technically fulfilled the task we had asked them to complete, even though they did not test the feature of the interface we were hoping. In these cases, we called this a success, as they had fulfilled the task as written. Figure 13. Attempts per task for LibGuides v1 compared to LibGuides v2. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 69 Figure 14. Total time per participant for LibGuides v1 compared to LibGuides v2. DISCUSSION There were several overarching themes that we discovered during the testing of LibGuides versions one and two. The first relates to Nielsen’s conception of search dominance and its implications for finding guides as well as resources within guides. Task one, which asked students to navigate to a relevant LibGuide from the library homepage, revealed that students were much more likely to search for a guide than to navigate to one by using links. Although the library homepage in our study included a clearly demarcated “Research Guides” link, only one tester clicked on it. In contrast, six of ten of the students used search as their first and only strategy, and an additional two of ten first clicked on a link and then switched to search as their next strategy. Although our initial search-focused research question and related task looked specifically at how students navigate to guides, most of the other tasks provided additional insight into how students navigate within them as well. Our findings are consistent with Nielsen’s observation that search functions as an “escape hatch” when users get “stuck in navigation.”42 Many students we tested used mixed strategies to find content, often resorting to searching for content when they were confused, lost, or impatient. While one student explicitly stated that search is a backup for when he cannot find something via browsing, search behaviors from many other students suggested that they were “search-dominant,” preferring searching over browsing both on library website pages and from within LibGuides. Similar to Nielsen’s studies on reliance on search engine results, students were unlikely to change their search strategies even if they were not receiving helpful results. Students did not engage in what Xie and Joo referred to as “whole site exploration,” browsing and evaluating most of the available information on a website to accomplish the assigned tasks.43 While research guides are sometimes designed to function as linear pathways that lead students through the research process or as comprehensive resources that introduce AM I ON THE LIBRARY WEBSITE? | CONRAD AND STEVENS 70 https://doi.org/10.6017/ital.v38i3.10977 students to a series of tools and resources, all of which could be useful in the research process, the students we tested did not approach guides in this way. Rather than starting on the first tab and comprehensively exploring it tab by tab and content box by content box, students ignored most of the content on the page, searching instead to find the specific information they needed. Our testers’ search behaviors were also consonant with Nielsen’s observation that scoped searches are inconsistent with users’ mental models about how search should function. Nielsen found that search boxes that only cover a subsection of a site are generally confusing to users and negatively impact users’ ability to find what they are looking for on a site. In our study, several students used scoped search boxes both on library website pages and within LibGuides to find content that the search did not index. Version two testers had access to a search box on every page that aligned with their global search expectations, and they frequently used it, so much so that they their preference for search disrupted some of the usability questions we were trying to answer in our tasks. For example, users’ tendency to search instead of browse interfered with our ability to clearly discern whether it was easier for students to find content on pages with one-, two-, or three-column content designs (many students did not even attempt to find content in the columns). Students’ global search expectations of search boxes also have implications on their ability to find LibGuides that they have been told exist or to discover the existence of LibGuides that might help them with their research. For example, students with search-dominant tendencies who attempt to use a library search tool that does not index LibGuides or the content within LibGuides will be unlikely to find them. While students did use search boxes embedded within LibGuides content areas, version two testers had access to a global search box located at the top right-hand side of every LibGuides page, and as a result, they were more likely to use the global search than the embedded search boxes. This behavior is consistent with Nielsen’s assertion that for ease of use, search should consist of a box “at the top of the page, usually in the right-hand corner,” that is “wide enough to contain the typical query.”44 Version two testers were quick to find and use the search box in the header that fit this description. Although students often used search boxes, and global ones in particular, to accomplish usability testing tasks, they were sometimes impeded by screenshots of search boxes and links. Several students clicked on them thinking they were live, unable to immediately distinguish that they differed from the functional embedded search boxes that some of the guides also included. As Nielsen observed, “Users often move fast and furiously when they’re looking for search. As we’ve seen in recent studies, they typically scan the homepage looking for ‘the little box where I can type.’”45 Librarians sometimes use screenshots of search boxes in an effort to provide helpful visuals to accompany instructional content (text) focusing on how to access and use a specific resource. Because many students scan the page for a search box so that they can quickly find needed information rather than carefully reading material in the content boxes, it could be argued that these screenshots inadvertently confuse students and impede usability. Another way to look at this issue, however, may be that guide content can be misaligned with user expectations and contexts. A user looking to search for articles on a topic who stumbles on a guide may have no reason to do anything other than look for a search box. In contrast, a user introduced to a guide in the context of a course who is asked to read through the content and explore three listed resources in preparation for a discussion to occur in the next class meeting will likely have a very different orientation to the guide and perception of its purpose and usefulness. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 71 Students’ search behaviors also made us question the efficacy of linking to specific books or articles within a LibGuide. In tasks three through seven, many of the students used OneSearch or the library catalog to search for specific books or articles rather than referencing the guide where potentially useful resources were listed. For example, while trying to find the COM 100 guide during task one, one student commented, “I never really look for stuff. I just go to the databases.” Version two testers, who had access to a global search in the header of every LibGuides page, were even more likely to navigate away from the guides to find books or articles. While several studies in the literature had suggested that vertical tab navigation may be more usable than horizontal tab navigation, our study did not bear this out, as students in both rounds were able to find items on vertical and horizontal navigation menus at relatively consistent rates. Similarly, one-, two-, and three-column content design did not appear to affect users’ abilities to find information and links on a page; however, users’ tendency to search rather than bro wse interfered with the relevant task’s intention of comparing the browsability of different content column designs, and therefore more targeted research on this question is needed. One student commented on the pointlessness of content in second columns, stating “Nobody ever looks on the right side, I always look on the left cause everything’s usually on the left side. Because you don’t read from right to left, it’s left to right.” He was, nevertheless, able to complete the task regardless of the multi-column design. Subtab placement in LibGuides versions one and two was very different from each other; version one subtabs were invisible to users unless they hovered over the main menu item on the horizontal menu, while version two allowed us to make subtabs immediately visible on the vertical menu, without any action needed by the user to uncover their existence. Given the subtabs’ visibility, we had anticipated that version two testers would be more likely to find and use subtabs, but this turned out not to be the case. Only one out of ten students found the relevant subtab. Although the successful tester was using LibGuides version two in which the subtab was visible, the fact that nine out of ten testers failed to see the subtab, regardless of whether it was immediately visible or not, suggests that subtab usage may not be an effective navigation strategy. Results from all tasks also suggested that students might not understand what research guides are or how guides might help them with their research. Like many libraries, the Cal Poly Pomona University Library did not refer to LibGuides by their product name on the library website, labeling them “Research Guides” instead in an effort to make their intended purpose clearer. Testing revealed, however, that students are not predisposed to think of a “research guide” as a useful tool to help them get started on their research. One student said, “I’m not sure what the definition of a research guide is.” When prompted to think more about what it might be, the student guessed that it was a pamphlet with “something to help me guide the research.” The student did not offer any additional guesses about what specifically that help might look like. Moreover, students’ tendency to resort to search itself can also be interpreted as evidence that they are confused about how guides are supposed to help them with research. Instead of reading or skimming information on the guides, students used search as a strategy to attempt to complete the tasks an average of 70 percent of the time across both rounds. Many of their searches navigated students away from the very guides that were designed to help them. The tendency to navigate away from guides was likely increased by the content included in the guides we tested, since many incorporated search boxes and links that pointed to external systems, such as the catalog, the discovery layer, LibAnswers, etc. However, many students’ first attempts to AM I ON THE LIBRARY WEBSITE? | CONRAD AND STEVENS 72 https://doi.org/10.6017/ital.v38i3.10977 accomplish the tasks given them involved immediately navigating away from LibGuides. Others navigated away shortly after an initial attempt or two to complete the task within the guide. All but one student navigated away from LibGuides to complete tasks; four did so more than five times. Eight of ten students used OneSearch in the header or from the library homepage; the other two used embedded OneSearch boxes on the LibGuides. Results also suggested that it might be easier for students to find guides that are explicitly associated with their courses, through either the guides’ titles or other searchable metadata, than to find and understand the relevance of general research guides. Even though general research guides might be relevant to the subject matter of students’ courses, guides that explicitly reference a course or courses are easily discoverable and their relevance is more immediately obvious. For instance, the first task asked students to find a “research guide” to help them write a paper on climate change for a COM 100 class. We wanted to see whether students would find the “Controversial Topics” research guide that was designed for COM 100 and that included the course number in the guide’s metadata. Mentioning the course number in the task seemed to make it more actionable as an assignment they might expect from a professor. When students searched for “COM 100,” they were more likely to find the Controversial Topics guide; two of three students who found the guide searched using the course number. If course numbers had not been included, they might not have found the guide as searching for the course number brought up the correct guide as the one result. Two additional students unsuccessfully attempted to find the guide by searching for “COM100,” without a space. Had the LibGuides search been more effective, or had librarians included both versions of the course code with and without a space, more students would likely have found the guide. Limitations Limitations of this study include weaknesses in both our usability tasks and the content of some of the LibGuides, which made it difficult to answer our research questions. We may have tested too many different features at once, which can be a pitfall of usability testing in general. Some tasks, such as tasks five and seven, tested both navigation placement and column layouts. In task five, for instance, there were multiple factors that could have led to success or failure; did a student overlook the ASCE standards because of column layout or tab placement or was the layout moot because the search box was comprehensive enough to allow them to complete the task without browsing the guide’s content? Similarly, task two tested a guide with seven tabs. It is not clear if the students who did not click on a tab missed it because of the placement of the navigation on the page or because the navigation contained too many options. Weaknesses in the content of many of the LibGuides used in the study led to additional limitations. Many of the LibGuides were text heavy and included jargon. One student even commented, “ It’s a lot of words here, so I really don’t want to read them.” Although we set out to test the usability of different navigation layouts and template designs, factors such as content overload or use of jargon could have influenced success or failure. The wording of task seven, for example, was particularly problematic and led to unclear results. Students were instructed to find an “encyclopedic source” in an attempt to see if they would click on books listed in a third column in version one testing compared to a left column in version two testing. The column header was titled “Useful Books for Background Research” and the box included encyclopedias. Students appeared to struggle with the idea of what constituted an “encyclopedic source.” When one student was specifically asked what she thought the term meant, she responded, “not sure.” Based INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 73 on the results of this task, it was difficult to discern if the interface or the wording of the task resulted in task completion failures. The contrived nature of usability testing itself might also have affected our results. For example, one student exhibited a tendency to rush through tasks, a behavior that may have been due to experiencing content overload, anxiety over being observed during the testing process, time limitations of which we were unaware, etc. On the other hand, behavior that we perceived to be rushing might be consistent with the students’ normal approach to navigating websites. Whatever the case, it is important to keep in mind that usability testing puts users on the spot because they are testing an interface in front of an audience. The usability testing context can therefore influence user behavior, including the number of times students might attempt to find a resource or complete a given task. Some students might be impatient or uncomfortable with the process, resulting in attempts to complete the testing as quickly as possible, including giving up on tasks more quickly than they would in a more natural setting. Conversely, other students might be more likely to expend more time and effort when performing in front of an audience than they would privately. CONCLUSION Usability testing was effective for revealing some of the difficulties students encounter when using our LibGuides and our website and for prompting reflection on the types of content they include, how that content is presented, and the contexts in which that content may or may not be useful to our students. Analysis of the data from our study and a review of the literature within the context of existing political realities and constraints within our library led to our development of several data-informed recommendations for LibGuides creators, most of which were adopted. One of the most important recommendations was that LibGuides should use the same header that is on the library’s main website, which includes a global search box. Use of the similar header not only would provide a consistent look and feel but it would also provide users with the global search box at the top of the page that is aligned with their mental model of how search should function. Our testing confirmed many students prefer to use global search boxes to find information rather than browsing or in addition to browsing when they get stuck. While some librarians were not thrilled with what they viewed as the privileging or predominance of the discovery layer on their guides, preferring to direct students to specific databases instead of the OneSearch, this recommendation was ultimately accepted due to the compelling nature of the usability data we were able to share. Our recommendation that subtabs should be avoided was also accepted because of how compelling the data was: 90 percent of users failed to find links located on subtabs. We also recommended that librarians should evaluate the importance of all content on their guides to minimize student confusion when browsing. While we acknowledged that there might be contexts when screenshots of search boxes would be useful, we encouraged librarians to think carefully about their use and to avoid them when possible. Additionally, librarians were encouraged to evaluate whether the content they were adding was of core importance to the LibGuide, reflecting on the degree to which it added value or possibly detracted from the LibGuide, perhaps by virtue of lack of relevance or content-overload. Content boxes consisting of suggested books on a general subject guide were used as an example, given the difficulty of providing useful book suggestions to students working on wildly different topics. While results from our rounds of usability testing did not indicate that left-side vertical navigation was decidedly more usable than horizontal navigation at the top of the page, we nevertheless AM I ON THE LIBRARY WEBSITE? | CONRAD AND STEVENS 74 https://doi.org/10.6017/ital.v38i3.10977 recommended that all guides should use left tab navigation, for consistency’s sake across guides, because left-side navigation has become standard on the web, and because other LibGuide studies have suggested that left-side navigation is easier to use than horizontal navigation, due to issues such as “banner blindness.”46 The librarians agreed, and a template was set up in the administrative console requiring that all public-facing LibGuides use left tab navigation. Based on other usability studies in the literature as well, we also recommended that guides should include no more than a maximum of seven main content tabs.47 Although our study did not provide any actionable data about the relative usability of one-, two-, and three-column content designs, other articles in the literature had emphasized the importance of consistency and avoiding a busy look with too much content. In order to avoid both a busy look and having guides that looked decidedly different from each other due to inconsistent number of columns, we therefore recommended that all guides should utilize a two-column layout, with the left column reserved for navigation. All content should appear in a single main column. However, future iterations of LibGuides usability testing should attempt to find ways to test whether limiting content to a single column is indeed more usable than dispersing it across two or more columns. The group voted on many of our recommendations, and several were simple to implement and oversee because they could translate into design decisions that could be set up as default, unchangeable options within the LibGuides administration module. Other recommendations were more difficult to operationalize and enforce. For example, because our findings indicated that students attempted to search for course numbers to find a guide that they were told was relevant to their research for a specific class, another one of our recommendations to the librarians’ group was to include, as appropriate, various course numbers in their guides’ metadata in order to both make them more discoverable and appear more immediately relevant to students’ coursework. This recommendation is not one that a LibGuides administrator could enforce due to issues revolving around subject matter and curriculum knowledge. The issue of context, and specifically the connection between courses and guides that has the potential to underscore their relevance and purpose to students, also caused us to question the effectiveness of general subject guides in assisting students with their research. If students are more likely to understand the relevance and purpose of a LibGuide when it is explicitly connected to their specific class or assignment and less likely to make the connection between a general research guide and their coursework, then the creation and maintenance of general subject guides might not be worth the time and effort librarians invest in them. This question is made more pressing by studies in the literature that indicate both low usage and shallow use of guides, such as using them primarily to find a database.48 While this question did not lead to a specific recommendation to the librarians’ group, we have since reflected that the return on investment issue might be effectively addressed via closer collaboration with faculty in the disciplines. If research guides are more clearly aligned with specific research assignments in specific courses , and if faculty members instruct their students to consult library research guides and integrate LibGuides and other library resources into learning management systems, perhaps use and return on investment would improve. Researchers like Hess and Hristova, for example, found that online tutorials that are required in specific courses show high usage.49 The connection between course integration and usage may hold true with LibGuides as well. Regardless, students’ frequent lack of understanding of what guides are designed to do and their tendency to navigate quickly away from them rather than exploring them suggests that INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 75 reconceptualizing what guides are designed to do, and what needs they are designed to meet in what specific contexts might prove to be a useful exercise. A guide designed as an ins tructional tool to teach specific concepts, topic generation processes, search strategies, citation practices, etc. within the context of a specific assignment for a specific course may well be immediately perceived as relevant to students in that course. Such a guide discussed in the context of a class might also be perceived as more useful than guides consisting of lists of resources and tools, which are unlikely to be interpreted as helpful by students who stumble upon them while seeking research assistance on the library’s website. As such, thinking about how and in what context students are likely to find guides, and how material might be presented so that guides are quickly perceived as a potentially relevant resource worth exploring might also prove useful. The importance of talking to users cannot be overemphasized; without collecting user feedback, whether through usability testing or another method, it is difficult to know how students perceive and use LibGuides or any other library online service. Getting user input on navigation flow, template design, and search functionality can provide valuable details that can help libraries improve the usability of their online resources. It is also important to note that in our rapidly changing environment, users’ needs and preferences also change. As such, collecting and analyzing user feedback to inform user-centered design should be a fluid process, not a one-time effort. Admittedly, it can sometimes be challenging to make collective design decisions, particularly when librarians have strong opinions grounded on their own personal experiences working with students that conflict with usability testing data. Although it is necessary to incorporate user feedback into the design process, it is also important to be open to compromise in order to achieve stakeholder buy-in for some usability-informed changes. As with many library services, usage of LibGuides is contingent at least in part on awareness, as students are unlikely to use services of which they are unaware or are unlikely to discover due to the limitations of a library’s search tools. Given the prevalence of search dominance among our users, we should not assume that simply placing a “Research Guides” link on a webpage will lead to usage. Increased outreach, better integration with the content of specific courses and assignments, and a thorough review of LibGuides content by those creating the guides with an eye toward the specific contexts in which they are likely to be used, taught, serendipitously discovered, etc. is necessary to ensure that the research guides librarians create are worth the time they invest in them. Additional studies focusing on why students do or do not use specific types of research guides, the contexts in which they are most useful, how students use them, and the specific content in guides that students find most helpful are needed to determine whether and to what extent they are aligned with students’ information-seeking preferences, behaviors, and needs, as well as how they might be improved to increase their use and usefulness. AM I ON THE LIBRARY WEBSITE? | CONRAD AND STEVENS 76 https://doi.org/10.6017/ital.v38i3.10977 APPENDIX 1: LIBGUIDES USABILITY TESTING TASKS Purpose: Seeing how students browse or search to get to research guides Task 1: You are writing a research paper on the topic of climate change for your COM 100 class. Your teacher told you that the library has a “research guide” that will help you write your paper. Find the guide. Start: Library homepage Purpose: Testing tab orientation on top Task 2: You need to compare two public opinion polls on the topic of climate change for your COM 100 class. Find a list of public opinion polls on the research guide shown. Start: http://libguides.library.cpp.edu/controversialtopics OR http://csupomona.beta.libguides.com/controversial-topics Purpose: Testing subtabs Task 3: You are writing a research paper for your Apparel Merchandising & Management class on the topic of product development. Your teacher told you that the library has a “research guide” that includes a list of articles on product development. Find the product development articles on this research guide. Start: http://libguides.library.cpp.edu/amm OR http://csupomona.beta.libguides.com/AMM Purpose: Testing searching within the LibGuides pages Task 4: If you were going to look for additional books on the topic of product development, what would you do next? Start: http://libguides.library.cpp.edu/amm OR http://csupomona.beta.libguides.com/AMM Purpose: Testing two-tab column design Task 5: You are designing an earthquake-resistant structure for your Civil Engineering course and need to review seismic load provisions. Locate the ASCE Standards on Seismic Loads. Use the research guide we open for you. Start: http://libguides.library.cpp.edu/civil OR http://csupomona.beta.libguides.com/civil- engineering Purpose: Seeing if including screenshots of search boxes is problematic Task 6: Your professor also asks you to find an online handbook or reference source on the topic of finite element analysis. Locate an online handbook or reference source on this topic. Start: http://libguides.library.cpp.edu/civil OR http://csupomona.beta.libguides.com/civil- engineering Purpose: Seeing if three-columns are noticeable http://libguides.library.cpp.edu/controversialtopics http://csupomona.beta.libguides.com/controversial-topics http://csupomona.beta.libguides.com/controversial-topics http://libguides.library.cpp.edu/amm http://csupomona.beta.libguides.com/AMM http://libguides.library.cpp.edu/amm http://csupomona.beta.libguides.com/AMM http://libguides.library.cpp.edu/civil http://csupomona.beta.libguides.com/civil-engineering http://csupomona.beta.libguides.com/civil-engineering http://libguides.library.cpp.edu/civil http://csupomona.beta.libguides.com/civil-engineering http://csupomona.beta.libguides.com/civil-engineering INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 77 Task 7: Find resources that might be good for background research on motivation and classroom learning for a psychology course. Find an encyclopedic source on this topic. Start: http://libguides.library.cpp.edu/psychology OR http://csupomona.beta.libguides.com/psychology http://libguides.library.cpp.edu/psychology http://libguides.library.cpp.edu/psychology http://libguides.library.cpp.edu/psychology AM I ON THE LIBRARY WEBSITE? | CONRAD AND STEVENS 78 https://doi.org/10.6017/ital.v38i3.10977 REFERENCES 1 William Hemmig, “Online Pathfinders: Toward an Experience-Centered Model,” Reference Services Review 33, no. 1 (February 2005): 67, https://dx.doi.org/10.1108/00907320510581397. 2 Charles H. Stevens, Marie P. Canfield, and Jeffrey T. Gardner, “Library Pathfinders: A New Possibility for Cooperative Reference Service,” College & Research Libraries 34, no. 1 (January 1973): 41, https://doi.org/10.5860/crl_34_01_40. 3 “About Springshare,” Springshare, accessed May 7, 2017, https://springshare.com/about.html. 4 “LibGuides Community,” accessed December 4, 2018, https://community.libguides.com/?action=0. 5 See, for example, Alisa C. Gonzalez and Theresa Westbrock, “Reaching Out with LibGuides: Establishing a Working Set of Best Practices,” Journal of Library Administration 50, no. 5/6 (September 7, 2010): 638–56, https://doi.org/10.1080/01930826.2010.488941. 6 Suzanna Conrad and Nathasha Alvarez, “Conversations with Web Site Users: Using Focus Groups to Open Discussion and Improve User Experience,” The Journal of Web Librarianship 10, no. 2 (2016): 74, https://doi.org/10.1080/19322909.2016.1161572. 7 Ibid., 74. 8 Suzanna Conrad and Julie Shen, “Designing a User-Centric Web Site for Handheld Devices: Incorporating Data-Driven Decision-Making Techniques with Surveys and Usability Testing,” The Journal of Web Librarianship 8, no. 4 (2014): 349-83, https://doi.org/10.1080/19322909.2014.969796. 9 “About Springshare.” 10 Jimmy Ghaphery and Erin White, “Library Use of Web-based Research Guides,” Information Technology and Libraries 31, no. 1 (2012): 21-31, https://doi.org/10.6017/ital.v31i1.1830. 11 “LibGuides Community,” accessed December 4, 2018, https://community.libguides.com/?action=0&inst_type=1. 12 Katie E. Anderson and Gene R. Springs, “Assessing Librarian Expectations Before and After LibGuides Implementation,” Practical Academic Librarianship: The International Journal of the SLA Academic Division 6, no. 1 (2016): 19-38, https://journals.tdl.org/pal/index.php/pal/article/view/19. 13 Examples include: Troy A. Swanson and Jeremy Green, “Why We Are Not Google: Lessons from a Library Web Site Usability Study,” The Journal of Academic Librarianship 37, no. 3 (2011): 222- 29, https://doi.org/10.1016/j.acalib.2011.02.014; Judith Z. Emde, Sara E. Morris, and Monica Claassen-Wilson, “Testing an Academic Library Website for Usability with Faculty and Graduate Students,” Evidence Based Library and Information Practice 4, no. 4 (2009): 24-36, https://doi.org/10.18438/B8TK7Q; Heather Jeffcoat King and Catherine M. Jannik, “Redesigning for Usability: Information Architecture and Usability Testing for Georgia Tech https://dx.doi.org/10.1108/00907320510581397 https://doi.org/10.5860/crl_34_01_40 https://springshare.com/about.html https://community.libguides.com/?action=0 https://doi.org/10.1080/01930826.2010.488941 https://doi.org/10.1080/01930826.2010.488941 https://doi.org/10.1080/19322909.2014.969796 https://doi.org/10.6017/ital.v31i1.1830 https://community.libguides.com/?action=0&inst_type=1 https://journals.tdl.org/pal/index.php/pal/article/view/19 https://doi.org/10.1016/j.acalib.2011.02.014 https://doi.org/10.18438/B8TK7Q INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 79 Library’s Website,” OCLC Systems & Services 21, no. 3 (2005): 235-43, https://doi.org/10.1108/10650750510612425; Danielle A. Becker and Lauren Yannotta, “Modeling a Library Website Redesign Process: Developing a User-Centered Website through Usability Testing,” Information Technology and Libraries 32, no. 1 (2013): 6-22, https://doi.org/10.6017/ital.v32i1.2311; Darren Chase, “The Perfect Storm: Examining User Experience and Conducting a Usability Test to Investigate a Disruptive Academic Library Web Site Redevelopment,” The Journal of Web Librarianship 10, no. 1 (2016): 28-44, https://doi.org/10.1080/19322909.2015.1124740; Andrew R. Clark et al., “Taking Action on Usability Testing Findings: Simmons College Library Case Study,” The Serials Librarian 71, no. 3-4 (2016): 186-96, https://doi.org/10.1080/0361526X.2016.1245170; Anthony S. Chow, Michelle Bridges, and Patrician Commander, “The Website Design and Usability of US Academic and Public Libraries: Findings from a Nationwide Study,” Reference & User Services Quarterly 53, no. 3 (2014): 253-65, https://journals.ala.org/index.php/rusq/article/view/3244/3427; Gricel Dominguez, Sarah J. Hammill, and Ava Iuliano Brillat, “Toward a Usable Academic Library Web Site: A Case Study of Tried and Tested Usability Practices,” The Journal of Web Librarianship 9, no. 2-3 (2015), https://doi.org/10.1080/19322909.2015.1076710; Junior Tidal, “One Site to Rule Them All, Redux: The Second Round of Usability Testing of a Responsively Designed Web Site,” The Journal of Web Librarianship 11, no. 1 (2017): 16-34, https://doi.org/10.1080/19322909.2016.1243458. 14 Kate A. Pittsley and Sara Memmott, “Improving Independent Student Navigation of Complex Educational Web Sites: An Analysis of Two Navigation Design Changes in LibGuides,” Information Technology and Libraries 31, no. 3 (2012): 52-64, https://doi.org/10.6017/ital.v31i3.1880. 15 Alec Sonsteby and Jennifer DeJonghe, “Usability Testing, User-Centered Design, and LibGuides Subject Guides: A Case Study,” The Journal of Web Librarianship 7, no. 1 (2013): 83-94, http://dx.doi.org/10.1080/19322909.2013.747366. 16 Sarah Thorngate and Allison Hoden, “Exploratory Usability Testing of User Interface Options in LibGuides 2,” College & Research Libraries 78, no. 6 (2017), https://doi.org/10.5860/crl.78.6.844. 17 Nora Almeida and Junior Tidal, “Mixed Methods Not Mixed Messages: Improving LibGuides with Student Usability Data,” Evidence Based Library and Information Practice 12, no. 4 (2017): 66, https://academicworks.cuny.edu/ny_pubs/166/. 18 Ibid., 63; 71. 19 Dana Ouellette, “Subject Guides in Academic Libraries: A User-Centered Study of Uses and Perceptions,” Canadian Journal of Information and Library Science 35, no. 4 (December 2011): 436–51, https://doi.org/10.1353/ils.2011.0024. 20 Ibid., 442. 21 Ibid., 442-43. https://doi.org/10.1108/10650750510612425 https://doi.org/10.6017/ital.v32i1.2311 https://doi.org/10.1080/19322909.2015.1124740 https://doi.org/10.1080/0361526X.2016.1245170 https://journals.ala.org/index.php/rusq/article/view/3244/3427 https://journals.ala.org/index.php/rusq/article/view/3244/3427 https://doi.org/10.1080/19322909.2015.1076710 https://doi.org/10.1080/19322909.2016.1243458 https://doi.org/10.6017/ital.v31i3.1880 http://dx.doi.org/10.1080/19322909.2013.747366 https://doi.org/10.5860/crl.78.6.844 https://academicworks.cuny.edu/ny_pubs/166/ https://doi.org/10.1353/ils.2011.0024 AM I ON THE LIBRARY WEBSITE? | CONRAD AND STEVENS 80 https://doi.org/10.6017/ital.v38i3.10977 22 Ibid., 443. 23 Ibid., 443; Shannon M. Staley, “Academic Subject Guides: A Case Study of Use at San Jose State University,” College & Research Libraries 68, no. 2 (March 2007): 119–39, https://doi.org/10.5860/crl.68.2.119. 24 Jakob Nielsen, “Search and You May Find,” Nielsen Norman Group, last modified July 15, 1997, https://www.nngroup.com/articles/search-and-you-may-find/. 25 Jakob Nielsen, “Macintosh: 25 Years,” Nielsen Norman Group, last modified February 2, 2 009, https://www.nngroup.com/articles/macintosh-25-years/; Jakob Nielsen and Raluca Budiu, Mobile Usability (Berkeley: New Riders, 2013), chap. 2, O’Reilly. 26 Jakob Nielsen, “Incompetent Research Skills Curb Users’ Problem Solving,” Nielsen Norman Group, last modified April 11, 2011, https://www.nngroup.com/articles/incompetent-search- skills/. 27 Jakob Nielsen, “Search: Visible and Simple,” Nielsen Norman Group, last modified May 13, 2001, https://www.nngroup.com/articles/search-visible-and-simple/. 28 Ibid. 29 Jakob Nielsen, “Mental Models for Search Are Getting Firmer,” Nielsen Norman Group, last modified May 9, 2005, https://www.nngroup.com/articles/mental-models-for-search/. 30 Ibid. 31 Erik Ojakaar and Jared M. Spool, Getting Them to What They Want: Eight Best Practices to Get Users to the Content They Want (and to Content They Didn’t Know They Wanted) (Bradford, MA: UIE Reports: Best Practices Series, 2001). 32 Amanda Nichols Hess and Mariela Hristova, “To Search or to Browse: How Users Navigate a New Interface for Online Library Tutorials,” College & Undergraduate Libraries 23, no. 2 (2016): 173, https://doi.org/10.1080/10691316.2014.963274. 33 Ibid., 176. 34 Hyejung Han and Dietmar Wolfram, “An Exploration of Search Session Patterns in an Image- Based Digital Library,” Journal of Information Science 42, no. 4 (2016): 483, https://doi.org/10.1177/0165551515598952. 35 Ibid., 487. 36 Xi Niu, Tao Zhang, and Hsin-liang Chen, “Study of User Search Activities with Two Discovery Tools at an Academic Library,” International Journal of Human-Computer Interaction 30 (2014): 431, https://doi.org/10.1080/10447318.2013.873281. 37 Iris Xie and Soohyung Joo, “Tales from the Field: Search Strategies Applied in Web Searching,” Future Internet 2 (2010): 268-69, https://doi.org/10.3390/fi2030259. https://doi.org/10.5860/crl.68.2.119 https://www.nngroup.com/articles/search-and-you-may-find/ https://www.nngroup.com/articles/macintosh-25-years/ https://www.nngroup.com/articles/incompetent-search-skills/ https://www.nngroup.com/articles/incompetent-search-skills/ https://www.nngroup.com/articles/search-visible-and-simple/ https://www.nngroup.com/articles/mental-models-for-search/ https://doi.org/10.1080/10691316.2014.963274 https://doi.org/10.1177/0165551515598952 https://doi.org/10.1080/10447318.2013.873281 https://doi.org/10.3390/fi2030259 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 81 38 Ibid., 275; 267-68. 39 Ibid., 268-69. 40 Sonsteby and DeJonghe, “Usability Testing, User-Centered Design,” 83-94. 41 We experienced technical difficulties when capturing screens and audio simultaneously in Camtasia. The audio did not sync in real time with the testing and we had to correct sync issues after the fact. A full technical test of screen capture and recording technology might have resolved this issue. 42 Nielsen, “Search: Visible and Simple.” 43 Nielsen, “Search and You May Find”; Nielsen, “Incompetent Research Skills”; Iris Xie and Soohyung Joo, “Tales from the Field,” 268-69. 44 Jakob Nielsen, “Search: Visible and Simple.” 45 Ibid. 46 Pittsley and Memmott, “Improving Independent Student Navigation,” 52-64. 47 E.g., Sonsteby and DeJonghe, “Usability Testing, User-Centered Design,” 83-94. 48 Ouellette, “Subject Guides in Academic Libraries,” 448; Brenda Reeb and Susan Gibbons, “Students, Librarians, and Subject Guides: Improving a Poor Rate of Return,” Portal: Libraries and the Academy 4, no. 1 (January 22, 2004): 124, https://dx.doi.org/10.1353/pla.2004.0020; Staley, “Academic Subject Guides,” 119–39. 49 Hess and Hristova, “To Search or to Browse,” 174. https://dx.doi.org/10.1353/pla.2004.0020 ABSTRACT Introduction Literature Review The Growth of LibGuides LibGuides Usability Testing and User Studies Information Retrieval Behaviors: Search and Browse Preferences Method Task 1 Task 2 Task 3 Task 4 Task 5 Task 6 Task 7 Results Searching vs. Browsing to Find LibGuides Navigation, Tabs, and Layout Embedded Search Boxes & Screenshots of Search Boxes Comparisons Between Rounds Discussion Limitations Conclusion Appendix 1: LibGuides Usability Testing Tasks ReFerences 10979 ---- 20190318 10979 gallley Editorial Board Thoughts Who Will Use This and Why? User Stories and Use Cases Kevin M. Ford INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 5 Kevin M. Ford (kefo@loc.gov) is Librarian, Linked Data Specialist, Library of Congress. Perhaps I’m that guy. The one always asking for either a “user story” or a “use case,” and sometimes both. They are tools employed in software or system engineering to capture how, and importantly why, actors (often human users, but not necessarily) interact with a system. Both have protagonists, but one is more a creative narrative, the other like a strict, unvarnished retelling. User stories relate what an actor wants to do and why. Use cases detail to varying degrees how that actor might go about realizing his desire. The concepts, though distinct, are often confused and conflated. And, because they classify as jargon, the concepts have sometimes been employed outside of technology to capture what an actor needs, the path the actor takes to his or her objective, including any decisions that might be made along the way, and all of this effort is undertaken in order to identify the best solution. By giving the actors a starring role, user stories and use cases ensure focus is on the actors, their inputs, and the expected outcome. They protect against incorporating unnecessary elements, which could clutter and, even worse, weaken the end product, and they create a baseline understanding by which the result can be measured. And so I find myself frequently asking in meetings, and mumbling in hallways: “What’s the use case for that?” or “Is there a user story? If not, then why are we doing it?” You get the idea. It’s a little ironic that I would become this person. Not because I didn’t believe in user stories and use cases – quite the contrary, I’ve always believed in the importance and utility of them – but because of a book I was assigned during graduate coursework for my LIS degree and my initial reaction. It’s not just an unassuming book, it has a downright boring appearance, as one might expect of a book entitled “Use Case Modeling.”1 It’s a shocking 347 pages. It was a joint endeavor by two authors: Kurt Bittner and Ian Spence. I think I read it, but I can’t honestly recall. I assume I did because I was that type of student and I had a long Chicago El commute at the time. In any case, I know beyond doubt that I was assigned this book, dutifully obtained it, and then picked it up, thumbed through it, rolled my eyes, and probably said, “Ugh, really?” And that’s just it. The joke’s on me. The concepts, and as such the book, which I’ve moved across the country a couple of times, remain near-daily constants in my life. As a developer, I basically don’t do anything without a user story and a use case, especially one whose steps (including preconditions, alternatives, variables, triggers, and final outcome) haven’t been reasonably sketched out. “Sketched out” is an interesting phrase because one would think that if entire books were being authored on the topic of use cases, for example, then use cases would be complicated and involved affairs. They can be, but they need not be. The same holds for user stories. Imagine you were designing a cataloging system, here’s an example of the latter: As a librarian I want my student catalogers to be guided through selection of vocabulary terms to improve both their accuracy and speed.2 EDITORIAL BOARD THOUGHTS: WHO WILL USE THIS AND WHY? | FORD 6 https://doi.org/10.6017/ital.v38i1.10979 That single-sentence user story identifies the actors (student catalogers), what they need (a “guided … selection of vocabulary terms”), and why (“to improve their accuracy and speed”). The use case would explore how the student catalogers (the actors) would interact with the system to realize that user story. The use case might be narrowly defined (“Adding controlled terms to records”) or might be part of a broader use case (“Cataloging records”), but in either instance the use case might go to some length to describe the interaction between the student catalogers and the system in order to generate a clear understanding of the various interactions. By doing this, the use case helps to identify functional requirements and it clearly articulates user/system expectations, which can be reviewed by stakeholders before work begins and used to verify delivery of the final product. As I have presented this, using these tools might strike you as overly formal and time-consuming. In many circumstances they might be, if the developer has sufficient user and domain knowledge (rare, very, very rare) and especially if the “solution” is not an entirely new system but just an enhancement or augmentation to an existing system. Yet, whether it is a completely new system being developed by someone who has long and profound experience with the domain or a simple enhancement, it may be worth entertaining the questions/process if even informally. I find it is often sufficient to ask “Who will use this and why?” Essentially I’m asking for the “user story” but dispensing with the jargon. Doing so may lead to additional questions, the answers to which would likely check the boxes of a “use case” even if the effort is not identified as such, and it certainly ensures the user-driven nature and need of the request. This might all sound obvious, but I like to think of it as defensive programming, which is like defensive driving. Yes, the driver coming up to the stop sign on my right is going to stop, but I take my foot off the gas and position it over the brake just in case. Likewise, I’m confident the functional requirements I’m being handed have been fully considered and address a user need, but I’m going to ask for the user story anyway. I’m also leery of scope creep which, if I were to continue the driving analogy, would be equivalent to driving to one store because you need to, but then also driving to two additional stores for items you think might be good to have but for which you have no present need. It’s time-consuming, you’ve complicated your project, you’ve added expense to your budget, and the extra items might be of little or no use in the end. The number of times I’ve been in meetings in which new, additional features are discussed because the designers think it is a good idea (that is, there has been no actual user request or input sought) is alarmingly high. That’s when I pipe up, “Is there a user story? If not, then why are we doing it?” User stories and use cases help focus any development project on those who stand to benefit, i.e. the project’s stakeholders, and can guard simultaneously against insufficient planning and software bloat. And the concepts, though most often thought of with respect to large-scale projects, apply in all circumstances, from the smallest feature request to an existing system to the redesign of a complex system. If you are not in the habit of asking, try it next time: Who will use this and why? ENDNOTES 1 Kurt Bittner and Ian Spence, Use Case Modeling (Boston: Addison-Wesley, 2003). Also useful: Alistair Cockburn, Writing Effective Use Cases (Boston: Addison-Wesley, 2001). INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 7 2 “Use Case 3.4: Authority tool for more accurate data entry,” Linked Data for Libraries (LD4L), accessed March 1, 2019, https://wiki.duraspace.org/display/ld4l/Use+Case+3.4%3A+Authority+tool+for+more+accur ate+data+entry. 10980 ---- 10980 2019038 editor LITA President’s Message Updates from the 2019 ALA Midwinter Meeting Bohyun Kim INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 2 Bohyun Kim (bohyun.kim.ois@gmail.com) is LITA President 2018-19 and Chief Technology Officer & Associate Professor, University of Rhode Island Libraries, Kingston, Rhode Island. In this President’s message, I would like to provide some updates from the 2019 ALA Midwinter Meeting held in Seattle, Washington. First, as many of you know, the potential merger of LITA with ALCTS and LLAMA has been temporarily put on hold, due to an initial timeline that was rather ambitious and the lack of time required to deliberate on and resolve some issues in the transition plan to meet that timeline.1 These updates were also shared at the LITA Town Hall during the Midwinter Meeting, where many LITA members spent time discussing topics such as the draft mission and vision statements for the new division, what makes people feel at home in a division, in which areas LITA should re- double its focus, and which activities LITA may be able to set aside without losing its identity. Valuable feedback and thoughts were provided by Town Hall participants. Many emphasized the importance of building and retaining a community of library technologists around LITA values, programming, resources, advocacy, service activities, and networking opportunities in those feedback. The merger-related discussion is to resume this Spring, and the leadership of LITA, ALCTS, and LLAMA will make every effort to ensure the best future for three divisions at this time of great flux and change. Second, LITA is looking into introducing some changes to the LITA Forum. In the feedback and thoughts gathered at the LITA Town Hall, the LITA Forum was also mentioned as one of the valuable LITA offerings to its members. The origin of the LITA Forum goes back to LITA’s first national conference held in Baltimore in 1983.2 Since then, the LITA Forum has become a cherished venue for many library technologists, a place where they meet other like-minded people in the field, learn from one another, share ideas and experience, and look for more ways in which technology can be utilized to better serve libraries and library patrons. Initially, the Steering Committee hoped that all three divisions would participate in putting together the LITA Forum with a wider range of content that encompasses the interests of not only LITA members but also of those in ALCTS and LLAMA, in a virtual format in order to engage more members who cannot easily travel, to be held some time in Spring 2020. At the time this idea was conceived more than a year ago, it was assumed that all preparations for the member vote regarding the merger would have been nearly completed by the time of the Midwinter Meeting. However, the Steering Committee unfortunately ran out of time for that preparation. Merger planning also took up almost the entirety of the time that the leadership and the staff of the three divisions had available. This resulted in an unfortunate delay in proper Forum planning. With the merger conversation on hold at this point and the new timeline for the merger being likely to be set back at least by a year, the changed circumstances for the Forum planning had to be reviewed. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 3 After a lively and thoughtful discussion at the Midwinter Meeting, the LITA Board decided that, considering how much work remains to be done regarding merger planning, it may not be practical or feasible to have the next LITA Forum be the first virtual and joint one. However, there was a lot of interest in and excitement about trying a virtual format since it will allow LITA to reach and serve the needs of more LITA members than the traditional in-person meeting. It was also pointed out that the virtual format may provide an opportunity for LITA to experiment with different and more unconventional conference program formats, which could be a welcoming change to LITA members. The LITA Board, however, also acknowledged the value of a physical conference where people get to meet one another in person, which cannot be easily transferred to a virtual conference. If the virtual conference experiment takes place and is successful, LITA may hold its Forum alternating every year between two different formats – virtual and physical. Planning for and running a fully virtual conference at the scale of a multi-day national forum will require additional time and careful consideration since it will be the first time the LITA Forum Planning Committee and the LITA Office attempt this. Logistics management is likely to be quite different in a virtual conference. The attendee expectations and the user experience will also significantly differ in a virtual conference than in a physical conference. As the first step of this investigation, the LITA Forum Planning Committee will explore what the ideal LITA virtual forum may look like in terms of programming formats and participant experience. The LITA Office and the Finance Advisory Committee will also look into the financial side of running the LITA Forum in a virtual format. At this time, it is not yet determined when the next LITA Forum will be held and whether it will be a virtual or a physical one. Once these investigations are completed, however, the LITA Board should be able to decide on the most appropriate path towards the next LITA Forum. Stay tuned for what exciting changes may be coming to the LITA Forum. Third, I would like to mention that LITA issued a statement regarding the incidents of aggressive behavior, racism, and harassment reported at the 2019 ALA Midwinter Meeting.3 Along with the statement, the LITA Board has decided to commit funds to provide an online bystander / allyship training, which we hope will equip LITA members with tools that empower active and effective allyship, recognize and undo oppressive behaviors and systems, and promote the practice of cultural humility, thereby collectively increasing our collaborative capacity. The LITA statement and the Board decision were received positively by many LITA members. Other ALA divisions such as ALCTS, ALSC, ASGCLA, LLAMA, UNITED, and YALSA have already expressed interest in working together with LITA on this, and the LITA Board is looking into a few options to choose from. More information about the training will be soon provided. Lastly, I am thrilled to announce that the LITA President’s Program at the upcoming ALA Annual Conference at Washington D.C in June will feature Meredith Broussard, a data journalist and the author of Artificial Unintelligence: How Computers Misunderstand the World, as the speaker. In her book, Broussard delves into many problems surrounding techno-chauvinism, which displays blind optimism about technology and an abundant lack of caution about how new technologies will be used. She further details how this simplistic worldview that prioritizes building new things and efficient code above social conventions and human interactions often misinterprets a complex social issue as a technical problem and results in a reckless disregard for public safety and the public good. LITA PRESIDENT’S MESSAGE: UPDATES FROM THE 2019 ALA MIDWINTER MEETING | KIM 4 https://doi.org/10.6017/ital.v38i1.10980 Reviewing the early history of computing and digital technology, Broussard observes: “We have a small, elite group of men who tend to overestimate their mathematical abilities, who have systematically excluded women and people of color in favor of machines for centuries, who tend to want to make science fiction real, who have little regard for social convention, who don’t believe that social norms or rules apply to them, who have unused piles of government money sitting around, and who have adopted the ideological rhetoric of far-right libertarian anarcho-capitalists. What could possibly go wrong?”4 I invite all of you to come to this program for more insight and a deeper understanding about what the recent technology innovation involving artificial intelligence (AI) and big data means to our everyday life and where it may be headed. The program information is available in the ALA 2019 Annual Conference Scheduler at https://www.eventscribe.com/2019/ALA- Annual/fsPopup.asp?Mode=presInfo&PresentationID=519109. ENDNOTES 1 The official announcement can be found at the LITA Blog. See Bohyun Kim, “Update on New Division Discussions,” LITA Blog, January 26, 2019, https://litablog.org/2019/01/update-on- new-division-discussions/. 2 Stephen R. Salmon, “LITA’s First Twenty-Five Years: A Brief History,” Library Information Technology Association (LITA), September 28, 2006, http://www.ala.org/lita/about/history/1st25years. 3 “LITA’s Statement in Response to Incidents at ALA Midwinter 2019,” LITA Blog, February 4, 2019, https://litablog.org/2019/02/litas-statement-in-response-to-incidents-at-ala- midwinter-2019/. 4 Meredith Broussard, Artificial Unintelligence: How Computers Misunderstand the World (Cambridge, Massachusetts: The MIT Press, 2018), p. 85. 10992 ---- 20190318 10992 editor Letter from the Editor Kenneth J. Varnum INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2019 1 https://doi.org/10.6017/ital.v38i1.10992 The current (March 2019) issue of Information Technology and Libraries sees the first of what I know will be many exciting contributions to our new “Public Libraries Leading the Way” column. This feature (announced in December 2018) shines a spotlight on technology-based innovations from the public library perspective. The first column, “The Democratization of Artificial Intelligence: One Library’s Approach,” by Thomas Finley of the Frisco (Texas) Public Library, discusses how his library has developed a teaching and technology lending program around artificial intelligence, creating kits that community members can take home and use to explore artificial intelligence through a practical, hands-on, approach. If you have a public library perspective on technology that you would like to share in a conversational, 1000-1500-word column, submit a proposal. Full details and a link to the proposal submission form can be found on the LITA Blog. I look forward to hearing your ideas. In addition to the premiere column in this series, the current issue includes the LITA President’s column from Bohyun Kim to update us on the 2019 ALA Midwinter meeting, particularly on the status of the proposed ALCTS/LLAMA/LITA merger, and our regular Editorial Board Thoughts column, contributed this quarter by Kevin Ford, on the importance of user stories in successful technology projects. Articles in this issue cover the topics: improving sitewide navigation; improving the display of HathiTrust records in Primo; using linked data to create a geographic discovery system; measuring information system project success; a systematic approach towards web preservation; and determining textbook cost, formats and licensing. I hope you enjoy reading the issue, whether you explore just one article, or read it “cover to cover.” As always, if you want to share the research or practical experience you have gained as an article in ITAL, get in touch with me at varnum@umich.edu. Sincerely, Kenneth J. Varnum, Editor varnum@umich.edu March 2019 11007 ---- Creating and Deploying USB Port Covers at Hudson County Community College Communications Creating and Deploying USB Port Covers at Hudson County Community College Lotta Sanchez and John DeLooper INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 91 Lotta Sanchez (lsanchez@hccc.edu) is Library Associate – Technology, Hudson County Community College. John DeLooper (john.delooper@lehman.cuny.edu) is Web Services – Online Learning Librarian, Lehman College, City University of New York. ABSTRACT In 2016, Hudson County (NJ) Community College (HCCC) deployed several wireless keyboards and mice with its iMac computers. Shortly after deployment, library staff found that each device’s required USB receiver (a.k.a. dongle) would disappear frequently. As a result, HCCC library staff developed and deployed 3D printed port covers to enclose these dongles. This, for a time, proved very successful in preventing the issue. This article will discuss the development of these port covers, their deployment, and what worked and did not work about the project. INTRODUCTION 3D printing was invented in the 1980s but remained a niche product until emerging as a mainstream technology beginning in 2009. It has been speculated that the growth in popularity was due to several factors, most notably the expiration of patents on technologies such as fused deposition modeling.1 The expiration of this patent led to the emergence of several new companies such as MakerBot, which developed and released lower priced 3D printers in an effort to popularize 3D printing.2 Nevertheless, early 3D printers were still more expensive than most individual consumers could afford. As with laser printers in the 1980s, many libraries combined their role in the growing makerspace movement with their community purchasing power to bring this new technology to libraries across the United States.3 Libraries thus became focal points in the nascent consumer 3D printing movement, frequently providing both training and access to supplies and equipment. As the popularity of 3D printing grew, new communities of 3D printing users emerged and began to design and share artwork and practical objects created with 3D printing technology, often via communities like Thingiverse and Shapeways. 3D PRINTING AT HUDSON COUNTY COMMUNITY COLLEGE In August 2014, the Hudson County (NJ) Community College (HCCC) Library moved into a larger facility, nearly doubling its square footage. At this time, many libraries were beginning to open makerspaces, which are facilities for collaboration where the “emphasis is on creating with technology,” and HCCC saw an opportunity to join this movement.4 Given the results of student feedback surveys, and the observed popularity of 3D printing in public libraries, HCCC librarians sought to purchase a 3D printer as a signature technology for the new makerspace. To support the new makerspace, the library’s staff implemented a series of workshops to teach students how to use the 3D printer and create their own projects. In addition, when the makerspace was not in use, the library’s administration allowed staff to experiment with the 3D printer, as well as all mailto:lsanchez@hccc.edu mailto:john.delooper@lehman.cuny.edu CREATING AND DEPLOYING USB PORT COVERS | SANCHEZ AND DELOOPER 92 https://doi.org/10.6017/ital.v38i3.11007 technologies housed in the makerspace, to allow them to better understand and promote these tools. ABOUT HUDSON COUNTY COMMUNITY COLLEGE As per its 2017-18 HCCC Factbook, HCCC is an urban institution “offering courses and classes in a wide variety of disciplines and studies in one of the most densely populated and ethnically diverse areas of the United States.”5 As of fall 2017, HCCC’s full-time-equivalent student population is 7,712, and includes students representing “more than 90 nationalities.” Many of these students hail from outside of the United States, “nearly 58 percent of whom speak a language other than English in their homes.” HCCC’s demographics also skew young, with students ages 20 through 29 comprising approximately 52 percent of enrolled students. More recently, HCCC has also increased its enrollment of high school students, as “the number of students under the age of 18, who are mostly enrolled through HCCC’s various high school initiative programs, has more than quadrupled over the past five years.” As with many other community colleges, HCCC’s student body includes approximately a 6:4 ratio of female to male students. THE MAC USB DILEMMA As part of the move to a new facility, the library purchased several new technologies such as computers including Dell PCs and Apple iMacs (Macs). The Dell PCs came with wireless keyboards and mice, and in March 2016, the Macs were switched to wireless keyboards and mice as well because their original keyboards and mice began to break down and needed replacement. Students reported to library staff that the wireless keyboards and mice were a good investment, as they made it easier to move keyboards for better collaboration and for ease of storing backpacks and textbooks on desks. On both the Dell PCs and the Macs, the wireless keyboards and mice required the use of a small USB receiver, known as a dongle, to connect to the computer. As the wireless keyboards were installed, several library staff members raised concerns that wireless keyboards and mice would be tempting targets for theft by patrons. Surprisingly, theft of keyboards and mice did not come to pass. Since deployment, library staff reported no incidents of theft of any keyboards or mice. However, an unexpected type of theft soon emerged. Library employees noticed that on the iMacs, the Type-A USB dongles, which were needed for the computers to receive input from the keyboards and mice, started disappearing. Staff observed that this seemed to be a problem only among the library’s 18 Macs, not its 57 Dell computers, which also had wireless keyboards and wireless receivers. Anecdotal observation suggested that this phenomenon emerged due to the Dell’s dark color scheme, which obscured each compu ter’s USB ports, and rendered the dongles inconspicuous. In contrast, the iMacs had sleek aluminum finishes, on which the dongles were more visible, and seemed to be perceived by students as flash drives (see figures 1-5). INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 93 Figure 1. HCCC iMac (back with dongle shown). CREATING AND DEPLOYING USB PORT COVERS | SANCHEZ AND DELOOPER 94 https://doi.org/10.6017/ital.v38i3.11007 Figure 2. HCCC Dell PC (back with dongle shown). INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 95 Figure 3. HCCC iMac with USB dongle closeup. Figure 4. HCCC PC with USB dongle closeup. CREATING AND DEPLOYING USB PORT COVERS | SANCHEZ AND DELOOPER 96 https://doi.org/10.6017/ital.v38i3.11007 Figure 5. Comparison of Mac and PC USB ports. These perceptions were confirmed as students started visiting service points with dongles from the Macs and turning them in to library staff as “lost flash drives.” As there was frequently a lag between when a dongle was turned over to staff and the device’s initial disappearance, students began to report frustration that they would try to use a Mac and find that the mouse and keyboard could not communicate with the computer. This would cause them to assume that the Mac was broken, and library staff would respond by taking the computer out of service until a tech could examine it, often several hours or even days later depending on staffing. During the first semester that these keyboards and mice were deployed, the library found that almost every USB receiver was lost or stolen. This resulted in over $300 of unplanned expenses. In addition, library staff spent dozens of hours inspecting the iMacs after students reported non-functioning keyboards, determining what issue was occurring, ordering replacement parts, and connecting new dongles, a process also referred to as “pairing.” To address this problem, HCCC’s director of library technology sought solutions from the library’s technology staff. At a staff meeting in the spring 2016 semester, most of the members of the technology unit suggested that the library address the disappearing dongle issue by purchasing new wired keyboards and mice. The director of the technology unit felt that this was a premature solution to the issue, as he and the library administration preferred a solution that allowed the library to continue to use the wireless keyboards and mice, which were both costly and requested by the institution’s student community. During this meeting, the idea of finding port covers for the dongles arose, and one of the library’s technology associates suggested using the library’s 3D printer to create a cover that inserts into one Type A USB port and would cover the dongle in the adjacent slot. The library’s technology director asked her to create a prototype, and the technology associate began work on creating this port cover. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 97 METHODOLOGY To create the 3D-printed port cover, the technology associate began with an online search of the 3D-printing community Thingiverse, looking to see if any other 3D port covers already existed. She hoped to find an existing port cover that was both functional and easy to manufacture—in other words, quick to print, since the library’s MakerBot Replicator often took hours to print intricate designs and frequently jammed, due to an extruder design flaw that was common to fifth generation Replicator printers.6 A Thingiverse search found several varieties of port covers, but each was designed solely to occupy a port in order to prevent dust or corrosion, not to cover or hide dongles or other peripherals. Since none of the existing designs adequately met the library’s needs, the technology associate created her own design using TinkerCAD, a web-based computer aided design (CAD) program (see figure 6). Figure 6. Picture of port cover design in TinkerCAD. Since each of HCCC’s iMac computers contained four Type A ports, students would often attach other peripherals such as phone charge/sync cables or flash drives. Therefore, the port cover needed to be small enough to allow room for peripherals. The technology associate thus designed a cover that would not hinder students who wanted to insert their USB flash drives or other devices, as is depicted in figures 7-10. CREATING AND DEPLOYING USB PORT COVERS | SANCHEZ AND DELOOPER 98 https://doi.org/10.6017/ital.v38i3.11007 Figure 7. Closeup of port cover. Figure 7. Alternate angle of port cover. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 99 Figure 8. Picture of dongle with port cover installed. Figure 9. Port cover allowed space to utilize other USB ports for flash drives, etc. CREATING AND DEPLOYING USB PORT COVERS | SANCHEZ AND DELOOPER 100 https://doi.org/10.6017/ital.v38i3.11007 She then exported the TinkerCAD design as an STL (Stereolithography) file, and printed prototypes on the MakerBot Replicator using PLA filament. Finding that her initial measurements did not quite fit, she adjusted the models one millimeter at a time, and reprinted them until the fit was secure and the dongles were covered. At this point, she printed enough covers for each Mac, along with a few spares in case covers broke or wore out during normal operation. RESULTS At the beginning of the fall 2016 semester, the port covers were deployed to each of the library’s Macs. During that semester, the technology associate monitored the effectiveness of the port covers. By the end of the semester, four port covers had disappeared, along with one dongle. At the beginning of the spring 2017 semester, the missing dongle was replaced, and replacement port covers were printed and deployed to the machines from which the port covers disappeared. Again, the success of the port cover installation project was monitored. During this period, four port covers disappeared, along with two dongles. After the spring semester, the technology associate conferred with the director of library technology, and they decided that given the relatively low cost of 3D-printer filament used to print the dongles, and the greatly reduced receiver theft rate, this was an acceptable loss. They therefore decided to continue utilizing the port covers. But, in the fall 2017 semester, five covers and each of their corresponding dongles disappeared. Then, during spring 2018 semester, all of the port covers disappeared at least once, as did the associated dongles. In total, 20 dongles were lost during that semester. The director of library technology and the technology associate conferred once again, and decided that due to this increase in theft, and a concurrent change in the college’s purchasing process, the library would abandon the 3D-printed port cover experiment. ANALYSIS After two seemingly successful semesters, library staff were proud of the changes that resulted from deploying the port cover. Yet given the reoccurrence of the theft pattern in subsequent semesters, they started to worry that printing new port covers was not a sustainable practice. To that end, the technology associate considered several theories as to what would cause the port covers to disappear. For instance, research by Keizer, Lindenberg, and Steg found that acts of social disorder (such as graffiti or litter) will spread if not stopped promptly.7 Under this framework, it could be suggested that the library was too slow to respond to missing covers, and thus permitted the loss of the dongles due to insufficient action or maintenance. This theory seems logical since following an enrollment decline that began in fall 2016, a hiring freeze was instituted so as staff members left the institution, few positions were replaced. Indeed, as of fall 2018, HCCC’s staff is 75 percent part-time and part-timers are subject to renewal or dismissal every six months. In addition, many library employees are student workers, who often leave at graduation, and other part-time staff tend to find full-time employment or leave the library for full-time work at rates that may exceed other institutions who have more permanent staff. With limited staff resources, many of the library’s employees noted anecdotally that they were not able to give as much attention toward preventative maintenance on library computers as they had in prior semesters. Therefore, they did not have time to proactively monitor equipment such as port covers and dongles. It is also possible that a novelty factor was at play. Perhaps when the covers were first deployed, the brightly colored filaments stood out on the aluminum computers, making students more likely INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 101 to notice them and alter their behaviors accordingly. If this was the case, new students who began their coursework in subsequent semesters would not have known that port covers were an additional piece that had been added to the library’s computers in response to prior issues. Following this speculation, the library’s patrons who removed port covers in fall 2017 and spring 2018 might have thought they were removing damaged or nonfunctional flash drives similarly to the students who brought what they believed were lost flash drives to library staff during the spring 2016 semester. Finally, the difference in semesters could also have been due to random chance, in which case, no staff action could have affected the rate at which port covers disappeared. CONCLUSION AND FUTURE RESEARCH Being unsure of which of these analyses was most correct, the technology associate had planned to learn from the sudden resurgence in thefts in several ways. She planned to experiment with adding signage about the importance of dongles and the usage of port covers, and to interview student Mac users to find out their perceptions about the port covers, as well as possible ideas and student-generated suggestions to prevent future thefts. She also considered designing and experimenting with printing more elaborate port covers to see if increased visibility or an elaborate shape would change theft rates. However, a complication arose during the 2018 and 2019 fiscal years. During this time, the college’s finance office changed its purchasing procedures. First, they eliminated the library’s technology budget, centralizing all technology purchases in a “pool,” whose total budget was uncertain. To make purchases from this pool, departments had to create detailed needs justification and obtain approvals from four high-level executives, in addition to the preexisting procedure of obtaining quotes and getting department head and vice president approval. While the library was eventually able to obtain funds from this process, navigating the pool process typically took about six months per purchase, which meant that, in effect, replacement dongles had to come from existing supplies. In addition, the supplies budget line, which was greatly reduced due to the enrollment decline, also came under increased scrutiny, and the purchasing department began to refuse to approve the purchase of batteries. While many of the Mac keyboards were solar powered, and thus did not require batteries, all of their wireless mice, along with the wireless keyboards and mice on the Windows PCs, required the use of either AA or AAA batteries. As battery supplies dwindled, the Purchasing Department did eventually agree to allow purchase of more batteries, under the condition that the library begin going through the pools process to purchase wired keyboards and mice. In the meantime, the technology associate continues to monitor wireless dongles, reprint port covers, and swap wired keyboards from the library’s spare parts inventory for wireless ones as dongles have disappeared. The creation of 3D-printed port covers was successful at preventing equipment loss at HCCC for only two semesters before failing to fulfill that purpose. Library staff speculated about the cause of this change but were unable to make that determination with certainty before budgetary changes caused the end of the 3D-printed port cover experiment. Nevertheless, this project proved valuable to the library to better learn about 3D-printing technology, and to experiment with its practical uses in the library environment. ENDNOTES CREATING AND DEPLOYING USB PORT COVERS | SANCHEZ AND DELOOPER 102 https://doi.org/10.6017/ital.v38i3.11007 1 Filemon Schoffer, “How Expiring Patents Are Ushering in the next Generation of 3D Printing,” TechCrunch (blog), June 5, 2016, http://social.techcrunch.com/2016/05/15/how-expiring- patents-are-ushering-in-the-next-generation-of-3d-printing/. 2 Christopher Mims, “3D Printing Will Explode in 2014, Thanks to the Expiration of Key Patents,” Quartz (blog), July 21, 2013, https://qz.com/106483/3d-printing-will-explode-in-2014- thanks-to-the-expiration-of-key-patents/. 3 Jason Griffey, “Absolutely Fab-Ulous,” Library Technology Reports 48, no. 3 (April 2012): 21–24, https://journals.ala.org/index.php/ltr/article/view/4794. 4 Caitlin Bagley, “What Is a Makerspace? Creativity in the Library,” ALA TechSource, December 20, 2012, http://www.ala.org/tools/article/ala-techsource/what-makerspace-creativity-library. United for Libraries, American Library Association Office for Information Technology Policy, and Public Library Association, “Progress in the Making: An Introduction to 3D Printing and Public Policy,” September 2014, http://www.ala.org/advocacy/sites/ala.org.advocacy/files/content/advleg/pp/hometip- 3d_printing_tipsheet_version_9_Final.pdf. 5 Hudson County Community College, “Fact Book 2017-2018,” 2018, https://www.HCCC.edu/uploadedFiles/Pages/Explore_HCCC/Visiting_HCCC(1)/FACTBOOK- %20final%20web%20version.pdf. 6 Adi Robertson, “MakerBot Is Replacing Its Most Ill-Fated 3D Printing Product,” The Verge (blog), January 4, 2016, https://www.theverge.com/2016/1/4/10677740/new-makerbot-smart- extruder-plus-3d-printer-ces-2016. 7 Kees Keizer, Siegwart Lindenberg, and Linda Steg, “The Spreading of Disorder,” Science 322, no. 5908 (2008): 1681–85. http://social.techcrunch.com/2016/05/15/how-expiring-patents-are-ushering-in-the-next-generation-of-3d-printing/ http://social.techcrunch.com/2016/05/15/how-expiring-patents-are-ushering-in-the-next-generation-of-3d-printing/ https://qz.com/106483/3d-printing-will-explode-in-2014-thanks-to-the-expiration-of-key-patents/ https://qz.com/106483/3d-printing-will-explode-in-2014-thanks-to-the-expiration-of-key-patents/ https://journals.ala.org/index.php/ltr/article/view/4794 http://www.ala.org/tools/article/ala-techsource/what-makerspace-creativity-library http://www.ala.org/advocacy/sites/ala.org.advocacy/files/content/advleg/pp/hometip-3d_printing_tipsheet_version_9_Final.pdf http://www.ala.org/advocacy/sites/ala.org.advocacy/files/content/advleg/pp/hometip-3d_printing_tipsheet_version_9_Final.pdf https://www.hccc.edu/uploadedFiles/Pages/Explore_HCCC/Visiting_HCCC(1)/FACTBOOK-%20final%20web%20version.pdf https://www.hccc.edu/uploadedFiles/Pages/Explore_HCCC/Visiting_HCCC(1)/FACTBOOK-%20final%20web%20version.pdf https://www.theverge.com/2016/1/4/10677740/new-makerbot-smart-extruder-plus-3d-printer-ces-2016 https://www.theverge.com/2016/1/4/10677740/new-makerbot-smart-extruder-plus-3d-printer-ces-2016 ABSTRACT Introduction 3D printing at Hudson County Community College About Hudson County Community College The Mac USB Dilemma Methodology Results Analysis Conclusion and Future Research ENDNotes 11009 ---- Assessing the Effectiveness of Open Access Finding Tools Articles Assessing the Effectiveness of Open Access Finding Tools Teresa Auch Schultz, Elena Azadbakht, Jonathan Bull, Rosalind Bucy, and Jeremy Floyd INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 82 Teresa Auch Schultz (teresas@unr.edu) is Social Sciences Librarian, University of Nevada, Reno. Elena Azadbkaht (eazadbakht@unr.edu) is Health Sciences Librarian, University of Nevada, Reno. Jonathan Bull (jon.bull@valpo.edu) is Scholarly Communications Librarian, Valparaiso University. Rosalind Buch (rbucy@unr.edu) is Research & Instruction Librarian, University of Nevada, Reno. Jeremy Floyd (jfloyd@unr.edu) is Metadata Librarian, University of Nevada, Reno. ABSTRACT The open access (OA) movement seeks to ensure that scholarly knowledge is available to anyone with internet access, but being available for free online is of little use if people cannot find open versions. A handful of tools have become available in recent years to help address this problem by searching for an open version of a document whenever a user hits a paywall. This project set out to study how effective four of these tools are when compared to each other and to Google Scholar, which has long been a source of finding OA versions. To do this, the project used Open Access Button, Unpaywall, Lazy Scholar, and Kopernio to search for open versions of 1,000 articles. Results show none of the tools found as many successful hits as Google Scholar, but two of the tools did register unique successful hits, indicating a benefit to incorporating them in searches for OA versions. Some of the tools also include additional features that can further benefit users in their search for accessible scholarly knowledge. INTRODUCTION The goal of open access (OA) is to ensure as many people as possible can read, use, and benefit from scholarly research without having to worry about paying to read and, in many cases, restrictions on reusing the works. However, OA scholarship helps few people if they cannot find it. This is especially problematic for green OA works, which are those that have been made open by being deposited in an open online repository even if they were published in a subscription -based journal. OpenDOAR reports more than 3,800 such repositories.1 As users are unlikely to search each individual repository, an efficient search method is needed to find the OA items spread across so many locations. In recent years, several browser extensions have been released that allow a user to search for an open version of an article while on a webpage for that article. The tools include: • Lazy Scholar, a browser extension that searches Google Scholar, PubMed, EuropePMC, DOAI.io, and Dissem.in. It has extensions for both the Chrome and Firefox browsers.2 • Open Access Button, which uses both a website and a Chrome extension to search for OA versions.3 • Unpaywall, which also acts through a Chrome extension to search for open articles via the digital object identifier.4 • Kopernio, a browser extension that searches subject and institutional repositories and is owned by Clarivate Analytics. Kopernio has extensions for Chrome, Firefox, and Opera.5 mailto:teresas@unr.edu mailto:eazadbakht@unr.edu mailto:jon.bull@valpo.edu mailto:rbucy@unr.edu mailto:jfloyd@unr.edu ASSESSING THE EFFECTIVENESS OF OPEN ACCESS FINDING TOOLS |AUCH SCHULTZ, AZADBAKHT, ET AL. 83 https://doi.org/10.6017/ital.v38i3.11109 Some of the tools offer other services, such as Open Access Button’s ability to help the user email the author of an article if no open version is available, as well as integration with libraries’ interlibrary loan workflows. Kopernio and Lazy Scholar offer to sync with a user’s institutional library to see if an article is available through the library’s collection.6 Although other similar extensions might also exist, this article is focused on the four mentioned above based on the authors’ knowledge of available OA finding tools at the time of the project. LITERATURE REVIEW As noted above, scholars have indicated for several years a need for reliable and user-friendly methods, systems, or tools that can help researchers find OA materials. Bosman et al. forwarded the idea of a scholarly commons—a set of principles, practices, and resources to enable research openness—that depends upon clear linkages between digital research objects.7 Bulock notes that OA has “complicated” retrieval in that OA versions are often housed in various locations across the web, including institutional repositories (IRs), preprint servers, and personal websites. 8 There is no perfect search option or tool, although some have tried creating solutions, such as the Open Jericho project from Wayne State University, which is seeking to create an aggregator to search institutional repositories and eventually other sources as well.9 However, this lack of a central search tool can lead to confusion among researchers.10 Nicholas and colleagues found that their sample of early career scholars drawn from several countries relied heavily on Google and Google Scholar to find articles that interested them.11 Many also turn to ResearchGate and other social media platforms and risk running afoul of copyright. The results of Ithaka S+R’s 2015 survey of faculty in the United States reflect these findings to a certain extent, as variations exist between researchers in different disciplines.12 A majority of the respondents also indicated an affinity for freely accessible materials. As more researchers become aware of and gravitate toward OA options, the efficacy of various discovery tools, such as the browser extensions evaluated in this study, will become even more pertinent. Previous studies on the findability of OA scholarship have focused primarily on Google and Google Scholar.13 A few have assessed tools such as OAIster, OpenDOAR, and PubMed Central.14 Norris, Oppenheim, and Rowland sought a selection of articles using Google, Google Scholar, OAIster, and OpenDOAR.15 While OAIster and OpenDOAR found just 14 percent of the articles’ open versions, Google and Google Scholar combined managed to locate 86 percent. Jamali and Nabavi assessed Google Scholar’s ability to retrieve the full text of scholarly publications and documented the major sources of the full-text versions (publisher websites, institutional repositories, ResearchGate, etc.).16 Google Scholar was able to locate full-text versions of more than half (57.3 percent) of the items included in the study. Most recently, Martin-Martin et al. likewise used Google Scholar to gauge the availability of OA documents across different disciplines.17 They found that roughly 54.6 percent of the scholarly content for which they searched was freely available, although only 23.1 percent of their sample were OA by virtue of the publisher. As of yet, no known studies have systematically evaluated the growing selection of open access tools’ efficiency and effectiveness at retrieving OA versions of articles. However, several scholars and journalists have reviewed these new tools, especially the more established Open Access Button and Unpaywall.18 These reviews were mostly positive, even as some acknowledged that the tools are not a wholescale solution for locating OA publications. Despite pointing out these tools’ INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 84 limitations, reviewers voiced their hope that the OA finding tools could help disrupt the traditional scholarly publishing industry.19 At least one study has used the Open Access Button to determine the green OA availability of journal articles. Emery used the tool as the first step to identify OA article versions and then searched individual institutional repositories, followed by Google Scholar as the final steps.20 Emery found that 22 percent of the study sample was available as green OA but did not say what portion of that was found by the Open Access Button. Emery did note that the Open Access Button returned 17 false positives (six in which the tool took the user to the wrong article or other content, and 11 in which it took the user to a citation of the article with no full text available). She also found at least 38 cases of false-negative returns from the Open Access Button, or articles that were openly available that the tool failed to find. The study did not count open versions found on ResearchGate or Academia.edu. METHODOLOGY OA Finding Tools This study compared the Chrome browser extensions for Google Scholar and four OA finding tools: Lazy Scholar, Unpaywall, Open Access Button, and Kopernio. Each extension was used while in the Chrome browser to search for open versions of the selected articles and the success of each extension in finding any free, full version was recorded. The authors did not track whether an article was licensed for reuse. For the four OA finding tools, the occurrences of false positives (e.g., the retrieval of an error page, a paywalled version, or the wrong article entirely) were also tracked. False positives were not tracked for Google Scholar, which does not purport to find only open versions of articles. Data collection occurred over a six-week period in October and November 2018. The authors used Web of Science to identify the test articles (N=1,000) with the aim of selecting articles that would give the tools the best chance for finding a high number of open versions. Articles selected were published in 2015 and 2016. These years were selected in order to try to avoid embargoes that might have prevented articles being made open through deposit. The articles were selected from two disciplines: Applied Physics and Oncology, both of which have a large share in Web of Science and come from a broader discipline with a strong OA culture.21 Each comparison began with searching the Google Scholar extension by article DOI or title if a DOI was not available. All versions retrieved by Google Scholar were examined until an open version was located or until the retrieved versions were exhausted. The remaining OA tools were then tested from the webpage for the article record on the journal’s website (if available). If no journal page was available, the article PDF page was tested. All data were recorded in a shared Google Sheet according to a data dictionary. Searches for open versions of paywalled articles were performed away from the authors’ universities to ensure the institutions’ subscriptions to various journals did not impact the results. Authors were limited in the number of articles they could search each day as some tools blocked continued use, presumably over concerns of illegitimate web activity, after as few as 15 searches. Study Limitations This methodology might have missed open versions of articles, even using these five search tools. Although studies have found Google Scholar to be one of the most effective ways of searching for ASSESSING THE EFFECTIVENESS OF OPEN ACCESS FINDING TOOLS |AUCH SCHULTZ, AZADBAKHT, ET AL. 85 https://doi.org/10.6017/ital.v38i3.11109 open versions, Way has shown that it is not perfect.22 Therefore, it is possible that this study undercounted the number of OA articles. The study tested the ability of OA finding tools to locate open articles from a journal’s main article page, not other possible webpages (e.g., the Google Scholar results page). This design may have limited the effectiveness of some tools, such as Kopernio, which appear to work well with some webpages but not others. RESULTS Overall, the tools found open versions for just less than half of the study sample (490), whereas they found no open versions for 510 articles. Although Lazy Scholar, Unpaywall, Open Access Button, and Kopernio all found open versions, Google Scholar returned the most with 462 articles (94 percent of all articles with at least one open version). Open Access Button, Lazy Scholar, and Unpaywall all found a majority of the open articles (62 percent, 73 percent, and 67 percent, respectively); however, Kopernio found open versions for just 34 percent of the articles (see figure 1). Figure 1. Number of open versions found by each tool. It was most common for three or more of the tools to find an open version for an article, with just 48 found by two tools and 98 found by only one tool (see figure 2). INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 86 Figure 2. Number of articles where X number of OA finding tools found an open version. When looking at articles where only one tool returned an open version, Google Scholar had the highest results (84). Open Access Button (4) and Lazy Scholar (10) also returned unique hits, but Unpaywall and Kopernio did not. Open Access Button returned the most false positives with 46, or nearly 5 percent of all 1,000 articles. Lazy Scholar returned 31 false positives (3 percent), Unpaywall returned 14 (1 percent), and Kopernio returned 13 (1 percent). DISCUSSION The results for the OA search tools show that while all four options met with some success, none of them performed as well as Google Scholar. Three of the tools—Lazy Scholar, Open Access Button, and Unpaywall—did find at least half or more of the open versions that Google Scholar did. It is important to note that Open Access Button, which found the second fewest open versions, does not search ResearchGate and Academia.edu because of legal concerns over article versions that are likely infringing copyright.23 This could have affected Open Access Button’s performance. Likewise, Kopernio’s lower percentage of finding OA resources might relate to concerns over article versions as well. When creating an account on Kopernio, the user is asked to affiliate themselves with an institution so that the tool can search existing library subscriptions at that institution. For this study, the authors did not affiliate with their home institutions when setting up Kopernio to get a better idea of which content was open as opposed to content being accessible because of the tool connecting to a library’s subscription collection. If the authors were to identify ASSESSING THE EFFECTIVENESS OF OPEN ACCESS FINDING TOOLS |AUCH SCHULTZ, AZADBAKHT, ET AL. 87 https://doi.org/10.6017/ital.v38i3.11109 with an institution, the number of accessible articles would likely increase, but this access would not be a true representation of what open content is discoverable. In addition, some tools might work better with certain publishers than others. For instance, Kopernio did not appear to work with Spandidos Publications, a leading biomedical science publisher that publishes much of its content as gold OA, meaning the entire journal is published as OA. Kopernio found just one open version of a Spandidos article, compared to 153 by Google Scholar. This could be an unintentional malfunction either with Spandidos or Kopernio, which if fixed, could greatly increase the efficacy of this finding tool. However, Open Access Button, Lazy Scholar, Unpaywall, and Google were able to find OA publications from Spandidos at similar rates (135, 138, and 139, respectively) with no false positives. While none of the tools performed as well as Google Scholar, some of the tools were easier to use compared to Google Scholar. Google Scholar does not automatically show an open version first; instead, users often have to first select the “All X Versions” option at the bottom of each record and then open each version until they find an open version. Lazy Scholar and Unpaywall appear (for the most part) automatically, meaning users can see right away if an open version is available and then click a button once to be taken to that version. Although Open Access Button and Kopernio do not show automatically if they have found an open version, users need to click a button on their toolbar once to activate each tool and see if the tool was able to find an open version. Open Access Button also provides the extra benefit of making it easy for users to email authors to make their works open if an open version is not already available. Relying on Lazy Scholar, Unpaywall, or Open Access Button first causes users no harm, and they can always rely on Google Scholar as a backup. Whether all four tools are needed is questionable. For instance, a few of the authors found Kopernio difficult to work with as it seemed to be incompatible with at least one publisher’s website and it introduced extra steps in downloading a PDF file. The fact that it also returned by far the fewest open versions—just 36 percent of the ones Google Scholar found and no unique hits—does not argue well for users to include it in their OA finding toolbox. Also, while Lazy Scholar, Unpaywall, and Open Access Button all performed better on their own, the authors wonder what improvements could be created by combining the resources of the individual tools. CONCLUSION The growth of OA finding tools is encouraging to see as far as helping to make OA works more discoverable. Although the study showed that Google Scholar uncovered more articles than any of the other tools, the utility of at least two of the tools—Lazy Scholar and Open Access Button—can still be seen in that both found articles not discovered by the other tools, including Google Scholar. Indeed, using the tools in conjunction with one another appears to be the best method. And although Open Access Button found the second fewest articles, the tool’s effort to integrate with interlibrary loan and discovery workflows, as well as its concern about legal issues are all promising for its future. Likewise, Kopernio might be a better tool for those interested in combining access to a library collection—which likely has a large number of final, publisher versions of scholarship—with their search for openly available scholarship. Future studies can include newer OA finding tools that have entered the market, as well as evaluate the user experience of the tools. Another study can also look at how well Open Access INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 88 Button’s author email feature works. Also, as Open Access Button and Unpaywall continue to move into new areas, such as interlibrary loan support, research could explore if these are more effective ways of connecting users to OA material as well as measure users’ understanding of OA versions they find. Overall, the emergence of OA finding tools offers much potential for increasing the visibility of OA versions of scholarship, although no tool is perfect. However, if scholars wish to support OA through their research practices or find themselves unable to purchase or legally acquire the publisher's version, each of these tools can be valuable additions to their work. DATA STATEMENT The data used for this study has been shared publicly in the Zenodo database under a CC-BY 4.0 license at https://doi.org/10.5281/zenodo.2602200. ENDNOTES 1 Jisc, “Browse by Country and Region,” accessed February 15, 2019, http://v2.sherpa.ac.uk/view/repository_by_country/countries=5Fby=5Fregion.html. 2 Colby Vorland, “Extension,” accessed March 14, 2019, http://www.lazyscholar.org/; Colby Vorland, “Data Sources,” Lazy Scholar (blog), accessed March 14, 2019, http://www.lazyscholar.org/data-sources/. 3 “Avoid Paywalls, Request Research,” Open Access Button, accessed March 14, 2019, https://openaccessbutton.org/. 4 Unpaywall, “Browser Extension,” accessed March 14, 2019, https://unpaywall.org/products/extension. 5 Kopernio, “FAQs,” accessed March 14, 2019, https://kopernio.com/faq. 6 Colby Vorland, “Features,” Lazy Scholar (blog), accessed March 14, 2019, http://www.lazyscholar.org/category/features/. 7 Jeroen Bosman et al., “The Scholarly Commons—Principles and Practices to Guide Research Communication,” Open Science Framework, September 15, 2017, https://doi.org/10.17605/OSF.IO/6C2XT. 8 Chris Bulock, “Delivering Open,” Serials Review 43, no. 3–4 (October 2, 2017): 268–70, https://doi.org/10.1080/00987913.2017.1385128. 9 Elliot Polak, email message to author, June 4, 2019. 10 Bulock, "Delivering Open.” 11 David Nicholas et al., “Where and How Early Career Researchers Find Scholarly Information,” Learned Publishing 30, no. 1 (January 1, 2017): 19–29, https://doi.org/10.1002/leap.1087. https://doi.org/10.5281/zenodo.2602200 http://v2.sherpa.ac.uk/view/repository_by_country/countries=5Fby=5Fregion.html http://www.lazyscholar.org/ http://www.lazyscholar.org/data-sources/ https://openaccessbutton.org/ https://unpaywall.org/products/extension https://kopernio.com/faq http://www.lazyscholar.org/category/features/ https://doi.org/10.17605/OSF.IO/6C2XT https://doi.org/10.1080/00987913.2017.1385128 https://doi.org/10.1002/leap.1087 ASSESSING THE EFFECTIVENESS OF OPEN ACCESS FINDING TOOLS |AUCH SCHULTZ, AZADBAKHT, ET AL. 89 https://doi.org/10.6017/ital.v38i3.11109 12 Christine Wolff, Alisa B Rod, and Roger C. Schonfeld, “Ithaka S+R US Faculty Survey 2015,” 2015, 83, https://sr.ithaka.org/publications/ithaka-sr-us-faculty-survey-2015/. 13 Mamiko Matsubayashi et al., “Status of Open Access in the Biomedical Field in 2005,” Journal of the Medical Library Association 97, no. 1 (January 2009): 4–11, https://doi.org/10.3163/1536- 5050.97.1.002; Michael Norris, Charles Oppenheim, and Fytton Rowland, “The Citation Advantage of Open-Access Articles,” Journal of the American Society for Information Science and Technology 59, no. 12 (October 1, 2008): 1963–72, https://doi.org/10.1002/asi.20898; Doug Way, “The Open Access Availability of Library and Information Science Literature,” College & Research Libraries 71, no. 4 (2010): 302–09; Charles Lyons and H. Austin Booth, “An Overview of Open Access in the Fields of Business and Management,” Journal of Business & Finance Librarianship 16, no. 2 (March 31, 2011): 108–24, https://doi.org/10.1080/08963568.2011.554786; Hamid R. Jamali and Majid Nabavi, “Open Access and Sources of Full-Text Articles in Google Scholar in Different Subject Fields,” Scientometrics 105, no. 3 (December 1, 2015): 1635–51, https://doi.org/10.1007/s11192-015- 1642-2; Alberto Martín-Martín et al., “Evidence of Open Access of Scientific Publications in Google Scholar: A Large-Scale Analysis,” Journal of Informetrics 12, no. 3 (August 1, 2018): 819–41, https://doi.org/10.1016/j.joi.2018.06.012. 14 Norris, Oppenheim, and Rowland, “The Citation Advantage of Open-Access Articles”; Micahel Norris, Fytton Rowland, and Charles Oppenheim, “Finding Open Access Articles Using Google, Google Scholar, OAIster and OpenDOAR,” Online Information Review 32, no. 6 (November 21, 2008): 709–15, https://doi.org/10.1108/14684520810923881; Maria-Francisca Abad‐García, Aurora González‐Teruel, and Javier González‐Llinares, “Effectiveness of OpenAIRE, BASE, Recolecta, and Google Scholar at Finding Spanish Articles in Repositories,” Journal of the Association for Information Science and Technology 69, no. 4 (April 1, 2018): 619–22, https://doi.org/10.1002/asi.23975. 15 Norris, Rowland, and Oppenheim, “Finding Open Access Articles Using Google, Google Scholar, OAIster and OpenDOAR.” 16 Jamali and Nabavi, “Open Access and Sources of Full-Text Articles in Google Scholar in Different Subject Fields.” 17 Martín-Martín et al., “Evidence of Open Access of Scientific Publications in Google Scholar.” 18 Stephen Curry, “Push Button for Open Access,” The Guardian, November 18, 2013, sec. Science, https://www.theguardian.com/science/2013/nov/18/open-access-button-push; Bonnie Swoger, “The Open Access Button: Discovering When and Where Researchers Hit Paywalls,” Scientific American Blog Network, accessed May 30, 2017, https://blogs.scientificamerican.com/information-culture/the-open-access-button- discovering-when-and-where-researchers-hit-paywalls/; Lindsay Mckenzie, “How a Browser Extension Could Shake Up Academic Publishing,” Chronicle of Higher Education 68, no. 33 (April 21, 2017): A29–A29; Joyce Valenza, “Unpaywall Frees Scholarly Content,” School Library Journal 63, no. 5 (May 2017): 11–11; Barbara Quint, “Must Buy? Maybe Not,” Information Today 34, no. 5 (June 2017): 17–17; Michaela D. Willi Hooper, “Product Review: Unpaywall [Chrome & Firefox Browser Extension],” Journal of Librarianship & Scholarly Communication 5 https://sr.ithaka.org/publications/ithaka-sr-us-faculty-survey-2015/ https://doi.org/10.3163/1536-5050.97.1.002 https://doi.org/10.3163/1536-5050.97.1.002 https://doi.org/10.1002/asi.20898 https://doi.org/10.1080/08963568.2011.554786 https://doi.org/10.1007/s11192-015-1642-2 https://doi.org/10.1007/s11192-015-1642-2 https://doi.org/10.1016/j.joi.2018.06.012 https://doi.org/10.1108/14684520810923881 https://doi.org/10.1002/asi.23975 https://www.theguardian.com/science/2013/nov/18/open-access-button-push https://blogs.scientificamerican.com/information-culture/the-open-access-button-discovering-when-and-where-researchers-hit-paywalls/ https://blogs.scientificamerican.com/information-culture/the-open-access-button-discovering-when-and-where-researchers-hit-paywalls/ INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 90 (January 2017): 1–3, https://doi.org/10.7710/2162-3309.2190; Terry Ballard, “Two New Services Aim to Improve Access to Scholarly Pdfs,” Information Today 34, no. 9 (November 2017): Cover-29; Diana Kwon, “A Growing Open Access Toolbox,” The Scientist, accessed December 11, 2017, https://www.the-scientist.com/?articles.view/articleNo/51048/title/A- Growing-Open-Access-Toolbox/; Kent Anderson, “The New Plugins — What Goals Are the Access Solutions Pursuing?,” The Scholarly Kitchen, August 23, 2018, https://scholarlykitchen.sspnet.org/2018/08/23/new-plugins-kopernio-unpaywall- pursuing/. 19 Curry, “Push Button for Open Access”; Swoger, “The Open Access Button”; Mckenzie, “How a Browser Extension Could Shake Up Academic Publishing”; Kwon, “A Growing Open Access Toolbox.” 20 Jill Emery, “How Green Is Our Valley?: Five-Year Study of Selected LIS Journals from Taylor & Francis for Green Deposit of Articles,” Insights 31, no. 0 (June 20, 2018): 23, https://doi.org/10.1629/uksg.406. 21 Anna Severin et al., “Discipline-Specific Open Access Publishing Practices and Barriers to Change: An Evidence-Based Review,” F1000Research 7 (December 11, 2018): 1925, https://doi.org/10.12688/f1000research.17328.1. 22 Way, “The Open Access Availability of Library and Information Science Literature.” 23 Open Access Button, “Open Access Button Library Service FAQs,” Google Docs, accessed February 19, 2019, https://docs.google.com/document/d/1_HWKrYG7Qj7ff05- cx8Kw40mL7ExwRz6ks5Fb10GEGg/edit?usp=embed_facebook. https://doi.org/10.7710/2162-3309.2190 https://www.the-scientist.com/?articles.view/articleNo/51048/title/A-Growing-Open-Access-Toolbox/ https://www.the-scientist.com/?articles.view/articleNo/51048/title/A-Growing-Open-Access-Toolbox/ https://scholarlykitchen.sspnet.org/2018/08/23/new-plugins-kopernio-unpaywall-pursuing/ https://scholarlykitchen.sspnet.org/2018/08/23/new-plugins-kopernio-unpaywall-pursuing/ https://doi.org/10.1629/uksg.406 https://doi.org/10.12688/f1000research.17328.1 https://docs.google.com/document/d/1_HWKrYG7Qj7ff05-cx8Kw40mL7ExwRz6ks5Fb10GEGg/edit?usp=embed_facebook https://docs.google.com/document/d/1_HWKrYG7Qj7ff05-cx8Kw40mL7ExwRz6ks5Fb10GEGg/edit?usp=embed_facebook ABSTRACT Introduction Literature review Methodology OA Finding Tools Study Limitations Results Discussion Conclusion Data statement ENDNOTES 11015 ---- Library-Authored Web Content and the Need for Content Strategy Articles Library-Authored Web Content and the Need for Content Strategy Courtney McDonald and Heidi Burkhardt INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 8 Courtney McDonald (crmcdonald@colorado.edu) is Learner Experience & Engagement Librarian and Associate Professor, University of Colorado at Boulder. Heidi Burkhardt (heidisb@umich.edu) is Web Project Manager & Content Strategist, University of Michigan. ABSTRACT Increasingly sophisticated content management systems (CMS) allow librarians to publish content via the web and within the private domain of institutional learning management systems. “Libraries as publishers” may bring to mind roles in scholarly communication and open scholarship, but the authors argue that libraries’ self-publishing dates to the first “pathfinder” handout and continues today via commonly used, feature-rich applications such as WordPress, Drupal, LibGuides, and Canvas. Although this technology can reduce costly development overhead, it also poses significant challenges. These tools can inadvertently be used to create more noise than signal, potentially alienating the very audiences we hope to reach. No CMS can, by itself, address the fact that authoring, editing, and publishing quality content is both a situated expertise and a significant, ongoing demand on staff time. This article will review library use of CMS applications, outline challenges inherent in their use, and discuss the advantages of embracing content strategy. INTRODUCTION We tend to look at content management as a digital concept, but it’s been around for as long as content. For as long as humans have been creating content, we’ve been searching for solutions to manage it. The Library of Alexandria (300 BC to about AD 273) was an early attempt at managing content. It preserved content in the form of papyrus scrolls and codices, and presumably controlled access to them. Librarians were the first content managers.1 (emphasis added) Content is, and has always been, central to the mission of libraries. Content is physical, digital, acquired, purchased, leased, subscribed, and created. “Libraries as publishers” may bring to mind roles in scholarly communication and open scholarship, but the authors argue that libraries’ self- publishing dates to the first mimeographed ‘pathfinder’ handout and continues today via commonly used, feature-rich web content management systems (CMSs). Libraries use these CMSs to support research, teaching, and learning in a variety of day-to-day operations. The sophisticated and complex infrastructure surrounding web-based library content has evolved from the singular, independently hosted and managed “library website” into a “library web ecosystem” comprised of multiple platforms, including integrated library systems, institutional repositories, CMSs, and others. Multiple CMS applications, whether open-source (e.g., WordPress, Drupal), institutionally supported (e.g., Canvas, Blackboard) or library-specific (e.g., Springshare’s LibGuides), are employed by most libraries to power the library’s website and research guides, as well as to make their collections, in any and all formats, discoverable and accessible. mailto:crmcdonald@colorado.edu mailto:heidisb@umich.edu) LIBRARY-AUTHORED WEB CONTENT AND THE NEED FOR CONTENT STRATEGY | MCDONALD AND BURKHARDT 9 https://doi.org/10.6017/ital.v38i3.11015 Library staff at all levels create and publish content through these CMS platforms, an activity that is critical to our users discovering what we offer and accomplishing their goals. The CMS removes technical bottlenecks and enables subject matter experts to publish content without coding expertise or direct access to a server. This disintermediation has many benefits, enabling librarians to share and interact directly with their communities, and reducing costly development overhead. As with any powerful technology that’s simple to use, effectively implementing a CMS is not without pitfalls. Through these tools, we can inadvertently create more noise than signal, potentially alienating the very audiences we hope to reach. Further, effective management of content and workflows across and among so many platforms is not trivial. Distributing web content creation among many authors can quickly lead to numerous challenges requiring expert attention. Governance strategies for library-authored web content are rarely addressed in the library literature. This article will review library use of CMS applications, outline challenges inherent in their use, and discuss the advantages of embracing content strategy as a framework for library-authored web content governance. CONTENT MANAGEMENT SYSTEMS: A DEFINITION Any conversation on this topic is complicated by the fact that there is both misunderstanding and disagreement regarding the definition of a content management system. In their survey of 149 libraries covering day-to-day website management, including staffing, infrastructure, and organizational structures, Bundza et al. observed “[w]hen reviewing the diverse systems mentioned, it is obvious that people defined CMSs very broadly.”2 Connell surveyed over 600 libraries regarding their use of CMSs, defined as “website management tools through which the appearance and formatting is managed separately from content, so that authors can easily add content regardless of web authoring skills.”3 A few respondents “indicated their CMS was Dreamweaver or Adobe Contribute” and another “self-identified as a non-CMS user but then listed Drupal as their web management tool.”4 While the authors find the survey definition itself slightly ambiguous (likely in the service of clarity for survey respondents), we also believe that these responses may hint at an underlying and widespread lack of clarity regarding the technology itself. An early report on potential library use of content management systems by Browning and Lowndes in 2001 opined that “a CMS is not really a product or a technology. It is a catch-all term that covers a wide set of processes that will underpin the ‘Next Generation’ large-scale website.”5 While technological developments over the last twenty years reveal some limitations to this early characterization, we believe it is fundamentally sound to define the CMS primarily through its functions. Fulton defined a CMS as “an application that enables the shared creation, editing, publish ing, and management of digital content under strict administrative parameters.”6 The authors concur with Barker’s (2016) similarly task-based definition: “A content management system (CMS) is a software package that provides some level of automation for the tasks required to effectively manage content . . . usually server-based, multi-user . . . [and] interact[ing] with content stored in a INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 10 repository.”7 Browning & Lowndes defined the key tasks, or functions, of the CMS as encompassing four major categories: Authoring, Workflow, Storage, and Publishing.8 Barker (2016) also outlined “the big four” of content management as: enterprise content management (e.g., intranets), digital asset management (DAM), records management, and web content management (WCM), with WCM defined as “the management of content primarily intended for mass delivery via a website. WCM excels at separating content from presentation and publishing to multiple channels.”9 For the purpose of clarity within the scope of this article, our discussion will primarily focus on content management systems as they are used for WCM, acknowledging that some principles may apply in varying degrees to other categories. THE CMS AND LIBRARY WEBSITES The library literature reveals that, generally speaking, libraries began the transition from telnet and Gopher catalog interfaces to launching websites in the 1990s.10 Case studies of library websites from this period through the mid-2000s report library website pages increasing at a rapid rate, in some cases doubling or tripling on a yearly basis. 11 A comment from Dallis and Ryner in regard to their own case study provides a sense of what might be considered typical during this period: “The management of the site was decentralized, and it grew to an estimated 8,000 pages over a period of five years.”12 This proliferation, in turn, spurred focused interest in content management. “Web content management (WCM) as a branch of content management (CM) gained importance during the Web explosion in the mid-1990s.”13 As early as 2001 there were published laments regarding the state of library websites: Institutions are struggling to maintain their Web sites. Out of date material, poor control over design and navigation, a lack of authority control and the constriction of the Webmaster (or even Web Team) bottleneck will be familiar to many in the HE/FE [Higher Education / Further Education] sector. The pre-millennial Web has been characterized by highly manual approaches to maintenance; the successful and sustainable post-millennial Web will have significant automation. One vehicle by which this can be achieved is the CMS.14 Mach wrote: The special concerns of Web maintenance have only multiplied with the increased size and complexity of many library Web sites. Not only does the single Webmaster model no longer work for most libraries, but the static HTML page is also in jeopardy. Many overworked Web librarians dream about the instant content updates possible with database-driven site or content management software. But while these technical solutions save staff time, they demand a fair amount of compromise.15 In 2010, Fulton noted, “at one time, all institutions [mentioned in her literature review] could effectively manage their sites outside of a CMS. However, changing standards combined with uncontrollable growth patterns persuaded them to take steps to prevent prolonged chaos.”16 LIBRARY-AUTHORED WEB CONTENT AND THE NEED FOR CONTENT STRATEGY | MCDONALD AND BURKHARDT 11 https://doi.org/10.6017/ital.v38i3.11015 Changing Technology, Accessibility, and Literacy Throughout the early 2000s, advances in consumer technology and in web development (e.g., CSS, HTML 5, Bootstrap) together with the need to comply with web-accessibility standards resulted in a gradual move from static, hand-coded sites to other solutions. In 2005, Yu stated, “Today’s content management solution is either a sophisticated software-based system or a database- driven application.”17 After a detailed explanation of the cumbersome process of managing and updating a static site using Microsoft’s FrontPage, Kane and Hegarty noted, “The opportunity to migrate the site to a content management system provided a golden opportunity . . . to bring the code into line with best practice.”18 This transition also coincided with the growth of viable CMS options, particularly open-source tools. Black stated in 2011: “In the past few years, the field of open-source CMSs has increased, making it more likely that a library will find a viable CMS in the existing marketplace that will meet the organization’s needs.”19 In 2013, Comeaux and Schmetzke replicated an earlier study of library websites’ accessibility, reviewing the homepages of library websites at 56 institutions offering ALA-accredited library and information science programs using Bobby, an automated web-accessibility checker. They found that CMS-powered library websites had a higher average of approved pages and a lower average of errors per page than those not powered by a CMS.20 In a 2017 study, Comeaux manually reviewed 37 academic library websites (members of the Association of Southeastern Research Libraries), and found that approximately three-quarters of CMS-driven sites were responsive, as compared to only one-quarter of sites without a CMS.21 Accessibility also manifests itself on the web in other ways. It is important to consider what we know about literacy and how people read online. The ability to write using plain language, in addition to other essential techniques for effective web writing, is an important aspect of accessibility that must be addressed in tandem with compliance with industry standards such as the Web Content Accessibility Guidelines (WCAG, https://www.w3.org/TR/WCAG20/). A summary of recent results for the Program for the International Assessment of Adult Competencies (PIAAC, https://nces.ed.gov/surveys/piaac/) survey, administered to US adults, reported “the majority of people may struggle to read through a ‘simple’ bullet-point list of rules . . . Nearly 62% of our population might not be able to read a graph or calculate the cost of shoes reliably.”22 Blakiston succinctly observed: “on the web, scanning and skimming is the default.”23 These trends have led to an increasing push to adopt “plain language” by governmental agencies and others.24 Skaggs stated, “Adopt plain language throughout your website. Plain language focuses on understanding and writing for the user’s goals, making content easily scannable for the user, and writing in easy to understand sentences.”25 LIBRARY WEBSITES AND THE CHALLENGES OF A DISTRIBUTED ENVIRONMENT In 2011, Black pointed out one of the chief advantages to using a CMS: “CMSs support a distributed content model by separating the content from the presentation and giving the content provider an easy to use interface for adding content”.26 Empowerment to focus on special expertise is noted as another benefit: “Chief among the efficiencies gained in using a CMS is the simple act of giving content authors the tools they need to create webpages and, most importantly, to do so without requiring the technical knowledge that used to be a part of webpage development. Designers can design, writers can write, editors can edit, and technology folks can manage the CMS and support https://www.w3.org/TR/WCAG20/ https://nces.ed.gov/surveys/piaac/ INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 12 its users.”27 Browning and Landes agreed: “the concept of ‘self-service authoring’, whereby staff do not need special skills to edit the content for which they are responsible, can be regarded as a major step towards acceptance of the web as a medium for communication by non-web specialists. Providing this is the key advantage of a CMS.”28 Librarians quickly found, however, that while the adoption of a CMS could empower more subject matter experts to participate in web content development and address technical issues such as responsive design and compliance with accessibility standards, the transition to a distributed model of content creation, oversight, and maintenance resulted in larger organizational ramifications. In 2006, approximately a decade following libraries’ general move to the web and at an early stage for CMS adoption, Guenther (2006) cautioned: “A CMS is only a tool. Purchasing the very best CMS with every bell and whistle available will be a useless exercise without a solid plan to guide people and processes around its use.”29 This same article went on to observe: What makes using a CMS a tremendous advantage is exactly what makes it a potential nightmare. A CMS can make website development really easy; that's the good part. The bad part is, it makes webpage development really easy. One of the first issues you encounter is having to suddenly support a lot more content authors posting a lot more content. What once was an environment with limited activity can become a web development environment requiring considerably more oversight and technical support. Having more hands stirring the pot, so to speak, is wrought with all kinds of challenges. 30 Untenable Growth This model of distributed content creation, in which authorship is undertaken by numerous parties across the organization, generally results in a rapidly increasing quantity of content without necessarily guaranteeing consistent quality. A review of the literature reveals that, more commonly, a distributed model leads to a lack of consistency and focus in library web content’s structure and execution. Some papers underscore the problematic quality of the highly individualized nature of the content: “the sheer mass of [libraries’] public web presence has reached the point where maintenance is a problem. Often the webpages grew out of the personal interests of staff members, who have since left for other jobs for other responsibilities or simply retired.”31 Blakiston stated, “For a number of years, librarians were motivated to create more web content. It was assumed that adding more content was a service for library users, and it was also seen as a way to improve their web skills and demonstrate their fluency with technology.”32 Similarly, Chapman and Demsky described how the University of Michigan Library website grew “in an organic fashion” and noted, “[a]s in many places, the library’s longstanding attitude toward the web was that more was more and that there was really no harm in letting the website develop however individual units and librarians thought best.”33 Other papers described “authority and decision-making issues . . . differing opinions, turf struggles or a lack of communication . . . a shortage of time and motivation, general inertia, and resistance to change on the part of content authors.”34 Iglesias noted, “Some librarians will always be more comfortable creating webpages from scratch, fearing a loss of control. The library as a whole must decide if the core responsibility of librarians is to create content or to create websites.”35 LIBRARY-AUTHORED WEB CONTENT AND THE NEED FOR CONTENT STRATEGY | MCDONALD AND BURKHARDT 13 https://doi.org/10.6017/ital.v38i3.11015 Newton and Riggs stated, “This approach to content appears to be at odds with the role of librarians as leaders in information management practices and in supporting users to find , filter and critically evaluate information.”36 In her article “Editorial and Technological Workflow Tools to Promote Website Quality,” Morton-Owens discussed several studies measuring the severe impact of even small flaws (such as typographical errors) on users’ judgements of a website’s credibility, and, by extension, of the organization’s credibility: “users’ experience of a website leads them to attribute characteristics of competence and trustworthiness to the sponsoring organization.”37 A. Paula Wilson, citing McConnell and Middleton, summarized the potential pitfalls inherent in a distributed model in which empowerment of content creators overshadows a unified vision, strategy, and approach to library-wide content management: A decentralized model without the use of guidelines, standards or templates will eventually fail. The website may experience inconsistency in presentation and navigation, outdated and incorrect information, and gaps in content, and its webpages maybe noncompliant in usability and accessibility design so much so that users cannot find information.38 Inconsistent Voice and Lack of Organizational Unity In addition to such compounding factors and in contrast to journalistic practice, “libraries lack an editorial culture where content production and management is viewed as a collective rather than a personal effort.”39 Morton-Owens noted: “The concept of editing is not yet consistently applied to websites unless the site represents an organization that already relies on editors (like a newspaper)—but it is gaining recognition as a best practice. If the website is the most readily available public face of an institution, it should receive editorial attention just as a brochure or fundraising letter would.”40 In an environment with distributed authorship lacking a strong and consistent editorial culture, an organization's “voice” can quickly deteriorate. In web writing, voice is often defined as personality. Blakiston stated: “The written content you provide plays an essential role in defining your library as an organization.”41 Young went further, aligning voice with values, and arguing “[a]ny item of content that your library creates—an FAQ, a policy page, or a Facebook post—should be conveyed in the voice of your library and should communicate the values of your library. A combined expression of content and values defines the voice of your organization.” 42 In their 2006 article “CMS/CMS: Content Management System/Change Management Strategies,” Goodwin et al. insightfully explore organizational challenges: The effort of developing a unified web presence reveals where the organization itself lacks unity . . . Effective use of a content management system requires an organized and comprehensive consolidation of library resources, which emphasizes the need for a different organizational model and culture—one that promotes thinking about the library as a whole, sharing and collaboration.43 Fulton built on this concept: “Disunity in the library’s web interface could signify disunity within the institution. On the other hand, a harmonious web presence suggests an institution that works well together.”44 Young drew an inherent connection between a strongly unified organizational identity and a consistent and coherent “content strategy”: INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 14 While libraries in general can draw on decades or centuries of cultural identity, each individual library may wish to convey a unique set of attributes that are appropriate for unique contexts. In this way, the element of “organizational values” inherent to content strategy signals a larger visioning project for determining the mission, vision, and values of your library. If these elements are already in place, then the work of content strategy can easily be adapted to fit existing values statements. Otherwise, content strategy and organizational values can develop as a joint initiative. 45 LIBRARY WEBSITES AND CONTENT STRATEGY Content strategy is an emerging discipline that brings together concepts from user experience design, information architecture, marketing, and technical writing. Content strategy encompasses activities related to creating, updating, and managing content that is intentional, useful, usable, well-structure, easily found, and easily understood, all while supporting an organization’s strategic goals.46 Browning and Lowndes recognized as early as 2002 that strategy would be required as the variety of communication channels for libraries increased: “As local information systems integrate and become more pervasive, self-service authoring extends to the concept of ‘write once, re-use anywhere’, in which the web is treated as just another communication channel along with email, word processor files and presentations, etc.”47 More than a decade later, in the introductory column to a 2013 themed issue of Information Outlook focused on content strategy, Hales stated: Content strategy is a field for which information professionals and librarians are ideally suited, by virtue of both their education and temperament. Content, after all, is another word for information, and librarians and information professionals have been developing strategies for acquiring, managing, and sharing information for centuries. Today, however, information is available to more people in more forms and through more channels than ever before, making content strategies a necessity for organizations rather than an afterthought.48 Jones and Farrington posited a common refrain for stating the importance of content strategy for librarianship: “Library website content must be viewed in much the same way as a physical collection” and the “library website, to apply S. R. Ranganathan’s Fifth Law, is a growing organism and must be treated as such, especially with the complexity of web content.”49 Claire Rasmussen drew connections between Ranganathan’s Laws and content strategy in a blog post, pointing out that web content represents an additional set of responsibilities to be managed: “For hundreds of years, librarians have been the primary caretakers of the content corpus. But somebody needs to care for the content that never makes it into a library’s collections, too.”50 Blakiston & Mayden provided a helpful overview of content strategy and its application in libraries in their article “How We Hired a Content Strategist (And Why You Should Too),” finding many points of connection between skill sets essential to content strategy and those commonly possessed by librarians: Librarians who have worked in public services may have the needed skills to ask good questions and find out what users need . . . professionals doing this kind of work came from backgrounds including communications, English and library science . . . desirable LIBRARY-AUTHORED WEB CONTENT AND THE NEED FOR CONTENT STRATEGY | MCDONALD AND BURKHARDT 15 https://doi.org/10.6017/ital.v38i3.11015 qualifications for . . . content strategist[s] . . . [include] strategic planning, web skills and project management.51 The circumstances that motivated them to propose and eventually hire a dedicated content strategist at the University of Arizona Libraries hearken back to the discussion earlier in this article regarding the increasing complexity of web librarianship: “the web product manager had independently coordinated all user research and content strategy work. The idea of both managing [a major web redesign project] and leading these other important areas was not realistic.”52 Datig also pointed to increasing day-to-day responsibilities when advocating for the importance of content strategy for librarians with outreach and marketing responsibilities: “Lack of time, and a desire for that time to be well spent, is a huge concern for all librarians involved in library outreach and marketing . . . content strategy is an important and overlooked aspect of maintaining an effective and vital library outreach program.”53 Hackett reflected on her role as web content strategist in a blog post after a recent website migration, noting: “moving forward with a content strategy . . . will ensure that University Libraries’ website is useful, usable, and discoverable—now and in the future.”54 Yet, while the need for strategy is hard to dispute and librarians are theoretically well suited for web content strategy work, Blakiston & Mayden noted that explicit organizational support for content strategy in libraries remained limited: “Despite the growing popularity of content strategy as a discipline, only a handful of libraries had hired staff dedicated to this role at the time we proposed adding a content strategist to our staff.”55 CONCLUSION This article has traced the history of library adoption of web content management systems, the evolution of those systems, and the corresponding challenges as libraries have attempted to manage increasingly prolific content creation workflows across multiple, divergent CMS platforms. What is the Library Website, Anyway? While some variation would to be expected from institution to institution, largely missing from the conversation is agreement on the purpose and aim of the library website writ large. This lack of definition, together with the technological and growth-related issues already discussed, has doubtless contributed to the confusion. After all, how would we know if we are “building it right” if we are not sure what we are meant to be building in the first place? In response to this ambiguity, the following definition was proposed: The library website is an integrated representation of the library, providing continuously updated content and tools to engage with the academic mission of the college/university. It is constructed and maintained for the benefit of the user. Value is placed on consump tion of content by the user rather than production of content by staff.56 Effective Management of Library Web Content Requires Dedicated Resources and Clear Authority Inconsistent processes, disconnects between units, varying constituent goals, and vague or ineffective WCM governance structures are recurrent themes throughout the literature. As CMS applications have enabled broader access to web publishing, models of library web management INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 16 have moved away from workflows structured around strictly technical tasks and permissions, and have instead migrated toward consensus-based, revolving committee structures. While greater involvement of subject matter experts has been noted as a positive earlier in this article, other challenges have also been acknowledged. McDonald, Haines, and Cohen stated: “In the context of web design and governance, consensus is a blocker to nimble, standards-based, user-focused action.”57 Library Website as an Integrated Representation of the Organization As previously discussed, web content governance issues often signal a lack of coordination, or even of unity, across an organization. Demsky stated, “We won’t be fully successful until we see it as our website” (emphasis added).58 Internal documentation from the University of Michigan Library emphasized the value of “publicly represent[ing] ourselves as one library,” and stated: The more people are provided with clear communication that shows our offerings and unique items are part of the . . . Library—rather than confuse users by making primary attribution to a sub-library, collection, or service point—the more people will recognize and understand the library's tremendous, overall value.59 Content Strategy and the Case for Library-Authored Content No CMS can, by itself, address the fact that authoring, editing, and publishing quality content is both a situated expertise and a significant, ongoing demand on staff time. Each platform, resource, or database brings its own visual style, terminology, tone and functionality. They are all parts of the library experience, which in turn is one part of the student, research or teaching experience. An understanding of content strategy is critical if staff are to see the connections between their own content and the rest of the content delivered by the organization.60 Libraries must proactively embrace and employ best practices in content strategy and in writing for the web to effectively address considerations of literacy and to present a consistent voice for the organization. These practices position libraries to fully realize the promise of content management systems through embracing an ethos of library-authored content. The authors define library-authored content as collectively owned and authored content that represents the organization as a whole. Library-authored content is: • collaboratively planned, written, and edited with participation of both subject matter experts and domain experts (i.e., library staff with expertise in content strategy, web librarianship); • carefully drafted to optimize for clarity within the context of the end-user; • current, reviewed on a recurrent schedule, and regularly updated; • consistent across the ecosystem of CMS applications and other platforms, including print materials and social media; • compliant with industry standards (including but not limited to those related to accessibility), and with relevant internal brand standards; and • centrally managed as the primary responsibility of one or more domain experts. LIBRARY-AUTHORED WEB CONTENT AND THE NEED FOR CONTENT STRATEGY | MCDONALD AND BURKHARDT 17 https://doi.org/10.6017/ital.v38i3.11015 In order for libraries to meet the ever-increasing demands on our resources to produce timely, user-centered content that advances our missions for supporting teaching, research, and learning, a cultural shift toward a more collective, collaborative model of web content management and governance is necessary. Content strategy provides a flexible, adaptable framework for libraries to more efficiently and effectively leverage the power of multiple CMS platforms, to present engaging on-point content, and to provide appropriate, scaffolded support for researchers at all levels — with a team of one or a team of many. ENDNOTES 1 Deane Barker, “What Web Content Management Is (and Isn’t),” in Web Content Management (O’Reilly Media, Inc., 2016), sec. What Web Content Management Is (and Isn’t), https://learning.oreilly.com/library/view/web-content-management/9781491908112/. 2 Maira Bundza, Patricia Fravel Vander Meer, and Maria A. Perez-Stable, “Work of the Web Weavers: Web Development in Academic Libraries,” Journal of Web Librarianship 3, no. 3 (September 15, 2009): 252, https://doi.org/10.1080/19322900903113233. 3 Ruth Sara Connell, “Content Management Systems: Trends in Academic Libraries,” Information Technology and Libraries 32, no. 2 (June 10, 2013): 43, https://doi.org/10.6017/ital.v32i2.4632. 4 Connell, 46. 5 Paul Browning and Mike Lowndes, “JISC TechWatch Report: Content Management Systems,” 2001, 3, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.15.9100. 6 Camilla Fulton, “Library Perspectives on Web Content Management Systems,” First Monday 15, no. 8 (July 15, 2010): sec. Review of literature, https://doi.org/10.5210/fm.v15i8.2631. 7 Barker, “What Web Content Management Is (and Isn’t),” sec. What Is A Content Management System? 8 Browning and Lowndes, “JISC TechWatch Report,” 4. Within a diagram outlining the major functions within the content life-cycle, they include the steps ‘Review’, ‘Archive’ and ‘Dispose’ - steps which, in the experience and observations of the authors, are often overlooked in general library web practice. 9 Barker, sec. Types of Content Management Systems. 10 Laura B. Cohen, Matthew M. Calsada, and Frederick J. Jeziorkowski, “ScratchPad: A Quality Management Tool for Library Web Sites,” Content and Workflow Management for Library Websites: Case Studies, 2005, 102–26, https://doi.org/10.4018/978-1-59140-533-7.ch005; Diane Dallis and Doug Ryner, “Indiana University Bloomington Libraries Presents Organization to the Users and Power to the People: A Solution in Web Content Management,” Content and Workflow Management for Library Websites: Case Studies, 2005, 80–101, https://doi.org/10.4018/978-1-59140-533-7.ch004; Stephen Sottong, “Database-Driven Web Pages Using Only JavaScript: Active Client Pages,” Content and Workflow Management for Library Websites: Case Studies, 2005, 167–85, https://doi.org/10.4018/978-1-59140-533- https://learning.oreilly.com/library/view/web-content-management/9781491908112/ https://doi.org/10.1080/19322900903113233 https://doi.org/10.6017/ital.v32i2.4632 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.15.9100 https://doi.org/10.5210/fm.v15i8.2631 https://doi.org/10.4018/978-1-59140-533-7.ch005 https://doi.org/10.4018/978-1-59140-533-7.ch004 https://doi.org/10.4018/978-1-59140-533-7.ch008 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 18 7.ch008; Ray Bailey and Tom Kmetz, “Migrating a Library’s Web Site to a Commercial CMS within a Campus‐wide Implementation,” Library Hi Tech 24, no. 1 (January 1, 2006): 102–14, https://doi.org/10.1108/07378830610652130; Juan Carlos Rodriguez and Andy Osburn, “Developing a Distributed Web Publishing System at CSU Sacramento Library: A Case Study of Coordinated Decentralization,” Content and Workflow Management for Library Websites: Case Studies, 2005, 51–79, https://doi.org/10.4018/978-1-59140-533-7.ch003; Barbara A. Blummer, “A Literature Review of Academic Library Web Page Studies,” Journal of Web Librarianship 1, no. 1 (June 21, 2007): 45–64, https://doi.org/10.1300/J502v01n01_04; Robert Slater, “The Library Web Site: Collaborative Content Creation and Management,” Journal of Web Librarianship 2, no. 4 (December 2008): 567–77, https://doi.org/10.1080/19322900802473928; Rebecca Blakiston, “Developing a Content Strategy for an Academic Library Website,” Journal of Electronic Resources Librarianship 25, no. 3 (July 2013): 175–91, https://doi.org/10.1080/1941126X.2013.813295; Suzanne Chapman and Ian Demsky, “Taming the Kudzu: An Academic Library’s Experience with Web Content Strategy,” in Cutting-Edge Research in Developing the Library of the Future, ed. Bradford Lee Eden (Lanham, MD: Rowman & Littlefield, 2015). 11 Cohen, Calsada, and Jeziorkowski, “ScratchPad,” 11; Rodriguez and Osburn, “Developing a Distributed Web Publishing System at CSU Sacramento Library,” 76–77; Slater, “The Library Web Site,” 57. 12 Dallis and Ryner, “Indiana University Bloomington Libraries Presents Organization to the Users and Power to the People,” 82. 13 Holly Yu, ed., Content and Workflow Management for Library Web Sites: Case Studies (Hershey, PA: IGI Global, 2005), vi. 14 Browning and Lowndes, “JISC TechWatch Report,” 5. 15 Michelle Mach, “Website Maintenance Workflow at a Medium-Sized University Library,” Content and Workflow Management for Library Websites: Case Studies, 2005, 128, https://doi.org/10.4018/978-1-59140-533-7.ch006. 16 Fulton, “Library Perspectives on Web Content Management Systems,” sec. Review of literature. 17 Yu, Content and Workflow Management for Library Web Sites, 2. 18 Nora Hegarty and David Kane, “New Web Site, New Opportunities: Enforcing Standards Compliance within a Content Management System,” Library Hi Tech 25, no. 2 (June 19, 2007): 278, https://doi.org/10.1108/07378830710755027. 19 Elizabeth L. Black, “Selecting a Web Content Management System for an Academic Library Website,” Information Technology and Libraries 30, no. 4 (December 1, 2011): 186, https://doi.org/10.6017/ital.v30i4.1869. https://doi.org/10.4018/978-1-59140-533-7.ch008 https://doi.org/10.1108/07378830610652130 https://doi.org/10.4018/978-1-59140-533-7.ch003 https://doi.org/10.1300/J502v01n01_04 https://doi.org/10.1080/19322900802473928 https://doi.org/10.1080/1941126X.2013.813295 https://doi.org/10.4018/978-1-59140-533-7.ch006 https://doi.org/10.1108/07378830710755027 https://doi.org/10.6017/ital.v30i4.1869 LIBRARY-AUTHORED WEB CONTENT AND THE NEED FOR CONTENT STRATEGY | MCDONALD AND BURKHARDT 19 https://doi.org/10.6017/ital.v38i3.11015 20 Dave Comeaux and Axel Schmetzke, “Accessibility of Academic Library Web Sites in North America: Current Status and Trends (2002‐2012),” Library Hi Tech 31, no. 1 (March 1, 2013): 27, https://doi.org/10.1108/07378831311303903. 21 David J. Comeaux, “Web Design Trends in Academic Libraries — A Longitudinal Study,” Journal of Web Librarianship 11, no. 1 (January 2, 2017): 12, https://doi.org/10.1080/19322909.2016.1230031. 22 Meredith Larson, “Even If You’re Trying, You’re Probably Not Writing for the Average American,” Federal Communicators Network (blog), October 9, 2018, https://fedcommnetwork.org/2018/10/09/even-if-youre-trying-youre-probably-not-writing- for-the-average-american/. 23 Rebecca Blakiston, Writing Effectively in Print and on the Web—A Practical Guide for Librarians (Rowman & Littlefield, 2017), 110. 24 National Adult Literacy Agency, “Plain English around the World,” Simply Put, 2015, http://www.simplyput.ie/plain-english-around-the-world; Plain Language Action and Information Network, General Services Administration, United States Government, “Home | Plainlanguage.Gov,” plainlanguage.gov, accessed February 1, 2019, https://www.plainlanguage.gov/. 25 Danielle Skaggs, “My Website Reads at an Eighth Grade Level: Why Plain Language Benefits Your Users (and You),” Journal of Library & Information Services in Distance Learning, 2016, 2, https://doi.org/10.1080/1533290X.2016.1226581. 26 Black, “Selecting a Web Content Management System for an Academic Library Website,” 185. 27 Kim Guenther, “Content Management Systems as ‘Silver Bullets,’” Online 30, no. 4 (2006): 55. 28 Paul Browning and Mike Lowndes, “Content Management Systems: Who Needs Them?,” Ariadne, no. 30 (2002): sec. The Issue, http://www.ariadne.ac.uk/issue30/techwatch. 29 Guenther, “Content Management Systems as ‘Silver Bullets,’” 54. 30 Guenther, 56. 31 Michael Seadle, “Content Management Systems,” Library Hi Tech 24, no. 1 (January 1, 2006): 5, https://doi.org/10.1108/07378830610652068. 32 Blakiston, “Developing a Content Strategy for an Academic Library Website,” 176. 33 Chapman and Demsky, “Taming the Kudzu,” 25. 34 Bundza, Meer, and Perez-Stable, “Work of the Web Weavers,” 256. 35 Edward Iglesias, “Winning the Peace: An Approach to Consensus Building When Implementing a Content Management System,” in Content Management Systems in Libraries: Case Studies, ed. Bradford Lee Eden (Scarecrow Press, 2008), 177. https://doi.org/10.1108/07378831311303903 https://doi.org/10.1080/19322909.2016.1230031 https://fedcommnetwork.org/2018/10/09/even-if-youre-trying-youre-probably-not-writing-for-the-average-american/ https://fedcommnetwork.org/2018/10/09/even-if-youre-trying-youre-probably-not-writing-for-the-average-american/ http://www.simplyput.ie/plain-english-around-the-world https://www.plainlanguage.gov/ https://doi.org/10.1080/1533290X.2016.1226581 http://www.ariadne.ac.uk/issue30/techwatch https://doi.org/10.1108/07378830610652068 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 20 36 Kristy Newton and Michelle Riggs, “Everybody’s Talking but Who’s Listening? Hearing the User’s Voice above the Noise, with Content Strategy and Design Thinking,” VALA2016 Conference, January 1, 2016, 1, https://ro.uow.edu.au/asdpapers/536. 37 Emily G. Morton-Owens, “Editorial and Technological Workflow Tools to Promote Website Quality,” Information Technology and Libraries 30, no. 3 (September 2, 2011): 91, https://doi.org/10.6017/ital.v30i3.1764. 38 A. Paula Wilson, Library Web Sites: Creating Online Collections and Services (Chicago: American Library Association, 2004), 4. 39 Chapman and Demsky, “Taming the Kudzu,” 35. 40 Morton-Owens, “Editorial and Technological Workflow Tools to Promote Website Quality,” 97. 41 Blakiston, Writing Effectively in Print and on the Web — a Practical Guide for Librarians, 6. 42 Scott W. H. Young, “Principle 1: Create Shareable Content,” Library Technology Reports 52, no. 8 (November 18, 2016): 11–12. 43 Susan Goodwin et al., “CMS/CMS: Content Management System/Change Management Strategies,” Library Hi Tech 24, no. 1 (January 2006): 55–56, https://doi.org/10.1108/07378830610652103. 44 Fulton, “Library Perspectives on Web Content Management Systems,” sec. Discussion. 45 Young, “Chapter 1. Principle 1,” 12. 46 Kristina Halvorson, “Understanding the Discipline of Web Content Strategy,” Bulletin of the American Society for Information Science & Technology 37, no. 2 (January 2011): 23–25, https://doi.org/10.1002/bult.2011.1720370208; Anne Haines, “Web Content Strategy: What Is It, and Why Should I Care?,” InULA Notes 27, no. 2 (December 18, 2015): 11–15, https://scholarworks.iu.edu/journals/index.php/inula/article/view/20672/26734; U.S. Department of Health & Human Services, “Content Strategy Basics,” usability.gov, January 24, 2016, https://www.usability.gov/what-and-why/content-strategy.html. 47 Browning and Lowndes, “Content Management Systems,” sec. The Issue. 48 Stuart Hales, “Providing Content Strategy Services,” Information Outlook (Online); Alexandria 17, no. 6 (December 2013): 8. 49 Kyle M. L. Jones and Polly-Alida Farrington, “WordPress as Library CMS,” American Libraries; Chicago 42, no. 5/6 (June 2011): 34. 50 Claire Rasmussen, “Do It Like a Librarian: Ranganathan for Content Strategists « Brain Traffic Blog,” BrainTraffic Blog, June 13, 2012, https://web.archive.org/web/20120613173955/http://blog.braintraffic.com/2012/06/do -it- like-a-librarian-ranganathan-for-content-strategists/. https://ro.uow.edu.au/asdpapers/536 https://doi.org/10.1108/07378830610652103 https://doi.org/10.1002/bult.2011.1720370208 https://scholarworks.iu.edu/journals/index.php/inula/article/view/20672/26734 https://www.usability.gov/what-and-why/content-strategy.html https://web.archive.org/web/20120613173955/http:/blog.braintraffic.com/2012/06/do-it-like-a-librarian-ranganathan-for-content-strategists/ https://web.archive.org/web/20120613173955/http:/blog.braintraffic.com/2012/06/do-it-like-a-librarian-ranganathan-for-content-strategists/ LIBRARY-AUTHORED WEB CONTENT AND THE NEED FOR CONTENT STRATEGY | MCDONALD AND BURKHARDT 21 https://doi.org/10.6017/ital.v38i3.11015 51 Rebecca Blakiston and Shoshana Mayden, “How We Hired a Content Strategist (And Why You Should Too),” Journal of Web Librarianship 9, no. 4 (2015): 196, https://doi.org/10.1080/19322909.2015.1105730. 52 Blakiston and Mayden, 197. 53 Ilka Datig, “Revitalizing Library Websites and Social Media with Content Strategy: Tools and Recommendations,” Journal of Electronic Resources Librarianship 30, no. 2 (2018): 63–64, https://doi.org/10.1080/1941126X.2018.1465511. 54 Karen Hackett, “What Is a Web Content Strategist?,” Library News, October 17, 2016, https://sites.psu.edu/librarynews/2016/10/17/whats-a-web-content-strategist/. 55 Blakiston and Mayden, “How We Hired a Content Strategist (And Why You Should Too),” 196. 56 Courtney McDonald, Anne Haines, and Rachael Cohen, “From Consensus to Expertise: Rethinking Library Web Governance,” ACRL TechConnect (blog), November 2, 2015, https://acrl.ala.org/techconnect/post/from-consensus-to-expertise-rethinking-library-web- governance/. 57 McDonald, Haines, and Cohen. 58 Ian Demsky, “Lessons from My First Year as Web Content Strategist,” Library Tech Talk (blog), August 7, 2014, https://www.lib.umich.edu/blogs/library-tech-talk/lessons-my-first-year- web-content-strategist. 59 University of Michigan Library, “Editorial Style and Best Practices,” January 23, 2019, sec. Library Branding. 60 Newton and Riggs, “Everybody’s Talking but Who’s Listening?,” 12. https://doi.org/10.1080/19322909.2015.1105730 https://doi.org/10.1080/1941126X.2018.1465511 https://sites.psu.edu/librarynews/2016/10/17/whats-a-web-content-strategist/ https://acrl.ala.org/techconnect/post/from-consensus-to-expertise-rethinking-library-web-governance/ https://acrl.ala.org/techconnect/post/from-consensus-to-expertise-rethinking-library-web-governance/ https://www.lib.umich.edu/blogs/library-tech-talk/lessons-my-first-year-web-content-strategist https://www.lib.umich.edu/blogs/library-tech-talk/lessons-my-first-year-web-content-strategist Abstract Introduction Content Management Systems: A Definition The CMS and Library Websites Changing Technology, Accessibility, and Literacy Library Websites and the Challenges of a Distributed Environment Untenable Growth Inconsistent Voice and Lack of Organizational Unity Library Websites and Content Strategy Conclusion What is the Library Website, Anyway? Effective Management of Library Web Content Requires Dedicated Resources and Clear Authority Library Website as an Integrated Representation of the Organization Content Strategy and the Case for Library-Authored Content Endnotes 11018 ---- Articles Weathering the Twitter Storm: Early Uses of Social Media as a Disaster Response Tool for Public Libraries During Hurricane Sandy Sharon Han INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 37 Sharon Han (shrnhan@gmail.com) is Candidate for Master of Science in Library and Information Science, School of Information Sciences, University of Illinois. ABSTRACT After a disaster, news reports and online platforms often document the swift response of public libraries supporting their communities. Despite current scholarship focused on social media in disasters, early uses of social media as an extension of library services require further scrutiny. The Federal Emergency Management Agency (FEMA) recognized Hurricane Sandy as one of the earliest U.S. disasters in which first responders used social media. This study specifically examines early uses of Twitter by selected public libraries as an information tool during Sandy’s aftermath. Results can inform uses of social media in library response to future disasters. INTRODUCTION In the Digital Age of instantaneous communication, when disasters hit, they hit us all. The fall and winter of 2017-18 brought a literal and figurative deluge to our screens with the arrival of hurricanes Harvey, Irma, and Maria to the United States. Within moments of each event, websites and news feeds filled with images of destruction and cries for help. The use of social media to bring awareness to victims’ situations through hashtags and directly tagging first responders underscores the importance of this technological tool in the twenty-first century. In fact, the ubiquity of social media in documenting Hurricane Harvey have led some to believe that it should be considered the first “social media storm.”1 However, many of the most popular social media platforms have existed since the mid-2000s and have already been used to communicate disaster- related information since well before Harvey reached the United States’ shores. Some of social media’s earliest adapters were even public libraries who had the resources and means to use this information technology as a method of connecting with their communities. Why should social media matter to public libraries in times of disaster? As a physical manifestation of information access, the public library maintains a relationship with its community that varies across regions, time, and context. Currently, the public library as an entity is in an interventionist period, according to Jaeger’s article “Libraries, Policy, and Politics in a Democracy: Four Historical Epochs,” where its roles and responsibilities are heavily influenced by outside factors, especially the federal government.2 From tax forms to permits to insurance claims, the government encourages people to use the public library to find and use information necessary to navigate American society. Public demand for accessing government and other resources is especially apparent after natural disasters, which, due to their unpredictable nature, can heighten WEATHERING THE TWITTER STORM | HAN 38 https://doi.org/10.6017/ital.v38i2.11018 community uncertainty and the need for credible and reliable information. Public libraries can meet this information need by using social media as one strategy to assess and provide resources in real time. When Hurricane Sandy landed on New Jersey’s shore on October 29, 2012, it prompted a new era for societal response to emergencies and community needs. Due to the hurricane’s trajectory into densely populated areas of the American northeast and subsequent widespread flooding, Hurricane Sandy was the deadliest storm of 2012.3 With initial estimated recovery costs of up to $50 billion, the degree of damage to buildings, infrastructure, and endangerment of people’s safety made swift and coordinated communication paramount in response efforts. Thus, the aftermath of Hurricane Sandy resulted in federal agencies using social media for the first time in coordinating and implementing disaster response.4 As community-based service providers, many public libraries responded to the hurricane by sharing available resources and services with patrons. However, few studies explicitly examine the use of social media as a library tool to support their community. This paper explores the role of social media and its impact on public library services in response to Hurricane Sandy as a measure of libraries using digital mediums to support their communities. Using Twitter posts from three separate public libraries impacted by the hurricane, their content is analyzed and compared to reported library services after the storm. The analysis will then be used to discuss the use of social media as a library tool and recommendations for social media implementation in future disaster response. BACKGROUND INFORMATION Library Response to Disasters According to the Institute of Museum and Library Services’ Public Library Data from 2009 to 2011, over half of all public libraries are located within declared “disaster counties.”5 This value implicates disaster response as an important topic within public librarianship discourse. In addition to assessing damages to buildings and collections, libraries must also meet the needs of its community. Information needs are heightened after a disaster, as the destruction results in information uncertainty and loss of important resources such as power and telecommunication services.6 Consistent and increased use of public libraries is not unusual post-disaster. For example, despite 35 percent of Louisiana libraries being closed after Hurricane Katrina in 2004, a study found that overall library visitor counts only decreased by 1 percent.7 Frequent use of library resources after a disaster can be attributed to the library’s free and low-cost resources, as well as the institution’s reputation as a source for reliable and credible information.8 Libraries also extend their resources and services beyond their walls. Library bookmobiles and delivery programs provide services to those who are unable to physically visit the library. Some libraries use their skills in information management and communication to assist local disaster preparedness groups and response teams.9 In 2011, the Federal Emergency Management Agency (FEMA) declared public libraries eligible for temporary relocation funds in the event of an emergency, a distinction once limited to first responders, hospitals, utilities, and schools.10 Former Executive Director of the American Library Association’s (ALA) Washington Office, Emily Sheketoff, stated such a distinction recognizes libraries as “essential community organizations.”11 In context with Jaeger’s interventionist period, it benefits libraries and government agencies alike to have libraries open to serve communities after a disaster. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 39 In the aftermath of Hurricane Sandy, communities suffered from varying degrees of damage, such as flooding, power outages, debris, and downed trees.12 The impact of the storm drove many community members to their local libraries to seek shelter, charge their electronics, file insurance claims and other e-government forms, drop off or pick up donations, and obtain entertainment.13 Despite the many stories of libraries serving disaster victims and working with first responders, such actions have yet to be translated into widespread library policy and procedures. ALA provides a “Disaster Preparedness and Recovery” resources webpage, but it primarily focuses on addressing material and structural needs after a disaster, such as mitigating water damage to collections.14 Other studies also note a majority of library disaster response literature remains focused on protecting materials.15 Such a limited perspective is highlighted in a national survey in which the majority of librarian respondents believed protecting library materials and performing daily services were their primary goals in the event of an emergency.16 As a result, library communication with the community and local organizations remains a relatively unexplored subject in context with disaster response.17 While trade journals and websites publish stories of individual libraries serving their communities, formal studies and research are comparatively scarce. With the widespread use of technology and the Internet, one method of communication stands out as an important tool for library outreach and study: social media. Disaster Response through Social Media As information providers and advocates of communication technology, libraries should use social media to connect with their communities. Although libraries were early adopters of social media prior to Hurricane Sandy, their use of these tools tends to focus on one-way information sharing instead of a dialogue with their community.18 Social media in context of disaster response may upend traditional library social media use, which is why this topic needs further examination. Social media coupled with mobile technology has created a society in which information sharing and communication are constant and instantaneous.19 Since social networking is a relatively new form of media, formal studies on its impact on social behaviors have only come about in the last decade.20 Within this young body of literature, however, social media use in disaster response and recovery is a popular topic for researchers, organizations, and federal agencies.21 Alexander claims that social media provides the following benefits during disaster response: • Provides an outlet to listen and share thoughts, emotions, opinions; • Monitors a situation; • Integrates social media into emergency plans; • Crowdsources information; • Creates social cohesion and promoting therapeutic initiatives; • Furthers causes; and • Creates research data.22 Such a comprehensive list is beneficial to this study because it provides a framework through which library social media use can be examined. These benefits stem from the sharing of information with people or entities, which is a large component of library disaster response, as discussed in the previous section. Using Alexander’s list as a reference, the three main benefits this study examines in context with library disaster response are: WEATHERING THE TWITTER STORM | HAN 40 https://doi.org/10.6017/ital.v38i2.11018 1. Monitors a situation. A survey of library patrons impacted by the 2015 South Carolina floods revealed all respondents used social media to learn about the flooding and impacted areas.23 People now frequently use social media to get updates on situations, whether they were directly or indirectly impacted by the natural disaster itself. Disaster response groups also monitor social media feeds to assess and allocate resources to those in need.24 Libraries can use social media feeds to assess resources and services use, plan outreach opportunities, and even inform the public about its own status during the disaster. 2. Integrates social media into emergency plans. Social media is a low-cost and effective way to coordinate disaster response between organizations and people. Much like bookmobiles, social media serves as outreach for librarians to improve service accessibility. Librarians can use platforms like Twitter and Facebook to help coordinate their activities and services alongside with other responders in the community. Having an established plan of action where the library’s role and responsibilities are clearly outlined will result in more effective service and efficient response to community needs.25 3. Creates social cohesion and promoting therapeutic initiatives. In alignment with the library’s mission of creating and serving communities, social media can act as an extra method of fostering connections in times of need. Disaster victims can take advantage of social media’s speed and ubiquity to check in with family, tell them they are safe, and participate in relief efforts.26 Social cohesion through platforms such as Twitter can also create participatory discourse between people and organizations. For example, then-FEMA administrator Chris Furgate’s recommendation to read to children during the hurricane prompted the hashtag #StormReads to trend on Twitter, as many accounts—libraries included—shared their recommended titles.27 Library use of social media can also address growing concerns about rumors and misinformation spread during disasters.28 As providers of reliable and accurate information, libraries help establish source credibility and push more accurate resources to misinformed and unaware community members. Although there is a substantial amount of research focused on libraries responding to disasters and social media use during disasters separately, there is a gap in library science literature examining social media as a method of library disaster response. Interestingly, formal studies that mention library disaster response note an explicit absence of social media as a form of emergency communication.29 Despite the current dearth, library social media studies can develop quickly thanks to the abundant amount of data available on social media platforms. As libraries continue to respond to disasters, they will require more deliberate and planned use of social media as a communication tool. Such a need demands a closer examination of how libraries have historically used social media during disasters. CASE STUDIES: THREE PUBLIC LIBRARIES AND TWITTER This study will examine the social media feeds of three public libraries during and immediately after Hurricane Sandy landed on the northeastern coast as a measure of social media’s impact on communication and information-sharing amongst libraries, patrons, and first responders. Due to its frequent use for sharing up-to-date information, Twitter was the selected social media platform to study.30 The public library systems were selected for this analysis based on their varying characteristics and available literature describing their actions after the hurricane. New York INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 41 Public Library (NYPL, @NYPL), Princeton Public Library (PPL, @PrincetonPL), and Queens Library (QL, @QueensLibrary) have Twitter accounts that were at least two years old by October 2012. All accounts were active during the time period of interest, although they were closed when Hurricane Sandy landed. NYPL and QL were closed an additional two days due to damages to several branch libraries.31 These library systems serve varied communities. NYPL and QL are urban libraries located in New York City, with 91 and 62 branches respectively, and PPL is a one branch library located in downtown Princeton, New Jersey. The larger library systems reported flooding and power outages at several branches from the hurricane, while PPL sustained no structural or internal damages.32 However, all library systems were in communities where large numbers of households lost electricity and Internet access, and sustained damages from fallen trees and flooding.33 The library systems were mentioned in news reports for services to library patrons affected by the storm, including providing charging stations for electronics, helping people fill out FEMA insurance forms, running programs for children and adults, and having public computers and wireless connections to access the Internet.34 The libraries’ coupled use of Twitter and active provision of disaster response services make them ideal candidates for examining the correlation between the two activities. METHODOLOGY This study used a filtered search on Twitter to identify tweets from each library’s feed within the time period of interest. Within searches, each tweet was recorded and categorized based on content and message format. A single tweet could have more than one category. Common content subcategories were identified to improve analysis. Defined categories are as follows: • Hurricane Information: Information on the hurricane’s status and impact from news and government agencies. • Library Policies: Information on library policies. • Library Policies, Renewals/Fines: Information on renewals and fines during the studied time period. • Library Status: Information on library branch closures. • Library Event/Service Related to Hurricane: Event or service specifically planned in response to hurricane. • Library Event/Service NOT Related to Hurricane: Regular library programming; included event/service cancellations as an indirect/direct result of hurricane. • Non-Library Event/Service Related to Hurricane: Information on non-library sponsored events and services provided in response to the hurricane. • Replies: A publicly posted message from the library to another Twitter user. • Social Interactions: Non-informative and informative tweets aimed at conversing with people or organizations in a social manner. Selected categories were then associated with a corresponding benefit from three of Alexander’s defined benefits (table 1).35 After categorizing, the collected data was organized for analysis and comparison. WEATHERING THE TWITTER STORM | HAN 42 https://doi.org/10.6017/ital.v38i2.11018 Table 1. Categories organized by social media benefits.36 Benefit Twitter Content Categories Monitoring a situation § Hurricane information § Library event/service related to hurricane § Replies Integrating social media into emergency plans § Library policies § Library status § Non-library event/service related to hurricane Creating social cohesion and promoting therapeutic initiatives § Library event/service related to hurricane § Library event/service NOT related to hurricane § Non-library event/service related to hurricane § Replies § Social interactions RESULTS From October 29-31, each library used Twitter regularly to provide information or to communicate with library followers. Tweet frequencies were counted and compared over the five- day period across libraries (figure 1). While NYPL and QL averaged almost 11 tweets per day, PPL had nearly double their numbers, at about 18 tweets per day. NYPL and QL had a generally increasing trendline in tweets, while PPL’s Twitter use fluctuated greatly. NYPL and QL’s low tweet count during the studied time frame may be attributed to library-wide closures, although only QL’s tweet count increased significantly upon reopening. Figure 1. Number of tweets per day by library. Content analysis illustrated variations in Twitter use across all three libraries (figure 2). NYPL tweeted the most about their library status and renewal/fine policy, with 21 and 17 tweets, INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 43 respectively. PPL focused more on advertising library events and services such as electrical outlets, heat, Internet, and entertainment. They also used Twitter heavily for social interactions, 35 percent of PPL’s 112 tweets, including asking questions, recommending books, thanking concerned patrons, and even apologizing for retweeting too many news articles about the hurricane. QL’s Twitter use was more of a mix, often posting about library status and socially interacting with other Twitter users. Figure 2. Twitter content by library. Each library also differed in least common content tweeted. NYPL had the fewest tweets about the hurricane, non-library services and events related to the hurricane, other library policies, and social interactions. PPL also had few tweets with information about the hurricane and rarely tweeted about fines and renewals. QL had no tweets about the hurricane, nor did they tweet about any library events or programs that were unrelated to their disaster response. DISCUSSION The data collected was analyzed to determine whether each library fulfilled the three identified benefits of social media that directly relate to the library’s mission of information access and community building: monitoring a situation, integrating social media into emergency plans, and creating social cohesion and promoting therapeutic initiatives. Each library’s consistent responses to Twitter users, status updates, and information about library services illustrates they all monitored their communities’ situations and responded accordingly through services and programs, as evidenced in news reports. Libraries also used Twitter to engage with others and WEATHERING THE TWITTER STORM | HAN 44 https://doi.org/10.6017/ital.v38i2.11018 create a social network of library patrons and local institutions. Based on the lack of information about the storm itself and few recommendations for non-library disaster response group resources, it is not apparent libraries integrated social media as part of their emergency policy and procedures. This also resulted in a dissonance between library action and their online communication. One notable example: many news reports described librarians aiding patrons with finding and filling out FEMA insurance forms, but only one of the 196 tweets analyzed in this study advertised FEMA assistance at the library.37 PPL tweeted several posts illustrating library use by affected patrons, but also emphasized they were at capacity due to large visitor numbers and shortages in charging stations and Internet bandwidth. PPL also failed to offer alternatives on Twitter to meet patron information needs. The lack of a coordinated effort perhaps can be explained in two parts. First, as no two disasters are alike, library response is often a direct reaction to the event and damages to their institution and community. A busy library would logically place social media communication and coordination as a lower priority than other immediate, tangible needs. Second, librarians may not make a concerted effort to use social media if they are trained to prioritize protecting library collections and conducting regular services.38 While digital and outreach services such as bookmobiles have been common components of libraries, there is still a noticeable gap in libraries extending these same services using online tools. The libraries in this study used social media as a part of their disaster response, but the lack of planning resulted in each library’s Twitter feed acting more as a “triage center,” providing basic assistance as the need arose, rather than an extension of in-house services. TAKEAWAYS AND FURTHER RESEARCH While these libraries provided much needed services in the aftermath of Hurricane Sandy, their implementation of social media as a communication and information-sharing tool illustrates opportunities to develop more coordinated efforts. As library presence on and use of social media continues to grow, it should be considered as a necessary component of library disaster response and collaboration with other government agencies and first responders. While libraries are qualified for FEMA funding, it is uncertain that local first responder groups are aware of the services and benefits libraries provide post-disaster at all. As of 2013, the U.S. Department of Homeland Security’s Virtual Social Media Working Group did not include any library organizations, which leaves libraries out of crucial conversations in designing comprehensive disaster response plans.39 In an effort to participate in productive discourse, librarians also need to improve their social media use to better align with their practice when serving distressed communities. While the exact reasons for librarians’ lack of effective social media use in disaster response remains speculative, other research has shown that training opportunities for social media use in libraries remain scarce and not very effective.40 Since Hurricane Sandy, social media has only grown as a powerful tool for people and communities, rendering it an essential skill for librarians today. This should motivate librarians, library associations, and other professional groups to consider developing effective training and workshops geared towards intentional use of social media. Despite its power, social media should be seen as a complementary tool to enhance information services for community members. It will optimize the library’s reach, but it cannot completely replace current methods of outreach, nor should it. This is especially important when considering INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 45 who benefits the most from libraries, many of whom do not necessarily have consistent access to social media.41 Social media use varies across age, socioeconomic status, digital access, and education levels, making it important for librarians to consider whose information needs are and are not being met online. Considering such limitations, learning impactful social media skills and creating a support network amongst disaster response groups will enable libraries to effectively develop outreach strategies and improve disaster response services. The discussion and takeaways highlight the necessity for further research on social media use in library disaster response. As the history of library development and service informs the direction of libraries today, so too should historic uses of social media as a library service tool guide future work. Continuing research may include case studies of public library response to recent disasters, which would provide better insight into the developing use of social media. The identified patterns and strengths can be used to guide future work in incorporating effective social media policies and protocols in library disaster plans. Considering social media usage by first responders and federal agencies, future research should also include a closer examination of relationships between public libraries, first responders, and disaster information providers in improving coordinated response efforts. CONCLUSION When disaster strikes, many communities exhibit a great need for resources and information. Despite libraries providing much needed service and resources to community members after natural disasters, their use of social media platforms as a tool remains overlooked. This study examines historical use of social media as a communication and service tool between libraries, community members, and disaster response groups in the aftermath of Hurricane Sandy. The effectiveness of social media use was evaluated using Alexander’s review of social media benefits and compared with descriptions of post-Sandy library resources and services described in the literature. The study found social media use to be highly variable based on content and correlations with reported in-house library services. There was no sign of a coordinated effort with other disaster response groups, and the primary objective of their Twitter accounts was connecting with patrons and other organizations through social interactions. Improvements to social media use could be achieved through intentional coordination with first responders, directed training, and evaluating social media’s strengths and limitations in disaster response. If libraries wish to continue providing pertinent information, they need to adapt to communication methods used by their community. With social media’s strong presence in society, suburban and urban libraries such as the ones examined in this study should improve their use of social media as an effective information sharing and communication tool. Continuing to examine and assess uses of social media as a disaster response tool can help shape policies and procedures that will enable libraries to better serve their communities. REFERENCES 1 Maya Rhodan, “‘Please Send Help.’ Hurricane Harvey Victims Turn to Twitter and Facebook,” Time, Aug. 30, 2017, http://time.com/4921961/hurricane-harvey-twitter-facebook-social- media/. WEATHERING THE TWITTER STORM | HAN 46 https://doi.org/10.6017/ital.v38i2.11018 2 Paul T. Jaeger et al., “Libraries, Policy, and Politics in a Democracy: Four Historical Epochs,” Library Quarterly 83, no. 2 (Apr. 2013): 166–81, https://doi.org/10.1086/669559. 3 Virtual Social Media Working Group and DHS First Responders Group, “Lessons Learned: Social Media and Hurricane Sandy", U.S. Department of Homeland Security, June 2013, https://www.dhs.gov/sites/default/files/publications/Lessons%20Learned%20Social%20Me dia%20and%20Hurricane%20Sandy.pdf. 4 Virtual Social Media Working Group and DHS First Responders Group. 5 Bradley W. Bishop and Shari R. Veil, “Public Libraries as Post-Crisis Information Hubs,” Public Library Quarterly 32 (2013): 33–45, https://doi.org/10.1080/01616846.2013.760390. 6 Bishop and Veil. 7 Bishop and Veil. 8 Bishop and Veil; Jingjing Liu et al., “Social Media as a Tool Connecting with Library Users in Disasters: A Case Study of the 2015 Catastrophic Flooding in South Carolina,” Science & Technology Libraries 36, no. 3 (July 2017): 274–87, https://doi.org/10.1080/0194262X.2017.1358128. 9 Charles R. McClure et al., “Hurricane Preparedness and Response for Florida Public Libraries: Best Practices and Strategies,” Florida Libraries 52, no. 1 (2009): 4–7. 10 Michael Kelley, “ALA Midwinter 2011: FEMA Recognizes Libraries as Essential Community Organizations,” School Library Journal, Jan. 11, 2011, http://lj.libraryjournal.com/2011/01/industry-news/ala-midwinter-2011-fema-recognizes- libraries-as-essential-community-organizations/. 11 Kelley. 12 Maureen M. Garvey, “Serving A Public Library Community After A Natural Disaster: Recovering From ‘Hurricane Sandy,’” Journal of the Leadership & Management Section 11, no. 2 (Spring 2015): 22–31; Cathleen A. Merenda, “How the Westbury Library Helped the Community after Hurricane Sandy,” Journal of the Leadership & Management Section 11, no. 2 (Spring 2015): 32– 34. 13 Sarah Bayliss, Shelley Vale, and Mahnaz Dar, “Libraries Respond to Hurricane Sandy, Offering Refuge, WiFi, and Services to Needy Communities,” School Library Journal, Nov. 1, 2012, http://www.slj.com/2012/11/public-libraries/libraries-respond-to-hurricane-sandy-offering- refuge-wifi-and-services-to-needy-communities/; Joel Rose, “For Disaster Preparedness: Pack A Library Card? : NPR,” NPR, Aug. 12, 2013, https://www.npr.org/2013/08/12/210541233/for-disasters-pack-a-first-aid-kit-bottled- water-and-a-library-card. 14 “Disaster Preparedness and Recovery,” ALA Advocacy, Legislation & Issues, 2017, http://www.ala.org/advocacy/govinfo/disasterpreparedness. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 47 15 Bishop and Veil, “Public Libraries as Post-Crisis Information Hubs.” 16 Lisl Zach, “What Do I Do in an Emergency? The Role of Public Libraries in Providing Information During Times of Crisis,” Science & Technology Libraries 30, no. 4 (Sept. 2011): 404–13, https://doi.org/10.1080/0194262X.2011.626341. 17 Bishop and Veil, “Public Libraries as Post-Crisis Information Hubs.” 18 Liu et al., “Social Media as a Tool Connecting with Library Users in Disasters: A Case Study of the 2015 Catastrophic Flooding in South Carolina”; Zach, “What Do I Do in an Emergency?” 19 Virtual Social Media Working Group and DHS First Responders Group, “Lessons Learned.” 20 David Alexander, “Social Media in Disaster Risk Reduction and Crisis Management,” Science & Engineering Ethics 20, no. 3 (Sept. 2014): 717–33, https://doi.org/10.1007/s11948-013-9502- z. 21 Alexander; Liu et al., “Social Media as a Tool”; Virtual Social Media Working Group and DHS First Responders Group, “Lessons Learned.” 22 Alexander, “Social Media in Disaster Risk Reduction.” 23 Liu et al., “Social Media as a Tool.” 24 Alexander, “Social Media in Disaster Risk Reduction.” 25 Bishop and Veil, “Public Libraries as Post-Crisis Information Hubs.” 26 Alexander, “Social Media in Disaster Risk Reduction.” 27 Bayliss, Vale, and Dar, “Libraries Respond.” 28 Liu et al., “Social Media as a Tool.” 29 Liu et al.; Zach, “What Do I Do in an Emergency?” 30 Deborah D. Halsted, Library as Safe Haven: Disaster Planning, Response, and Recovery: A How-to- Do-It Manual for Librarians, First Edition (Chicago: American Library Association, 2014). 31 George M. Eberhart, “Libraries Weather the Superstorm,” American Libraries Magazine, Nov. 4, 2012, https://americanlibrariesmagazine.org/2012/11/04/libraries-weather-the- superstorm/; Rose, “For Disaster Preparedness.” 32 Bayliss, Vale, and Dar, “Libraries Respond”; Eberhart, “Libraries Weather the Superstorm”; Rose, “For Disaster Preparedness.” 33 Bayliss, Vale, and Dar, “Libraries Respond.” WEATHERING THE TWITTER STORM | HAN 48 https://doi.org/10.6017/ital.v38i2.11018 34 Bayliss, Vale, and Dar; Eberhart, “Libraries Weather the Superstorm”; Lisa Epps and Kelvin Watson, “EMERGENCY! How Queens Library Came to Patrons’ Rescue After Hurricane Sandy,” Computers in Libraries 34, no. 10 (Dec. 2014): 3–30; Rose, “For Disaster Preparedness.” 35 Alexander, “Social Media in Disaster Risk Reduction.” 36 Benefits listed and defined in Alexander, David. “Social Media in Disaster Risk Reduction and Crisis Management.” Science & Engineering Ethics 20, no. 3 (Sept. 2014): 717–33. https://doi.org/10.1007/s11948-013-9502-z. 37 Eberhart, “Libraries Weather the Superstorm”; Rose, “For Disaster Preparedness.” 38 Zach, “What Do I Do in an Emergency?” 39 Virtual Social Media Working Group and DHS First Responders Group, “Lessons Learned.” 40 Rachel N. Simons, Melissa G. Ocepek, and Lecia J. Barker, “Teaching Tweeting: Recommendations for Teaching Social Media Work in LIS and MSIS Programs,” Journal of Education for Library and Information Science 57, no. 1 (Dec. 1, 2016): 21–30, https://doi.org/10.3138/jelis.57.1.21. 41 Alexander, “Social Media in Disaster Risk Reduction.” 11075 ---- Challenges and Strategies for Educational Virtual Reality: Results of an Expert-led Forum on 3D/VR Technologies across Academic Institutions Articles Challenges and Strategies for Educational Virtual Reality: Results of an Expert-led Forum on 3D/VR Technologies across Academic Institutions Matt Cook, Zack Lischer-Katz, Nathan Hall, Juliet Hardesty, Jennifer Johnson, Robert McDonald, and Tara Carlisle INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 25 Matt Cook (matt_cook@harvard.edu) is Digital Scholarship Program Manager, Harvard Library. Zack Lischer-Katz (zlkatz@ou.edu) is Postdoctoral Research Fellow, University of Oklahoma Libraries. Nathan Hall (nfhall@vt.edu) is Director, Digital Imaging and Preservation Services, University Libraries, Virginia Tech. Juliet Hardesty (jlhardes@iu.edu) is Metadata Analyst, Indiana University Libraries. Jennifer Johnson (jennajoh@iupui.edu) is Digital Scholarship Outreach Librarian, University Library, IUPUI. Robert McDonald (rhmcdonald@colorado.edu) is Dean, University Libraries, University of Colorado Boulder. Tara Carlisle (tara.carlisle@ou.edu) is Head, Digital Scholarship Lab, University of Oklahoma Libraries. ABSTRACT Virtual reality (VR) is a rich visualization and analytic platform that furthers the library’s mission of providing access to all forms of information and supporting pedagogy and scholarship across disciplines. Academic libraries are increasingly adopting VR technology for a variety of research and teaching purposes, which include providing enhanced access to digital collections, offering new research tools, and constructing new immersive learning environments for students. This trend suggests that positive technological innovation is flourishing in libraries, but there remains a lack of clear guidance in the library community on how to introduce these technologies in effective ways and make them sustainable within different types of institutions. In June 2018, the University of Oklahoma hosted the second of three forums on the use of 3D and VR for visualization and analysis in academic libraries, as part of the project Developing Library Strategy for 3D and Virtual Reality Collection Development and Reuse (LIB3DVR), funded by a grant from the Institute of Museum and Library Services. This qualitative study invited experts from a range of disciplines and sectors to identify common challenges in the visualization and analysis of 3D data, and the management of VR programs, for the purpose of developing a national library strategy. INTRODUCTION Virtual reality, 3D data, and other spatial technologies are being adopted in libraries as innovative and immersive tools for enhancing research and teaching.1 VR provides a highly realistic, interactive visualization platform for engaging with 3D data, such as models produced from cultural heritage sites or medical imaging data, presenting many potential applications for a range of academic fields.2 While these technologies are not new, they have become increasingly affordable, enabling widespread adoption beyond their traditional niches. For example, VR equipment has been studied in computer science departments for decades, but costs restricted use to large research labs.3 Consumer-oriented VR headsets emerged in the late 1980s, but at a high mailto:matt_cook@harvard.edu mailto:zlkatz@ou.edu mailto:nfhall@vt.edu mailto:jlhardes@iu.edu mailto:jennajoh@iupui.edu mailto:rhmcdonald@colorado.edu mailto:tara.carlisle@ou.edu CHALLENGES AND STRATEGIES FOR EDUCATIONAL VIRTUAL REALITY | COOK, LISCHER-KATZ, ET AL. 26 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11075 price point and with many technical challenges, such as high latency in interactive graphics processing, they were ultimately unsuccessful in the consumer marketplace. Cheaper and technically superior mass-market VR headsets became widely available in 2016 with the release of the Oculus Rift and HTC Vive systems. VR is finally within the budgetary and technical means of libraries of various sizes to adopt and deploy. At the same time, educators are developing new methods of crafting VR content. Decreasing costs of equipment associated with 3D data creation techniques, such as photogrammetry, laser scanning, and medical imaging (e.g., CT scanning), have encouraged their adoption outside of specialized fields. This content is increasingly used within immersive learning environments. Spatial data creation and visualization tools together can comprise a 3D/VR ecosystem that enables a range of research activities, including 3D scanning of cultural heritage artifacts, drone scanning of landscapes, interactive mapping, and data visualization, all of which can be viewed and analyzed in immersive VR.4 There is already evidence to suggest that VR has many academic benefits. While VR has not yet been proven to lead to better learning outcomes when compared with other educational media (indeed for learning certain types of facts, videos and lectures are often still mo re effective), it does offer other types of benefits. VR has been shown to lead to changes in student attitudes, such as increasing student engagement or self-efficacy.5 Furthermore, research has shown the positive impact that 3D and VR visualization can have on analytic tasks for researchers, which indicates the benefit of having this type of equipment and support available through academic libraries. 6 Despite decreasing costs and a growing understanding of the potential benefits of the technology, there is still concern in the library field about the cost and sustainability issues associated with bringing these types of technologies into the library. There are currently no standards or best practices in place for adopting 3D/VR, so institutions often have to develop ad hoc solutions, which wastes time by duplicating work already being done in other institutions and makes it difficult to share content due to a lack of interoperability standards. To begin to address these challenges and aid in the maturation of 3D and VR as learning and research technologies, an interdisciplinary group of librarians from Virginia Tech, Indiana University, and the University of Oklahoma convened to develop a series of three national forums on this topic, funded by the Institute for Museum and Library Services (IMLS), as a project titled Developing Library Strategy for 3D and Virtual Reality Collection Development and Reuse (LIB3DVR).7 Each forum was designed to cover a particular phase of the 3D/VR lifecycle within academic contexts. In June 2018, the second forum was held at the University of Oklahoma on the topic of 3D/VR Visualization and Analysis and considered the following research questions: RQ1: What are effective strategies for addressing common challenges faced by academic libraries as they implement 3D and VR programs? RQ2: How are academic librarians using VR to support existing library services, such as curriculum development and access? RQ3: How can the knowledge and resources of academic library–based 3D/VR programs be shared with other academic and information organizations, such as public libraries and regional higher-education institutions? INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 27 This paper presents the findings of the 3D/VR Visualization and Analysis Forum, discusses the common challenges and strategies identified, and indicates key directions forward. The forum assembled invited experts representing academic libraries, commercial software companies, VR and visualization labs, and educational research centers for two days of closed-door discussions. The forum identified common challenges faced by a diverse range of stakeholders and institutions of various types and scales, synthesized strategies and practices discussed by forum participants as possible solutions to those challenges, and presented policies that participants are developing to support VR as a research and learning tool in their institutions. In addition to convening experts in the field, a public forum was also held that brought together diverse stakeholders from the South Central United States library community to provide opportunities for engagement and knowledge sharing. Participants in the public forum represented local academic libraries, public libraries, public K-12 educators, commercial VR developers, and other academic programs. This enabled the cross-pollination of ideas and sharing of best practices for implementing VR in a range of contexts not represented by the invited experts. Including such a diverse group enabled the wider academic and library communities to benefit from the sharing of information that is otherwise often siloed or restricted to large institutions with substantial economic and knowledge resources. LITERATURE REVIEW A growing body of literature on VR has considered the technology’s general benefits, explored its potential applications for research, presented methods of integrating VR into the classroom, defined some of the institutional challenges of adopting VR, and considered the use of VR for expanding library services. The General Benefits of VR While the science that informs the development of contemporary VR systems has its roots in nineteenth-century scientific perceptual research (and even further back we can see Rene Descartes’ seventeenth-century theory of vision establishing the groundwork for contemporary VR systems development) it has been primarily within the last two decades that computer science and electrical engineering departments have defined the platform characteristics that reveal VR to be uniquely beneficial for working with complex 3D data.8 Under controlled conditions, researchers have identified and tested the prevalence and impact of myriad “real-world” depth cues; benefits related to preservation of the embodied first-person viewer in a virtual environment; and the importance of increased viewing angles for engaging with what would traditionally be considered cluttered data sets.9 Combined, this set of features allows for more efficient analysis of visual information, especially as related to activities where the user is expected to search, identify, describe, and compare subcomponents of complex, multivariate data sets.10 Research has thus shown that VR is valuable because it presents information in context, at human scale, and in a way that is responsive to a wide range of body-centered interactions and representational characteristics that reproduce real-world interactions. These general benefits can be applied across academic disciplines and institutions. Uses of VR in Research The general benefits of VR that have been identified are now regularly employed in research capacities across the academy. VR and related 3D data-creation tools are being applied to fields such as digital humanities,11 archaeology,12 cultural heritage preservation,13 medieval studies,14 CHALLENGES AND STRATEGIES FOR EDUCATIONAL VIRTUAL REALITY | COOK, LISCHER-KATZ, ET AL. 28 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11075 engineering,15 biology and biodiversity research,16 medicine,17 and architecture.18 In some cases, these approaches draw on the capabilities of 3D/VR to recreate immersive, high-fidelity experiences of real-world spaces, while in other cases, researchers are exploring the capabilities of VR to provide a platform for analyzing spatially oriented research data in the form of 3D models of cultural heritage artifacts and sites or visualizations of multivariate quantitative data. 19 In all of these cases, VR extends the capabilities of the human senses to engage with digital research data and scholarly outputs in ways that open new possibilities for discovery and analysis. Integrating VR into the Classroom Starting with work on Second Life, Quest Atlantis, and others, researchers have also studied the pedagogical potential of virtual worlds.20 These early virtual worlds consisted of computer- generated 3D environments, but user engagement was limited to viewing them on 2D computer monitors and interacting via keyboard and mouse interfaces. Early virtual classrooms were designed and studied in the hope that they could effectively bring students and teachers together from across the world and enable them to engage with distant artifacts, locations, and people as part of the curriculum, and early research was concerned with how virtual worlds could emulate or expand on the benefits of traditional learning environments.21 For example, Jeremy Bailenson has argued that VR is particularly well-suited for providing field trips to students, i.e., learning experiences that enable students to visit new places, and Chris Dede has identified the benefits and challenges of VR field trips.22 One of the main challenges of this type of learning technology is the high cost of designing and building the virtual environments; however, as Bailenson points out, once they are created they can be endlessly replicated and shared, “enabling us to share educational opportunities with anyone who has an Internet connection and an HMD [head- mounted display].”23 With the increasing adoption by schools and libraries of VR equipment due to decreasing equipment costs, the focus has shifted to studying the pedagogical benefits of immersive VR experiences. These experiences place the user in an interactive, stereoscopic world, with interface controls modeled on intuitive embodied gestures and movements. VR has been shown to aid design and learning tasks in a range of fields, including design work in architecture classes24 and anatomical instruction in medical schools.25 These VR experiences augment, but do not replace, other forms of classroom learning, just as traditional field trips provide interactive learning experiences that contribute to formal classroom learning. While these applications are impressive, the high cost of VR adoption leads to concern about evaluating the benefits of VR for learning. What types of benefits are valued and how do we evaluate those benefits in rigorous ways? Bailenson points out that VR may not facilitate factual knowledge acquisition better than other educational delivery methods; instead, it may have other benefits, such as increased student engagement, enthusiasm, and self -efficacy.26 Indeed, Lischer- Katz et al. showed how a carefully designed course integration using VR could have a positive impact on undergraduate students’ self-efficacy in regards to spatial analytic tasks and technology engagement.27 Mina Johnson-Glenberg studied two unique attributes of VR, “the sense of presence, and embodied affordances of gesture and manipulation in the 3rd dimension,” and generated findings that supported the hypothesis that “when learners perform actions with agency and can manipulate content during learning, they are able to learn even very abstract content better than those who INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 29 learn in a more passive and low embodied manner.”28 Schneider et al., Kuliga et al., and Pober and Cook have all documented the impact of VR on the process of architectural design.29 In the case of Schneider et al., students engaged with digital and physical versions of the same facilities via virtual and real-world tours over the course of a semester. While students were critical of the digital surrogates’ relative lack of atmospheric detail, they also communicated that “experiencing the 3D-model in real size helped to evaluate the design.”30 Similarly, Angulo successfully integrated VR into her undergraduate architecture coursework, resulting in a documented increase in term-project evaluation scores for those students who iterated on designs using a VR viewing tool.31 These studies reflect the potential impact of VR across the design disciplines (e.g. , architecture), where such tools are already being deployed in professional settings.32 Collectively, these studies indicate the likely benefits of VR in the classroom across disciplines, especially in fields where accurate perceptions of spatial characteristics—such as depth and scale—are critical to student (and professional) success. Additional work is necessary to develop and streamline pedagogical research instrumentation whereby easily applied metrics can be implemented by library and instructional staff to evaluate the effectiveness of VR on students and other users. Institutional Experiences of Adopting VR With the release of consumer-grade VR equipment, more and more institutions are considering the feasibility of adopting VR. As a result, case studies, practical strategies, and models for institutional deployment of VR are beginning to appear in the published literature.33 For example, Austin Olney discusses the implementation of augmented reality (AR) systems at the White Plains Public Library in New York, and suggests that this endeavor is more easily accomplished by building off of existing VR capabilities and policies. In particular, logistical and legal concerns that had been addressed when previously deploying “public” VR (e.g., the signing of waivers before use) were useful for rapid deployment of AR.34 Bohyun Kim offers practical considerations for the integration of VR systems into library makerspaces, assessing suitable VR hardware and software to support 3D-modeling activities at the University of Rhode Island Libraries.35 Patterson and co- workers define five service models for integrating VR into libraries (see table 1 below). Service Model Intended Use Open lab space Walk-in Closed lab space Demonstrations, testing, and staging of new equipment Flexible lab space Reservable space and equipment for class or team use Equipment checkout Individual use Developer kits (laptops and VR equipment) For checkout to use in research, demonstrations, and presentations Table 1. Library Service Models for VR.36 One of the major challenges faced by all institutions adopting VR is the concern over user comfort while using VR systems. As Steven LaValle suggests, “experiencing discomfort as a side effect of using VR systems has been the largest threat to widespread adoption of the technology over the CHALLENGES AND STRATEGIES FOR EDUCATIONAL VIRTUAL REALITY | COOK, LISCHER-KATZ, ET AL. 30 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11075 past decades.”37 Fortunately, published best practices concerning baseline performance standards for consumer headsets and software design considerations have been well established in the literature, suggesting that those looking to adopt VR in institutional settings can do so by drawing on existing technical knowledge concerning how to ensure the relative comfort of VR users.38 This combination of practical and technical considerations will undoubtedly further the adoptio n of VR across educational institutions worldwide. The remainder of this article reports on the methodology and findings of this forum, which expand upon the benefits and challenges identified in the literature. METHODS The conveners assembled a two-and-a-half-day forum at the University of Oklahoma in Norman, Oklahoma, with fifteen expert participants in attendance. Participants were selected in consultation with an advisory board, with the intention of recruiting a diverse group of national experts in representative fields, including academic librarians, researchers from a variety of disciplines, and commercial game designers and software engineers. The conveners used nominal group technique to generate research data for this study.39 Nominal group technique is a consensus-building method for achieving general agreement on a topic through face-to-face small group discussions and it “empowers participants by providing an opportunity to have their voices heard and opinions considered by other members” in a structured format.40 This method was adopted in order to reveal key challenges related to the visualization and analysis of 3D and VR data and strategies for designing and managing library programs to support these activities. Data were generated through community note taking using Google Drive documents designated for each forum session. At the end of each discussion session, a group note taker summarized and presented the views of each small group to the wider forum. Both the raw, community notes and the summarized, facilitator notes were collected and analyzed. Notes produced from the smaller groups and from the larger group form the basis of the findings. One discussion topic, “Course Integrations and Measuring Impact on Student Learning ,” was open to the public during the “public forum” portion of the forum on the afternoon of the second day (see the “Findings from the Public Forum” section, below). While these additional attendees were not given the opportunity to participate in the collaborative note taking, participants in the public forum provided anonymous responses to a set of questions on index cards that they submitted to the research team. Data analysis consisted of grouping data from the community note-taking documents into higher- level categories based on the research questions and emergent themes, following an inductive analysis approach. A central part of the data analysis process involved moving from grouping specific examples of institutional practices and personal perspectives in order to link them to more general, community-wide phenomena. In this way, a set of shared challenges and strategies could be identified at the community level of analysis. While there was a range of institutional and professional perspectives presented by the forum participants and the intention was to present a diverse set of perspectives on the topics covered in this forum, one limitation of this methodology is that it is limited to small groups of experts, which INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 31 could potentially exclude other perspectives. The inclusion of the public forum, including more participants from a greater range of institutions, helped to mitigate this limitation. We validated these findings by disseminating drafts to participants and asking them to correct, cl arify, or elaborate on the contents. The authors incorporated all participant feedback into a subsequent draft. This project was approved by the Institutional Review Board of the lead organizing institution, Virginia Tech. FINDINGS This section discusses the forum findings and aligns them with the project’s research questions. RQ1: What are effective strategies for addressing common challenges faced by academic libraries as they set out to implement 3D and VR programs? Findings for research question 1 (RQ1) are broken down into three main areas: (1) human- centered design challenges, (2) initiating VR programs in libraries and schools, and (3) curriculum and research integration and assessment. Human-Centered Design Challenges Participants agreed that, in many ways, virtual reality is still an immature technology, which is reflected in shared experiences of forum participants, such as encountering simulator sickness and interface learning curves. As simulator sickness results from, among other things, a graphical rendering performance shortfall that leads to a disconnect between what the eyes and inner ear perceive, or an unnatural locomotive user interface (UI) decision on the part of content creators, the importance of graphics card and software selection (above and beyond their educational value) was emphasized.41 Adding in-app spatial reference points—such as a virtual horizon line— was mentioned as a quick solution for the disorientation some users experience when engaging with virtual environments. Practical solutions were provided for some of the most common issues related to the academic use of VR, including providing ginger candy and personal mirrors, as well as defining reasonable time limits per person for in-headset time, to help with motion sickness as well as self-consciousness that follows the removal of a headset (i.e., mussed hair) that is experienced by some users, particularly students. Further, while not physiologically discomforting, instances of self-consciousness have been known to interfere with the educational effectiveness of virtual reality at the K-12 level, and methods for mitigating that experience—with the deployment of small mirrors, to check one’s appearance after a headset session, for example— were discussed by forum participants.42 Most VR systems are able to provide both sitting and standing experiences. Standing experiences are particularly valuable insofar as full-body interface mechanisms allow users to engage their whole bodies in interacting with VR systems, which provides a close correspondence between a user’s physical movements and their navigation of the virtual space. This decreases the learning curve of VR learning experiences, since the traditional alternatives—a sometimes tricky controller schema or command line interface—often require software or discipline specific training. In the case of seated VR systems, the system is able to accommodate users with disabilities. Not only are students, faculty, and staff with disabilities able to engage with VR content, but also the VR technology can provide heretofore impossible learning experiences by providing lifelike access to scenes that are physically inaccessible to them. While these possibilities are promising, participants agreed that further work needs to be performed to accommodate often overlooked barriers to accessibility. Examples of early techniques used to successfully address accessibility CHALLENGES AND STRATEGIES FOR EDUCATIONAL VIRTUAL REALITY | COOK, LISCHER-KATZ, ET AL. 32 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11075 concerns include the mirroring of controller mappings to allow for use by either hand, the integration of accessible interfaces such as the Xbox Adaptive Controller, and the creation of open- source developer tools that can virtually adjust for a range of visual impairments.43 Regarding specific VR hardware currently available on the commercial market, participants were quick to point to common factors that limit the approachability, use, and scalability of university- or library-based virtual reality. One critical concern was the cost of VR equipment; at the high end, Oculus Rift, HTC Vive, and assorted Windows Mixed Reality headsets require a tethered connection to a dedicated personal computer (PC), and designated PCs must be outfitted with graphics cards that start at ~$300 (US) and increase rapidly from there. To sustain the performance necessary for a stable and comfortable virtual experience to the end user, a $500 graphics card coupled to another $500-$1,000 worth of computing hardware (e.g., CPU, motherboard, storage, monitor, etc.) is necessary, in addition to the purchase price of the VR headset. Cost-wise, high-end VR remains a prohibitively expensive endeavor outside of well- funded research institutions. While cost was mentioned as a major cause for concern, the physical constraint to user movements due to the wiring that connects the VR headset to the PC impedes one of the primary features of the technology, positional tracking of user movement. The importance of positional tracking, the ability to track a user’s headset and hand controllers along the six axes of motion as the user moves through physical space, means that interfacing with VR is ideally an intuitive, “natural,” and immersive experience that can be easily disrupted if the user encounters cables or other restrictions to their movements. Fortunately, the major headset manufacturers mentioned above are starting to release more affordable, untethered, positionally tracked VR devices with increasing quality, which suggests that institutions ranging from K-12, to community colleges, to R1 research universities and academic libraries will soon be able to adopt immersive visualization technologies without encountering significant financial or ergonomic problems. 44 Forum participants also noted that the same real-world fidelity that makes virtual reality an impressive learning platform can also produce harmful and disturbing effects. Indeed, many of the most popular VR titles are violent, action-oriented “first-person shooters,” or multiplayer chat rooms in which very little attention is being paid to regulating abusive language. Moreover, there are some educational or training applications that, while culturally and politically sensitive, are nonetheless unsettling in a high-fidelity virtual environment. Disturbing places or objects can also negatively impact users; for example, phobia training applications in virtual reality, such as FearlessVR, might require academic institutions to deploy disclaimers prior to use at the risk of disruptive or damaging reactions from users (http://www.fearlessvr.com/). Finally, forum participants discussed ways in which earlier design paradigms have subtly influenced the design of 3D/VR applications. Participants noted that while libraries and librarians might immediately assume that a books-on-the-shelf virtual library is a good way to invest time and development resources, the technology affords the means to interact with not just text-based materials (or the spines of books, in the case of browsing activities), but richly detailed “source material” (3D objects of study) as well. In the case of design principles, the entire body can be incorporated into wholly new search and discovery mechanisms that build on the best aspects of the library browsing experience while incorporating novel content types. http://www.fearlessvr.com/ INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 33 Initiating VR Programs in Libraries and Schools Participants identified a variety of challenges associated with initiating VR programs in libraries and schools and a set of strategies for addressing these challenges. They discussed the challenges of developing curricula, the importance of management plans, the impact of particular institutional landscapes on the success or failure of 3D/VR initiatives, and offered some insight on the experiences of other information institutions, such as museums, that may be helpful to consider when developing broader strategies. One emergent theme emphasized the challenges of developing curricula—specifically, customized 3D/VR teaching modules. Some participants noted that the expertise needed to develop these learning modules is unevenly distributed across the university, which makes it difficult to develop these critical teaching components. Furthermore, finding and funding technical expertise is a significant challenge, and participants discussed the difficulties in balancing the investment in untested technologies with the potential benefits of those same technologies. Untested technologies may be difficult to maintain and may require buy-in from administrators. Another challenge is the difficulty in getting researchers to share their project outcomes for use in instruction. Researchers who develop VR tools or content may not want to share the products of their labor because they perceive them to be integral parts of their research agendas, or they may feel that the products of their projects may be so customized that they will not be useable by other researchers. Finally, participants in the forum pointed to the general bottleneck of content creation for 3D and VR, which impacts the development of teaching curricula and the production of research-quality 3D data. More specifically, they noted that better workflows need to be developed in order to make it easier to create their own 3D models and to acquire and work with 3D models created by other researchers. Participants pointed out that a lack of easily accessible content will likely limit investment in VR programs and integration into curricula, which suggests that support for sharing content between and within institutions could be an important way of promoting the adoption of these new technologies. Forum participants discussed how developing management plans for 3D/VR hosting spaces is essential to ensure successful initiation of VR programs. Participants offered a range of models for overseeing user engagement with VR equipment, including placing the equipment in a makerspace-like environment that was always monitored by staff; keeping the space locked with the option for users to request a key; as well as implementation in a fine arts space that has its own operating hours and security staff to oversee public engagement. Indeed, the question of staffing spaces emerged repeatedly throughout the proceedings, with the idea of using student staff from the host university or college as an effective and inexpensive means of o verseeing 3D/VR spaces, assisting users, and troubleshooting technical issues that are typical of these new and emerging technologies. Students were also identified as potential content creators for those spaces and promoters of the 3D/VR programs to their peers. The institutional landscape of the school hosting the 3D/VR program was also identified as an important factor that can impact 3D/VR adoption, since administrators can play a big part in helping or hindering VR implementation. Participants noted that they needed to explicitly justify the time and return on investment for VR in less supportive environments, which is not surprising given ever-tightening budgets across universities. Faculty support also impacts VR program initiation. While a handful of faculty members may be willing to superficially explore VR technology and try it out in one of their classes, wider adoption will require the provision of CHALLENGES AND STRATEGIES FOR EDUCATIONAL VIRTUAL REALITY | COOK, LISCHER-KATZ, ET AL. 34 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11075 measurable student outcomes. However, the biggest challenge noted for getting faculty to support VR initiatives was convincing them that it could be useful for their research. Participants from museums that are implementing VR or augmented reality (AR) technologies also offered some insight into institutional challenges of these technologies. Some museums are creating 3D scans of historical items, which raises a number of concerns about the accuracy of the models and how that impacts the meaning of the 3D models in the context of a museum’s wider collecting mission. Museums want to create 3D content that is historically accurate, while at the same time being optimized for use in VR and AR applications. The biggest challenge identified is how to make the technology affordable while also maintaining its usefulness to communities of museum visitors. This aligns with concerns expressed by libraries, which also find balancing cost and usability to be a key concern. Participants from across institutions offered strategies for addressing some of these challenges. First, participants noted that training sessions can provide an important introduction to VR for students and faculty that enables them to have a good initial experience and see the value of VR beyond its novelty. Without proper orientation, users could have an unpleasant initial experience, which could unnecessarily sour them on using VR in the future. Second, participants suggested several demonstration techniques for introducing a wide range of university and library community members to the new technology and exhibiting it in a positive light. “Road shows,” i.e., taking the technology to other parts of campus or different communities, can be very useful for reaching users who may not otherwise come into the library to engage with emerging technologies. Library-hosted events, such as hackathons, workshops, and demonstrations can also help to develop interest among current library patrons. Providing mobile workstations for classroom use or individual patron checkout can also make the technology more accessible. These strategies suggest the importance of convincing potential users of the benefits of 3D/VR for their particular interests or research needs. Curriculum and Research Integration and Assessment Integrating 3D/VR technologies into established research and teaching conventions and workflows has proven to be a challenge, and all participants reported continued struggles to establish ways to assign credit to the creators of 3D/VR content, and methods of rigorously assessing the impact of this content on learning outcomes. Regarding tenure and promotion concerns, it is not clear how faculty outside of technologically oriented or applied science disciplines might communicate and track their use of VR as a tool in their courses or research for inclusion in their tenure and promotion portfolios. Creating VR experiences represents a substantial investment of development resources and faculty time, and the experience itself—while distributable and citable—is not typically treated by the research community as a scholarly output. This is symptomatic of the relatively immature content ecosystem (i.e., a scarcity of educational VR software and scholarly 3D content), which necessitates either custom software development or forces researchers to use off-the-shelf tools that may comprise “blackboxed” data transformations or offer limited functionality. To address this current lack of fully developed educational software and the necessity of assembling 3D/VR course modules piecemeal, participants concluded that the best place to start the integration process would be by establishing “first principles,” or what is known for certain regarding the strengths and weaknesses of VR. Faculty members who want to include a 3D/VR INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 35 component in their teaching or research agenda should familiarize themselves with the key benefits of the technology that transcend disciplines and that are supported by evidence-based assessment. Faculty should select task types for their learning activities that lend themselves to VR visualization, and deploy course (or research) content that, due to fragility, distance, rarity, or scale are relatively inaccessible. Forum participants discussed how the benefits of 3D/VR in terms of a particular instructional goal or research initiative should be judiciously weighed against the relative cost of purchasing, installing, and maintaining what is still an immature, fast-changing set of interrelated content creation, visualization, and output (e.g., 3D printing) technologies. Moreover, once these technologies are deployed, and learning outcomes or research goals that align with the documented benefits of the technology are identified, assessment strategies should be deployed to evaluate the impact of these tools. Measuring performance, engagement (e.g., measuring time-on- task), comparative economic benefit (i.e., with respect to traditional course content delivery methods), and qualitative variables, such as impact on student self-efficacy, are all useful approaches for measuring the impact of VR in a research or teaching environment, and thereby may assist in justifying the expansion of these programs and tool sets. RQ2: How are academic librarians using VR to support existing library services, such as curriculum development and access? Participants discussed a number of applications for VR and related technologies that could help expand library services, including new ways of browsing and engaging with library resources, circulating VR equipment, and offering entirely new types of library services. Some applications that were brought up included using AR to develop virtual books and book “trailers” that display promotional materials for each book as the patrons browse the stacks with their AR-enabled device. Using VR to engage with the materiality of books was also suggested. Participants gave examples of using VR to look closely at Medieval manuscripts, rare books, and artists books in order to observe their spatial properties, such as indentations, surface detail/topography, ink, and other material aspects.45 Other participants suggested that VR could be used to compare and recontextualize library collections across geographies, placing the books in the contexts in which they were found and enabling comparative analysis of textual features. Participants also suggested that VR could be used as a platform for presenting numerical data, which could have applications for researchers who analyze data through already established visualization services at academic libraries. The discussion around expanding access using VR brought up the idea of libraries hosting 3D/VR collections, which could contain 3D scans and VR environments sourced from locally produced content and would be shared by 3D-scanning partners around the world. One question that was raised was how to store the 3D content in a way that it could be easily transferred to VR systems for access. Participants from the University of Oklahoma discussed their work hosting multi-campus VR walkthroughs of 3D scans of historic sites (e.g., the arches of Palmyra, Syria), which offered an evocative example of how libraries can simultaneously provide access to technology, create and curate collections of scholarly 3D-scan data, and present expert- led events.46 This shows how libraries hosting VR technologies can both be technological and intellectual partners in presenting 3D content to library patrons. Participants also pointed out how VR could be used to contribute to ongoing efforts in the library community to develop tools CHALLENGES AND STRATEGIES FOR EDUCATIONAL VIRTUAL REALITY | COOK, LISCHER-KATZ, ET AL. 36 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11075 for linked open data and bibliometrics by creating new ways of visualizing networks of relationships between texts. These examples suggest that VR could be used as a multidimensional visualization platform that would visualize relationships between texts and areas of knowledge in rich and immersive ways, yielding new insights for librarians, library researchers, and other users.47 A circulation model of VR deployment was also discussed and found to be successful at several institutions. This typically followed a traditional “checkout” model of using the circulation desk to loan equipment for patron use for a limited time period. Some libraries check out all the VR pieces separately, while others are experimenting with full VR kits. Circulating VR equipment brings up a number of challenges, such as hygiene concerns, theft and loss of equipment, the cost of licensing and scaling up software purchases, and managing software accounts and updates. Adopting a circulation policy for VR may help to ensure the sustainability of the program by centralizing cost, risk, and point of access. However, centralization may not always be the most appropriate solution as it can turn off some faculty and discourage their interest in using it. A final challenge of circulating VR hardware is the bottleneck of content creation, such that libraries find it difficult to provide new content for VR, which may make it difficult to sustain user interest over time. Some participants pointed out that without a game design curriculum or other creative programs on campus that have the capability of developing new content, users may lose interest, which would limit the success of a circulation-based model. If programs do not have access to the knowledge and tools necessary to develop VR content, it becomes difficult to expand services for development and pilot projects. It is therefore critical to form partnerships early on with content creators when developing VR programs. RQ3: How can the knowledge and resources of academic, library-based 3D/VR programs be shared with other academic and information organizations, such as public libraries and regional higher- education institutions? Based on participant discussions, several areas of concern were identified that need to be addressed in order for 3D/VR knowledge and resources to be shared with a broader range of institutions: methods for collaborating and coordinating across institutions; strategies for addressing development and hardware resource limitations; and addressing challenges to the widespread adoption of 3D/VR tools in higher education. Methods for Collaborating and Coordinating Across Institutions Collaboration and coordination across institutions was found to be important for ensuring that VR tools are made available to both large and small institutions. Few small colleges, cultural heritage institutions, or public K-12 school districts have the financial or technical resources to deploy educational VR at scale. The hardware and software used for educational VR require expertise—in the form of hardware setup, maintenance, administrative capabilities, and software development experience—represents a significant investment in labor (i.e., staffing overhead) and training on the part of research institutions, such as those participating in the forum. Forum participants agreed that, given the disproportionate concentration of expertise within higher education, initiatives should be undertaken to ensure that tools, workflows, training, and support are provided to organizations outside of academia. A consortium model was suggested as one formal mechanism for addressing this concern. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 37 Strategies for Addressing Development and Hardware Resource Limitations In the case of software or content, Forum participants emphasized the value of open-source and open-access standards for facilitating the widespread distribution and use of educational VR content, especially for those without the development resources to create their own content. Participants suggested an open “app store” like ecosystem to assist in the distribution of this educational content across organizations. To aid in the successful integration of apps from a central database, participants discussed the prospect of supporting collaborators remotely, perhaps even from within VR, essentially training others on the software using the strengths of VR hardware. Finally, participants identified several large content-hosting platforms that host a variety of 3D assets relevant to education which might readily be deployed in VR without extensive development resources. A sandbox or open-ended viewing environment for loading and analyzing arbitrary sets of user-generated 3D models could be provided by the university or academic library, such that the end user would experience 3D-learning objects that were selected and deployed by local educators without the need for software development expertise.48 In contrast to the high-end VR workstations (e.g., Oculus Rift, HTC Vive, etc.), participants were quick to point to the relative affordability of smartphone-based VR solutions. These Google Cardboard–type implementations are especially promising, given the widespread adoption of smartphones with sufficient computing power to render interactive educational 3D content stereoscopically. In the case of Sketchfab, a commercial 3D-hosting platform, web-based 3D assets can be uploaded, collected, and accessed from any current smartphone device, and launched quickly into a stereoscopic viewing experience. In this way, more users at a wider variety o f educational institutions can make use of some of the uniquely beneficial platform characteristics of VR (e.g., depth cues) without committing to the purchase of high-end VR workstations. Addressing Challenges to the Widespread Adoption of 3D/VR Tools in Higher Education Library technologists want to introduce VR to a wide audience of potential beneficiaries, yet this approach may cause faculty to dismiss such implementations as a novelty. Forum participants repeatedly discussed the importance of involving faculty in implementations of VR tools and spaces that have academic value. Participants noted how the uptake of 3D/VR technologies can be thwarted when faculty are disinterested or are not well-informed about the potential uses of 3D/VR technologies. Forum participants noted that university faculty who want to incorporate 3D/VR technologies are faced with many of the same challenges encountered by smaller educational organizations, including lack of guidance on setup and maintenance, administrative pushback, and high cos t. Participants agreed that academic libraries should play a critical role in the hosting and administration of these systems. Deployment of 3D/VR technologies in academic libraries would provide a central resource that could be used by many departments, including those fields that may lack the resources to invest in their own equipment. Moreover, since librarians already act as research and instructional collaborators with faculty, those relationships can be drawn upon when adopting VR, and showcasing faculty engagement helps to demonstrate the academic value of 3D/VR technologies for library and university administrators. Further Strategies for Collaborating Across Organizations Interinstitutional collaboration was put forward as a means to begin addressing the challenges facing both universities and smaller educational organizations seeking to implement 3D/VR programs. Participants agreed that it was incumbent upon the larger universities or institutions to CHALLENGES AND STRATEGIES FOR EDUCATIONAL VIRTUAL REALITY | COOK, LISCHER-KATZ, ET AL. 38 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11075 provide open access to their VR software projects and provide the mentorship and training necessary to successfully deploy these applications for use by smaller institutions. Forum participants suggest that, in some cases, this will require a physical visit to a collaborator’s location and live demonstrations of tools and techniques. Alternatively, a showcase or summit event could be hosted by large institutions for the purpose of demonstrating 3D/VR technologies to a number of small institutions. At such events, during site visits, or within VR-based training sessions, best practices and the results of empirical research on the efficacy of 3D/VR could be communicated to smaller organizations. One such institution type, public libraries, was discussed in relation to the potential value of collaborative outreach. Even with limited budgets, public libraries are trying to bring 3D/VR technology to their patrons. They would benefit from becoming strong collaborators with local universities and colleges. For example, summer programming at public libraries could be organized to effectively distribute the software, standards, best practices, and workflows being pioneered at the university level. Forum participants suggested a “hand me down” program for donating earlier generation headset hardware, which is replaced quite frequently and at great expense, but is typically still functional. In this case, smaller organizations would benefit from surplus equipment funded by research grant money or donor contribution, both of which are less common at the public-library level. Along with the open-access software and 3D-asset ecosystems discussed above, this sharing of hardware and knowledge would increase the impact of 3D/VR technologies across multiple organizations and institutions. Findings from the Public Forum The public portion of the forum consisted of a half-day, afternoon session attended by local stakeholders from other academic libraries, public schools, public libraries, and other institutions in Oklahoma, Texas, Kansas, and Arkansas. They were invited to attend and engage in discussions with the invited experts on the topics of 3D/VR in relation to smaller institutions, such as public K- 12 schools and public libraries. The following section reports on findings from that session, drawing on a set of 38 anonymously completed notecards that participants filled out during the public forum and returned to the project team. For these notecard responses, participants were asked to answer the following three questions: 1. What challenges do small- and medium-sized, public-facing institutions face when implementing VR? 2. How can large institutions support small- and medium-sized institutions who are starting to adopt 3D/VR? 3. What options are there for public libraries or K-12 to participate in 3D/VR workflows? The following sections summarize those responses and the key themes from the public forum discussion. Challenges Faced by Small- and Medium-Sized Institutions Public forum participants identified challenges related to cost (including equipment, maintenance, and staffing), content creation, and faculty/administrator buy-in as the top concerns facing small- to medium-sized public institutions looking to deploy educational VR. Regarding cost, it was clear that budgets for innovative, unproven technology that may become quickly obsolete were oftentimes nonexistent, and those local VR champions who sought to install such technology faced pushback from administrators who do not understand the technology and are focused on INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 39 measurable returns on investment rather than exploratory offerings. Finally, staffing was a challenge identified and shared by multiple public forum participants. To hire, train, and support skilled staff members who are expected to keep abreast of ongoing developments of a still young technology such as VR requires a sustainable investment from administrators. Participants noted that even having skilled staff in place and access to the necessary equipment, the success of a given VR deployment was not guaranteed. One potential bottleneck identified by public forum participants was VR content creation. Due to the relative immaturity of the VR software marketplace and its associated disorganization, public-forum participants described focusing their efforts on supporting local content-creation efforts by students, faculty, and staff. Beyond recognizing the need to hire and support costly development workers, public-forum participants also noted how the preservation and further distribution of locally produced VR content require skills and training beyond their level of expertise. Oftentimes, these small- to medium-sized public institutions have a single staff member, who may not have all of the required technical skills, tasked with deploying and developing VR content, which limits the local impact of this potentially transformative educational technology. Finally, multiple public-forum participants identified the need to garner faculty buy-in as a recurring challenge. Oftentimes faculty do not know about the technology or have a limited understanding of it. To overcome this, public-forum participants noted that a demonstration of VR’s value is critical. They suggested that this demonstration might be accomplished in partnership with larger educational institutions (e.g., universities), whose own staff, content, and expertise could be leveraged to best communicate the value of VR to local administrators and faculty. Ways Larger Institutions Can Support Smaller Ones Beyond assisting small- to medium-sized public institutions in demonstrating the value of VR for local faculty and administrators, public-forum attendees suggested that the continued development and distribution of open-source VR software, providing help with setup and maintenance issues, and engaging in formal knowledge distribution activities—in the form of conferences, consortia, and grant partnership—would streamline the adoption of VR by these smaller public institutions. Public-forum participants also expressed a desire to engage with the research outputs of larger institutions that focused on the efficacy of VR, which may be useful for working with local faculty and administrators. Software sharing was specifically identified as a way that larger institutions could support the early efforts of small- to medium-sized institutions as they set out to deploy and integrate VR. Participants were careful to note that additional support, in the form of training and troubleshooting, was equally important to the distribution of the software itself. The affordability of VR software was described as a particularly important issue by a number of participants, hence the focus on open-source solutions by the group. Predicting inevitable hardware and software failures, public-forum participants communicated that onsite support by VR experts from larger institutions would be ideal. Participants noted that, while knowledge sharing is important, guidance on how to set up a specific system sometimes requires onsite support. Fortunately, there are novel support mechanisms supported by the technology itself, with public-forum participants suggesting that experts could be “brought in” to provide support in the virtual environment itself. In this case, a multiplayer VR experience similar CHALLENGES AND STRATEGIES FOR EDUCATIONAL VIRTUAL REALITY | COOK, LISCHER-KATZ, ET AL. 40 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11075 to those demonstrated by the content developers in attendance could function as an interactive learning platform for the distribution of information and would even provide training on the technology. Ways Public Libraries and K-12 Schools Can Participate in 3D/VR The size and diversity of the public library and K-12 user communities were reflected in the feedback provided by public forum participants concerning ways in which these types of institutions can participate in VR. Participants suggested that within public libraries, focus groups that represent different ages, physical ability levels, and socioeconomic backgrounds of library users can be recruited to test and refine VR offerings. In this way, the wider public can be introduced to VR technologies and made aware of their benefits. The potential for K-12 students to assist in the VR-content development process, which could have the added benefit of helping students develop valuable technical skills, was also mentioned as a way that these institutions could participate. This programming need not start out as a formal curriculum, participants suggested, but rather as an afterschool program, taking the form of a “modern day A.V. club,” as one participant suggested. Public-forum participants identified the need for compensation or incentive programs for those adolescents in public libraries who are wishing to contribute to the content creation process. Overall, participants in the public forum provided positive feedback about their experience at the event: “Very useful indeed. I learned a lot,” wrote one participant. “It’s nice to hear perspectives of those working in academia and to share our own perspectives . . . ” wrote another public-forum participant. A third participant was enthusiastic about the future, writing: “I love the idea of libraries being partners.” These responses indicate the importance of collaboration between small and large institutions and the value of these types of public forums for sharing knowledge in this field. SUMMARY OF FINDINGS & DISCUSSION The findings drawn from the discussions and presentations at the forum offer a broad view of the current concerns of this diverse community involved in implementing 3D/VR in academic institutions for the purposes of education and research. The range of stakeholder groups is expansive and demonstrates a growing interest in immersive visualization technology across many fields and institution types. By reviewing previous literature in conjunction with the group discussion findings, we can identify and summarize a set of common challenges facing libraries and other information institutions that are implementing 3D/VR technologies. In the following section, we describe these challenges or considerations identified for each topic and point towards possible strategies or directions forward for addressing them. Initiating VR Programs in Libraries and Schools The main challenges facing libraries and schools as they initiate VR programs include developing interest and awareness for the emerging technology among faculty, students, and administrators; locating necessary expertise in VR within their communities when knowledge is unevenly distributed; getting enough buy-in from administrators to support the allocation of necessary resources; encouraging researchers to share their projects and research outputs for the benefit of the larger community; and overcoming the bottleneck of VR content creation as a limiting factor INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 41 on institutional investment. From the findings we can identify a set of strategies to begin to address these issues, including: • Utilize demonstration techniques to generate student and faculty interest in VR (e.g., “road shows” and library-hosted events, such as hackathons, workshops, and demonstrations). • Develop replicable workflows that can be implemented by a variety of stakeholders. • Establish management plans for 3D/VR hosting spaces that include using student labor for overseeing VR technology spaces and content creation. • Develop and validate metrics for evaluating the impact of VR technologies in order to provide evidence for faculty and administrators that VR is worth their time and investment. Integrating VR into Research and Teaching From the findings, we can also identify a number of challenges related to integrating VR into research and teaching. One of the major challenges in this area is related to the research community not valuing VR projects as scholarly or pedagogical outputs. Because of this, faculty members are typically reluctant to invest their time and resources into developing VR curricula if it will not contribute to their tenure and promotion portfolios. Related to that problem is the issue of assigning credit or attribution to 3D/VR learning objects when they may be developed by teams, which can discourage sharing and limit reuse of VR learning modules. Finally, because of a lack of metrics for measuring the impact of 3D/VR on student learning, it has been hard for faculty to rationalize integrating what is sometimes perceived as unproven technology into their classes. Determining which metrics to use has pedagogical as well as economic implications, since without metrics, it becomes difficult to rationalize the expense of 3D/VR to institutional administrators. Participants did not have strategies for addressing all of these challenges, but offered the following suggestions: • Design VR course integrations that take advantage of the particular access and analytic characteristics of VR technologies. • Weigh instructional goals with the cost of VR equipment and development time. • Develop assessment strategies and define metrics for evaluating the impact of VR learning activities on students. Expanding the Role of the Library with VR The group agreed that libraries are ideal places for hosting 3D/VR equipment, services, and support because they are often centrally located in their communities and they can potentially help by centralizing the risk and cost of untested technologies. Participants identified a number of new techniques for utilizing the benefits of 3D/VR to expand existing library services, such as offering new ways of browsing and engaging with existing library resources, enabling the development of 3D-based digital collections and curated exhibitions and events that draw from those collections, and adding VR-visualization equipment as another piece of digital technology that libraries can circulate to support the various uses of the range of patrons interested in this technology. Again, although there was no lack of big ideas in regards to how 3D/VR could exp and library services, the biggest challenges for libraries adopting 3D/VR into existing services was still a lack of verified educational content, which confirms the dire need to share 3D/VR content within institutions and across the wider community. Without platforms for sharing 3D/VR content and the appropriate institutional and disciplinary incentives to do so, 3D/VR is unlikely to be adopted broadly and the range of exciting new applications will not be realized beyond niche projects. CHALLENGES AND STRATEGIES FOR EDUCATIONAL VIRTUAL REALITY | COOK, LISCHER-KATZ, ET AL. 42 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11075 Collaborating and Coordinating Across Institutions Based on the findings drawn from both the expert-led and the public portions of the forum, it is clear that collaboration and coordination across institutions is essential for making 3D/VR a widely successful educational and research tool, because it can enable the sharing of resources with a range of smaller institutions that would otherwise not be able to adopt the technology on their own. Supporting this exchange will require providing faculty at larger institutions with the necessary tools and incentives to support that sharing, which is an area in which participants agreed that academic libraries could serve as the needed source for technical knowledge and equipment. In summary, participants identified the following approaches for supporting efforts at collaborating and supporting smaller institutions and expanding access to underserved communities: • Larger institutions should provide tools, workflows, training, and support through on-site visits. • Universities should partner with public libraries, since they can be hubs for providing access to communities that would otherwise not have the opportunity to engage with 3D/VR outside of academic communities. • Use open-source and open-access standards and content, including an open “app store” ecosystem of 3D/VR content. • Use existing databases of free 3D content. • Use affordable smartphone-based VR applications when more expensive VR systems are not feasible. These findings contribute to current discussions in the field of library innovation that consider how libraries can adopt and sustain emerging technologies, such as VR and 3D technologies.49 We have identified a set of common challenges and possible strategies for integrating 3D/VR programs into libraries and educational institutions, but additional research is required in this area to produce more detailed workflows for a range of institutional types to follow. There are inherent limitations to any specification, since every context has its own specific requirements for 3D/VR implementation, but as these findings suggest, there are common challenges that can be addressed in systematic and generalizable ways. These findings offer some examples of this, but additional data collection is necessary to focus on some of the key areas that are still developing. CONCLUSION The overriding theme across the findings from the forum is the importance of interinstitutional and interdisciplinary collaboration. Confirming what we had assumed going into this project, it is clear that many of the challenges of 3D/VR can only be solved through systematic and concerted effort across multiple stakeholder groups. 3D/VR is not limited to a niche area. As we can see from the range of participants and applications, it has broad transformative potential and is becoming increasingly mainstream in many contexts. This suggests the importance of addressing these challenges through additional forums and working groups to generate standards and best practices that can be applied across the growing 3D/VR community. Such guidance needs to be specific enough that they can offer practical benefit to stakeholder groups of varying capacities, but flexible enough to be useful for a range of applications and disciplinary practices. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 43 While the findings from the forum suggest a variety of techniques and strategies for addressing the challenges identified, there is still much work that needs to be done to establish standards and best practices, generate institutional support, and enact change within disciplinary cultures in order to better support these communities. In particular, the following areas require further inquiry: • Develop validated metrics for evaluating the impact of 3D/VR, from pedagogical, research, and institutional perspectives. • Develop guidelines and tools for supporting users with disabilities. • Support smaller institutions in initiating and supporting 3D/VR projects. • Find ways to educate skeptical disciplines about the value of research and teaching that uses 3D/VR. • Develop tools for supporting 3D/VR throughout the research or educational lifecycle, including: o project management and documentation tools; o universal 3D viewers that integrate with VR equipment and 3D repositories; o sustainable, preservation-quality file formats for 3D and VR; and o open platforms for hosting 3D/VR content. There are a number of other projects that are addressing some of these lingering challenges within the field of 3D and VR research and teaching, including Community Standards for 3D Data Preservation (CS3DP), an IMLS-funded project that is using a series of meetings and working groups to develop community-sanctioned standards for preserving 3D data in academic contexts (http://gis.wustl.edu/dgs/cs3dp/); Building for Tomorrow, another IMLS-funded project that is developing guidelines for preserving 3D models in the fields of architecture, design, architectural archives, and architectural history (https://projects.iq.harvard.edu/buildingtomorrow/home); the Smithsonian Institute’s 3D Digitization Program, which is developing workflows and metadata guidelines for a variety of 3D creation processes (https://3d.si.edu/); and the Library of Congress’s Born to Be 3D initiative, which has started convening experts in the field to look at the preservation challenges of “born digital” 3D data, including CAD models, GIS data, etc. (https://www.loc.gov/preservation/digital/meetings/b2b3d/b2b3d2018.html). The LIB3DVR project team will continue to collaborate with members of these project teams to ensure that knowledge is shared and that any standards and best practices that are developed for 3D/VR visualization and analysis take into consideration the findings from this Forum. The project team is confident that through these initiatives, useful standards and best practices will emerge to assist educators, researchers, librarians, technologists, and other information professionals address the complex challenges of implementing 3D/VR visualization and analysis for scholarly and pedagogical purposes in their institutions. NOTES 1 Matt Cook and Zack Lischer-Katz, “Integrating 3D and Virtual Reality into Research and Pedagogy in Higher Education,” in Beyond Reality: Augmented, Virtual, and Mixed Reality in the Library, ed. Kenneth J. Varnum (Chicago: ALA Editions, 2019), 69-85. 2 With 3D/VR technologies “a professor may take students on an immersive field trip to Stonehenge, changing the lighting to simulate various phases of solar events; an archaeologist http://gis.wustl.edu/dgs/cs3dp/ https://projects.iq.harvard.edu/buildingtomorrow/home https://3d.si.edu/ https://www.loc.gov/preservation/digital/meetings/b2b3d/b2b3d2018.html CHALLENGES AND STRATEGIES FOR EDUCATIONAL VIRTUAL REALITY | COOK, LISCHER-KATZ, ET AL. 44 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11075 may capture 3D scans of an archaeological excavation and share these data with a colleague on the other side of the world in the form of an immersive virtual exploration of the site; [or] a biochemistry professor may explore complex protein structures with students,” Zack Lischer- Katz et al., “3D/VR Creation and Curation: An Emerging Field of Inquiry,” in Jennifer Grayburn et al., eds., 3D/VR in the Academic Library: Emerging Practices and Trends (CLIR Report 176, February 2019), https://www.clir.org/wp-content/uploads/sites/6/2019/02/Pub-176.pdf. 3 Samuel A. Miller, Noah J. Misch, and Aaron J. Dalton, “Low-cost, Portable, Multi-wall Virtual Reality,” Eurographics Workshop (2005): 1-16, https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20050240930.pdf; Carolina Cruz- Neira, Daniel J. Sandin, and Thomas A. DeFanti, “Surround-screen Projection-based Virtual Reality: The Design and Implementation of the CAVE," in Proceedings of the 20th Annual Conference on Computer Graphics and Interactive Techniques (1993): 135-42; Jeremy Bailenson, Experience on Demand: What Virtual Reality Is, How It Works, and What It Can Do (New York: W. W. Norton, 2018). 4 Mieke Pfarr-Harfst and S. Münster, “Typical Workflows, Documentation Approaches and Principles of 3D Digital Reconstruction of Cultural Heritage,” in 3D Research Challenges II (Springer, 2016), 32–46, https://doi.org/10.1007/978-3-319-47647-6_2; Pierre Alliez et al., “Digital 3D Objects in Art and Humanities: Challenges of Creation, Interoperability and Preservation,” (white paper, PARTHENOS Project, May 24, 2017), https://hal.inria.fr/hal- 01526713v2/document. 5 Bailenson, Experience on Demand; Zack Lischer-Katz, Matt Cook, and Kristal Boulden, “Evaluating the Impact of a Virtual Reality Workstation in an Academic Library: Methodology and Preliminary Findings,” in Proceedings of the Association for Information Science and Technology Annual Meeting (Vancouver, Nov. 2018): 300-08, https://doi.org/10.1002/pra2.2018.14505501033. 6 Ciro Donalek et al., “Immersive and Collaborative Data Visualization Using Virtual Reality Platforms,” In Proceedings of 2014 IEEE International Conference on Big Data (2014): 609-14. 7 The LIB3DVR website is available at http://lib3dvr.org. Information about the grant is available at https://www.imls.gov/grants/awarded/lg-73-17-0141-17. 8 Hermann von Helmholtz and James Powell Cocke Southall, Treatise on Physiological Optics, Vol. 3 (Courier Corporation, 2005, Originally published 1867); Andries Van Dam, David H. Laidlaw, and Rosemary Michelle Simpson, “Experiments in Immersive Virtual Reality for Scientific Visualization,” Computers & Graphics 26, no. 4 (2002): 535-55; Doug A. Bowman and Ryan P. McMahan, “Virtual Reality: How Much Immersion Is Enough?" Computer 40, no. 7 (2007): 36- 43; David A. Atchison and Larry N. Thibos, “Optical Models of the Human Eye,” Clinical and Experimental Optometry 99, no. 2 (2016): 99-106. 9 Colin Ware and Peter Mitchell, “Reevaluating Stereo and Motion Cues for Visualizing Graphs in Three Dimensions,” in Proceedings of the 2nd Symposium on Applied Perception in Graphics and Visualization, (2005), 51-58; Tao Ni, Doug A. Bowman, and Jian Chen, “Increased Display Size and Resolution Improve Task Performance in Information-rich Virtual Environments,” in Proceedings of Graphics Interface (Quebec, Canada, June 7-9, 2006), 139-46; Andrew Forsberg https://www.clir.org/wp-content/uploads/sites/6/2019/02/Pub-176.pdf https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20050240930.pdf https://doi.org/10.1007/978-3-319-47647-6_2 https://hal.inria.fr/hal-01526713v2/document https://hal.inria.fr/hal-01526713v2/document https://doi.org/10.1002/pra2.2018.14505501033 http://www.lib3dvr.org/ https://www.imls.gov/grants/awarded/lg-73-17-0141-17 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 45 et al., “A Comparative Study of Desktop, Fishtank, and CAVE Systems for the Exploration of Volume Rendered Confocal Data Sets,” IEEE Transactions on Visualization and Computer Graphics 14, no. 3 (2008): 551-63; Marta Kersten-Oertel, Sean Jy-Shyang Chen, and D. Louis Collins, “An Evaluation of Depth Enhancing Perceptual Cues for Vascular Volume Visualization in Neurosurgery,” IEEE Transactions on Visualization and Computer Graphics 20, no. 3 (2014): 391-403; Susan Jang et al., “Direct Manipulation is Better than Passive Viewing for Learning Anatomy in a Three-dimensional Virtual Reality Environment,” Computers & Education 106 (2017): 150-65. 10 Eric D. Ragan et al., “Studying the Effects of Stereo, Head Tracking, and Field of Regard on a Small-scale Spatial Judgment Task,” IEEE Transactions on Visualization and Computer Graphics 19, no. 5 (2013): 886-96; Bireswar Laha, Doug A. Bowman, and John J. Socha, “Effects of VR System Fidelity on Analyzing Isosurface Visualization of Volume Datasets,” IEEE Transactions on Visualization & Computer Graphics 4 (2014): 513-22. 11 Victoria Szabo, “Collaborative and Lab-Based Approaches to 3D and VR/AR in the Humanities,” in Grayburn et al., eds., 3D/VR in the Academic Library: Emerging Practices and Trends (CLIR Report 176, February 2019), 12-23, https://www.clir.org/wp- content/uploads/sites/6/2019/02/Pub-176.pdf; Aris Alissandrakis et al., “Visualizing Dynamic Text Corpora Using Virtual Reality,” 39th Annual Conference of the International Computer Archive for Modern and Medieval English (Tampere, Finland, May 30-June 3, 2018), 205, http://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1213822&dswid=2342. 12 van Dam, Laidlaw, and Simpson, “Experiments in Immersive Virtual Reality,” 535-55; Limp et al., “Developing a 3-D Digital Heritage Ecosystem.” 13 Will Rourk, “3D Cultural Heritage Informatics: Applications to 3D Data Curation,” in Grayburn et al., eds., 3D/VR in the Academic Library: Emerging Practices and Trends (CLIR Report 176, February 2019), 24-38, https://www.clir.org/pubs-reports-pub176/. 14 Bill Endres, Digitizing Medieval Manuscripts: The St. Chad Gospels, Materiality, Recoveries, and Representation in 2D & 3D (Amsterdam: Arc Humanities Press, 2019). 15 Seth Abhishek, Judy M. Vance, and James H. Oliver, "Virtual Reality for Assembly Methods Prototyping: A Review," Virtual Reality 15, no. 1 (2011): 5-20. 16 Jeremy A. Bot and Duncan J. Irschick, “Using 3D Photogrammetry to Create Open-Access Models of Live Animals: 2D and 3D Software Solutions,” in Grayburn et al., eds., 3D/VR in the Academic Library: Emerging Practices and Trends (CLIR Report 176, February 2019), 54-72, https://www.clir.org/pubs-reports-pub176/. 17 Guido Giacalone et al., “The Application of Virtual Reality for Preoperative Planning of Lymphovenous Anastomosis in a Patient with a Complex Lymphatic Malformation," Journal of Clinical Medicine 8, no. 3 (2019): 371. 18 Michelle E. Portman, Asya Natapov, and Dafna Fisher-Gewirtzman, “To Go Where No Man Has Gone Before: Virtual Reality in Architecture, Landscape Architecture and Environmental Planning,” Computers, Environment and Urban Systems 54 (2015): 376-84. https://www.clir.org/wp-content/uploads/sites/6/2019/02/Pub-176.pdf https://www.clir.org/wp-content/uploads/sites/6/2019/02/Pub-176.pdf http://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1213822&dswid=2342 https://www.clir.org/pubs-reports-pub176/ https://www.clir.org/pubs-reports-pub176/ CHALLENGES AND STRATEGIES FOR EDUCATIONAL VIRTUAL REALITY | COOK, LISCHER-KATZ, ET AL. 46 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11075 19 Fred Limp et al., “Developing a 3-D Digital Heritage Ecosystem: From Object to Representation and the Role of a Virtual Museum in the 21st Century," Internet Archaeology 30 (2011): 1-38; Donalek et al., 2014. 20 Bryan Carter and Aline Click, “Imagine the Real in the Virtual: Experience Your Second Life," paper presented at 22nd Annual Conference on Distance Teaching and Learning (Madison, WI, 2006); Sasha Barab et al., “Making Learning Fun: Quest Atlantis, a Game without Guns,” Educational Technology Research and Development 53, no. 1 (2005): 86-107. 21 Ekaterina Praslova–Førland, Alexei Sourin, and Olga Sourina, “Cybercampuses: Design Issues and Future Directions,” Visual Computer 22, no. 12 (2006): 1,015-28; Stephen Bronack et al., “Designing Virtual Worlds to Facilitate Meaningful Communication: Issues, Considerations, and Lessons Learned,” Technical Communication 55, no. 3 (2008): 261-69; Kim Holmberg and Isto Huvila, “Learning Together Apart: Distance Education in a Virtual World,” First Monday 13, no. 10 (October 2008), https://firstmonday.org/article/view/2178/2033; Mats Deutschmann, Luisa Panichi, and Judith Molka-Danielsen, “Designing Oral Participation in Second Life: A Comparative Study of Two Language Proficiency Courses,” ReCALL 21, no. 2 (May 2009): 206- 26; Diane Carr, Martin Oliver, and Andrew Burn, “Learning, Teaching and Ambiguity in Virtual Worlds,” in Researching Learning in Virtual Worlds, Anna Peachey et al., eds. (London: Springer, 2010), 17–31. 22 Chris Dede, “Immersive Interfaces for Engagement and Learning,” Science 323 (2009): 66-69; Bailenson, Experience on Demand. 23 Dede, 234. 24 Julie Milovanovic et al., “Virtual and Augmented Reality in Architectural Design and Education: An Immersive Multimodal Platform to Support Architectural Pedagogy,” paper presented at the 17th International Conference, CAAD Futures (Istanbul, Turkey, July 2017), https://hal.archives-ouvertes.fr/hal-01586746. 25 Susan Jang et al., “Direct Manipulation is Better than Passive Viewing for Learning Anatomy in a Three-dimensional Virtual Reality Environment,” Computers & Education 106 (2017): 150-65, https://doi.org/10.1016/j.compedu.2016.12.009. 26 Bailenson, Experience on Demand. 27 Lischer-Katz et al., 2018, “Evaluating the Impact of a Virtual Reality Workstation.” 28 Mina C. Johnson-Glenberg, “Immersive VR and Education: Embodied Design Principles That Include Gesture and Hand Controls,” Frontiers in Robotics and AI 5 (2018): 4, https://doi.org/10.3389/frobt.2018.00081. 29 Sven Schneider et al., “Educating Architecture Students to Design Buildings from the Inside Out,” in Proceedings of the 9th International Space Syntax Symposium (Seoul, Korea, 2013); Saskia F. Kuliga et al., “Virtual Reality as an Empirical Research Tool—Exploring User Experience in a Real Building and a Corresponding Virtual Model,” Computers, Environment and Urban Systems 54 (2015): 363-75; Elizabeth Pober and Matt Cook, “The Design and Development of an https://firstmonday.org/article/view/2178/2033 https://hal.archives-ouvertes.fr/hal-01586746 https://doi.org/10.1016/j.compedu.2016.12.009 https://doi.org/10.3389/frobt.2018.00081 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 47 Immersive Learning System for Spatial Analysis and Visual Cognition,” paper presented at the Conference of the Design Communication Association (Bozeman, MT, 2016). 30 Schneider et al., “Educating Architecture Students,” 15. 31 Antonieta Angulo, “On the Design of Architectural Spatial Experiences Using Immersive Simulation," in EAEA 11 Conference Proceedings, Envisioning Architecture: Design, Evaluation, Communication (Milan, Italy, 2013), 151-58. 32 Michelle Goldchain, “Virtual Reality Leads to Better Building Designs, Happier Clients, Says Architecture Firm,” Curbed - Washington, DC (March 10, 2017), https://dc.curbed.com/2017/3/10/14690200/virtual-reality-perkins-will. 33 E.g., Miguel Figueroa, “In a Virtual World: How School, Academic, and Public Libraries Are Testing Virtual Reality in Their Communities,” American Libraries 49, no. 3/4 (April 3, 2018): 26-33; Edward Iglesias, “Creating a Virtual Reality-based Makerspace,” Online Searcher 42, no. 1 (February 1, 2018): 36-39. 34 Austin Olney, “Augmented Reality: All About Holograms,” in Beyond Reality: Augmented, Virtual, and Mixed Reality in the Library, Kenneth J. Varnum, ed. (Chicago: ALA Editions, 2019), 1-16. 35 Bohyun Kim, “Virtual Reality for 3D Modeling,” in Beyond Reality: Augmented, Virtual, and Mixed Reality in the Library, Kenneth J. Varnum, ed. (Chicago: ALA Editions, 2019), 31-46. 36 Brandon Patterson et al., “Play, Education, and Research: Exploring Virtual Reality through Libraries,” in Beyond Reality: Augmented, Virtual, and Mixed Reality in the Library, Kenneth J. Varnum, ed. (Chicago: ALA Editions, 2019), 50-51. 37 Steven LaValle, Virtual Reality (London: Cambridge University Press, 2017), 348. 38 Oculus VR, LLC, “Oculus Best Practices, Version 310-30000-02,” retrieved from http://static.oculus.com/documentation/pdfs/intro-vr/latest/bp.pdf; Robert S. Kennedy, Kay M. Stanney, and William P. Dunlap, “Duration and Exposure to Virtual Environments: Sickness Curves During and Across Sessions,” Presence: Teleoperators and Virtual Environments 9, no. 5 (2000): 463-72. 39 Andre L. Delbecq, Andrew H. Van de Ven, and David H. Gustafson, Group Techniques for Program Planning: A Guide to Nominal Group and Delphi Processes (Glenview, IL: Scott, Foresman, &Co., 1975). 40 Sara S. McMillan, Michelle A. King, and Mary P. Tully, “How to Use the Nominal Group and Delphi Techniques,” International Journal of Clinical Pharmacy 38 (2016): 656, https://doi.org/10.1007/s11096-016-0257-x. 41 Eugenia M. Kolasinski, “Simulator Sickness in Virtual Environments,” Report No. ARI -TR-1027 (Alexandria, VA: Army Research Institute for the Behavioral and Social Sciences, 1995). https://dc.curbed.com/2017/3/10/14690200/virtual-reality-perkins-will http://static.oculus.com/documentation/pdfs/intro-vr/latest/bp.pdf https://doi.org/10.1007/s11096-016-0257-x CHALLENGES AND STRATEGIES FOR EDUCATIONAL VIRTUAL REALITY | COOK, LISCHER-KATZ, ET AL. 48 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11075 42 Lisa Castaneda, Anna Cechony, and Arabella Bautista, “Applied VR in the Schools, 2016 -2017 Aggregated Report,” Foundry 10 (2017), http://fineduvr.fi/wp-content/uploads/2017/10/All- School-Aggregated-Findings-2016-2017.pdf. 43 More information about the Xbox Adaptive Controller can be found here: https://www.xbox.com/en-US/xbox-one/accessories/controllers/xbox-adaptive-controller. 44 Information about upcoming VR hardware releases can be found here: https://www.roadtovr.com/simple-guide-oculus-quest-rift-s-valve-index-hp-reverb- comparison/. 45 E.g., see Prof. William Endres work on scanning Medieval manuscripts at the Lichfield Cathedral, https://lichfield.ou.edu/content/imaging. 46 Dian Schaffhauser, “Multi-Campus VR Session Tours Remote Cave Art,” Campus Technology (Oct. 9, 2017), https://campustechnology.com/articles/2017/10/09/multi-campus-vr-session- tours-remote-cave-art.aspx. 47 Matt Cook, “Virtual Serendipity: Preserving Embodied Browsing Activity in the 21st Century Research Library,” The Journal of Academic Librarianship 44, no. 1 (Jan. 2018): 145-9, https://doi.org/10.1016/j.acalib.2017.09.003. 48 See Cook and Lischer-Katz, “Integrating 3D and Virtual Reality into Research and Pedagogy,” for a discussion of the VR “sandbox” platform developed at University of Oklahoma Libraries, Oklahoma Virtual Academic Laboratory (OVAL). More information about OVAL can be found here: https://libraries.ou.edu/content/virtual-reality-ou-libraries. 49 E.g., Matt Cook and Betsy Van der Veer Martens, “Managing Exploratory Units in Academic Libraries,” Journal of Library Administration 59, no. 6 (2019): 1-23, https://doi.org/10.1080/01930826.2019.1626647. http://fineduvr.fi/wp-content/uploads/2017/10/All-School-Aggregated-Findings-2016-2017.pdf http://fineduvr.fi/wp-content/uploads/2017/10/All-School-Aggregated-Findings-2016-2017.pdf https://www.xbox.com/en-US/xbox-one/accessories/controllers/xbox-adaptive-controller https://www.roadtovr.com/simple-guide-oculus-quest-rift-s-valve-index-hp-reverb-comparison/ https://www.roadtovr.com/simple-guide-oculus-quest-rift-s-valve-index-hp-reverb-comparison/ https://lichfield.ou.edu/content/imaging https://campustechnology.com/articles/2017/10/09/multi-campus-vr-session-tours-remote-cave-art.aspx https://campustechnology.com/articles/2017/10/09/multi-campus-vr-session-tours-remote-cave-art.aspx https://doi.org/10.1016/j.acalib.2017.09.003 https://libraries.ou.edu/content/virtual-reality-ou-libraries https://doi.org/10.1080/01930826.2019.1626647 ABSTRACT Introduction Literature Review The General Benefits of VR Uses of VR in Research Integrating VR into the Classroom Institutional Experiences of Adopting VR Methods Findings RQ1: What are effective strategies for addressing common challenges faced by academic libraries as they set out to implement 3D and VR programs? Human-Centered Design Challenges Initiating VR Programs in Libraries and Schools Curriculum and Research Integration and Assessment RQ2: How are academic librarians using VR to support existing library services, such as curriculum development and access? RQ3: How can the knowledge and resources of academic, library-based 3D/VR programs be shared with other academic and information organizations, such as public libraries and regional higher-education institutions? Methods for Collaborating and Coordinating Across Institutions Strategies for Addressing Development and Hardware Resource Limitations Addressing Challenges to the Widespread Adoption of 3D/VR Tools in Higher Education Further Strategies for Collaborating Across Organizations Findings from the Public Forum Challenges Faced by Small- and Medium-Sized Institutions Ways Larger Institutions Can Support Smaller Ones Ways Public Libraries and K-12 Schools Can Participate in 3D/VR Summary of Findings & Discussion Initiating VR Programs in Libraries and Schools Integrating VR into Research and Teaching Expanding the Role of the Library with VR Collaborating and Coordinating Across Institutions Conclusion Notes 11077 ---- Use of Language-Learning Apps as a Tool for Foreign Language Acquisition by Academic Libraries Employees Articles Use of Language-Learning Apps as a Tool for Foreign Language Acquisition by Academic Libraries Employees Kathia Ibacache INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 22 Kathia Ibacache (kathia.ibacache@colorado.edu) is the Romance Languages Librarian at the University of Colorado Boulder. ABSTRACT Language-learning apps are becoming prominent tools for self-learners. This article investigates whether librarians and employees of academic libraries have used them and whether the content of these language-learning apps supports foreign language knowledge needed to fulfill library-related tasks. The research is based on a survey sent to librarians and employees of the University Libraries of the University of Colorado Boulder (UCB), two professional library organizations, and randomly selected employees of 74 university libraries around the United States. The results reveal that librarians and employees of academic libraries have used language-learning apps. However, there is an unmet need for language-learning apps that cover broader content including reading comprehension and other foreign language skills suitable for academic library work. INTRODUCTION The age of social media and the advances in mobile technologies have changed the manner in which we connect, socialize, and learn. As humans are curious and adaptive beings, the moment mobile technologies provided apps to learn a foreign language, it was natural that self-regulated learners would immerse themselves in them. Language-learning apps’ practical nature, as an informal educational tool, may attract self-learners such as librarians and employees of academic libraries to utilize this technology to advance foreign language knowledge usable in the workplace. The academic library employs a wide spectrum of specialists, from employees offering research consultations, reference help, and instruction, to others specialized in cataloging , archival, acquisition, and user experience, among others. Regardless of the library work, employees utilizing a foreign language possess an appealing skill, as knowing a foreign language heightens the desirability of employees and strengthens their job performance. In many instances, librarians and employees of academic libraries may be required to have reading knowledge of a foreign language. Therefore, for these employees, acquiring knowledge of a foreign language might be paramount to deliver optimal job performance. This study aims to answer the following questions: 1) Are librarians and employees of academic libraries using language-learning apps to support foreign language needs in their workplace? and 2) Are language-learning apps addressing the needs of librarians and employees of academic libraries? For purposes of this article, mobile language apps are those accessed through a website, and apps downloaded onto portable smartphones, tablets, desktops, and laptops. mailto:kathia.ibacache@colorado.edu USE OF LANGUAGE-LEARNING APPS | IBACACHE 23 https://doi.org/10.6017/ital.v38i3.11077 BACKGROUND Mobile-assisted language learning (MALL) has a user-centered essence that resonates with users in the age of social media. Librarians and employees of academic libraries needing a foreign language to fulfill work responsibilities are a target group that can benefit from using language- learning apps. These apps provide a multifaceted capability that offers time and space flexibility and adaptability that facilitates the changeable environment favored by self-learners. Kukulska- Hulme states that it is customary to have access to learning resources through mobile devices. 1 In the case of those individuals working in academic libraries, language-learning apps may present an opportunity to pursue a foreign language accommodating their self-learning style, time availability, space, and choice of device. Considering the features of language-learning apps, some have a more personal quality where the device interacts with one user while other apps emulate social media characteristics connecting a wide array of users. For instance, users learning a language through the Hello Talk app can communicate with native speakers all around the world. Through this app, language learners can send voice notes, corrections to faulty grammar, and use the built-in translator feature. Therefore, language-learning apps may not only provide self-learners a vehicle to communicate remotely, but also to interact using basic conversational skills in a given language. In the case of those working in academic libraries, this human connectedness among users may not be as relevant as the interactive nature of the device, its mobility, the convenience of the virtual learning, and the flexibility of the mobile technology. Kukulska-Hulme notes that the ubiquity of mobile learning is affecting the manner in which one learns.2 Although there is abundant literature referring to mobile language technologies and their usefulness in students’ language learning in different school levels including higher education, scholarship regarding the use of language-learning apps by professionals is scarce.3 Broadbent refers to self-regulated learners as those who plan their learning through goals and activities. 4 The author concurs that to engage in organized language learning through a language-learning app, one should have some level of organizational learning or as a minimum enough motivation to engage in self-learning. In this context, some scholars believe that the level of self-management of learning will determine the level of learning success.5 Moreover, learners who possess significant personal learning initiative (PLI) have the foundation to accomplish learning outcomes and overcome difficulties.6 PLI may be one factor affecting learners’ motivation to learn a language in a virtual environment and away from the formal classroom setting. This learning initiative may play a significant role in the learning process, as it may influence the level of engagement and positive learning outcome. In terms of learning outcomes, language software developers may also play a role by adapting and broadening content based on learning styles and considering the elements that would provide a meaningful user experience. In this sense, Bachore conveys that there is a need to address language-learning styles when using mobile devices.7 Bachore also notes that as interest in mobile language learning increases, so does the different manners in which mobile devices are used to implement language learning and instruction.8 Similarly, Louhab refers to context dimensions as the parameters in mobile learning that consider learners’ individuality in terms of where the learning takes place, individual personal qualities and INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 24 learning needs, and the features of their mobile device.9 Bradley also suggests that learning is part of a dialogue between the learners and their devices as part of a sociocultural context where thinking and learning occur.10 In addition. Bradley infers that users are considered when creating learning activities and when improving them.11 For these reasons, some researchers address the need to focus on accessibility and developing content designed for different types of users, including differently abled learner s.12 Furthermore, adaptation, according to the learner’s style, may be considered as a pivotal quality of language- learning apps as software developers try to break the gap between formal instruction and a learner-oriented mobile learning platform. Undoubtedly, the technological gap, which includes the cost of the device, interactivity, screen size, and input capabilities, among others, matter when centering on implementing language learning supported by mobile technologies. However, learning style is only one aspect in the equation. A learner’s need is another. For example: the needs of a learner who seeks to acquire familiarity with a foreign language because of an upcoming vacation may be substantially distinct from the needs of professionals such as academic librarians, who may need reading, writing, or even speaking proficiency in a given language. A user-centered approach in language-learning software design may advance the adequacy of these apps connecting them with a much wider set of learning needs. When referring to mobile apps for language learning, Godwin-Jones asserts that while the capability of devices is relevant, software development is paramount to the educational process.13 Therefore, language-learning software developers may consider creating learning activities that target basic foreign language-learning needs and more tailored ones suitable for people who require different content. Kukulska-Hulme refers to “design for learning” as creating structured activities for language learning.14 Although language-learning apps appear to rely on learning activities built on basic foreign language learning needs, these apps should desire to rely more on learners’ evaluative insights to advance software development that meets the specific needs of learners. Although mobile technologies as a general concept will continue to evolve, its mobile nature will likely continue focusing on user experience satisfying those who prefer the freedom of informal learning. METHODOLOGY Instrument The author used a 26-question Qualtrics survey approved by the Institutional Review Board at the University of Colorado Boulder (UCB). The survey was open for eight weeks and received 199 total responses. However, the number of responses to each question varied depending on the question. The data collected was both quantitative and qualitative in nature, seeking to capture respondents’ perspectives and measurable data that could be used for statistics. The survey consisted of twelve general questions for all respondents that reported working in an academic library, then branched into either nine questions for respondents who had used a language- learning app, and five questions for those who had not. The respondents answered via text fields, standard single and multiple-choice questions, and a single answer Likert matrix table. Qualtrics provided a statistical report, which the author used to analyze the data and create the figures. USE OF LANGUAGE-LEARNING APPS | IBACACHE 25 https://doi.org/10.6017/ital.v38i3.11077 Participants The survey was distributed through an email to librarians and employees of UCB’s University Libraries. The author also identified 74 university libraries in the United States from a list of members of the Association of Research Libraries, and distributed the survey via email to ten randomly selected library employees from each of these libraries.15 The recipients included catalogers, subject specialists, archivists, and others working in metadata, acquisition, reference, and circulation. In addition, the survey was also distributed to the listserv of two library organizations: The Seminar on the Acquisition of Latin American Library Materials (SALALM) and Reforma, the National Association to Promote Library and Information Services to Latinos and the Spanish Speaking. These organizations were chosen due to their connection with foreign languages. RESULTS Use of Foreign Language at Work Of the respondents, 172 identified as employees of academic libraries (66 percent). Of these, a significant percentage reported using a foreign language in their library work. The respondents belonged to different generational groups. However, most of the respondents were in the age groups of 30-39 and 40-49 years old. The respondents performed a variety of duties within the categories presented. Due to incomplete survey results, varying numbers of responses were collected for each question. Therefore, of 110 respondents, 82 identified their gender as female. In addition, of 105 respondents, 62 percent reported being subject specialists, 56 worked in reference, 54 percent identified as instruction librarians, 30 percent worked in cataloging and metadata, 30 percent worked in acquisition, 10 percent worked in circulation, 2 percent worked in archiving, and 23 percent reported doing “other” types of library work. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 26 Figure 1. Age of respondents (n=109). Figure 2. Foreign language skills respondents used at work (multiple responses allowed, n-106). 9.17% 29.36% 30.28% 12.84% 18.35% 20-29 years old 30-39 years old 40-49 years old 50-59 years old 60 years or older 102 65 49 49 Reading Writing Speaking Listening USE OF LANGUAGE-LEARNING APPS | IBACACHE 27 https://doi.org/10.6017/ital.v38i3.11077 As shown in figure 2, respondents used different foreign language skills at work. However, reading was used with significantly more frequency. When asked, “How often do you use a foreign language at work?” 38 respondents out of 105 used it daily, 29 used it weekly, and 21 used it monthly. In addition, table 1 shows that a large percentage of respondents noted that knowing a foreign language helped them with collection development tasks and reference services. However, the respondents who chose “other” stated in a text field that knowing a foreign language helped them with translation tasks, building management, creating a welcoming environment, attending foreign guests, communicating with vendors, researching, processing, and having a broader perspective of the world emphasizing empathy. These respondents also expressed that knowing a foreign language helped them to work with materials in other languages, digital humanities projects, and to offer library tours and outreach to the community. Type of Librarian Work Expressed benefit (%) Collection development 61.5 Reference 57.6 Communication 56.7 Instruction 41.3 Cataloging and metadata 41.3 Acquisition 40.3 Other 19.2 Table 1. Types of librarian work benefiting from knowledge of a foreign language (multiple responses allowed, n=104). Figure 3. Languages respondents studied using an app (multiple responses allowed, n=51). As shown in figure 3, Spanish was the most prominent language studied. Thirteen out of 51 respondents studied French and Portuguese. Additionally, respondents stated in the text field “other” that they have also used these apps to study English, Mandarin, Arabic, Malay, Hebrew, Swahili, Korean, Navajo, Turkish, Russian, Greek, Polish, Welsh, Indonesian, Thai, and Tamil. Regardless, apps were not the sole means for language acquisition. Some respondents specified using books, news articles, Pimsleur CDs, television shows, internet radio, conversations with family members and native speakers, formal instruction, websites, dictionaries, online tutorials, audio tapes, online laboratories, flashcards, podcasts, movies, and YouTube videos. 22 13 13 9 8 5 26 Spanish French Portuguese German Italian Japanese Other INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 28 Over a third of 49 respondents used a language-learning app for 30 hours or more, and less than a quarter used it between 11-30 hours. Concerning the device preferred to access the apps, most of the respondents utilized a smartphone (63.27 percent), followed by a laptop (16.33 percent), and a tablet (14.29 percent). Table 2 shows the elements of language-leaning apps that 48 respondents found more satisfactory. They selected “learning in own time and space” as the most desired element followed by “vocabulary” and “translation exercises.” Participants were less captivated by “pronunciation capability” (29.1 percent) and “dictionary function” (16.6 percent). Element of a Language-learning App Percentage finding Satisfactory (%) Learning in own time and space 64.5 Vocabulary 56.2 Translation exercises 56.2 Making mistakes without feeling embarrassed 54.1 Responsive touch screen 52 Self-testing 52 Reading and writing exercises 43.7 Game-like features 37.5 Voice recognition capability 37.5 Comfortable text entry 37.5 Grammar and verb conjugation exercises 35.4 Pronunciation capability 29.1 Dictionary function 16.6 Table 2. Most satisfactory aspects with language-learning apps (multiple responses allowed, n=48). Figure 4. Most unsatisfactory elements of language-learning apps (n=30). Conversely, 30 respondents described unsatisfactory elements on the survey. These elements were grouped into the categories shown in figure 4. The elements were: payment restrictions, lack of grammatical explanations, monocentric content focused on travel, vocabulary-centric content 13 10 5 2 Content Flexibility/Interface Grammar Payment USE OF LANGUAGE-LEARNING APPS | IBACACHE 29 https://doi.org/10.6017/ital.v38i3.11077 (although opinions were varied on this issue), and poor interface. Respondents also mentioned a lack of flexibility that inhibited learners from reviewing earlier lessons or moving forward as desired, unfriendly interfaces, and limited scope. Other respondents alluded to technical issues with keyword diacritical, non-intuitive software and repetitive exercises. While these elements relate to the language apps themselves, one respondent mentioned missing human interaction and another reported the lack of a system to prompt learners to be accountable for their own learning process. Figure 5. Reasons participants had not used a language-learning app (multiple responses allowed, n=53). Figure 5 shows that time restriction (i.e., availability of time to use the app) was the most prevalent reason why respondents had not used a language-learning app. However, a larger percentage of respondents answered “other” to expand on the reason they had not tried this technology. The explanations provided included: missing competent content for work; already having sufficient proficiency; preferring books, dictionaries, google translate, and podcasts; lacking interest; and having different priorities. Similarly, when asked whether they would use a language-learning app if given an opportunity, a large percentage of 52 respondents answered “maybe” (65.38 percent). However, when 51 respondents answered the question: “What elements facilitated your language learning?,” 66.6 percent responded they preferred having an instructor, 54.9 percent liked being part of a classroom, and 41.1 percent liked language activities with classmates. DISCUSSION Library Employee Use of Language-learning Apps The data revealed that a large number of respondents used a foreign language in their library work, reporting that reading and writing were the most needed skills. However, only about half of the respondents had used a language-learning app. Therefore, there appears to be interest in language-learning apps, but use is not widespread at this time. Overall, respondents felt language- learning apps did not offer a curriculum that supported foreign language enhancement for the workplace, especially the academic library one. This factor may be one reason why respondents stopped using the apps and why this technology was not utilized more extensively. 54.71% 37.73% 32.07% 1.88% Other Lack of time Prefer traditional setting Screen too small INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 30 Interestingly, the majority of the respondents were in their thirties and forties. One may surmise that young Millennials in their twenties would be more inclined to use language-learning apps. However, the data showed a slight lead by respondents in their forties. This information may corroborate the author’s inference that generational distinctions among employees of academic libraries do not limit the ability to seek and even prefer learning another language through apps. Moreover, a Pew Research Center study showed that older generations than Millennials have welcomed technology and even Gen Xers had a 10 percent lead on the ownership of tablets over Millennials.16 Referring to the device used to interact with the language app, most respondents preferred a smartphone. Only a smaller fraction of respondents preferred a tablet, laptop, or desktop. This data may attest to the movability feature of language-learning apps preferred by self-learners and the notion that language learning may happen outside the classroom setting. However, while smartphones provide ubiquity and a sense of independence, so can tablets. Therefore, what is it about smartphones that ignites preference from a user experience perspective? Is it their ability to make calls, portability, fast processors, Wi-Fi signal, or cellular connectivity that makes a difference? Since tablets can also be considered portable, and their larger screen and web surfing capabilities are desirable assets, is it the “when and where” that determines the device? While not all respondents reported using an app to learn a language, those who did expressed satisfaction with learning in their own space and time and with translation exercises. Nevertheless, it is captivating that few respondents deemed important the ability of the software to help learners with the phonetic aspect of the language. This diminished interest in pronunciation may be connected with the type of language learning needed in the academic library profession. As respondents indicated, language-learning apps tend to focus on conversational skills rather than reading and text comprehension. In addition to those respondents who used an app to learn a new language, one respondent reported reinforcing skills in a language already acquired. A compelling matter to consider is the frequency with which respondents utilize a foreign language in their work. About a third of the respondents used a foreign language at work on a daily basis, and approximately a quarter used it weekly. This finding reveals that foreign language plays a significant role in academic library work. Since the respondents fulfilled different responsibilities at their library work, one may deduce that foreign language is utilized in a variety of settings other than strictly desk tasks. In fact, as stated before, respondents reported using foreign language for multiple tasks including communicating with vendors and foreign guests as well as providing a welcoming environment, among others. Even though 59 respondents stated that knowing a foreign language helped them with communication, respondents appeared to be more concerned with reading comprehension and vocabulary. It is likely reading comprehension was ranked high in the level of importance since library jobs that require foreign language knowledge tend to utilize reading comprehension skills widely. Nonetheless, the author wonders whether subject specialists utilize more skills related to listening and communication in a foreign language, especially those librarians who provide instruction. Therefore, it is curious that they did not prioritize these skills. Perhaps this topic could be the subject for future research. Notwithstanding these results, language-learning apps appear to center on content that improves listening and basic communication instead of reading USE OF LANGUAGE-LEARNING APPS | IBACACHE 31 https://doi.org/10.6017/ital.v38i3.11077 comprehension. Therefore, the question remains as to whether mobile language apps have enough capabilities to provide a relevant learning experience to librarians and staff working in academic libraries. Are Language-Learning Apps Responding to the Language Needs of Employees Working in Academic Libraries? The survey results indicate that language-learning apps are not sufficiently meeting respondents’ foreign language needs. Qualitative data showed that there may be several elements affecting the compatibility of language-learning apps with the needs of employees working in academic libraries. However, the findings were not conclusive due to the limited number of responses. When respondents were asked to identify the unsatisfactory elements in these apps, 65.9 percent of 47 respondents found an issue with language-learning apps, but 23 percent of those respondents answered “none.” According to respondents, the main problems with apps were the lack of content and scope that was suitable for employees of academic libraries, flexibility, and grammar. Perhaps mobile language-app developers speculate that some learners still use a formal classroom setting for foreign language acquisition, and therefore leave more advanced curriculum for that setting. It is also possible that developers deem more dominant a market that centers on travel and basic conversation; this may explain why these apps do not address foreign language needs at the professional level. Finally, these academic library employees appear to perceive that there is a need for these apps to explore and offer a curriculum and learning activities that benefit those seeking deeper knowledge of a language. CONCLUSION Mobile language learning has changed the approach to language acquisition. Its mobility, portability, and ubiquity have established a manner of instruction that provides a sense of freedom and self-management that suits self-learners. Moreover, as app technology has progressed, features have been added to devices that facilitate a more meaningful user experience with language-learning apps. Employees of academic libraries that have used foreign language- learning apps are cognizant of language-learning activities that support their foreign language needs for work such as reading comprehension and vocabulary. However, language-learning apps appear to market conversational needs, providing exercises that focus on travel more than less ons that center on reading comprehension and deeper areas of language knowledge. This indicates a lack of language-learning content that would be more appropriate for those working in academic libraries. Finally, academic library employees who require a foreign language in their work are a target group that may benefit from mobile language learning. Presently, this target group feels language- learning apps are too basic to cover professional, broader needs. Therefore, as language-learning app developers consider service to wider groups of people, it would be beneficial for these apps to expand their lesson structure and content to address the needs of academic library professionals. ENDNOTES 1 Agnes Kukulska-Hulme, “Will Mobile Learning Change Language Learning?” Recall 21, no. 2 (2009): 157, https://doi.org/10.1017/S0958344009000202. https://doi.org/10.1017/S0958344009000202 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 32 2 Ibid, 158. 3 See Florence Martin and Jeffrey Ertzberger, “Here and Now Mobile Learning: An experimental Study on the use of Mobile Technology,” Computers & Education 68, (2013): 76-85, https://doi.org/10.1016/j.compedu.2013.04.021; Houston Heflin, Jennifer Shewmaker, and Jessica Nguyen, “Impact of Mobile Technology on Student Attitudes, Engagement, and Learning,” Computers & Education 107, (2017): 91-99, https://doi.org/10.1016/j.compedu.2017.01.006; Yoon Jung Kim, “The Effects of Mobile- Assisted Language Learning (MALL) on Korean College Students’ English-Listening Performance and English-Listening Anxiety,” Studies in Linguistics, no. 48 (2018): 277-98, https://doi.org/10.15242/HEAIG.H1217424; Jack Burston, “The Reality of Mall: Still on the Fringes,” CALICO Journal 31, no. 1 (2014): 103-25, https://www.jstor.org/stable/calicojournal.31.1.103. 4 Jaclyn Broadbent, “Comparing Online and Blended Learner’s Self-Regulated Learning Strategies and Academic Performance,” Internet and Higher Education 33 (2017): 24, https://doi.org/10.1016/j.iheduc.2017.01.004. 5 Rui-Ting Huang and Chung-Long Yu, “Exploring the Impact of Self-Management of Learning and Personal Learning Initiative on Mobile Language Learning: A Moderated Mediation Model,” Australian Journal of Education Technology 35, no. 3 (2019): 118, https://doi.org/10.14742/ajet.4188. 6 Ibid, 121. 7 Mebratu Mulato Bachore, “Language through Mobile Technologies: An Opportunity for Language Learners and Teachers,” Journal of Education and Practice 6, no. 31 (2015): 51, https://files.eric.ed.gov/fulltext/EJ1083417.pdf. 8 Ibid, 50. 9 Fatima Ezzahraa Louhab, Ayoub Bahnasse, and Mohamed Talea, “Considering Mobile Device Constraints and Context-Awareness in Adaptive Mobile Learning for Flipped Classroom,” Education and Information Technologies 23, no. 6 (2018): 2608, https://doi.org/10.1007/s10639-018-9733-3. 10 Linda Bradley, “The Mobile Language Learner: Use of Technology in Language Learning,” Journal of Universal Computer Science 21, no. 10 (2015): 1270, http://jucs.org/jucs_21_10/the_mobile_language_learner/jucs_21_10_1269_1282_bradley.pdf . 11 Ibid. 12 Tanya Elias, “Universal Instructional Design Principles for Mobile Learning,” The International Review of Research in Open and Distance Learning 12, no. 2 (2011): 149, https://doi.org/10.19173/irrodl.v12i2.965. 13 Robert Godwin-Jones, “Emerging Technologies: Mobile Apps for Language Learning,” Language Learning & Technology 15, no. 2 (2011): 3, http://dx.doi.org/10125/44244. https://doi.org/10.1016/j.compedu.2013.04.021 https://doi.org/10.1016/j.compedu.2017.01.006 https://doi.org/10.15242/HEAIG.H1217424 https://www.jstor.org/stable/calicojournal.31.1.103 https://doi.org/10.1016/j.iheduc.2017.01.004 https://doi.org/10.14742/ajet.4188 https://files.eric.ed.gov/fulltext/EJ1083417.pdf https://doi.org/10.1007/s10639-018-9733-3 http://jucs.org/jucs_21_10/the_mobile_language_learner/jucs_21_10_1269_1282_bradley.pdf https://doi.org/10.19173/irrodl.v12i2.965 http://dx.doi.org/10125/44244 USE OF LANGUAGE-LEARNING APPS | IBACACHE 33 https://doi.org/10.6017/ital.v38i3.11077 14 Kukulska, 158. 15 “Membership: List of ARL Members,” Association of Research Libraries, accessed April 5, 2019, https://www.arl.org/membership/list-of-arl-members. 16 Jingjing Jiang, “Millenials Stand Out for their Technology Use,” Pew Research Center (2018), https://www.pewresearch.org/fact-tank/2018/05/02/millennials-stand-out-for-their- technology-use-but-older-generations-also-embrace-digital-life/. https://www.arl.org/membership/list-of-arl-members https://www.pewresearch.org/fact-tank/2018/05/02/millennials-stand-out-for-their-technology-use-but-older-generations-also-embrace-digital-life/ https://www.pewresearch.org/fact-tank/2018/05/02/millennials-stand-out-for-their-technology-use-but-older-generations-also-embrace-digital-life/ ABSTRACT INTRODUCTION BACKGROUND METHODOLOGY Instrument Participants RESULTS Use of Foreign Language at Work DISCUSSION Library Employee Use of Language-learning Apps Are Language-Learning Apps Responding to the Language Needs of Employees Working in Academic Libraries? CONCLUSION ENDNOTES 11091 ---- Digital Faculty Development Editorial Board Thoughts Digital Faculty Development Cinthya Ippoliti INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 5 Cinthya Ippoliti (cinthya.ippoliti@ucdenver.edu) is Director, Auraria Library, Colorado. The role of libraries within faculty development is not a new concept. Librarians have offered workshops and consultations for faculty for everything from designing effective research assignments, to scholarly impact, and open educational resources. In recent months however, both ACRL and EDUCAUSE have highlighted new expectations for faculty to develop skills in supporting students within a digital environment. As part of ACRL’s “Keeping Up With…” series, Katelyn Handler and Lauren Hays1 discuss the rise of faculty learning communities that cover topics such as universal design, instructional design, and assessment. Effective teaching has also recently become the focus of many institutions’ efforts in increasing student success and retention, and faculty play a central role in students’ academic experience. In addition, the EDUCAUSE Horizon Report echoes these sentiments, positing that “the role of full-time faculty and adjuncts alike includes being key stakeholders in the adoption and scaling of digital solutions; as such, faculty need to be included in the evaluation, planning, and implementation of any teaching and learning initiative.”2 Finally, Maha Bali and Autumn Caines mention that “when offering workshops and evidence-based approaches, educational development centers make decisions on behalf of educators based on what has worked in the past for the majority.”3 They call for a new model that blends digital pedagogy, identity, networks, and scholarship where the experience is focused on “participants negotiating multiple online contexts through various online tools that span open and more private spaces to create a networked learning experience and an ongoing institutionally based online community.”4 So how does the library fit into this context? What we are talking about here goes far beyond merely providing access to tools and materials for faculty. It requires a deep tripartite partnership with educators and the centers for faculty development, as each partner brings something unique to the table that cannot be covered by one area alone. The interesting element here is a dichotomy where this type of engagement can span both in-person and virtual environments as faculty utilize both to teach and connect with colleagues as part of their own development. The lines between these two worlds suddenly blur and it is experience and connectivity that are at the center of the interactions rather than the tools themselves. While librarians may not be able to provide direct support in terms of instructional technologies, they can certainly inform efforts to integrate open and critical pedagogy and scholarship into faculty development programming and into the curriculum. Libraries can take the lead on providing the theoretical foundation and application for these efforts while the specifics of tools and approaches can be covered by other entities. Bali and Caines also observe that bringing together disparate teaching philosophies and skill sets under this broader umbrella of digital support and pedagogy can help provide professional development opportunities for faculty, especially adjuncts, who may not have the ability to participate otherwise. This opportunity can act as a powerful catalyst to influence their teaching by implementing, and therefore modeling, a best-practices approach so that they are thinking about DIGIAL FACULTY DEVELOMENT | IPPOLITI 6 https://doi.org/10.6017/ital.v38i2.11091 bringing students together in a similar fashion even if they are not teaching exclusively online, but especially if they are.5 Open pedagogy can accomplish this in a variety of ways. Bronwyn Hegarty defines eight areas that constitute open pedagogy: (1) participatory technologies; (2) people, openness, and trust; (3) sharing ideas and resources; (4) connected community; 5) learner generated; (6) reflective practice; and (7) peer review.6 These elements are applicable to both faculty development practices, as well as pedagogical ones. Just as faculty might interact with one another in this manner, so can they collaborate with their students utilizing these methods. By being able to change the course materials and think about the ways in which those activities shape their learning, students can view the act of repurposing information as a way to help them define and achieve their learning goals. This highlights the fact that an environment where this is possible must exist as a starting point and it also underlines the importance of the instructor’s role in fostering this environment. Having a cohort of colleagues, for both instructors and students, can “facilitate student access to existing knowledge, and empower them to critique it, dismantle it, and create new knowledge.”7 This interaction emphasizes a two- way experience where both students and instructors can learn from one another. This is very much in keeping with the theme of digital content, as by the very nature of these types of activities, the tools and methods must lend themselves to being manipulated and repurposed, and this can only occur in a digital environment. Finally, in a recent posting on the Open Oregon blog, Silvia Lin Hanick and Amy Hofer discuss how open pedagogy can also influence how librarians interact with faculty and students. Specifically, they state that “open education is simultaneously content and practice”8 and that by integrating these practices into the classroom, students are learning about issues such as intellectual property and the value of information, by acting “like practitioners” 9 where they take on “a disciplinary perspective and engage with a community of practice.”10 This is a potentially pivotal element to take into consideration when analyzing the landscape of library-related instruction, because it frees the librarian from feeling as if everything rests on that one-time instructional opportunity. The development of a community of practitioners which includes the students, faculty, and the librarian has the potential to provide learning opportunities along the way. Including the librarian as part of this model makes sense not only as a way to signal the critical role the librarian plays in the classroom, but also as a way to stress that thinking about, and practicing library-related activities is (or should be) as much part of the course as any other exercise. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 7 REFERENCES 1 Katelyn Handler and Lauren Hays, “Keeping Up With…Faculty Development,” Association of College and Research Libraries, last modified 2019, http://www.ala.org/acrl/publications/keeping_up_with/faculty_development. 2 “Horizon Report,” EDUCAUSE, last modified 2019, https://library.educause.edu/- /media/files/library/2019/2/2019horizonreportpreview.pdf. 3 Maha Bali and Autumm Caines. “A call for promoting ownership, equity, and agency in faculty development via connected learning.” International Journal of Educational Technology in Higher Education 15, no. 1 (2018): 3. 4 Bali, “A call for promoting ownership, equity, and agency in faculty development,” 9. 5 Ibid, 3. 6 Bronwyn Hegarty, “Attributes of Open Pedagogy: A Model for Using Open Educational Resources,” last modified, 2015, https://upload.wikimedia.org/wikipedia/commons/c/ca/Ed_Tech_Hegarty_2015_article_attri butes_of_open_pedagogy.pdf. 7 Kris Shaffer, “The Critical Textbook,” last modified 2014, http://hybridpedagogy.org/critical- textbook/. 8 Silvia Lin Hanick and Amy Hofer, “Opening the Framework: Connecting Open Education Practices and Information Literacy,” Open Oregon, last modified 2017, http://openoregon.org/opening- the-framework/. 9 “Opening the Framework.” 10 “Opening the Framework.” 11093 ---- 20190615 11093 galley LITA President’s Message Moving Forward with LITA Bohyun Kim INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 2 Bohyun Kim (bohyun.kim.ois@gmail.com) is LITA President 2018-19 and Chief Technology Officer & Associate Professor, University of Rhode Island Libraries, Kingston, RI. I am happy to share some updates on what I covered in my previous column. First of all, I am excited to report that the merger planning of LITA, ALCTS, and LLAMA is back on track. The merger planning had been temporarily put on hold due to the target date for the merger being delayed from Fall 2019 to Fall 2020, as announced earlier this year. After taking some time after the 2019 ALA Midwinter Meeting, the current leadership of LITA, ALCTS, and LLAMA met, reviewed the work that we have accomplished so far, and decided that the remaining work will now go to the capable hands of the President-Elects of LITA, ALCTS, and LLAMA, who were elected this April. During their term, this new cohort of President-Elects will build on the work done by the cross-divisional working groups, in order to present the three-division merger for the membership vote in Spring 2020 with more details. Another piece of good news is that LITA, ALCTS, and LLAMA will begin experimenting with joint programming in order to kickstart our collaboration while the merger planning continues. The LITA Board decided to hold the next LITA Forum in Fall 2020. ALCTS is also planning for its second virtual ALCTS Exchange to take place in Spring 2020. LITA, ALCTS, and LLAMA will work together on both program committees of the LITA Forum and the ALCTS Exchange to provide a wider and more interesting range of programs at both conferences. If the membership vote result is in favor of the three-division merger, then the new division will be officially formed in Fall 2020, and the planned 2020 LITA Forum may become the first conference of the new division. Shortly after the 2019 ALA Midwinter Meeting, the LITA Board decided to commit funds to create and disseminate an online allyship training to address the issues aggressive behavior, racism, and harassment reported at the Midwinter Meeting.1 Since then, the LITA staff and the LITA Board of Directors have been closely working with the ALA office and several other divisions, ALCTS, ALSC, ASGCLA, PLA, RUSA, and United, reviewing options. It is likely that this training will follow the “train-the-trainer” model, in order to generate and expand the pool of allyship trainers who will develop and run the LITA’s online allyship training for LITA members. Our goal is to expand our collective capacity to strengthen active and effective allyship, recognize and undo oppressive behaviors and systems, and promote the practice of cultural humility, which requires ongoing efforts, not just a one-time event. We hope to be able to announce more details soon once the final plan is determined. I would also like to highlight the LITA award winners who will be celebrated at the 2019 ALA Annual Conference in Washington D.C. and to thank the members of the award committees for their hard work.2 The 2019 LITA/Ex Libris Student Writing Award will go to Sharon Han, a Master of Science in Library and Information Science candidate at the University of Illinois School of Information Sciences, for her paper, "Weathering the Twitter Storm: Early Uses of Social Media as a Disaster Response Tool for Public Libraries During Hurricane Sandy," which is included in this issue. Charles McClure and John Price Wilkin were selected as the 2019 winners of the LITA/OCLC INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 3 Frederick G. Kilgour Award for Research in Library and Information Technology and the Hugh C. Atkinson Memorial Award sponsored by ACRL, ALCTS, LLAMA, and LITA, respectively. Charles McClure is the Francis Eppes Professor of Information Studies in the School of Information and the Director of the Information Use Management and Policy Institute at Florida State University. John Price Wilkin is the Juanita J. and Robert E. Simpson Dean of Libraries at the University of Illinois at Urbana-Champaign. The North Carolina State University Libraries will receive the 2019 LITA/Library Hi Tech Award for Outstanding Communication in Library and Information Technology, which recognizes outstanding individuals or institutions for their long-term contributions to the education of the Library and Information Science technology field and is sponsored by LITA and Emerald Publishing. Other not-to-be-missed LITA highlights at the 2019 ALA Annual Conference in Washington D.C. include the LITA Top Tech Trends program widely known for its insightful overview of emerging technologies, the LITA President’s Program with Meredith Broussard, a data journalist and the author of Artificial Unintelligence: How Computers Misunderstand the World3 as the speaker, and the LITA Happy Hour, a lively social gathering of all library technologists and technology- enthusiasts. The LITA Avram Camp is also preparing for another terrific all-day discussion and activities this year for women and non-binary library technologists to examine the shared challenges, to network, and to support one another. The LITA Imagineering Interest Group has put together another fantastic program, “Agency, Consent, and Power in Science Fiction and Fantasy,” featuring four sci-fi authors: Sarah Gailey, Malka Older, John Scalzi, and Martha Wells. The LITA Membership Committee is also preparing a virtual LITA Kickoff orientation for those who are newly attending the ALA Annual Conference. In this last column that I write as the LITA President, I would like to express my sincere gratitude to the dedicated LITA Board of Directors, the always fantastic LITA staff, and many LITA leaders and members whose creativity, passion, and energy continue to drive LITA forward. Serving as the Chief Elected Officer of one of the leading membership association in library technology has been a true honor to me, and having such a great team of people to work with has been of tremendous help to me in tackling many dauting tasks. It is often said that all LITA Presidents face unique challenges during their terms. I can say that this has been certainly true during my term. Working together with the ALCTS and the LLAMA leadership on the three-division merger was a valuable experience and a privilege. While we could not move things as quickly as we hoped, we have built a great foundation for the next phase of the planning and learned many things together along the way. Last but not least, I would like to thank everyone who stood for the election and congratulate all newly-elected LITA officers: Evviva Weinraub for the President-Elect, Hong Ma and Galen Charlton for Board of Directors at Large, and Jodie Gambill for the LITA Councilor. I am confident that led by the incoming LITA President, Emily Morton-Owens, the capable and dedicated LITA leadership will continue to accomplish many great things with energetic and forward-thinking LITA members in coming years. The future of LITA is brighter with these new LITA leaders. Good luck and thank you for your service! LITA PRESIDENT’S MESSAGE: MOVING FORWARD WITH LITA | KIM 4 https://doi.org/10.6017/ital.v38i2.11093 ENDNOTES 1 “LITA’s Statement in Response to Incidents at ALA Midwinter 2019,” LITA Blog, February 4, 2019, https://litablog.org/2019/02/litas-statement-in-response-to-incidents-at-ala-midwinter- 2019/. 2 “LITA Awards & Scholarships,” Library Information Technology Association (LITA), http://www.ala.org/lita/awards. 3 Meredith Broussard, Artificial Unintelligence: How Computers Misunderstand the World (Cambridge, Massachusetts: The MIT Press, 2018). 11101 ---- From Digital Library to Open Datasets: Embracing a “Collections as Data” Framework Articles From Digital Library to Open Datasets: Embracing a “Collections as Data” Framework Rachel Wittmann, Anna Neatrour, Rebekah Cummings, and Jeremy Myntti INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 49 Rachel Wittmann (rachel.wittmann@utah.edu) is Digital Curation Librarian, University of Utah. Anna Neatrour (anna.neatrour@utah.edu) is Digital Initiatives Librarian, University of Utah. Rebekah Cummings (rebekah.cummings@utah.edu) is Digital Matters Librarian, University of Utah. Jeremy Myntti (jeremy.myntti@utah.edu) is Head of Digital Library Services, University of Utah. ABSTRACT This article discusses the burgeoning “collections as data” movement within the fields of digital libraries and digital humanities. Faculty at the University of Utah’s Marriott Library are developing a collections as data strategy by leveraging existing Digital Library and Digital Matters programs. By selecting various digital collections, small- and large-scale approaches to developing open datasets are explored. Five case studies chronicling this strategy are reviewed, along with testing the datasets using various digital humanities methods, such as text mining, topic modeling, and GIS (geographic information system). INTRODUCTION For decades, academic research libraries have systematically digitized and managed online collections for the purpose of making cultural heritage objects available to a broader audience. Making archival content discoverable and accessible online has been revolutionary for the democratization of scholarship, but the use of digitized collections has largely mimicked traditional use: researchers clicking through text, images, maps, or historical documents one at a time in search of deeper understanding. “Collections as data” is a growing movement to extend the research value of digital collections beyond traditional use and to give researchers more flexible access to our collections by facilitating access to the underlying data, thereby enabling digital humanities research.1 Collections as data is predicated upon the convergence of two scholarly trends happening in parallel over the past several decades.2 First, as mentioned above, librarians and archivists have digitized a significant portion of their special collections, giving access to unique material that researchers previously had to travel across the country or globe to study. At the same time, an increasing number of humanist scholars have approached their research in new ways, employing computational methods such as text mining, topic modeling, GIS (geographic information system), sentiment analysis, network graphs, data visualization, and virtual/augmented reality in their quest for meaning and understanding. Gaining access to high-quality data is a key challenge of digital humanities work, since the objects of study in the humanities are frequently not as amenable to computational methods as data in the sciences and social sciences.3 Typically, data in the sciences and social sciences is numerical in nature and collected in spreadsheets and databases with the intention that it will be computationally parsed, ideally as part of a reproducible and objective study. Conversely, data (or, more commonly, “evidence” or “research assets”) in the humanities is text- or image-based and is created and collected with the intention of close reading or analysis by a researcher who brings their subjective expertise to bear on the object.4 Even a relatively simple digital humanities method like identifying word frequency in a corpus of literature is predicated on access to plain mailto:rachel.wittmann@utah.edu mailto:anna.neatrour@utah.edu mailto:rebekah.cummings@utah.edu mailto:jeremy.myntti@utah.edu FROM DIGITAL LIBRARY TO OPEN DATASETS | WITTMANN, NEATROUR, CUMMINGS, AND MYNTTI 50 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11101 text (.txt) files, high-quality optical character recognition (OCR), and the ability to bulk download the files without running afoul of copyright or technical barriers. As “The Santa Barbara Statement on Collections as Data” articulates, “with notable exceptions like the HathiTrust Research Center, the National Library of the Netherlands Data Services & API’s, the Library of Congress’ Chronicling America, and the British Library, cultural heritage institutions have rarely built digital collections or designed access with the aim to support computational use.”5 By and large, digital humanists have not been well-served by library platforms or protocols. Current methods for accessing collections data include contacting the library for direct access to the data or “scraping” data off library websites. Recently funded efforts such as the Institute of Museum and Library Services’ (IMLS’s) Always Already Computational and the Andrew W. Mellon Foundation’s Collections as Data: Part to Whole seek to address this problem by setting standards and best practices for turning digital collections into datasets amenable to computational use and novel research methods.6 The University of Utah J. Willard Marriott Library has a long-running digital library program and a burgeoning digital scholarship center creating a moment of synergy for librarians in digital collections and digital scholarship to explore collaboration in teaching, outreach, and digital collection development. A shared goal between the digital library and digital scholarship teams is to develop collections as data of regional interest that could be used by researchers for visualization and computational exploration. This article will share our local approach to developing and piloting a collections as data strategy at our institution. Relying upon best practices and principles from Thomas Padilla’s “On a Collections as Data Imperative,” we transformed five library collections into datasets, made select data available through a public GitHub repository, and tested the usability of the data with our own research questions relying upon expertise and infrastructure from Digital Matters and the Digital Library at the Marriott Library.7 DIGITAL MATTERS In 2015, administration at the Marriott Library was approached by multiple colleges at the University of Utah to explore the possibility of creating a collaborative space to enable digital scholarship. While digital scholarship was happening across campus in disparate and unfocused ways, there was no concerted effort to share resources, build community, or develop a multi-college digital scholarship center with a mission and identity. After an eighteen-month planning process, the Digital Matters pop-up space was launched as a four-college partnership between the College of Humanities, College of Fine Arts, College of Architecture + Planning, and the Marriott Library. An anonymous $1 million donation in 2017 allowed the partner colleges to fund staffing and activity in the space for five years, including the hire of a Digital Matters director tasked with planning for long-term sustainability. The development of Digital Matters brings new focus, infrastructure, and partners for digital humanities research to the University of Utah and the Marriott Library. Monthly workshops, speakers, and reading groups led by digital scholars from all four partner colleges have created a vibrant community with cross- disciplinary partnerships and unexpected synergies. Close partnerships and ongoing dialogue have increased awareness for Marriott faculty, particularly those working in and collaborating with Digital Matters, of the challenges facing digital humanists and the ways in which the library community is uniquely suited to meet those needs. For example, a University of Utah researcher in the College of Humanities developed “Century of Black Mormons,” a community-based public history database of biographical information and primary source documents on black Mormons baptized between 1830 and 1930.8 Working closely with the Digital Initiatives librarian and various staff and faculty at the Marriott Library, they created an Omeka S site that allows users to interact with the historical data using GIS, timeline features, and basic webpage functionality. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 51 INSTITUTION DIGITAL LIBRARY The University of Utah has had a robust digital library program since 2000, including one of the first digital newspaper repositories, Utah Digital Newspapers (UDN, https://digitalnewspapers.org/). In 2016, the library developed its own digital asset management system using open-source systems such as Solr, Phalcon, and nGinx after using CONTENTdm for over fifteen years.9 This new system, Solphal, has made it possible for us to implement a custom solution to manage and display a vast amount of digital content, not only for our library, but also for many partner institutions throughout the state of Utah. Our main digital library server (https://collections.lib.utah.edu/) contains over 765,000 objects in nearly 700 collections, consisting of over 2.5 million files. Solphal is also used to manage the UDN, containing nearly 4 million newspaper pages and over 20 million articles. Digital library projects are continually evolving as we redefine our digital collection development policies, ensuring that we are providing researchers and other users the digital content that they are seeking. With such a large amount of data available in the digital library, we can no longer view our digital library as a set of unique, yet siloed, collections, but more as a wealth of information documenting the history of the university, the state of Utah, and the American West. We are also engaged in remediating legacy metadata across the repository in order to achieve greater standardization, which could support computational usage of digital library metadata in the future. With this in mind, we are working to strategically make new digital content available on a large scale that can help researchers discover this historical content within a collections as data mindset. Leveraging the existing Digital Library and Digital Matters programs, faculty at the Marriott Library are in the process of piloting a collections as data strategy. We selected digital collections with varying characteristics and used them to explore small- and large-scale approaches to developing datasets for humanities researchers. We then tested the datasets by employing various digital humanities methods such as text mining, topic modeling, and GIS. The five case studies below chronicle our efforts to embrace a collections as data framework and extend the research value of our digital collections. TEXT MINING MINING TEXTS When developing the initial collections as data project, several factors were considered to identify the optimal material for this experiment. Selecting already digitized and described material in the University of Utah Digital Library was ideal to avoid waiting periods required for new digitization projects. The Marriott Library Special Collections’ relationship with the American West Center, an organization based at the University of Utah with the mission of documenting the history of the American West, has produced an extensive collection of oral histories held in the Audio Visual Archive which have typewritten transcripts yielding high-quality OCR. Given the availability and readiness of this material, we built a selected corpus of mining-related oral histories, drawn from collections such as the Uranium Oral Histories and Carbon County Oral Histories. Engaging in the entire process with a digital humanities framework, we scraped our own digital library repository as though we had no special access to the back end of the system, developing a greater understanding of the process and workflows needed to build a text corpus to support a research inquiry. In this way, we extended our skills so that we would be able to scrape any digital library system if this expertise was needed in the future. The extensive amount of text produced by the corpus of 230 combined oral histories provided ideal material for topic modeling. Simply put, “topic modeling is an automatic way to examine the contents of a corpus of documents.”10 The output of these models is word clouds with varying sizes of words based on the number of co-occurrences within the corpus; larger words indicate more occurrences and smaller ones indicate fewer. Each topic model then points to the most relevant documents within the corpus based on the co-occurrences of the words contained in that model. In order to create these topic models from the https://digitalnewspapers.org/ https://collections.lib.utah.edu/ FROM DIGITAL LIBRARY TO OPEN DATASETS | WITTMANN, NEATROUR, CUMMINGS, AND MYNTTI 52 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11101 corpus of oral histories, a workflow was developed with the expertise of the Digital Matters cohort, implementing MALLET for R script, using the LDA topic model style, developed by Blei et al.11 Figure 1. Topic model from text mining the mining-related oral histories found in the University of Utah’s Digital Library. From the mining-related oral history corpus, twenty-six topic models were created. Once generated, each topic model points to five interviews that are most related to the words in a particular model. In figure 1, the words carbon, county, country, and Italian are the largest, because the interviews are about Carbon County, Utah. Considering this geographical area of Utah was the most ethnically diverse in the late 1800s due to the coal mining industry recruiting labor from abroad, including Italy, these words are not surprising. As indicated by their prominence in the topic model, the set of words co-occur most often in the interview set. We approached the process of topic modeling the oral histories as an exploration, but after INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 53 reviewing the results, we discovered that many of the words which surfaced through this process pointed to deficiencies in the original descriptive metadata, highlighting new possibilities for access points and metadata remediation. Honing in on the midsize words tended to uncover unique material that is not covered in descriptive metadata, as these words are often mentioned more than a handful of times and across multiple interviews. The largest words in the model are typically thematic to the interview and included in the descriptive metadata. For example, when investigating the inclusion of “wine” in the topic model found in figure 1, conversations about the winemaking process amongst the Italian mining community in Carbon County, Utah were revealed. From an interview with Mary Nicolovo Juliana conducted in 1973 from the Carbon County Oral History Project, Nicolovo discusses how her father, a miner, made wine at home.12 As the topic models are based on co-occurrences in the corpus, there was another interview with Emile Louise Cances, from the Carbon County Oral History Project conducted in 1973. Cances, from a French immigrant mining family, discusses the vineyards her family had in France.13 With both of these oral histories, there was no reference to wine in the descriptive metadata. A researcher may miss this content because it isn’t included as an access point in metadata. Thus, topic modeling allowed for the discoverability of potentially valuable topics that may be buried in hundreds of pages of content. From this collections as data project, text mining the mining oral history texts to produce topic models, we are considering employing topic modeling when creating new descriptive metadata for similar collections. Setting a precedent, the text files for this project are hosted on the growing Marriott Library Collections as Data Github repository. After we developed this corpus, we discovered that a graduate student in the History department had developed a similar project, demonstrating the research value of oral histories combined with computational analysis.14 HAROLD STANLEY SANDERS MATCHBOOKS COLLECTION When assessing potential descriptive metadata for the Harold Stanley Sanders Matchbooks Collection, an assortment of matchbooks that reflect many bygone establishments predominately from Salt Lake City that include restaurants, bars, hotels, and other businesses, non-Dublin Core metadata was essential for computational purposes. With the digital project workflow now extending beyond publishing the collection in the Digital Library, to publishing the collection data to the Marriott Library Collections as Data GitHub repository, assessing metadata needs has evolved. As matchbooks function as small advertisements, they often incorporate a mix of graphic design, advertising slogans, and addresses of the establishment. The descriptive metadata was created first with the most relevant fields for computational analysis, including business name, type of business, transcription of text, notable graphics, colors of matchbooks, and street addresses. For collection mapping capabilities, street addresses were then geocoded using a Google Sheets add-on called Geocode Cells, which uses Google’s Geocoding API (see figure 2). FROM DIGITAL LIBRARY TO OPEN DATASETS | WITTMANN, NEATROUR, CUMMINGS, AND MYNTTI 54 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11101 Figure 2. A screenshot of Google Sheets add-on, Geocode Cells. (https://chrome.google.com/webstore/detail/geocode-cells/pkocmaboheckpkcbnnlghnfccjjikmfc). Figure 3. A screenshot of Harold Stanley Sanders Matchbook Collection Map, made with ArcGIS Online. https://chrome.google.com/webstore/detail/geocode-cells/pkocmaboheckpkcbnnlghnfccjjikmfc INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 55 This proved efficient for this collection, as other geocoding services required zip codes for street addresses which were not present on the matchbooks. With the latitude and longitude addition to the metadata, the collection was then mapped using ArcGIS Online (see figure 3).15 The extensive metadata, including geographic-coordinate data, is available on the library’s GitHub repository for public use. After the more computationally ready metadata was created, it was then massaged to fit library best practices and Dublin Core (DC) standards. This included deriving Library of Congress Subject Headings for DC subjects from business type and concatenating notable matchbook graphics and slogans for the DC description. While providing the extensive metadata is beneficial for computational experimentation, it adds time and labor to the lifespan of the project. KENNECOTT COPPER MINER RECORDS One aspect of our collections as data work at the University of Utah moving forward is the need for long- term planning for resources that contain interesting information that could eventually be used for computational exploration, even if we currently don’t have the capacity to make the envisioned dataset available at the current time. The Marriott Library holds a variety of personnel records from the Kennecott Copper Corporation, Utah Copper Division. These handwritten index cards contain a variety of interesting demographic data about the workers who were employed by the company from 1900-19 such as name, employee ID, date employed, address, dependents, age, weight, height, eyes, hair, gender, nationality, engaged by, last employer, education, occupation, department, pay rate, date leaving employment, and reason for leaving. Not all the cards are filled out with the complete level of detail as listed in the fields above, however, usually name, date employed, ethnicity, and notes about pay rates for each employee are included. Developing a scanning and digitization procedure for creating digital surrogates of almost 40,000 employment records was fairly easy due to an existing partnership and reciprocal agreement with FamilySearch, however developing a structure for making the digitized records available and providing full transcription is a long-term project. Librarians used this project as an opportunity to think strategically about the limits of Dublin Core when developing a collections as data project from the start. The digital library repository at the University of Utah provides the ability to export collection level metadata as .tsv files. With this in mind, the collection metadata template was created with the aim of eventually being able to provide researchers with the granular information on the records. This required introducing a number of new, non-standard field labels to our repository. Since we are not able to anticipate exactly how a researcher might interact with this collection in the future, our main priority was developing a metadata template that would accommodate full transcription for every data point on the card. Twenty new fields in the template reflect the demographic data on the card, and ten are existing fields that map to our standard practices with Dublin Core fields. Because we do not currently have the staffing in place to transcribe 40,000 records, we are implementing a phased approach of transcribing four basic fields, with fuller transcription to follow if we are able to secure additional funding. FROM DIGITAL LIBRARY TO OPEN DATASETS | WITTMANN, NEATROUR, CUMMINGS, AND MYNTTI 56 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11101 Figure 4. Employment card for Alli Ebrahim, 1916. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 57 Figure 5. Employment card for Richard Almond, 1917. WOMAN’S EXPONENT A stated goal for Digital Matters is to be a digital humanities space that is unique to Utah and addresses issues of local significance such as public lands, water rights, air quality, indigenous peoples, and Mormon history.16 When considering what digital scholarship projects to pursue in 2019, Digital Matters faculty became aware of the upcoming 150th anniversary of women in Utah being the first to vote in the nation. Working with a local nonprofit, Better Days 2020, and colleagues at Brigham Young University (BYU), Digital Matters faculty and staff decided to embark on a multimodal analysis of the 6,800-page run of the Woman’s Exponent, a Utah women’s newspaper published between 1872-1914 primarily under the leadership of Latter-day Saint Relief Society President Emmeline B. Wells. In its time, the Woman’s Exponent was a passionate voice for women’s suffrage, education, and plural marriage, and chronicled the interest and daily lives of Latter-day Saint women. Initially, we hoped to access the data through the Brigham Young University Harold B. Lee Library, which digitized the Exponent back in 2000. We quickly learned that OCR from nearly twenty years ago would not suffice for digital humanities research and considered different paths for rescanning the Exponent. After accessing the original microfilm from BYU, we leveraged existing structures for digitization. Through an agreement that the Marriott Library has in place with a vendor for completing large-scale digitization of newspapers on microfilm for inclusion in the Utah Digital Newspapers program, we were able to add the Woman’s Exponent to the existing project without securing a new contract for digitization. The vendor digitized the microfilm, created an index of each title, issue, date, and page, and extracted the full text FROM DIGITAL LIBRARY TO OPEN DATASETS | WITTMANN, NEATROUR, CUMMINGS, AND MYNTTI 58 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11101 through an OCR process. They then delivered 330 GB of data to us, including high-quality TIFF and JP2000 images, a PDF file for each page, and METS-ALTO XML files containing the metadata and OCR text. Acquiring data for the Woman’s Exponent project illuminated the challenges that digital humanists face when looking for clean data. Our original assumption was that if something had already been scanned and put online, the data must exist somewhere. We soon learned, when working with legacy digital scans, that the OCR might be insufficient or the original high-quality scans might be lost over the course of multiple system migrations. As librarians with existing structures in place for digitization, we had the content rescanned and delivered within a month. Our digital humanities partners from outside of the library did not know this option was available and assumed our research team would have to scan 6,800 pages of newspaper content before we were able to start analyzing the data. This incongruity highlighted cultural differences between digital humanists with their learned self-reliance and librarians who are more comfortable and conversant looking to outside resources. Indeed, our digital humanities colleagues seemed to believe that “doing it yourself” was part and parcel of digital humanities work. The Woman’s Exponent project is still in early phases, but now that we have secured the data, we are considering what digital humanities methods we can bring to bear on the corpus. With the 2020 150th anniversary of women’s suffrage in Utah, we have considered a topic modeling project looking at themes around universal voting, slavery, and polygamy and tracking how the discussion around those topics evolved over the 42-year run of the paper. Another potential project is building a social network graph of the women and men chronicled throughout the run of the paper. Developing curriculum around women in Utah history is of particular interest to the group as women are underrepresented in the current K-12 Utah history curriculum. Keeping in line with our commitment to collections as data, we have released the Woman’s Exponent as a .tsv file with OCR full-text data, which can be analyzed by researchers studying Utah, Mormon studies, the American West, or various other topics. Collaborators have also developed a digital exhibit on the Woman’s Exponent which includes essays about a variety of topics as well as sections showcasing its potential for digital scholarship.17 OBITUARY DATA The Utah Digital Newspapers (UDN) program began in 2002 with the goal of making historical newspaper content from the State of Utah freely available to the public for research purposes. Between 2002 and 2019, there have been over 4 million newspaper pages digitized for UDN. Due to search limitations of the software system used for UDN at the time, the data model for newspapers was made more granular, and included segmentation for articles, obituaries, advertisements, birth notices, etc. This article segmentation project ended in 2016 when it was determined that the high cost of segmentation was not sustainab le with mass digitization of newspapers and users were still able to find the content they are looking for on a full newspaper page. Before the article segmentation project concluded, UDN had accrued over 20 million articles, including 318,044 articles that were tagged as obituaries or death notices. In 2013, the Marriott Library partnered with FamilySearch to index the genealogical information that can be gleaned from these obituaries. The FamilySearch Indexing (FSI) program crowdsourced the indexing of this data to thousands of volunteers worldwide. Certain pieces of data, such as place names, were mapped to an existing controlled vocabulary and dates were entered in a standardized format to ensure that certain pieces of the data are machine actionable.18 After the obituaries were indexed by FSI in 2014, a copy of the data was given to the Marriott Library to use in UDN. The indexed data included fields such as name of deceased, date of death, place of death, date of birth, birthplace, and relative names with relationships. Since this massive amount of data didn't easily fit within the UDN metadata schema, it was stored for several years without the Marriott Library doing anything with the data. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 59 Now that we are thinking about our digital collections as data, we are exploring ways that researchers could use this vast amount of data. The data was delivered to the library in large spreadsheets that are not easily usable in any spreadsheet software. We are exploring ingesting the data into a revised newspaper metadata schema within our digital asset management system or converting the data into a MySQL database so it is possible to search and find relationships between pieces of data. Working with a large dataset such as this can be challenging. The data from only two newspapers, including 1,038 obituaries, is a 25 MB file. The full database is over 10 GB of data. Since this is a large amount of data, we are working through issues related to how we can distribute this data in a usable way in order for researchers to make use of the data. We are also looking at the possibility of having FSI index a dditional obituary data from UDN, which will make the database continually expand. CONCLUSION As the digital library community recognizes the need for computational-ready collections, the University of Utah Digital Library has embraced this evolution with a strategic investment. Implementing the collections as data GitHub repository for computational users is a first step towards providing access to collections beyond the traditional digital library environment. While there may be improved ways to access this digital library data in the future, the GitHub repository filled an immediate need. Developing standardized metadata for computational use can often require more time from metadata librarians who are already busy with the regular work of describing new assets for the digital library. Developing additional workflows for metadata enhancement and bulk download can delay the process in making new collections available. In most cases, collections need to be evaluated individually to determine what type of resources can be invested in making them available for computational use. For a project needing additional transcription, like the Kennecott Mining Records, crowdsourcing might seem like potential avenue to pursue. However, the digital library collection managers have misgivings about the training and quality assurance involved in developing a new large-scale transcription project. Combined with the desire to ensure that the people who are working on the project have adequate training and compensation for their labor, we are making the strategic decision to transcribe for some of the initial access points to the collection now, and attempt full transcription at a later date pending additional funding. For the UDN obituary data, leveraging an existing transcription program at no cost with minimal supervision needed by librarians worked well in being able to surface additional genealogical data that can be released for researchers. The collections as data challenge mirrors a perennial digital library conundrum—how much time and effort should librarians invest for unknown future users with unknown future needs? Much like digitization and metadata creation, creating collections as data requires a level of educated guesswork as to what collections digital humanists will want to access, what metadata fields they will be interested in manipulating, and in what formats they will need their data. Considering the limited resources of librarians, should we convert our digital collections into data in anticipation of use or convert our collections on demand? This “just in case” vs. “just in time” question is worthy of debate and will naturally be dependent on the resources and priorities of individual institutions. With an increasing number of researchers experimenting with digital humanities methods, collections as data will be a standard consideration when working with new digitization projects at the University of Utah. Visualization possibilities outside of the digital-library environment will be regularly assessed. Descriptive metadata practices beyond Dublin Core will be developed when beneficial to the computational and experimental use of the data by the public. Integrating techniques like topic modeling into descriptive metadata workflows provides additional insight about the digital objects being described. While adding collections as data to existing digitization workflows will require an additional investment of time, developing these projects has also created new opportunities for collaboration both within the library and FROM DIGITAL LIBRARY TO OPEN DATASETS | WITTMANN, NEATROUR, CUMMINGS, AND MYNTTI 60 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11101 in developing expanded partnerships at the University of Utah and other institutions in the Mountain West. By leveraging our existing partnerships, we were able to create collections as data pilots organically by taking advantage of our current workflows and digitization procedures. While we have been successful in releasing smaller-scale collections as data projects, we still need to consider integration issues with our larger digital library program and experiment more with enabling access to large datasets. With librarians engaged in producing curated datasets that evolve from unique special collection materials, they can extend the research value of the digital library and the collections that are unique to each institution. As we look towards the future, we see this work continuing and expanding as librarians engage more with digital humanities teaching and support. ACKNOWLEDGEMENTS The authors would like to acknowledge Dr. Elizabeth Callaway, former Digital Matters postdoctoral fellow and current Assistant Professor in the Department of English at the University of Utah, for developing the topic modeling workflow used in the collections as data project, Text Mining Mining Texts. Callaway’s expertise was invaluable in creating the scripts to enable distance reading of the text corpus, documenting this process, and training library staff. REFERENCES 1 Thomas G. Padilla, “Collections as Data: Implications for Enclosure,” College & Research Libraries News; Chicago 79, no. 6 (June 2018): 296, https://crln.acrl.org/index.php/crlnews/article/view/17003/18751. 2 Thomas Padilla et al., “The Santa Barbara Statement on Collections as Data (V1),” n.d., https://collectionsasdata.github.io/statementv1/. 3 Christine L. Borgman, “Data Scholarship in the Humanities,” in Big Data, Little Data, No Data: Scholarship in the Networked World (Cambridge, MA: The MIT Press, 2015), 161–201. 4 Miriam Posner, “Humanities Data: A Necessary Contradiction,” Miriam Posner’s Blog (blog), June 25, 2015, http://miriamposner.com/blog/humanities-data-a-necessary-contradiction/. 5 Thomas Padilla et al., “The Santa Barbara Statement on Collections as Data (V1),” n.d., https://collectionsasdata.github.io/statementv1/. 6 Thomas Padilla, “Always Already Computational,” Always Already Computational: Collections as Data, 2018, https://collectionsasdata.github.io/; Thomas Padilla, “Part to Whole,” Collections as Data: Part to Whole, 2019, https://collectionsasdata.github.io/part2whole/. 7 “Marriott Library Collections as Data GitHub Repository,” April 16, 2019, https://github.com/marriott- library/collections-as-data. 8 “Century of Black Mormons,” accessed April 25, 2019, http://centuryofblackmormons.org. 9 Anna Neatrour et al., “A Clean Sweep: The Tools and Processes of a Successful Metadata Migration,” Journal of Web Librarianship 11, no. 3-4 (October 2, 2017): 194-208, 111, https://doi.org/10.1080/19322909.2017.1360167. 10 Anna L. Neatrour, Elizabeth Callaway, and Rebekah Cummings, “Kindles, Card Catalogs, and the Future of Libraries: A Collaborative Digital Humanities Project,” Digital Library Perspectives 34, no. 3 (July 2018): 162–87, https://doi.org/10.1108/DLP-02-2018-0004. https://crln.acrl.org/index.php/crlnews/article/view/17003/18751 https://collectionsasdata.github.io/statementv1/ http://miriamposner.com/blog/humanities-data-a-necessary-contradiction/ https://collectionsasdata.github.io/statementv1/ https://collectionsasdata.github.io/ https://collectionsasdata.github.io/part2whole/ https://github.com/marriott-library/collections-as-data https://github.com/marriott-library/collections-as-data http://centuryofblackmormons.org/ https://www.ifla.org/files/assets/newspapers/SLC/2014_ifla_slc_herbert_mynti_alexander_witkowski_-_getting_the_crowd_into_obituaries.pdf https://doi.org/10.1108/DLP-02-2018-0004 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 61 11 David M. Blei et al., “Latent Dirichlet Allocation,” Journal of Machine Learning Research 3, no. 4/5 (May 15, 2003): 993–1022, http://search.ebscohost.com/login.aspx?direct=true&db=asn&AN=12323372&site=ehost-live. 12 “Mary Nicolovo Juliana, Carbon County, Utah, Carbon County Oral History Project, No. 47, March 30 1973,” Carbon County Oral Histories, accessed April 29, 2019, https://collections.lib.utah.edu/details?id=783960. 13 “Mrs. Emile Louise Cances, Salt Lake City, Utah, Carbon County Oral History Project, No. CC-25, February 24, 1973,” Carbon County Oral Histories, accessed April 29, 2019, https://collections.lib.utah.edu/details?id=783899. 14 Nate Housley, “A Distance Reading of Immigration in Carbon County,” Utah Division of State History Blog, 2019, https://history.utah.gov/a-distance-reading-of-immigration-in-carbon-county/. 15 “Harold Stanley Sanders Matchbooks Collection,” accessed May 8, 2019, https://collections.lib.utah.edu/search?facet_setname_s=uum_hssm; “Harold Stanley Sanders Matchbooks Collection Map,” accessed May 8, 2019, https://mlibgisservices.maps.arcgis.com/apps/webappviewer/index.html?id=d16a5bc93b864fc0b953 0af8e48c6c6f. 16 Rebekah Cummings, David Roh, and Elizabeth Callaway, “Organic and Locally Sourced: Growing a Digital Humanities Lab with an Eye Towards Sustainability,” Digital Humanities Quarterly, 2019. 17 “Woman’s Exponent Data,” https://github.com/marriott-library/collections-as- data/tree/master/womansexponent; “Woman’s Exponent Digital Exhibit,” https://exhibits.lib.utah.edu/s/womanexponent/. 18 John Herbert et al., “Getting the Crowd into Obituaries: How a Unique Partnership Combined the World’s Largest Obituary with the Utah’s Largest Historic Newspaper Database,” in Salt Lake City, UT: International Federation of Library Associations and Institutions, 2014, https://www.ifla.org/files/assets/newspapers/SLC/2014_ifla_slc_herbert_mynti_alexander_witkowski _-_getting_the_crowd_into_obituaries.pdf. http://search.ebscohost.com/login.aspx?direct=true&db=asn&AN=12323372&site=ehost-live https://collections.lib.utah.edu/details?id=783960 https://collections.lib.utah.edu/details?id=783899 https://history.utah.gov/a-distance-reading-of-immigration-in-carbon-county/ https://collections.lib.utah.edu/search?facet_setname_s=uum_hssm https://mlibgisservices.maps.arcgis.com/apps/webappviewer/index.html?id=d16a5bc93b864fc0b9530af8e48c6c6f https://mlibgisservices.maps.arcgis.com/apps/webappviewer/index.html?id=d16a5bc93b864fc0b9530af8e48c6c6f https://www.zotero.org/google-docs/?KMYo08 https://www.zotero.org/google-docs/?KMYo08 https://github.com/marriott-library/collections-as-data/tree/master/womansexponent https://github.com/marriott-library/collections-as-data/tree/master/womansexponent https://exhibits.lib.utah.edu/s/womanexponent/ https://www.ifla.org/files/assets/newspapers/SLC/2014_ifla_slc_herbert_mynti_alexander_witkowski_-_getting_the_crowd_into_obituaries.pdf https://www.ifla.org/files/assets/newspapers/SLC/2014_ifla_slc_herbert_mynti_alexander_witkowski_-_getting_the_crowd_into_obituaries.pdf ABSTRACT INTRODUCTION DIGITAL MATTERS INSTITUTION DIGITAL LIBRARY TEXT MINING MINING TEXTS HAROLD STANLEY SANDERS MATCHBOOKS COLLECTION KENNECOTT COPPER MINER RECORDS WOMAN’S EXPONENT OBITUARY DATA CONCLUSION ACKNOWLEDGEMENTS REFERENCES 11141 ---- Online Ticketed-Passes: A Mid-Tech Leap in What Libraries Are For Public Libraries Leading the Way Online Ticketed-Passes: A Mid-Tech Leap in What Libraries Are For Jeffrey Davis INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 8 Jeffrey Davis (jtrappdavis@gmail.com) is Branch Manager at San Diego Public Library, San Diego, California. Last year a library program received coverage from The New York Times, the Wall Street Journal, the magazines Mental Floss and Travel+Leisure, many local newspapers and TV outlets, online and trade publications like Curbed, Thrillist, and Artforum, and more. That program is New York’s Culture Pass, a joint program of the New York, Brooklyn, and Queens Public Libraries. Culture Pass is an online ticketed-pass program providing access to area museums, gardens, performances, and other attractions. As the New York Daily News wrote in their lede: “It’s hard to believe nobody thought of it sooner: A New York City library card can now get you into 33 museums free.” Libraries had thought of it sooner, of course. Museum pass programs in libraries began at least as early as 1995 at Boston Public Library and the online ticketed model in 2011 at Contra Costa (CA) County Library. The library profession has paid this “mid-tech” program too little attention, I think, but that may be starting to change. WHAT ARE ONLINE TICKETED-PASSES? The original museum pass programs in libraries circulate a physical pass that provides access to an attraction or group of attractions. Sometimes libraries are able to negotiate free or discounted passes but many times the passes are purchased outright. The circulating model is still the most common for library pass programs, but it suffers from many limitations. Passes by necessity are checked out for longer than they’re used. They sit waiting for pick up on hold shelves and in transit to their next location. Long queues make it hard for patrons to predict when their requests will be filled, and therefore difficult to plan on using. For the participating attractions, physical passes are typically good anytime and so compete with memberships and paid admission. There are few ways to shape who borrows the passes in order to meet institutional goals. And there are few ways to limit repeat use by library patrons to both increase exposure and nudge users toward membership. As a result, most circulating pass programs only connect patrons to a small number of venues. Despite these limitations, circulating passes have been incredibly popular: at writing there are 967 requests for San Diego Public Library’s 73 passes to the New Children’s Museum. We sometimes see that sort of interest in a new bestseller, but this is a pass that SDPL has offered continuously since 2009. In 2011, Contra Costa County Library launched the first “ticketed-pass” program, Discover & Go. Discover & Go replaced circulating physical passes with an online system with which patrons, remotely or in the library with staff assistance, retrieve day-passes — tickets — by available date or venue. This relatively simple and common-sense change makes an enormous difference. In addition to convenience and predictability for patrons, availability is markedly increased because venues are much more comfortable providing passes when they can manage their use: patrons can be restricted to a limited number of tickets per venue per year and venues can match the INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 9 number of tickets available to days that they are less busy. The latter preserves the value of their memberships while making use of their own “surplus capacity” to bring in new visitors and potential new members. Funding and internal expectations at many venues carry obligations to reach underserved communities and the programs allow partner attractions to shape public access and receive reporting by patron zip code and other factors. The ePass software behind Discover & Go is regional by design and supports sharing of tickets across multiple library systems in ways that are impractical to do with physical passes. As new library systems join the program, they bring new partner attractions into the shared collection with them. The Oakland Zoo, for example, needs only to negotiate with their contact at Oakland Public Library to coordinate access for members of Oakland, San Francisco, and San Jose Public Libraries. Because of the increased attractiveness of participation, it’s been easier for libraries to bring venues into the program. In 2011, Discover & Go hoped for a launch collection of five museums but ultimately opened with forty. The success of ticketed-pass programs in turn attracts more partners. Today, Discover & Go is available through 49 library systems in California and Nevada with passes to 137 participating attractions. Similarly, New York’s Culture Pass launched with 33 participating venues and has grown in less than a year to offer a collection of 49. While big city programs attract the most attention, pass programs are offered by county systems like Alamace County (NC), consortiums like Libraries in Clackamas County (OR), small cities like Lawrence (MA), small towns like Atkinson (NH), and statewide like the Michigan Activity Pass which is available through over 600 library sites with tickets to 179 destinations plus state parks, camping, and historical sites. For each library, the participating destinations form a unique collection: a shelf of local riches, idiosyncratic and rooted in place. Through various libraries one can find tickets for the Basketball Hall of Fame, Stone Barns Center for Food and Agriculture, Dinosaur Ridge, Eric Carle Museum of Picture Book Art, Bushnell Park Carousel, California Shakespeare Theater, children’s museums, zoos, aquariums, botanical gardens, tours, classes, performances, and on to the Met, MOMA, Crocker, de Young, and many, many, many more. For kids, “enrichments” like these are increasingly understood as essential parts of learning and exploration. For adults, access to our cultural treasures, including partners like San Francisco’s Museum of the African Diaspora or Chicago’s National Museum of Puerto Rican Arts & Culture — besides being its own reward — enhances local connection and understanding. We’re also starting to see the ticketing platform itself become an asset to smaller organizations — craft studios, school performances, farm visits, nature centers, and more — that want to increase public access without having to take on a new ability. Importantly, ticketed-pass programs are built on the core skills of librarians: information management, collection development, community outreach, user-centered design, customer service, and technological savvy. THE TECHNOLOGY Discover & Go was initially funded by a $45,000 grant from the Bay Area Library and Information System (BALIS) cooperative. Contra Costa contracted with library software company Quipu Group to develop the ePass software that runs the program and that is also used by NY’s Culture Pass, PUBLIC LIBRARIES LEADING THE WAY: ONLINE TICKETED PASSES | DAVIS 10 https://doi.org/10.6017/ital.v38i2.11141 Multnomah County (OR) Library’s My Discovery Pass, and a consortium of Oregon libraries as Cultural Pass. Ticketed-pass software is also offered by the LibraryInsight and Plymouth Rocket companies and used by Denver Public Library, Seattle Public Library, the Michigan Activity Pass, and others. The software consists of a web application with a responsive patron interface and connects over SIP2 or vendor API to patron status information from the library ILS. Administrative tools set fine- grained ticket availability, blackout dates, and policies including restrictions by patron age, library system, zip code, municipality, number of uses allowed globally and per venue, and more. Recent improvements to ePass include geolocation to identify nearby attractions and improved search filters. Still in development are transfer of tickets between accounts, re-pooling of unclaimed tickets, and better handling of replaced library cards. The strength that comes from multi-system ticketed-pass programs also carries with it challenges on the patron account side. ILSes each implement protocols and APIs for working with patron account information differently and library systems maintain divergent policies around patron status. There’s a role for LITA and for library consortia and state libraries to push for more attention to and consistency on patron account policies and standards. The emphasis in library automation is similarly shifting. Our ILSes originated to manage the circulation of physical items, a catalog-centric view. Today, as Robert Anderson of Quipu Group suggested to me, a diverse range of online and offline services and non-catalog offerings orbit our users, calling for a new frame of reference: “It’s a patron-centric world now.” THE VISION Library membership is the lynchpin of ticketed-pass and complementary programs in the technical sense, as above, and conceptually: library membership as one’s ticket to the world around. Though I’m not aware of academic libraries offering ticketed-passes, they have been providing local access through membership. At many campuses, the library is the source for one’s library card which is also one’s campus ID, on- and off-campus cash card, transit pass, electronic key, print management, and more. That’s kind of remarkable and deserving of more attention. Traditionally, librarians have responded to patron needs by providing information, resources, and services ourselves. New models and technologies are making it easier to complement this with the facilitation approach, of which online ticketed-passes are the quintessential example. We further increase access by reducing barriers of complexity, language, know-how, and social capital, for example, by maintaining community calendars of local goings-on or helping communities take advantage of nearby nature. Online ticketed-pass programs will grow and take their place in the public’s expectations of libraries and librarians: that libraries are the place that help us (better, more equitably) access the resources and riches around us. Powering this are important new tools for library technologists to interrogate and advance with the same attention we give to both more established and more speculative applications. 11169 ---- Testing for Transition: Evaluating the Usability of Research Guides Around a Platform Migration Articles Testing for Transition: Evaluating the Usability of Research Guides Around a Platform Migration Ashley Lierman, Bethany Scott, Mea Warren, and Cherie Turner INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 76 Ashley Lierman (lierman@rowan.edu) is Instruction Librarian, Rowan University. Bethany Scott (bscott3@uh.edu) is Coordinator of Digital Projects, University of Houston. Mea Warren (mewarren@uh.edu) is Natural Science and Mathematics Librarian, University of Houston. Cherie Turner (ckturner2@uh.edu) is Assessment & Statistics Coordinator, University of Houston. ABSTRACT This article describes multiple stages of usability testing that were conducted before and after a large research library’s transition to a new platform for its research guides. A large interdepartmental team sought user feedback on the design, content, and organization of the guide homepage, as well as on individual subject guides. This information was collected using an open-card-sort study, two face-to-face, think-aloud testing protocols, and an online survey. Significant findings include that users need clear directions and titles that incorporate familiar terminology, do not readily understand the purpose of guides, and are easily overwhelmed by excess information, and that many of librarians’ assumptions about the use of library resources may be mistaken. This study will be of value to other library workers seeking insight into user needs and behaviors around online resources. INTRODUCTION Like many libraries that employ Springshare’s popular LibGuides platform for creating online research resources, the University of Houston Libraries (UHL) has accumulated an extensive collection of guides over the years. By 2015, our collection included well over 250 guides, with varying levels of complexity, popularity, usability, and accessibility. This presented a major challenge when we planned to migrate our LibGuides instance (locally branded as “Research Guides”) to LibGuides v2 in fall 2015, but also an opportunity: the transition would be an ideal time to appraise, reorganize, and streamline existing guide content. Although UHL had conducted user research in the past to improve usability, in preparing for the migration it became clear that another round of tests would be beneficial in revising our guides for the new platform. Our Research Guides would be presented much differently in LibGuides v2, and the design and organization of information would need to be tailored to the needs of our user community like any other service. User feedback would be vital to reorganizing our guides’ content and to making customizations to the new system. This article will describe the usability testing process that was employed before and after UHL’s migration to LibGuides v2. Usability testing is one technique in the field of user experience (UX). The primary goal of UX is to gain a deep understanding of users’ preferences and abilities, in order to inform the design and implementation of more useful, easy-to-use products or systems. Best practices for UX emphasize “improving the quality of the user’s interaction with and perceptions of your product and any related services.”1 Usability tests conducted as part of this case study mailto:lierman@rowan.edu mailto:bscott3@uh.edu mailto:mewarren@uh.edu mailto:ckturner2@uh.edu TESTING FOR TRANSITION | LIERMAN, SCOTT, WARREN, AND TURNER 77 https://doi.org/10.6017/ital.v38i4.11169 were informed by the work of Jakob Nielsen, who pioneered several UX ideas and techniques, and the explanations on conducting your own usability testing provided in Steve Krug’s seminal works on the topic, Don’t Make Me Think and Rocket Surgery Made Easy. UHL’s transition to LibGuides v2 consisted of five stages: (1) card sort testing to determine the best organization of guides in the new system; (2) the migration itself; (3) face-to-face usability testing after migration to study user expectations and behavior after the change; (4) a survey to identify any significant variations in distance users‘ experiences; and (5) final analysis and implementation of the results. Incorporating usability testing was a relatively easy and inexpensive process with a high yield of useful insights, which could be adapted as needed to other library settings in order to evaluate similar online resources. LITERATURE REVIEW As libraries have moved from traditional paper pathfinders to online research guides of increasing sophistication, there has been substantial study into the effectiveness of online research guides for various audiences and information needs. Several studies highlight the apparent disconnect between students’ and librarians’ perceptions of research guides, especially regarding the purpose, organization, and intended use of the guides. Reeb and Gibbons used an analysis of surveys and web usage statistics from several university libraries to show that students rarely or never used online guides despite the extensive time spent by librarians to curate and present information resources.2 Similarly, in Courtois, Higgins, and Kapur’s one-question survey (“Was this guide useful?”) the authors were surprised to find that 40 percent of the responses received rated guides unfavorably, noting that “it was disheartening for many guide owners to receive poor ratings or negative comments on guides that require significant time and effort to produce and maintain.”3 Hemmig concluded that in order to increase the value of a guide from a user perspective, librarians must adopt a user-centric approach by guiding the search process, understanding students’ mental models for research, and providing “starter references.”4 Staley’s survey of student users also indicates a need to be mindful of what resources guides are actually expected to provide, as it found that pages linking to articles and databases were far more used than pages with other content.5 Data has also shown that undergraduate students are unable to match their information needs with the resources provided on broad subject-area guides, leading several authors to conclude that students would be able to use course-specific guides more easily. For instance, Strutin found that course guides are among the most frequently used guides, especially when paired with library instruction sessions.6 Several other studies cite survey data, statistics, and educational concepts like cognitive load theory to conclude that ideally, guides would be customized to the specific information needs of each course and its assignments in order to better match the mental models and information-seeking behavior of undergraduate students.7 While the value of online research guides has been under study for quite some time, usability testing of guides is a relatively recent phenomenon. In 2010, librarians at Concordia University conducted usability testing of two online research guides and found that undergraduate students generally found the guides difficult to use.8 Librarians at Metropolitan State University conducted two rounds of usability tests on their LibGuides with a broader range of participant types, highlighting the ability to incorporate usability testing as part of an iterative design process.9 At Ithaca College, subject librarians partnered with students in a Human-Computer Interaction INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 78 course to test both course guides and subject guides through a series of usability tests, pre- and post-test questionnaires, and a group discussion in which students evaluated the findings of the usability tests and discussed their experiences.10 At the University of Nevada, Las Vegas, librarians conducted usability testing with both undergraduate students and librarians, and surprisingly found that attitudes towards the guides were similar in both groups: interface design challenges were the greatest barrier to task completion, rather than the level of expertise of the user.11 Finally, at Northwestern University, librarians conducted several types of usability tests as a part of a transition from the original LibGuides platform to LibGuides v2, to determine what features worked from the original guides and what could be improved or updated during the migration.12 Throughout these and other usability studies, the authors have identified a number of desirable and undesirable elements in research guide design: • Clean and simple design is highly prioritized by users. Students preferred streamlined text, plentiful white space, and links to “best bets” rather than comprehensive but overwhelming lists of databases.13 These findings also align with accepted web design best practices. • Guide parts and included resources should be labeled clearly and without jargon.14 Sections and subpages within each guide should be named according to key terms that students recognize and understand. Also, librarians should consider creating subpages using a “need-driven approach,” based on the purpose of each research task or step, rather than by the format of materials or resources.15 • The tabbed navigation of LibGuides v1 is both unappealing to and easily missed by users, and if it must be implemented, great care should be taken to maximize its visibility and usability.16 • Consistency of guide elements, both within a guide and from one guide to the next, helps users more easily orient themselves when using guides; certain elements should always be present in the same place on the page, including navigational elements and table of contents, contact information, supplemental resources such as citation and plagiarism information, and common search boxes.17 With the findings and recommendations of these predecessors in mind, we designed a multi-stage study to expand upon their results and identify new challenges and opportunities that the LibGuides v2 platform might present. METHODOLOGY Stage 1: Card Sort The majority of Research Guides at UHL are organized by subject area, by course, or both. There are a number of guides, however, that are not affiliated with any particular subject area or course, containing task-oriented information that may be valuable across a wide variety of disciplines. The organizational system for these guides had developed organically over time as new guides were developed, rather than being structured intentionally, and it had become evident that these guides were not particularly discoverable or well-used by students. The migration to LibGuides v2 presented an opportunity to reorganize these guides based on user input. A team of three librarians from the Liaison Services department conducted an open-card-sort study in November 2015, in order to determine how best to organize those Research Guides not already affiliated with a course or subject area. Card sorting is a method of identifying the TESTING FOR TRANSITION | LIERMAN, SCOTT, WARREN, AND TURNER 79 https://doi.org/10.6017/ital.v38i4.11169 categories and organization of information that make the most sense to users, by asking users to sort potential tasks into named categories representing the menus or options that would be available on the site. An open-card sort allows users to create and name as many categories as they need, as opposed to a closed-card sort, which requires users to sort the available options into a predetermined set of categories. To prepare for the study, we reviewed all of our guides to develop a complete list of those not affiliated with a subject or course. For each guide, we developed a brief, clear description of the guide’s topic that would be easy for an average library user to understand, each on a small laminated card. Over an approximately ninety-minute period, we staffed a table in the 24-Hour Lounge of M.D. Anderson Library, where we recruited passersby to participate in the study. After answering a few demographic questions, participants were asked to place the cards into groups that seemed logical to them. They could create as many or as few groups as necessary, but were asked to try to place every card in a group. While the participants organized the cards, they were asked to explain their thought processes and rationale, and one librarian observed the sorting process and took notes on their actions and explanations. When a participant finished grouping the cards, they were asked to write on an index card a name for each of the groups they had created. The final groupings were photographed and the labels retained for recording purposes. After the testing was complete, participants’ responses were organized into a spreadsheet and reviewed for recurring patterns and commonalities. A new set of categories was developed based on those most commonly created by students during the study, and these categories were titled using the most common terminology used by students in their group labels. Stage 2: Migration At the direction of the instructional design librarian (IDL), Research Guide editors at UHL revised and prepared their own guide content throughout fall 2015, eliminating unneeded information and reorganizing what remained. The IDL led multiple trainings and work sessions throughout the process to ensure compliance. During this same time, the IDL completed back-end work in the LibGuides system to prepare for migration, and the Web Services department created a custom layout for the new guide site. The data migration itself took place on December 18, 2015, followed by cleanup and full implementation in January 2016. The IDL provided a deadline by which all content must be ready for public consumption, prior to the start of the spring semester. Af ter that deadline, the Web Services department switched the URL for UHL’s Research Guides site to the LibGuides v2 instance and made the new system publicly available. Stage 3: Face-to-Face Testing After the migration process was complete, the IDL assembled a team of ten other librarian and staff stakeholders from the Liaison Services, Special Collections, and Web Services departments to develop a usability testing protocol. This team assisted the IDL in developing two different face-to- face testing scripts and the text of a survey for distance users, as well as helping to administer face-to-face testing. The method we chose for the face-to-face testing process was think-aloud testing. In a think-aloud test, the user is provided a set of tasks to complete using the web resource that have been identified as common potential uses. The user is asked to attempt each task, and to narrate any thoughts or reactions to the resource, as well as the thought process and rationale behind each decision made. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 80 Several members of the team were already familiar with usability practices and had participated in think-aloud user testing before. Training for the others was provided in the form of short practical readings, verbal guidance from the IDL in group meetings, and practice sessions before conducting the face-to-face testing. In the practice sessions, group members volunteered for their roles in the testing, discussed protocol and logistics and asked any questions, and practiced the tasks they would each need to complete: making the recruitment pitch to users, walking through the consent process, using recording software, using the notetaking sheet, and so on. As the team leader and one of the members experienced with usability, the IDL conducted the actual testing interviews. Each of the face-to-face tests focused on either subject guides or the guide homepage. For both tests, tables were set up in the 24-Hour Lounge for recruitment and testing. Two team members recruited students in the library at the time of testing by offering snacks and library branded giveaways. Two additional team members facilitated the test and took notes during testing. Both tests also used the same consent forms and demographic questions, and largely the same follow- up questions. Participants in both homepage and subject guide testing were guided to the appropriate starting points and interviewed about their impressions of the homepage and guides, their perceptions of the purpose of these resources, and their understanding of the Research Guides name. Subject guide testers were allowed to select which of our two testing guides they would be more comfortable using: the General Business Resources guide, or the Biology and Biochemistry Resources guide. Subject guide testers were also asked how they would seek help if the guide did not meet their needs. Both groups were then asked to complete one of two sets of tasks. The homepage tasks were designed to test users’ ability to find individual guides, either for a specific course or for general information on a subject; the subject guide tasks were designed to test users’ ability to find appropriate resources for research on a given topic. After completing the tasks for their appropriate resources, participants answered several general follow-up questions, with additional questions from the facilitator as necessary. Stage 4: Survey Unlike the face-to-face testing, the survey focused only on use of subject guides, not the homepage. Otherwise, however, because the purpose of the survey was to compare the behavior of distance users to the behavior of on-campus users, the survey was designed to mimic the face-to-face test as closely as possible. Several team members with liaison responsibilities identified distance user groups in their subject areas who would be demographically appropriate and available at the needed time, and contacted appropriate faculty members to ask for assistance in distributing the survey via email. Ultimately, the survey was distributed to small cohorts of users in the areas of Social Work, Education, Nursing, and Pharmacy, and customized for each targeted cohort. Each version of the survey linked users to their appropriate subject guide and then asked the same questions regarding impressions of the guide that were asked in the face-to-face testing. Users were also asked to complete tasks using the guide that were similar in purpose to those in the face-to-face testing, and they were prompted to enter the resource they found at the end of each task. Demographic information was requested at the end of the survey to ensure that in the event of drop-offs, basic demographic information would be more likely to be lost than testing data. The TESTING FOR TRANSITION | LIERMAN, SCOTT, WARREN, AND TURNER 81 https://doi.org/10.6017/ital.v38i4.11169 survey was distributed to the target groups over a three-week period in June 2016. Six users at least partially completed the survey, and four completed it in full. Stage 5: Analyzing and Implementing Results After completing the face-to-face testing, the IDL reviewed and transcribed the recordings of each test session, along with additional insights from the notetakers. Responses to each interview question were coded and ordered from most to least common, as were patterns of behavior and difficulties in completing each task. Task results and completion times were also recorded for each user and organized into a spreadsheet with users’ demographic information. The IDL then reported out to Research Guide editors on common responses and task behaviors observed in the testing, and interpretations of the implications of these results for guide design. After survey responses were collected, the IDL compiled and analyzed the results using a similar process, although the survey received few enough responses that coding was not necessary. Users’ responses to questions were noted and grouped, and success and failure rates on tasks were tallied. A second report out to Research Guide editors summarized these results and described which responses closely resembled those received in the face-to-face testing and which varied. Finally, when all data had been collected, the IDL compiled recommendations based on the testing results with other recommendations derived from past UHL studies and from reviewing the literature, and from these developed a set of Research Guides Usability Guidelines. The guidelines were organized from highest to lowest priority, based on how commonly each was indicated in testing or in the literature. Research Guide editors were asked to revise their guides according to these guidelines within one year of their implementation, and advised that their compliance would be evaluated in an audit of all guide content in summer 2017. In the interest of transparency, the IDL also included in the guidelines document an annotated bibliography of the relevant literature review, and a formal report on the procedures and results of the usability testing process. FINDINGS Card Sort One significant observation from the card sort was that, while librarians tended to organize guides into groups based on type of user (e.g., “undergraduates,” “student athletes,” “first-years,” etc.), none of the students who participated categorized resources in this way, and they did not seem to be particularly conscious of the categories into which they or other users might fit. Instead, their groupings focused on the type of task to which each guide would be most appropriate, rather than the type of user that would be most likely to use that guide. For example, users readily recognized guides related to citation tasks and preferred them to be grouped together, regardless of the level at which they addressed the topic, and also grouped advanced visualization techniques like GIS with simpler multimedia-related tasks like finding images. Similarly, category labels tended to include “How To . . . ” language in describing their contents, focusing on the task to which the guides in that category would be beneficial. This aligns with the recommendation from Sinkinson et al. to name guide pages based on purpose rather than format.18 It is worth noting, however, that all of the students who participated in the card-sort study were undergraduates and may not have fully understood some of the more complex research tasks being described. It should also be noted that all users created some sort of category for “general” or “basic” research tasks, and most either explicitly created an “advanced” research category, or INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 82 created several more granular categories and then verbally described these as being for “advanced” research tasks. In general, organization by task type was most preferred, followed by level of sophistication of task. Face-to-Face Testing: Homepage No significant correlations were found between user demographics and users’ success rates in completing each task, nor between demographics and time on task. Users’ ability to navigate the system was generally consistent regardless of major, year in program, and—somewhat surprisingly—frequency of library use. This is, however, in keeping with Costello et al.‘s finding that technology barriers were more significant in user testing than level of experience.19 When testing the homepage, we found that all users were able to find known guides (such as a course guide for a specific course) and appropriate guides for a given task (such as a citation guide for a particular style) quickly and easily. When seeking a guide, users generally used the By Subject view of all guides to locate both subject and course guides. If this view was not helpful, as in the case of citation style guides, users’ next step was most commonly to switch to the All Guides view and use the search function to look for key terms. Users understood and used the By Subject and All Guides views intuitively, expressed more confusion and hesitation about the By Owner and By Type views, and disregarded the By Group view entirely. We had been concerned about whether the search function would confuse users by highlighting results from guide subpages, but on the contrary, the study participants used the search function easily, and the fact that it surfaced results from within guides seemed to help them find and identify relevant terms, rather than confusing them. Overall, users responded favorably to the look and feel of guides, albeit with a few specific critiques: the initially limited color palette made it difficult for some users to distin guish parts of a guide from one another, and the text size was found to be uncomfortably small in some areas. Face-to-Face Testing: Subject Guides In subject guide testing, we found overwhelmingly that users both valued and made use of link and box descriptions within guides, using them throughout the navigation process as sources of additional information. Users generally preferred briefer descriptions, rather than reading lengthy paragraphs of text, but several noted specific instances in which they would not have understood the nature or purpose of a database without the description that was provided. We also found, conversely, that librarian profile boxes were of less value to users than we had assumed. When asked how they would find help when researching, most subject guide testers said they would turn to Google, ask at the library service desk, or use the Contact Us link in the LibGuides footer; only two mentioned the profile box as a potential source of help at all. Users also seemed unsure of the purpose of the profile box, and not to recognize whose photo and contact information they were seeing, in spite of box labels and text. Contrary to our expectations, users also readily clicked through to subpages of guides to find information, sometimes even when more useful information was actually available on the guide landing page. This was particularly evident in one of the subject guides that included internal navigation links in a box on the landing page: if users saw a term they recognized in one of these links, they would click it immediately, without exploring the rest of the page. In general, users latched on quickly to terms in subpage and box titles that seemed relevant to their tasks, and some expressed feelings of increased confidence and reassurance when seeing a familiar term featured TESTING FOR TRANSITION | LIERMAN, SCOTT, WARREN, AND TURNER 83 https://doi.org/10.6017/ital.v38i4.11169 prominently on an otherwise unfamiliar resource. Scanning for keywords in this manner also sometimes led users astray, however: some navigated to inappropriate pages or links because they featured words like “Research” or “Library” in their titles. Users also expressed confusion about page titles that did not match their expectations of tasks they could complete online, such as “Biology Reading Room.” These findings support those of many prior authors regarding the importance of including clear descriptions with key words that users readily understand.20 Many of our results from subject guide testing not only ran counter to our expectations, but challenged the assumptions on which we had based our questioning. For example, we had been curious to learn whether links to external websites were used significantly compared to links to library databases, or if they simply cluttered guides unnecessarily. In testing, however, we found that users did not distinguish between the two types of resources at all, and used both interchangeably. A better question seemed to be not whether users found those links useful, but how to distinguish them from library content—or whether the distinction was necessary from the user’s perspective. Some team members had also been concerned about the scroll depth of guide pages, but the majority of users not only said they did not mind scrolling, but seemed surprised and amused by being asked. Their own assumptions about this type of resource clearly included the need to scroll down a page. A few other miscellaneous issues presented themselves in our face-to-face testing. One was that the purpose and nature of Research Guides was not readily evident to users. Many used language that conflated guides with search tools like databases, or even with individual information resources like books or articles. For example, a user asked whether the By Owner view listed the authors of articles available in this resource. The curated and directional nature of Research Guides was not at all clear to users. Furthermore, in spite of the improvements to guide look and feel in LibGuides v2, several users still spoke of guides as being cluttered, lengthy, and overwhelming, leaving them intimidated and unsure of where to begin. Consistently, testers tended to gravitate toward course guides even when subject guides would have been more appropriate for a given task, and some users expressed that this choice was because of the greater specificity in course guide titles. Users demonstrated a great preference for familiarity, gravitating toward terms and resources that were known to them, and even repeating behaviors that had been unproductive earlier in the testing process. Finally, one of the greatest points of confusion for users seemed to be the relationship of Research Guides to physical materials within the library. Users readily and confidently followed links to online resources from Research Guides but expressed confusion and hesitancy when guides pointed to books or other resources available in the library. Survey The survey of off-campus users had few responses, but the demographics of the respondents varied more than those of the on-campus testing participants, including graduate students and faculty. The users who did respond showed evidence of less use of guide subpages than we had observed in the face-to-face testing, indicating that the presence of a librarian during testing may have influenced users to explore guides more thoroughly than they would have when working on their own. At the same time, more experienced researchers in the survey group—in this case, a late-program graduate student and a faculty member—were apparently more likely than less experienced users to explore guides thoroughly, and to succeed at research tasks. Survey respondents also were far more likely to state that they would use the profile box on guides for INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 84 help, with some indicating that they recognized their liaison librarian’s picture and were familiar with the librarian as a source of assistance. Liaison librarians at UHL often work more closely with higher-level students and faculty than with undergraduates, and this greater familiarity was not surprising. DISCUSSION Implementation of Findings Based on the results of the literature review and testing, a number of changes and recommendations were implemented. A brief description of the nature and purpose of Research Guides was added to the guide homepage’s sidebar, and more color variation was added to guides, while font sizes were increased. Existing documentation was also reworked and expanded to create the Research Guides Usability Guidelines document for all guide editors, which included adding or revising the following recommendations: • Pages, boxes, and resources should all be labeled with clear, jargon-free language that includes keywords familiar to their most frequent users. • Page design should be “clean and simple,” minimizing text and incorporating ample white space. • Brief, one- to two-sentence descriptions should be provided for all links. • Each guide should have an introduction on its landing page with a brief description of its contents and purpose. It may be helpful to include links to subpages in this box as well, but this should be done judiciously, as these links may take users off the landing page prematurely. • Pages and resources aimed at undergraduates should be organized and titled according to their relevance to research tasks (e.g., “Find Articles”), and not by user group. • Electronic resources should be prioritized on guides over print resources. • Clear distinctions should be made between library and non-library links when the distinction is important. • A profile box with a picture should be included, but the importance of this item is not as great as we had previously imagined. Limitations One of the most significant challenges in our testing was actually negotiating the IRB application process. Delays in our application raised concerns within the team that we would not receive approval in time to test with students before the start of the summer break. Although we did receive approval in time, the window for testing afterward was extremely narrow. Submitting the application also bound us to the scripts and text that we had originally drafted, which severely limited the flexibility of the testing process. This became a challenge at several points when a particular phrasing or design of a question was found to be ineffective in practice, but could not be altered from its original form. Tensions between the requirements for institutional review and the unique needs of usability testing are a persistent problem for user experience development in an academic setting, and must be planned for accordingly as much as possible. In some cases, as well, we might have improved our results by better designing our questions. One example of this was the question about the name “Research Guides,” which anecdotal evidence has suggested might be challenging for users. Simply asking whether that name made sense to the participant was clearly not effective in practice, and did not yield actionable insights. In the future, TESTING FOR TRANSITION | LIERMAN, SCOTT, WARREN, AND TURNER 85 https://doi.org/10.6017/ital.v38i4.11169 we might consider informal testing of our planned questions with users in the target demographic before proceeding with full-scale usability testing. A final challenge was in gathering data on use of guides by distance users. Though we were able to get enough responses to draw some tentative conclusions, we had hoped for a larger pool of data. Though it would make the results more difficult to compare to in-person testing, reducing the length of the survey might have helped to produce more responses. Additionally, increased marketing and more flexible timing for survey distribution might have also helped us reach a larger audience. CONCLUSIONS The results of our testing were very instructive, and led to the creation of valuable documentation for guide editors to use in their work. We also learned a number of lessons relating to process that would be of value to other librarians seeking to perform similar testing at their own institutions. The first of these is that working with a large, interdepartmental team on this type of project— while occasionally unwieldy—is greatly beneficial overall. Even if all the team members are not able to fully participate, involving as many colleagues as possible in the usability testing process lessens the workload for each individual, increases flexibility, and ultimately increases buy-in and compliance with the resulting changes and recommendations. For a platform used directly by a relatively large percentage of librarians, as LibGuides generally is, the number of stakeholders in user research is correspondingly large, and as many of these stakeholders as possible should be involved to some degree. Not only will this distribute the benefits of the process more broadly, it will make it possible to staff more extensive and more frequent testing sessions. In the course of our testing process, we also came to recognize the value of testers familiar with the user group under examination. A majority of librarians involved in testing were from public- facing departments, with significant student contact in their day-to-day work. As a result, we were able to quickly attract a diverse set of participants for our testing simply through our collective knowledge of students’ likely behaviors and preferences: where students were most likely to congregate, what kinds of rewards would motivate them to participate, how to reach them at a distance, and how far their patience would be likely to extend for an in-person interview or an online survey. The incentives and location that the testing teams selected were so effective that the numbers of volunteers we received overwhelmed our capacity to accommodate within the allotted testing time, resulting in a substantial pool of responses for analysis. Therefore, we conclude that the effectiveness of user research can be increased by including (or at least consulting) those most familiar with the user group to be studied. Simply assuming that participants will be available may ultimately compromise the effectiveness of testing. Additionally, time management is an extremely important element of testing development. Failing to fully account for the demands of the IRB process, for example, led to significant limitations for our project concerning the timing of testing, the availability of participants, our capacity for marketing and distribution of the survey, and the quality of our testing instrument. While acknowledging that, as in our case, sometimes the need for usability testing arises on short notice, we recommend allocating as much time and preparation to the process as possible, to ensure that every aspect of the testing can be given adequate attention. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 86 Figure 1. Average monthly guide views by transition period. TESTING FOR TRANSITION | LIERMAN, SCOTT, WARREN, AND TURNER 87 https://doi.org/10.6017/ital.v38i4.11169 As a final note, nearly two years after the best practices were implemented, we collected and compared guide traffic statistics from three key periods: • September 2014 through December 2015, the sixteen months preceding our transition to LibGuides v2; • January 2016 through August 2017, our first twenty months on LibGuides v2, during which time best practices had not yet been fully developed and implemented; and • September 2017 through April 2019, from the beginning of best practices implementation through the time of writing (best practices were implemented gradually between September 2017 and February 2018). Mindful of the fact that guide usage fluctuates with the academic year, we compared average views for each guide on a monthly basis. Figure 1 shows the average number of times each guide was viewed in a month for each period of the transition. As the figure shows, for most of the academic year, guide views dropped sharply after our transition from LibGuides v1 to LibGuides v2, and continued to decline slightly with time through the period when our best practices were implemented. There are a number of possible causes for this phenomenon: • Guide usage may be declining over time generally for a variety of reasons, and the transition to the new look of v2 may have confused and disoriented users in the immediate aftermath, causing use of some guides to be discontinued. • A substantial number of older guides were eliminated in the transition to v2, some of which may have been more heavily used than suspected, and new guides that have been created since may not yet have gained traction and recognition from users. • Librarians may also have reduced their efforts to incorporate guides into their teaching and outreach strategies. • Improved organization in the new system may be helping users to find the guide they need on the first try, without having to move through and examine multiple guides. In any case, this trend is concerning and merits further investigation, but a direct correlation with the transition to LibGuides v2 and the implementation of best practices has not been established. A more accurate measure of the effect of the best practices would be a user satisfaction survey, although a comparison would be difficult to make due to a lack of a baseline from bef ore the transition. We will continue to investigate trends in the use of our guide and how our best practices have affected our users, and how they can be improved upon in the future. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 88 APPENDIX A: HOMEPAGE TESTING SCRIPT Welcome and Demographics Hello! Thank you for agreeing to participate. I’ll be helping you through the process, and my colleague here will be taking notes. Before we get started, I’d like to ask you a few quick questions about yourself. • Are you a student? o (No:) ▪ What is your status at UH? (Faculty, staff, fellow, etc.) ▪ With what college or area are you affiliated? o (Yes:) ▪ Are you an undergraduate or a grad student? ▪ What program are you in? ▪ What year are you in now? • How often do you use this library? • How often do you use the Libraries’ website or online resources? • About how many hours a week would you say you spend online? • Have you ever used the Libraries’ Research Guides before? (If not) have you ever heard of them? Are you ready to start? Do you have any questions? Homepage Tour First, I’d like to ask you a few questions about the homepage, which you can see here. Don’t worry about right or wrong answers, I just want to know your reactions. • When you look at this page, what are your first impressions of it? • Just from looking at these pages, what do you think this resource is for? • Look at the categories across the top of the screen. What do you think each of those mean? What would you use them for? • What would you call the resources listed here? • We call these resources “Research Guides.” Does that name make sense to you? Tasks: Odd-Numbered Participants Now we’re going to ask you to complete two tasks using this page and the links on it. This isn’t a test, and nothing you do will be the wrong or right answer. We just want to see h ow you interact with the site and what we can do to make that experience better. Do you have any questions so far? Let’s begin. Please try to talk about what you’re doing as much as possible, and tell us what you’re thinking and why you’re taking each step. 1. You need to find sources for an assignment for your history class, and you aren’t sure where to start. You clicked a link on the Help section of the library webpage that led you here. Find a guide that you think can help you. 2. You are taking Chemistry 1301, and your professor told you that the library has a research guide especially for this class. Find the guide you think they meant. TESTING FOR TRANSITION | LIERMAN, SCOTT, WARREN, AND TURNER 89 https://doi.org/10.6017/ital.v38i4.11169 Tasks: Even-Numbered Participants Now we’re going to ask you to complete two tasks using this page and the links on it. This isn’t a test, and nothing you do will be the wrong or right answer. We just want to see how you interact with the site and what we can do to make that experience better. Do you have any questions so far? Let’s begin. Please try to talk about what you’re doing as much as possible, and tell us what you’re thinking and why you’re taking each step. 1. You need to format a bibliography in MLA style, and your professor told you that the library has a research guide that can help. Find the guide you think she meant. 2. You are taking a psychology course for the first time, and you want find out what types of tools you should use to do research in psychology. You clicked a link on the Help section of the library webpage that led you here. Find a guide that you think can help you. Follow-Up Questions Now I’d like to ask you a few follow-up questions. • Was this easy or hard to do? • What was the easiest part? • What was the hardest part? • What did you like about using this site? • What’s one thing that would have made these tasks easier to complete? INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 90 APPENDIX B: SUBJECT GUIDES TESTING SCRIPT Welcome and Demographics Hello! Thank you for agreeing to participate. I’ll be helping you through the process, and my colleague here will be taking notes. Before we get started, I’d like to ask you a few quick questions about yourself. • Are you a student? o (No:) ▪ What is your status at UH? (Faculty, staff, fellow, etc.) ▪ With what college or area are you affiliated? o (Yes:) ▪ Are you an undergraduate or a grad student? ▪ What program are you in? ▪ What year are you in now? • How often do you use this library? • How often do you use the Libraries’ website or online resources? • About how many hours a week would you say you spend online? • Have you ever used the Libraries’ Research Guides before? (if not) Have you ever heard of them? Are you ready to start? Do you have any questions? Guide Impressions First, I’d like to ask you a few questions about this page. Don’t worry about right or wrong answers, I just want to know your reactions. • When you look at this page, what are your first impressions of it? • Just from looking at this page, what do you think this resource is for? What would you use it for? • What would you call this type of resource? • We call resources like this “Research Guides.” Does that name make sense to you? • If you couldn’t find what you were looking for on this page, what would you do to find help? Now we’re going to ask you to complete two tasks using this page and the links on it. This isn’t a test, and nothing you do will be the wrong or right answer. We just want to see how you interact with the site and what we can do to make that experience better. Do you have any questions so far? Let’s begin. Please try to talk about what you’re doing as much as possible, and tell us what you’re thinking and why you’re taking each step. Tasks: General Business Resources Guide 1. Find a database that you could use for research in a general business class. 2. Imagine you want to find information on census data. Find an appropriate resource on this guide. 3. Find a tool you could use to find a dissertation to use in a general business class. TESTING FOR TRANSITION | LIERMAN, SCOTT, WARREN, AND TURNER 91 https://doi.org/10.6017/ital.v38i4.11169 Tasks: Biology and Biochemistry Resources Guide 1. Find a database that you could use for research in a biology class. 2. Imagine you want to find information on taxonomy. Find an appropriate resource on this guide. 3. Find a tool you could use to find a thesis to use in a biology class. Follow-Up Questions Now I’d like to ask you a few follow-up questions. • Was this easy or hard to do? • What was the easiest part? • What was the hardest part? • What did you like about using this site? • What did you dislike? • What’s one thing that would have made these tasks easier to complete? • Did it bother you to have to scroll down the page to find additional information? • If you had been doing this on your own, do you think you would have kept scrolling, or gone to other pages on the guide? • Did you notice or read the text below the links? • Did the names of the different pages on the guide make sense to you? Did you know what to expect? • Do you think you would use these resources yourself if you were a student in the appropriate class? INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 92 APPENDIX C: EXAMPLE SURVEY— SOCIAL WORK STUDENTS Screening Questions Are you a University of Houston student, faculty member, or employee? • Yes • No Are you at least 18 years of age? • Yes • No Consent UNIVERSITY OF HOUSTON CONSENT TO PARTICIPATE IN RESEARCH PROJECT TITLE: Usability Testing of Library Research Guides You are being invited to participate in a research project conducted by Ashley Lierman, the Instructional Design Librarian, and a team of other librarians from the University of Houston Libraries. NON-PARTICIPATION STATEMENT Your participation is voluntary and you may refuse to participate or withdraw at any time without penalty or loss of benefits to which you are otherwise entitled. You may also refuse to answer any question. If you are a student, a decision to participate or not or to withdraw your participation will have no effect on your standing. PURPOSE OF THE STUDY The purpose of this study is to investigate user interactions with the Research Guides area of the UH Libraries’ website, in order to understand user needs and expectations and improve the performance of the site. PROCEDURES You will be one of approximately fifty subjects to be asked to participate in this survey. You will be asked to provide your initial thoughts and reactions to the Libraries’ Research Guides, and to complete three ordinary research tasks using the page and associated links, then answer follow- up questions about your experience. The survey includes 23 questions and should take approximately 20-30 minutes. CONFIDENTIALITY Your participation in this project is anonymous. Please do not enter your name or other identifying information at any point in this survey. TESTING FOR TRANSITION | LIERMAN, SCOTT, WARREN, AND TURNER 93 https://doi.org/10.6017/ital.v38i4.11169 RISKS/DISCOMFORTS No foreseeable risks or discomforts should result from this research. BENEFITS While you will not directly benefit from participation, your participation may help investigators better understand our users’ needs and expectations from the Libraries’ website. ALTERNATIVES Participation in this project is voluntary and the only alternative to this project is non - participation. PUBLICATION STATEMENT The results of this study may be published in professional and/or scientific journals. It may also be used for educational purposes or for professional presentations. However, no individual subject will be identified. If you have any questions, you may contact Ashley Lierman at 713-743-9773. ANY QUESTIONS REGARDING YOUR RIGHTS AS A RESEARCH SUBJECT MAY BE ADDRESSED TO THE UNIVERSITY OF HOUSTON COMMITTEE FOR THE PROTECTION OF HUMAN SUBJECTS (713- 743-9204). By clicking the “I Agree to Participate” button below, you affirm your consent to participate in this survey. If you do not consent to participate, you may simply close this window. • I Agree to Participate Guide Impressions Click the link below (will open in a new window) and explore the page it leads to, then return to this survey and answer the questions. http://guides.lib.uh.edu/socialwork When you look at the page linked above, what are your first impressions of it? Just from looking at the page, what do you think this resource is for? What would you use it for? What would you call this type of resource, if you had to give it a name? If you couldn’t find what you were looking for on the page linked above, what would you do to find help? On the following pages, you will be asked to complete three brief tasks. This is not a test, and nothing you do will be the wrong or right answer. The purpose of these tasks is simply to allow you to experiment with using the guide in an authentic way. When you have completed all of the tasks, you will be asked a few questions about your experiences. http://guides.lib.uh.edu/socialwork INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 94 First Task Click the link below to open the Social Work Resources guide (will open in a new window): http://guides.lib.uh.edu/socialwork On the Social Work Resources guide, find a link to a database that you could use to investigate possible psychiatric medications. Enter the name of the database you found: Second Task Click the link below to open the Social Work Resources guide (will open in a new window): http://guides.lib.uh.edu/socialwork Imagine you want to find a psychological assessment. Find an appropriate resource on Social Work Resources guide. (You do not need to actually find an assessment, only the name of a resource that would help you locate one.) Enter the name of the resource you found: Third Task Click the link below to open the Social Work Resources guide (will open in a new window): http://guides.lib.uh.edu/socialwork On the Social Work Resources guide, find a tool you could use to find historical census data. Enter the name of the tool you found: Follow-Up Questions Were the tasks on the preceding pages easy or difficult to do? • Extremely easy • Somewhat easy • Neither easy nor difficult • Somewhat difficult • Extremely difficult What was the easiest part of completing the tasks? What was the most difficult part of completing the tasks? What did you like about using the guide that you were linked to? What did you dislike about using the guide? What is one thing that would have made the tasks easier to complete? Demographics Thank you for completing the survey! Before you leave, please answer a few demographic questions about yourself. http://guides.lib.uh.edu/socialwork http://guides.lib.uh.edu/socialwork http://guides.lib.uh.edu/socialwork TESTING FOR TRANSITION | LIERMAN, SCOTT, WARREN, AND TURNER 95 https://doi.org/10.6017/ital.v38i4.11169 Are you a student? • Yes • No Type of student: • Undergraduate • Graduate • Not a student Program or major: Year in program: • 1st • 2nd • 3rd • 4th • 5th or higher • Not a student How often do you use the University of Houston Libraries? • Daily • A few times a week • A few times a month • A few times a year • Never How often do you use the Libraries’ website or online resources (e.g. databases, catalog, etc.)? • Daily • A few times a week • A few times a month • A few times a year • Never Have you ever used the Libraries’ Research Guides before? • Yes • No Ending Screen We thank you for your time spent taking this survey. Your response has been recorded. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 96 REFERENCES 1 “User Experience Basics,” Usability.gov, https://www.usability.gov/what-and-why/user- experience.html. 2 Brenda Reeb and Susan Gibbons, “Students, Librarians, and Subject Guides: Improving a Poor Rate of Return,” Portal: Libraries and the Academy 4, no. 1 (2004): 123-30, https://doi.org/10.1353/pla.2004.0020. 3 Martin P. Courtois, Martha E. Higgins, and Aditya Kapur, “Was This Guide Helpful? Users’ Perceptions of Subject Guides,” Reference Services Review 33, no. 2 (2005): 188-96, https://doi.org/10.1108/00907320510597381. 4 William Hemmig, “Online Pathfinders: Toward an Experience-Centered Model,” Reference Services Review 33, no. 1 (2005): 66-87, https://doi.org/10.1108/00907320510581397. 5 Shannon M. Staley, “Academic Subject Guides: A Case Study of Use at San Jose State University,” College & Research Libraries 68, no. 2 (2007): 119-40, http://crl.acrl.org/content/68/2/119.short. 6 Michal Strutin, “Making Research Guides More Useful and More Well Used,” Issues in Science and Technology Librarianship 55 (2008), https://doi.org/10.5062/F4M61H5K. 7 Kristin Costello et al., “LibGuides Best Practices: How Usability Showed Us What Students Really Want from Subject Guides” (presentation, Brick & Click ’15: An Academic Library Conference, Maryville, MO, November 6, 2015): 52-60; Alisa C. Gonzalez and Theresa Westbrock, “Reaching Out with LibGuides: Establishing a Working Set of Best Practices,” Journal of Library Administration 50, no. 5-6 (2010): 638-56, https://doi.org/10.1080/01930826.2010.488941; Jennifer J. Little, “Cognitive Load Theory and Library Research Guides,” Internet Reference Services Quarterly 15, no. 1 (2010): 53-63, https://doi.org/10.1080/10875300903530199; Dana Ouellette, “Subject Guides in Academic Libraries: A User-Centered Study of Uses and Perceptions,” Canadian Journal of Information and Library Science 35, no. 4 (2011): 436-51, https://doi.org/10.1353/ils.2011.0024. 8 Luigina Vileno, “Testing the Usability of Two Online Research Guides,” Partnership: The Canadian Journal of Library and Information Practice and Research 5, no. 2 (2010): 1-21. https://doi.org/10.21083/partnership.v5i2.1235. 9 Alec Sonsteby and Jennifer DeJonghe, “Usability Testing, User-Centered Design, and LibGuides Subject Guides: A Case Study,” Journal of Web Librarianship 7, no. 1 (2013): 83-94. https://doi.org/10.1080/19322909.2013.747366. 10 Laura Cobus-Kuo, Ron Gilmour, and Paul Dickson, “Bringing in the Experts: Library Research Guide Usability Testing in a Computer Science Class,” Evidence Based Library and Information Practice 8, no. 4 (2013): 43-59, http://ejournals.library.ualberta.ca/index.php/EBLIP/article/view/20170. 11 Costello et al., 56. https://www.usability.gov/what-and-why/user-experience.html https://www.usability.gov/what-and-why/user-experience.html https://doi.org/10.1353/pla.2004.0020 https://doi.org/10.1108/00907320510597381 https://doi.org/10.1108/00907320510581397 http://crl.acrl.org/content/68/2/119.short https://doi.org/10.5062/F4M61H5K https://doi.org/10.1080/01930826.2010.488941 https://doi.org/10.1080/10875300903530199 https://doi.org/10.1353/ils.2011.0024 https://doi.org/10.21083/partnership.v5i2.1235 https://doi.org/10.1080/19322909.2013.747366 http://ejournals.library.ualberta.ca/index.php/EBLIP/article/view/20170 TESTING FOR TRANSITION | LIERMAN, SCOTT, WARREN, AND TURNER 97 https://doi.org/10.6017/ital.v38i4.11169 12 John J. Hernandez and Lauren McKeen, “Moving Mountains: Surviving the Migration to Libguides 2.0,” Online Searcher 39, no. 2 (2015): 16-21. 13 Ouellette, 447; Denise FitzGerald Quintel, “LibGuides and Usability: What Our Users Want,” Computers in Libraries 36, no. 1 (2016): 8; Sonsteby and DeJonghe, 89. 14 Costello et al., 56; Hernandez and McKeen, 20; Sonsteby and DeJonghe, 89. 15 Caroline Sinkinson et al., “Guiding Design: Exposing Librarian and Student Mental Models of Research Guides,” Portal: Libraries and the Academy 12, no. 1 (2012): 74, https://doi.org/10.1353/pla.2012.0008. 16 Costello et al., 56; Ouellette, 444-45; Quintel, 8; Kate A. Pittsley, and Sara Memmot, “Improving Independent Student Navigation of Complex Educational Web Sites: An Analysis of Two Navigation Design Changes in LibGuides,” Information Technology and Libraries 31, no. 3 (2012): 56, https://doi.org/10.6017/ital.v31i3.1880; Sonsteby and DeJonghe, 87. 17 Cobus-Kuo, Gilmour, and Dickson, 50; Costello et al., 56. 18 Sinkinson et al., 74. 19 Costello et al., 56. 20 Costello et al., 56; Hernandez and McKeen, 20; Sonsteby and DeJonghe, 89; Sinkinson et al., 74. https://doi.org/10.1353/pla.2012.0008 https://doi.org/10.6017/ital.v31i3.1880 ABSTRACT INTRODUCTION LITERATURE REVIEW METHODOLOGY Stage 1: Card Sort Stage 2: Migration Stage 3: Face-to-Face Testing Stage 4: Survey Stage 5: Analyzing and Implementing Results FINDINGS Card Sort Face-to-Face Testing: Homepage Face-to-Face Testing: Subject Guides Survey DISCUSSION Implementation of Findings Limitations CONCLUSIONS APPENDIX A: HOMEPAGE TESTING SCRIPT Welcome and Demographics Homepage Tour Tasks: Odd-Numbered Participants Tasks: Even-Numbered Participants Follow-Up Questions APPENDIX B: SUBJECT GUIDES TESTING SCRIPT Welcome and Demographics Guide Impressions Tasks: General Business Resources Guide Tasks: Biology and Biochemistry Resources Guide Follow-Up Questions APPENDIX C: EXAMPLE SURVEY— SOCIAL WORK STUDENTS Screening Questions Consent Guide Impressions First Task Second Task Third Task Follow-Up Questions Demographics Ending Screen REFERENCES 11241 ---- Letter from the Editor Kenneth J. Varnum INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2019 1 https://doi.org/10.6017/ital.v38i2.11241 Welcome to the June 2019 issue of ITAL. You’ll likely notice a new look to the journal when you read this issue’s content. Our helpful and supportive partners at Boston College, where Information Technologies and Libraries is archived, have updated the journal’s content management system to the current version of Open Journal Systems. I am grateful to John O’Connor at Boston College for his patience with and quick, helpful responses to my numerous questions as we adapted to the new user interface and editorial workflows. Columns in this issue include Bohyun Kim’s final “President’s Message” as her term concludes, summarizing the work that has gone into the planned division merger that would combine LITA, ALCTS, and LLAMA. Editorial Board member Cinthya Ippoliti discusses the role of libraries in fostering digital pedagogy in her “Editorial Board Thoughts” column. And, in the second of our new “Public Libraries Leading the Way” column, Jeffrey Davis discusses the technologies and advantages of digital pass systems. Peer-reviewed articles in this issue include: • “No Need to Ask: Creating Permissionless Blockchains of Metadata Records,” by Dejah Rubel, laying a path for using blockchain for managing metadata. • “50 years of ITAL/JLA: A Bibliometric Study of Its Major Influences, Themes, and Interdisciplinarity,” by Brady Lund, a thorough study of how our journal has influenced, and been influenced by, other leading information technology journals. • “Weathering the Twitter Storm: Early Uses of Social Media as a Disaster Response Tool for Public Libraries During Hurricane Sandy,” by Sharon Han. This article is the 2019 LITA/Ex Libris Student Writing Award-winning paper. • “‘Good Night, Good Day, Good Luck’: Applying Topic Modeling to Chat Reference Transcripts,” by Megan Ozeran and Piper Martin, describing a process to categorize chat reference themes using topic mapping software. • “Information Security in Libraries: Examining the Effects of Knowledge Transfer,” by Tonia San Nicolas-Rocca and Richard J Burkhard, investigating the importance of knowledge transfer across an organization to enhance information security behaviors. • “Wikidata: From ‘An’ Identifier to ‘The’ Identifier,” by Theo van Veen, describing how libraries could use Wikidata as a source of linked open data. Thank you to this issue’s authors, and all of Information Technology and Libraries’ readers for supporting peer-reviewed, open-access, scholarly publishing. In closing, I would like to thank the members of the Editorial Board whose terms are ending June 30: Patrick “Tod” Colegrove, Joseph Deodato, Richard Guajardo, and Frank Cervone. I’m grateful to these four individuals, upon whom I’ve relied for their excellent advice and guidance in steering ITAL’s course. We are in the process of appointing new Editorial Board members with two-year terms starting on July 1, and I’ll introduce them in the next issue. Kenneth J. Varnum, Editor varnum@umich.edu June 2019 11251 ---- HathiTrust as a Data Source for Researching Early Nineteenth-Century Library Collections: Identification, Coverage, and Methods Articles HathiTrust as a Data Source for Researching Early Nineteenth-Century Library Collections: Identification, Coverage, and Methods Julia Bauder INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 14 Julia Bauder (bauderj@grinnell.edu) is Associate Professor and Social Studies and Data Services Librarian, Grinnell College. ABSTRACT An intriguing new opportunity for research into the nineteenth-century history of print culture, libraries, and local communities is performing full-text analyses on the corpus of books held by a specific library or group of libraries. Creating corpora using books that are known to have been owned by a given library at a given point in time is potentially feasible because digitized records of the books in several hundred nineteenth-century library collections are available in the form of scanned book catalogs: a book or pamphlet listing all of the books available in a particular library. However, there are two potential problems with using those book catalogs to create corpora. First, it is not clear whether most or all of the books that were in these collections have been digitized. Second, the prospect of identifying the digital representations of the books listed in the catalogs is daunting, given the diversity of cataloging practices at the time. This article will report on progress towards developing an automated method to match entries in early nineteenth-century book catalogs with digitized versions of those books, and will also provide estimates of the fractions of the library holdings that have been digitized and made available in the Google Books/HathiTrust corpus. INTRODUCTION Digital libraries such as Google Books and HathiTrust have created tantalizing opportunities for research into the history of American culture: automated analyses of the entire corpus of books published at a given point in time. The attraction of this prospect is most clearly demonstrated by the avalanche of papers written using the Google Books Ngram data, which provides counts over time of the words and phrases used in the works that make up the Google Books corpus. As soon as this data became available in 2009, it was used to make arguments about social, linguistic, and other changes over time as reflected in changes in the words used in print.1 However, for nearly as long, other researchers have been cautioning that the Google Books corpus is not a representative sample of publishing output, let alone of what the public at large was actually reading in a given year, and that its unrepresentativeness makes it dangerous to draw sweeping conclusion s from this data.2 One potentially feasible solution to the problem of unrepresentativeness in the Google Books corpus would be to use corpora based on the holdings of a specific library or a group of libraries. Using library holdings to form corpora helps to remedy some known issues with using the Google Books corpus as an indicator of social change, such as the fact that many books did not become mailto:bauderj@grinnell.edu HATHITRUST AS A DATA SOURCE | BAUDER 15 https://doi.org/10.6017/ital.v38i4.11251 popular and/or widely available until well after their official publication date, and that some prolific authors who contributed hundreds of thousands of words to the Google Books corpus were never as widely purchased and read as authors who wrote a single, short, best-selling work.3 Although using books held by a set of libraries at a given time as the corpus has its own problems of unrepresentativeness—particularly, for long-established libraries, the fact that the books on the shelf at a given time represent not only works of interest to current users but also those of interest to users from decades past—triangulating this data with that provided by the Google Books Ngram data would at least give some sense of whether and where these different corpora disagree.4 Creating corpora using books that are known to have been owned by a given library at a given point in time is potentially feasible because digitized records of the books in several hundred nineteenth-century library collections are available in the form of scanned book catalogs: a book or pamphlet listing all of the books available in a particular library. However, there are two potential problems with using those book catalogs to create corpora. First, it is not clear whether most or all of the books that were in these collections have been digitized, incorporated into Google Books and HathiTrust, and hence made available for Ngram analyses. Second, the prospect of identifying the digital representations of the books listed in the catalogs is daunting, as both widely agreed-upon cataloging standards and universal identifiers were not adopted until late in the nineteenth century. This article will report on progress towards developing a fully-automated method to match entries in early nineteenth-century book catalogs with digitized versions of those books, and will also provide estimates of the fractions of the library holdings that have been digitized and made available in the Google Books/HathiTrust corpus. METHODS Practical considerations dictated using data from HathiTrust rather than from Google Books for this research. The HathiTrust corpus, although not perfectly coextensive with the Google Books corpus, has very substantial overlap with it. The HathiTrust digital archive was founded in 2008, when a group of large academic libraries formed a collaboration to archive and disseminate their digitized books. The vast majority of those digitized books—around 95 percent, as of mid-2017— had originally been scanned as part of the Google Books project; the agreements that Google Books entered into with the libraries typically stipulated that Google had to provide the library with a digital copy of each book scanned from that library. 5 It was necessary to use HathiTrust rather than Google Books as the comparison corpus because the metadata for the titles in HathiTrust is readily available in ways that the Google Books metadata is not, including as bulk MARC-data downloads. The libraries included in this analysis are social libraries, which were a type of quasi-public library that predated the now-standard, tax-supported public library in the United States. These libraries were privately owned and operated, but were open to some large portion of the population of a particular area who were willing and able to pay a fee or buy a share to belong to the library. Although the presence or absence of a book in social library collections is not a perfect indicator of the book’s popularity—most social libraries pointedly refused to purchase the “trashy” but widely read sensational fiction of the day—it is a defensible proxy (although with some caveats, as noted above) for the popularity of the “serious” literature and nonfiction works that made up the bulk of these libraries’ collections. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 16 Roughly one hundred social library book catalogs published between 1800 and 1860 can be found in HathiTrust.6 For the purposes of the present study, attention was focused on the thirteen library catalogs from ten different American libraries that were published between 1776 and 1825. (A list of these catalogs can be found in appendix A.) These catalogs were chosen because they are likely to present the worst-case scenario in terms of both of the challenges mentioned above: the highest percentage of rare and extremely old books, which Google’s partner libraries would have been least likely to permit to be scanned by Google, and, presumably, the most primitive and eclectic cataloging practices. To the extent that it was possible to do so, this analysis focused on book-length monographs. When serials or pamphlets were listed in a separate section of the catalog, those catalog pages were excluded from the process by which entries were extracted from the catalogs and parsed into CSV files. Serials present particularly intractable matching problems: not only are the original catalogs often unclear about which specific volumes were held, but also HathiTrust’s MARC data does not always clearly indicate which volumes are available in HathiTrust either. Pamphlets have limited coverage in HathiTrust. The selected catalogs were downloaded from HathiTrust as PDFs, and the pdftotext software was used to extract the OCR data from the relevant pages of the scans as hOCR (a file format for OCR that includes information about where each word is located on the page in addition to the words themselves). 7 Then cleaning scripts were created that parsed the hOCR data into CSV files for analysis, with one catalog entry per line of the CSV file.8 Given the widely varied cataloging practices of the early nineteenth century, several different cleaning scripts were written, each tailored to a particular catalog format. For example, many of the catalogs had entries that spanned multiple lines (see figures 1 and 2), so the scripts for those catalogs had to be able to identify when each new entry started. Many catalogs had extraneous information, such as the name of the donor of the book or the size of the book, that had to be filtered out (see figure 1; F, Q, O, and D refer to the size of the book: folio, quarto, octavo, or duodecimo). In addition, various forms of dittoes were frequently used in these catalogs (see figures 1, 2, and 3), so one of the tasks for the cleaning scripts was to identify the dittoes and replace them with the correct words from the previous entry. Figure 1. Library Company of Philadelphia, A Catalogue of the Books Belonging to the Library Company of Philadelphia: To Which Is Prefixed, A Short Account of the Institution, with the Charter, Laws, and Regulations (Philadelphia, PA: Printed by Bartram & Reynolds, 1807), 5. HATHITRUST AS A DATA SOURCE | BAUDER 17 https://doi.org/10.6017/ital.v38i4.11251 Figure 2. Library Company of Baltimore, A Catalogue of the Books, &c. Belonging to the Library Company of Baltimore: To Which Are Prefixed the Act for the Incorporation of the Company, Their Constitution, Their By-Laws, and an Alphabetical List of the Members (Baltimore, MD: printed by Eades and Leakin, 1809), 46. Figure 3. Washington Library Company, Catalogue of Books in the Washington Library (Washington, DC: printed by Anderson and Meehan, 1822), 17. Unfortunately, the horizontal-line dittoes seen in figures 1 and 2—a type of ditto which is used in seven of the thirteen catalogs—are represented inconsistently or not at all in the hOCR, so they cannot reliably be used to identify places where words need to be carried down from the previous entry. For the catalog of the Library of Company of Philadelphia, from which figure 1 was taken, the numbers after the horizontal-line dittoes (which identify the books’ locations on the shelves) can be used to distinguish between a line that is indented because it is a continuation of the entry above and a line that is indented but is the start of a new entry. In theory, a cleaning script for the catalog of the Library Company of Baltimore (figure 2) could use a similar process to identify the last line of an entry by watching for the right-justified count of volumes at the end of each entry. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 18 When a right-justified digit was encountered, the script could then carry down the first word from that entry if the first word in the next entry was indented. However, these isolated digits were also not handled well by the OCRing process: many do not appear in the hOCR file at all, and those that do are as likely to be OCRed as a colon, an exclamation point, a capital I, etc., as they are to be a digit. Hence, the three catalogs of the Library Company of Baltimore, which use this format and have this OCR issue, were not analyzed for this project. Table 1. Results of verification Library Date founded if known, or inc. if not known9 Date catalog printed Number of spreadsheet entries Number of entries hand- verified Hand- verified entries that cannot be positively identified Hand- verified, positively identifiable entries that are not in HathiTrust Positively identifiable entries successfully matched when work was in HathiTrust Library Company of Philadelphia 1731 1807 7619 128 0% 16.9% 79.8% Horsham Library Company 1808 1810 143 143 28.4% 5.1% 79.8% Salem (MA) Athenaeum Inc. 1810 1811 1585 130 0.8% 11.3% 72.3% New York Society Library 1754 1813 4522 135 5.7% 17.9% 76.1% Providence Library Company 1753 1818 688 688 17.1% 9.4% 87.2% Apprentices’ Library (New York, NY)10 1820 1820 1811 124 34.4% 15.0% 69.7% Washington (DC) Library Company Inc. 1814 1822 900 124 12.9% 3.2% 83.7% Boston Library Inc. 1794 1824 2273 138 4.1% 11.1% 82.5% Mercantile Library (New York, NY) 1820 1825 1386 138 0% 11.3% 86.0% HATHITRUST AS A DATA SOURCE | BAUDER 19 https://doi.org/10.6017/ital.v38i4.11251 The catalogs of the other nine libraries could all be parsed with an acceptable success rate and, with one exception, were included. The exception was the Salem Athenaeum’s 1818 catalog, which was identical in format and nearly identical in content to the Athenaeum’s 1811 catalog. Given the overwhelming similarity it was decided to include only one of the catalogs; given that the goal of this analysis was to try to use the worst-case-scenario catalogs, the older of the two catalogs was chosen for inclusion. Once the catalogs were parsed into CSV files, they were run through another script that attempted to match each entry in the catalog against metadata from HathiTrust. In February 2019, MARC records containing metadata for 2,824,875 public-domain titles in HathiTrust were downloaded from HathiTrust via their OAI feed and ingested into a local Apache Solr index for searching and matching, using code from the SolrMarc and VuFind projects.11 Because of OCR errors in the catalog files and mistakes in the original catalogs, many of the words in the entries have one or more character-level errors. Therefore, Solr’s fuzzy searching option was used, which allows words to match as long as the Levenshtein distance between them is two or less. (The Levenshtein distance is the number of edits, such as changing one letter to another or adding or deleting a letter, it would take to turn one word into the other.) No attempt was made to match specific editions; as can be seen from the excerpts in figures 2 and 3, many of the catalogs do not contain sufficient detail to do so, even if it was desirable. The goal was merely to determine whether the text of that work, from any edition, was contained in the HathiTrust corpus. Once the catalogs had been checked against HathiTrust, a sample of the entries was hand-verified. For the two smallest catalogs, the Horsham Library Company and the Library Company of Providence, all entries were hand-verified. For the other catalogs, a random sample of approximately 130 items (+/- 10) was selected. Microsoft Excel’s random-number generator was used to assign each line in the CSV file a number between 0 and 1, and then the lowest 1.5 percent to 12.5 percent (depending on the number of items in the catalog) were examined. RESULTS Percentage of Works Included in HathiTrust A minimum of four of the books in every catalog examined was missing from HathiTrust. As can be seen in table 1, the fraction of books from the hand-verified sample that was missing from HathiTrust ranged from 3.2 percent for the Washington Library Company to just shy of 18 percent for the New York Society Library. The Library Company of Philadelphia, at 16.9 percent missing, had the second-highest missing number. It is not surprising that these two libraries, as two of the oldest and most venerable libraries in the United States at the time, owned the most books that are not represented in HathiTrust, as both have a high percentage of very old and rare works. However, not all of the books from these collections that are not represented in HathiTrust fall into that category. Only six of the twenty missing works from the Library Company of Philadelphia sample, and no more than eight of twenty-two from the New York Society Library, were published before 1700, for example.12 Percentage of Works That Cannot Be Positively Identified As can be seen in figures 1 through 3, some catalogs provided relatively full titles (figures 1 and 2), while others described the works in only two or three words each (figure 3). As might be expected, it is much easier to positively identify the works when fuller titles are provided, although two or three words proved to be enough to identify the work unambiguously the INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 20 majority of the time. (All of the titles shown in figure 3 can be positively identified, for example.) In the samples taken from the nine catalogs, the percentage of titles that were unidentifiably ambiguous ranged from 0 percent (Library Company of Philadelphia, Mercantile Library of New York) to more than one in four (Apprentices’ Library of New York, 34.4 percent; Horsham Library Company, 27.9 percent). The Apprentices’ Library of New York and the Horsham Library Company were particularly problematic because they frequently omitted the name of the author, in addition to greatly compressing the title; without an author name, titles such as Modern Geography (Apprentices’ Library) and History of Rome (Horsham Library Company) present far too many potential matches. However, even including the author’s name does not make all greatly compressed entries identifiable. One particularly egregious example comes from the Library Company of Providence’s 1818 catalog, which contains an entry reading “Bell’s Inquiry.” The list of candidates for this work includes A Practical Inquiry into the Authority, Nature, and Design of the Lord’s Supper, by William Bell; An Inquiry into the Causes Which Produce, and the Means of Preventing Diseases Among British Officers, Soldiers, and Others in the West Indies, by John Bell; and Inquiry into the Policy and Justice of the Prohibition of the Use of Grain in Distilleries, by Archibald Bell. Figure 4. New York Society Library, A Catalogue of the Books Belonging to the New-York Society Library (New York: printed by C. S. Van Winkle, 1813), 139. Success Rates for the Parsing and Matching Scripts When there was a single, identifiable work that matched the catalog entry, and that work was in HathiTrust, the matching scripts identified it at least 70 percent of the time for every individual catalog. Unsurprisingly, catalogs such as those of the Horsham Library Company and the Apprentices’ Library of New York that had entries that were difficult to positively identify were also more difficult for the script to properly match, although the matching script still succeeded between roughly 70 and 80 percent of the time. HATHITRUST AS A DATA SOURCE | BAUDER 21 https://doi.org/10.6017/ital.v38i4.11251 For two other libraries with below-average matching results (the Library Company of Philadelphia and the New York Society Library), many of the matching problems were caused by issues with the scanned catalogs that the data-cleaning scripts did not handle well. The New York Society Library catalog listed out the contents of multivolume sets in a way that was difficult for the cleaning script to identify and remove (see figure 4); instead, it was common for each volume of the set to end up with its own entry in the dataset. Since the HathiTrust records generally do not list out the contents of each volume, it was very rare for the cleaning script to correctly match a set based on an entry for one volume in the set. Twenty-seven percent (six out of 22) missed matches from that sample failed because of this table-of-contents issue. For the Library Company of Philadelphia, the problem lies with a quirk in the hOCR where the character heights for many of the horizontal-line dittoes are extremely high—around twenty pixels, when the text around those dittoes is typically around ten pixels high. It appears as if the OCR program may have treated each horizontal-line ditto as an em dash and assigned it a height that would be proportional for an em dash of that length. These extra-tall line heights for the first “word” on the line cause issues with the algorithm that processes the text line-by-line, causing some entries to be inappropriately divided across two entries in the data spreadsheets. Unsurprisingly, the matching script had difficulty correctly identifying the correct work in HathiTrust when it was trying to match based on only half of the book’s title. CONCLUSIONS Although not a complete success, the results of this study provide hope that it might be possible to create full-text corpora based on the works in individual libraries with minimal manual labor, with a few caveats. The first caveat is that the digitized catalogs of those libraries must meet certain specifications: 1) The catalog is formatted, and has been OCRed, in such a way that it is consistently possible to parse the catalog line-by-line and to identify algorithmically where each entry starts and ends. 2) The catalog provides at least the authors’ last names, if not their full names, plus a more-or- less complete and accurate transcription of the title proper. 3) Either the catalog contains minimal extraneous information (such as tables of contents or donors’ names), or the extraneous information is consistently formatted in a way that it can be algorithmically identified and removed. The second caveat is that even if all of these conditions are met, the full-text corpora that can be created will probably still be missing some small percentage of the books available in that library. One potential direction for future research could be more closely examining the books that are absent from HathiTrust to see if there are any commonalities among them that might bias research done using these corpora, or if the missing works can safely be treated as random omissions. On the other hand, as was noted above, the catalogs used in this study represent a likely worst-case scenario for being able to positively identify the works listed in the catalogs and for those works being present in HathiTrust. Another promising avenue for future research would be to repeat this analysis on catalogs from the mid-to-late nineteenth century to see if the works in those catalogs are in fact more likely to exist in the HathiTrust corpus. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 22 APPENDIX A: AMERICAN LIBRARY CATALOGS FROM 1776 TO 1825 INCLUDED IN HATHITRUST Boston Library, Catalogue of Books in the Boston Library, June, 1824, Boston: Munroe and Francis, 1824, http://hdl.handle.net/2027/hvd.32044080249337. General Society of Mechanics and Tradesman of the City of New York, Catalogue of the Apprentices’ Library, Instituted by the Society of Mechanics and Tradesman of the City of New-York, on the 25th November, 1820: With the Names of the Donors: To Which Is Added, an Address Delivered on the Opening of the Institution by Thomas R. Mercein, a Member of the Society. New York: printed by William A. Mercein, no. 93 Gold-Street, 1820, http://hdl.handle.net/2027/nnc2.ark:/13960/t8md1cv2t. Horsham Library Company, The Constitution, Bye-Laws, and Catalogue of Books, of the Horsham Library Company. Philadelphia, PA: J. Rakestraw, 1810, http://hdl.handle.net/2027/nnc1.cu55910696. Library Company of Baltimore, A Catalogue of the Books, &c. Belonging to the Library Company of Baltimore: To Which Are Prefixed the Act for the Incorporation of the Company, Their Constitution, Their By-Laws, and an Alphabetical List of the Members. Baltimore, MD: printed by Eades and Leakin, 1809, http://hdl.handle.net/2027/nyp.33433069263907. Library Company of Baltimore, A Supplement to the Catalogue of Books, &c. Belonging to the Library Company of Baltimore. Baltimore, MD: printed by J. Robinson, 1816, http://hdl.handle.net/2027/nyp.33433069263899. Library Company of Baltimore, A Supplement to the Catalogue of Books, &c. Belonging to the Library Company of Baltimore. Baltimore, MD: printed by J. Robinson, 1823, http://hdl.handle.net/2027/nyp.33433069263899. Library Company of Philadelphia, A Catalogue of the Books Belonging to the Library Company of Philadelphia: To Which Is Prefixed, A Short Account of the Institution, with the Charter, Laws, and Regulations. Philadelphia, PA: Printed by Bartram & Reynolds, 1807, http://hdl.handle.net/2027/nyp.33433075914816. Mercantile Library Association of the City of New York, Catalogue of the Books Belonging to the Mercantile Library Association of the City of New-York: To Which Are Prefixed, the Constitution and the Rules and Regulations of the Same. New York: printed by Hopkins & Morris, 1825, http://hdl.handle.net/2027/nyp.33433057517090. New York Society Library, A Catalogue of the Books Belonging to the New-York Society Library. New York: printed by C. S. Van Winkle, 1813, http://hdl.handle.net/2027/mdp.39015023478822. Providence Library Company, Charter and By Laws of the Providence Library Company, and a Catalogue of the Books of the Library. Providence, RI: printed by Miller and Hutchens, 1818, http://hdl.handle.net/2027/nyp.33433059555346. Salem Athenaeum, Catalogue of the Books Belonging to the Salem Athenæum, with the By-Laws and Regulations. Salem, MA: Printed by Thomas C. Cushing, 1811, http://hdl.handle.net/2027/hvd.32044080252174. http://hdl.handle.net/2027/hvd.32044080249337 http://hdl.handle.net/2027/nnc2.ark:/13960/t8md1cv2t http://hdl.handle.net/2027/nnc1.cu55910696 http://hdl.handle.net/2027/nyp.33433069263907 http://hdl.handle.net/2027/nyp.33433069263899 http://hdl.handle.net/2027/nyp.33433069263899 http://hdl.handle.net/2027/nyp.33433075914816 http://hdl.handle.net/2027/nyp.33433057517090 http://hdl.handle.net/2027/mdp.39015023478822 http://hdl.handle.net/2027/nyp.33433059555346 http://hdl.handle.net/2027/hvd.32044080252174 HATHITRUST AS A DATA SOURCE | BAUDER 23 https://doi.org/10.6017/ital.v38i4.11251 Salem Athenaeum, Catalogue of the Books Belonging to the Salem Athenæum, with the By-Laws and Regulations. Salem, MA: Printed by W. Palfray, 1818, http://hdl.handle.net/2027/hvd.32044080252174. Washington Library Company, Catalogue of Books in the Washington Library, July 20, 1822. Washington, DC: printed by Anderson and Meehan, 1822, http://hdl.handle.net/2027/chi.098498263. REFERENCES 1 See, e.g., Jean-Baptiste Michel et al., “Quantitative Analysis of Culture Using Millions of Digitized Books,” Science, 311, no. 6014 (January 11, 2011): 176-82, https://doi.org/10.1126/science.1199644; Jean M. Twenge, W. Keith Campbell, and Brittany Gentile, “Male and Female Pronoun Use in U.S. Books Reflects Women’s Status, 1900 -2008,” Sex Roles 67, nos. 9-10 (November 2012), 488-93, https://doi.org/10.1007/BF00287963; Patricia M. Greenfield, “The Changing Psychology of Culture from 1800 through 2000,” Psychological Science 24, no. 9, 1722-31, https://doi.org/10.1177/0956797613479387. 2 Eitan Adam Pechenick, Christopher M. Danforth, and Peter Sheridan Dodds, “Characterizing the Google Books Corpus: Strong Limits to Inferences of Socio-cultural and Linguistic Evolution,” PLOS One 10, no. 10 (October 7, 2015): e0137041. https://doi.org/10.1371/journal.pone.0137041; Alexander Koplenig, “The Impact of Lacking Metadata for the Measurement of Cultural and Linguistic Change Using the Google Ngram Data Sets—Reconstructing the Composition of the German Corpus in Times of WWII,” Digital Scholarship in the Humanities 32, no. 1 (April 2017): 169-88, https://doi.org/10.1093/llc/fqv037. 3 Pechenick et al., 2015; Lindsay DiCuirci, Colonial Revivals: The Nineteenth-Century Lives of Early American Books (Philadelphia: University of Pennsylvania Press, 2019). 4 Robert A. Gross, “Reconstructing Early American Libraries: Concord, Massachusetts, 1795 -1850,” Proceedings of the American Antiquarian Society, 97, no. 1 (January 1, 1987): p. 331-451. 5 Jennifer Howard, “What Ever Happened to Google’s Effort to Scan Millions of University Library Books?,” EdSurge, August 20, 2017, https://www.edsurge.com/news/2017-08-10-what- happened-to-google-s-effort-to-scan-millions-of-university-library-books. 6 Book catalogs fell out of favor in the latter half of the nineteenth century as library collections became larger and more dynamic, making book catalogs much more difficult and expensive to compile and to keep up to date. By the end of the nineteenth century, book catalogs had largely been replaced by the card catalog system that remained in use through most of the twentieth century. Although card catalogs were far superior for their primary purposes—maintaining an inventory of books presently owned by the library and allowing library users to locate the books that they wanted—they leave no permanent record of the books listed in the catalog at any particular point in time. 7 Information about pdftotext can be found at https://manpages.debian.org/testing/poppler- utils/pdftotext.1.en.html. http://hdl.handle.net/2027/hvd.32044080252174 http://hdl.handle.net/2027/chi.098498263 https://doi.org/10.1126/science.1199644 https://doi.org/10.1007/BF00287963 https://doi.org/10.1177/0956797613479387 https://doi.org/10.1371/journal.pone.0137041 https://doi.org/10.1093/llc/fqv037 https://www.edsurge.com/news/2017-08-10-what-happened-to-google-s-effort-to-scan-millions-of-university-library-books https://www.edsurge.com/news/2017-08-10-what-happened-to-google-s-effort-to-scan-millions-of-university-library-books https://manpages.debian.org/testing/poppler-utils/pdftotext.1.en.html https://manpages.debian.org/testing/poppler-utils/pdftotext.1.en.html INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 24 8 The cleaning scripts, as well as data and other code used in this project, are available in https://github.com/julia-bauder/library-catalog-analysis-public. 9 The founding and incorporation dates were taken from the prefatory texts in the book catalogs used in this analysis, as listed in appendix A. 10 The scan of this catalog that is available from HathiTrust is missing pages 3-6. 11 Apache Solr is a widely used, open-source search platform. SolrMarc is a utility that can be used to index MARC records into Solr. VuFind is an open-source library discovery layer built in part on Solr and SolrMarc. For more information, see http://lucene.apache.org/solr/, https://github.com/solrmarc/solrmarc, and https://vufind.org/vufind/, respectively. The HathiTrust OAI feed is available at https://www.hathitrust.org/oai. 12 Five of the missing works from the New York Society Library sample were undated in the catalog. https://github.com/julia-bauder/library-catalog-analysis-public http://lucene.apache.org/solr/ https://github.com/solrmarc/solrmarc https://vufind.org/vufind/ https://www.hathitrust.org/oai ABSTRACT INTRODUCTION Methods Results Percentage of Works Included in HathiTrust Percentage of Works That Cannot Be Positively Identified Success Rates for the Parsing and Matching Scripts Conclusions Appendix A: American Library Catalogs from 1776 to 1825 Included in HathiTrust References 11273 ---- Automated Storage & Retrieval System: From Storage to Service Articles Automated Storage & Retrieval System: From Storage to Service Justin Kovalcik and Mike Villalobos INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 114 Justin Kovalcik (JDKovalcik@gmail.com) is Director of Library Information Technology, CSUN Oviatt Library. Mike Villalobos (Mike.Villalobos@csun.edu) is Guest Services Supervisor, CSUN Oviatt Library. ABSTRACT The California State University, Northridge (CSUN) Oviatt Library was the first library in the world to integrate an automated storage and retrieval system (AS/RS) into its operations. The AS/RS continues to provide efficient space management for the library. However, added value has been identified in materials security and inventory as well as customer service. The concept of library as space, paired with improved services and efficiencies, has resulted in the AS/RS becoming a critical component of library operations and future strategy. Staffing, service, and security opportunities paired with support and maintenance challenges, enable the library to provide a unique critique and assessment of an AS/RS. INTRODUCTION “Space is a premium” is a phrase not unique to libraries; however, due to the inclusive and open environment promoted by libraries, their floor space is especially attractive to those within and outside of the building’s traditional walls. In many libraries, the majority of floor space is used to house a library’s collection. In the past, as collections grew, floor space became increasingly limited. Faced with expanding expectations and demands, libraries struggled to identify a balance between transforming space for new services while adding materials to a growing collection. In addition to management activities like weeding, other solutions such as offsite storage and compact shelving rose in popularity as a method to create library space in the absence o f new building construction. Years later as collections move away from print and physical materials, libraries are beginning to reexamine their building’s space and envision new features and services. “Now that so many library holdings are accessible digitally, academic libraries have the opportunity to make use of their physical space in new and innovative ways.”1 The CSUN Oviatt Library took a novel approach and launched the world’s first automated storage and retrieval system (AS/RS) in 1991 as a storage solution to resolve its building space limitations. The project was a California State University (CSU) System Chancellor’s Office initiative that cost more than $2 million to implement and began in 1989. The original concept “came from the warehousing industry, where it had been used by business enterprises for years.”2 By leveraging and storing physical materials in the AS/RS, the CSUN Oviatt Library is able to create space within the library for new activities and services. “Instead of simply storing information materials, the library space can and should evolve to meet current academic needs by transforming into an environment that encourages collaborative work.”3 mailto:JDKovalcik@gmail.com mailto:Mike.Villalobos@csun.edu AUTOMATED STORAGE & RETRIEVAL SYSTEM | KOVALCIK AND VILLALOBOS 115 https://doi.org/10.6017/ital.v38i4.11273 Unfortunately, as the first stewards of an AS/RS, CSUN made decisions that led to mismanagement and neglect resulting in the AS/RS facing many challenges in becoming a stable and reliable component of the library. However, recent efforts have sought to resolve these issues and resulted in system updates, management, and functionality. Whereas in the past low-use materials were placed in AS/RS to create space for new materials, now materials are moved into the AS/RS to create space for patrons, secure collections, and improve customer service. As part of this critical review, the functionality and maintenance along with the historical and current management of the AS/RS will be examined. BACKGROUND CSUN is the second-largest member of the twenty-three-campus CSU system. The diverse university community includes over 38,000 students and more than 4,000 employees.4 Consisting of nine colleges offering 60 baccalaureate degrees, 41 master’s degrees, 28 credentials in education, and various extended learning and special programs, CSUN provides a diverse community with numerous opportunities for scholarly success.5 The CSUN Oviatt Library’s AS/RS is an imposing and impressive area of the library that routinely attracts onlookers and has become part of the campus tour. The AS/RS is housed in the library’s east wing and occupies an area that is 8,000 square feet and 40 feet high arranged into six aisles. The 13,260 steel bins, each 2 feet x 4 feet, in heights of 6, 10, 12, 15, and 18 inches, are stored on both sides of the aisles enabling the AS/RS to store an estimated 1.2 million items.6 Each aisle has a storage retrieval machine (SRM) that performs automatic, semiautomatic, and manual “picks” and “deposits” of the bins.7 The AS/RS was assessed in 2014 as responsibilities, support, and expectations of the system shifted and previous configurations were no longer viable. Discontinued and failing equipment, unsupported server software, inconsistent training and use, and decreased local support and management were identified as impediments for greater involvement in library projects and operations. Campus provided funding in 2015 to update the server software as well as major hardware components on three of the six aisles. Divided into two phases, the server software upgrade was completed in May 2017 followed by the hardware upgrade in January 2019.8 LITERATURE REVIEW The continued growth of student, faculty, and academic programs along with evolving expectations and needs since the late 1980s has required the library to analyze library services and examine the building’s physical space and storage capacity. In the late 1980s, identifying space for increasing printed materials was the main contributing factor in implementing the AS/RS. In the mid-2010s, creating space within the library for new services was dependent on a stable and reliable AS/RS. “The conventional way of solving the space problem by adding new buildings and off-site storage facilities was untenable.”9 A benefit of an AS/RS, as Creaghe and Davis predicted in 1986 was, “the probable slow transition from books to electronic media, an AAF [Automated Access Facility] may postpone the need for future library construction indefinitely.”10 The AS/RS has enabled the library to create space by removing physical materials while enhancing customer service, material security, and inventory control. “The role of the library as service has been evolving in lockstep with user needs. The current transformative process that takes place in academia has a powerful impact on at least two functional areas of the library: INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 116 library as space and library as collection.”11 In addition, the “increased security the AAF … offers will save patrons time that would be spent looking for books on the open shelves that may be in use in the library, on the waiting shelves, misplaced, or missing.”12 In subsequent years, library services have evolved to include computer labs with multiple high-use printers/scanners/copiers, instructional spaces, individual and group study spaces, makerspaces, etc., in addition to campus entities that have required large amounts of physical space within the library. “It is well-known that academic libraries have storage problems. Traditional remedies for this situation—used in libraries across the nation—include off-site storage for less used volumes, as well as, more recently, innovative compact shelving. These solutions help, but each has its disadvantages, and both are far from ideal. . . . When the Eastern Michigan University Library had the opportunity to move into a new building, we saw that an AS/RS system would enable us to gain open space for activities such as computer labs, training rooms, a cafe, meeting rooms, and seating for students studying.”13 The AS/RS provides all the space advantages provided by off-site storage and compact shelving while adding much more value while mitigating negatives of off-site time delays and the confusion of accessing and using compact shelving. STAFFING & USAGE 1991–1994 Following the 80/20 principle, low-use items were initially selected for storage in the AS/RS. “When the storage policy was being developed in [the] 1990s, the 80/20 principle was firmly espoused by librarians. . . . Thus, by moving lower-use materials to AS/RS, the library could still ensure that more than 80% of the use of the materials occurs on volumes available in the open stacks.”14 Low-use items were identified if one of the following three conditions was met: (1) the item’s last circulation date was more than five years ago; (2) the item was a non-circulating periodical; or (3) items that were not designed to leave an area and received little patron usage such as the reference collection. In 1991, the AS/RS was loaded with 800,000 low-use items and went live for the first time later that year. Staffing for the initial AS/RS department consisted of one full-time AS/RS supervisor (40 hours/week), one part-time AS/RS repair technician (20 hours/week), and 40 hours a week of dedicated student employees, for a total of 100 hours a week of dedicated AS/RS management. The AS/RS was largely utilized as a specialized service for internal library operations with limited patron-initiated requests. AS/RS operations were uniquely created and customized for each AS/RS operator as well as the desired task needing to be performed. Skills were developed internally with knowledge and training shared by word of mouth or accompanied with limited documentation. 2000 - Mid-2000s The AS/RS department functioned in this manner until the 1994 Northridge earthquake struck the campus directly and required partial building reconstruction to the library. Although there was no damage to the AS/RS itself or its surrounding structure, extensive damage occurred in the wings of the library. The damage resulted in the library building being closed and inaccessible. When the library reopened in 2000, it was determined that due to previous AS/RS low usage that a dedicated department was no longer warranted. The AS/RS supervisor position was dissolved, the student employee budget was eliminated, and the AS/RS technician position was not replaced after the employee retired in 2008. AS/RS operational responsibilities were consolidated into the Circulation Department and AS/RS administration into the Systems Department. Both circulation AUTOMATED STORAGE & RETRIEVAL SYSTEM | KOVALCIK AND VILLALOBOS 117 https://doi.org/10.6017/ital.v38i4.11273 and systems departments redefined their roles and responsibilities to include the AS/RS without additional budgetary funding, staffing, or training. In order for AS/RS operations to be absorbed by these departments, changes had to occur in the administration, operating procedures, staffing assignments, and access to the AS/RS. All five Circulation staff members and twenty student employees received informal training by members of the former AS/RS department in the daily operations of the AS/RS. The Circulation members also received additional training for first-tier troubleshooting of AS/RS operations such as bin alignments, emergency stops, and inventory audits. The AS/RS repair technician remained in the Systems Department; however, AS/RS troubleshooting responsibility was shared among the Systems support specialists and dedicated AS/RS support was lost. The administrative tasks of scheduling preventive maintenance services (PMs), resolving AS/RS hardware/equipment issues with the vendor, and maintaining the server software remained with the head of the Systems Department. Without a dedicated department providing oversight for the AS/RS, issues and problems began to occur frequently. Circulation had neither the training nor resources available to master procedures or enforce quality control measures. Similarly, the systems department became increasingly removed from daily operations. Many issues were not reported at all and became viewed as system quirks that required workarounds or were viewed as limitations of the system. For issues that were reported, troubleshooting had to start all over again and Systems relied on Circulation staff being able to replicate the issue in order to demonstrate the problem. System’s personnel retained little knowledge on performing daily operations, and troubleshooting became more complex and problematic as different operators had different levels of knowledge and skill that accompanied their unique procedures. Mid-2000s–2015 These issues became further exasperated when areas outside of Circulation were given full access to the AS/RS in the mid-2000s. Employees from different departments of the library began entering and accessing the AS/RS area and operated the AS/RS based on knowledge and skills they learned informally. Student assistants from these other departments also began accessing the area and performing tasks on behalf of their informally trained supervisors. Further, without access control, employees as well as students ventured into the “PIT” area of the AS/RS where the SRMs move and end-of-aisle operations occur. This area contains many hazards and is unsafe without proper training. During this period, the Special Collections and Archives (SC/A) Department loaded thousands of un-cataloged, high-use items into the AS/RS that required specialized service from Circulation. These items were categorized as “Non-Library of Congress” and inventory records were entered into the AS/RS software manually by various library employees. In addition, paper copies were created and maintained as an independent inventory by SC/A. Over the years, the SC/A paper inventory copies were found to be insufficiently labeled, misidentified, or missing. Therefore, the AS/RS software inventory database and the SC/A paper copy inventory contained conflicts that could not be reconciled. To resolve this situation, an audit of SC/A materials was completed in spring 2019 to locate inventory that was thought to be missing. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 118 All bound journals and current periodicals were eventually loaded into the AS/RS as well, causing other departments and areas to rely on the AS/RS more heavily. Departments such as Interlibrary Loan and Reserves, as well as patrons, began requesting materials stored in the AS/RS more routinely and frequently. The AS/RS transformed from a storage space with limited usage to an active area with simultaneous usage requests of different types throughout the day. Without a dedicated staff to organize, troubleshoot, and provide quality control, there was an abundance of errors that led to long waits for materials, interdepartmental conflicts, and unresolved errors. High-use materials from SC/A, as well as currently received periodicals from the main collection, were the catalysts that drove and eventually warranted change in the AS/RS usage model from storage to service. The inclusion of these materials created new primary customers identified as internal library departments: SC/A and Interlibrary Loan (ILL). With over 4,000 materials contained in the AS/RS, SC/A requires prompt service for processing archival material into the AS/RS and filling specialized patron requests for these materials. In addition, ILL processes over 500 periodical requests per month that utilize and depended on AS/RS services. The additional storage and requests created an uptick in overall AS/RS utilization that carried over into Circulation Desk operations as well. 2015–Present The move from storage to service was not only inevitable due to an evolving AS/RS inventory, but was necessary in order to regain quality control and manage the library-wide projects that involved the AS/RS. The increased usage and reliance on the AS/RS required the system be well maintained and managed. Administration of the AS/RS remains within Systems and Circulation student employees continue to provide supervised assistance to the AS/RS. The crucial change was identified and emerged within Circulation for a dedicated operations and project manager. An AS/RS lead position was created with responsibilities for the daily operations and management of the system and service. However, this was not a complete return to the original staffing concept of the early 1990s. The concept for this new position focuses on project management and system operations rather than the original sole attention to system operations. The AS/RS lead is the point of contact for all library projects that utilize the AS/RS, relaying any AS/RS issues or concerns to Systems, and daily AS/RS usage. This shift is necessary due to the increased demand and reliance on the system that has changed its charge from storage to service. CUSTOMER SERVICE The library noted over time that the AS/RS could be used as a tool in weeding and other collection shift projects to create space and aid in reorganizing materials. As more high-use materials were loaded into the AS/RS the indirect advantages of the AS/RS became more apparent. Patrons request materials stored within the AS/RS through the library’s website and pick up the materials at the Circulation desk. There is no need for patrons to navigate the library, successfully use the classification system, and search shelves to locate an item that may or may not be there. As Kirsch notes, “The ability to request items electronically and pick them up within minutes eliminates the user’s frustration at searching the aisles and floors of an unfamiliar library.”15 The vast majority of library patrons are CSUN students that commute and must make the best use of their time while on campus. Housing items in the AS/RS creates the opportunity to have hundreds of thousands of items all picked up and returned to one central location. This makes it far easier for library patrons, especially users with mobility challenges, to engage with a plethora of library AUTOMATED STORAGE & RETRIEVAL SYSTEM | KOVALCIK AND VILLALOBOS 119 https://doi.org/10.6017/ital.v38i4.11273 materials. The time allotted for library research and/or enjoyment becomes more productive as their desired materials are delivered within minutes of arriving in the building. As Heinrich and Willis state, “the provision of the nimble, just-in-time collection becomes paramount, and the demand for AS/RS increases exponentially.”16 AS/RS items are more readily available than shelved items on the floor, as it takes minutes to have AS/RS items returned and made available once again. “They may be lost, stolen, misshelved, or simply still on their way back to the shelves from circulation—we actually have no way of knowing where they are without a lengthy manual search process, which may take days. . . . Unlike books on the open shelves, returned storage books are immediately and easily ‘reshelved’ and quickly available again.”17 Another advantage is there is no need to keep materials in call-number order with the unpleasant reality of missing and misshelved items. Items in the AS/RS are assigned bin locations that can only be accessed by an operator- or user-initiated request. The workflow required to remove a material from the AS/RS involves multiple scans and procedures that increase accountability that does not exist for items stored on floor shelves. Further, users are assured of an item’s availability within the system. Storing materials in the AS/RS ensures that items are always checked out when they leave the library and not sitting unaccounted for in library offices and processing areas. It also avoids patron frustration of misshelved, recently checked-out, or missing items. SECURITY The decision to follow the 80/20 principle and place low-use items in the AS/RS meant high-use items remained freely available to library patrons on the open shelves of each floor. This resulted in high-use items being available for patron browsing and checkout, as well as patron misuse and theft. The sole means of securing these high-use items involved tattle-tape and installing security gates at the main entrance. Therefore, the development of policies and procedures for the enforcement of these gates was also required. Beyond the inherent cost, maintenance, and issue of ensuring items are sensitized and desensitized correctly, gate enforcement became another issue that rested upon the Circulation Department. Assuming theft would occur by exiting the building through passing through the gates at the main entrance of the library, enforcement is limited in actions that may be performed by library employees. Touching, impeding the path, following, detaining, searching, etc. of library patrons are restricted actions reserved for campus authorities such as the police and not library employees. Rather than attempting to enforce a security mechanism in which we have no authority, the AS/RS provides an alternative for the security of high-use and valuable materials. Storing items in the AS/RS eliminates the possibility of theft or damage by visitors and places control and accountability over the internal use of materials. “There would be far fewer instances of mutilation and fewer missing items.”18 Further, access to the AS/RS area was restricted from all library personnel to only Circulation and Systems employees with limited exceptions. Individual log ins also provided a method of control and accountability as each operator is required to use a personal account rather than a departmental account to perform actions on the AS/RS. Materials stored in the AS/RS are, “more significantly . . . safer from theft and vandalism.”19 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 120 INVENTORY Conducting a full inventory of a library collection is time consuming, expensive, and often inaccurate by the time of completion. Missing or lost items, shelf reading projects, in-process items, etc. create overhead for library employees and generate frustration for patrons searching for an item. Massive, library-wide projects such as collection shifts and weeding are common endeavors undertaken to create space, remove outdated materials, and improve collection efficiency. However, actions taken on an open shelves collection is time consuming, costly, inefficient, and affect patron activities. These projects typically involve months of work that involve multiple departments to complete. Items stored within the AS/RS do not experience these challenges because the system is managed by a full-time employee throughout the year and not on a project basis. The system is capable of performing inventory audits, and does not affect public services. Therefore, while the cost of an item on an open shelf is $0.079, the cost of storing the same item in the AS/RS is $0.0220 Routine and spot audits ensure an accurate inventory, confirm capacity level of the system, and establish best management of the bins. AS/RS inventory audits are highly accurate and much more efficient than shelf reading with little impact to patron services. “While this takes some staff time, it is far less time-consuming than shelf reading or searching for misshelved books.”21 Storing materials in the AS/RS is more efficient than on open shelves; however, bin management is essential in ensuring bins are configured in the best arrangement to achieve optimal efficiency. The size and configuration of bins directly affects storage capacity. Type of storage, random or dedicated, also influences capacity, efficiency, and accessibility of items. The 13,260 steel bins in the AS/RS range in height from 6 to 18 inches. The most commonly used bins are the 10- and 12-inch bins; however, there is a finite number of these bin heights. Unfortunately, the smallest and largest bins are rarely used due to material sizes and weight capacity; therefore, AS/RS optimal capacity is unattainable and the number of materials eligible for loading limited by number of bins available. The library also determined that dedicated, rather than random, bin storage type aided in locating specialized materials, reduced loading and retrieval errors, and enhanced accessibility by arranging highly used bins to reachable locations. In the event an SRM breaks down and an aisle becomes nonfunctional for retrieving bins, strategically placing the highest used and specialized locations in bins that can be manually pulled is a proactive strategy. However, this requires dedicated bins with an accurate and known inventory that has been arranged in accessible locations. LESSONS LEARNED Disasters & Security In 1994, the AS/RS proved to provide a much more stable and secure environment than the open stacks when it successfully endured a 6.9 earthquake. The reshelving of more than 300,000 items required a crew of more than thirty personnel over a year to complete. Many items were destroyed from the impact of falling to the floor and being buried underneath hundreds of other AUTOMATED STORAGE & RETRIEVAL SYSTEM | KOVALCIK AND VILLALOBOS 121 https://doi.org/10.6017/ital.v38i4.11273 items. The AS/RS in contrast consisted of over 800,000 items and successfully sustained the brunt of the earthquake’s impact with no damage to any of the stored items. Unfortunately. the materials that had been loaded into the AS/RS in 1991 were low-use items that were viewed as one step from weeding. Therefore, high-use items stored in open shelves were damaged and required the long process of recovery and reconstruction: identifying and cataloging damaged and undamaged materials, disposal of those damaged, renovation of the area, and purchase of new items. The low-use items stored in the AS/RS by contrast required a few bins that had slightly shifted be pushed back fully into their slots. AS/RS items have proven to be more secure from misplacement, theft, and physical damage from earthquakes as compared to items in open shelves. Maintenance, Support, and Modernization The CSUN Oviatt Library has received two major updates to the AS/RS since it was installed in 1991. In 2011, the AS/RS received updates for communication and positioning components. The second major update occurred in two phases between 2016 and 2018 and focused on software and equipment. In phase one, server and client-side software was updated from the original software created in 1989. In phase two, half the SRMs received new motors, drives, and controllers. Due to the many years of reliance on preventive maintenance (PM) visits and avoidance of modernization, our vendors were unable to provide support for the AS/RS software and had difficulty locating equipment that had become obsolete. Preventive maintenance visits were used to maintain the status quo and are not a long-term strategy for maintaining a large investment and critical component of business operations. Creaghe and Davis note that, “current industrial facility managers report that with a proper AAF [Automated Access Facility] maintenance program, it is realistic to expect the system to be up 95- 98 percent of the time.”22 PM service is essential for long-term AS/RS success; however, preventive maintenance alone is incapable of modernization and ensuring equipment and software do not become obsolete. Maintenance is not the same as support, rather maintenance is an aspect of support. Support includes points of contacts who are available for troubleshooting, spare supplies on hand for quick repairs, a life-cycle strategy for major components, and long- term planning and budgeting. Kirsch attested the following describing Eastern Michigan University’s strategy: “Although the dean is proud and excited about this technology, he acknowledges that just like any computerized technology, when it’s down, it’s down. ” To avoid system problems, EMU bought a twenty-year supply of major spare parts and employs the equivalent of one-and-a-half full-time workers to care for its automated storage and retrieval system.”23 A system that relies solely on preventive maintenance will quickly become obsolete and require large and expensive projects in the future if the system is to continue functioning. Further, modernization provides an avenue for new features and functions to be realized that increase functionality and efficiency. Networking The CSUN Oviatt Library on average receives between three to four visits a year along with multiple emails and phone conversations requesting information from different libraries regarding the AS/RS. These conversations aid the library by viewing the AS/RS in different perspectives and forces the library to review current practices. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 122 The library has learned through speaking with many different libraries that needs, design, and configuration of an AS/RS can be as unique as the libraries inquiring. The CSUN Oviatt Library, for example. is much different than the three other CSU system libraries that have an AS/RS. Due to our system being outdated, it has been difficult to form or establish meaningful groups or share information because the systems are all different from each other. As more conversations occur and systems become more modern and standard, there is potential for knowledge sharing as well as group lobbying efforts for features and pricing. Buy In User confidence in any system is required in order for that system to be successful. Convincing a user base that moving materials from readily available open shelves and transferring them into steel bins housed within 40-feet-high aisles that are inaccessible will be difficult if the system is consistently down. Therefore, the better the AS/RS is managed and supported, the more reliable and dependable that system will be and the likelihood user confidence will grow. Informing stakeholders of long-term planning and welcoming feedback demonstrates that the system is being supported and managed with an ongoing strategy that is part of future library operations. Similarly, administrators need confirmation that large investments and mission-critical services are stable, reliable, and efficient. Creating a new line item in the budget for AS/RS support and equipment life-cycle requires justification along with a firm understanding of the system. In addition, staffing and organizational responsibilities must also be reviewed in order to establish an environment that is successful and efficient. Continuous assessments of the AS/RS regarding downtime, projects involved, services and efficiencies provided, etc. aid in providing an illustration of the importance and impact of the system on library operations as a whole. Recording Usage and Statistics Unfortunately, usage statistics were not recorded for the AS/RS prior to June 2017. Therefore, data is unavailable to analyze previous system usage, maintenance, downtime, or project involvement. Data-driven decisions require the collection of statistics for system analysis and assessment. Following the server software and hardware updates, efforts have been taken to record project statistics, inventory audits, and SRM faults, as well as public and internal paging requests. CONCLUSION The AS/RS remains, as Heinrich & Willis described it, “a time-tested innovation.”24 Through lessons learned and objective assessment, the library is positioning the AS/RS to be a critical component for future development and strategy. By expanding the role of the AS/RS to include functions beyond low-use storage, the library discovered efficiencies in material security, customer service, inventory accountability, and strategic planning. The CSUN Oviatt Library has learned, experienced, and adjusted its perception, treatment, and usage of the AS/RS over the past thirty years. Factors often forgotten such as access to the area, staffing and inventory auditing are easily overlooked, while other potential functions such as material security and customer services may not be identified without ongoing analysis and assessment. Critical review without a limited or biased perception, has enabled the library to realize the greater functionality the AS/RS is able to provide. AUTOMATED STORAGE & RETRIEVAL SYSTEM | KOVALCIK AND VILLALOBOS 123 https://doi.org/10.6017/ital.v38i4.11273 NOTES 1 Shira Atkinson and Kirsten Lee, “Design and Implementation of a Study Room Reservation System: Lessons from a Pilot Program Using Google Calendar,” College & Research Libraries 79, no. 7 (2018): 916–30, https://doi.org/10.5860/crl.79.7.916. 2 Helen Heinrich and Eric Willis. “Automated Storage and Retrieval System: A Time-tested Innovation,” Library Management 35, no. 6/7 (August 5, 2014): 444-53. https://doi.org/10.1108/LM-09-2013-0086. 3 Atkinson and Lee, “Design and Implementation of a Study Room Reservation System,” 916–30. 4 “About CSUN,” California State University, Northridge, February 2, 2019, https://www.csun.edu/about-csun. 5 “Colleges,” California State University, Northridge, May 8, 2019, https://www.csun.edu/academic-affairs/colleges. 6 Estimated AS/RS capacity was calculated by determining the average size and weight of an item for each size of bin along with the most common bin layout. The average item was then used to determine how many could be stored along the width and length (and if appropriate height) of the bin and then multiplied. Many factors affect the overall capacity including: bin layout (with or without dividers), stored item type (book, box, records, etc.), weight of the items, and operator determination of full, partial, empty bin designation. The AS/RS mini-loaders have a weight limit of 450 pounds including the weight of the bin. 7 “Automated Storage and Retrieval System (AS/RS),” CSUN Oviatt Library, https://library.csun.edu/About/ASRS. 8 “Automated Storage and Retrieval System (AS/RS),” CSUN Oviatt Library, https://library.csun.edu/About/ASRS. 9 Heinrich and Willis, “Automated Storage and Retrieval System,” 444-53. 10 Norma S. Creaghe and Douglas A. Davis. “Hard Copy in Transition: An Automated Storage and Retrieval Facility for Low-Use Library Materials,” College & Research Libraries 47, no. 5 (September 1986): 495-99, https://doi.org/10.5860/crl_47_05_495. 11 Heinrich and Willis, “Automated Storage and Retrieval System,” 444-53. 12 Creaghe and Davis, “Hard Copy in Transition,” 495-99. 13 Linda Shirato, Sarah Cogan, and Sandra Yee, “The Impact of an Automated Storage and Retrieval System on Public Services.” Reference Services Review 29, no. 3 (September 2001): 253-61, https://doi.org/10.1108/eum0000000006545. 14 Heinrich and Willis, “Automated Storage and Retrieval System,” 444-53. 15 Sarah E. Kirsch, “Automated Storage and Retrieval—The Next Generation: How Northridge’s Success is Spurring a Revolution in Library Storage and Circulation,” paper presented at the https://doi.org/10.5860/crl.79.7.916 https://doi.org/10.1108/LM-09-2013-0086 https://www.csun.edu/about-csun https://www.csun.edu/academic-affairs/colleges https://library.csun.edu/About/ASRS https://doi.org/10.5860/crl_47_05_495 https://doi.org/10.1108/eum0000000006545 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 124 ACRL 9th National Conference, Detroit, Michigan, April 8-11 1999, http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/pdf/kirsch99.pdf . 16 Heinrich and Willis, “Automated Storage and Retrieval System,” 444-53. 17 Shirato, Cogan, and Yee, “The Impact of an Automated Storage and Retrieval System, 253-61. 18 Kirsch, “Automated Storage and Retrieval.” 19 Shirato, Cogan, and Yee, “The Impact of an Automated Storage and Retrieval System, 253-61. 20 Cost of material management was calculated by removing building operational costs (lighting, HVAC, carpet, accessibility/open hours, etc.) and focusing on the management of the material instead. The management of materials (or unit cost) is determined by dividing the total amount of fixed and variable costs by the total number of units; 400,000 items divided by $31,500 in annual shelving student budget equals $0.079 per-material per-year in open shelves; 900,000 items divided by $18,000 in annual AS/RS student budget equals $0.02 per- material per-year in the AS/RS. 21 Shirato, Cogan, and Yee, “The Impact of an Automated Storage and Retrieval System,” 253-61. 22Creaghe and Davis, “Hard Copy in Transition,” 495-99. 23 Kirsch, “Automated Storage and Retrieval.” 24 Heinrich and Willis, “Automated Storage and Retrieval System,” 444-53. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/pdf/kirsch99.pdf ABSTRACT Introduction Background Literature Review Staffing & Usage 1991–1994 2000 - Mid-2000s Mid-2000s–2015 2015–Present Customer Service Security Inventory Lessons Learned Disasters & Security Maintenance, Support, and Modernization Networking Buy In Recording Usage and Statistics Conclusion Notes 11369 ---- Virtual Reality: A Survey of Use at an Academic Library ARTICLES Virtual Reality A Survey of Use at an Academic Library Megan Frost, Michael Goates, Sarah Cheng, and Jed Johnston INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2020 https://doi.org/10.6017/ital.v39i1.11369 Megan Frost (megan@byu.edu) Physiological Sciences Librarian, Brigham Young University. Michael Goates (michael_goates@byu.edu) Life Sciences Librarian, Brigham Young University. Sarah Cheng is an undergraduate student, Brigham Young University. Jed Johnston (jed_johnston@byu.edu) Innovation Lab Manager, Brigham Young University. ABSTRACT We conducted a survey to inform the expansion of a virtual reality (VR) service in our library. The survey assessed user experience, demographics, academic interests in VR, and methods of discovery. Currently our institution offers one HTC VIVE VR system that can be reserved and used by patrons within the library, but we would like to expand the service to meet the interests and needs of our patrons. We found use among all measured demographics and sufficient patron interest for us to justify expansion of our current services. The data resulting from this survey and the subsequent focus groups can be used to inform other academic libraries exploring or developing similar VR services. INTRODUCTION Virtual reality (VR) is commonly defined as an experience in which a user remains physically within their real world while entering a virtual world (comprising three-dimensional objects) using a headset with a computer or a mobile device.1 VR is part of a spectrum of related technologies ranging from mostly real experiences to completely virtual experiences, such as augmented reality, augmented virtuality, and mixed reality. 2 Extended reality (XR) is a term often used when describing these technologies as a whole. Many different XR devices and services are available in academic libraries. The most popular XR devices used in libraries are the HTC VIVE, the Oculus Rift by Facebook, and Google Cardboard.3 Other common XR devices include GearVR by Samsung and PlayStation Virtual Reality by Sony.4 The HTC VIVE and Oculus Rift are technologies that provide an immersive virtual-reality experience. Google Cardboard provides both non-immersive virtual reality and augmented reality experiences, while mixed reality is provided through various technologies such as Microsoft’s HoloLens and mixed-reality headsets from HP, Acer, and Magic Leap. In addition, many academic libraries are using augmented reality apps that can be downloaded on patrons’ personal mobile devices.5 Academic libraries are starting to offer various XR services to increase engagement with patrons and teach information literacy.6 Despite the increase in XR service offerings, there is little consistency in the devices used or in how these services are developed at academic libraries , and there is substantial variation in the types of services offered. For example, some libraries make VR headsets available for in-house activities, such as storytelling, virtual travel, virtual gaming, and the development of new skills.7 Other libraries, notably Ryerson University Library and Archives in Toronto, let students and faculty borrow their Oculus Rift headsets for two or three days at a time.8 Some university libraries lend out headsets or 360-degree cameras or provide a virtual- mailto:megan@byu.edu mailto:michael_goates@byu.edu mailto:jed_johnston@byu.edu INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 VIRTUAL REALITY | FROST, GOATES, CHENG, AND JOHNSTON 2 reality space for students to develop content.9 The University of Utah Library offers an open-door, drop-in VR workshop once a week.10 Claude Moore Health Sciences Library at the University of Virginia implemented a project that educated its students and staff on the uses of VR in the health field through a combination of large-group demonstrations, one-on-one consultations, and workshops.11 The XR field is developing quickly, and XR services have the potential to benefit students academically. Some universities are already offering classes on VR platforms.12 This is particularly true in fields that are high risk or potentially discomforting. For example, students in medical fields benefit by practicing virtually before attempting surgery on a human body.13 In addition to potential surgical benefits, the University of New England has been utilizing XR technology to teach empathy to its medical and other health profession students by putting the learner in the place of their patients.14 Other examples of XR usage in the health fields include a recent attempt to introduce VR in anatomic pathology education and the use of virtual operating rooms to train nurses and educate the public. 15 One recent study measured the effectiveness of using VR platforms in engineering education and found a drastic improvement in student performance.16 Many educational institutions outside of the university setting have also started exploring how XR could be used to enhance students’ educational experience. This technology has already progressed from being considered a novelty to being an established tool to engage learners.17 One of the perceived benefits of XR use in public libraries by both library patrons and staff is the ability of XR technology to inspire curiosity and a desire to learn.18 In some school programs, students are able to advance their learning through XR apps that allow them not only to absorb information but also to experience what they are learning through hands-on activities and full immersion without danger (e.g., hazardous science experiments) or high cost (e.g., traveling to another country).19 XR has the potential to increase the overall engagement of students, which, according to Carnini, Kuh, and Klein’s 2006 study, is correlated to how well students learn.20 XR has the ability to capture the attention of students and eliminate distractions. This is particularly true for students with attention deficit disorder, anxiety disorders, or impuls e-control disorder.21 The application of XR goes beyond traditional classroom settings. A case study assessing the benefits of VR in American football training found that players showed an overall improvement of 30 percent after experiencing game plays created by their coaches in a virtual environment.22 Although these studies were not conducted in an academic library or university setting, their results are transferable. It is beneficial to academic libraries to provide technologies to their patrons that enhance and advance their learning. Currently, XR apps available for purchase on the Google App Store are still limited. Most app development comes from private companies; however, some universities are giving their students the opportunity to develop XR content.23 OBJECTIVES At Brigham Young University, we want our VR services to foster the link between academic achievement and virtual reality. In order to do this effectively, our first objective is to determine which VR services will be of most benefit to our patrons. To inform the expansion of future VR services, we conducted a survey of patrons using current VR services in the library. This survey is also intended to help other libraries that are developing VR services and potentially developers INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 VIRTUAL REALITY | FROST, GOATES, CHENG, AND JOHNSTON 3 interested in creating academic content for students. We were primarily interested in user experience, demographics, academic interests in VR, and methods of discovery. METHODS During one semester, January through April 2018, we asked individuals to complete a questionnaire following their use of the library’s HTC VIVE system. This questionnaire was administered through an online Qualtrics survey that was distributed via email to patrons after using the library’s VR equipment. It consisted of thirteen questions that gathered basic demographic information as well as information on patron interests and experiences with the library’s VR services. The complete survey used in this study can be found in Appendix A. Currently the Harold B. Lee Library at Brigham Young University offers one HTC VIVE VR system that can be used on site in the science and engineering area of the library. It is primarily operated by student employees who work at the science and engineering reference desk. Time slots are reserved through an online registration system on the library’s website. In order to gather more in-depth, qualitative data on patron experience with the library’s VR services, we also conducted a focus group with VR users. We recruited participants by adding a question at the end of the Qualtrics survey asking whether the responder would be interested in participating in a focus group. All focus group participants received a complimentary lunch. During the focus group, we asked a series of five questions to gain a deeper understanding of users’ VR experience at the library. In particular, we asked participants to explain what went well during their VR experience in the library, what difficulties they experienced, how they envisioned using VR for both academic and extracurricular purposes, and what type of VR content (e.g., software or equipment) they would like the library to acquire. The focus group facilitator asked follow-up questions for clarification as needed. The session was audio recorded, and participant responses were transcribed and coded for themes. RESULTS AND DISCUSSION Demographics The most frequent users of the VR equipment in the library were male students in the science. technology. engineering, or mathematics (STEM) disciplines. The percentage of male students at Brigham Young University is roughly 50 percent but over 70 percent of our survey respondents were male. That stated, there was considerable use among all measured demographics, as shown in figure 1. Over one third of responders were not students. University faculty made up 11 percent of responders during the survey period. The proportion of faculty who responded was higher than the university’s faculty-to-student ratio and likely the result of directly advertising the service to non-student university employees. Because some users informed librarians that they had brought spouses and children to use the equipment, we estimate that the 7 percent of responders who were neither students nor university employees mostly consisted of family or friends accompanying students or employees. Over one third of student responders were majoring in disciplines outside of science, technology, engineering, and mathematics. This number is small when compared to the number of students in these majors across campus (approximately 63 percent of students on campus are not majoring in STEM disciplines.); however, it demonstrates that there is an interest in VR technology throughout the university. As the VR services are located in the science and engineering area of the library, it is not surprising that more students majoring INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 VIRTUAL REALITY | FROST, GOATES, CHENG, AND JOHNSTON 4 in these disciplines used these services when compared to students majoring in other disciplines. In fact, 15 percent of responders learned about the services at the reference desk, where they could see other patrons using the VR equipment. The most common discovery method, however, was the various forms of advertisements targeted to both students and employees of the Brigham Young University, as shown in figure 2. Figure 1. Demographics. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 VIRTUAL REALITY | FROST, GOATES, CHENG, AND JOHNSTON 5 Figure 2. Most effective discovery methods: advertisement and word-of-mouth. Only 7 percent of responders identified research or class assignments as their primary reason for using the services. The large majority of use, as shown in figure 3, was simply for entertainment or fun. This was not unexpected, especially as most of the users were trying the technology for the first time (see figure 4). However, because we purchased the equipment with the intent to support academic pursuits on campus, we hoped to see a higher percentage of academic use. Figure 3. Most responders came because it sounded fun. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 VIRTUAL REALITY | FROST, GOATES, CHENG, AND JOHNSTON 6 Figure 4. Most responders were first-time users. Faculty use was higher than expected (see figure 5). Eleven percent of users during our survey period were faculty. The majority of these responders indicated an interest in potentially using VR technology with their students (see figure 6). While this interest was positive, faculty member suggestions for classroom use remained hypothetical, without any concrete intentions for implementation. This suggests that although faculty interest exists, faculty may need to be informed of specific application ideas in order to be more likely to incorporate this technology into their courses. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 VIRTUAL REALITY | FROST, GOATES, CHENG, AND JOHNSTON 7 Figure 5. Faculty were interested in trying the VR equipment. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 VIRTUAL REALITY | FROST, GOATES, CHENG, AND JOHNSTON 8 Figure 6. Faculty were interested in using VR academically. A clear majority (72 percent) indicated an intention of returning to the library to use the service again (see figure 7). INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 VIRTUAL REALITY | FROST, GOATES, CHENG, AND JOHNSTON 9 Figure 7. Most responders intend to return. Because our VR services were a small pilot program at the time of the survey, we did not offer a large number of paid apps to users. Table 1 displays the most common apps used by survey responders. Most users tried Google Earth during their session, and employees at the reference desk often recommended this app to new users. Another common app for new users was The Lab, which includes a few small games showcasing the current capabilities of VR. Google Tiltbrush is an app for creating 3D art. Virtual Jerusalem is an app that was created by faculty at Brigham Young University and allows users to walk around and explore the Jerusalem Temple Mount during the time of Christ. The fifth-most-used app we offered was 3D Organon VR Anatomy, which teaches human anatomy. 1. Google Earth 2. The Lab 3. Tiltbrush by Google 4. Virtual Jerusalem 5. 3D Organon VR Anatomy Table 1. Top five apps used. Focus Group Data We conducted a total of three focus group sessions. Each session included between five and eight participants, for a total of twenty-one focus group participants. Because we were primarily INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 VIRTUAL REALITY | FROST, GOATES, CHENG, AND JOHNSTON 10 interested in student responses, we limited focus group participants to students enrolled at Brigham Young University. The participants were asked to describe what did or did not go well during their VR session. When describing what went well during their VR session, many participants responded with positive comments about the quality of service the library employees provided during their session. Most participants expressed satisfaction with the number and quality of the apps provided by the library. During all three focus groups, participants mentioned that they liked how easy it was to sign up for the VR services. The most common problems reported by participants related to health or safety concerns, such as feeling dizzy, bumping into objects in the room because of the lack of space, and tripping over the headset wire. Other s reported problems related to the level of personal or social comfort with the VR services, such as feeling self-conscious using VR in a semi-open space not exclusively devoted to VR services or being told to be quieter. When asked about ways the library could improve its VR services, the students suggested solutions to many of these problems. A frequent recommendation was that the library dedicate a space to VR. The reasons for this suggestion included minimizing the risk of accidentally bumping into objects, reducing the embarrassment of using the VR equipment in front of spectators, and allowing participants to become more fully immersed in the VR experience without worrying about being too loud. Other common suggestions included providing more than one headset for multiple patrons to use for gaming purposes or team projects, acquiring wireless headsets to eliminate wire tripping hazards, and providing more online training videos to reduce reliance on library workers for common troubleshooting problems. Participants did not provide actionable suggestions on ways to decrease dizziness while operating VR equipment. When asked about how the students could see themselves using VR academically, many responded with some of the more well-known uses of VR technology, such as potential uses in science, medicine, engineering, and the military. However, some students had a very hard time determining how VR could be applied to humanities fields such as English. After some discussion, most students were able to see the relevance of VR in their field, but some said that they most likely would not pursue those functions of VR, using VR exclusively for extracurricular activities. In contrast to the lack of academic uses envisioned by focus group participants, participants had substantially more ideas about how they would use VR for extracurricular purposes, including playing games for stress relief, exercising, exploring the world, and watching movies. Many expressed interest in using VR for extracurricular learning outside their majors, such as virtually being part of significant historic events, exploring ecosystems, and visiting museums or other significant landmarks. Students expressed interest in exploring the many possibilities provided by VR technology but were not especially aware of or interested in how VR might apply to their specific field of study unless they were in an engineering, medical, or other science-related discipline. CONCLUSIONS VR is a rapidly growing field, and academic libraries are already providing students access to this technology. In our study, we found considerable interest across campus in using VR in the library, however the academic interest and use were not as high as we hoped. Future marketing to faculty might benefit from specifically suggesting ideas for academic uses or collaboration. Even though our current VR services are located at the science and engineering help desk, nearly 40 percent of INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 VIRTUAL REALITY | FROST, GOATES, CHENG, AND JOHNSTON 11 users were not in STEM disciplines. This is encouraging and suggests value in marketing future VR services to all library patrons. We also found sufficient patron interest to justify exploring related VR services, such as offering classes on creating content and acquiring less expensive headsets that can be borrowed outside of the library. Although this survey was limited to one university, we believe the results can be used to inform other academic libraries as they develop similar VR services. ENDNOTES 1 Susan Lessick and Michelle Kraft, “Facing Reality: The Growth of Virtual Reality and Health Sciences Libraries,” Journal of the Medical Library Association: JMLA 105, no. 4 (2017): 407. 2 Paul Milgram et al., “Augmented Reality: A Class of Displays on the Reality-Virtuality Continuum,” in Telemanipulator and Telepresence Technologies 2351 (International Society for Optics and Photonics, 1995), 282-92. 3 Hannah Pope, “Incorporating Virtual and Augmented Reality in Libraries,” Library Technology Reports 54, no. 6 (2018): 8. 4 Sarah Howard, Kevin Serpanchy, and Kim Lewin, “Virtual Reality Content for Higher Education Curriculum,” Proceedings of VALA (Melbourne, Australia: Libraries, Technology and the Future Inc., 2018), 2. 5 Zois Koukopoulos and Dimitrios Koukopoulos, “Usage Scenarios and Evaluation of Augmented Reality and Social Services for Libraries,” in Digital Heritage. Progress in Cultural Heritage: Documentation, Preservation, and Protection (Springer International, 2018), 134-41; Leanna Fry Balci, “Using Augmented Reality to Engage Students in the Library,” Information Today Europe/ILI365 (November 17, 2017), https://www.infotoday.eu/Articles/Editorial/Featured- Articles/Using-Augmented-Reality-to-engage-students-in-the-library-121763.aspx. 6 Bruce Massis, “Using Virtual and Augmented Reality in the Library,” New Library World 116, nos. 11-12 (2015): 789, https://doi.org/10.1108/NLW-08-2015-0054. 7 Adetoun A Oyelude, “Virtual and Augmented Reality in Libraries and the Education Sector,” Library Hi Tech News 34, no. 4 (2017): 3, https://doi.org/10.1108/LHTN-04-2017-0019. 8 Weina Wang, Kelly Kimberley, and Fangmin Wang, “Meeting the Needs of Post-Millennial: Lending Hot Devices Enables Innovative Library Services,” Computers in Libraries (April 2017): 7. 9 “Oxford LibGuides: Virtual Reality: Borrowing VR Equipment,” Bodleian Libraries, https://ox.libguides.com/vr/borrowing; “Virtual Reality Services,” Penn State University Libraries, https://libraries.psu.edu/services/virtual-reality-services; “VR Studio,” North Carolina State, https://www.lib.ncsu.edu/spaces/vr-studio. 10 Oyelude, “Virtual and Augmented Reality,” 3. 11 Lessick and Kraft, “Facing Reality: The Growth of Virtual Reality,” 409. https://www.infotoday.eu/Articles/Editorial/Featured-Articles/Using-Augmented-Reality-to-engage-students-in-the-library-121763.aspx https://www.infotoday.eu/Articles/Editorial/Featured-Articles/Using-Augmented-Reality-to-engage-students-in-the-library-121763.aspx https://doi.org/10.1108/NLW-08-2015-0054 https://doi.org/10.1108/LHTN-04-2017-0019 https://ox.libguides.com/vr/borrowing https://libraries.psu.edu/services/virtual-reality-services https://www.lib.ncsu.edu/spaces/vr-studio INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 VIRTUAL REALITY | FROST, GOATES, CHENG, AND JOHNSTON 12 12 Oyelude, “Virtual and Augmented Reality,” 3. 13 Medhat Alaker, Greg R. Wynn, and Tan Arulampalam, “Virtual Reality Training in Laparoscopic Surgery: A Systematic Review & Meta-Analysis,” International Journal of Surgery 29 (2016): 86, https://doi.org/10.1016/j.ijsu.2016.03.034. 14 Elizabeth Dyer, Barbara J. Swartzlander, and Marilyn R. Gugliucci, “Using Virtual Reality in Medical Education to Teach Empathy,” Journal of the Medical Library Association: JMLA 106, no. 4 (2018): 498, https://doi.org/10.5195/jmla.2018.518. 15 Emilio Madrigal, Shyam Prajapati, and Juan Hernandez-Prera, “Introducing a Virtual Reality Experience in Anatomic Pathology Education,” American Journal of Clinical Pathology 146, no. 4 (2016): 462, https://doi.org/10.1093/ajcp/aqw133; Nils Fredrik Kleven et al., “Training Nurses and Educating the Public Using a Virtual Operating Room with Oculus Rift,” IEEE (2014): 1, https://doi.org/10.1109/VSMM.2014.7136687. 16 Wadee Alhalabi, “Virtual Reality Systems Enhance Students’ Achievements in Engineering Education,” Behaviour & Information Technology 35, no. 11 (2016): 925, https://doi.org/10.1080/0144929X.2016.1212931. 17 Patricia Brown, “How to Transform Your Classroom with Augmented Reality—EdSurge News,” Edsurge, November 2, 2015, https://www.edsurge.com/news/2015-11-02-how-to-transform- your-classroom-with-augmented-reality. 18 Negin Dahya et al., “Virtual Reality in Public Libraries,” University of Washington Information School, https://ischool.uw.edu/vrinlibraries. 19 Del Siegle, “Seeing is Believing: Using Virtual and Augmented Reality to Enhance Student Learning,” Gifted Child Today 42, no. 1 (2019): 46, https://doi.org/10.1177/1076217518804854. 20 Guillaume Loup et al., “Immersion and Persistence: Improving Learners’ Engagement in Authentic Learning Situations,” 11th European Conference on Technical Enhanced Learning (2016): 414, https://doi.org/10.1007/978-3-319-45153-4_35; Robert Carini, George Kuh, and Stephen Klein, “Student Engagement and Student Learning: Testing the Linkages,” Research in Higher Education 47, no. 1 (2006): 23-4, https://doi.org/10.1007/s11162-005-8150-9. 21 Mariano Alcaniz, Elena Olmos-Raya, and Luis Abad, “Use of Virtual Reality for Neurodevelopmental Disorders: A Review of the State of the Art and Future Agenda,” Medicina- Buenos Aires 79, nos. 77–81 (2019): 419-20, https://doi.org/10.21565/ozelegitimdergisi.448322. 22 Yazhou Huang, Lloyd Churches, and Brendan Reilly, “A Case Study on Virtual Reality American Football Training,” Proceedings of the 2015 Virtual Reality International Conference 6 (2015): 3, https://doi.org/10.1145/2806173.2806178. 23 “Media Lab,” Massachusetts Institute of Technology, https://libraries.psu.edu/services/virtual- reality-services; “The iSchool Technology Resources at FSU: Virtual Reality,” Florida State University LibGuides, https://guides.lib.fsu.edu/iSchoolTech/VR. https://www.sciencedirect.com/science/journal/17439191 https://doi.org/10.1016/j.ijsu.2016.03.034 https://doi.org/10.5195/jmla.2018.518 https://doi.org/10.1093/ajcp/aqw133 https://doi.org/10.1109/VSMM.2014.7136687 https://doi.org/10.1080/0144929X.2016.1212931 https://www.edsurge.com/news/2015-11-02-how-to-transform-your-classroom-with-augmented-reality https://www.edsurge.com/news/2015-11-02-how-to-transform-your-classroom-with-augmented-reality https://ischool.uw.edu/vrinlibraries https://doi.org/10.1177/1076217518804854 https://doi.org/10.1007/978-3-319-45153-4_35 https://doi.org/10.1007/s11162-005-8150-9 https://doi.org/10.21565/ozelegitimdergisi.448322 https://doi.org/10.1145/2806173.2806178 https://libraries.psu.edu/services/virtual-reality-services https://libraries.psu.edu/services/virtual-reality-services https://guides.lib.fsu.edu/iSchoolTech/VR ABSTRACT Introduction Objectives Methods Results and Discussion Demographics Focus Group Data Conclusions ENDNOTES 11519 ---- Meeting Users Where They Are: Delivering Dynamic Content and Services through a Campus Portal COMMUNICATIONS Meeting Users Where They Are Delivering Dynamic Content and Services through a Campus Portal Graham Sherriff, Dan DeSanto, Daisy Benson, and Gary S. Atwood INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2020 https://doi.org/10.6017/ital.v39i1.11519 Graham Sherriff (graham.sherriff@uvm.edu) is Instructional Design Librarian, University of Vermont. Dan DeSanto (ddesanto@uvm.edu) is Instruction Librarian, University of Vermont. Daisy Benson (daisy.benson@uvm.edu) is Library Instruction Coordinator, University of Vermont. Gary S. Atwood (gatwood@uvm.edu) is Education Librarian, University of Vermont. ABSTRACT Campus portals are one of the most visible and frequently used online spaces for students, offering one-stop access to key services for learning and academic self-management. This case study reports how instruction librarians at the University of Vermont collaborated with portal developers in the registrar’s office to develop high-impact, point-of-need content for a dedicated “Library” page. This content was then created in LibGuides and published using the Application Programming Interfaces (APIs) for LibGuides boxes. Initial usage data and analytics show that traffic to the libraries’ portal page has been substantially and consistently higher than expected. The next phase for the project will be the creation of customized library content that is responsive to the student’s user profile. INTRODUCTION For many academic institutions, campus portals (also referred to as enterprise portals) are one of students’ most frequently used means of interacting with their institutions. Campus portals are websites that provide students and other campus constituents with a “one-stop shop” experience, with easy access to a selection of key services for learning and academic self -management. Typically, portals provide features that make it possible for students to obtain course information, manage course enrollment, view grades, manage financial accounts, and access information about campus activities. For faculty and staff, campus portals provide access to administrative resources related to teaching, human relations, and more. These campus portals are different from library portals, which some libraries implemented in the 2000s as a way to centralize access to key library services.1 Currently, the public-facing websites of many colleges and universities serve a crucial role in marketing the institution to prospective students. This creates an incentive to be as comprehensive as possible and to showcase the full breadth of programs, services, offices, and facilities. A common disadvantage to this approach to institutional web design is information overload: an overwhelming array of labels and links that diminish the ability of current affiliates to find and access the services they need. These sites are designed for external users for whom the research and educational functions of the library are a low priority. Campus portals, however, are designed for internal users and can take a more selective approach. They give student and faculty users a view of campus services that aligns with their priorities and places them in a convenient interface. In this sense, they are tools for information management. Campus portals play a critical role in students’ daily lives because they do much more than simply present information. Carden observes that campus portals have these key characteristics: mailto:graham.sherriff@uvm.edu mailto:ddesanto@uvm.edu mailto:daisy.benson@uvm.edu mailto:gatwood@uvm.edu INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 MEETING USERS WHERE THEY ARE | SHERRIFF, DESANTO, BENSON, AND ATWOOD 2 • allow a single user authentication and authorization step at the initial point of contact to be applied to all (or most) other entities within the portal; • allow multiple types and sources of information to be displayed on a single composite screen (multiple “channels”); • provide automated personalization of the selection of channels offered, based on each user’s characteristics, on the groups to which each user belongs, and possibly on the way in which the system has historically been used; • allow user personalization of the selection of channels displayed and the look-and-feel of the interface, based on personal preferences; • provide a consistent style of access to diverse information sources, including “revealing” legacy applications through a new consistent interface; and • facilitate transaction processing as well as simple data access. 2 In sum, enterprise portals use a combination of advanced technologies that have the ability to present both static and user-responsive information in a space reserved for affiliates of the university. These abilities present an attractive venue for libraries to leverage the capabilities of a campus portal to present users with dynamic, personalized instructional experiences—in a space where users are. This aligns with the principles of user-centered design, which emphasizes the need to empathize with users’ needs and perspectives. Simplicity, efficiency, convenience, and responsiveness to each user’s individual circumstances are critical.3 The idea of presenting libraries’ content through a campus portal is not a new one. Stoffel and Cunningham surveyed libraries in 2004 and, while finding that “library participation in campus portals is . . . relatively rare,” of the sixteen self-selected responding campuses, ten had a library tab or a dedicated library channel within their campus portal, while two more had a channel or tab under development.4 The types of library integration described in most examples consisted of using the portal’s campus authentication to link to a user’s library account and view borrowed books, fines, holds, and announcements. While resources like federated searches, research guides, and lists of journals and databases appeared in some respondents’ portals, they largely appeared as static content rather than responding to the user’s profile. Since 2004, portals have remained a core part of the University of Vermont’s information delivery system, but portal integration remains relatively rare among libraries and most have done little to integrate new tools such as research guides or develop instructional content that leverages a portal’s user-responsive design. As a result, there is little in the literature on libraries’ integration of content into campus portals, but a small number of case studies provide proof of concept, such as Lehigh University, California State University-Sacramento, and Arizona State University.5 These case studies also illustrate the importance of cross-campus collaboration. Our project required some critical elements, specifically access to the campus portal and a method for publishing content. The projects described in the case studies were successful partly because they were able to apply advanced programming expertise that was not available to our group, such as API coding. Instead, our group was able to obtain these critical inputs through a partnership with the University of Vermont registrar’s office. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 MEETING USERS WHERE THEY ARE | SHERRIFF, DESANTO, BENSON, AND ATWOOD 3 At the University of Vermont, the campus portal uses the Banner product licensed from Ellucian and has branded it as “MyUVM.” It is administered by the registrar’s office. Librarians have observed that it is central to students’ academic lives. Students go to MyUVM as their pathway to many of the online services and tools that they use. They go there to check email, log in to the learning management system (LMS), check grades, to add, drop, or withdraw from courses, to check their schedule, and more. They go there to carry out tasks. Figure 1. Screenshot of MyUVM (https://myuvm.uvm.edu as it was on March 1, 2019). The importance of MyUVM is communicated to University of Vermont students at orientation. In this way, first-year students learn at the earliest point, even before their academic programs begin, that the portal is their primary gateway for access to campus academic services. This shapes their view of the services available to them and how those services are organized. It also shapes how they reach those services and how they interact with them. At the same time, the selective principle underlying the campus portal means that if something is not present, it is less visible and less accessible, and there is a risk of signaling to students that it is not important to their daily lives or their academic performance. METHODS The characteristics of campus portals and their contents motivated instruction librarians to explore the possibility of integrating library services into MyUVM. In 2014, the University of Vermont Libraries’ Educational Services Working Group—a small cross-libraries group of librarians who work on a variety of projects supporting classroom instruction and research assistance—began by defining the desirable scope of possible portal content. https://myuvm.uvm.edu/ INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 MEETING USERS WHERE THEY ARE | SHERRIFF, DESANTO, BENSON, AND ATWOOD 4 The Educational Services Working Group quickly determined that library content included in the portal should be designed to conform with the principle of priority-based selectivity employed across the portal as a whole. This content should not attempt to represent the full s uite of library information and services available. This would replicate the websites of the three libraries on campus and would risk creating overload and disorientation, in a similar way to institutional websites. It is common for actionable and instructional material to become buried beneath links on a library homepage, and the homepages of our three libraries’ websites are no different. Our hope was to reposition selected instructional content such as research guides, databases -by- subject, chat reference, and liaison librarian contacts in a venue with which students are used to interacting. The goal of the project was the strategic positioning of dynamic, responsive information about research services in a venue with which students frequently interact. Research librarians would select and organize the most important and pertinent instructional content. Such selectivity fit well within the portal’s principle for curating content: high-use tools and services that directly support students’ priorities. Thus the objective for this project would not be the re-creation of the library websites within MyUVM. It was also determined that the scope would exclude content that might be considered marketing or engagement for its own sake, for the same purpose of minimizing users’ cognitive load and helping them to quickly find the features they need. The MyUVM developers in the registrar’s office were enthusiastic about working with us on this project, which partly reflects an increased attention across campus to equitable access to student services for all users—something that is important for its own sake, but also for the purposes of accreditation. Following preliminary discussions in early 2018, MyUVM developers created a test “Libraries” page, equivalent to a full screen of content, and assigned to our group the privileges necessary to view it in the MyUVM test environment. Each page in MyUVM is composed of a series of content boxes or channels. In developing our new page, our task was to develop content for the desired channels. We began our process for composing the page with a card-sorting exercise that identified priorities for the content that should be highlighted. The participants were the group’s members, in order to expedite initial decisions about content that could be tested with users at a later point in the project. Items that figured prominently in this process were the libraries’ “Ask a Librarian” service, research guides, and search tools (discovery layer, databases, and journal directory). This confirmed that our group’s priorities centered on users’ transactional interactions with library services and not merely the one-way promotion of library information. The results of the card sorting were then translated into a wireframe (see figure 2). Each square in the wireframe represented a channel for which we would need to create the appropriate content: • Ask a Librarian (contact details for the libraries’ research assistance services) • Research Guides (subject and class guides) • Search our Collections (search tools for the discovery layer, databases, and journal directory) INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 MEETING USERS WHERE THEY ARE | SHERRIFF, DESANTO, BENSON, AND ATWOOD 5 • Research Roadmap (the libraries’ suite of tutorials on foundational research skills) • Featured Content (a channel for rotating or temporary time-specific content) • Libraries (a box with a link to each of the three libraries on campus; we later added a channel for each library) • The wireframe also envisaged the inclusion of a pop-out chat widget. Figure 2. Wireframe for library content. As noted, the project needed a process that would enable our group to create and publish this content autonomously, but without requiring advanced programming skills on our part. We learned that MyUVM is capable of publishing content pushed from a webpage by using its URL. This meant that we could create content in LibGuides, a platform with which our group was very familiar, and then push the content of an individual LibGuide box to a MyUVM channel simply by providing the LibGuide box URLs to the portal developers. This method offers several advantages. Importantly, it meant that our group had direct control of the box content and was able to publish it without needing the MyUVM developers to review and authorize every edit. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 MEETING USERS WHERE THEY ARE | SHERRIFF, DESANTO, BENSON, AND ATWOOD 6 Those involved in this project faced important decisions early in the process regarding which resources we deemed essential for inclusion and best suited to this new online context. Once items were selected, it was important to keep user behaviors in mind as we prioritized “above the fo ld” content. Students are used to quickly popping into the portal, finding what they need, and popping out. We tried to place interactive content that fit this use pattern in high-visibility places and moved content that required more sustained reading and attention further down the page. A challenge faced during the design process was our campus’s lack of a unified, cross-libraries web presence. The three libraries on our campus have separate websites, but the University of Vermont portal required that we present a unified “Libraries” presence. In some cases, such as links back to library webpages, we were easily able to treat the three libraries separately. In other cases, such as our research guides, we were able to merge resources from multiple libraries. In still other cases, such as our chat widgets, we had to make decisions about which library’s resource would be featured and which other versions would be secondary. The prototyping and testing phases revealed that some content needed to be adjusted in order to display in MyUVM as desired. LibGuides’ tabbed boxes and gallery boxes did not display correctly. Also, some style coding inherited from the LibGuides boxes needed to be adjusted in order to display cleanly. One item, “Course Reserves,” was present in the wireframe but not the page at the time of implementation. We continue to work on the development of a widget for searching “Course Reserves” holdings. The version of the “Library” page at the time of going live is shown in figure 3. Figure 3. Screenshot of the “Library” page in MyUVM. The “Research Guides” channel has a dropdown menu for subject guides and another for class guides. These menus were created using LibGuides widgets, meaning that they update automatically as guides are published and unpublished, and do not require any manual maintenance. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 MEETING USERS WHERE THEY ARE | SHERRIFF, DESANTO, BENSON, AND ATWOOD 7 The “Search Our Collections” channel includes three access points to the libraries’ collections. This contrasts with the libraries’ websites, which display only the discovery layer search box. The latter approach has the advantage of promoting one-stop searching, but also the disadvantage of overwhelming users with non-relevant results. Channels on the left side of the page are less dynamic and interactive. At the top, links to the three libraries on campus provide highly visible quick access for students looking for the libraries’ websites. Similarly, the “Ask a Librarian” channel quickly gets students to reference and consultation services at their home library. The “You Should Know” channel provides a space for rotating content to be changed based on time-of-year, events on campus, or other perceived student needs. RESULTS The “Library” page in MyUVM went live in January 2019, at the same time that spring semester classes began. Our preliminary review of results from the semester, based on data collected from MyUVM, LibGuides Statistics, and Google Analytics, has identified several positive outcomes. MyUVM data showed that there were 18,891 visits to the “Library” page during the period from mid-January to the end of March, a period of eleven weeks when classes were in session. This volume of traffic substantially exceeded our group’s expectations for the first months following implementation, during a period when we were only beginning to promote awareness of the page. Data also showed that usage during this period was generally consistent. The most significant variation in traffic was a small peak in late February that corresponded with a high point in the level of library instruction. LibGuides Statistics showed an overall increase in usage of subject guides, though it is not possible to attribute this to the MyUVM project with complete certainty. In addition, however, we also observed that for many of our guides during this period, MyUVM was among the top referring sites. LibGuides Statistics also recorded unexpectedly large increases in usage for the “Research Roadmap” that we attribute primarily to the MyUVM project. Four sections of the “Research Roadmap” experienced increases of more than 100 percent during the January-March period. The Research Roadmap’s “More Help” page showed a 65 percent drop in visits, but a possible explanation for this is that the highlighting of sections in MyUVM is providing more-immediate help to our users in finding what they need and promoting independent use of instructional materials by students. LibChat Statistics indicated a significant increase in chat reference transactions at Howe Library, the University of Vermont’s central library: a 23 percent increase over the count for the fall 2018 semester, with the implementation of the MyUVM project being the only reasonable explanation. All initial data appear to show that users are finding and continuing to use the “Library” tab in the portal. They are discovering guides and using the embedded chat widget. We plan to gather more usage data for other channels on the page to better inform our picture of what users are doing once they find and view the “Library” tab. As campus portals have become a ubiquitous part of university life, revisiting the library’s role in these portals seems worthwhile, especially given that INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 MEETING USERS WHERE THEY ARE | SHERRIFF, DESANTO, BENSON, AND ATWOOD 8 commonplace design tools like LibGuides dramatically lower the technological acumen needed for creating content. FUTURE DIRECTIONS The next step for this project is to leverage the ability of a campus portal to create a MyUVM homepage library channel that customizes the display of content, based on unique user characteristics. When the user logs in, they are routed to the portal’s landing page, which is dynamically created based upon their student or faculty status, enrollment in a college or school, level of study (graduate/undergraduate), or number of years attending the University of Vermont. This page has the ability to conform to the user in even more granular ways and dynamically display content based upon their major or other demographic categories such as study abroad status, veteran status, or first-year students. By leveraging the portal’s ability to display user-specific content, the University of Vermont Libraries have the ability to customize instructional content tailored to a user’s information needs and place that content in a channel that will display alongside other channels on the MyUVM homepage. A first-year history major’s library channel could contain tutorials on working with primary sources, a link to their liaison librarian, links to digitized newspaper collection, and help guides for Chicago citation style. A graduate student in nursing might see information abou t evidence-based practices for developing a clinical question, help guides for using PubMed and CINAHL, and resources for point-of-care. A faculty member in psychology might find tutorials for creating alerts in their favorite journals, information about copyright and reserves material, or information about citation-management software. In each case, the portal pushes resources and assistance to each user that best fits their specific need, as informed by the librarians best equipped to address that need. This last step of placing dynamic content on the MyUVM homepage will require a great deal of coordination with liaison librarians both to identify the most pertinent disciplinary information to place in the portal and to identify the times of year when certain information is most relevant. To keep portal content dynamic and pertinent to users, a system will need to be created for releasing and removing content on a regular basis and this scheduling of content will require the input of liaison librarians. The Educational Services Working Group will need to manage this scheduling, as well as the enforcement of portal design conventions in coordination with the MyUVM developers. Although this management may end up being complex, it is not insurmountable, and our next steps will be to both to create a system for content creation and management, and to begin to create test content for a sample of user groups. We also plan to gather more data and expand our analytics capabilities to assess how users are using content on the MyUVM “Library” page and examine which features are most popular, how much traffic is being driven back to our websites, and how users are interacting with the features on the page. CONCLUSION Our project has confirmed our initial inclination that students go to MyUVM as a finding tool for finding inter-campus resources. Also, faculty have reported accessing library resources through the portal and directing their students to that pathway as well. The immediate high use and consistency of use indicate that we have placed our selected Libraries resources in a high -traffic INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 MEETING USERS WHERE THEY ARE | SHERRIFF, DESANTO, BENSON, AND ATWOOD 9 venue. Instead of attempting to coax students to our web outpost in the wilds of the internet, we have placed an exit ramp from a highway they already travel. This has proven overwhelmingly effective and confirms, on our campus at least, the literature from the mid-2000s pointing out the opportunity created for libraries by campuses’ institutional adoption of portal systems. In all, the project has been a worthwhile venture for the University of Vermont Libraries. We have observed immediate use and better-than-expected levels of traffic, as well as continued use throughout the semester. It appears that once students wear a path to resources in MyUVM, they are continuing to use that path as a way to access library content. We look forward to further customizing that content in the near future. ACKNOWLEDGEMENTS We gratefully acknowledge David Alles, Portal Developer, and Naima Dennis, Senior Assistant Registrar for Technology, in the University of Vermont Office of the Registrar, for their contributions to the design and development of this project. ENDNOTES 1 Scott Garrison, Anne Prestamo, and Juan Carlos Rodriguez, “Putting Library Discovery Where Users Are,” in Planning and Implementing Resource Discovery Tools in Academic Libraries, ed. Mary Pagliero Popp and Diane Dallis (Hershey, PA: Information Science Reference, 2012), 391, https://doi.org/10.4018/978-1-4666-1821-3.ch022; Bruce Stoffel and Jim Cunningham, “Library Participation in Campus Web Portals: An Initial Survey,” Reference Services Review 33, no. 2 (June 1, 2005): 145-46, https://doi.org/10.1108/00907320510597354. 2 Mark Carden, “Library Portals and Enterprise Portals: Why Libraries Need to Be at the Centre of Enterprise Portal Projects,” Information Services & Use 24, no. 4 (2004): 172–73, https://doi.org/10.3233/ISU-2004-24402. 3 Ilka Datig, “Walking in Your Users’ Shoes: An Introduction to User Experience Research as a Tool for Developing User-Centered Libraries,” College & Undergraduate Libraries 22, nos. 3–4 (2015): 235–37, https://doi.org/10.1080/10691316.2015.1060143; Steven J. Bell, “Staying True to the Core: Designing the Future Academic Library Experience,” portal: Libraries and the Academy 14, no. 3 (2014): 369–82. https://doi.org/10.1353/pla.2014.0021. 4 Stoffel and Cunningham, “Library Participation in Campus Web Portals,” 145-46. 5 Tim McGeary, “MyLibrary: The Library’s Response to the Campus Portal,” Online Information Review 29, no. 4 (2005): 365–73, https://doi.org/10.1108/14684520510617811; Garrison, Prestamo, and Rodriguez, “Putting Library Discovery Where Users Are,” 393-94. https://doi.org/10.4018/978-1-4666-1821-3.ch022 https://doi.org/10.1108/00907320510597354 https://doi.org/10.3233/ISU-2004-24402 https://doi.org/10.3233/ISU-2004-24402 https://doi.org/10.1080/10691316.2015.1060143 https://doi.org/10.1353/pla.2014.0021 https://doi.org/10.1108/14684520510617811 ABSTRACT INTRODUCTION METHODS RESULTS FUTURE DIRECTIONS CONCLUSION ACKNOWLEDGEMENTS ENDNOTES 11571 ---- Public Libraries Leading the Way: On Educating Patrons on Privacy and Maximizing Library Resources Public Libraries Leading the Way On Educating Patrons on Privacy and Maximizing Library Resources T.J. Lamanna INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 4 T.J. Lamanna (professionalirritant@riseup.net) is an Adult Services Librarian, Cherry Hill Public Library. ABSTRACT Libraries are one of our most valuable institutions. They cater to people of all demographics and provide services to patrons they wouldn’t be able to get anywhere else. The list of services libraries provide is extensive and comprehensive, although unfortunately, there are significant gaps in what our services can offer, particularly those regarding technology advancement and patron privacy. Though library classes on educating patrons’ privacy protection are a valiant effort, we can do so much more and lead the way, maybe not for the privacy industry but for our communities and patrons. Creating a strong foundational knowledge will help patrons leverage these new skills in their day to day lives as well as help them educate their families about common privacy issues. In this column, we’ll explore some of the ways libraries can utilize their current resources as well as provide ideas on how we can maximize their effectiveness and roll new technologies into their operations. Though many libraries have policies on how they deal with patron privacy, unfortunately some policies aren’t very strong and oftentimes staff isn’t trained in the details of these policies. Fortunately, for libraries who don’t have these necessary policies, there are some, such as the San Jose Public Library, that offer their own as a framework.1 Those that do have a strong comprehensive policy must make sure they are enforcing and regularly updating it to comply with new technologies being released. It’s a daunting task, but as Article VII of the Library Bill of Rights says, “All people, regardless of origin, age, background, or views, possess a right to privacy and confidentiality in their library use. Libraries should advocate for, educate about, and protect people’s privacy, safeguarding all library use data, including personally identifiable information.”2 This means we have a responsibility to our patrons to do everything in our power to protect them and teach them to protect themselves. This requires a concerted effort not just for technology and IT librarians, but for all library workers. A privacy policy means little if those on the front lines are either unaware of the policy or unsure how it is to be implemented. Therefore, all library staff should both understand the fundamental reasons behind library privacy policies and be trained in maintaining them. Libraries may consider implementing this training during staff development days or offer independent training sessions as needed. Since the introduction of the Patriot Act, libraries stopped collecting patrons’ reading habits, but so many library integrated library systems (ILS) snag massive amounts of patron information we are unaware of. I’ve been administering our ILS for over two years and I just found another space where items are being unnecessarily retained that I didn’t notice before. An instance such as this calls for limiting personally identifiable information (PII) to what is strictly necessary. mailto:professionalirritant@riseup.net ON EDUCATING PATRONS ON PRIVACY AND MAXIMIZING LIBRARY RESOURCES | LAMANNA 5 https://doi.org/10.6017/ital.v38i3.11571 In limiting the PII gathered in the first place, library staff should consider the following questions: What information do libraries really need to collect to offer library cards or programming? Does your library really need patrons’ date of birth or gender? Probably not. If so, you shouldn’t be collecting it, and if you do, make sure you anonymize the data. Using metrics is vital to how libraries function, receive funding, and schedule programming. You can still use the information, but it should not be connected to a patron in any way. After educating staff, we can educate patrons on developing better and safer practices regarding personal privacy and security in their daily lives. Practical examples range from teaching patrons how to create strong passwords and backup sensitive files to explaining how malware works and what the “cloud” actually is. This is a start, but it goes far beyond that. I’ve served many patrons who, even after taking courses on the subject, are overwhelmed by the security measures needed to protect themselves. This isn’t necessarily a sign that our classes are ineffective, but it does imply that new tactics are needed. Let’s look at a few examples. Another version of PII that we often overlook are security measures such as closed-circuit television (CCTV) or security/police officers in our buildings.3 They often are either forgotten or outside the purview of the library itself. As the College of Policing states, “CCTV is more effective when directed at reducing theft of and from vehicles, while it has no impact on levels of violent crime.”4 While there are justifications for bringing this technology into the library, they should only be set up where needed, taking great care not to point them at patron or staff computers. If CCTV is needed, make sure to follow local retention laws and remove the footage as soon as its time has expired. This idea applies to all collected information. There is no reason to archive data beyond the date they can be destroyed as it puts the library and its patrons in a compromised position. Law enforcement in the library is a tough thing to argue against in our current political climate. But studies have shown that police presence does little to deter crime and may actually disproportionately impact marginalized communities.5 Consider the purpose of law enforcement personnel and if their presence is actually necessary to the proper functioning of your library. In the event that you should have law enforcement come in with a subpoena that requires you to turn over your patron data, it’s important to have a canary warning that can be removed so your patrons understand what has happened.6 Another way libraries can lead the way in protecting patron privacy both inside and outside the library is by supporting legislation that bans facial recognition software. This type of technology is becoming ubiquitous, but places have already started pushing back and libraries can be the epicenter of this movement. It’s already been banned in Oakland,7 San Francisco8 (one of the homes of this technology), as well as Somerville, Massachusetts, with groups like the Massachusetts Library Association unanimously putting out a moratorium on facial surveillance, which is the practice of recording ones face to create user profiles.9 There are other states that are working down this path and it’s overwhelmingly heartening to see libraries step up and in front of something they know would damage our communities. We ought to be activists, standing on the front lines and showing our patrons our deepest commitment to them. Surely there are greater strides we can make, such as revising WiFi policies. WiFi is one of the most used services libraries offer and many libraries don’t use it to their full potential. For instance, some libraries turn off their WiFi when the building is closed, severely limiting patrons’ INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 6 usage. It’s a service we pay for and there is no reason it shouldn’t be available at all times. Your IT service should make sure the WiFi is secure (it should be where it’s available at all hours or not). Unlimited access to WiFi becomes invaluable to users who need it for emergencies including completing work or accessing important online services when the library is closed. While we do have limited bandwidth and IT services must actively maintain WiFi security, libraries should make sure it’s available to the public as often as possible. Now that we’ve covered using bandwidth when we aren’t open, let’s talk about libraries with excess bandwidth. No resource should go unused in the library. We have a limited budget and we should make sure every penny is used to serve our communities. One fantastic use of excess bandwidth — especially during closed hours — would be to set up a Tor relay in your library, an anonymity network that allows people to surf the internet with extra security and privacy in mind. It’s quite easy to set up and you can limit how much bandwidth it uses so you aren’t shorting anyone in your library. It’s a service used by groups such as journalists or activists who want to make positive change in the world and need a safe place to do so. Some are concerned that the Tor network is used for malicious intent but the Tor Project, the organization that runs the network, constantly works to ensure nothing like that is taking place. Also, anything solicitous you can find on the Tor network is available on the regular internet including places like Facebook or Craigslist, so the stigma of the network should be taken in context. The Tor Project routinely monitors the network and searches out illegal material (there are no hired killers on the Tor network). Given all this, you could help the network greatly by just partitioning a small amount of your bandwidth. Libraries have the unique ability to be transformative. Unlike other non-profits or organizations, we have the ability to pivot. We can both change directions as needed and pave the way for our communities as leaders in the movement toward patron privacy. I leave you with a quote from Hardt and Negri: “…we share common dreams of a better future.”10 That should be our motto. ENDNOTES 1 “Our Privacy Policy, San Jose Public Library, accessed August 15, 2019, https://www.sjpl.org/privacy/our-privacy-policy. 2 “Library Bill of Rights,” American Library Association, last modified January 19, 2019, http://www.ala.org/advocacy/intfreedom/librarybill. 3 “Importance of CCTV in Libraries for Better Security,” accessed August 14, 2019, https://www.researchgate.net/publication/315098570_Importance_of_CCTV_in_Libraries_for _better_security. 4 “Effects of CCTV on Crime,” College of Policing, accessed August 14, 2019, http://library.college.police.uk/docs/what-works/What-works-briefing-effects-of-CCTV- 2013.pdf. 5 “Do Police Officers in School Really Make Them Safer?” accessed August 14, 2019, https://www.npr.org/2018/03/08/591753884/do-police-officers-in-schools-really-make-them- safer. 6 “Canary Warning,” Wikipedia https://en.wikipedia.org/wiki/Warrant_canary. https://www.sjpl.org/privacy/our-privacy-policy http://www.ala.org/advocacy/intfreedom/librarybill https://www.researchgate.net/publication/315098570_Importance_of_CCTV_in_Libraries_for_better_security https://www.researchgate.net/publication/315098570_Importance_of_CCTV_in_Libraries_for_better_security http://library.college.police.uk/docs/what-works/What-works-briefing-effects-of-CCTV-2013.pdf http://library.college.police.uk/docs/what-works/What-works-briefing-effects-of-CCTV-2013.pdf https://www.npr.org/2018/03/08/591753884/do-police-officers-in-schools-really-make-them-safer https://www.npr.org/2018/03/08/591753884/do-police-officers-in-schools-really-make-them-safer https://en.wikipedia.org/wiki/Warrant_canary ON EDUCATING PATRONS ON PRIVACY AND MAXIMIZING LIBRARY RESOURCES | LAMANNA 7 https://doi.org/10.6017/ital.v38i3.11571 7 Sarah Ravani, “Oakland Bans Use of Facial Recognition Technology, Citing Bias Concerns,” San Francisco Chronicle, July 17, 2019, https://www.sfchronicle.com/bayarea/article/Oakland-bans- use-of-facial-recognition-14101253.php. 8 Kate Conger, Richard Fausset, and Serge F. Kovaleski, “San Francisco Bans Facial Recognition Technology,” New York Times, May 14, 2019, https://www.nytimes.com/2019/05/14/us/facial- recognition-ban-san-francisco.html. 9 Sarah Wu, “Somerville City Council Passes Facial Recognition Ban,” Boston Globe, June 27, 2019, https://www.bostonglobe.com/metro/2019/06/27/somerville-city-council-passes-facial- recognition-ban/SfaqQ7mG3DGulXonBHSCYK/story.html. 10 Michael Hart and Antonio Negri, Multitude: War and Democracy in the Age of Empire, (New York: The Penguin Press, 2009), p. 128. https://www.sfchronicle.com/bayarea/article/Oakland-bans-use-of-facial-recognition-14101253.php https://www.sfchronicle.com/bayarea/article/Oakland-bans-use-of-facial-recognition-14101253.php https://www.nytimes.com/2019/05/14/us/facial-recognition-ban-san-francisco.html https://www.nytimes.com/2019/05/14/us/facial-recognition-ban-san-francisco.html https://www.bostonglobe.com/metro/2019/06/27/somerville-city-council-passes-facial-recognition-ban/SfaqQ7mG3DGulXonBHSCYK/story.html https://www.bostonglobe.com/metro/2019/06/27/somerville-city-council-passes-facial-recognition-ban/SfaqQ7mG3DGulXonBHSCYK/story.html ABSTRACT Endnotes 11577 ---- Are Ivy League Library Website Homepages Accessible? ARTICLES Are Ivy League Library Website Homepages Accessible? Wenfan Yang, Bin Zhao, Yan Quan Liu, and Arlene Bielefield INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2020 https://doi.org/10.6017/ital.v39i2.11577 Wenfan Yang (youngwf@126.com) is a master’s student in the School of Management, Tianjin University of Technology, China. Bin Zhao (andy.zh@126.com) is Professor in School of Management, Tianjin University of Technology, China. Yan Quan Liu (liuy1@southernct.edu) is Professor in Information and Library Science at Southern Connecticut University and Special Hired Professor of Tianjin University of Technology. Arlene Bielefield (bielefielda1@southernct.edu) is Professor in Information and Library Science at Southern Connecticut University. Copyright © 2020. ABSTRACT As a doorway for users seeking information, library websites should be accessible to all, including those who are visually or physically impaired and those with reading or learning disabilities. In conjunction with an earlier study, this paper presents a comparative evaluation of Ivy League university library homepages with regard to the Americans with Disabilities Act (ADA) mandates. Data results from WAVE and AChecker evaluations indicate that although the error of Missing Form Labels still occurs in these websites, other known accessibility errors and issues have been significantly improved from five years ago. INTRODUCTION An academic library is “a library that is an integral part of a college, university, or other institution of postsecondary education, administered to meet the information and research needs of its students, faculty, and staff.”1 People living with physical disabilities face barriers whenever they enter a library. Many blind and visually impaired persons need assistance when visiting a library to do research. In such cases, searching the collection catalog, periodical indexes, and other bibliographic references are frequently conducted by a librarian or the person accompanying that individual to the library. Thus, professionals in these institutions can advance the use of academic libraries for the visually impaired, physically disabled, hearing impaired, and people with learning disabilities. Library websites are libraries’ virtual front doors for all users pursuing information from libraries. Fichter stated that the power of the website is in its popularization.2 Access by everyone regardless of disability is an essential reason for its popularization. Whether users are students, parents, senior citizens, or elected officials navigating the library website to find resources, or sign up for computer courses at the library, the website can be either a liberating or a limiting experience.3 According to the Web Accessibility Initiative (https://www.w3.org/WAI/), website accessibility means that people with disabilities can use the websites. More specifically, website accessibility means that people with disabilities can perceive, understand, navigate, and interact with websites and that they can contribute to the websites. Incorporating accessibility into website design enables people with disabilities to enjoy the benefits of websites to the same extent as anyone else in their community. This study evaluated the current state of the accessibility of university websites of the American Ivy League university libraries using guidelines established by the Americans with Disabilities Act mailto:youngwf@126.com mailto:andy.zh@126.com mailto:liuy1@southernct.edu mailto:bielefielda1@southernct.edu https://www.w3.org/WAI/ INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 2 (ADA) for those who are visually or physically impaired or who have reading or learning disabilities. ADA’s Section 508 and the Web Content Accessibility Guidelines (WCAG), by the World Wide Web Consortium (W3C) provide guidelines for website developers which define what makes a website accessible to those with physical, sensory, or cognitive disabilities. Since a broad array of disabilities are recognized under the ADA, websites seeking to be compliant with the ADA should use the Act’s technical criteria for website design. This study used two common accessibility evaluation tools—WAVE and AChecker—for both Section 508 and the WCAG version 2.0 Level AA. Among universities in the United States, the eight Ivy League universities—Brown, Columbia, Cornell, Dartmouth, Harvard, Princeton, University of Pennsylvania, and Yale—all have a long and distinguished history, strict academic requirements, high-quality teaching, and high-caliber students. Because of their good reputations, they are expected to lead by example, not only in terms of academic philosophy and campus atmosphere, but also by the accessibility of their various websites. Of course, any library website, whether an urban public library or a university library, should be accessible to everyone. Hopefully, this study of their accessibility can enlighten other universities on how to better develop and maintain library websites so that individuals with disabilities can enjoy the same level of accessibility to academic knowledge as everyone else. LITERATURE REVIEW In 1999, Schmetzke reported that emerging awareness about the need for accessible website design had not yet manifested itself in the actual design of library websites. For example, at the fourteen four-year campuses within the University of Wisconsin system, only 13 percent of the libraries’ top-level pages (homepages plus the next layer of library pages linked to them) were free of accessibility problems.4 Has this situation changed in the last twenty years? To answer this question, a number of authors have suggested various methods for evaluating software/hardware for accessibility and usability.5 Included in the process of compiling data is “involving the user at each step of the design process. Involvement typically takes the form of an interview and observation of the user engaged with the software/hardware.”6 Providenti & Zai conducted a study in 2007 focused on providing an update on the implementation of website accessibility guidelines of Kentucky academic library websites. They tested the academic library homepages of bachelor-degree granting institutions in Kentucky for accessibility compliance using Watchfire’s WebXACT accessibility tester and the W3C’s HTML validator. The results showed that from 2003 to 2007, the number of library homepages complying with basic accessibility guidelines was increasing.7 Billingham conducted research on Edith Cowan University (ECU) Library websites. The websites were tested twice, in October 2012 and June 2013, using automated testing tools such as code validators and color analysis programs, resulting in findings that 11 percent of the guidelines for WCAG 2.0 Level A to Level AA were passed in their first test. Additionally, there was a small increase in the percentage of WCAG 2.0 guidelines passed by all pages tested in the second test. 8 While quite a few research studies focus on library website accessibility rather than the university websites, the conclusions diverge. Tatiana & Jeremy (2014) tested 509 webpages at a large public university in the northeastern United States using WAVE (http://wave.webaim.org) and Cynthia Says (http://www.cynthiasays.com). The results indicated that 51 percent of those webpages http://wave.webaim.org/ http://www.cynthiasays.com/ INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 3 passed automated website accessibility tests for Section 508 compliance with Cynthia Says. However, when using WAVE for WCAG Priority 1 compliance, which is a more rigorous evaluation level, only 35 percent passed the test.9 Maatta Smith reported that not one of the websites of 127 US members of the Urban Library Council (ULC) was without Errors or Alerts, with the average number of Errors being 27.10 Such results were similar with Liu.11 12They also found that about half (58 of 127) of the urban public libraries provided no information specifically for individuals with disabilities. Of the 127 websites, some were confusing by using the variety of verbiage to suggest services for individuals with disabilities. Sixty-six of them provide some information about services within the library for individuals with disabilities. The depth of the information varied, but in all instances contact information was included for additional assistance. Liu, Bielefield, and McKay examined 122 library homepages of ULC members and reported on three main aspects. First, only seven of them presented as Error free when tested for compliance with the 508 standards. The highest percentage of Errors occurred in accessibility Sections 508(a) and 508(n). Second, the number of issues was dependent on the population served. That means libraries serving larger populations tend to have more issues with accessibility than those serving smaller ones. Third, the most common Errors were Missing Label and Contrast Errors, while the highest number of Alerts was related to the device-dependent event handler, which means that a keyboard or mouse is a necessary piece of equipment to initiate a desired transaction.12 Although they were interested in overall website accessibility, Theofanos and Redish focused their research on the visually impaired website user. The authors investigated and revealed six reasons to bridge the gap between accessibility and usability. The six reasons were: 1. Disabilities affect more people than you may think worldwide. 750 million people have a disability, and three of every ten families are touched by a disability. In the United States, one in five have some kind of disability, and one in ten have a severe disability. That’s approximately 54 million Americans. 2. It is a good business. According to the President’s Committee on the employment of People with Disabilities, the discretionary income of people with disabilities is $175 billion. 3. The number of people with disabilities and income to spend is likely to increase. The likelihood of having a disability increases with age, and the overall population is aging. 4. The website plays an important role and has significant benefits for people with disabilities. 5. Improving accessibility enhances usability for all users. 6. It is morally the right thing to do.13 Lazar, Dudley-Sponaugle, and Greenidge validated that most blind users were just as impatient as most sighted users. They want to get the information they need as quickly as possible. They don’t want to listen to every word on the page just as sighted users do not read every word.14 Similarly, Foley found that using automated validation tools did not ensure complete accessibility. Students with low vision found many of the pages hard to use even though they were validated.15 Outcomes of all the research revealed that most university library websites have developed a policy on website accessibility, but the policies of most universities had deficiencies.16 Library staff INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 4 must be better informed and trained to understand the tools available to users, and when reviewing web pages, the audiences of all kinds must be considered.17 RESEARCH DESIGN AND METHODS This study, as a continuing effort from an earlier study on urban library websites, made use of content analysis methodology to examine the website accessibility of the university libraries against the Americans with Disabilities Act (ADA), with a focus on those with visual or cognitive disabilities.18 Under the ADA, people with disabilities are guaranteed access to all postsecondary programs and services. The evaluation of accessibility focuses on the main pages of these university library websites, as shown in table 1, because these homepages considerably demonstrate the institution’s best effort or, at least, most recent redesigns. It was the intent of the authors of this research to reveal the current status of the Ivy League library homepages’ accessibility and the importance that Ivy League universities attach to the accessibility of their websites. Commonly recognized website evaluators (WAVE, AChecker, and Cynthia Says), along with other online tools, evaluate a website's accessibility by checking its HTML and XML code. WAVE and AChecker were selected for this study for the robustness of their evaluation based on W3C guidelines, comprehensiveness of evaluation reporting, and ready availability to any institution or individual conducting website evaluations. WAVE is a web evaluation tool that was utilized to check websites against Section 508 standards and WCAG 2.0 guidelines. This assessment was conducted by entering a uniform research locator, URL, or website address in the search box. The evaluation tool provided a summary of Errors, Alerts, Features, Structural Elements, HTML5 and ARIA. AChecker is a tool to check single HTML page content for conformance with accessibility standards to ensure the content can be accessed by everyone. It produces a report of all accessibility problems for the selected guidelines by three types of problems: Known Problems, Likely Problems and Potential Problems. Both WAVE and AChecker help website developers make their website content more accessible. Data from different periods were compared to show statistically whether enough attention was paid to accessibility issues by the Ivy League university systems. The study team collected the first data set in February 2014, using WAVE for Section 508. In 2018, AChecker accessibility checker was used for both Section 508 and WCAG 2.0 AA. The Access Board published new requirements for information and communication technology covered by Section 508 of the Rehabilitation Act (https://www.access-board.gov/guidelines-and- standards/communications-and-it/about-the-ict-refresh) on January 18, 2017. The latest WCAG 2.0 guidelines were updated on September 5, 2013 (https://www.w3.org/TR/wcag2ict/). While the WAVE development team indicated that they have updated the indicators in WAVE regarding WCAG 2.0, the current indicators regarding Section 508 refer to the previous technical standards for Section 508, not the updated 2017 ones. According to AChecker.ca, the versions of the Section 508 standards and WCAG 2.0 AA guidelines used were published on March 12, 2004 and June 19, 2006, respectively, with neither being the latest versions. This study centered on three research questions: https://www.access-board.gov/guidelines-and-standards/communications-and-it/about-the-ict-refresh https://www.access-board.gov/guidelines-and-standards/communications-and-it/about-the-ict-refresh https://www.w3.org/TR/wcag2ict/ INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 5 1. Are the library websites of the eight Ivy League universities ADA compliant? 2. Are there easily identified issues that present barriers to access for the visually impaired on the IVY League university library homepages? 3. What should Ivy League libraries do to achieve ADA compliance and to maintain it? Table 1. Investigated Websites of Ivy League University Libraries. Library Website Address Brown University Library https://library.brown.edu Columbia University Libraries http://library.columbia.edu Cornell University Library https://www.library.cornell.edu Dartmouth Library https://www.library.dartmouth.edu Harvard Library http://library.harvard.edu Princeton University Library http://library.princeton.edu Penn Libraries http://www.library.upenn.edu Yale University Library https://web.library.yale.edu RESULTS & DISCUSSION All five evaluation categories employed by WAVE for Section 508 standards, as shown in figure 1, were examined, with a more in-depth review of the homepage of the University of Pennsylvania library. Similar results in numbers of the five categories are presented in the library homepages of Brown University, Columbia University, and Cornell University. Interestingly, WAVE indicates more Errors and Alerts on the homepage of Yale University. Figure 1. WAVE Results for Section 508 Standards. In order to determine the accuracy of the results, the team also used AChecker to reevaluate these homepages in the year 2018. Known Problems as the category in AChecker are as serious as Errors in WAVE. They have been identified with certainty as accessibility barriers by the website INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 6 evaluators and need to be fixed. Likely Problems are problems that could be barriers which require a human to decide whether there is a need to fix them. AChecker cannot identify Potential Problems and requires a human to confirm if identified problems need remediation. Figure 2 shows the numbers for each category as detected by AChecker on June 18, 2018, on the eig ht Ivy League university libraries’ homepages. The library homepage of the University of Pennsylvania was found to contain the most, which was the same as the result from WAVE. However, among the seven remaining libraries’ homepages, the homepage of Harvard University library displayed the same number of problems as the University of Pennsylvania detected by AChecker. Figure 2. AChecker Results for Section 508 Standards. There was significant improvement between 2014 and 2018 The results of this research from WAVE for Section 508 standards signify a significant shift in the accessibility of these websites between the years of 2014 and 2018. Among the five WAVE detection categories in the eight library homepages, the total of Errors and Alerts decreased during this period. For instance, the total number of Errors was 36 in 2014 decreasing to 11 in 2018, and the number of Alerts decreased from 141 to 14. Figure 3 shows the number of Errors in each library homepage, and figure 4 shows the number of Alerts. They all show a downward trend from 2014 to 2018. But Features, Structural Elements and HTML/ARIA were all on the rise when comparing the two years’ data sets. The green sections in table 2 indicate a decrease of the numbers in three categories from 2014 to 2018, and the yellow sections indicate an increase in numbers. These data results revealed that Errors and Alerts, the most common problems related to access, had been better controlled during these years, while others might still remain unchanged. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 7 Figure 3. Change of Errors from 2014 to 2018. Figure 4. Change of Alerts from 2014 to 2018. Table 2. Changes of Features, Structural Elements, and HTML/ARIA between 2014 and 2018. Categories Features Structural Elements HTML/ARIA Year of Data Collection 2014 2018 2014 2018 2014 2018 Total 108 191 184 233 24 89 Brown University Library 13 15 6 13 0 1 Columbia University Libraries 12 13 23 14 17 0 Cornell University Library 5 6 20 18 0 4 Dartmouth Library 10 8 15 27 0 23 Harvard Library 20 20 14 24 0 4 Princeton University Library 15 31 45 24 0 3 Penn Libraries 12 90 29 104 7 50 Yale University Library 21 8 32 9 0 4 Missing Form Labels were the Top Error Against the ADA The data used in the analysis below were all the test data collected in 2018. All Errors appearing in data results were collected and analyzed. Figure 5 shows the number of Errors that were identified based on the specific requirements contained in Section 508 of the Rehabilitation Act as evaluated by WAVE. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 8 Figure 5. Occurrences of Specific Error per Specific 508 Standards. The term Error refers to accessibility errors that need to be fixed. Missing Form Label was the highest frequency Error type shown. Only two types of Errors occurred in Ivy League university libraries’ homepages. But these Errors didn’t appear on every homepage. There are several Errors in some homepages while others had no Errors. For example, Linked Image Missing Alternative Text occurred on the library homepage of Harvard University twice. Table 3 shows the distribution of Errors in eight homepages. Table 3. Distribution of Errors in Eight Homepages. Missing Form Label Linked Image Missing Alternative Text Brown University Library Columbia University Libraries 1 Cornell University Library Dartmouth Library 3 Harvard Library 2 Princeton University Library Penn Libraries 1 Yale University Library 4 Missing Form Label is listed in Section 508 (n) and means there is a form control without a corresponding label. This is important because if a form control does not have a properly associated text label, the function or purpose of that form control may not be presented to sc reen reader users. Linked Image Missing Alternative Text occurred only in the Harvard library homepage among the eight Ivy League university libraries’ homepages. It indicated that an image without alternative text results in an empty link. If an image is within a link that does not provide alternative text, a screen reader has no content to present to the user regarding the function of the link. These website accessibility issues may be easy fixes and considered minor to some; however, if they are not detected, they are major barriers for persons living with low vision or blindness. As a result, users are left at a disadvantage because they are lacking critical information to successfully fulfill their needs. Examples of such Error icons in WAVE are displayed in figures 6 and 7. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 9 Figure 6. Missing Form Label Icon from Yale University Library Homepage. Figure 7. Linked Image Missing Alternative Text Icon from Harvard Library Homepage. A total of eleven Errors, as shown in figure 8, were located on the homepages of the eight Ivy League libraries and illustrated the number of Errors that occurred in each library homepage. The average number of Errors for each homepage was 1.375. Yale University library homepage had the most Errors with a total of four. Library homepages of Brown University, Cornell University and Princeton University performed best with zero Errors. Figure 8. The Total of Errors in Ivy League Libraries’ Homepages. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 10 Six Alerts appear among ADA requirements The issues that Alerts identify are also significant for website accessibility. Figure 9 shows there are six different kinds of Alerts that were identified based on the specific requirements contained in Section 508 of the Rehabilitation Act. Figure 9. Occurrences of Specific Alert per Specific 508 Standards. The Noscript Element was the most encountered Alert issue. Alerts that WAVE reports need close scrutiny, because they likely represent an end-user accessibility issue. The Noscript Element is related to the 508 (l) requirement and means a Noscript Element is present when JavaScript is disabled. For users of screen readers and other assistive technologies, almost all have JavaScript enabled, so Noscript cannot be used to provide an accessible version of inaccessible scripted content. Skipped Heading Level ranked was second in number. The importance of Headings is in their provision of document structure and facilitation of keyboard navigation for users of assistive technology. These users may be confused or they may experience difficulty navigating when heading levels are skipped. Examples of icons of these Alerts, evaluated by WAVE, indicated these Noscript Elements as being to accessibility, as shown in figures 10 and 11. Figure 10. Noscript Element Icon from Cornell University Library Homepage. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 11 Figure 11. Skipped Heading Level Icon from Dartmouth Library Homepage. A total of fourteen Alert problems were detected. Figure twelve illustrates the number of Alerts that occurred on each library homepage. On average, there were 1.75 Alerts present on the eight websites. The library homepages of Yale University and the University of Pennsylvania had the most Alerts with 4 on each site. Only the Brown University library’s homepage had zero Alerts. Figure 12. The Total of Alerts in Ivy League Libraries’ Homepages. Linked Image with Alternative Text was the most frequently found Feature issue Features as a category of issues indicates conditions of accessibility that probably need to be improved and usually require further verification and manual fixing. For example, if a Feature is detected on a website, it means that further manual verification is required to confirm its accessibility. Figure 13 shows the number of Features that were identified, based on the specific requirement contained in Section 508 of the Rehabilitation Act. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 12 Figure 13. Occurrences of Specific Features per Specific 508 Standards. Linked Image with Alternative Text, which is a 508 (a) requirement, was shown to be the most encountered Features issue. This means that an alternative text should be presented for an image that is within a link. By including appropriate alternative text on an image within a link, the function and purpose of the link and the content of the image are available to screen reader users even when images are unavailable. Another high occurring Feature was Form Label, which means a form label is present and associated with a form control. A properly associated form label is presented to a screen reader user when the form control is accessed. These evaluation steps were the same ones used for Errors and Alerts. Example icons of Features evaluated by WAVE are displayed as figures 14 and 15. Figure 14. Linked Image with Alternative Text Icon from Brown University Library Homepage. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 13 Figure 15. Form Label Icon from Penn Libraries Homepage. This study also ranked the number of Features that were detected by WAVE in the eight Ivy League library homepages. Figure 16 displays the number of Features that occurred on each library homepage. In total there were 191 Features detected by WAVE in the eight Ivy League university libraries’ homepages. The homepage of the University of Pennsylvania library was found to have 90 Features, by far the most of all the libraries. No library was entirely free of Features according to the WAVE measurement using Section 508 standards. Figure 16. The Total of Features in Ivy League Libraries’ Homepages. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 14 Table 4A. Comparison between WAVE & AChecker Section 508 Standards on Brown and Columbia’s library homepages. Section 508 Standards Brown University Columbia University WAVE AChecker WAVE AChecker April June April June April June April June Total 33 29 47 47 28 29 79 83 A 9 9 9 9 12 13 12 14 B C 14 14 26 28 D 8 8 14 14 E F G H I J 8 8 14 14 K L 6 6 12 12 M N 1 1 1 1 1 1 O 23 19 1 1 15 15 1 1 P Table 4B. Comparison between WAVE & AChecker Section 508 Standards on Cornell and Dartmouth’s library homepages. Section 508 Standards Cornell University Dartmouth College WAVE AChecker WAVE AChecker April June April June April June April June Total 30 29 107 106 59 68 65 67 A 2 2 2 2 8 8 10 11 B C 36 36 22 23 D 32 32 9 9 E F G H I J 33 32 9 9 K L 3 3 7 7 M N 7 7 23 29 8 8 O 21 20 1 1 28 31 P INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 15 Table 4C. Comparison between WAVE & AChecker Section 508 Standards on Harvard and Princeton’s library homepages. Section 508 Standards Harvard University Princeton University WAVE AChecker WAVE AChecker April June April June April June April June Total 51 51 139 139 57 61 74 74 A 20 20 29 29 25 25 20 20 B C 43 43 32 32 D 32 32 10 10 E F G H I J 34 34 10 10 K L 1 1 M N 5 5 3 7 O 26 26 1 1 29 29 1 1 P Table 4D. Comparison between WAVE & AChecker Section 508 Standards on Pennsylvania and Yale’s library homepages. Section 508 Standards University of Pennsylvania Yale University WAVE AChecker WAVE AChecker April June April June April June April June Total 253 249 129 139 28 29 84 85 A 40 37 14 19 6 7 4 5 B C 82 87 28 28 D 11 11 21 21 E F G 1 1 H I J 11 11 21 21 K L 1 1 9 9 3 3 4 4 M 3 2 N 103 104 1 1 8 8 4 4 O 106 105 1 1 11 11 1 1 P INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 16 A Few 508 Standards Deviate from Comparison between two Evaluators To determine whether the WAVE tool missed some specific requirements in Section 508, the authors comparatively examined these eight university homepages using both WAVE and AChecker from one site to another synchronously in April and again in June 2019. There are sixteen principles in Section 508. They are arranged from A to P. Tables 4A–4D indicate issues for these Section 508’s requirements in the eight universities’ homepages respectively. Except the requirement G for Yale library homepage which shows one issue in AChecker, in neither WAVE nor AChecker during the time we conducted our examination, there was no issue found for the seven requirements (B, E, F, H, I, K, and P) below: B. Equivalent alternatives for any multimedia presentation shall be synchronized with the presentation; E. Redundant text links shall be provided for each active region of a server-side image map; F. Client-side image maps shall be provided instead of server-side image maps except where the regions cannot be defined with an available geometric shape; H. Markup shall be used to associate data cells and header cells for data tables that have two or more logical levels of row or column headers; I. Frames shall be titled with text that facilitates frame identification and navigation; K. A text-only page, with equivalent information or functionality, shall be provided to make a website comply with the provisions of this part, when compliance cannot be accomplished in any other way. The content of the text-only page shall be updated whenever the primary page changes; P. When a timed response is required, the user shall be alerted and given sufficient time to indicate more time is required. The results tabulated in tables 4A–4D indicate that these seven Section 508 requirements perhaps are not problematic to the websites. CONCLUSIONS Based on the results, this study determined that the eight Ivy League universities’ homepages exhibited some issues with accessibility for people with disabilities. Considerable effort is necessary to ensure their websites ready to meet the challenges and future needs of web accessibility. Users with visual impairments can navigate a website only when it is designed to be accessible with other assistive technology. While each institution presented both general and comprehensive coverage of services for users with disabilities, it would have been more practical and efficient if specific links were posted on the homepage. According to the American Foundation for the Blind (https://www.afb.org), “usability” is a way of describing how easy a website is to understand and use. Accessibility refers to how easily a website can be used, understood, and accessed by people with disabilities. https://www.afb.org/ INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 17 This study has concluded that expertise and specialized training and skill are still needed in th is area. Principles of accessible website design must be introduced and taught, underscoring that design matters for people with disabilities just as it does in the physical environment. As highlighted earlier through the evaluation tool WAVE, most of the problems detected can be fixed with provided solutions. A frequent review is critical and websites should be assessed at a minimum on a yearly basis for accessibility compliance. There is much to be done if accessibility is to be realized for everyone. LIMITATIONS The authors recognize that this study, using free website accessibility testing tools, has certain limitations. As WAVE remarked in their HELP page, the aim of website developers is not to get rid of all identified problem categories except Errors that need to be fixed, but to determine whether a website is accessible. At the time of writing neither WAVE nor AChecker were updated with the latest general WCAG 2.1 AA rules. While the version of WCAG 2.1 is expected to provide new guidelines for making websites even more accessible, more careful and comprehensive studies against the WCAG 2.1 AA rules could further assist university library professionals and their website developers to provide those with disabilities with accessible websites. Moreover, while it is effective to conduct these machine-generated evaluations, it is equally important that researchers check the issues manually to impose human analysis in determining the major issues with content. ENDNOTES 1 Joan M. Reitz, ODLIS: Online Dictionary for Library and Information Science. (Westport, CT: Libraries Unlimited, 2004), 1–2. 2 Darlene Fichter, “Making your Website Accessible,” Online Searcher 37, no. 4 (2013): 73–76. 3 Fichter, “Making your Website Accessible,” 74. 4 Axel Schmetzke, Web Page Accessibility on University of Wisconsin Campuses: A Comparative Study (Stevens Point, WI, 2019). 5 Jeffrey Rubin and Dana Chisnell, Handbook of Usability Testing: How to Plan, Design, and Conduct Effective Tests (Idaho: Wiley, 2008), 6–11. 6 Alan Foley, “Exploring the Design, Development and Use of Websites through Accessibility and Usability Studies,” Journal of Educational Multimedia and Hypermedia 20, no. 4 (2011), 361–85, http://www.editlib.org/p/37621/. 7 Michael Providenti and Robert Zai III, “Web Accessibility at Kentucky's Academic Libraries,” Library Hi Tech 25, no. 4 (2007): 478–93, https://doi.org/10.1108/07378830710840446. 8 Lisa Billingham, “Improving Academic Library Website Accessibility for People with Disabilities,” Library Management 35, no. 8/9 (2014): 565–81, https://doi.org/10.1108/LM-11-2013-0107. 9 Tatiana I Solovieva and Jeremy M Bock, “Monitoring for Accessibility and University Websites: Meeting the Needs of People with Disabilities,” Journal of Postsecondary Education and http://www.editlib.org/p/37621/ https://doi.org/10.1108/07378830710840446 https://doi.org/10.1108/LM-11-2013-0107 INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 ARE IVY LEAGUE LIBRARY WEBSITE HOMEPAGES ACCESSIBLE? | LIU 18 Disability 27, no. 2 (2014): 113–27, http://search.proquest.com/docview/1651856804?accountid=9744. 10 Stephanie L. Maatta Smith, “Web Accessibility Assessment of Urban Public Library Websites,” Public Library Quarterly 33, no. 3 (2014): 187–204, https://doi.org/187- 204.10.1080/01616846.2014.937207. 11 Yan Quan Liu, Arlene Bielefeld, and Peter McKay, “Are Urban Public Libraries’ Websites Accessible to Americans with Disabilities?,” Universal Access in the Information Society, 18, no. 1 (2019): 191–206, https://doi.org/10.1007/s10209-017-0571-7. 12 Liu, Bielefeld, and McKay, “Are Urban Public Library Websites Accessible.” 13 Mary Frances Theofanos and J. Redish, “Bridging the Gap: Between Accessibility and Usability,” Interactions 10, no. 6 (2003): 36–51, https://doi.org/10.1145/947226.947227. 14 Jonathan Lazar, A. Dudley-Sponaugle, and K. D. Greenidge, “Improving Web Accessibility: A Study of Webmaster Perceptions,” Computers in Human Behavior 20, no. 2 (2004): 269–88, https://doi.org/10.1016/j.chb.2003.10.018. 15 Foley, “Exploring the Design,” 365. 16 David A. Bradbard, Cara Peters, and Yoana Caneva, “Web Accessibility Policies at Land-grant Universities,” Internet & Higher Education 13, no. 4 (2010): 258–66, https://doi.org/10.1016/j.iheduc.2010.05.007. 17 Mary Cassner, Charlene Maxey-Harris, and Toni Anaya, “Differently Able: A Review of Academic Library Websites for People With Disabilities," Behavioral & Social Sciences Librarian 30, no. 1 (2011): 33–51, https://doi.org/10.1080/01639269.2011.548722. 18 Liu, Bielefeld, and McKay, “Are Urban Public Library Websites Accessible,” 195. http://search.proquest.com/docview/1651856804?accountid=9744 https://doi.org/187-204.10.1080/01616846.2014.937207 https://doi.org/187-204.10.1080/01616846.2014.937207 https://doi.org/10.1007/s10209-017-0571-7 https://doi.org/10.1145/947226.947227 https://doi.org/10.1016/j.chb.2003.10.018 https://doi.org/10.1016/j.iheduc.2010.05.007 https://doi.org/10.1080/01639269.2011.548722 ABSTRACT INTRODUCTION Literature Review Research Design and Methods Results & Discussion There was significant improvement between 2014 and 2018 Missing Form Labels were the Top Error Against the ADA Six Alerts appear among ADA requirements Linked Image with Alternative Text was the most frequently found Feature issue A Few 508 Standards Deviate from Comparison between two Evaluators ConclusionS Limitations ENDNOTES 11581 ---- Bento Box User Experience Study at Franklin University ARTICLES Bento-Box User Experience Study at Franklin University Marc Jaffy INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2020 https://doi.org/10.6017/ital.v39i1.11581 Marc Jaffy (marc.jaffy@franklin.edu) is Acquisitions Librarian, Franklin University. ABSTRACT This article discusses the benefits of the bento-box method of searching library resources, including a comparison of the method with a tabbed search interface. It then describes a usability study conducted by the Franklin University Library in which 27 students searched for an article, an ebook, and a journal on two websites: one using a bento box and one using the EBSCO Discovery Service (EDS). Screen recordings of the searches were reviewed to see what actions users took while looking for information on each site, as well as how long the searches took. Students also filled out questionnaires to indicate what they thought of each type of search. Overall students found more items on the bento-box site, and indicated a slight preference for the bento-box search over EDS. The bento-box site also provided quicker results than the EDS site. As a result, the Franklin University library decided to implement bento-box searching on its website. INTRODUCTION “One page, one search box, results from as many library-resource types as possible.”1 In 2018, the Franklin University Library redesigned its website to provide users with a more modern interface that more closely matched Franklin University’s website. The library also wanted to improve the site’s usability and make it easier for students to find information. To determine how to best improve the user experience, library staff members held a number of meetings to discuss the new site’s layout and contents. Because “students almost always resort[] to searching via Web site search boxes rather than navigating through the Web site by browsing,” a crucial decision involved what search results the redesigned library website would provide. 2 As a result of these discussions, the Franklin University Library’s initial website redesign included a persistent search bar in the upper left of each page which searched the library’s website, as well as a prominent tabbed search bar on the library’s homepage (see figure 1). The homepage search bar provided a default tab that used EBSCO Discovery Service (EDS) to search the library resources cataloged in EDS (most of the library’s databases and catalog) and a second tab which used EBSCO’s Journal Finder to look for e-journals. Once our new website went live, feedback from patrons demonstrated that the persistent website search bar caused confusion among users who expected it to search the library’s databases rather than the library’s website. We also found the “search journals” tab on the homepage unnecessary. As a result, we removed both the persistent search bar and the journals tab. After these changes, the main search option provided to library users was the EDS search bar on the library’s homepage, although some interior pages of the library’s website provided a search bar related to that page (such as an option to search the catalog on the catalog page). Although EDS searches mainly for articles and books, it “may overlook user needs for other types of library mailto:marc.jaffy@franklin.edu INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 2 resources or services.”3 This is a problem because “library users increasingly perceive the discovery interface as a portal to all of the library’s resources.”4 Due to dissatisfaction with the EDS search, the library decided to look for alternatives. One alternative which “[a] number of libraries have turned to [is] the bento-based approach to discovery and display.”5 Figure 1. Redesigned Franklin University Library website with two search boxes on the homepage. The circled search box in the upper left was initially persistent across the entire site. To determine whether the bento-box search format would serve our users better than EDS, the library designed and conducted a usability study comparing EDS and bento-box searches. By comparing user search behavior and results for each search method, as well as user opinion regarding these different methods of searching library websites, the library hoped to gain a clearer understanding of how its users interact with search boxes on the library’s website and— most importantly—which search method would best serve its users. The remainder of this article sets forth the results of that trial. After explaining what bento-box search is, as well as reasons a library might want to use bento-box search results, it reports on a usability study the Franklin University Library conducted by discussing both the observations of screen recordings demonstrating user search behavior and responses to questionnaires. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 3 BENTO-BOX LIBRARY SEARCH What Is Bento-Box Search? The term “bento box” is based on “Japanese cuisine where different parts of a meal are compartmentalized in aesthetically pleasing ways.”6 Instead of compartmentalizing food, a bento- box search results page compartmentalizes search results from a variety of different resources on a single page. The user sees a single search box which gives “side-by-side results from multiple library search tools, e.g., library catalog, article index, institutional repository, website, LibGu ides, etc.”7 A bento-box search provides results based on searches of individual library resources. This is important because of the difficulty of providing a single search that includes combined results from all resources: “the nature of library content and current technology makes it difficult to create usable ‘blended’ results; catalog materials may crowd out books or vice versa.” 8 Bento-box results avoid this problem by “provid[ing] these resources on equal footing, leveraging the ranking algorithms internal to each resource’s individual search interface.” 9 As a result, the bento box gives libraries “the best cost/benefit way to improve article search relatively quickly with relatively large user benefit.”10 Figure 2, the University of Michigan’s bento-box results page, illustrates how a bento box provides search results from a variety of resources in visually discrete boxes. This is done behind the scenes by using separate searches to query the individual resources, as demonstrated by figure 3, the architecture of Wayne State University’s bento-box search. Benefits of a Bento-Box Search Results Page Wayne State University’s switch to a bento box “resulted in increased access to resources.”11 A bento box can increase access to resources both because it makes library search easier for users and because it provides results in a format that makes it easier for users to find information. Simplified Search When deciding what type of search to provide on the library website, the main consideration involves what users expect when searching. Student experiences with internet search engines have influenced their expectations for library search, which leads them to “approach library search interfaces as if they were Google.”12 What do users like about Google? “One of the main reasons that users are satisfied with Google is its simple user interface.”13 Based on their experience with Google and other search engines, library users “expect easy searching” that provides “one-step immediate access to the full text of library resources.”14 Bento-box results let libraries meet these expectations by presenting users with a simple interface that permits easy searching and “returns results across many library sources.”15 Additionally, bento-box results “can integrate library website results, allowing users to type things like ‘how do I renew a book’ into a single search box and get meaningful results.”16 As a result, adopting a bento-box search results page can permit a library to satisfy user search expectations. The bento-box format will provide users the information they seek whether they are looking for an article, a book, or information about the library. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 4 Figure 2. University of Michigan Library’s bento-box results. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 5 Figure 3. Wayne State University Library bento box architecture, from Cole Hudson and Graham Hukill, “One-to-Many: Building a Single-Search Interface for Disparate Resources,” in K. Varnum (ed.) Exploring Discovery: The Front Door to Your Library’s Licensed and Digitized Content (Chicago: ALA Editions, 2016): 147. Better Presentation of Results Bento-box results can help alleviate user confusion because “format types [are] more evident: Novice users, such as undergraduates, may not have a good understanding of the difference between books, journals, and articles.”17 The bento-box presentation makes it easier for users to find information since “results from different sources are returned to visually discrete boxes” 18 on a single page. This presentation of grouped results benefits library users because “[b]y presenting search results in separate streams, users can more easily navigate to what they need.”19 When the Princeton University Library implemented a bento-box results page (termed All Search), it found that “most users praised the new All Search for its ease of use and also for the ‘bento-box’ approach of grouping results by category. They felt that this clarified the results they were seeing and made it easier for them to pursue different avenues of inquiry.”20 INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 6 Comparison with Tabbed Searching One alternative to the bento box is to offer users a tabbed search box which lets users select a specific resource to search by selecting a tab on the search bar. Before the 2018 redesign, the Franklin University Library’s website provided users with a tabbed search box, as shown in figure 4. Our redesigned website reduced the number of tabs from four to two, but we ultimately removed tabbed search from our website because we did not find it effective. Figure 4. Previous Franklin University Library website with tabbed search box. Tabbed search requires users to decide which tab to use. When the Franklin University Library provided a tabbed search box we found that users had difficulty identifying which tab they should use. In addition to causing user confusion over which tab to use for their search, because each search tab only searches a portion of library resources, tabbed searching “misses a wide swath of available information and resources [which] will make that missing information practically invisible.”21 Another problem with tabbed search is that it requires a library to designate one of the tabs as a default search. This can lessen the chances that users will search (or use) resources in the non- default tab(s) because library users “tend to favor the most prominent search option.”22 Lown, Sierra, and Boyer cited a study which found that the default option was used 60.5 percent of the time in a tabbed interface, and reported that on the North Carolina State University website the default tab was used 73.7 percent of the time.23 Tabbed search does not meet library user needs because it is “inconsistent and confusing.”24 When Wayne State University switched from tabbed searching to a bento box “many library resources that were previously hidden from search and discovery on the main library website were, for the first time, exposed to all searches, for all users . . . [which] resulted in increased use and awareness of these resources.”25 INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 7 Design Considerations Bento-box results pages are highly customizable. A 2017 review of 38 academic libraries using bento-box search found “much variation in the implementation.”26 A bento-box results page needs to balance providing necessary information with displaying results in a way that is not too overwhelming—or too cluttered. The University of Michigan analyzed usage of its bento-box results page and redesigned it to improve how it presented results to library users by displaying “[f]ewer results . . . in each section—with a more prominent ‘see all’ link—than in the original design.”27 Given user expectations and the challenges previously discussed, libraries must design the results to “maximize the exposure of [their] collection[s] and services, in the most appropriate precedence, while preventing cognitive overload.”28 A cluttered results page, with a lack of distinction between categories, will make it difficult for users to find information and will cause confusion rather than ease it. Another concern with a bento-box results page occurs when some of the result boxes “end up ‘below the fold,’ meaning users will need to scroll down to see them. This creates the same problem as a tabbed search box—users don’t see results from all library sources.”29 Because users are less likely to see below the fold search results, the bento-box results page needs to prioritize category locations so that the more important results are above the fold (which requires the library to determine the relative importance of search result categories). USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY The Trial Design During Franklin University’s fall 2018 and spring 2019 welcome weeks, in addition to providing students with information about the library’s services, staff at the library’s information table asked students to participate in a trial to help determine whether adopting a bento-box results page would benefit our users. Participants were offered a Franklin University coffee mug as an incentive. The trial asked participants to look for information on two different library websites: one using EDS (Franklin University Library) and one using a bento box (Wayne State University Library). We set up two laptops for participants to use and made screen recordings of the participants’ actions during their searches for later viewing and analysis. After they finished the tasks we asked participants to fill out a questionnaire (reproduced in Appendix A), which had three background questions and six questions about their experience/thoughts relating to the tasks. To decide what information to ask participants to look for, we reviewed library websites that used bento boxes to see what categories they searched. We compared those categories to the types of information available on our library’s website. Although we identified a number of possible categories, we decided to limit the trial to three tasks because we did not want to overburden participants. Based on our experience working with students, we decided the three categories we were most interested in investigating: articles, books, and journals. To see how users searched for items in these categories we asked participants to complete the following three tasks on each website: INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 8 1. Find an article available through the library on the topic “criminal justice.” 2. Find the ebook Lean Six Sigma for Leaders by Martin Brenig-Jones and Jo Dowdall. 3. Find the electronic journal Business Today. Participants Thirty-four people participated in the trial. However, not all of the participants completed all of the tasks on each library’s website. We discarded the results from participants who did not attempt at least two tasks on each library’s website. Removing those who did not complete at least two tasks on each site left 27 participants (“adjusted” results). Unless otherwise noted, the data discussed below refers to the adjusted results. The trial was open to students, faculty, and staff. Eleven participants took the trial in fall 2018 and 16 participants took it in spring 2019. Most of the participants were undergraduates (21), with some graduate students (6). No doctoral students, faculty, or staff-only participated. (One staff member participated and completed a questionnaire but did not perform enough tasks for their results to be included.) Results We watched the screen recordings to time how long it took students to complete the three tasks on each library’s website. However, if a student flipped between the sites while searching instead of first completing all three searches on one site we did not time the results. We also observed what students did while searching to gain an understanding of how they searched for information. Overall, students spent less time searching on the site using the bento box (Wayne State University Library) than they did on the Franklin University Library site: • Students spent an average of 2 minutes, 35 seconds to complete the tasks on the Wayne State site compared to 3 minutes, 28 seconds on Franklin’s. • Twelve students finished their searches quicker on Wayne State’s site, while six had quicker results from Franklin’s. How Students Searched for Information The screen recordings showed that students looking for information often went to parts of the library websites which did not contain the content they sought. Frequently, they would search whatever part of the site they were on—even if it did not contain the content they needed. On the Wayne State site, when a student used an interior search bar the bento box provided results from a wide range of library resources. A search on the journals page would give results for journals, books, articles, and more—even if the page they were on did not contain the resource they needed to find. By contrast, any interior search box found on the Franklin University page would only provide results for whatever portion of the library’s resources that search box accessed: a search on the journals page would only provide journal results, a search on the catalog page would only provide catalog results, etc. Student action after the initial search demonstrates the need for interior search boxes which can search the library’s entire site. Twelve of 27 students on each site (although not the same 12 students) followed up a search by using a search bar they found on their results page without returning to the homepage to use the main search bar. Students did this even when the page they were on did not relate to the content they were looking for. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 9 For library users, the division between the content of the library’s website and the content provided by the library “’is not obvious and makes no real sense.’”30 The screen recordings of student behavior when searching for content on library websites demonstrated that students also could not distinguish between different areas within the library’s website. Search for Articles The first task asked students to find an article on criminal justice. Because it was the first task on the list, most students started with this search. Students had more success finding the article on Franklin’s site than on Wayne State’s site (18 to 14). Although only 14 students were credited with finding the article on Wayne State’s site, several students actually reached the bento-box page which included a category for article results. However, they did not realize that they had found the article and selected an ebook or database instead. Search for Ebooks The second task asked students to find the ebook Lean Six Sigma for Leaders by Martin Brenig- Jones and Jo Dowdall. Students had more success finding this book on the bento-box site. Twenty students found the book using the Wayne State Library’s bento-box search, compared to ten who found the book on the Franklin University Library’s site. Many students, in searching for the ebook, typed their search on the results page from the previous search. On the Wayne State University Library’s site, this led to a bento-box results page which included the book. On Franklin University Library’s site, the results were more complex. Between fall 2018 and spring 2019, our EBSCO EDS custom catalog was not renewed, which resulted in the search results for the ebook no longer displaying in EDS. [This non-renewal was not intentional.] As a result, in fall 2018, the ebook that students looked for appeared in EDS search results (on both the Franklin University Library’s main search bar and interior EBSCO search boxes); however, in spring 2019 it did not. To see what effect this had, we looked at the more limited search results from the fall 2018 trial when an EDS search on Franklin University Library’s site would successfully find the book. Even then, students had more success finding the book on the bento-box site: 9 students successfully found the book on Wayne State’s site, compared to 7 on Franklin University’s site. Some students on Franklin University Library’s site tried, unsuccessfully, to identify the proper page of the library’s site to find “books.” Instead of the catalog page, they went to a page labeled “textbooks” which helped students locate course reserves. Because there was no search option for the entire site on that page, those students did not successfully find the ebook. Search for Journals The third task asked students to look for the journal Business Today. As with the ebook search, students had more success finding the journal on the bento-box site: 19 students found the journal on Wayne State Library’s site compared to 10 on Franklin University Library’s site. Students searched for the journal in a similar manner to the way they searched for the ebook. Many just put the journal title in the search bar on the page they were on—if that search bar on Franklin University Library’s site provided access to the journal, they would find it with their INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 10 search. But if they were on a page which did not have a search which included the journal as a result, they would not. The journal search on the Wayne State University Library’s site demonstrated the “below th e fold” problem discussed above because the bento-box result for “journals” was below the fold. As a result, at least one student properly searched for the journal, but did not find it because the result was not visible on the screen and they did not scroll down. Questionnaires We asked students a series of questions about their experience searching for information on the two library websites (see Appendix A for the questionnaire and Appendix B for the results). Slightly more students preferred the Wayne State library’s site (14) to Franklin University’s (12). However, five of the users who preferred Franklin University’s site referenced their familiarity with that site. Four of these users specifically referenced their familiarity with the Franklin University Library site in response to a question asking “[w]hy did you prefer the type of search you picked,” while one referenced their familiarity with Franklin University Library’s site in response to a question asking whether there was “anything else you’d like to tell us.” The questionnaire also asked students why they preferred one site to the other and what they liked and disliked about each site. Preference: Franklin University Library As mentioned above, many of the comments from those who indicated a preference for the Franklin University Library site indicated the preference was due to familiarity: • “Might be because I am a bit used to it, I just found it easier to navigate.” • “Because it’s the one I am familiar with.” • “Because I'm familiar with it.” • “I liked both. Both easy to use. Familiar with Franklin's.” Other comments favored Franklin University Library’s overall website design (as opposed to search): • “The website is cleaner. Easier to use.” • “Easier to navigate.” Some of the comments did indicate a preference for the search technique: • “One search bar to search all types. Seemed to include more in search results.” • “It easy to search and straight forward.” • “Access to research was quicker on the Franklin website. Also, I felt like there was more research material available.” Preference: Wayne State University Library Those who preferred the Wayne State search did so more based on the search technique than did those who preferred Franklin: • “Simple and the search was in one spot.” INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 11 • “One search bar that pulled from the [catalog].” • “Because you can type in exactly what you were looking for and it comes up.” Others appreciated the way search results were displayed: • “Their search system organizes the result by type of information, whereas Franklin's website makes you search for the type of material information before displaying the results.” • “Better layout breaks articles, journals, etc. into separate columns.” • “Wayne had each section (book, e-journal, article) separately which was easier to find.” • “The layout.” Still others just found the Wayne State Library search easier to use: • “It is more visual and easy to find and easy to use.” • “It presented the information in an easy way to find.” • “Easy—all in one.” What Search Results Do Library Users Want? We asked participants to rank which results they would like to see displayed when searching on the library’s site. While most applied a numerical ranking, some just circled items. All questionnaire responses were included when compiling these rankings, including rankings from those who did not complete at least two tasks on each website, because user preferences about what search results they want are valid even if they did not perform the required tasks on each library’s website. We converted numerical rankings so that the first choice received six points, the second choice five points, etc. Of the 34 participants who answered this question, 24 provided rankings and ten circled items without indicating how they ranked those items. Where participants circled items, we converted their responses to a numerical equivalent based on how many answers they circled. If they only circled one, it was treated as the first choice, and given 6 points. If they circled more than one, we combined the numerical value of the answers and each answer received the average value. (For example, if two answers were circled, they were treated as a first and second choice, and each circled answer was given a score of 5.5.) The responses indicate students most wanted library search to provide results for articles and journals, followed by databases and ebooks: • Articles: 144 • Journals: 125 • Databases: 112 • Books/ebooks: 111 • Research guides: 69.5 • Library site: 63.5 Mischo, Norman, and Schlembach reported actual usage of the University of Illinois’ bento-box results page by category between 2015 and 2017. How do the categories our users indicated they INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 12 would like to see on a bento-box results page compare with the actual use of bento-box results at the University of Illinois? At the University of Illinois, 56.1 percent of click-throughs were for articles (Franklin University students’ first choice), while 33.6 percent were for book and online catalog content (our students’ fourth choice).31 Databases, our students’ third choice, were not a listed category, while journals (Franklin University students’ second choice) were only the fifth-most used resource (and the percentage of click-throughs was low, at only 3.6 percent).32 Limitations There were a few issues with the study which should be kept in mind when evaluating the results. Number of Participants Thirty-four individuals participated in the trial. After removing results from participants who failed to complete a sufficient number of tasks, only 27 participants remained. While this number is small, it does provide information on what students think and, more importantly, how they act when searching for various types of information on the library’s website. Examples of library user experience testing based on similar numbers include: • The University of Kansas Library conducted “usability testing of our Primo discovery interface on basic library searching scenarios for undergraduate and graduate students” and reported results from 27 users.33 • The University of Southern Mississippi Library conducted usability testing of 24 users (“six participants from each of the following library user groups: undergraduate students, graduate students, faculty, and library employees”) to evaluate and modify their website.34 • Syracuse University conducted usability testing on “ten students . . . and eighteen library staff members.”35 Familiarity with Franklin University Library Website Student familiarity with the Franklin University Library website affected student opinion. Of 12 students in the adjusted results who preferred Franklin University Library’s site, 5 (41.7 percent) gave an answer indicating that familiarity was a factor in their preference. When considering all the questionnaires (including those from participants who were not included in the adjusted results), 7 out of 17 users (41.2 percent) who preferred Franklin University Library’s site gave an answer referencing familiarity. As a result, opinion may have been skewed in favor of Franklin University, despite Wayne State being slightly favored overall. A good illustration of this problem is the response from a student who the screen recording showed did not even attempt any of the tasks on the Franklin University Library website but indicated a preference for the Franklin University Library site because it’s “easy to use.” CONCLUSION As a result of the user experience study, the Franklin University Library decided that providing bento-box search results would benefit our library’s users. The trial showed that students required less time to conduct their searches using Wayne State’s bento-box search and found more items successfully on Wayne State’s site. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 13 The lack of student distinction between different types of library content, along with the likelihood of their entering a search in whatever search box they see on a page further supports providing bento-box results. Adopting a bento-box results page will permit the library to provide search boxes on interior pages which permit students to search for materials site-wide. The bento box will let students search for content anywhere on the library’s web site without requiring them to first figure out what type of library resource they are looking for and then find the correct section of the library’s website. Additionally, the ebook search issue previously discussed demonstrates the benefits of switching to a bento box. The disappearance of ebook search results from the EDS listing would not have mattered with a bento-box style search because the bento box would have displayed a box for catalog results. Comments from two of the students who preferred the Wayne State library website demonstrate the benefits of a bento-box format. The bento-box search meant that Wayne State’s site is “simple and the search was in one spot.” It also helps students because “[Wayne State’s] search system organizes the result by type of information, whereas Franklin's website makes you search for the type of material information before displaying the results.” INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 14 APPENDIX A: QUESTIONNAIRE About this Study The Franklin University library is studying how users search for, and find, information on library websites. The purpose of this study is to ask library users (and potential library users) to search for information on two different library websites and give their opinion on their search experience. You are asked to be a participant as a member of the Franklin University community who is a library user, or a potential library user. We hope to have between 20 and 50 people participate in this study. If you agree to participate in this study, you will be asked to look for information to find four different resources on two library websites (Franklin University’s website and Wayne State University’s website). You will only be asked to find the information, not to access the information. You will then be asked to fill out a questionnaire providing demographic information and your opinion about library search. As part of this study, your search activity on the websites may be recorded by screen recording software. Your participation in this study is anticipated to take about 15 minutes. There are no known risks to participation in the study. The benefits of participation include helping the library to better serve its users by identifying how users search for information on the library’s website. In return for your participation in this survey, you will receive a Franklin University coffee mug. This study is conducted anonymously—no personally identifiable information will be collected. Your participation in this survey is voluntary. If you decide to participate, you have the right to refuse to answer any of the questions that make you uncomfortable. You also have the right to withdraw from this study at any time, with no repercussions. This research has been reviewed and approved by the Franklin University Institutional Review Board. For questions regarding participants’ rights, you can contact the Institutional Review Board at irb@franklin.edu. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 15 Please answer the following background questions: 1) What do you do at Franklin University (circle all that apply)? a) Non-degree-seeking student b) Undergraduate student c) Master student d) Doctoral student e) Staff f) Faculty 2) How often do you use the library’s website (circle the best choice)? a) Frequently (every week) b) Occasionally (every month) c) Rarely (less than once a month) d) Never 3) How often do you use the search function on the library’s website? a) Frequently b) Occasionally c) Rarely d) Never INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 16 Please answer the following questions about your experience and preferences when searching for information on library websites: 1) Which library’s search did you prefer: a) Franklin University Library b) Wayne State University Library 2) Why did you prefer the type of search you picked in the answer to question 1? 3) For Franklin University’s search results: a) What did you like? b) What didn’t you like? 4) For Wayne State University’s search results: a) What did you like? b) What didn’t you like? 5) Please rank in order of preference what search results you would want to see displayed when searching on the library’s website: a) Articles related to a topic b) Books / Ebooks c) Databases d) Library website e) Journals f) Research Guides g) Other (please list): 6) Is there anything else you’d like to tell us about your experience looking for information on these library websites? INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 17 APPENDIX B: ADJUSTED QUESTIONNAIRE RESULTS Below are the results from participants who completed at least two tasks on each university library’s website. Where the screen recordings indicated that participants did not complete at least two tasks on each of the websites the questionnaire responses were not recorded. What do you do at Franklin University (circle all that apply)? Undergraduate: 21 Masters: 6 How often do you use the library’s website (circle the best choice)? Occasionally: 9 Frequently: 12 Rarely: 6 How often do you use the search function on the library’s website? Occasionally: 7 Frequently: 13 Rarely: 7 Which library’s search did you prefer: Franklin University: 12 Wayne State University: 14 N/A: 1 INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 18 APPENDIX C: SCREEN RECORDING RESULTS Analysis of screen recordings from participants who completed at least two tasks on each university library’s website. For timed results, we did not include the results of students who flipped between library websites while completing the tasks. (An example of flipping between sites occurred when a student found the article on the Franklin University Library site, then looked for it on Wayne State University Library site before looking for the ebook on the Franklin University site.) Time to complete tasks (average): Franklin University Library site: 3:28 Wayne State University Library site: 2:35 Site where student finished search quicker: Franklin University Library site: 6 Wayne State University Library site: 12 Total items found: Franklin University Library site: 38 Wayne State University Library site: 53 Articles found: Franklin University Library site: 18 Wayne State University Library Site: 14 Books found: Franklin University Library site: 10 Wayne State University Library site: 20 Journals Found: Franklin University Library site: 10 Wayne State University Library site: 19 INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 19 ENDNOTES 1 Cole Hudson and Graham Hukill, “One-to-Many: Building a Single-Search Interface for Disparate Resources,” in K. Varnum (ed.) Exploring Discovery: The Front Door to Your Library’s Licensed and Digitized Content (Chicago: ALA Editions, 2016): 146. 2 Suzanna Conrad and Nathasha Alvarez, “Conversations with Web Site Users: Using Focus Groups to Open Discussion and Improve User Experience,” Journal of Web Librarianship 10, no. 2 (April 2016): 71, https://doi.org/10.1080/19322909.2016.1161572. 3 Scott Hanrath and Miloche Kottman, “Use and Usability of a Discovery Tool in an Academic Library,” Journal of Web Librarianship 9, no. 1 (January 2015): 4, https://doi.org/10.1080/19322909.2014.983259. 4 Irina Trapido, “Library Discovery Products: Discovering User Expectations through Failure Analysis.” Information Technology and Libraries 35, no. 3 (2016): 22, https://doi.org/10.6017/ital.v35i3.9190. 5 William Mischo, Michael Norman, and Mary Schlembach, “Innovations in Discovery Systems: User Studies and the Bento Approach,” Proceedings of the Charleston Library Conference (2017): 299, https://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1991&context=charleston. 6 Hudson and Hukill, “One-to-Many,” 142. 7 Emily Singley, “To Bento Or Not to Bento—Displaying Search Results,” http://emilysingley.net/usablelibraries/to-bento-or-not-to-bento-displaying-search-results/. 8 Jonathan Rochkind, “Article Search Improvement Strategy,” https://bibwild.wordpress.com/2012/10/02/article-search-improvement-strategy/. 9 Hudson and Hukill, “One-to-Many,” 145. 10 Rochkind, “Article Search Improvement Strategy.” 11 Hudson and Hukill, “One-to-Many,” 142. 12 Nancy Turner, “Librarians do it Differently: Comparative Usability Testing with Students and Library Staff,” Journal of Web Librarianship 5, no. 4 (October 2011), https://doi.org/10.1080/19322909.2011.624428; Elena Azadbakht, John Blair, and Lisa Jones, “Everyone's Invited: A Website Usability Study Involving Multiple Library Stakeholders,” Information Technology & Libraries 36, no. 4 (2017), 43, https://doi.org/10.6017/ital.v36i4.9959. 13 Colleen Kenefick and Jennifer A. DeVito, “Google Expectations and Interlibrary Loan: Can We Ever Be Fast Enough?” Journal of Interlibrary Loan, Document Delivery & Electronic Reserves 23, no. 3 (July 2013): 158, https://doi.org/10.1080/1072303X.2013.856365. https://doi.org/10.1080/19322909.2016.1161572 https://doi.org/10.1080/19322909.2014.983259 https://doi.org/10.6017/ital.v35i3.9190 https://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1991&context=charleston http://emilysingley.net/usablelibraries/to-bento-or-not-to-bento-displaying-search-results/ https://bibwild.wordpress.com/2012/10/02/article-search-improvement-strategy/ https://doi.org/10.1080/19322909.2011.624428 https://doi.org/10.6017/ital.v36i4.9959 https://doi.org/10.1080/1072303X.2013.856365 INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 BENTO BOX USER EXPERIENCE STUDY AT FRANKLIN UNIVERSITY | JAFFY 20 14 Kenefick and DeVito, “Google Expectations and Interlibrary Loan,” 157; Carol Diedrichs, “Discovery and Delivery: Making it Work for Users,” Serials Librarian 56, no. 1-4 (January 2009): 81, https://doi.org/10.1080/03615260802679127. 15 Singley, “To Bento or Not to Bento.” 16 Singley. 17 Singley. 18 Hudson and Hukill, “One-to-Many,” 146. 19 Singley, “To Bento or Not to Bento.” 20 Eric Phetteplace and Jeremy Darrington, “A Hybrid Approach to Discovery Services,” Reference & User Services Quarterly 53, no. 4 (2014): 293. 21 Cory Lown, Tito Sierra, and Josh Boyer, “How Users Search the Library from a Single Search Box,” College & Research Libraries 74, no. 3 (2013): 229. 22 Singley, “To Bento or Not to Bento.” 23 Lown, Sierra, and Boyer, “How Users Search the Library from a Single Search Box.” 24 Hudson and Hukill, “One-to-Many,” 150. 25 Hudson and Hukill, 150. 26 Mischo, Norman, and Schlembach, “Innovations in Discovery Systems,” 304. 27 Suzanne Chapman et al., “Manually Classifying User Search Queries on an Academic Library Web Site,” Journal of Web Librarianship 7, no. 4 (October 2013): 419, https://doi.org/10.1080/19322909.2013.842096. 28 Aaron Tay and Feng Yikang, “Implementing a Bento-Style Search in LibGuides V2,” Code4lib Journal no. 29 (July 2015), https://journal.code4lib.org/articles/10709. 29 Singley, “To Bento or Not to Bento.” 30 Chapman et al., “Manually Classifying User Search Queries,” 406. 31 Mischo, Norman, and Schlembach, “Innovations in Discovery Systems.” 32 Mischo, Norman, and Schlembach. 33 Hanrath and Milche Kottman, “Use and Usability of a Discovery Tool,” 5. 34 Azadbakht, Blair, and Jones, “Everyone's Invited,” 34. 35 Turner, “Librarians Do It Differently,” 290. https://doi.org/10.1080/03615260802679127 https://doi.org/10.1080/19322909.2013.842096 https://journal.code4lib.org/articles/10709 ABSTRACT Introduction Bento-box Library Search What Is Bento-Box Search? Benefits of a Bento-Box Search Results Page Simplified Search Better Presentation of Results Comparison with Tabbed Searching Design Considerations User Experience Study at Franklin University The Trial Design Participants Results How Students Searched for Information Search for Articles Search for Ebooks Search for Journals Questionnaires Preference: Franklin University Library Preference: Wayne State University Library What Search Results Do Library Users Want? Limitations Number of Participants Familiarity with Franklin University Library Website Conclusion Appendix A: Questionnaire About this Study Appendix B: Adjusted Questionnaire Results What do you do at Franklin University (circle all that apply)? How often do you use the library’s website (circle the best choice)? How often do you use the search function on the library’s website? Which library’s search did you prefer: Appendix C: Screen Recording Results Time to complete tasks (average): Site where student finished search quicker: Total items found: Articles found: Books found: Journals Found: Endnotes 11585 ---- A Comprehensive Approach to Algorithmic Machine Sorting of Library of Congress Call Numbers Articles A Comprehensive Approach to Algorithmic Machine Sorting of Library of Congress Call Numbers Scott Wagner and Corey Wetherington INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 62 Scott Wagner (smw284@psu.edu) is Information Resources and Services Support Specialist, Penn State University Libraries. Corey Wetherington (cjw36@psu.edu) is Open and Affordable Course Content Coordinator, Penn State University Libraries. ABSTRACT This paper details an approach for accurately machine sorting Library of Congress (LC) call numbers which improves considerably upon other methods reviewed. The authors have employed this sorting method in creating an open-source software tool for library stacks maintenance, possibly the first such application capable of sorting the full range of LC call numbers. The method has potential application to any software environment that stores and retrieves LC call number information. BACKGROUND The Library of Congress Classification (LCC) system was devised around the turn of the twentieth century, well before the advent of digital computing. 1 Consequently, neither it nor the system of Library of Congress (LC) call numbers which extend it were designed with any consideration to machine readability or automated sorting.2 Rather, the classification was formulated for the arrangement of a large quantity of library materials on the basis of content, gathering like items together to allow for browsing within specific topics, and in such a way that a new item may always be inserted (interfiled) between two previously catalogued items without disruption to the overall scheme. Unlike, for instance, modern telephone numbers, ISBNs, or UPCs—identifiers which pair an item with a unique string of digits having a fixed and regular format, largely irrespective of any particular characteristics of the item itself—LC call numbers specify the locations of items relative to others and convey certain encoded information about the content of those items. The Library of Congress summarizes the essence of the LCC in this way: The system divides all knowledge into twenty-one basic classes, each identified by a single letter of the alphabet. Most of these alphabetical classes are further divided into more specific subclasses, identified by two-letter, or occasionally three-letter, combinations. For example, class N, Art, has subclasses NA, Architecture; NB, Sculpture, ND, Painting; as well as several other subclasses. Each subclass includes a loosely hierarchical arrangement of the topics pertinent to the subclass, going from the general to the more specific. Individual topics are often broken down by specific places, time periods, or bibliographic forms (such as periodicals, biographies, etc.). Each topic (often referred to as a caption) is assigned a single number or a span of numbers. Whole numbers used in LCC may range from one to four digits in length, and may be further extended by the use of decimal numbers. Some subtopics appear in alphabetical, rather than hierarchical, lists and are represented by mailto:smw284@psu.edu mailto:cjw36@psu.edu ALGORITHMIC MACHINE SORTING OF LC CALL NUMBERS | WAGNER AND WETHERINGTON 63 https://doi.org/10.6017/ital.v38i4.11585 decimal numbers that combine a letter of the alphabet with a numeral, e.g., .B72 or .K535. Relationships among topics in LCC are shown not by the numbers that are assigned to them, but by indenting subtopics under the larger topics that they are a part of, much like an outline. In this respect, it is different from more strictly hierarchical classification systems, such as the Dewey Decimal Classification, where hierarchical relationships among topics are shown by numbers that can be continuously subdivided.3 As this description suggests, LCC cataloging practices can be quite idiosyncratic and inconsistent across different topics and subtopics, and sorting rules for properly shelf-ordering LC call numbers can be correspondingly complex, as we will see below.4 For the purposes of discussion in what follows, we divide LC call number strings into three principal substrings: the classification, the Cutter, and what we will term the specification. The classification categorizes the item on the basis of its subject matter, following detailed schedules of the LCC system published by the Library of Congress; the Cutter situates the item alongside others within its classification (often on the basis of its title and/or author5); and the specification distinguishes a specific edition, volume, format, or other characteristic of the item from others having the same author and title: HC125⏞ 𝑎 .G25313⏞ 𝑏 1997⏞ 𝑐 In the above example, the classification string (a) denotes the subject matter (in this case, General Economic History and Conditions of Latin America), the Cutter string (b) locates the book within this topic on the basis of author and/or title (following a specific encoding process), and the specification string (c) denotes the particular edition of the text (in this case, by year). Each of these general substrings may contain further substrings having specific cataloging functions, and though each is constructed following certain rigid syntactical rules, a great deal of variation in format may be observed within the basic framework. The following is an inexhaustive summary of the basic syntax of each of the three call number components: • The classification string always begins with one to three letters (the class/subclass), almost always followed by one to four digits (the caption number), possibly including an additional decimal. The classification may also contain a date or ordinal number following the caption number. • The beginning of the Cutter string is always indicated by a decimal point followed by a letter and at least one digit. While the majority of call numbers contain a Cutter, it is not always present in all cases. Among the sorting challenges posed by LC call numbers, we note in particular the “double Cutter”—a common occurrence in certain subclasses— wherein the Cutter string changes from alphabetic to numeric, then back to alphabetic, and finally again to numeric. Triple Cutters are also possible, as are dates intervening between Cutters. Certain Cutter strings (e.g., in juvenile fiction) end with an alphabetic “work mark” composed of two or more letters. • The specification string (which may be absent on older materials) is always last, and usually contains the date of the edition, but may also include volume or other numbering, ordinal numbers, format/part descriptions (e.g., “DVD,” “manual,” “notes”), or other distinguishing information. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 64 Figure 1 shows example call numbers, all found within the catalog of Penn State University Libraries, suggesting the wide variety of possibilities: Figure 1. Example call numbers. As one might expect given this irregularity in syntax, systematic machine-sorting of LC call numbers is by no means trivial. To begin with, sorting procedures within the LCC system are to a certain degree contextual—that is, the sorter must understand how a given component of a call number operates within the context of the entire string in order to determine how it should sort. Both integer and decimal substrings appear in LC call numbers, so that a numeral may properly precede a letter in one part of a call number (a ‘1’ sorts before an ‘A’ in the classification portion, for example: H1 precedes HA1), while the contrary may occur in another part (within the Cutter, in particular, an ‘A’ may well precede a ‘1’: HB74.P65A2 precedes HB74.P6512). Similarly, letters may have different sorting implications depending on where and how they appear. Compare, for instance, the call numbers V23.K4 1961 and U1.P32 v.23 1993/94. The V in the former denotes the subclass of general nautical reference works and simply sorts alphabetically, whereas the v in the latter call number functions in part as an indicator that the numeral 23 refers to a specific volume number and is to be sorted as an integer rather than a decimal. Such contextual cues are often tacitly understood by a human sorter, but can present considerable challenges when implementing machine sorting procedures. Furthermore, the lack of uniformity or regularity in the format of call number strings poses various practical obstacles for machine sorting. Taken together, these assorted complexities suggest the insufficiency of a single alphanumeric sorting procedure to adequately handle LC call numbers as unprocessed, plain text strings. LITERATURE REVIEW A thorough review of information science literature revealed little formal discussion of the algorithmic sorting of LC call numbers. If the topic has been more widely addressed in the scholarly or technical literature, we were unable to discover it. Nevertheless, the general problem appears to be fairly well known. This is evident both from informal online discussions of the topic (e.g., in blog posts, message board threads, and coding forums) and from the existence of certain features of library management system (LMS) and integrated library system (ILS) software designed to address the issue. In this section we examine methods proffered by some of these sources, and detail how each fails to fully account for all aspects of LC call number sorting. B1190 1951 no Cutter string DT423.E26 9th.ed. 2012 compound specification E505.5 102nd.F57 1999 ordinal number in classification HB3717 1929.E37 2015 date in classification KBD.G189s no caption number, no specification N8354.B67 2000x date with suffix PS634.B4 1958-63 hyphenated range of dates PS3557.A28R4 1955 “double Cutter” PZ8.3.G276Lo 1971 Cutter with “work mark” PZ73.S758345255 2011 lengthy Cutter decimal ALGORITHMIC MACHINE SORTING OF LC CALL NUMBERS | WAGNER AND WETHERINGTON 65 https://doi.org/10.6017/ital.v38i4.11585 In a brief article archived online, Conley and Nolan outline a method for sorting LC call numbers through the use of function programming in Microsoft Excel. 6 Given a column of plain-text LC call numbers, their approach entails successive processing of the call numbers across several spreadsheet columns with the aim of properly accounting for the sorting of integers . The fully- processed strings are then ultimately ready for sorting in the rightmost column using Excel’s built - in sorting functionality. We note that Conley and Nolan’s method (hereafter “CNM”) only attempts to sort what the authors refer to as the “base call number” (i.e., the classification and Cutter portions), and does not address the sorting of “volume numbers, issue numbers, or sheet numbers” (what we refer to here as the “specification”). 7 CNM stems from the tacit observation that ordinary, single-column sorting of LC call numbers is clearly inadequate in an environment like Excel’s. For instance, in the following example, standard character-by-character sorting fails at the third character position, since PZ30.A1 erroneously sorts before PZ7.A1 (as 3 is compared to 7 in the third character position), contrary to the correct order (7 before 30). To address this, CNM normalizes the numeric portion of the class number with leading zeros so that each numeric string is of equal length, ensuring that the proper digits are compared during sorting. This entails a transformation, PZ30.A1  PZ0030.A1 PZ7.A1  PZ0007.A1 following which the strings will in fact sort correctly in an Excel column. This technique appears adequate until we compare call numbers having subclasses of different length: P180.A1  P0180.A1 PZ30.A1  PZ0030.A1 Here, while standard Excel sorting will in fact properly order the resulting strings, in other applications, depending on the sorting hierarchy employed, sorting may fail in the second position if letters are sorted before numbers. Hierarchy aside, it is not difficult to see the potential issues that may arise from sorting unlike portions of the call number string against one another in this way, particularly when comparing characters within the Cutter string or in situations involving a “double Cutter.” For instance, the call numbers B945.D4B65 1998 and B945.D41 1981b are listed here in their proper sorting order, but are in fact sorted in reverse by CNM when, in the eighth character position, 1 is sorted before B in accordance with Excel’s default sorting priority. This again illustrates an essential problem of character-by-character sorting: in certain substrings we require a letters-before-numbers sorting priority, while in others a numbers-before-letters order is needed. This impasse makes clear that no single-column sorting methodology can succeed for all types of LC call numbers without significant modification to the call number string. In a blog post, Dannay observed that CNM does not account for certain call number formats, particularly those of legal materials within the K classification having 3-letter class strings. 8 (The INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 66 same would also be true in the D classification, where 3-letter strings also appear.) Although minor modification of portions of the function code (e.g., replacing certain ‘2’s with ‘3’s) would be sufficient to alleviate this particular issue, Dannay proposed instead to employ placeholder characters to normalize the classification string and avoid instances of alphabetic characters being compared against numeric ones. Dannay’s Method (DM) normalizes various parts of the classification string, including the subclass, caption, and decimal portions: Q171.T78  Q**0171.0.T78 QA9.R8  QA*0009.0.R8 (Here, of course, it is imperative that the chosen placeholder character sort before all letters in the sorting hierarchy.) DM thus successfully avoids the issue of comparing classification strings of unequal length or format. Nevertheless, despite the improvements of DM over CNM, both approaches are ultimately unable to properly process certain types of common LC call numbers. For example, call numbers with dates preceding the Cutter (e.g., GV722 1936.H55 2006) and call numbers without Cutters (e.g., B1205 1958) both result in errors, as do those containing the aforementioned “double Cutters.” Furthermore, as we previously noted, neither DM nor CNM were designed to handle any portion of the specification string following the Cutter, where the presence of ordinal and volume- type numbering is commonplace. Hence neither method is able to properly order the quite ordinary pair of call numbers AC1.G7 v.19 and AC1.G7 v.2, since the first digit of each’s volume number is compared and ordered numerically (i.e., character-by-character), resulting in a mis-sort. Though neither DN nor CNM is ultimately comprehensive (nor designed to be), both methods contain valuable insights and strategies that inform our own approach to the problem. SOFTWARE REVIEW Available software solutions for sorting LC call numbers appear to be nearly as scant as literature on the subject. While GitHub contains a handful of programs that attempt to address the problem, we found none which could be considered comprehensive. Table 1 is a summary of those programs we discovered and were able to examine. The “sqlite3-lccn-extension” program is an extension for SQLite 3 which provides a collation for normalizing LC call numbers, executing from a SQLite client shell. We discovered several limitations in its ability to sort certain call number formats similar to those discussed above in the literature review. For instance, the program cannot correctly sort specification integers (e.g., it sorts v.13 before v.3) or call numbers lacking Cutter strings (e.g., it sorts B 1190.A1 1951 before B 1190 1951). We found similar issues with “js-loc-callnumbers,” a JavaScript program with a web interface into which a list of call numbers can be pasted. The program transforms the call numbers into normalized strings, which are then sorted and displayed to the user. However, we observed that it does not account for dates or ordinal numbers in the classification string, nor can it correctly sort call numbers lacking caption numbers.9 ALGORITHMIC MACHINE SORTING OF LC CALL NUMBERS | WAGNER AND WETHERINGTON 67 https://doi.org/10.6017/ital.v38i4.11585 Program and Author App-Type, Interface Repository URL Last Update “sqlite3-lccn-extension” by Brad Dewar database extension, shell https://github.com/macdewar/sqlite3- lccn-extension Dec. 2013 “js-loc-callnumbers” by Ray Voelker JavaScript, web https://github.com/rayvoelker/js-loc- callnumbers Feb. 2017 “Library-of-Congress- System” by Luis Ulloa Python tutorial, command line https://github.com/ulloaluis/Library-of- Congress-System Sep. 2018 “lcsortable” by mbelvadi2 Google Sheets script https://github.com/mbelvadi2/lcsortabl e May 2017 “library-callnumber-lc” by Library Hackers Perl, Python https://github.com/libraryhackers/libra ry-callnumber- lc/tree/master/perl/Library- CallNumber-LC Dec. 2014 “lc_call_number_compare” by SMU Libraries JavaScript, command line https://github.com/smu- libraries/lc_call_number_compare Dec. 2016 “lc_callnumber” by Bill Dueber Ruby https://github.com/billdueber/lc_callnu mber Feb. 2015 Table 1. List of GitHub software involving LC call number sorting. Several of the programs are rather narrow in scope. The “lcsortable” script is a Google Sheets scheme for normalizing LC call numbers into a separate column for sorting, very much like CNM and DM. Its normalization routine appears to conflate decimals and integers, though, leading to transformations such as HF5438.5.P475 2001  HF5438.0005.P04752001 which would clearly result in a great deal of incorrect sorting across a wide array of LC call number formats. The command-line-based Python program “library-callnumber-lc” processes a call number and returns a normalized sort key, but is not intended to store or sort groups of call numbers. It cannot adequately handle compound specifications or Cutters containing consecutive letters (e.g., S100.BC123 1985), and does not appear to preserve the demarcation between a caption integer and caption decimal (i.e., the decimal point), thereby commingling integer and decimal sorting logic. Lastly, “Library-of-Congress-System” is a tutorial/training program written in Python that runs from the command line and supplies a list of call numbers for the user to sort. It does not draw call numbers from a static collection nor allow call numbers to be input by the user; rather, it randomly generates call numbers within certain parameters and following a https://github.com/macdewar/sqlite3-lccn-extension https://github.com/macdewar/sqlite3-lccn-extension https://github.com/rayvoelker/js-loc-callnumbers https://github.com/rayvoelker/js-loc-callnumbers https://github.com/ulloaluis/Library-of-Congress-System https://github.com/ulloaluis/Library-of-Congress-System https://github.com/mbelvadi2/lcsortable https://github.com/mbelvadi2/lcsortable https://github.com/libraryhackers/library-callnumber-lc/tree/master/perl/Library-CallNumber-LC https://github.com/libraryhackers/library-callnumber-lc/tree/master/perl/Library-CallNumber-LC https://github.com/libraryhackers/library-callnumber-lc/tree/master/perl/Library-CallNumber-LC https://github.com/libraryhackers/library-callnumber-lc/tree/master/perl/Library-CallNumber-LC https://github.com/smu-libraries/lc_call_number_compare https://github.com/smu-libraries/lc_call_number_compare https://github.com/billdueber/lc_callnumber https://github.com/billdueber/lc_callnumber INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 68 prescribed pattern. As such, we were not able to satisfactorily test its sorting capabilities for the kind of use-case scenario under discussion. We did not evaluate the remaining two GitHub programs, “lc_call_number_compare” and “lc_callnumber,” as we could not get the former, a JavaScript ES6 module, to execute, and as the latter, a Ruby application which we did not attempt to install, evidently remains unfinished: its GitHub documentation lists “Normalization: create a string that can be compared with other normalized strings to correctly order the call numbers” as the among tasks yet to be completed. In addition to these open resources, we examined LC sorting capability within the commercial LMS/ILS software we had at hand. The MARC (Machine-Readable Cataloging) 21 protocol, a widely used international standard for formatting bibliographic data, provides a specific syntax for cataloging LC call numbers for the purposes of machine parsing.10 Symphony WorkFlows, the LMS licensed by Penn State University Libraries from SirsiDynix (and thus the only one available for our direct examination), contains within its search module a call number browsing feature which attempts to sort call numbers in shelving order via “Shelving IDs,” call number strings rendered from each item’s MARC 21 “050” data field for sorting purposes. While these Shelving IDs are not visible within WorkFlows (that is, they operate in the background), they can be accessed as plain text strings via BLUEcloud Analytics, a separate, SirsiDynix-branded data assessment and reporting tool peripheral to the LMS. Examination of these sort keys revealed integer normalization strategies similar to those of DM and CNM, with additional processing of volume-type numbering within the specification string. However, these Shelving IDs are similarly unable to correctly sort “double Cutter” substrings and other syntactic complexities, such as ordinal numbers appearing in the classification. The following Shelving ID transformations of two call numbers in the Penn State University Libraries catalog, for instance, fail to properly account for the ordinal numbers which appear within the classification: E507.5 36th.V47 2003  E 000507.5 36TH.V47 2003 E507.5 5th.C36 2000  E 000507.5 5TH.C36 2000 Consequently, and as expected, these two call numbers sort incorrectly within WorkFlows’ call number browsing panes.11 PROPOSED PARSING AND SORTING METHODOLOGY Given the sorting difficulties inherent in the single-column approaches outlined above, we suggest a multi-column, tiered sorting procedure in which only like portions of the call number are compared to one another. This requires the call number to be processed, its various components identified, and each component appropriately sorted according to its specific typ e. This, in turn, requires a sorting algorithm which can identify like substrings by scanning for specific patterns and cues. “Shelf reading” is a term for the common practice of verifying the correct ordering of items filed on a library shelf, typically unaided by technology, and our approach is primarily informed by the kind of mental procedures one undertakes when performing such sorting “in one’s head.”12 Perhaps the most significant component of this process involves recognizing and interpreting the role and logic of specific types of substrings and identifying their positions within the sorting hierarchy. The overall design of the LC classification, from class to subclass to caption, constitutes ALGORITHMIC MACHINE SORTING OF LC CALL NUMBERS | WAGNER AND WETHERINGTON 69 https://doi.org/10.6017/ital.v38i4.11585 a left-to-right progression from general to specific, and the classification portion of a call number can be interpreted as a series of containers holding items of increasingly narrow scope, some of which may be empty (that is, absent). This creates a structure that has a linear, hierarchical aspect, but also contains within it subcategories that share a common position within the structure. The priority that a subcategory (or container) is afforded in the sorting process depends first upon its position in the linear hierarchy, and subsequently on the depth ascribed to it relative to other subcategories that share the same position. Call numbers indicate a subcategory’s position in the linear dimension by including or expanding sections; its depth within a given position is encoded in the character or series of characters chosen to represent it. Thus, the sorting process may be regarded as a comparison of the paths that two call numbers denote through this structure, and the point at which the paths diverge is then the decisive point in determining an item’s position relative to others. This inflection point may occur at any juncture of the comparison, from the first character to the last. Given these observations, a comprehensive machine-sorting strategy must observe the following provisions: 1. Characters in call numbers should only be compared to characters that occupy an equivalent section of another call number. (“Like compared to like.”) 2. Within these designated sections, characters should only be compared to characters that occupy a corresponding position (place value) within that section. 3. If call numbers are identical up to the point that one of them lacks a section that the other call number possesses, the one with the “missing” section is ordered first. This is in keeping with the convention that items occupying a more general level in the hierarchy are ordered before those occupying a more specific one. (This principle is often summarized in shelf- reading tutorials as “nothing before something.”) 4. If call numbers are identical up to the point that one of them lacks a character in a given position within a particular section that the other call number possesses, the one missing the character is ordered first. Again, this preserves the general to specific scheme of LCC sorting. (Another instance of “nothing before something.”) 5. Whole numbers (e.g., caption integers, volume numbers) must be distinguished from decimals. For character-by-character sorting to work in sections of the call number containing integers, the length of whole numbers must be normalized to assure each digit is compared to another of equal place value. APPLICATION OF METHODOLOGY ShelfReader is a software application designed by the authors to improve the speed and accuracy of the shelf-reading process in collections filed using the Library of Congress system—and, to our knowledge, is the first such application to do so. It was coded by Scott Wagner in PHP and JavaScript, uses MySQL for data storage and sorting, and is deployed as a web application. ShelfReader allows the user to scan library items in the order they are shelved and receive feedback regarding any mis-shelved items. The program receives an item’s unique barcode identification via a barcode scanner, assembles a REST request incorporating the barcode, and sends it to an API connected to the LMS. The application then processes the response, retrieving the title and call number of the item, along with information about the item’s status (for example, if it has been marked as lost or missing). The call number is passed off to the sorting algorithm, INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 70 which processes it and assigns it a position among the set of call numbers recorded during that session. A user interface then presents a “virtual shelf” to the user displaying a graphical representation of the items in the order they were scanned. When items are out of place on the shelf, the program calculates the fewest number of moves needed to correct the shelf and presents the necessary corrections for the user to perform until the shelf is properly ordered. A screenshot depicting the ShelfReader GUI during a typical shelf-reading session is presented in figure 2. Figure 2. A screenshot of the ShelfReader GUI, showing an incorrectly filed item (highlighted in blue text) and its proper filing position (represented by the green band). ShelfReader’s sorting strategy consists of breaking call numbers into elemental substrings and arranging those parts in a database table so that any two call numbers may be compared exclusively on the basis of their corresponding parts. To this end, a base set of call number components was established. These are shown in table 2, along with their abbreviations (for ease in reference), maximum length, and corresponding MySQL data types. The specific MySQL data type determines the kind of sorting employed in each column: • varchar Accepts alphanumeric string data. Sorting is character by character, numbers before letters. • integer Accepts numerical data; numbers are evaluated as whole numbers. • decimal Accepts decimal values. Specifying the overall length of the column and the number of characters to the right of the decimal point has the effect of adding zeros as placeholders in any empty spaces to the right of the last digit. The values are then compared digit by digit. ALGORITHMIC MACHINE SORTING OF LC CALL NUMBERS | WAGNER AND WETHERINGTON 71 https://doi.org/10.6017/ital.v38i4.11585 • timestamp A date/time value that defaults to the date and time the entry is made. This orders call numbers that are identical (i.e., multiple copies of the same item) in the order they are scanned. Section, Component Abbreviation Max. Length MySQL Data Type Classification class/subclass sbc 3 varchar caption number, integer part ci 4 integer caption number, decimal part cdl 16 decimal caption date cdt 4 varchar caption ordinal co 16 integer caption ordinal indicator coi 2 varchar Cutter first Cutter, alphabetical part c1a 3 varchar first Cutter, numerical part c1n 16 decimal first Cutter date cd 4 integer second Cutter, alphabetical part c2a 3 varchar second Cutter, numerical part c2n 16 decimal second Cutter date cd2 4 integer third Cutter, alphabetical part c3a 3 varchar third Cutter, numerical part c3n 16 decimal Specification specification sp 256 varchar timestamp — — MySQL timestamp Table 2. ShelfReader call number components and data types. When parsing a call number, it must be assumed that each call number may contain all of the components identified above. The following is a general outline of the parsing algorithm which processes the call number: INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 72 1. An array is created from the call number. Each character, including spaces, is an element of the array. 2. A second array is then created to serve as a template for each call number, replacing the actual characters with ones indicating data type. For example, all integers are replaced with ‘I’s. This makes pattern matching and data-type testing simpler. 3. Pattern matching is used to identify the presence or absence of landmarks such Cutters, spaces, volume-type numbering, etc. 4. When landmarks are identified, their beginning and ending positions in the call number string are noted. 5. Component strings are created by looping through the appropriate section of the call number template, constructing a string in which the template characters are replaced by the actual characters in the call number string and continuing until a space, th e end of the string, or an incompatible character is encountered. 6. Where needed, whole numbers strings are normalized to uniform length. Dividing a call number into its component parts and placing those parts in separate columns in a database table, then, effectively creates a sort key that may be used for ordering. This key occupies a row of the table, and is an inflated representation of the call number insofar as it makes use of the maximum possible string length of each component type. It contains the characters of each component the call number possesses, and any empty columns serve as placeholders for components it does not possess. When two call numbers are compared, sorting proceeds through each successive column, each component (and each character within each component) serving as a potential break point within the sorting process. We note that every column (with the exception of the specification) contains exclusively alphabetic or numeric data, so that numbers and letters are never compared in th ose sections of the call number string. (The use of spaces in the specification string effectively accounts for the mixed alphanumeric data type.) Some additional points of clarification regarding the algorithm’s multi-column approach to sorting are worth mentioning: 1. Any lowercase alphabetic characters are converted to uppercase before processing in order to ensure that letter case does not affect sorting. 2. Components are arranged in the database table from left to right in the order they occur in the call number. 3. If a call number does not contain a given component, the column is left empty (in the case of a varchar column) or is assigned a zero value (in the case of numeric columns). 4. Empty columns and zero columns sort before columns containing data. 5. In columns designated as varchar columns, numbers are compared as whole numbers . This means that, in order to sort correctly, the length of any number stored must be normalized to a uniform length (6 places) by adding leading zeros. For example, 17 must be normalized to “000017.” 6. Sorting proceeds column by column, provided the call numbers are identical. When the first difference is encountered, sorting is complete. ALGORITHMIC MACHINE SORTING OF LC CALL NUMBERS | WAGNER AND WETHERINGTON 73 https://doi.org/10.6017/ital.v38i4.11585 Table 3 shows two randomly selected call numbers of rather common configuration, along with the corresponding sort keys created by ShelfReader: E169.1.B634 2002 E169.1.B653 1987 }  sbc ci cdl cdt co coi c1a c1n cd c2a c2n cd2 c3a c3n sp E 0169 0.10000 0 B 0.6340000000 0002002 E 0169 0.10000 0 B 0.6530000000 0001987 Table 3. Example ShelfReader sort-key processing of two similar call numbers. In this first example, sorting is complete when 3 is compared to 5 in the first numerical Cutter (c1n) column. (Note that we have here truncated the length of certain strings for space and readability.) To illustrate how the application handles call numbers having heterogenous formats, table 4 shows the sort keys created from two call numbers in an example mentioned above, one with a “double Cutter” and one without: B945.D4B65 1998 B945.D41 1981b }  sbc ci cdl cdt co coi c1a c1n cd c2a c2n cd2 c3a c3n sp B 0945 0.0 0 D 0.400000 B 0.650000 0001998 B 0945 0.0 0 D 0.410000 0.000000 0001981B Table 4. ShelfReader sort-key processing of a “double Cutter” call number and a nearby, single Cutter call number. By pushing the second cutter (B65) in the first call number into the c2a and c2n columns, the issue of comparing incompatible sections of the call number is avoided, as the 1 in the second call number is compared to the placeholder 0 in the first. When the sorting routine reaches this position, it terminates, and any subsequent characters are ignored. Aspects of this multi-column approach may seem counterintuitive at first, but the method mimics what we do when we order call numbers mentally. One compares two call numbers character by character within these component categories until encountering a difference, or until a character or entire category in one of the call numbers is found to be absent. RESULTS ShelfReader’s sorting method is powerful, accurate, and has been extensively tested without issue in a number of different academic libraries within Penn State’s statewide system. The application accurately sorts all valid LC call numbers (with the exception of those for certain cartographic materials in the G1000 – G9999 range, which sometimes employ a different syntax and sorting order) as well those of the National Library of Medicine classification system (which augments INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 74 LCC with class W and subclasses QS – QZ) and the National Library of Canada classification (which adds to LCC the subclass FC, for Canadian history). While there may conceivably be valid LC or LC- extended call numbers having exotic formats that would fail to correctly sort in ShelfReader, we are not aware of any examples (outside of, once again, the G1000 – G9999 range), nor have we received reports of any from users. In addition to verifying proper shelf-ordering, ShelfReader contains a number of other features useful for stacks maintenance. The program can identify shelved items that are still checked out to patrons, have been marked missing or lost, or are flagged as in transit between locations, and often reveals items which have been inadvertently “shadowed” (i.e., excluded from public-facing library catalogs) or have shelf labels which do not match their catalogued call numbers. The GUI has different modes to accommodate the user’s preferred view (both single shelf and multi-shelf, stacks views), and allows for a good deal of flexibility in how and when the user wishes to make and record shelf corrections. A reports module is also included, which tracks shelving statistics and other useful information for later reference. The ShelfReader application code (including the full sorting algorithm) is freely available via an MIT license at https://github.com/scodepress/shelfreader. While ShelfReader was developed and tested using the collections and systems of Penn State University Libraries, its architecture could be adapted and configured for use with other library APIs and adjusted to suit local practices within the general confines of the LC call number structure.13 We can also envision a wide array of potential applications of the sorting functionality within other software environments, and we welcome and encourage users to pursue innovative adaptations of the method. REFERENCES AND NOTES: 1 Leo E. LaMontagne, American Library Classification: With Special Reference to the Library of Congress (Hamden, CT: The Shoe String Press, 1961). The lengthy development of the LCC is described in detail in chapters XIII and XIV (pp. 221-51). 2 Indeed, as LaMontagne asserts, “The Classification was constructed [ . . . ] to provide for the needs of the Library of Congress, with no thought to its possible adoption by other libraries. In fact, the Library has never recommended that other libraries adopt its system . . . ” (ibid., p. 252). Nevertheless, LCC is employed by the overwhelming majority of academic libraries in the United States (Brady Lund and Daniel Agbaji, “Use of Dewey Decimal Classification by Academic Libraries in the United States,” Cataloging & Classification Quarterly 56, no. 7 (December 2018): 653-61, https://doi.org/10.1080/01639374.2018.1517851). 3 “Library of Congress Classification,” Library of Congress, https://www.loc.gov/catdir/cpso/lcc.html. Italics in original. 4 For a summary of LC sorting rules, see “How to Arrange Books in Call Number Order Using the Library of Congress System,” Rutgers University Libraries, https://www.libraries.rutgers.edu/rul/staff/access_serv/student_coord/LibConSys.pdf. Note that this summary is not comprehensive and does not cover all contingencies. 5 Here we emphasize that our definition of the Cutter string may differ from that of others, including (at times) that of the Library of Congress. For instance, the schedules for certain LCC https://github.com/scodepress/shelfreader https://doi.org/10.1080/01639374.2018.1517851 https://www.loc.gov/catdir/cpso/lcc.html https://www.libraries.rutgers.edu/rul/staff/access_serv/student_coord/LibConSys.pdf ALGORITHMIC MACHINE SORTING OF LC CALL NUMBERS | WAGNER AND WETHERINGTON 75 https://doi.org/10.6017/ital.v38i4.11585 subclasses regard the first portion of a Cutter as part of the classification itself. Since this paper concerns sorting rather than classification, we favor the simpler and more convenient definition. 6 J.F. Conley and L.A. Nolan, “Call Number Sorting in Excel,” https://scholarsphere.psu.edu/downloads/9cn69m421z. 7 Conley and Nolan, “Call Number Sorting in Excel.” 8 Tim Danny, “Sorting LC Call Numbers in Excel,” https://medium.com/@tdannay/sorting-lc-call- numbers-in-excel-75de044bbb04. 9 While there is in fact a “hack” or partial patch built into the program which identifies call numbers beginning with the subclass KBG and parses them separately, there is no general support for other call numbers in this category. 10 For the details of this syntax, see “050 - Library of Congress Call Number (R),” Library of Congress, https://www.loc.gov/marc/bibliographic/bd050.html. 11 Testing was conducted on SirsiDynix Symphony WorkFlows Staff Client version 3.5.2.1079, build date June 5, 2017. 12 For an overview, see “Student Library Assistant Training Guide: Shelving Basics,” Florida State College at Jacksonville, https://guides.fscj.edu/training/shelving. 13 Shelfreader was written to receive real-time data directly from a SirsiDynix API connected to Penn State University Libraries’ LMS, a great improvement over drawing from a static collections database. This does, however, present a challenge for making the program easily adaptable to libraries using distinct web services. A strategy to adapt the program would need to account for potential differences in barcode structure, structure and naming conventions in the REST request, and structure and naming conventions within the server response from institution to institution. It is possible that these issues could be resolved via a configuration file made available to the user, but no attempt to address this issue has been undertaken as of yet. https://scholarsphere.psu.edu/downloads/9cn69m421z https://medium.com/@tdannay/sorting-lc-call-numbers-in-excel-75de044bbb04 https://medium.com/@tdannay/sorting-lc-call-numbers-in-excel-75de044bbb04 https://www.loc.gov/marc/bibliographic/bd050.html https://guides.fscj.edu/training/shelving ABSTRACT Background Literature Review Software Review Proposed Parsing and Sorting Methodology Application of Methodology Results References and Notes: 11607 ---- User Experience with a New Public Interface for an Integrated Library System ARTICLES User Experience with a New Public Interface for an Integrated Library System Kelly Blessinger and David Comeaux INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2020 https://doi.org/10.6017/ital.v39i1.11607 Kelly Blessinger (kblessi@lsu.edu) is Head of Access Services, Louisiana State University. David Comeaux (davidcomeaux@lsu.edu) is Systems and Discovery Librarian, Louisiana State University. ABSTRACT The purpose of this study was to understand the viewpoints and attitudes of researchers at Louisiana State University toward the new public search interface from SirsiDynix, Enterprise. Fifteen university constituents participated in user studies to provide feedback while completing common research tasks. Particularly of interest to the librarian observers were identifying and characterizing where problems were expressed by the participants as they utilized the new interface. This study was approached within the framework of the cognitive load theory and user experience (UX). Problems that were discovered are discussed along with remedies, in addition to areas for further study. INTRODUCTION The library catalog serves as a gateway for researchers at Louisiana State University (LSU) to access the print and electronic resources available through the library. In 2018 LSU, in collaboration with our academic library consortium (LOUIS: The Louisiana Library Network), upgraded to a new library catalog interface. This system, called Enterprise, was developed by SirsiDynix, which also provides Symphony, an integrated library system (ILS) long used by the LSU Libraries. “SirsiDynix and Innovative Interfaces are the two largest companies competing in the ILS arena that have not been absorbed by one of the top-level industry players.”1 There were several reasons for the change. Most importantly, SirsiDynix made the decision to discontinue updates to the previous online public access catalog (OPAC), known as e-Library, and focus development on Enterprise. In response to this announcement, the LOUIS consortium chose to sunset the E-Library OPAC in the summer of 2018. This was welcome news to many, especially the systems librarian, who had felt frustrated by the antiquated interface of the old OPAC as well as its limited potential for customization. The newer interface has a more modern design and includes features such as faceted browsing to better suit the twenty-first-century user. Enterprise also delivers better keyword searching. This is largely because it uses the Solr search platform, which operates on an inverted index. Solr (pronounced “solar”) is based on open source indexing technology and is customizable, more flexible, and usually provides more satisfactory results to common searches than our previous catalog. Inverted indexing can be conceptualized similarly to indexes within books. “Instead of scanning the entire collection, the text is preprocessed and all unique terms are identified. This list of unique terms is referred to as the index. For each term, a list of documents that contain the term is also stored.”2 Unlike the old catalog, which sorted results by date (newest to oldest), Enterprise ranks results by relevance, like search engines. The new search is also faster because the results are matched to the inverted index instead of whole records.3 mailto:kblessi@lsu.edu mailto:davidcomeaux@lsu.edu INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 2 The authors wanted to investigate how well this new interface would meet users’ research needs. While library database and website usage patterns can be assessed through quantitative measures using web analytics, librarians are often unaware of the elements that cause frustration for the users unless they are reported. Prior to Enterprise going live, the library’s head of access services solicited internal “power users” in the library to use the new interface. Power users were identified as library personnel in units who used the catalog daily for their work. This group included interlibrary loan, circulation, and some participants in research and instruction services. These staff members were asked to use Enterprise as their default search to help discover problems before it went live. A shared document was created in Google Drive for employees to leave feedback regarding their experiences and suggestions for improvements. The systems librarian had access to this folder and periodically accessed it and made warranted changes that were within his control. Several changes were made based on feedback from the internal user group. These included adding the term “Checked Out” in addition to the due date, and the adjustment of features that were not available or working correctly in the advanced search mode, such as Date, Series, Call Number, and ISBN. Several employees were curious about differences between the results in the old system and Enterprise due to the new algorithm. Additionally, most internal users fou nd the basic search too simplistic and not useful, so the advanced search mode was made the default search. Among the suggestions, there was also praise for the new interface. These statements were regarding elements of the user-enablement tools, such as “I was able to login using my patron information. I really like the way that part functions,” and from areas where additional information was now available, such as “I do enjoy that it shows the table of contents —certainly helps with checking citations for ILL.” While the feedback from internal stakeholders was helpful, the authors were determined to gather feedback from patrons as well. To obtain this feedback, the authors elected to conduct usability studies. Usability testing employs representative users to complete typical tasks while observers watch, facilitate, and take notes. The goal of this type of study is to collect both qualitative and quantitative data to explore problem areas and to gauge user satisfaction.4 Enterprise includes an integration with EBSCO Discovery Service (EDS) to display results from the electronic databases subscribed to by the library as well as the library’s holdings. EDS was implemented several years ago as a separate tool. The implementation team suspected that launching Enterprise with periodical article search functionality might be confusing to those who were not accustomed to the catalog operating in this manner. Therefore, for the initial roll-out, the discovery functionality was disabled in Enterprise, leaving it to function strictly as a catalog to library resources. This decision will be revisited later. Like many other academic libraries, EDS is currently the default search for users from the LSU Libraries homepage. Other search interfaces, labeled “Catalog,” “Databases,” and “E-Journals,” are also included as options in a tabbed search box. CONCEPTUAL FRAMEWORK Two schools of thought helped to frame this research inquiry: cognitive load theory and user experience (UX). Cognitive load theory relates to the amount of new information a novice learner can take on at a given time due to limitations of the working memory. This theory originated in the field of instructional design in the late 1980s.5 The theory states that what separates novice learners from experts is that the latter know the background, or are familiar with the schema of a INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 3 problem, whereas novices start without this information. Accordingly, experts are able to categorize and work with problems as they are presented, whereas new learners need to formulate their problem-solving strategies while encountering new knowledge. As a result, novices are quicker to max out the cognitive load in their working memories while trying to solve problems. UX emerged from the field of computer science and measures the satisfaction of users with their experience with a product. A 2010 article reviewed materials published in 65 separate studies with “cognitive load” in the title or abstract.6 Early articles on cognitive load focused on learning and instructional development. These studies concentrated on limiting extraneous information (e.g. , materials and learning concepts), which affects the amount of information able to be held in the working memory.7 While the research that developed cognitive load theory centered on real-life problem- solving scenarios, later research focused on its impact in e-learning environments and learning regarding this delivery mode.8 In contrast to cognitive load theory, which was formed by academic study, the concept of UX was developed in response to user/customer satisfaction, particularly regarding electronic resources such as websites.9 User testing allows end users to provide real- time feedback to developers so they see the product working, and in particular, to note where it could be improved. UX correlates well with cognitive load theory for this study, as the concept arose with the widespread use of computers in the workplace and in homes in the mid-1990s. USER STUDIES User expectations have shifted beyond the legacy or “classic” OPAC, originally designed for use by experienced researchers with the primary goal of searching for known items.10 User feedback has historically been sought when libraries release new platforms and services and to gauge user satisfaction regarding research tools. “Libraries seek fully web-based products without compromising the rich functionality and efficiencies embodied in legacy platforms.”11 A study by Borbely used a combination of the log files of OPAC searches and a satisfaction questionnaire to determine which factors were most important to both professional and nonprofessional users. Their findings indicated that task effectiveness, defined as the system returning relevant results, was the primary factor related to user satisfaction.12 Many of the articles dealing with user studies and library holdings published in recent years have focused on next-generation catalogs (NGCs). This was defined in a 2011 study by 12 characteristics: “A single point of entry for all library resources, state of the art web interface, enriched content, faceted navigation, simple keyword search box with a link to advanced search box on every page, relevancy based on circulation statistics and number of copies, ‘did you mean . . .’ spell checking recommendations/related materials based on transaction logs, user contributions (tagging and ranking), RSS feeds, integration with social networking sites, and persistent links.”13 Catalogs defined as next-generation provide more options and functionality in a user-friendly, intuitive format. They are typically designed to more closely mimic web search engines, with which novice users are already familiar. Tools within NGCs such as the faceted browsing of results have been reported as popular in user studies, especially among searchers without high levels of previous search experience. “Faceted browsing offers the user relevant subcategories by which they can see an overview of results, then narrow their list.”14 A 2015 study interviewed 18 academic librarians and users to seek their feedback regarding new features made possible by NGCs. Their findings indicate “that while the next-generation catalogue interfaces and INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 4 features are useful, they are not as ‘intuitive’ as some of the literature suggests, regardless of the users’ searching skills.”15 This study also indicated that users typically use the library catalog in combination with other tools such as Google Scholar or WorldCat Local both for ease of use and for a more complete review of the literature. While Enterprise contains the twelve elements that Yang and Hofmann defined for a NGC, since the discovery element has been disabled, LSU Libraries use of Enterprise may be better described as an ILS OPAC with faceted results navigation. While the implementation of discovery services and other web tools has shifted users to sources other than the catalog, many searchers often still prefer to use the library’s catalog. Reasons for this may include familiarity with the interface, the ability to limit results to smaller numbers, or the unavailability of specific desired search options through other interfaces. PROBLEM STATEMENT The purpose of this study was to understand the viewpoints and attitudes of university stakeholders regarding a new interface to the online catalog. In particular, four areas were investigated: 1. Identification of problems searching for books on general and distinct topics. 2. Identification of problems searching for books with known titles and specific journal titles. 3. Exploration of the usability of patron-empowerment features. 4. Identification of other issues and/or frustrations (e.g., “pain points”). METHODOLOGY Three groups of users were identified for this study: undergraduate students, graduate students, and staff/faculty. The student participants were the easiest to recruit due to a fine forgiveness program that was initiated at LSU Libraries in 2016. This program gives library users the option of completing user testing in lieu of some or all of their library fines (up to $10 per user test). All the student participants were recruited in this manner. Additionally, five faculty/staff members identified as frequent library users were asked to participate. The participant pool for user testing included five undergraduate students, five graduate students, and five faculty and staff members. Each of these groups had five participants, which is considered a best practice in user testing.16 The total sample studied for this study was 15 library users representing these three unique user groups. These participants are described in more detail in appendix A. For the observations, individuals were brought to the testing room in the library. This is a small neutral conference room with a table, laptop, and chairs for the librarian observers and participants. Each participant was tested individually and was asked to speak aloud through their thought process as they used the new interface. The authors employed a technique known as “rapid iterative testing.” This type of testing involves updating the interface soon after problems are identified by a user or observer. Thus, after each user test, confusing and extraneous information was removed applying cognitive load theory, improving the interface in alignment with the concept of UX. This approach helped to minimize the number of times participants repeatedly encountered the same issues. This framework makes this study more of a working study than a typical user study. A demonstration of this type of testing is included as figure 1. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 5 BEFORE AFTER Figure 1. Iterative Testing Model. This shows the logon screen before and after it was modified. This was based on the observation that users were unsure what information to enter here. The software Usability Studio was utilized to record audio and video of the participants’ electronic movements throughout each test. Although the software can also record video of users throughout tests, the authors felt that this may make the participants uncomfortable and possibly more reluctant to openly share their opinions. At the beginning of each user study, participants were informed of the purpose of the study, the process, and the estimated time of the observation (30 to 45 minutes). The participants were then asked to sign a consent form for participation in the study. The interviews began with two open-ended pre-observation questions to gauge the users’ previous library experience. The first question asked students whether they had received library training in any of their courses, or, if the participant was a teaching staff or faculty member, if they regularly arranged library instruction for their students. The second question explored if they needed to use library resources in a previous assignment or required these in one they had assigned. Then volunteers were given a written list of four multi-part task-based exercises, detailed in appendix B. These exercises were designed to evaluate the areas of concern outlined in the problem statement and to let the users explore the system, helping the observers discover unforeseen issues. The observations ended with two follow-up questions that asked the participants to describe their experience with the new interface. They were asked what they liked and what they found frustrating. They were also asked if there were areas where they felt they needed more help, and how the design could be made more user friendly. After the testing was completed, the audio files were imported into Temi, a transcription tool that provided the text of what the users and the observers said throughout the test periods. The authors reviewed these transcripts and the recorded videos of the user’s keystrokes within the system for further clarity. The process and all instruments involved were reviewed by the LSU Institutional Review Board prior to the testing. All user tests took place from March through November 2018. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 6 FINDINGS Previous Library Training and Assignments Three of the five undergraduate participants had received some previous library training fro m their professors or from a librarian who visited their classes. Those that had training tended to recall specific databases they found useful, such as CQ Researcher or JSTOR. The assignments requiring library research mentioned by the undergraduate participants typically required several scholarly resources as a component of an assignment. Four out of five of the graduate-student participants also indicated that they had some library training, and most also indicated they used the library frequently. Two of the graduate-student participants, both studying music, mentioned that a research course was a required component of their degree program. The staff and faculty members tested mentioned that, depending on the format of the course, they either demonstrated resources to their students themselves, or would request a librarian to teach more extensive training. Participant 13, a teaching staff member, mentioned that in a previous class she was “able to get our subject librarian to provide things to cater to the students, and they had research groups, so she [the subject librarian] was very helpful.” Some of the teaching staff and faculty mentioned providing specific scholarly resources for their students. They acknowledged that since these were provided, their students did not gain hands-on experience finding scholarly materials themselves. Participant 10, a faculty member, stated that she usually requires that students in one of her courses “do an annotated bibliography. I’ll require that they find six to ten sources in the library and usually require that at least three or four of those sources be on the shelf physically, because I want them to actually work with the books, and in addition, to avail themselves of electronic resources.” Most of the staff and faculty participants indicated that, despite its weaknesses, they preferred using the online catalog over EDS, mainly because EDS included materials outside of our collection. When asked to explain, Participant 12, a staff member said, Because I feel like [with] EBSCO you get a ton of results, and you know, I’m still looking for stuff that you guys have. Um, [however] because of the way the original catalog is, I feel like I have to go through Discovery to get a pretty accurate search on what LSU has. Because, when I do use the Discovery search, it’s a lot more sensitive, or should I say maybe a lot less sensitive, and it will pick up a lot of results. . . . It searches your catalog really well, just like WorldCat does. . . . So, if the catalog was the thing that was able to do that, that would be cool. If the catalog search was more intuitive and inviting, I wouldn’t even bother going to some of these other places. Books: General and Distinct Topics The observers noticed multiple participants using or commenting on known techniques learned from experience with the old catalog interface. These included Boolean operators such as AND to connect terms within the results. Enterprise does not include Boolean logic in its searches. A goal of the structure for the new algorithm is to provide a search closer to natural language. While most of the student participants typically searched by keyword when searching for books on general topics, staff and faculty participants typically preferred to search within the subject or title fields. Faculty and staff participants also actively utilized the Library of Congress Subject Heading links within records and said that they also recommended that their students find materials in this manner. Participant 9, a faculty member, said that he usually told his students to “find one book, then go to the catalog record . . . where you’ll get the subject headings. Because . . . you’re not going INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 7 to guess the subject headings just off the top of your head, and that's how they are organized. That's the best way of getting . . . your real list.” Many users were able to deduce that a book was available in print based on the call number listed. However, some undergraduate-student participants were confused by links in individual records and assumed that these were links to electronic versions of the book. These primarily linked to additional information about the book, such as the table of contents. The observers also found that many students used the publication date limiter when searching for materials, while commenting that they typically favored recent publications, unless the subject matter made a historical perspective beneficial. The date limiter, while effective for books, is less effective for periodicals, which include a range of publication dates. More advanced researchers, such as one staff participant, enjoyed the option to sort their results by date, but sorted these by “oldest first” indicating that they did this to find primary documents. Known Title Books None of the user groups tested had trouble locating a known book title within the new interface, although two undergraduate students remarked that they preferred to search by the author, if known, to narrow the results. Most undergraduates determined if the books were relevant to their needs based on the title and did not explore the tables of contents or further information available. Graduate students tended to be more sophisticated in their searches for relevance and used the additional resources available in the records. Participant 12, a staff member, mentioned that he liked the new display of the results when he was searching for a specific book. While the old system contained brief title information in the results display, he believed the new version showed more pertinent information, such as the call number, in the initial results. He said, “and this is also great too, because the old system . . . you would bring up information, then there’s another tab you have to click on to get to the . . . real meat of it. So . . . this is really good to see if it’s a book, to know what the number is immediately, just to not have to go through so many clicks.” Specific Journals Specific journal results were problematic and were confusing in multiple ways. The task regarding journals directed users to find a specific journal title within Enterprise, and then to determine whether 2018 issues were available and in what format (e.g., print or electronic). All the student users had trouble determining whether a journal was in print or electronic and if the year they needed was available. The task of finding a specific journal title and its available date range was also troublesome to many students. The catalog lists “Publication Dates” for journals prominently in the search results. However, these dates indicate the years that a journal was published, not the years that the library holds. Users need to go into the full record for a journal to see the available years listed under the holding information. Unfortunately, this was not intuitive to many. Additionally, the presentation of years in records for journals was also unclear to some. For instance, Participant 2, a freshman, did not understand that a dash next to an initial date (e.g., 2000–) indicated that the library held issues from 2000 to the present. Many student users, especially those familiar with Google Scholar or EDS, did not understand that journals are solely indexed in the catalog by the title of the journal. This is problematic for those who are accustomed to more access points for journals, such as article title and author. Journals were additionally confusing because each format (e.g., print, electronic, or microfilm) has its own record in the catalog. Typically, users clicked on the first record with the title that matched INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 8 the one they required, and assumed that this was all that was available, rather than scrolling through the results to view the different formats and the varying years within. This issue was problematic for all the student participants. Participant 5, a PhD student, summed up this frustration by saying, “I find stuff like that sometimes when I’m looking for other things. Like, it shows the information [for the journal] but great, awesome. I found that [the journal] is maybe, hopefully somewhere, but sometimes you click on whatever link they have, and it goes to this one thing and there’s like one of the volumes available. So, this is not useful.” While records in the past used to be cataloged with all the holding information in one record, a better functionality for end users, this practice was changed in the mid- to late 2000sdue to the updating of journal holdings was a manual process completed by technical services staff. This timeframe was when the influx of electronic journal records began to steadily increase, making this workflow too cumbersome. Due to all these known issues with journals, when asked to search for a specific journal in the catalog, several advanced searchers (graduate students, staff, and faculty) indicated they would not use the catalog to find journals. Several stated other sources they preferred to use, whether Google Scholar, interlibrary loan, or subject-specific databases in their fields. After fumbling around with the catalog, Participant 12, a staff member, summed this up by saying, “I guess if I was looking for a journal, I would just go back to the main page, and go from there [from the e-journals option]. I haven’t really searched for journals from the catalog. The catalog is usually my last [resort], especially for something like a journal.” Usability of Patron-Empowerment Features Many participants were confused by the login required to engage with the patron-enablement tools prior to the iterative changes demonstrated in figure 1. Once changes were made clarifying the required login information, patrons were able to use the patron-enablement tools well, placing holds and selecting the option to send texts regarding overdue materials. However, few undergraduate participants intuitively understood the functionality of the checkboxes next to records to retrieve items later. Some participants assumed that they needed to be logged into the system to use this functionality, similar to EDS. Participant 1, a senior, said that she used a different method for retrieving items later, stating “normally, I'm going to be honest, if I needed the actual title, I’d put it in a Word document on my computer. I wouldn’t do it on the library website.” Another graduate student, Participant 14, stated that while he was aware of the purpose of the checkboxes, he would not use them because the catalog would not be the only resource he would be using. He said that his preference was to “keep a reference list [in Word] for every project. And then this reference list will ultimately become the reference for the work done.” Participants in every category noted that they did not usually create lists in the catalog to refer to them later. There was enthusiasm regarding the new option to text records, with Participant 6, a staff member, going so far as to say “boy, this is gonna make me very annoying to my friends” and staff Participant 12 stating “that’s a really cool feature. I think that’s more helpful than this email to yourself.” Unfortunately, there were several issues discovered regarding the text functionality. The first issue was that it was not reliably working with all carriers. Once that was resolved, the systems librarian removed extraneous information regarding the functionality. This included text that “Standard Rates Apply” and a requirement for users to choose their phone carrier before a text could be sent. These were both deemed unnecessary as it was assumed that users would INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 9 know whether they were charged for receiving text messages. Additionally, one of the international graduate student participants did not understand the connection between texting information and the tool to do this, which was titled “SMS notification.” While the texts were successful while the users were performing the studies, it was discovered later that the texts did not include the call numbers for items. After discussion regarding this problem arose at a LOUIS conference, the decision was made to hide texting as an option until the system was able to properly text all the necessary information. When SirsiDynix can fix this issue, the language around this will likely be made more intuitive by labeling it “Text Notifications” instead of SMS. Other Issues and Frustrations The researchers noticed that some options were causing confusion, such as the format limiter. Under this drop-down option, several areas were displayed that did not align to known formats in the LSU collections, such as “Continuing Resources.” To remedy this, all the formats that could not be identified were removed as options. Another confusing element was the way that records were displaying in initial user tests. Some MARC cataloging information was visible to users, so the systems librarian modified the full record display to hide this information. Originally, the option to see this information was moved to the side under an option to “View MARC Record.” However, since this still seemed to confuse users, this button was changed to “Librarian View.” Undergraduate-student users reported confusion when they needed to navigate out of the catalog into a new, unfamiliar database interface to obtain an electronic article. Participant 3, a senior, described her feelings when this happened, that she felt like she was “not in the hands of LSU anymore. I’m with these people, and I don’t know how to work this for sure.” Another undergraduate user gave the suggestion that the system provide a warning when this occurs, so users knew that they would be navigating in a new system. Since so many of the records link to databases and other non-catalog resources, this was not pursued. Several undergraduate-student users mentioned that they didn’t understand the physical layout of the library, and that they used workarounds to get the materials they needed rather than navigate the library. For example, some were using the “hold” option in the catalog to have staff pull materials for them for reasons not initially intended by the library. Rather than using this feature for convenience, they stated they were using it due to a lack of awareness of the layout of the library or the call number system. One user, Participant 4, a sophomore, used the hold feature to determine whether a book was in print or electronic. When she clicked on the “hold” button in a record and it was successful, she said “okay, so I can place a hold on it, so I’m assuming there is copy here.” Follow-Up Questions Feedback to the new interface was primarily positive. Several participants mentioned that the search limiters were now more clearly presented as choices from drop-down boxes. Additionally, result lists are now categorized by facets such as format and location, which users had options to “include” or “exclude” at their discretion. Participant 9, a faculty member, particularly liked the new Library of Congress subject facet from within the search results. She mentioned that these were available in the past interface, but the process to get to them was much more cumbersome. She regarded this new capability as a “game changer” and “something she hadn’t even dreamed of.” Experienced searchers, such as Participant 6, a staff member, noticed and appreciated the improvements in search results made possible by the new algorithm. She said, “It’s very easy to INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 10 look at, especially compared to the old database, and the keyword searching is a lot better.” After conducting a brief search, another staff member, Participant 12, mentioned that he thought the results returned by an author search were much more relevant than in the past. He said, “Sometimes with the older system even name searches can be sort of awful. I mean . . . this is a lot better. . . . If you type in Benjamin Franklin, for me at least, it’s difficult to get to the things he actually wrote. You know, you can find a lot of books about [them], and so you kind of have to filter until you can find . . . the subject.” Figure 2. Catalog disclaimer. The new search is also more forgiving of misspellings than the old version, which responded with “This Search Returned No Results” when an item was misspelled. Those who were very familiar with the old interface, such as staff and faculty, were particularly excited by small changes. These included being able to use a browser’s back button instead of the navigation internal to the system, or the addition of the ISBN field to the primary search. Prior to the new interface, the system would give an error message when a user attempted to use a browser’s back button instead of the internal navigation. Additionally, users mentioned that they liked that features were similar to the previous interface with additional options. An example of a new feature is the system employing fuzzy logic to provide “did you mean” and autocomplete suggestions when users start typing in titles they are interested in, similar to what Google provides. This same logic also returns results with related terms, eliminating the need for truncation tools.17 One graduate student, however, particularly mentioned missing Boolean operators; they thought they were helpful because students had been taught these and were familiar with them. Due to this comment, and other differences between the old and new interfaces, a disclaimer was added to INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 11 make users aware of these changes (see figure 2). Two of the five undergraduate participants and two faculty participants noted if they didn’t understand something, or needed help, they would ask a person at a service desk for assistance. One of the staff members mentioned she would use the virtual chat if she had a question regarding the catalog. She also suggested that a question mark symbol providing additional information when clicked might be helpful if users got confused in the system. To allow users to provide continued feedback, the systems librarian created a web form for users to ask questions or report errors regarding the system. DISCUSSION Study participants requested several updates; unfortunately, some of the recommendations suggested were in areas where the systems librarian had little control to make changes. Since the initial advanced search was not easy to customize, the systems librarian created a custom new advanced search that more closely fit the needs of catalog users than the built-in search. One limitation of the default advanced search that several participants and staff users noted was the inability to limit by publication date. To work around this problem, one of the features the systems librarian implemented in the custom search was a date-range limiter. While still falling short of the patrons’ desired outcome of inputting a precise date to limit by, the date range feature was still a step forward. He was also able to make stylistic changes, such as bolding fields or making buttons appear in bright colors to make them more visible. Other changes included eliminating confusing options and reordering the way full records appeared. This included moving the call number to a more visible area than where it was originally located. After a staff participant suggested it, he was also able to make the author name a hyperlinked field. Now users can click on an author’s name to see what other books are available by that author within the library. The systems librarian was also able to make the functionality of the checkboxes more intuitive by adding a “Select an Action” option at the top of the list of results, which more clearly indicated what could be done with the checked options. These include being added to a list, printed, e- mailed, or placed on hold. The username and PIN required to engage with the user-enablement tools was continually problematic, and not intuitive. Only one of the student participants knew their login information, a graduate student close to graduation. The user name is a unique nine-digit LSU ID number, which students, faculty, and staff don’t often use. The PIN is system generated, so there is no way users could intuit what their PIN is. Once the user selects the “I forgot my PIN” option however, the PIN is sent to them, and then they have the option to change it to something they prefer. This setup is not ideal, especially since many other resources on campus are accessed through a more familiar text-based login and password. The addition of “I forgot my PIN” to this part of the interface helps by anticipating and assisting with this problem by providing an example with the nine-digit ID number, but this can also be overlooked. For this reason and for other security reasons related to paying fines, the library is exploring options to provide a single-sign-in login mechanism. The lack of knowledge regarding the physical layout of the library cannot be solely blamed on the users. In 2014, the LSU Libraries made several changes to Middleton Library, the main campus library. The first was the closing of a once distinct collection, Education Resources, whose titles were merged with the regular collections. The second was weeding a large number of materials on the library’s third floor to facilitate the creation of a Math Lab. The resulting shifting of the collection had a direct impact on how patrons were able to locate materials within the library. Due to required deadlines, access services staff needed to place books in groupings out of their typical INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 12 call number locations. The department is still working to remedy this five years later. In addition, service points previously manned to assist with wayfinding on the third and fourth floors were closed. CONCLUSION To optimize whatever resources the library provides, user feedback is a useful tool. However, there are limitations to this study, the most obvious being that the data collection took place and was based on researchers at one institution. This could limit the applicability of the study’s results. The second is regarding the sampling method for user tests. Student users self -selected by volunteering in lieu of user fines, and staff and faculty identified as frequent library users, were purposively selected. Less-experienced users may have encountered different issues as they navigated the system. Most of the student participants indicated they had received library training in some manner, and that they had been required to use library resources in the past to complete assignments. Only a small number of participants were undergraduates without library training. The authors noted that the student participants who had received library training were more likely to attempt complicated searches and to explore advanced features. However, they also tended to try to conduct searches that were optimized for the previous catalog, such as using Boolean logic. Those with library training were also more likely to identify problematic areas, such as searching for journals, and to develop workarounds to get the materials they desired. The two graduate students in music, who were required to take a research course, both indicated how helpful this knowledge was to conducting research in their field. The user tests in this study demonstrated which information points the users at LSU found to be the most relevant, which allowed the system librarian to redesign a search that better fit their needs. This included hiding or separating extraneous information, such as additional information regarding texting, and making changes so all MARC coding only appeared under the newly created “Librarian View.” While this study demonstrated that the advanced researcher participants created workarounds regarding journal searches, undergraduate participants also created workarounds (such as placing holds) to accommodate their lack of knowledge regarding the library system and the physical library. Several of the undergraduate participants reported having anxiety regarding their ability to navigate systems when the catalog linked to databases with interfaces new to them. The authors found that more advanced researchers appreciated having more data in catalog records, such as information on publishers and Library of Congress Subject Headings. Students without as much exposure to library resources tended to prefer to conduct keyword searches and were more likely to judge the relevance of a record based mainly on the title or year of publication. Most of the staff and faculty participants in this study indicated that they preferred to use the OPAC over EDS. Less-seasoned researchers tended to prefer ease and convenience over additional control and functionality. These kinds of generalizations could be tested by additional studies at other universities. The new user-empowerment features were received positively, especially the new “Text Notifications” feature. Most participants indicated that they found it easy to renew items within the interface. However, the authors discovered that few patrons indicated they would u se “My lists” to capture records they would like to retrieve later. The user tests highlighted how many problems LSU library users are having signing on to the system to utilize the user-enablement tools. It is hoped that the upcoming change to a single sign on will alleviate these issues and the users’ frustrations. The systems librarian would like to incorporate other changes, such as the INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 13 request to return to the same spot on a list after going into a full record rather than returning to the top of the list. He is also planning on programming the mobile view for the catalog soon. Currently the mobile site is still linking to the desktop version. He has also reintroduced the option to conduct a Boolean search by linking to the old catalog due to so many users being familiar with it. The text messaging is expected to be corrected in an upcoming upgrade. Overall, response from the participants in this study was positive, especially regarding the new algorithm. They also appreciated the familiarity of the design with the previous catalog interface, with additional features and functionality. Regardless of the limitations in this study, some of this study’s findings reaffirm those from previous user studies. These include researchers indicating a need to consult multiple resources either in combination with or in exclusion of the catalog, and NGCs not being as intuitive as expected. The need to consult multiple resources particularly correlated with this study’s findings regarding journals. Librarians were aware that searching journals in the online catalog was tricky for users due to multiple issues. Many of the experienced participants in this study mentioned that they appreciated the new algorithm because it provided more accurate results. This reaffirmed results from the Borbely study, which indicated that task effectiveness, or the system returning relevant results, as the primary factor related to user satisfaction. Also similar to findings from the literature, users appreciated the newly available faceted-browsing features. Dissimilar to a previous study however, it was the advanced searchers, rather than the novices, who mentioned these specifically as an improvement.18 The authors noted that it was common for undergraduate library participants to express confusion regarding navigating the physical library, so the library has taken several steps to remedy this. Since this user testing was completed, a successful grant was written to provide new digital signage to replace outdated signage. This digital signage will be much more flexible and easier to update than the older fixed signage. Additionally, this grant provided a three-year license to the StackMaps software. This software has since been integrated into the catalog and EDS tool to direct users to physical locations within the libraries. Additionally, the access services department updated physical bookmarks that display the call number ranges and resources available on each floor. These are now available at all the library’s public service desks. The library will also continue providing the popular “hold” services for patrons. This is a relatively new service, which was started to offset confusion and to assist patrons during the construction they may have encountered during the changes to the library. Finally, since the fine forgiveness program has been so fruitful regarding recruitment for user studies, the special collections library also anticipates providing user studies in lieu of photocopying costs in the future. FUTURE RESEARCH These user tests made it obvious that finding specific journal information through the catalog was difficult for most users. This is an area that needs remediation, and the systems librarian plans to conduct further user testing to explore avenues to make searching for journal holdings more efficient. Another potential area for further study includes assessing Enterprise’s integration of article records. As previously mentioned, Enterprise can be configured to include article-level records into its display. However, this functionality would duplicate an existing feature of our main search tab, an implementation of EDS that we have labeled “Discovery.” While the implementation team felt that duplicating this functionality on a search tab labelled “Catalog” might initially confuse users, replacing our current default search tab with Enterprise warrants INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 14 serious consideration. An additional area to explore is a rethinking of the tabbed search box design. While the tabbed design remains popular in libraries, a trend toward a single search box on the library homepage has been observed in academic libraries.19 A future study with an emphasis on determining the best presentation of a various search interfaces, including either a reshuffling of available tabs or a move to a single search box, is planned in the foreseeable future. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 15 APPENDIX A: STUDY PARTICIPANTS Participant Status Year Major Date Tested 1 Undergrad Senior International Studies & Psychology 3/23/2018 2 Undergrad Freshman Mass Communication 4/6/2018 3 Undergrad Senior Child and Family Studies 4/13/2018 4 Undergrad Sophomore Pre-Psychology 4/24/2018 5 Graduate PhD Music 4/26/2018 6 Staff N/A English 5/3/2018 7 Graduate Masters Music 5/3/2018 8 Graduate PhD French 5/4/2018 9 Faculty N/A History 5/7/2018 10 Faculty N/A English 5/8/2018 11 Undergrad Junior Accounting 6/1/2018 12 Staff N/A History 6/4/2018 13 Staff N/A Mass Communication 8/28/2018 14 Graduate PhD Curriculum and Instruction 10/3/2018 15 Graduate PhD Petroleum Engineering 10/2/2018 INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 16 APPENDIX B: USER EXERCISES Worksheet 1) You need to do a research paper on gerrymandering and race. a) Identify three books that you may want to use. b) How would you save these titles to refer to later? 2) You are looking for the book titled Harriet Tubman and the Fight for Freedom by Lois E. Horton. Find out the following and write your answers below. a) Does the library own this in print? b) What is the call number? c) If we have this book, go into the record, and text yourself the information. d) Place a hold on this book. 3) You need an article from the Journal of Philosophy. Do we have access to the 2018 issues? What type of access (e.g., print or electronic)? 4) Log in to your personal account to see the following: a) What you have checked out currently, if you have materials out, try to renew an item. b) Determine any fines you owe. c) Add a text notification for overdue notices. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 17 ENDNOTES 1 Marshall Breeding, “Library Systems Report 2018: New Technologies Enable an Expanded Vision of Library Services,” American Libraries (May 1, 2018): 22–35. 2 David A. Grossman and Ophir Frieder, Information Retrieval: Algorithms and Heuristics, 2nd ed. (Dordrecht, The Netherlands: Springer, 2004). 3 Dikshant Shahi, Apache Solr: A Practical Approach to Enterprise Search (Berkeley, CA: Springer eBooks, 2015). EBSCOhost. 4 “Usability Testing,” U.S. Department of Health and Human Services, accessed June 1, 2019, https://www.usability.gov/how-to-and-tools/methods/usability-testing.html. 5 John Sweller, “Cognitive Load During Problem Solving: Effects on Learning,” Cognitive Science 12, no. 2 (1988): 257–85, https://doi.org/10.1207/s15516709cog1202_4. 6 Nina Hollender et al., “Integrating Cognitive Load Theory and Concepts of Human–Computer Interaction,” Computers in Human Behavior 26, no. 6 (2010): 1278–88, https://doi.org/10.1016/j.chb.2010.05.031. 7 Wolfgang Schnotz and Christian Kürschner, “A Reconsideration of Cognitive Load Theory,” Educational Psychology Review 19, no. 4 (2007): 469–508, https://doi.org/10.1007/s10648- 007-9053-4. 8 Jeroen J. G. van Merriënboer and Ayres Paul, “Research on Cognitive Load Theory and Its Design Implications for E-Learning,” Educational Technology Research and Development 53, no. 3 (2005): 5–13, https://doi.org/10.1007/BF02504793. 9 Ashok Sivaji and Soo Shi Tzuaan, “Website User Experience (UX) Testing Tool Development Using Open Source Software (OSS),” in 2012 Southeast Asian Network of Ergonomics Societies Conference (SEANES), ed. Halimahtun M. Khalid et al. (Langkawi, Kedah, Malaysia: IEEE, 2012), 1–6, https://doi.org/10.1109/SEANES.2012.6299576. 10 DeeAnn Allison, “Information Portals: The Next Generation Catalog,” Journal of Web Librarianship 4, no. 4 (2010): 375–89, https://doi.org/10.1080/19322909.2010.507972. 11 Breeding, “Library Systems Report 2018.” 12 Maria Borbely, “Measuring User Satisfaction with a Library System According to ISO/IEC TR 9126‐4,” Performance Measurement and Metrics 12, no. 3 (2011): 151–71, https://doi.org/10.1108/14678041111196640. 13 Sharon Q. Yang and Melissa A. Hofmann, “Next Generation or Current Generation? A Study of the OPACs of 260 Academic Libraries in the USA and Canada,” Library Hi Tech 29 no. 2 (2011): 266–300, https://doi.org/10.1108/07378831111138170. 14 Jody Condit Fagan, “Usability Studies of Faceted Browsing: A Literature Review,” Information Technology & Libraries 29, no. 2 (2010): 58–66, https://doi.org/10.6017/ital.v29i2.3144. https://www.usability.gov/how-to-and-tools/methods/usability-testing.html https://doi.org/10.1207/s15516709cog1202_4 https://doi.org/10.1016/j.chb.2010.05.031 https://doi.org/10.1007/s10648-007-9053-4 https://doi.org/10.1007/s10648-007-9053-4 https://doi.org/10.1007/BF02504793 https://doi.org/10.1109/SEANES.2012.6299576 https://doi.org/10.1080/19322909.2010.507972 https://doi.org/10.1108/14678041111196640 https://doi.org/10.1108/07378831111138170 https://doi.org/10.6017/ital.v29i2.3144 INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE WITH A NEW PUBLIC INTERFACE | BLESSINGER AND COMEAUX 18 15 Hollie M. Osborne and Andrew Cox, “An Investigation into the Perceptions of Academic Librarians and Students Towards Next-Generation OPACs and their Features,” Program: Electronic Library and Information Systems 51, no. 4 (2015): 2163, https://doi.org/10.1108/PROG-10-2013-0055. 16 “Best Practices for User Centered Design,” Online Computer Library Center (OCLC), accessed June 7, 2019, https://www.oclc.org/content/dam/oclc/conferences/ACRL_user_centered_design_best_pract ices.pdf. 17 “Enterprise,” SirsiDynix, accessed June 27, 2019, https://www.sirsidynix.com/enterprise/. 18 Fagan, “Usability Studies.” 19 David J. Comeaux, “Web Design Trends in Academic Libraries—A Longitudinal Study,” Journal of Web Librarianship 11, no. 1 (2017): 1–15, https://doi.org/10.1080/19322909.2016.1230031. https://doi.org/10.1108/PROG-10-2013-0055 https://www.oclc.org/content/dam/oclc/conferences/ACRL_user_centered_design_best_practices.pdf https://www.oclc.org/content/dam/oclc/conferences/ACRL_user_centered_design_best_practices.pdf https://www.sirsidynix.com/enterprise/ https://doi.org/10.1080/19322909.2016.1230031 ABSTRACT INTRODUCTION Conceptual Framework User Studies Problem Statement Methodology Findings Previous Library Training and Assignments Books: General and Distinct Topics Known Title Books Specific Journals Other Issues and Frustrations Follow-Up Questions Discussion Conclusion Future Research APPENDIX A: Study Participants APPENDIX B: User Exercises ENDNOTES 11627 ---- LITA President's Message: Sustaining LITA LITA President’s Message Sustaining LITA Emily Morton-Owens INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 2 Emily Morton-Owens (egmowens.lita@gmail.com) is LITA President 2019-20 and the Assistant University Librarian for Digital Library Development & Systems at the University of Pennsylvania Libraries. Recently, at the 2019 Midwinter Meeting in Seattle, ALA decided to adopt sustainability as one of the core values of librarianship. The resolution includes the idea of a triple bottom line: “To be truly sustainable, an organization or community must embody practices that are environmentally sound AND economically feasible AND socially equitable.” If you had thought of sustainability mainly in terms of the environment, you have plenty of company. I originally pictured it as an umbrella term for a variety of environmental efforts: clean air, waste reduction, energy efficiency. But in fact the idea encompasses human development in a broader sense. One definition of sustainability involves making decisions in the present that take into account the needs of the future. Of course our current environmental threats demand our attention, and libraries have found creative ways to promote environmental consciousness (myriad examples include Books on Bikes, seeking LEED or passive house certification for library buildings, providing resources on xeriscaping, and many more). Even if you’re not presently working in a position that allows you to engage directly on the environment, though, the concept of sustainability turns out to permeate our work and values. The ideas of solving problems in a way that doesn’t create new challenges for future people, developing society in a way that allows all people to flourish, and fostering strong institutions: these concepts all resonate with the work we do daily, not only in what we offer our users but also in how we work with each other. As a profession, we have a history of designing future-proof systems (or at least attempting to). Whenever I’ve been involved in planning a digital library project, one of the first questions on the table is “How do we get our data back out of this, when the time comes?” No matter how enamored we are of the current exciting new solution, we remember that things will look different in the future. Library metadata schemas are all about designing for interoperability and reusability, including in new ways that we can’t picture yet. Someone who is unaccustomed to this kind of planning may see a high project overhead for these concerns, but we have consistently incorporated long-term thinking into our professional values due to the importance we place on free access, data preservation, and interoperability. The triple-bottom line approach, considering economic, social, and environmental factors, also influences the LITA leadership. I recently announced the LITA Board’s decision to reduce our in - person participation at ALA Midwinter for 2020, which is partly in response to ALA’s deliberations about reinventing the event starting in 2021. With all the useful collaboration technologies now at our fingertips, it is harder to justify requiring our members to meet in person more than once per year. It is possible for us to do great work, on a continuous and rolling basis, throughout the year. More importantly, we want to offer committee and leadership positions to members who may not mailto:egmowens.lita@gmail.com http://www.ala.org/aboutala/sites/ala.org.aboutala/files/content/governance/council/council_documents/2019_ms_council_docs/ALA%20CD%2037%20RESOLUTION%20FOR%20THE%20ADOPTION%20OF%20SUSTAINABILITY%20AS%20A%20CORE%20VALUE%20OF%20LIBRARIANSHIP_Final1182019.pdf SUSTAINING LITA | MORTON-OWENS 3 https://doi.org/10.6017/ital.v38i3.11627 be able to travel extensively, for personal or work reasons. (Especially when many do not receive financial support from their employers. And, to come back around to environmental concerns for a moment, think of all the flights our in-person meetings require.) By being more flexible about what participation looks like, we sustain the effort that our members put into LITA through a world of work that is changing. Financial sustainability is also a factor in our pursuit of a merger with ALCTS and LLAMA. We are three smaller divisions based on professional role, not library type, who share interests and members. We also have similar needs and processes for running our respective associations. Unfortunately, LITA has been on an unsustainable course with our budget for some time—we spend more than we take in annually, due to overhead costs and working within ALA’s processes and infrastructure. The LITA Board has engaged for many years on the question of how to balance our financial future with the fact that our programs require full-time staff, instructors, technology, printing, meeting rooms, etc. Core, as the new merged division will be known, will allow us to correct that balance by combining our operations, streamlining workflows, and containing our costs. The staff will also be freed up to invest more effort in member engagement. We can’t predict all the services that associations will offer in the future, but we know that, for example, online professional development is always needed, so we’re ensuring that the plan allows it to continue. It is inspiring to talk about the new collaborations and subject-matter synergies that the merger will bring with it, but Core will also achieve something important for sustaining a level of service to our membership. At the ALA level, the Steering Committee on Organizational Effectiveness (SCOE) is also looking at ways to streamline the association’s structure and make it more approachable and welcoming to new members. I would add that a simplified structure should make ALA more accountable to members as well, which is crucial for positioning it as an organization worth devoting yourself to. These shifts are essential because member volunteers are what make ALA happen, and we need a structure that invites participation from future generations of library workers. Taken together, these may look like a confusing flurry of changes. But librarians have evolved to be excellent at long-term thinking about our goals and values and how to pursue an exciting future vision based on what we know now and what tools (technology, people, ideas) we have at hand. We care about helping our users thrive and are able to take a broad view of what that encompasses. In particular, with the new resolution about sustainability, we’re including the health of our communities and the security of our environment as a part of that mission. Due to their innovative spirit and principled sense of commitment, our members are well-placed to lead transformations in their home institutions and to participate in the development of LITA. As we weigh all these changes, we value the achievements of our association and its past leaders and members, and seek to honor them by making sure those successes carry on for our future colleagues. 11631 ---- Letter from the Editor (September 2019) Letter from the Editor Kenneth J. Varnum INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2019 1 https://doi.org/10.6017/ital.v38i3.11631 Editorial Board Changes Thanks to the dozens of LITA members who applied to join the Board this spring. The large number of interested volunteers made the selection process challenging. I’m pleased to welcome six new members to the ITAL Editorial Board for two-year terms (2019-2021): • Lori Ayre (Independent Technology Consultant) • Jon Goddard (North Shore Public Library) • Soo-yeon Hwang (Sam Houston State University) • Holli Kubly (Syracuse University) • Brady Lund (Emporia State University) • Paul Swanson (Minitex) In This Issue Welcome to LITA’s new President, Emily Morton-Owens. In her inaugural President’s Message, “Sustaining LITA,” Morton-Owens discusses the many ways LITA strives to provide a sustainable organization for its members. We also have the next edition of our “Public Libraries Leading the Way column. This quarter’s essay is by Thomas Lamanna, “On Educating Patrons on Privacy and Maximizing Library Resources.” Joining those essays are six excellent peer-reviewed articles: • “Library-Authored Web Content and the Need for Content Strategy,” by Courtney McDonald and Heidi Burkhardt • “Use of Language-Learning Apps as a Tool for Foreign Language Acquisition by Academic Libraries Employees,” by Kathia Ibacache • “Is Creative Commons A Panacea for Managing Digital Humanities Intellectual Property Rights?,” by Yi Ding • “Am I on the Library Website?,” by Suzanna Conrad and Christy Stevens • “Assessing the Effectiveness of Open Access Finding Tools,” by Teresa Auch Schultz, Elena Azadbakht, Jonathan Bull, Rosalind Bucy, and Jeremy Floyd • “Creating and Deploying USB Port Covers at Hudson County Community College,” by Lotta Sanchez and John DeLooper Call for PLLW Contributions If you work at a public library, you’re invited to submit a proposal for a column in our “Public Libraries Leading the Way” series for 2020. Our series has gotten off to a strong start with essays by Thomas Finley, Jeffrey Davis, and Thomas Lamanna. If you would like to add your voice, please submit a proposal through this Google form. Kenneth J. Varnum, Editor varnum@umich.edu September 2019 https://doi.org/10.6017/ital.v38n3.11627 https://doi.org/10.6017/ital.v38n3.11571 https://doi.org/10.6017/ital.v38n3.11571 https://doi.org/10.6017/ital.v38n3.11627 https://doi.org/10.6017/ital.v38n3.11077 https://doi.org/10.6017/ital.v38n3.11077 https://doi.org/10.6017/ital.v38n3.10714 https://doi.org/10.6017/ital.v38n3.10714 https://doi.org/10.6017/ital.v38n3.10977 https://doi.org/10.6017/ital.v38n3.11009 https://doi.org/10.6017/ital.v38n3.11007 https://doi.org/10.6017/ital.v38n1.10974 https://doi.org/10.6017/ital.v38n2.11141 https://doi.org/10.6017/ital.v38n3.11571 mailto:https://docs.google.com/forms/d/e/1FAIpQLSfQu7C9OgMcDvvbN025a0KiEhAVrRlr7090AO3RowQYPbqTNg/viewform?usp=sf_link mailto:varnum@umich.edu Editorial Board Changes In This Issue Call for PLLW Contributions 11723 ---- Using Augmented and Virtual Reality in Information Literacy Instruction to Reduce Library Anxiety in Nontraditional and International Students ARTICLES Using Augmented and Virtual Reality in Information Literacy Instruction to Reduce Library Anxiety in Nontraditional and International Students Angela Sample INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2020 https://doi.org/10.6017/ital.v39i1.11723 Dr. Angela Sample (asample@oru.edu) is Head of Access Services, Oral Roberts University ABSTRACT Throughout its early years, the Oral Roberts University (ORU) Library held a place of pre-eminence on campus. ORU’s founder envisioned the Library as central to all academic function and scholarship. Under the direction of the founding dean of learning resources, the Library was an early pioneer in innovative technologies and methods. However, over time, as the case with many academic libraries, the Library’s reputation as an institution crucial to the academic work on campus had diminished. A team of librarians is now engaged in programs aimed at repositioning the Library as the university’s hub of learning. Toward that goal, the Library has long taught information literacy (IL) to students and faculty through several traditional methods, including one-shot workshops and sessions tied to specific courses of study. Now, in conjunction with disseminating augmented, virtual, and mixed reality (AVMR) learning technologies, the Library is redesigning instruction to align with various realities of higher education today, including uses of AVMR in instruction and research and following best practices from research into serving 1. online learners; 2. international learners not accustomed to Western higher-education practices; and 3. learners returning to university study after being away from higher education for some time or having changed disciplines of study. The Library is innovating online tutorials targeted for nontraditional and international graduate students with various combinations of AVMR, with the goal to diminish library anxiety. Numerous library and information science studies have shown a correlation between library anxiety and reduced library use, and library use has been linked to student learning, academic success, and retention.1 This paper focuses on IL instruction methods under development by the Library. Current indicators are encouraging as the Library embarks on the redesign of IL instruction and early development of inclusion of AVMR in IL instruction for nontraditional and international students. LITERATURE REVIEW The patron approaches the reference desk, with eyes downcast. In a voice so soft that it is barely above a whisper, the patron mumbles, “Is this where I can get help with research?” Some variation on the above scenario is an occurrence long familiar to academic reference librarians. In 1986, Mellon put a name to this nervousness of patrons; she called it library anxiety.2 mailto:asample@oru.edu INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 2 Since then, librarians have implemented various measures to help put patrons at ease and minimize their library anxiety. Scholars have studied many of these measures aimed at reducing library anxiety, both to determine the efficacy of such interventions and to understand better the causes of library anxiety. This paper describes one library’s intervention using a virtual-reality tour of the library to learn about some of the services available at the library prior to their initial visit in an attempt to reduce some aspects of their library anxiety. LIBRARY ANXIETY Library and information science (LIS) researchers have long recognized anxiety related to libraries and research can have a detrimental effect on students. Mizrachi described library anxiety as the feeling of being overwhelmed, intimidated, nervous, uncertain, or confused when using or contemplating use of the library and its resources to satisfy an information need. It is a state-based anxiety that can result in misconceptions or misapplications of library resources, procrastination, and avoidance of library tasks.3 Since Mellon’s theoretical framing of library anxiety in 1986, researchers have studied a number of library-related anxieties, including research anxiety, information literacy anxiety, library technophobia, and computer anxiety. Various studies have focused on different groups of students—freshmen, nontraditional students, and international students, to name a few—who may experience higher levels of library anxiety. Another area that has been of interest to researchers is the study of the efficacy of various measures aimed at reducing the library anxiety of students. Causes and Factors Researchers have found several causes of library anxiety. In her seminal article, Mellon used a grounded theory approach to understand and “describe students’ fear of the library as library anxiety.”4 Mellon noted most of the students in her study described their feelings as being lost in the library, which Mellon stated “stemmed from four causes: (1) the size of the library; (2) a lack of knowledge about where things were located; (3) how to begin; and (4) what to do.”5 Head and Eisenberg also found a majority of students (84 percent) had difficulties in knowing where to begin.6 Bostick and later Jiao and Onwuegbuzie named “five general antecedents of library anxiety . . . namely, barriers with staff, affective barriers, comfort with the library, knowledge of the library, and mechanical barriers.”7 Barriers with staff are the feelings students have regarding the accessibility and approachability of library staff.8 Affective barriers are students’ self-perceptions of their competence in using the library and library resources. Affective barriers’ arise from feelings of inadequacy and can be heightened by the perception that others possess library skills that they alone do not.9 Comfort with the library deals with the student’s perception of the library as a “safe and comforting environment.”10 Knowledge of the library is students’ knowledge of “where things are located and how to find their way around in the building.”11 Mechanical barriers refer to students’ perception of the reliability of machines in the library (e.g., copiers, printers, computers, etc.).12 Researchers focused on investigating the information-seeking behavior of students have identified stages of library anxiety. In her work, Kuhlthau identified six stages of information seeking in INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 3 which students may experience library anxiety: task initiation, topic selection, prefocus exploration, focus formulation, information collection, and search closure.13 In Blundell’s presentation of her theoretical model of the academic information search process (AISP) of undergraduate Millennial students (figure 1), she described the varying levels of anxiety students may feel throughout this process depending upon their success at finding needed information.14 Anxiety at Stage 2: Development/Refinement “ranges from mild to extreme, depending on the success of the student’s AISP in finding information he/she believes is appropriate for addressing the academic need.”15 At Stage 3, “Based on information located through the AISP in Stages 1 & 2, [the] student either fulfills [the] academic need with minimal anxiety, refocuses AISP with mid to high-level anxiety, or abandons the academic need completely with high/extreme levels of anxiety.”16 Figure 1. Blundell AISP Model.17 Although Blundell studied undergraduate Millennial students’ information-seeking behaviors, the same behaviors may also be descriptive of other groups of students. Blundell omitted anxiety at or prior to Stage 1 when the assignment is received by the student. One reason for the omission of anxiety in Blundell’s model at Stage 1 may be a seemingly paradoxical finding by many researchers regarding students’ inflated belief in their research skills as compared to their actual level of information literacy (IL) skills.18 Students with a high self-assessment of their IL skills may feel confident at the onset of research, only experiencing anxiety when encountering low success rates when searching for information or when experiencing information overload. However, many other students may experience anxiety at the onset of receiving an assignment, particularly on a INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 4 topic in which they have little or no knowledge. Others may experience anxiety if they realize they do not know where to look for information, how to use library research tools, or feel apprehension at the thought of asking for help from a librarian. For example, library anxiety can result from the requirements of the assignment; most professors require peer-reviewed sources. Many new students do not know what a peer-reviewed source is, much less how to find one. Indeed, many of the causes of library anxiety described from Mellon’s and later Jiao’s and Onwuegbuzie’s work can be positioned throughout all six of Kuhlthau’s and all three of Blundell’s stages of information seeking and could explain some of the potential steps Blundell noted in her model. Negative Effects In addition to the obvious discomfort students might feel, library anxiety, as with other forms of anxiety, can have a detrimental effect on students’ academic performance. As Mellon noted, “Students become so anxious about having to gather information in a library for their research paper that they are unable to approach the problem logically or effectively.”19 The findings from Jiao’s and Onwuegbuzie’s numerous studies support the negative effect library anxiety can have on students’ academic performance in various ways, including research performance, research proposal writing, and study habits.20 Research has also shown the link between higher levels of library anxiety and avoidance of the library.21 Avoidance of the library could hinder students’ academic performance or retention; studies have linked library use to higher GPAs and increased retention rates.22 Other negative effects of library anxiety include the reluctance of students to ask for help from a librarian and the tendency to procrastinate until it is too late to do well on assignments. When library anxiety is at a level high enough to cause students to enter a panic mode, logical thinking, the ability to apply existing skills, and building or acquiring new skills can be impaired. At-Risk Student Groups Acknowledging the negative effects library anxiety can have on students’ academic performance, several studies have looked to determine whether particular demographic groups of students experience library anxiety at higher rates and what factors or causes may be most prevalent in the causes of library anxiety for a particular group. In one study conducted by Jiao, Onwuegbuzie, and Lichtenstein, students who fell into the following groups tended to have the highest levels of library anxiety: “male, undergraduate, not speak English as their native language, have high levels of academic achievement, be employed either part- or full-time, and visit the library infrequently.”23 Some studies have focused on learning more about the library anxiety of a particular group. Some of the groups investigated include graduate, international, and nontraditional students. Still others have focused on possible racial differences in the prevalence of library anxiety. Although a few studies have found library anxiety to be higher for undergraduate students than graduate students, one of the most often-studied groups at risk for library anxiety has been graduate students.24 These researchers have looked at a number of factors in relation to graduate students’ library anxiety. In an early study, they found graduate students with the preferred learning style of visual learners tend to have higher levels of library anxiety. 25 In another study of graduate students, they examined the relation between library anxiety and trait anxiety, defined as “the relative stable proneness within each person to react to situations seen as stressful. ”26 Jiao and Onwuegbuzie, together with Bostick, investigated the potential relationship between race and library anxiety in 2004, which study they replicated in 2006. In both, the researchers found INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 5 Caucasian American graduate students reported higher levels of library anxiety than their African American counterparts.27 Another group frequently examined in library anxiety studies is international students. Mizrachi noted “studies involving international students in American universities consistently show their levels of library anxiety to be much higher than their American peers.”28 Onwuegbuzie and Jiao found international ESL students “had higher levels of library anxiety associated with ‘barriers with staff,’ ‘affective barriers,’ and ‘mechanical barriers,’ and lower levels of library anxiety associated with ‘knowledge of the library’ than did native English speakers.”29 Later, Jiao and Onwuegbuzie found the most prevalent causes of library anxiety for international students were mechanical barriers (library technology) as the greatest source, followed by affective barriers. 30 In the more recent pilot study by Lu and Adkins, the greatest barriers for international students were affective and staff barriers, while mechanical barriers, such as technologies, were no longer a significant cause of anxiety for most.31 Collins and Veal found adult learners in their study had the highest degree of library anxiety pertaining to affective barriers. 32 In their study, Kwon, Onwuegbuzie, and Alexander revealed graduate students who had higher levels of library anxiety resulting from affective barriers and knowledge of the library had weaker critical-thinking skills, lower self-confidence, less inquisitiveness, and reduced systematicity (“less disposed toward organized, logical, focused, and attentive inquiry”).33 Kwon found similar results in undergraduate students.34 Interventions Recognizing the multiple causes and multidimensional aspects of library anxiety, librarians have devised a number of interventions aimed at addressing one or more of its causes. Some of the means to address barriers with staff have focused on outreach, engaging library instruction, online presence, and other similar efforts to reach students and provide needed support for students’ research. Librarians have used information literacy instruction (ILI), reference desk consultations, and print and online guides to address library anxiety stemming from affective barriers, knowledge of the library, and even the mechanical barriers arising from lack of technology skills. A common intervention is ILI, which several studies have found to have some success in reducing students’ library anxiety. Bell explored students’ levels of library anxiety before and after a one- credit IL course.35 Platt and Platt examined the efficacy of two 50-minute ILI sessions, required of students enrolled in the Research Methods in Psychology course, in reducing library anxiety, which found “the greatest changes . . . were related primarily to knowledge of what resources are available in the library and how to access them.”36 In contrast to the typical one-session IL class, Fleming-May, Mays, and Radom investigated and found a three-workshop instruction model correlated with students’ increased confidence in using the library and lessening library anxiety. 37 Notwithstanding the benefits of library instruction sessions for students in relieving library anxiety, Pellegrino found students were far more likely to ask a librarian for help when their instructor, rather than a librarian, encouraged or required them do so.38 By familiarizing students with the location and arrangement of library services in the building, library orientations have been found to help relieve library anxiety. 39 Library orientations primarily aim to address one of the causes of library anxiety: a lack knowledge of the library. These orientations often introduce students to various library staff, which may also help with the dimension of library anxiety due to barriers with staff. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 6 Other interventions have been attempted with some success. Martin and Park found students were more apt to request assistance from the librarian if persuaded the consultation would save time. 40 McDaniel found in a study of graduate students that the use of peer mentors was effective in reducing affective barriers.41 Robbins discussed the use of library events to help ease students anxiety, but found in the follow-up survey many students were unaware of the events.42 DiPrince et al. discussed ways the use of a print guide can help alleviate library anxiety. 43 ORU LIBRARY Oral Roberts University The ORU Library serves the students, faculty, and staff of Oral Roberts University (ORU). ORU is a small, private, not-for-profit, liberal arts college located in Tulsa, Oklahoma. Founded in 1963 by Oral Roberts, enrollment is approximately 3,600 students. ORU is an interdenominational Christian institution focused on a whole-person education of spirit, mind, and body. ORU offers more than 150 majors, minors, and pre-professional programs in a range of degree fields, from business, biology, engineering, nursing, ministry, and more.44 History “The first building will be the Library which is the core of the whole academic structure.”45 —Oral Roberts (1962) From the founding of ORU, Founder Oral Roberts had a vision of the library’s centrality to academics.46 This set a precedent early in the history of ORU Library of the importance of the library to the academic work of the students and faculty of ORU. Expanding on traditional views of the function of an academic library to serve mainly as the repository of books and articles, through the vision of early library administrators, ORU Library emerged as one of the early adopters of electronic technology with the DAIRS (Dial Access Information Retrieval System) computer.47 Throughout the years, due to a number of factors, the ORU Library receded from the forefront of pre-eminence in academics on campus. Library practices followed the general trend of academic libraries. The ORU Library continued to acquire needed materials (e.g., books, journals, access to databases). Library instruction likewise kept up with current models of instruction. The typical method of instruction to undergraduates has been teaching one or two sessions to a class at the request of the instructor. On largely the efforts of the instruction librarian, IL became a required component of undergraduate education at ORU. With rare exceptions, undergraduate students at ORU are required as a part of Comp 102: Composition II to attend two sessions of an IL course. Other forms of ILI include workshops and sessions for undergraduates working on their senior papers and other sessions for graduate and postgraduate students, all typically at the request of the instructors of classes. With the new addition of augmented, virtual, and mixed reality (AVMR) learning technologies, at the behest of their dean, ORU librarians have begun to look at ways to incorporate these technologies into their classes and daily work. Several ORU instructors are using AVMR technologies in their classes.48 To help prepare students for the use of these technologies in their classes, one ORU instruction librarian has begun to introduce students to AVMR technologies. Other ORU instruction librarians are exploring ways to use AVMR technologies to create visualizations of library and research concepts, such as a 3D visualization of how Boolean logic INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 7 works in database searches. ORU instruction librarians are also exploring ways to incorporate AVMR technologies into a new program of online ILI. Although still in very early stages of planning, the proposed online ILI will include a virtual tour of the library. This paper focuses on the implementation and early feedback from a formative assessment on a virtual tour of the ORU Library. ORU Modular Students In addition to traditional 15-week semesters, two colleges at ORU offer graduate modular programs, the College of Education and the College of Theology and Ministry. Many of the students who enroll in these programs are nontraditional students who are returning to college after some time. Several of these students work full-time jobs and have family obligations in addition to their academic work. Often, these students are not local to the Tulsa campus; several are US students who live out of state and many others are international students. The modular classes offered by both programs can be a hybrid of online and modular format. The College of Theology and Ministry offers one-week courses on campus; the College of Education offers two-and-a-half-day on-campus classes. Modular classes are intensive due to the compressed nature of the curriculum. Often, modular students are visiting campus for the first time, and in addition to locating their classes, are very busy with coursework. Adding to these pressures, modular students may be using computer technologies in new ways. Navigating the Library’s resources is yet another stressor for many of these students. For students who are not familiar with the operations of an academic library, they may not be aware of Library services nor how to access those services. The Project In January 2017, the Global Learning Center opened on the campus of ORU. One hallmark of this renovated structure is the integration of AVMR technologies.49 Despite several professors on campus from various disciplines and colleges implementing AVMR into their curriculum, students’ use of the facilities was somewhat lower than had been hoped. In the Fall 2018 semester, the idea of creating a virtual tour of the ORU Library arose from a conversation between the author and a colleague, Dan Eller. Eller described an online ILI course he envisioned for ORU’s graduate Theology modular students. As a part of this course, he envisioned a virtual tour that could help students by reducing their library anxiety. Early in 2019, ORU’s associate vice president of technology and innovation, Michael Mathews, contacted Dr. Mark Roberts, dean of learning resources (of which the ORU Library is a part) to propose making AVMR learning technologies available through the ORU Library. Dean Roberts agreed and created an AVMR team of library faculty to oversee this project. In the Spring 2019 semester, the ORU IT department sent one of their employees, Stephen Guzman, to work with the Library’s AVMR team to set up an AVMR station and work in the Library to help make these new technologies available and known to ORU students. In addition to other AVMR projects Guzman helped the Library’s AVMR team begin, he volunteered to take the 360 images when he learned of the library’s desire to create a virtual tour of the Library. Guzman also helped in the selection of editing software, 3DVista, for which the Library acquired a license. Working with the 360 images Guzman took and stitched together, the author used 3DVista to create a virtual tour of the Library. This software allows for the addition of elements to the 360 images that make up the virtual tour to enhance the viewer’s experience and to provide INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 8 information and hyperlinks to external webpages with more information. Some of the elements added to the ORU Library Virtual Tour are hotspots that enable a viewer to move from one area to another, icons that present pop-up windows with more information, and other icons that link to the online profiles of various library faculty. Throughout the tour, consistent use of icons for the same functions is maintained. For example, icons with arrows allow the viewer to move from one location to another (figure 2), while icons with question marks displayed over Library personnel (figure 3) open the personnel’s profile webpage when clicked. Icons that contain the letter “i” feature pop-up windows with information and related links. The tour begins from outside the building so new visitors will be able to recognize the building when they arrive on campus (figure 4). Viewers can navigate through subsequent 360 images by clicking on the arrow icons so the viewer virtually travels the same path they will follow to enter the library when on campus. There are two other options to navigate the tour. The viewer can click on the small icons of scenes displayed on the left side of the screen to move to another area. The floor plans displayed at the upper right of the screen have red dots indicating the location of various scenes and, when clicked, move the viewer to that scene. Figure 2. AVMR Station near the Reference Desk, ORU Library Virtual Tour. Other elements of the tour include small icons of the scenes on the left of the screen. Beneath these icons are the names of the various areas. The title of the current scene appears in yello w lettering, providing information to help orient the viewer. Small floorplans located in the upper right side of the screen offer additional information on the location of the area (figure 3). Viewers can toggle these floorplans on and off. Another feature supplying location information is the dropdown menu for the floorplans (the dark blue bar at the upper right of the screen) which shows the floor level of the building on which the area is located. In the lower right of the screen, an information icon is available with details on what behavior to expect when clicking on icons and a description of the various ways to navigate the tour. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 9 Figure 3. Dean Mark Roberts near Alexa and the Self-Checkout station at the Circulation Desk, ORU Library Virtual Tour. Figure 4. ORU campus, ORU Library Tour. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 10 METHODOLOGY The aim of the virtual tour of the library is to reduce several dimensions of students’ library anxiety. The primary goal of the tour is to reduce anxiety related to knowledge of the library by familiarizing students with images and information regarding the building prior to their arrival on campus. Another aim is to reduce barriers with staff, which we address by proving information along with images of library faculty. Affective barriers and mechanical barriers are two of the most prevalent causes of library anxiety, which the intervention of the tour does not directly address. The hope is, however, that with the minimization of any anxiety stemming from knowledge of the building and barriers with staff, students will be encouraged to consult with librarians, particularly as information on the variety of ways to contact librarians is included on the information pop-up window on the Reference Desk. Pre- and Post-Surveys The pre- and post-surveys administered to students included 42 statements from Bostick’s Library Anxiety Scale. Bostick’s Library Anxiety Scale, developed in 1992, is a 5-point Likert scale survey instrument that contains 43 statements. The pre-survey also contains demographic questions. The one statement omitted from Bostick’s original survey was number 40, “The change machines are always out of order,” as the ORU Library does not have change machines.50 With the exception of the demographical questions, the post-survey is the duplicate of the pre-survey, with the same 42 statements. Although several researchers have adapted Bostick’s Library Anxiety Scale, such as Blundell’s adaptation to add “elements related specifically to information technology (both hardware such as computers, and software such as online research databases),”51 for the purposes of this preliminary inquiry, the researcher decided to use the original questions from the Library Anxiety Scale. The original statements were used because reduction of library anxiety stemming from information technology use was not a goal of this study. Administration of Survey A link to the pre-survey was posted on the homepage of the ORU Library. The author sent email invitations containing a link to the pre-survey to students enrolled in the June 2019 summer modular Theology classes. The author met with groups of Education modular students during the week they were on campus (June 24–30, 2019) to recruit participation. In a library session, another librarian encouraged her modular students to participate in the study. At the end of the pre-survey, a unique number and instructions to note the number were provided to participants to be used to log in to the post-survey. The link to the virtual tour appeared on the final screen of the pre-survey. The link to the post-survey was provided on the same page as the virtual tour, allowing participants to navigate to the post-survey when desired. The surveys asked for no identifying information; however, the unique number provided on the pre-surveys and entered by the participants on the post-surveys allowed the researcher to link the participants’ responses to both surveys. Once the results were downloaded, each of the participants’ pre- and post-survey responses were coded P1 through P7 to track any potential effects of the virtual tour on participants’ responses. Because of the low rate of participation, formal statistical analyses were not applied to these findings. The results were examined in two ways. Each participants’ pre- and post-survey responses were compared to determine if responses changed from pre- to post-survey. The total number of responses on each point of the Likert scale INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 11 to each of the 42 statements were examined to determine trends in participants’ levels of library anxiety. RESULTS Although approximately 100 students enrolled in either the graduate Theology or graduate Education modular classes visited the campus June 24–30, 2019, participation in this preliminary study was extremely low. To date only seven participants have completed both the pre- and post- surveys. The responses from this formative assessment will be used by the ORU Library to guide future iterations of the virtual Library tour and inclusion in ILI. The following discusses initial findings from the pre- and post-surveys. Most of the participants reported little or no discomfort or anxiety with using the Library. All participants indicated they are US citizens, and all indicated some level of familiarity with the Library. Four reported they had often visited the Library, three responded they had visited the Library previously, but not often. Of the seven participants, five indicated they are graduate students, one marked “other,” and one reported doctoral-student status. Ages of the participants varied from one at 20–29, one 30–39, two 40–49, and three at 50 years or over. The following describes the effect the virtual tour of the Library had on participants’ responses. Interestingly, one participant showed no change in responses from pre- to post-survey. Note: Bostick’s original categorization of the statements have been retained for all 42 of the statements on both instruments. Knowledge of the Library The principal aim of the virtual tour was to reduce library anxiety related to knowledge of the library by acquainting students with “where things are located and how to find their way around in the building.”52 Bostick categorized 5 of the 42 statements as knowledge of the library. Based on participants’ responses, there is some indication the tour did help acquaint students with the library. The changes in participants’ responses showed a greater positive trend after viewing the virtual tour; although on two statements, responses showed a negative trend (table 1). Table 1 shows the questions on which participants had a change in their responses from pre- to post- survey. The number in the Positive column indicates the number of participants whose responses displayed a favorable change in the perceptions of participants to that statement following the virtual tour. The number in the Negative column shows the number of participants whose responses on the post-survey showed a negative effect of the virtual tour on their responses. Statement Positive Negative I don’t feel physically safe in the library. 1 1 I enjoy learning new things about the library. 3 1 I want to learn how to do my own research. 1 The library is a safe place. 2 The library is an important part of my school. 2 Totals 9 2 Table 1. Statements in knowledge of the library category, which showed change on post-survey. The number of responses of strongly disagree statements in this category were unchanged from pre- to post-survey. The only statement that received any responses of strongly disagree was five INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 12 to the statement “I don't feel physically safe in the library.” Taken together with the two responses of disagree to this statement, all the participants feel safe in the library. To the statement “The library is a safe place,” all seven participants answered either agree (five on pre-survey, four on post-survey) or strongly agree (two on pre-survey, three on post-survey) (figure 6). Curiously, responses to “I enjoy learning new things about the library” changed from no responses of disagree on the pre-survey to one response of disagree on the post-survey. The other shift in the number of responses of disagree was on the statement “The library is an important part of my school” (two on pre-survey, one on post-survey), indicating a slight improvement (figure 5). Figure 5. Comparison of strongly disagree and disagree responses in knowledge of the library category. To the statements in this category, none of the respondents replied undecided, except to the statement “I enjoy learning new things about the library.” There was one undecided response on the pre-survey and no responses of undecided on the post-survey to this statement. The other change in this category was to the statement “The library is an important part of my school,” which moved from no responses of undecided on the pre-survey to one undecided on the post- survey. The respondents, for the most part, wanted to learn to do their own research, with five responses of agree or strongly agree on both the pre- and post-surveys. Five of the participants felt the library is of importance (one agree and four strongly agree). Six of the seven participants reported they enjoy learning new things about the library. The shift in responses from five agree and one strongly agree on pre-survey to two agree and four strongly agree indicates the tour might have affected participants’ views on this statement (figure 6). INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 13 Figure 6. Comparison of strongly agree and agree responses in knowledge of the Library category. Affective Barriers While not a direct goal of the virtual tour, the responses of participants showed the most gains on the post-survey within the category of affective barriers. This seems to indicate that viewing the virtual tour improved students’ self-perceptions of their competence in using the library and library resources. Out of the 42 statements on each of the instruments, 12 are in Bostick’s category, affective barriers. The statements in table 2 are those on which participants had a change in their responses from pre- to post-survey. The numbers in the positive column indicate the number of participant responses, which improved on the post-survey. A number in the negative column indicates participants’ post-survey responses that moved in a negative direction. Statement Positive Negative A lot of the university is confusing to me. 2 I am unsure how to begin my research. 2 I can never find things I need in the library. 3 I don’t know what resources are available in the library. 2 I don't know what to do next when the book I need is not on the shelf. 1 I feel comfortable using the library. 3 I get confused trying to find my way around the library. 2 I’m embarrassed that I don’t know how to use the library. 1 1 The directions for using the computers are not clear. Totals 17 1 Table 2. Statements in affective barriers category, which showed change on post-survey. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 14 Looking at the responses to statements in this category reveals some possible effects of the tour and some potential areas of library anxiety. The responses to “I don’t know what resources are available in the library” were split on the pre-survey, with three responses of agree, one of strongly agree, one undecided, one of strongly disagree, and two of disagree. The post-survey responses showed almost no change; the only change was one additional response of undecided, with no strongly agree responses (figures 7.1, 8.1, 9.1). These findings indicate more information on what sources are available to patrons may be needed on the virtual tour. Most of the respondents indicated confidence about where to begin research. On both pre- and post-surveys, there were five responses of strongly disagree or disagree to the statement “I am unsure how to begin my research” (figure 7.1). Most indicated they feel confident in using the library based on the responses to the statements “I’m embarrassed that I don’t know how to use the library,” “I feel comfortable using the library,” “I can never find things in the library,” and “I get confused trying to find my way around the library” (figures 7.1, 7.2, 9.1, 9.2). Responses were equally positive to the statements “The library won't let me check out as many items as I need,” “A lot of the university is confusing to me,” “I don’t know what to do next when the book I need is not on the shelf,” and “I can’t find enough space in the library to study” (figures 7.1, 7.2, 8.1, 8.2, 9.1, 9.2). Responses were divided on the statement “I feel like I’m bothering the reference librarian if I ask a question” (figures 8.2, 10.2). This finding needs further research to determine what is causing students to feel reluctance to ask the librarian for assistance. Figure 7.1. Comparison of strongly disagree and disagree responses in affective barriers category. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 15 Figure 7.2. Comparison of strongly disagree and disagree responses in affective barriers category. Figure 8.1. Comparison of undecided responses in affective barriers category. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 16 Figure 8.2. Comparison of undecided responses in affective barriers category. Figure 9.1. Comparison of strongly agree and agree responses in affective barriers category. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 17 Figure 9.2. Comparison of strongly agree and agree responses in affective barriers category. Mechanical Barriers Although not a goal of the study, there was positive change in participants’ responses in both statements in the category of mechanical barriers. It is unclear how the virtual tour might have caused the improvement in participants’ perception of the reliability of machines in the library. Statement Positive Negative The computer printers are often out of paper. 1 The copy machines are usually out of order. 1 Totals 2 Table 3. Statements in mechanical barriers category, which showed change on post-survey. In this category, on both the pre- and post-surveys, there was one strongly disagree response to both statements. No respondents replied agree or strongly agree to the statements in this category. Responses of disagree to both statements increased one from one disagree response on the pre-survey to two disagree responses on the post-survey. The number of undecided responses fell from five to four on the post-survey. As noted above, it is not clear what caused the change in responses. Barriers with Staff A secondary goal of the tour was to reduce barriers with staff and thus to reduce library anxiety by providing information with images of library faculty. By providing information and images of the library faculty, this study sought to reduce the anxiety students may have regarding the accessibility and approachability of library staff. In this category, participants showed some positive effects of the virtual tour on how participants viewed library staff. However, the responses of participants exhibited the most variability in this category, with almost an equal number of responses being positive or negative after viewing the tour. The reasons for this INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 18 variance are unclear. In future studies, additional space for comments will be included on the surveys as well as possible follow-up focus-group discussions to determine the causes of negative trends in responses. Table 4 shows the statements within this category on which participants had a change in their responses from pre- to post-survey. The number in the positive column indicates how many participant responses changed in a favorable direction on the post-survey. The number in the negative column indicates the number of participants whose post-survey responses moved in a negative direction. On the survey instruments, 12 of the 15 statements categorized as Bostick’s barriers with staff showed changes in responses. Statement Positive Negative I can always ask a librarian if I don’t know how to work a piece of equipment in the library. 1 I can’t get help in the library at the times I need it. 1 1 If I can’t find a book on the shelf the library staff will help me. 2 2 Library staff don’t have time to help me. 1 1 The librarians are unapproachable. 2 The library is a comfortable place to study. 2 The library staff doesn’t care about students. 1 3 The library staff doesn’t listen to students. 1 The reference librarians are not approachable. 2 The reference librarians are unhelpful. 2 The reference librarians don’t have time to help me because they’re always busy doing something else. 1 1 There is often no one available in the library to help me. 2 1 Totals 15 12 Table 4. Statements in Barriers with Staff category, which showed change on post-survey. The findings in the category, overall, were favorable. Most feel the librarians and library staff care and are responsive and available to students. Pre-survey responses indicated one or two of the participants felt librarians are unapproachable or unhelpful. Post-survey responses reflected a positive change in participants’ views on librarians’ approachability and helpfulness. Participants also reported the library to be a comfortable study location and that the rules are reasonable (figures 10.1, 10.2, 11.1, 11.2, 12.1, 12.2). INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 19 Figure 10.1. Comparison of strongly disagree and disagree responses in barriers with staff category. Figure 10.2. Comparison of strongly disagree and disagree responses in barriers with staff category. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 20 Figure 11.1. Comparison of undecided responses in barriers with staff category. Figure 11.2. Comparison of undecided responses in barriers with staff category. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 21 Figure 12.1. Comparison of strongly agree and agree responses in barriers with staff category. Figure 12.2. Comparison of strongly agree and agree responses in barriers with staff category. Comfort with the Library According to Collins and Veal, comfort with the library is students’ perceptions of the library as a “safe and comforting environment.”53 Out of the 42 statements, Bostick placed 8 within this category, all of whom showed some change in responses from pre-survey to post-survey. The changes reflected in this category were positive, but it is unclear how the virtual tour might have influenced participants’ perceptions on statements such as “There is too much crime in the library” or “Good instructions for using the library’s computers are available.” Further investigation is needed to determine what may account for changes in perception on statements such as these. Table 5 depicts the changes, both positive and negative, in participants’ responses on the statements in this category. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 22 Statement Positive Negative Good instructions for using the library’s computers are available. 2 I don’t understand the library’s overdue fines. 1 2 I feel comfortable in the library. 2 I feel safe in the library. 2 1 The library never has the materials I need. 1 The people who work at the circulation desk are helpful. 3 1 The reference librarians are unfriendly. 1 There is too much crime in the library. 1 2 Totals 12 7 Table 5. Statements in comfort with the library category that showed change on post-survey. The following bar graphs compare the responses on the pre-surveys to the post-survey responses within this category. As with other categories, responses were mostly favorable in this category (figures 13, 14, 15). Figure 13. Comparison of strongly disagree and disagree responses in comfort with the library category. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 23 Figure 14. Comparison of undecided responses in comfort with the library category. Figure 15. Comparison of strongly agree and agree responses in comfort with the library category. CONCLUSION The ORU Library has found the virtual tour to be of use in familiarizing students with the library. Anecdotal statements from students who viewed the tour during its creation noted the desire that such a tour had been available when they began college and further commented on the assistance that the tour will provide new students. A limitation of this study is the low participation, with no participation from students from some of the groups that other studies have shown may have higher levels of library anxiety (e.g., new students, international students). However, given the indications of positive effects of the virtual tour from our study results and anecdotal statements, we are encouraged that this tool that will assist our students in reducing library anxiety, with the result that they will visit and use the INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 24 library more often to their benefit. Again, although participation was low, these results have also encouraged ORU librarians to seek other ways to include AVMR and other innovative technologies in our instruction, outreach, and services. The 360 virtual tour of the Library is undergoing updates and additions to provide students with disabilities information on access points and accessible restrooms. Other projects underway include incorporating AVMR in IL sessions, the addition of a digital sandbox with various technologies and equipment including a VR station, and the addition of VR equipment in our designated faculty research room for use by university faculty to learn and teach students how to use AVMR technologies. The response from students and faculty to these new services has been enthusiastic and encouraging that the ORU Library is positively influencing and supporting the academic work of ORU faculty and students. RECOMMENDED READING Varnum, Kenneth J. Beyond Reality: Augmented, Virtual, and Mixed Reality in the Library. Chicago: ALA Editions, 2019. Elliott, Christine, Marie Rose, and Jolanda-Pieta van Arnhem. Augmented and Virtual Reality in Libraries. Lanham, MD: Rowman & Littlefield, 2018. ENDNOTES 1 Anthony J. Onwuegbuzie and Qun G. Jiao, “Information Search Performance and Research Achievement: An Empirical Test of the Anxiety-Expectation Mediation Model of Library Anxiety,” Journal of the American Society for Information Science & Technology 55, no. 1 (2004): 41–54, https://doi.org/10.1002/asi.10342; Qun G. Jiao and Anthony J. Onwuegbuzie, “Is Library Anxiety Important?,” Library Review 48, no. 6 (1999), https://doi.org/10.1108/00242539910283732; Qun G. Jiao and Anthony J. Onwuegbuzie, Library Anxiety: The Role of Study Habits (paper presented at the Annual Meeting of the Mid- South Educational Research Association (MSERA), Bowling Green, Kentucky, November 15–17, 2000), http://files.eric.ed.gov/fulltext/ED448781.pdf. 2 Constance A. Mellon, “Library Anxiety: A Grounded Theory and Its Development,” College & Research Libraries 47, no. 2 (1986), https://doi.org/10.5860/crl_47_02_160; see also Constance A. Mellon, “Library Anxiety: A Grounded Theory and Its Development,” College & Research Libraries 76, no. 3 (2015), https://doi.org/10.5860/crl.76.3.276. 3 Diane Mizrachi, “Library Anxiety,” Encyclopedia of Library and Information Sciences (Boca Raton, FL: CRC Press, 2017): 2782. 4 Mellon, “Library Anxiety,” (1986): 163; see also Mellon, “Library Anxiety,” (2015): 280. 5 Mellon, “Library Anxiety,” (1986): 162; see also Mellon, “Library Anxiety,” (2015): 278. 6 Alison J. Head and Michael B. Eisenberg, Truth Be Told: How College Students Evaluate and Use Information in the Digital Age: Project Information Literacy Progress Report (University of Washington's Information School, 2010): 3. https://doi.org/10.1002/asi.10342 https://doi.org/10.1108/00242539910283732 http://files.eric.ed.gov/fulltext/ED448781.pdf https://doi.org/10.5860/crl_47_02_160 https://doi.org/10.5860/crl.76.3.276 INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 25 7 Sharon Lee Bostick, “The Development and Validation of the Library Anxiety Scale,” (PhD diss., Wayne State University, 1992); Qun G. Jiao and Anthony J. Onwuegbuzie, “Antecedents of Library Anxiety,” Library Quarterly 67, no. 4 (1997): 72, https://doi.org/10.1086/629972. 8 Jiao and Onwuegbuzie, “Antecedents of Library Anxiety.” 9 Mellon, “Library Anxiety” (1986); see also Mellon, “Library Anxiety” (2015); see also Constance A. Mellon, “Attitudes: The Forgotten Dimension in Library Instruction,” Library Journal 113, no. 14 (1988). 10 Kathleen M. T. Collins and Robin E. Veal, “Off-Campus Adult Learners’ Levels of Library Anxiety as a Predictor of Attitudes Toward the Internet,” Library & Information Science Research 26, no. 1 (2004): 4, https://doi.org/https://doi.org/10.1016/j.lisr.2003.11.002. 11 Mizrachi, “Library Anxiety,” 2784. 12 Anthony J. Onwuegbuzie, “Writing A Research Proposal: The Role of Library Anxiety, Statistics Anxiety, and Composition Anxiety,” Library & Information Science Research 19, no. 1 (1997), https://doi.org/10.1016/S0740-8188(97)90003-7. 13 Carol Collier Kuhlthau, “Developing a Model of the Library Search Process: Cognitive and Affective Aspects,” Research Quarterly 28, no. (Winter 1988), https://www.jstor.org/stable/25828262; Carol C Kuhlthau, “Inside the Search Process: Information Seeking from the User’s Perspective,” Journal of the American Society for Information Science 42, no. 5 (1991), https://doi.org/10.1002/(SICI)1097- 4571(199106)42:5<361::AID-ASI6>3.0.CO;2-%23. 14 Shelley Blundell, “Documenting the Information-Seeking Experience of Remedial Undergraduate Students,” Proceedings from the Document Academy 1, no. 1 (2014), https://doi.org/10.35492/docam/1/1/4. 15 Blundell, “Documenting the Information-Seeking Experience,” 5. 16 Blundell, “Documenting the Information-Seeking Experience,” 6. 17 Used by permission of the author. Retrieved from http://remedialundergraduateaisp.pbworks.com/w/file/88755941/ModelRevised%20-%208 .4.jpg. 18 Blundell, “Documenting the Information-Seeking Experience”; Melissa Gross and Don Latham, “Attaining Information Literacy: An Investigation of the Relationship Between Skill Level, Self - Estimates of Skill, and Library Anxiety,” Library & Information Science Research 29, no. 3 (2007), https://doi.org/10.1016/j.lisr.2007.04.012; Melissa Gross and Don Latham, “Undergraduate Perceptions of Information Literacy: Defining, Attaining, and Self-Assessing Skills,” College & Research Libraries 70, no. 4 (2009), https://doi.org/10.5860/0700336; Melissa Gross and Don Latham, “Experiences With and Perceptions of Information: A Phenomenographic Study of First-Year College Students,” Library Quarterly 81, no. 2 (2011), https://doi.org/10.1086/658867; Melissa Gross, “The Impact of Low-Level Skills on https://doi.org/10.1086/629972 https://doi.org/https:/doi.org/10.1016/j.lisr.2003.11.002 https://doi.org/10.1016/S0740-8188(97)90003-7 https://www.jstor.org/stable/25828262 https://doi.org/10.1002/(SICI)1097-4571(199106)42:5%3c361::AID-ASI6%3e3.0.CO;2-%23 https://doi.org/10.1002/(SICI)1097-4571(199106)42:5%3c361::AID-ASI6%3e3.0.CO;2-%23 https://doi.org/10.35492/docam/1/1/4 http://remedialundergraduateaisp.pbworks.com/w/file/88755941/ModelRevised%20-%208.4.jpg http://remedialundergraduateaisp.pbworks.com/w/file/88755941/ModelRevised%20-%208.4.jpg https://doi.org/10.1016/j.lisr.2007.04.012 https://doi.org/10.5860/0700336 https://doi.org/10.1086/658867 INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 26 Information-Seeking Behavior: Implications of Competency Theory for Research and Practice,” Reference & User Services Quarterly (2005), https://www.jstor.org/stable/20864481. 19 Mellon, “Attitudes,” 138; Jiao and Onwuegbuzie, “Antecedents of Library Anxiety.” 20 Qun G. Jiao and Anthony J. Onwuegbuzie, “Perfectionism and Library Anxiety among Graduate Students,” Journal of Academic Librarianship 24, no. 5 (1998), https://doi.org/10.1016/S0099- 1333(98)90073-8; Jiao and Onwuegbuzie, “Is Library Anxiety Important?”; Qun G. Jiao and Anthony J. Onwuegbuzie, “Library Anxiety among International Students” (paper presented at the Annual Meeting of the Mid-South Education Research Association Point Clear, Alabama, November 17–19, 1999), https://eric.ed.gov/?id=ED437973; Qun G. Jiao and Anthony J. Onwuegbuzie, “Self-Perception and Library Anxiety: An Empirical Study,” Library Review 48, no. 3 (1999), https://doi.org/10.1108/00242539910270312; Qun G. Jiao and Anthony J. Onwuegbuzie, “Identifying Library Anxiety through Students’ Learning-Modality Preferences,” Library Quarterly 69, no. 2 (1999), https://doi.org/10.1086/603054; Qun G. Jiao and Anthony J. Onwuegbuzie, Library Anxiety: The Role of Study Habits; Qun G. Jiao and Anthony J. Onwuegbuzie, “Library Anxiety and Characteristic Strengths and Weaknesses of Graduate Students’ Study Habits,” Library Review 50, no. 2 (2001), https://doi.org/10.1108/00242530110381118; Qun G. Jiao and Anthony J. Onwuegbuzie, “Dimensions of Library Anxiety and Social Interdependence: Implications for Library Services, ” Library Review 51, no. 2 (2002), https://doi.org/10.1108/00242530210418837; Qun G. Jiao and Anthony J. Onwuegbuzie, The Relationship Between Library Anxiety and Reading Ability (paper presented at the Annual Meeting of the Mid-South Educational Research Association, Chattanooga, Tennessee, November 6–8, 2002), https://eric.ed.gov/?id=ED478612; Qun G. Jiao and Anthony J. Onwuegbuzie, “Reading Ability as a Predictor of Library Anxiety,” Library Review 52, no. 4 (2003), https://doi.org/10.1108/00242530310470720; Anthony J. Onwuegbuzie, and Vicki L. Waytowich, “The Relationship between Citation Errors and Library Anxiety: An Empirical Study of Doctoral Students in Education,” Information Processing & Management 44, no. 2 (2008), https://doi.org/10.1016/j.ipm.2007.05.007; Onwuegbuzie, “Writing A Research Proposal”; Anthony J. Onwuegbuzie and Qun G. Jiao, “I’ll Go to the Library Later: The Relationship between Academic Procrastination and Library Anxiety,” College & Research Libraries 61, no. 1 (2000), https://doi.org/10.5860/crl.61.1.45; Onwuegbuzie and Jiao, “Information Search Performance and Research Achievement”; Anthony J. Onwuegbuzie, Qun G. Jiao, and Sharon L Bostick, Library Anxiety: Theory, Research, and Applications, vol. 1 (Lanham, Maryland: Scarecrow Press, 2004). 21 Jiao and Onwuegbuzie, “Identifying Library Anxiety”; Qun G. Jiao, Anthony J. Onwuegbuzie, and Art A. Lichtenstein, “Library Anxiety: Characteristics of ‘At-Risk’ College Students,” Library & Information Science Research 18, no. 2 (1996), https://doi.org/10.1016/S0740- 8188(96)90017-1; Nahyun Kwon, “A Mixed-Methods Investigation of the Relationship between Critical Thinking and Library Anxiety among Undergraduate Students in Their Information Search Process,” College & Research Libraries 69, no. 2 (2008), https://doi.org/10.5860/crl.69.2.117; Mellon, “Attitudes.” 22 Gaby Haddow, “Academic Library Use and Student Retention: A Quantitative Analysis,” Library & Information Science Research 35, no. 2 (2013), https://www.jstor.org/stable/20864481 https://doi.org/10.1016/S0099-1333(98)90073-8 https://doi.org/10.1016/S0099-1333(98)90073-8 https://eric.ed.gov/?id=ED437973 https://doi.org/10.1108/00242539910270312 https://doi.org/10.1086/603054 https://doi.org/10.1108/00242530110381118 https://doi.org/10.1108/00242530210418837 https://eric.ed.gov/?id=ED478612 https://doi.org/10.1108/00242530310470720 https://doi.org/10.1016/j.ipm.2007.05.007 https://doi.org/10.5860/crl.61.1.45 https://doi.org/10.1016/S0740-8188(96)90017-1 https://doi.org/10.1016/S0740-8188(96)90017-1 https://doi.org/10.5860/crl.69.2.117 INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 27 https://doi.org/https://doi.org/10.1016/j.lisr.2012.12.002; Adam Murray, Ashley Ireland, and Jana Hackathorn, “The Value of Academic Libraries: Library Services as a Predictor of Student Retention,” College & Research Libraries 77, no. 5 (2016), https://doi.org/10.5860/crl.77.5.631; Krista M. Soria, “Factors Predicting the Importance of Libraries and Research Activities for Undergraduates,” Journal of Academic Librarianship 39, no. 6 (2013), https://doi.org/10.1016/j.acalib.2013.08.017; Krista M Soria, Jan Fransen, and Shane Nackerud, “Library Use and Undergraduate Student Outcomes: New Evidence for Students’ Retention and Academic Success,” portal: Libraries and the Academy 13, no. 2 (2013), https://doi.org/10.1353/pla.2013.0010; Krista M. Soria, Jan Fransen, and Shane Nackerud, “Stacks, Serials, Search Engines, and Students’ Success: First-Year Undergraduate Students’ Library Use, Academic Achievement, and Retention,” Journal of Academic Librarianship 40, no. 1 (2014), https://doi.org/10.1016/j.acalib.2013.12.002; Krista M Soria, Jan Fransen, and Shane Nackerud, “Beyond Books: The Extended Academic Benefits of Library Use for First- Year College Students,” College & Research Libraries 78, no. 1 (2017), https://doi.org/10.5860/crl.78.1.8. 23 Jiao, Onwuegbuzie, and Lichtenstein, “Library Anxiety,” 1. 24 Jiao and Onwuegbuzie, “Identifying Library Anxiety”; see also Bostick, “The Development and Validation”; Barbara Fister, Julie Gilbert, and Amy Ray Fry, “Aggregated Interdisciplinary Databases and the Needs of Undergraduate Researchers,” portal: Libraries and the Academy 8, no. 3 (2008), https://doi.org/10.1353/pla.0.0003; Mellon, “Library Anxiety”; Jiao and Onwuegbuzie, “Perfectionism and Library Anxiety among Graduate Students”; Jiao and Onwuegbuzie, “Is Library Anxiety Important?”; Jiao and Onwuegbuzie, “Library Anxiety among International Students”; Jiao and Onwuegbuzie, “Self-Perception and Library Anxiety: An Empirical Study”; Jiao and Onwuegbuzie, “Identifying Library Anxiety through Students’ Learning-Modality Preferences”; Jiao and Onwuegbuzie, Library Anxiety: The Role of Study Habits; Jiao and Onwuegbuzie, “Library Anxiety and Characteristic Strengths and Weaknesses of Graduate Students’ Study Habits”; Jiao and Onwuegbuzie, “Dimensions of Library Anxiety and Social Interdependence”; Jiao and Onwuegbuzie, The Relationship Between Library Anxiety and Reading Ability; Jiao and Onwuegbuzie, “Reading Ability as a Predictor of Library Anxiety”; Onwuegbuzie and Waytowich, “The Relationship between Citation Errors and Library Anxiety”; Onwuegbuzie, “Writing A Research Proposal”; Onwuegbuzie and Jiao, “I'll Go to the Library Later”; Onwuegbuzie and Jiao, “Information Search Performance and Research Achievement”; Onwuegbuzie, Jiao, and Bostick, Library Anxiety: Theory, Research, and Applications. 25 Onwuegbuzie and Jiao, “The Relationship”; Anthony Onwuegbuzie and Qun G. Jiao, “Understanding Library-Anxious Graduate Students,” Library Review 47, no. 4 (1998), https://doi.org/10.1108/00242539810212812. 26 Jiao and Onwuegbuzie, “Is Library Anxiety Important?” 27 Qun G. Jiao, Anthony J. Onwuegbuzie, and Sharon L Bostick, “Racial Differences In Library Anxiety among Graduate Students,” Library Review 53, no. 4 (2004), https://doi.org/10.1108/00242530410531857; Qun G. Jiao, Anthony J. Onwuegbuzie, and Sharon L. Bostick, “The Relationship Between Race and Library Anxiety among Graduate https://doi.org/https:/doi.org/10.1016/j.lisr.2012.12.002 https://doi.org/10.5860/crl.77.5.631 https://doi.org/10.1016/j.acalib.2013.08.017 https://doi.org/10.1353/pla.2013.0010 https://doi.org/10.1016/j.acalib.2013.12.002 https://doi.org/10.5860/crl.78.1.8 https://doi.org/10.1353/pla.0.0003 https://doi.org/10.1108/00242539810212812 https://doi.org/10.1108/00242530410531857 INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 28 Students: A Replication Study,” Information Processing & Management 42, no. 3 (2006), https://doi.org/10.1016/j.ipm.2005.03.018. 28 Mizrachi, “Library Anxiety,” 2784. 29 Anthony J. Onwuegbuzie and Qun G. Jiao, “Academic Library Useage: A Comparison of Native and Non-Native English-Speaking Students,” Australian Library Journal 46, no. 3 (1997): 263, https://doi.org/10.1080/00049670.1997.10755807; Jiao and Onwuegbuzie, “Antecedents of Library Anxiety.” 30 Jiao and Onwuegbuzie, “Library Anxiety among International Students.” 31 Yunhui Lu and Denice Adkins, “Library Anxiety among International Graduate Students,” Proceedings of the American Society for Information Science and Technology 49, no. 1 (2012), https://doi.org/10.1002/meet.14504901319. 32 Collins and Veal, “Off-Campus Adult.” 33 Nahyun Kwon, Anthony J. Onwuegbuzie, and Linda Alexander, “Critical Thinking Disposition and Library Anxiety: Affective Domains on the Space of Information Seeking and Use in Academic Libraries,” College & Research Libraries 68, no. 3 (2007): 276, https://doi.org/10.5860/crl.68.3.268. 34 Kwon, “A Mixed-Methods Investigation.” 35 Judy Carol Bell, “Student Affect Regarding Library-Based and Web-Based Research Before and After an Information Literacy Course,” Journal of Librarianship & Information Science 43, no. 2 (2011), https://doi.org/10.1177/0961000610383634. 36 Jessica Platt and Tyson L Platt, “Library Anxiety among Undergraduates Enrolled in a Research Methods in Psychology Course,” Behavioral & Social Sciences Librarian 32, no. 4 (2013): 248, https://doi.org/10.1080/01639269.2013.841464. 37 Rachel A. Fleming-May, Regina Mays, and Rachel Radom, “‘I Never Had to Use the Library in High School’: A Library Instruction Program for At-Risk Students,” portal: Libraries and the Academy 15, no. 3 (2015), https://doi.org/10.1353/pla.2015.0038. 38 Catherine Pellegrino, “Does Telling Them to Ask for Help Work?,” Reference & User Services Quarterly 51, no. 3 (2012), https://doi.org/10.5860/rusq.51n3.272. 39 Kathy Christie Anders, Stephanie J. Graves, and Elizabeth German, “Using Student Volunteers in Library Orientations,”Practical Academic Librarianship: The International Journal of the SLA 6, no. 2 (2016): 17–30, http://hdl.handle.net/1969.1/166249. 40 Pamela N. Martin and Lezlie Park, “Reference Desk Consultation Assignment: An Exploratory Study of Students’ Perceptions of Reference Service,” Reference & User Services Quarterly 49, no. 4 (2010), https://doi.org/10.5860/rusq.49n4.333. https://doi.org/10.1016/j.ipm.2005.03.018 https://doi.org/10.1080/00049670.1997.10755807 https://doi.org/10.1002/meet.14504901319 https://doi.org/10.5860/crl.68.3.268 https://doi.org/10.1177/0961000610383634 https://doi.org/10.1080/01639269.2013.841464 https://doi.org/10.1353/pla.2015.0038 https://doi.org/10.5860/rusq.51n3.272 http://hdl.handle.net/1969.1/166249 https://doi.org/10.5860/rusq.49n4.333 INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USING AUGMENTED AND VIRTUAL REALITY IN INFORMATION LITERACY INSTRUCTION | SAMPLE 29 41 Sarah McDaniel, “Library Roles in Advancing Graduate Peer-Tutor Agency and Integrated Academic Literacies,” Reference Services Review 46, no. 2 (2018), https://doi.org/10.1108/RSR-02-2018-0017. 42 Elaine M. Robbins, “Breaking the Ice: Using Non-Traditional Methods of Student Involvement to Effect [sic] a Welcoming College Library Environment,” Southeastern Librarian 62, no. 1 (2014), https://digitalcommons.kennesaw.edu/seln/vol62/iss1/5. 43 Elizabeth DiPrince et al., “Don’t Panic!,” Reference & User Services Quarterly 55, no. 4 (2016), https://doi.org/10.5860/rusq.55n4.283. 44 Oral Roberts University, “About ORU,” (2019), https://www.oru.edu/admissions/undergraduate/. 45 Oral Roberts, Our Partnership with God [sound recording]. Eighth World Outreach, Oral Roberts Evangelistic Association, Tulsa, OK: Abundant Life Recordings, 1962). 46 Oral Roberts, Our Partnership. 47 Margaret M. Grubiak, “An Architecture for the Electronic Church: Oral Roberts University in Tulsa, Oklahoma,” Technology and Culture 57, no. 2 (2016), https://doi.org/10.1353/tech.2016.0066. 48 Stephanie Hill, “ORU Receives Innovation Award,” press release, May 2, 2017, http://www.oru.edu/news/oru_news/20170502-glc-innovation-award.php?locale=en. 49 Hill, “ORU Receives.” 50 Bostick, “The Development and Validation,” 160. 51 Blundell, “Documenting the Information-Seeking Experience,” 263. 52 Mizrachi, “Library Anxiety,” 2784. 53 Collins and Robin E. Veal, “Off-Campus Adult,” 7. https://doi.org/10.1108/RSR-02-2018-0017 https://digitalcommons.kennesaw.edu/seln/vol62/iss1/5 https://doi.org/10.5860/rusq.55n4.283 https://www.oru.edu/admissions/undergraduate/ https://doi.org/10.1353/tech.2016.0066 http://www.oru.edu/news/oru_news/20170502-glc-innovation-award.php?locale=en ABSTRACT Literature Review Library Anxiety Causes and Factors Negative Effects At-Risk Student Groups Interventions ORU Library Oral Roberts University History ORU Modular Students The Project Methodology Pre- and Post-Surveys Administration of Survey Results Knowledge of the Library Affective Barriers Mechanical Barriers Barriers with Staff Comfort with the Library Conclusion Recommended Reading Endnotes 11787 ---- User Experience Methods and Maturity in Academic Libraries ARTICLES User Experience Methods and Maturity in Academic Libraries Scott W. H. Young, Zoe Chao, and Adam Chandler INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2020 https://doi.org/10.6017/ital.v39i1.11787 Scott W. H. Young (swyoung@montana.edu) is UX and Assessment Librarian, Montana State University. Zoe Chao (chaoszuyu@gmail.com) is UX Designer, Truist Financial. Adam Chandler (alc28@cornell.edu) is Director of Automation, Assessment, and Post-Cataloging Services, Cornell University. ABSTRACT This article presents a mixed-methods study of the methods and maturity of user experience (UX) practice in academic libraries. The authors apply qualitative content analysis and quantitative statistical analysis to a research dataset derived from a survey of UX practitioners. Results reveal the type and extent of UX methods currently in use by practitioners in academic libraries. Themes extracted from the survey responses also reveal a set of factors that influence the development of UX maturity. Analysis and discussion focus on organizational characteristics that influence UX methods and maturity. The authors conclude by offering a library-focused maturity scale with recommended practices for advancing UX maturity in academic libraries. INTRODUCTION User experience (UX) is a design practice for creating tools and services from a user-centered perspective. Academic libraries have been practicing UX for some time, with UX methods having been incorporated across the profession. However, there has been a lack of empirical data showing the extent of UX methods in use or state of UX maturity in libraries. To help illuminate these areas, we distributed a survey to UX practitioners working in academic libraries that inquired into methods and maturity. We followed a mixed-methods approach involving both qualitative content analysis and quantitative statistical analysis to analyze the dataset. Our results reveal the most- and least-common UX methods currently in use in academic libraries. Results also demonstrate specific organizational characteristics that help and hinder UX maturity. We conclude by offering a set of strategies for reaching higher levels of UX maturity. BACKGROUND AND MOTIVATION: UX IN ACADEMIC LIBRARIES UX has been represented in the literature of library and information science for at least two decades, when “the human interaction involved in service use” was recognized as a factor affecting the value and impact of libraries.1 The practice of UX has expanded and evolved and is now a growing specialty in the librarianship profession.2 UX in libraries is motivated by a call to actively pay close attention to users’ unique and distinctive requirements, which allows libraries to more effectively design services for our communities.3 As a practice, UX is now beginning to be represented in graduate curricula, public services and research support, access services, space design, and web design.4 With its attunement to a set of practices and principles, UX can be viewed as a research and design methodology similar and related to other methodologies that focus on users, services, problem solving, participation, collaboration, and qualitative data analysis .5 Notably, UX is related to human-centered design, service design, and participatory design.6 mailto:swyoung@montana.edu mailto:chaoszuyu@gmail.com mailto:alc28@cornell.edu INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 2 Specific methods of UX practice are today wide-ranging. They include surveys, focus groups, interviews, contextual inquiry, journey mapping, usability testing, personas, card sorting, A/B testing, ecology maps, observations, ethnography, prototyping, and blueprinting.7 Some UX methods are incorporated into agile development processes.8 Though tools and techniques are available to library UX practitioners in abundance, the rate of adoption of these tools is less understood. In a notable contribution to this question, Pshock showed through a nation-wide survey that the most familiar UX methods among library practitioners included usability testing, surveys, and focus groups.9 The question of methods is related to the question of maturity—how advanced is library UX practice? In addition to the rate of adoption of methods and tools, several different UX maturity models have been advanced in recent years. Priester derives maturity from four factors: culture of innovation, infrastructure agility, acceptance of failure, and library user focus.10 In discussing UX capacity in libraries, MacDonald proposes a six-stage maturity model: unrecognized, recognized, considered, implemented, integrated, and institutionalized.11 Sharon defines maturity as a combination of staff resources and organizational buy-in.12 Similarly, Sheldon-Hess proposes a five-level scale of UX maturity, based primarily on the degree of implementation of UX practice and user-centered thinking in an organization.13 And even earlier, Nielsen proposed an eight-level scale of UX maturity, starting with a “hostility toward usability” and concluding with a “user- driven” organization.14 After reviewing a number of different maturity models, Anderson reports that the most common hierarchies include the following steps: (1) Absence/Unawareness of UX Research, (2) UX Research Awareness—Ad Hoc Research, (3) Adoption of UX research into projects, (4) Maturing of UX research into an organizational focus, (5) Integrated UX research across strategy, and (6) Complete UX research culture.15 The field of library UX shows a clear and compelling interest in UX maturity, and we can benefit from further empirical evidence that can help illuminate the current state and future progress toward UX maturity, including the rate of adoption of methods, resource allocation toward UX, and organizational buy-in. The research presented in this paper is motivated by the need to provide current and comprehensive data to answer questions related to UX maturity in academic libraries. METHODS Research Questions The research questions for this study are the following: • RQ1: How mature is UX practice within academic libraries? • RQ2: What factors influence UX maturity? To answer these questions, we distributed a survey to UX practitioners working in academic libraries. Survey responses were analyzed qualitatively using content analysis and quantitatively using statistical analysis. Survey Participants The team members sent out the survey on May 23, 2018, to library profession electronic discussion lists.16 Of the 87 received responses, 74 included an institution name. We identified size and setting classification using The Carnegie Classification of Institutions of Higher Education (see INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 3 table 1) for the institutions.17 Eight of them cannot be mapped to the Carnegie classification due to being outside the United States (n = 6) or of different scopes (one research lab and one information school). Six schools have more than one response, which are treated separately to represent the diversity of opinion and experience within an organization. Classification Response Count Percentage Four-year, large 4918 56 Four-year, medium 1019 11 University outside US 6 7 Four-year, small 5 6 Non-university 2 2 Four-year, very small 1 1 Two-year, very large 1 1 Unspecified 13 15 Table 1. Institutional profiles of survey respondents, with response counts. Materials and Procedure Our online survey was organized into two main parts. After an initial informed consent section, the survey investigated (1) Demographics and UX Methods and (2) UX Maturity. Demographics and UX Methods In the first main part of the survey, participants were asked to select among 20 different UX methods that “you personally use at least every year or two at your institution.” The list of methods is derived from the UX Research Cheat Sheet by Nielsen Norman Group.20 Participants were asked to complete an optional free-text response question: “Would you like to add a comment clarifying the way you completed [this question]?” UX Maturity In the second main part of the survey, participants were asked to identify the UX maturity stage that “properly describes the current UX status” in their organization. The stages were adapted from the eight-stage scale of UX maturity proposed by Nielsen Norman Group: • Stage 1: Hostility Toward Usability • Stage 2: Developer-Centered UX • Stage 3: Skunkworks UX • Stage 4: Dedicated UX Budget • Stage 5: Managed Usability • Stage 6: Systematic User-Centered Design Process • Stage 7: Integrated User-Centered Design INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 4 • Stage 8: User-Driven Organization We concluded the survey by asking participants to optionally “explain why you selected that stage” with a free-text response. Research Data Analysis Content Analysis We followed the methodology of content analysis.21 Each qualitative survey response functioned as a meaning unit, with meaning units sorted into themes and subthemes. Each article author coded units independently; themes were resolved through discussion among the author group. The process of coding via content analysis allowed us to identify overarching trends in UX practice and maturity. Results are further discussed below. Statistical Analysis Data preparation and statistical analysis were conducted using R version 3.4.1 (see table 2 for full R package). Base R was used for our statistical analysis. Other R packages utilized in the project are listed in the table below. R package name Version ggplot2 3.0.0 Tibble 2.1.1 dplyr 0.7.5 tidyr 0.8.1 stringr 1.4.0 readr 1.1.1 readxl 1.3.1 Table 2. R packages used in the analysis Data Preparation The following steps were taken in the data analysis: 1. Content analysis into themes (see above) 2. Normalize institution names. We received more than one response from a few institutions. For these, the responses were treated as separate responses that happened to have the same demographics. 3. For responses that included institution names, we added a total student population variable to the response using values derived from Wikipedia and the Carnegie Classification of Institutions of Higher Education. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 5 4. For variables we derived during the content analysis we coded them as 0 or 1 dummy variables, that is, 0 = not present, and 1 = present. Coding them in this way allows us to bring them into a multiple linear regression model. 5. Using an R script, we tested each response for the presence of the content analysis, 0 or 1. 6. Plots were created using the R ggplot2 library. 7. Linear regression models were conducted using the base R lm function. Research Dataset Dataset, survey instrument, and R code are available through Dryad at https://doi.org/10.5061/dryad.jwstqjq5d.22 Survey Respondents Eighty-seven participants responded to one or more components of the survey. See table 3 for a breakdown of survey responses. Survey Question Responses UX Methods multiple choice: “Please check the following UX methods that you personally use at least every year or two at your institution.” 81 UX Methods free-text response: “Would you like to add a comment clarifying the way you completed [the question related to UX methods]?” 20 UX Maturity Stage multiple choice: “Which of the following [maturity stages] do you think properly describes the current UX status in your organization?” 79 UX Maturity Stage free-text response: “Please explain why you selected that stage.” 54 Table 3. Survey Responses. RESULTS Our research results demonstrate that certain characteristics of a library organization are related to UX maturity. These characteristics include the type and extent of UX methods that are currently in use, as well as organizational factors such as leadership support, staffing, and collaboration. We further explicate below according to our two research questions. RQ1: How mature is user experience practice within academic libraries? Our survey also asked participants to identify which stage of the Nielsen Norman Group Maturity scale “properly describes the current UX status” in their organization. Our findings indicate that most libraries are in a low-to-middle range of maturity, with more than 75% of respondents placing their organization at either Stage 3, Stage 4, or Stage 5 (figure 1). https://doi.org/10.5061/dryad.jwstqjq5d INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 6 Figure 1. Histogram of responses by stage, showing that the majority of respondents placed their organization at either Stage 3, Stage 4, or Stage 5. RQ2: What Factors Influence UX Maturity? Overview of Statistical Analysis Results We use linear regression for two different applications in this study (see appendix A for glossary of terms related to statistical analysis). The process of creating a statistical model allows us to see, with varying degrees of confidence, the impact of different variables on UX maturity stage. The results of the linear regression help us to tease out the variables with the most predictive value. Using certain methods does not cause the library to be at a higher stage; rather, libraries that use certain methods tend to be at a higher stage, statistically. That is what is meant by “predictive” in this context. Linear regression provides a ground truth in what we think we are seeing in survey responses: A useful general principle in science is that when you don’t know the true form of a relationship, start with something simple. A linear equation is perhaps the simplest way to describe a relationship between two or more variables and still get reasonably accurate predictions.23 The other reason we are using the linear regression output is to inform a possible future version of a UX maturity survey instrument, one more finely tuned to libraries than the Nielsen instrument alone that we used in this iteration.24 We feel that our use of multiple linear regression is appropriate and helpful given the exploratory nature of our study. The complete output is available at https://doi.org/10.5061/dryad.jwstqjq5d. Size of Institution We used the institution’s student population, the number of full-time enrolled students, as our proxy for the size of the library. Our assumption being, larger enrollment generally means larger https://doi.org/10.5061/dryad.jwstqjq5d INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 7 number of library staff. There are different ways UX maturity level could be compared with student enrollment; because the range in our sample is very wide, from 1,098 to 200,000 students across the sample of institutions, we attempted to control for the vast differences in size between the smallest and largest by sorting, from smallest to largest, 1 to 69 (the total number of cases in our dataset with both a stage and population defined), then assigning rank to the institution as an additional demographic variable. We then created a simple linear regression model comparing maturity stage as a function of ranked size. The null hypothesis is that there is no relationship between ranked size of institution and stage. Stage is the response variable and ranked size of the institution is the explanatory variable. The adjusted R-squared relationship is 0.027. This means that only about 3% of the variance is accounted for by the ranked size of the institution. The probability, or “p-value,” of getting our observed result if the null hypothesis is true for this relationship is 0.095 (almost 10%). This exceeds the standard .05 confidence level commonly used in statistical analysis. Therefore the size of the institution is not a reliable predictor of UX maturity level in our sample, a counterintuitive finding. The full statistical summary is available in the appendix. Methods Currently in Use by Academic Libraries Our next RQ2 finding relates to the type and extent of UX methods that are currently in use in academic libraries. Our survey asked participants to select which UX methods “you personally use at least every year or two at your institution.” User surveys, usability testing, and user interviews stand out as the most commonly used. Figure 2 shows response counts for all of the methods in the survey. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 8 Figure 2. Number of respondents that selected each method in the survey, showing the type and extent of UX methods currently in use in academic libraries. We then examined the number of methods in use per institution compared to the reported maturity stage (figure 3). The number of methods used per institution illustrates a trend: more methods used at an institution generally means the institution is at a higher stag e of maturity. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 9 Figure 3. The number of methods used per institution, illustrating that more methods currently in use at an institution generally indicates a higher level of maturity. Another way of representing the same two variables (reported number of methods and maturity stage) is with a scatterplot and statistical test (figure 4). In this simple linear regression model we have two variables: the response variable is stage and the explanatory variable is number of UX methods used in the past two years. In plotting these two variables on a chart, we can draw a line that minimizes the distance between the line and all of the points on the plot. Like the chart above, the linear relationship between total methods and stage is clearly visible. The total number of methods practiced accounts for about 18% of the variance when predicting the correct maturity stage. (Recall from our discussion about ranked size of institution that rank accounts for less than 3% of the variation, and is not even statistically significant.) In this case, the p-value is far below the 0.05 threshold, meaning the likelihood that we are seeing a relationship by random chance is very low. Therefore total number of methods is predictive of stage. Generally, the more methods respondents chose, the higher the maturity stage. We can see from this data that the number of methods used is more predictive of maturity stage than institution size. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 10 Figure 4. Maturity stage compared against total number of methods used, showing the positive relationship between number of UX methods used and UX maturity stage. For a more granular view, figure 5 shows the relation of specific UX methods used in different UX research phases (as categorized in the survey question, with methods organized by discovery, exploration, listening, and testing) to reported maturity stage. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 11 Figure 5. Showing the relation of specific UX methods to reported maturity stage. Factors that Influence UX Methods: Recency, Formality, Regularity We then applied content analysis to the free-text questions of our survey. Following the question that asked participants to select among 20 different UX methods that “you personally use at least every year or two at your institution,” the free-text question asked, “Would you like to add a comment clarifying the way you completed [this question]?” Each of the 20 free-text responses to this question was counted and categorized as a “meaning unit.” Themes were extracted from the free-text survey responses. We identified 3 themes across 20 meaning units: formality, regularity, and recency (see table 4). Question - Would you like to add a comment clarifying the way you completed [this question related to UX methods]? Thematic Analysis Theme Definition Number of Meaning Units* Example Meaning Unit INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 12 Recency How new or developed a library’s UX practice 7 “I am fairly new here and we are still developing a process that is well-rounded.” Formality How formal or structured the UX practice is 9 “We are aware of many of the techniques mentioned, but we don’t have a formal process for implementing them.” Regularity How often or frequently UX is practiced 4 “Right now we are doing a workflow analysis of interlibrary loan, but once completed probably wouldn’t do that for another three to four years.” *Each free-text response was counted and categorized as a single meaning unit. Table 4. Qualitative Questions and Thematic Analysis for UX Methods responses (n = 20). Factors that Influence UX maturity: Leadership Support, Collaboration, UX Lead, UX Group, Growth, and Resources We also conducted a content analysis on the free-text responses to the survey question related to the UX maturity scale that asked participants to “explain why you selected that stage.” Each of the 54 free-text responses to this question was counted and categorized as a “meaning unit.” Themes were extracted from the free-text survey responses. We identified 7 themes across 54 meaning units: leadership support, collaboration, UX lead, UX group, growth, and resources, and strategic alignment (see table 5). Question - Please explain why you selected [the current UX status in your organization]? Thematic Analysis Theme Definition Number of Meaning Units* Example Meaning Unit Leadership support The degree to which UX work is seen, understood by, and supported by library leadership. 32 “Just last year, the UX team moved into administration so that we can tie our work to strategic planning for the organization.” INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 13 UX group The presence of a committee or working group that conducts or otherwise supports UX work. 31 “I also chair a Web Working Group which focuses on improving our website from a usability standpoint.” Collaboration The degree to which UX work is collaboratively shared by individuals and departments throughout the library 30 “I don't know if UX has become a necessarily planned activity across the whole organization. I am team of one, and though I’ve tried, I haven’t been able to add anyone else to form an official UX team as well.” UX lead Personnel assigned to UX work, especially a dedicated UX lead 30 “I have recently been hired to partially work with UX and another person has been appointed UX coordinator.” Growth The degree to which expansion occurs around staffing, resources, and organizational understanding of UX work. 13 “We . . . will soon be posting a position for a UX librarian.” Resources The amount of time and budgetary resources dedicated to UX. 10 “Budget is our biggest constraint when it comes to UX testing.” Strategic alignment The inclusion of UX or user- centeredness in strategic planning 2 “We do employ user research to determine where to target priorities and strategy. However, I do not think we have a robust process for iterative testing or participatory design yet.” * Each free-text response was counted and categorized as a single meaning unit. Table 5. Qualitative questions and thematic analysis for UX maturity responses (n = 54). INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 14 This data can be visualized to show the relationships between UX maturity stage and the coded thematic responses (figure 6). Figure 6. Coded responses versus selected stage (0 indicates no comment related to that theme), showing that a lack of leadership support is often cited as a reason for not advancing past Stage 3; the presence of dedicated staff in the form of a UX lead or a UX group is often cited as a reason for reaching Stage 5. Full UX Maturity Model: UX Maturity as a Function of UX Methods In building a full model for the purposes of quantitative data analysis, we are attempting to predict the maturity stage based on the many different variables that appear in our dataset. This statistical exercise is a heuristic tool that can help us understand the survey responses and to draw results from the dataset that reveal key characteristics of UX maturity in libraries. We approached building a full model using a modified backward stepwise approach. With this approach, we begin with the full range of variables and work backward step by step to focus only on those variables that combine to form a model that makes the best predictions about the response variable—the UX maturity stage—for each case. Through this process, those variables that are less predictive are removed from the model one by one until we can settle on a model that explains the most variance.25 The modified backward stepwise “step” function used to create our model required 18 iterations before settling on the best version. Using adjusted R-squared as our metric, our full model accounts for 62% of the variance for this dataset. Adjusted R-square is an appropriate measure because it allows us to include many variables but also includes a penalty for including too many variables (as a penalty, the adjusted R-square value will decrease). With this model, we can make reasonable estimates of the maturity stage that a survey respondent selected by knowing which methods they use combined with the coded explanation the respondent provided via the free-text survey questions. The coded responses (see table 4 and table 5) provided measurable insights into INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 15 the organizational context of our respondents’ institutions, and this allows us to analyze and predict their respective maturity levels. With this additional information, we have a model that represents the multiple dimensions available in the dataset (see appendix B for additional data analysis). In table 6, we show the relationship between specific UX methods and the UX maturity stages. We see here that journey mapping, for example, is a highly influential factor for UX maturity. Variable Estimated Influence on Maturity Stage P-value (Significance) Journey maps 1.7 0.001*** Design review 1.3 0.010* User interviews 0.9 0.047* Usability testing 0.8 0.158 Benchmark testing 0.7 0.067 Usability ug review 0.3 0.498 User stories 0.2 0.440 Requirements and constraints 0.2 0.514 User surveys -0.3 0.492 Diary camera studies -0.6 0.325 FAQ review -0.8 0.062 Prototype testing -0.8 0.076 Field studies -1.3 0.003** *p < .05 (statistically significant result), **p < .01 , ***p <.001 (a highly statistically significant result) Table 6. Relationship between UX method variables and predicted maturity stage. In table 7, we show the relationship between the coded responses from the free-text survey questions (presented in tables 4 and 5), and the UX maturity stages. We see through this analysis that variables such as “resources” are important for advancing maturity. Similarly, we see that a lack of “leadership support” has a strong negative effect on maturity. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 16 Variable Estimated Influence on Maturity Stage P-value (Significance) Resources: yes 2.9 0.014* Collaboration: yes 0.8 0.147 Growth: yes 0.2 0.561 Resources: no -0.2 0.615 UX lead: no -0.5 0.216 Leadership support: yes -0.7 0.177 UX lead: yes -0.9 0.038* UX group: no -0.9 0.022* Leadership support: no -1.0 0.009** Strategic alignment: no -2.8 0.012* *p < .05 (statistically significant result), **p < .01 , ***p <.001 (a highly statistically significant result) Table 7. Relationship between organizational variables and predicted maturity stage, in descending order of influence on maturity stage. A Statistical Example Case: Estimating UX Maturity To help the reader understand the statistical summary provided by our model, we take a close look at one case drawn from one actual survey participant. In this example case, the respondent’s institution is a four-year, large university. The intercept for this multiple regression model happens to be 4.1119. Intercept in a multiple regression model represents the mean response (stage) when all the predictors are all zero.26 It is a baseline. Our example institution has practiced the following methods, with their respective influence on UX maturity included in parentheses: • User interviews (+ 0.9521) • Usability testing (+ 0.7984) • Benchmark testing (+ 0.7124) • Usability bug review (+ 0.2692) • Field studies (- 1.3346) • Prototype testing (- 0.8454) • User surveys (- 0.3204) Additionally, this institution has the following organizational characteristics, with their respective influence on UX maturity included in parentheses: • Leadership support: yes (- 0.6842) INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 17 • Resources: no (- 0.2192) By adding these numbers together with the starting point (4.111), we can calculate that the predicted stage for this institution is 3.44. Actual stage selected by survey respondent: 3 Residual -0.44 The model predicts a stage of 3.440 for this large, four-year university library. The model’s predicted value for this library is 0.440 greater than the stage selected by this survey respondent. The leftover part, or error, is the residual. The attentive reader might at this point ask why the variable called Leadership support: yes has a negative estimate, - 0.6842. That certainly is counterintuitive. Other evidence and our own interpretation lead us to expect that Leadership support: yes should have a positive effect on the maturity. In this particular case, the negative estimate has a high p-value (0.177) and is thus unreliable and not significant to the model. Part of the unreliability stems from the relatively small number of institutions (n = 9) that were coded as Leadership support: yes. In contrast, responses that were coded as Leadership support: no (n = 23) produced an even lower negative estimate of -0.9911, with a very reliable and highly significant p-value of 0.009. This shows us that when leadership support is lacking, maturity reliably suffers. We discuss this and other organizational characteristics in more detail below. DISCUSSION In interpreting our results, we have identified four key areas that we wish to emphasize: the significance of leadership support, the importance of organization-wide collaboration, the role of applied UX methods, and the emerging theory and practice of UX and design in libraries. Leadership Support and Strategic Alignment A major theme evident in the results relates to leadership support and strategic alignment. As expressed by the survey respondents, leadership support is viewed as the degree to which UX work is seen, understood by, and supported by library leadership and organizational strategic planning. In particular, a lack of support and visioning from leadership exerts negative pressure on UX maturity. On the other hand, when UX is coordinated with leadership vision and situated into strategic planning, UX maturity was rated more highly. From a leadership perspective, UX maturity relies on an allocated budget and designated staff to move beyond an ad hoc approach and reach higher levels on the maturity scale. One might expect that the larger an institution, the more advanced the UX maturity stage. However, based on our data analysis, size of institution is not a significant factor in UX maturity. Therefore the resources provided to library UX activities may not be about how large institutions are, but rather if leadership acknowledge the importance of UX and provide official, particularly financial, support. Organizational Collaboration Another major theme was collaboration—the degree to which UX research is collaboratively shared by individuals and departments throughout the library. Higher levels of UX maturity are driven by a widespread understanding of UX within an organization, with user research data INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 18 integrated into decision-making across multiple touchpoints. Conversely, a lack of collaboration was a factor that hindered maturity. Many respondents shared similar experiences, telling us that other staff or departments within the organization are not ready to embrace the potential of UX data, methods, and insights. We recognize that cultivating UX is an organic process that can result in uneven growth of UX within an organization. Some units may be ready to move further and faster while others may hesitate to contribute or collaborate. Not every department will immediately see the relevance or value of UX work for their area. Accounting or human resources, for example, might consider UX as beyond the scope of their practice. Thinking inclusively and holistically from the perspective of user-centered service design, however, opens up new connections between UX and the work of all departments across the organization. UX can help center those users—even internal users—who interact with service points such as accounting or human resources in ways that can improve the service experience for all involved. Applied UX Methods Across the 20 methods that we included in the survey, our results indicate that the application of different UX methods varies widely in type and extent. Many methods are in use to varying degrees. As the methods relate to maturity, we find that a greater number of methods in use during the previous two years was indicative of a higher maturity rating. In short, more methods lead to more maturity. The five most common methods included usability testing, user surveys, user interviews, accessibility evaluation, and field studies. These methods are similar in their ease of implementation and their wide representation in the library literature, and due to their commonness, they are not strongly indicative of UX maturity, high or low. The five least common methods included journey maps, benchmark testing, design review, FAQ review, and diary/camera studies. In this grouping we see a set of UX methods that are not as well known or widely discussed, but which can paint a more complete picture of the user experience. Journey mapping in particular was strongly and positively influential on UX maturity in our statistical model. This result does not necessarily indicate that a library can boost UX maturity simply by creating a journey map. Rather, we interpret this to indicate that the method itself is reflective of a coordinated UX effort in the institution. Journey mapping aims to obtain a high-level overview of a user’s interactions with every touch point to accomplish a task. As such, the successful implementation of a journey map relies on cross-functional and cross-departmental input and interpretation. This result calls for greater collaboration toward greater UX maturity. UX as an Emerging Practice within Libraries Many respondents focused on the newness or the maturity of their library’s UX practice, and most responses connected a low methods usage to the newness of the practice. In these responses, we see that UX in libraries is still a new field, and the practice is emerging with variations across institutions in terms of methods and maturity. We note that institutional size was not a factor that influenced maturity—some smaller institutions reported mature UX practice while some larger institutions reported lower UX maturity. In this result, we see that the amount of possible resources matters less than the intentional application of those resources in support of UX work. As institutions begin to see the value of UX and dedicate increasingly more resources relative to their budgets, UX maturity increases. Our survey respondents shared a variety of experiences along this journey toward maturity. Many told us “I’m new here” and that their library doesn’t fully understand UX and isn’t yet ready to include UX research in decision-making or strategic planning, or that the institution doesn’t have a plan yet for how to integrate the UX librarian into library operations. Still others reported that librarians in other units or library administrators are INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 19 not required or encouraged to consult with the UX librarian or integrate UX research. In this way, many libraries continue a more traditional model of decision-making that does not regularly apply intentional methods to account for the voices of users. On the upper end of the maturity scale, on the other hand, we see a wide adoption of UX as a legitimate area of work varies across units and within leadership groups. In this way, some libraries have demonstrated more responsiveness to UX and have more successfully integrated UX practices into strategic and operational workflows. Through the survey responses, we see a three- step progression that marks the emergence of UX as a trusted and legitimate methodology for understanding user experiences and designing library services: recency, formality, and regularity (table 4). In the earlier stages of maturity, survey respondents emphasize the newness or recency of a group or person assigned to conduct UX work. From there, a UX practice emerges as increasingly more formal as more UX methods are introduced more often into different contexts. Finally, as a library reaches UX maturity, we see a frequent application of a wide variety of UX methods in all corners of the library and with many stakeholders, along with organizational decision-making that regularly includes UX research data. A UX Maturity Scale for Libraries To help in understanding of the UX maturity scale and the characteristics related to each of its stages, we have adapted the Nielsen Norman UX Maturity scale for a library context. Table 8 shows a set of organizational characteristics that correspond to the eight stages of UX maturity. The indicators in table 8 are presented as an approximate guideline for understanding and diagnosing UX maturity. Stage Key Indicators Stage 1–2 Apathy or hostility to UX practice; lack of resources and staff for UX Stage 3 Ad hoc UX practices within the organization; UX is practiced, but unofficially and without dedicated resources or staff; leadership does not fully understand or support UX Stage 4 Leadership beginning to understand and support UX; dedicated UX budget; UX is assigned fully or partly to a permanent position Stage 5 The UX lead or UX group collaborates with units across the organization and contributes UX data meaningfully to organizational and strategic decision-making Stage 6 UX research data is regularly included in projects and decision-making; a wide variety of methods are practiced regularly by multiple departments Stage 7–8 UX is practiced throughout the organization; decisions are made and resources are allocated only with UX insights as a guide Table 8. Key indicators for UX maturity in academic libraries. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 20 This scale reflects the research presented in this paper while building on related models and prior research (more granularity is available in Stages 2–6 because we received more survey responses representing those stages). We note that our research is consonant with prior work in this area. Priestner includes a greater focus on library users (in contrast to a focus on library staff) as a key driver of library UX maturity.27 MacDonald reports that UX work is defined by applied methods, in particular, qualitative research.28 Sharon describes a UX maturity model based on two primary factors: the presence of UX researchers on staff and whether the organization actually listens to and responds to UX research.29 Finally, Sheldon-Hess bases library UX maturity on the extent of applied UX methods and the level of user-centeredness present in an organization, as indicated by degree to which staff consider user perspectives in internal communications and decision - making.30 Taken together, we see common strands that can help illuminate the key factors of UX maturity in libraries: applied methods, leadership support in the form of resources and strategic alignment, organizational collaboration, and decision-making that includes UX research. Strategies for Climbing the Maturity Scale: Toward a More User-Centered Library Our results reveal a few key barriers and boosts to higher maturity, and one key point of stagnation. Across the maturity scale, important factors that positively influence maturity involve leadership support and resource allocation toward UX in the form of personnel and infrastructure such as physical space, materials, strategic direction, and a working budget. Notably, respondents in our survey reported being stuck at Stage 3 due to a lack of leadership support. For instance, when resource-related comments appeared, we primarily heard about a lack of resources, which impaired maturity. Participants reported a mixture of personnel in support of UX work. Some libraries have a staff member dedicated to UX but lack a committee structure to support and advocate for the work. Other libraries do not have dedicated UX staff but had formed committee infrastructure to collaboratively move UX forward. Participants who lacked either a UX group or a UX lead reported lower levels of maturity and were particularly stagnated at Stage 3 (see figure 5 above). Alternatively, libraries are boosted to Stage 5 with the presence of a fully empowered UX lead who has the support of a UX group or committee that can network throughout the organization and drive collaboration and cross-functional implementation of UX methods and research data. We found that respondents from libraries that possessed both a dedicated UX staff and a UX group tended to place themselves higher on the maturity scale. For those who reside at Stage 5, having a UX group or a UX lead are the two main themes present in the survey. To move forward to Stage 5, a library needs to organize a UX group with an appointed lead to coordinate UX practice widely throughout the organization, including in library spaces, web presence, learning services, and digital initiatives. A systematic and cooperative UX approach planned by an official UX group and led by a designated UX lead is the key indicator of Stage 5. The support for the group and its lead needs to come rom not only leadership, but also colleagues throughout the library, which relates to the two major themes of leadership support and organizational collaboration. Stage 7–8 is achievable only with significant investment in UX. Given parent entity pressures, existing hierarchies, and prevailing non-user-centered cultures, libraries face a formidable set of challenges on the road to becoming user-centered organizations.31 This road is somewhat illuminated by the small number of survey respondents that marked themselves at a Stage 7 or 8. Highlights from their responses are instructive. One respondent told us, INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 21 We have multiple teams in the library to help with service design, conducting and gathering user research, and helping library staff think more about the user in their everyday work. We also have a couple Special Interest Groups (SIGs) dedicated to user research, UX, and assessment. We also have multiple departments within the library with UX expertise. From this response, we can see the key characteristics of UX maturity: leadership support up the line along with wide-spread collaboration throughout the organization. Staff infrastructures including multiple UX-oriented committees help drive and coordinate UX work. This respondent also reported the recent hiring of an assessment librarian situated in the library’s administration department who will help coordinate UX work throughout the organization. These elements work together to meaningfully integrate user perspectives into both digital and physical spaces and in multiple units. Moreover, this respondent marked 19 out of 20 UX methods currently in use (all but diary/camera studies), thus reinforcing the symbiotic relationship between UX maturity and UX methods: the variety of methods in use are a signal of maturity, and correspondingly, a greater maturity allows the space and resources for the application of more and different methods. Another survey respondent at Stage 7 remarked the following, My workplace has been very supportive in addressing UX issues both in digital and physical spaces. Since being hired, I have created workflows that incorporate data that we gather from users. If there isn’t data gathered in a certain area, we usually find a way to update workflows so that we can get that data. Almost every project that I have worked on digitally and in the physical spaces at the library has been the result of UX/UI data that has been gathered from our users. The elevated level of maturity at this library is especially reflected through the practice of “almost every project” being driven by user data. A truly user-centered library indeed integrates user data across all projects and advocates for the user at every opportunity. This respondent also marked a high variety of methods currently being practiced: 17 out of a possible 20 (methods not in use include diary/camera studies, user stories, and competitive analysis), further underscoring the two-way connection between methods and maturity. In further considering the upper reaches of maturity, we are inspired by an emerging theory of design-oriented librarianship that signals a professional paradigm shift such that UX could become recognized as a fundamental component of library research, education, and practice. 32 By investing more in UX methods, practices, and principles, libraries can achieve greater value and empowerment for our communities by designing more user-centered services and tools.33 Ultimately, achieving Stage 7–8 will result from deeply integrating user-centeredness across all operational phases, strategic planning, and decision-making of a library organization. LIMITATIONS We note a few limitations of our study. First, the UX stages used in the survey were defined by Jakob Nielson in 2006 for corporate application, so it is perhaps a bit dated. Further, the main goal of our statistical analysis is to develop a model that can accurately predict the UX maturity of a INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 22 library based on the UX methods employed at the institution combined with organizational characteristics. Allison outlines three broad categories of error in regression analysis: Measurement error: Very few variables can be measured with perfect accuracy, especially in the social sciences. Sampling error: In many cases, our data are only a sample from a larger population, and the sample will never be exactly like the population. Uncontrolled variation: [Age and schooling] are surely not the only variables that affect a person’s income, and these uncontrolled variables may “disturb” the relationship.34 In terms of measurement error, survey respondents may have bias when self-reporting maturity stage due to social pressures to produce desirable responses, meaning people tend to respond to self-report items in a manner that makes themselves look good.35 The resulting measurement error takes the form of over-reporting “desirable behavior” or under-reporting “undesirable behavior.” This is evident in some responses for UX maturity stages. For example, one respondent chose Stage 5—“Managed Usability”—but the comment described a slightly different picture: I think we are still floundering between “Dedicated UX Budget” and “Managed Usability.” . . . We are at the stage where people know they should consult with us, but either they don’t OR they do but don’t really hear the results, they are using us to confirm what they want to hear. In terms of sampling error, self-selection bias is a factor: our respondents might not be representative of the full population of UX librarians. We also did not make all of our questions mandatory, and as a result were not able to make use of all possible data within the scope of our survey. In terms of uncontrolled variation, our survey and statistical model does not fully account for all variables that influence UX maturity in libraries; for example, we included a limited list of UX methods, and we did not include questions that inquired specifically into the presence of a UX lead or UX group. FUTURE DIRECTIONS We see at least three paths forward for future research related to UX methods and maturity. First, librarianship would benefit from a UX maturity scale created specifically with and for our field’s theoreticians and practitioners. We propose one such scale above, but our scale has not undergone further testing, research, or validation. We note especially the library UX maturity scales of Sheldon-Hess and MacDonald, which could be further synthesized or built upon.36 Second, a self- assessment tool for diagnosing UX maturity could be developed based on a validated maturity scale. And third, the theory advanced by Clarke that librarianship can be usefully conceived of and practiced as a design discipline warrants further critical attention, especially as it relates to the application of UX methods and the development of UX maturity.37 CONCLUSION We applied a mixed-methods approach that involved content analysis and statistical analysis to a profession-wide survey. Our research data and analysis demonstrate the type and extent of UX methods currently in use by academic libraries. The five most common methods are usability INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 23 testing, user surveys, user interviews, accessibility evaluation, and field studies. The five least common methods are journey maps, benchmark testing, design review, FAQ review, and diary/camera studies. Furthermore, we identify the organizational characteristics that help or hinder the development of UX maturity. UX maturity in libraries is related to four key factors: the number of UX methods currently in use; the level of support from leadership in the form of strategic alignment, budget, and personnel; the extent of collaboration throughout the organization; and the degree to which organizational decisions are influenced by UX research. When one or more of these four connected factors advances, so too does UX maturity. We close by emphasizing three key factors for reaching higher levels of UX maturity. First, we encourage library leadership to see the value of UX and support its practice through strategic alignment and resource allocation. Second, we encourage libraries to commit to integrating UX principles and practices across all units, especially into leadership groups and through organization-wide collaboration and workflows. Third, UX methods should be reinforced and amplified with personnel, such as a standing UX group and a dedicated UX lead that can help direct UX work and enhance UX maturity. Libraries have the promise and potential to more deeply practice UX. Doing so can allow libraries to more deeply connect with users and reach higher levels of UX maturity, with the ultimate result of delivering tools and services that further empower our user communities. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 24 APPENDIX A: GLOSSARY OF STATISTICAL TERMS Term Working Definition adjusted R-squared Adjusted R2 = variance of fitted model values / variance of response values. “The adjusted R-squared compares the descriptive power of regression models—two or more variables—that include a diverse number of independent variables—known as a predictor. Every predictor or independent variable, added to a model increases the R-squared value and never decreases it. So, a model that includes several predictors will return higher R-Squared values and may seem to be a better fit. However, this result is due to it including more terms. The adjusted R-squared compensates for the addition of variables and only increases if the new predictor enhances the model above what would be obtained by probability. Conversely, it will decrease when a predictor improves the model less than what is predicted by chance.” Source: https://www.investopedia.com/ask/answers/012615/whats- difference-between-rsquared-and-adjusted-rsquared.asp confidence level “The confidence level tells you how sure you can be. It is expressed as a percentage and represents how often the true percentage of the population who would pick an answer lies within the confidence interval. The 95% confidence level means you can be 95% certain; the 99% confidence level means you can be 99% certain. Most researchers use the 95% confidence level.” Source: https://researchbasics.education.uconn.edu/confidence- intervals-and-levels/ confidence interval “A confidence interval is an interval which has a known and controlled probability (generally 95% or 99%) to contain the true value.” Source: https://stats.oecd.org/glossary/detail.asp?ID=5055 explained variance “Explained variance (also called explained variation) is used to measure the discrepancy between a model and actual data. In other words, it’s the part of the model’s total variance that is explained by factors that are actually present and isn’t due to error variance.” Source: https://www.statisticshowto.datasciencecentral.com/explained -variance-variation/ explanatory and response variables “The response variable is the focus of a question in a study or experiment. An explanatory variable is one that explains changes in that variable. It can be anything that might affect the https://www.investopedia.com/ask/answers/012615/whats-difference-between-rsquared-and-adjusted-rsquared.asp https://www.investopedia.com/ask/answers/012615/whats-difference-between-rsquared-and-adjusted-rsquared.asp https://researchbasics.education.uconn.edu/confidence-intervals-and-levels/ https://researchbasics.education.uconn.edu/confidence-intervals-and-levels/ https://stats.oecd.org/glossary/detail.asp?ID=5055 https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/variance/ https://www.statisticshowto.datasciencecentral.com/explained-variance-variation/ https://www.statisticshowto.datasciencecentral.com/explained-variance-variation/ INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 25 response variable.” Source: https://www.statisticshowto.datasciencecentral.com/explanato ry-variable/ multiple regression “Multiple regression is a statistical method for studying the relationship between a single dependent [or response] variable and one or more independent [or explanatory] variables. It is unquestionably the most widely used statistical technique in the biological and physical sciences.”38 null hypothesis “In general, this term relates to a particular hypothesis under test, as distinct from the alternative hypotheses which are under consideration. It is therefore the hypothesis which determines the probability of the type I error. In some contexts, however, the term is restricted to an hypothesis under test of ‘no difference’.” Source: https://stats.oecd.org/glossary/detail.asp?ID=3767 probability or p-value “The p value is the probability of getting our observed result, or a more extreme result, if the null hypothesis is true.”39 simple linear regression “Simple linear regression models the relationship between the magnitude of one variable and that of a second - for example, as X increases, Y also increases. Or as X increases, Y decreases.”40 statistical significance “Statistical significance refers to the claim that a result from data generated by testing or experimentation is not likely to occur randomly or by chance but is instead likely to be attributable to a specific cause. Having statistical significance is important for academic disciplines or practitioners that rely heavily on analyzing data and research, such as economics, finance, investing, medicine, physics, and biology. Statistical significance can be considered strong or weak. When analyzing a data set and doing the necessary tests to discern whether one or more variables have an effect on an outcome, strong statistical significance helps support the fact that the results are real and not caused by luck or chance. Simply stated, if a statistic has high significance then it's considered more reliable.” Source: https://www.investopedia.com/terms/s/statistical- significance.asp variance “The variance is the mean square deviation of the variable around the average value. It reflects the dispersion of the empirical values around its mean.” Source: https://stats.oecd.org/glossary/detail.asp?ID=5160 https://www.statisticshowto.datasciencecentral.com/explanatory-variable/ https://www.statisticshowto.datasciencecentral.com/explanatory-variable/ https://stats.oecd.org/glossary/detail.asp?ID=3767 https://www.investopedia.com/terms/s/statistical-significance.asp https://www.investopedia.com/terms/s/statistical-significance.asp https://stats.oecd.org/glossary/detail.asp?ID=5160 INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 26 APPENDIX B: ADDITIONAL DATA ANALYSIS Model: Stage as a Function of Population Rank Model: Stage as a Function of Total Methods INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 27 Model: Variables that Combine to Produce the Most Accurate Stage Predictions INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 28 ENDNOTES 1 David Liddle, “Best Value—The Impact on Libraries: Practical Steps in Demonstrating Best Value,” Library Management 20, no. 4 (June 1, 1999): 206–14, https://doi.org/10.1108/01435129910268982. 2 Daniel Pshock, “The User Experience of Libraries: Serving The Common Good,” User Experience 17, no. 2 (2017), https://web.archive.org/web/20190822051708/http://uxpamagazine.org/the-user- experience-of-libraries/. 3 Bruce Massis, “The User Experience (UX) in Libraries,” Information and Learning Sciences 119, no. 3/4 (March 12, 2018): 241–44, https://doi.org/10.1108/ILS-12-2017-0132. 4 Rachel Fleming-May et al., “Experience Assessment: Designing an Innovative Curriculum for Assessment and UX Professionals,” Performance Measurement and Metrics 19, no. 1 (December 15, 2017): 30–39, https://doi.org/10.1108/PMM-09-2017-0036; Rachel Ivy Clarke, Satyen Amonkar, and Ann Rosenblad, “Design Thinking and Methods in Library Practice and Graduate Library Education,” Journal of Librarianship and Information Science (September 8, 2019), https://doi.org/10.1177/0961000619871989; Aja Bettencourt-McCarthy and Dawn Lowe- Wincentsen, “How Do Undergraduates Research? A User Experience Experience,” OLA Quarterly 22, no. 3 (February 22, 2017): 20–25, https://doi.org/10.7710/1093-7374.1866; Juan Carlos Rodriguez, Kristin Meyer, and Brian Merry, “Understand, Identify, and Respond: The New Focus of Access Services,” portal: Libraries and the Academy 17, no. 2 (April 8, 2017): 321–35, https://doi.org/10.1353/pla.2017.0019; Asha L. Hegde, Patricia M. Boucher, and Allison D. Lavelle, “How Do You Work? Understanding User Needs for Responsive Study Space Design,” College & Research Libraries 79, no. 7 (2018), https://doi.org/10.5860/crl.79.7.895; Amy Deschenes, “Improving the Library Homepage through User Research—Without a Total Redesign,” Weave: Journal of Library User Experience 1, no. 1 (2014), https://doi.org/10.3998/weave.12535642.0001.102. 5 Amanda Kraft, “Parsing the Acronyms of User-Centered Design,” in 2019 ASCUE Proceedings (Association Supporting Computer Users in Education (ASCUE), Myrtle Beach, South Carolina, 2019), 61–69, https://eric.ed.gov/?id=ED597115. 6 IDEO, The Field Guide to Human-Centered Design (San Francisco: IDEO, 2015); Joe Marquez and Annie Downey, Library Service Design: A LITA Guide to Holistic Assessment, Insight, and Improvement (Lanham, MD: Rowman & Littlefield, 2016); Scott W. H. Young and Celina Brownotter, “Toward a More Just Library: Participatory Design with Native American Students,” Weave: Journal of Library User Experience 1, no. 9 (2018), https://doi.org/10.3998/weave.12535642.0001.901. 7 Aaron Schmidt and Amanda Etches, Useful, Usable, Desirable: Applying User Experience Design to Your Library (Chicago: ALA Editions, 2014); Joe J. Marquez and Annie Downey, Getting Started in Service Design: A How-To-Do-It Manual for Librarians (Chicago: American Library Association, 2017). https://doi.org/10.1108/01435129910268982 https://web.archive.org/web/20190822051708/http:/uxpamagazine.org/the-user-experience-of-libraries/ https://web.archive.org/web/20190822051708/http:/uxpamagazine.org/the-user-experience-of-libraries/ https://doi.org/10.1108/ILS-12-2017-0132 https://doi.org/10.1108/PMM-09-2017-0036 https://doi.org/10.1177/0961000619871989 https://doi.org/10.7710/1093-7374.1866 https://doi.org/10.1353/pla.2017.0019 https://doi.org/10.5860/crl.79.7.895 https://doi.org/10.3998/weave.12535642.0001.102 https://eric.ed.gov/?id=ED597115 https://doi.org/10.3998/weave.12535642.0001.901 INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 29 8 Zoe Chao, “Rethinking User Experience Studies in Libraries: The Story of UX Café,” Weave: Journal of Library User Experience 2, no. 2 (2019), https://doi.org/10.3998/weave.12535642.0002.203. 9 Daniel Pshock, “Results from the 2017 Library User Experience Survey,” Designing for Digital, March 6, 2018, https://web.archive.org/web/20190829163234/https://d4d2018.sched.com/event/DM8H/d 16-02-results-from-the-2017-library-user-experience-survey. 10 Andy Priestner, “Approaching Maturity? UX Adoption in Libraries,” in User Experience in Libraries: Yearbook 2017, ed. Andy Priestner (Cambridge, England: UX in Libraries, 2017), 1–8. 11 Craig M. MacDonald, “‘It Takes a Village’: On UX Librarianship and Building UX Capacity in Libraries,” Journal of Library Administration 57, no. 2 (February 17, 2017): 194–214, https://doi.org/10.1080/01930826.2016.1232942. 12 Tomer Sharon, “UX Research Maturity Model,” Prototypr (blog), 2016, https://web.archive.org/web/20190829163113/https://blog.prototypr.io/ux-research- maturity-model-9e9c6c0edb83?gi=c462f7ac4600. 13 Coral Sheldon-Hess, “UX, Consideration, and a CMMI-Based Model,” Coral Sheldon-Hess Blog (blog), 2013, https://web.archive.org/web/20190117144529/http://www.sheldon- hess.org/coral/2013/07/ux-consideration-cmmi/. 14 Jakob Nielsen, “Corporate UX Maturity: Stages 1–4,” Nielsen Norman Group, 2006, https://web.archive.org/web/20190709231540/https://www.nngroup.com/articles/ux- maturity-stages-1-4/; Jakob Nielsen, “Corporate UX Maturity: Stages 5–8,” Nielsen Norman Group, 2006, https://web.archive.org/web/20190709231533/https://www.nngroup.com/articles/ux- maturity-stages-5-8/. 15 Nikki Anderson, “UX Maturity: How to Grow User Research in Your Organization,” Medium, May 1, 2019, https://medium.com/researchops-community/ux-maturity-how-to-grow-user- research-in-your-organization-848715c3543. 16 Including: the User Experience Working Group under the Digital Libraries Federation Assessment Interest Group (DLF AIG UX), Code4Lib, Assessment Listserv of Association of Research Libraries (ARL), Access Conference List, Coalition for Networked Information (CNI), Library and Information Technology Association (LITA), Library User Experience (LibUX) Slack Channel, and ALA User Experience Interest Group. 17 Current index and classification list available from http://www.carnegieclassifications.iu.edu/classification_descriptions/size_setting.php ; Data at time of analysis available from Indiana University Center for Postsecondary Research (2018). Carnegie Classifications 2018 public data file, https://web.archive.org/web/20191006220952/http://carnegieclassifications.iu.edu/downl oads/CCIHE2018-PublicDataFile.xlsx. https://doi.org/10.3998/weave.12535642.0002.203 https://web.archive.org/web/20190829163234/https:/d4d2018.sched.com/event/DM8H/d16-02-results-from-the-2017-library-user-experience-survey https://web.archive.org/web/20190829163234/https:/d4d2018.sched.com/event/DM8H/d16-02-results-from-the-2017-library-user-experience-survey https://doi.org/10.1080/01930826.2016.1232942 https://web.archive.org/web/20190829163113/https:/blog.prototypr.io/ux-research-maturity-model-9e9c6c0edb83?gi=c462f7ac4600 https://web.archive.org/web/20190829163113/https:/blog.prototypr.io/ux-research-maturity-model-9e9c6c0edb83?gi=c462f7ac4600 https://web.archive.org/web/20190117144529/http:/www.sheldon-hess.org/coral/2013/07/ux-consideration-cmmi/ https://web.archive.org/web/20190117144529/http:/www.sheldon-hess.org/coral/2013/07/ux-consideration-cmmi/ https://web.archive.org/web/20190709231540/https:/www.nngroup.com/articles/ux-maturity-stages-1-4/ https://web.archive.org/web/20190709231540/https:/www.nngroup.com/articles/ux-maturity-stages-1-4/ https://web.archive.org/web/20190709231533/https:/www.nngroup.com/articles/ux-maturity-stages-5-8/ https://web.archive.org/web/20190709231533/https:/www.nngroup.com/articles/ux-maturity-stages-5-8/ https://medium.com/researchops-community/ux-maturity-how-to-grow-user-research-in-your-organization-848715c3543 https://medium.com/researchops-community/ux-maturity-how-to-grow-user-research-in-your-organization-848715c3543 http://www.carnegieclassifications.iu.edu/classification_descriptions/size_setting.php https://web.archive.org/web/20191006220952/http:/carnegieclassifications.iu.edu/downloads/CCIHE2018-PublicDataFile.xlsx https://web.archive.org/web/20191006220952/http:/carnegieclassifications.iu.edu/downloads/CCIHE2018-PublicDataFile.xlsx INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 30 18 Five institutions appear in this count twice, that is, on five occasions, two persons from the same institution responded separately to the survey. We invited this type of response to capture diversity of opinion and experience within an organization. 19 One institution appears in this count twice, for the same reason as explained in the previous endnote. 20 Susan Farrell, “UX Research Cheat Sheet,” Nielsen Norman Group, February 12, 2017, https://web.archive.org/web/20190828224735/https://www.nngroup.com/articles/ux- research-cheat-sheet/. 21 Klaus Krippendorff, Content Analysis: An Introduction to Its Methodology, 3rd edition (Los Angeles: SAGE, 2012). 22 Scott W. H. Young, Zoe Chao, and Adam Chandler, “Data From: User Experience Methods and Maturity in Academic Libraries,” Distributed by Dryad, https://doi.org/10.5061/dryad.jwstqjq5d. 23 Paul D. Allison, Multiple Regression: A Primer (Thousand Oaks, CA: Pine Forge, 1999), 6. 24 We are aware that our use of linear regression with this small sample surely “over-fits” the dataset, that is, the model is unlikely to predict as accurately if applied to a different dataset. The model will undergo further refinement in the future. 25 We made a conscious choice to leave in some variables in this model that are statistically insignificant. We did so because it might be too early to fully dismiss these elements as unimportant; it could be that our sample was too small to really be certain. Furthermore, our primary emphasis is in creating a model that does a good job of accurately predicting stage based on an array of different characteristics. Removing all the nonsignificant variables in this model would actually lower the prediction accuracy. Adjusted R-squared accounts for additional variables. 26 For more description of the Multiple Linear Regression Model, please see https://web.archive.org/web/20191006231250/https://newonlinecourses.science.psu.edu/s tat462/node/131/. 27 Priestner, “Approaching Maturity? UX Adoption in Libraries.” 28 Craig M. MacDonald, “User Experience Librarians: User Advocates, User Researchers, Usability Evaluators, or All of the Above?,” Proceedings of the Association for Information Science and Technology 52, no. 1 (2015): 1–10, https://doi.org/10.1002/pra2.2015.145052010055. 29 Sharon, “UX Research Maturity Model.” 30 Sheldon-Hess, “UX, Consideration, and a CMMI-Based Model.” 31 MacDonald, “‘It Takes a Village.’” https://web.archive.org/web/20190828224735/https:/www.nngroup.com/articles/ux-research-cheat-sheet/ https://web.archive.org/web/20190828224735/https:/www.nngroup.com/articles/ux-research-cheat-sheet/ https://doi.org/10.5061/dryad.jwstqjq5d https://web.archive.org/web/20191006231250/https:/newonlinecourses.science.psu.edu/stat462/node/131/ https://web.archive.org/web/20191006231250/https:/newonlinecourses.science.psu.edu/stat462/node/131/ https://doi.org/10.1002/pra2.2015.145052010055 INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 USER EXPERIENCE METHODS AND MATURITY IN ACADEMIC LIBRARIES | YOUNG, CHAO, AND CHANDLER 31 32 Rachel Ivy Clarke, “Toward a Design Epistemology for Librarianship,” The Library Quarterly 88, no. 1 (2018): 41–59, https://doi.org/10.1086/694872; Rachel Ivy Clarke, “How We Done It Good: Research through Design as a Legitimate Methodology for Librarianship,” Library & Information Science Research 40, no. 3 (July 1, 2018): 255–61, https://doi.org/10.1016/j.lisr.2018.09.007. 33 Clarke, Amonkar, and Rosenblad, “Design Thinking and Methods in Library Practice and Graduate Library Education.” 34 Allison, Multiple Regression: A Primer, 14. 35 Delroy L. Paulhus, “Socially Desirable Responding: The Evolution of a Construct,” in The Role of Constructs in Psychological and Educational Measurement (Mahwah, NJ: Lawrence Erlbaum, 2002), 49–69. 36 Sheldon-Hess, “UX, Consideration, and a CMMI-Based Model”; MacDonald, “‘It Takes a Village.’” 37 Rachel Ivy Clarke, “Design Thinking for Design Librarians: Rethinking Art and Design Librarianship,” in The Handbook of Art and Design Librarianship, ed. Paul Glassman and Judy Dyki, 2nd edition (Chicago: ALA Neal-Schuman, 2017), 41–49; Clarke, “Toward a Design Epistemology for Librarianship”; Clarke, “How We Done It Good”; Clarke, Amonkar, and Rosenblad, “Design Thinking and Methods in Library Practice and Graduate Library Education”; Shannon Marie Robinson, “Critical Design in Librarianship: Visual and Narrative Exploration for Critical Praxis,” The Library Quarterly 89, no. 4 (October 1, 2019): 348–61, https://doi.org/10.1086/704965. 38 Allison, Multiple Regression: A Primer, 1. 39 Geoff Cumming, Understanding the New Statistics: Effect Sizes, Confidence Intervals, and Meta- Analysis (New York: Routledge, 2012), 26. 40 Peter Bruce and Andrew Bruce, “Regression and Prediction,” in Practical Statistics for Data Scientists: 50 Essential Concepts (Sebastopol, CA: O’Reilly Media, 2017). https://doi.org/10.1002/pra2.2015.145052010055 https://doi.org/10.1016/j.lisr.2018.09.007 https://doi.org/10.1086/704965 ABSTRACT IntroDUCTION Background and Motivation: UX in Academic Libraries Methods Research Questions Survey Participants Materials and Procedure Demographics and UX Methods UX Maturity Research Data Analysis Content Analysis Statistical Analysis Data Preparation Research Dataset Survey Respondents Results RQ1: How mature is user experience practice within academic libraries? RQ2: What Factors Influence UX Maturity? Overview of Statistical Analysis Results Size of Institution Methods Currently in Use by Academic Libraries Factors that Influence UX Methods: Recency, Formality, Regularity Factors that Influence UX maturity: Leadership Support, Collaboration, UX Lead, UX Group, Growth, and Resources Full UX Maturity Model: UX Maturity as a Function of UX Methods A Statistical Example Case: Estimating UX Maturity Discussion Leadership Support and Strategic Alignment Organizational Collaboration Applied UX Methods UX as an Emerging Practice within Libraries A UX Maturity Scale for Libraries Strategies for Climbing the Maturity Scale: Toward a More User-Centered Library Limitations Future Directions Conclusion Appendix A: Glossary of statistical terms Appendix B: Additional Data Analysis Model: Stage as a Function of Population Rank Model: Stage as a Function of Total Methods Model: Variables that Combine to Produce the Most Accurate Stage Predictions ENDNOTES 11811 ---- Near-field Communication (NFC): An Alternative to RFID in Libraries ARTICLES Near-field Communication (NFC) An Alternative to RFID in Libraries Neeraj Kumar Singh INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2020 https://doi.org/10.6017/ital.v39i2.11811 Neeraj Kumar Singh (neerajkumar78ster@gmail.com), PhD, is Deputy Librarian, Panjab University, Chandigarh, India ABSTRACT Libraries are the central agencies for the dissemination of knowledge. Every library aspires to provide maximum opportunities to its users and ensure optimum utilization of available resources. Hence, libraries have been seeking technological aids to improve their services. Near-field communication (NFC) is a type of radio-frequency technology that allows electronics devices—such as computers, mobile phones, tags, and others—to exchange information wirelessly across a small distance. The aim of this paper is to explore NFC technology and its applications in modern era. The paper will discuss potential use of NFC in the advancement of traditional library management system. INTRODUCTION Similar to other identification technologies such as radio-frequency identification (RFID), barcodes, and QR codes, near-field communication (NFC) is a short-range (4–10 cm) wireless communication technology. NFC is based on the existing 13.56 MHZ RFID contactless card standards which have been established for several years and are used for payment, ticketing, electronic passport, and access control among many other applications. Data rates range from 106 to 424 kilobits per second. A few NFC devices are already capable of supporting up to 848 kilobits per second which is now being considered for inclusion in the NFC Forum specifications. 1 Compared to other wireless communication technologies NFC is designed for proximity or short- range communication which provides a dedicated read zone and some inherent security. Its 13.56 MHz frequency places it within the ISM band, which is available worldwide. It is a bi-directional communication meaning that you can exchange data in both directions with a typical range of 4 – 10 cm depending on the antenna geometry and the output power.2 NFC is convenient and fast: the action is automatically triggered when your phone comes within 10 cm near the NFC tag and you get instant access to the content on mobile, without a single click.3 RFID and NFC technologies are similar in that both use radio waves. Both RFID and NFC technologies exchange data within electronic devices in active mode as well as in passive mode. In the active mode, outgoing signals are basically those that actually come from the power source, whereas in case of passive mode the signals use the reflected energy they have received from the active signal. In RFID technology the radio waves can send information to receivers up to hundreds of meters away depending on the frequency of the band used by th e tag. If provided with high amount of power, these signals can also be sent to extreme distances (e.g., in the case of airport radar). At large airports it typically controls traffic within a radius of 100 kilometers of the airport below an elevation of 25,000 feet. RFID is also used very often in tracking animals and vehicles. mailto:neerajkumar78ster@gmail.com INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 NEAR FIELD COMMUNICATION (NFC) | SINGH 2 In contrast, items like passports and payment cards should not be capable of long-distance transmissions because of the threat of theft of personal information or funds. NFC is designed to meet this need. NFC tags are very small in size so as to fit on the inner side of devices and products such as inside luggage, purses and packs as well as from inside wallets and clothing and can be tracked. NFC technology has added security features that make it much more secure than the previously popular RFID equivalent and it is difficult to steal information stored in it. NFC has short range of work area compared to other wireless technologies, so it can be widely used for payments, ticketing and service admittance and thus has proved to be a safer technology. It is because of this security feature that this technology is used in cellular phones to turn them into a wallet.4 Both RFID and NFC wireless technologies can operate in active and passive communication modes to exchange data within electronic devices. The main differences between NFC and RFID are: • Though both RFID and NFC use radio frequencies for communication, NFC can be said to be an extension of the RFID technology. The RFID technology has been in use for more than a decade, but NFC has emerged on the scene recently. • RFID has a wider range whereas NFC has limited communication and operates only at close proximity. NFC typically has a range of a few centimeters. • RFID can function in many frequencies and many standards are being used, but NFC requires a fixed frequency of 13.56 MHz, and some other fixed technical specifications to function properly. • RFID technology can be used for such applications as item tracking, automated toll collecting on roads, vehicle movement, etc., that require wide area signals. NFC is appropriate for applications that carry data that needs to be kept secure like mobile payments, access controls, etc., that carry sensitive information. • RFID operates over long distances while exchanging data wirelessly so it is not secure for the applications that store personalized data. RFID using items susceptible to various fraud attacks such as data corruption. NFC’s short working range considerably reduces this risk of data theft, eavesdropping, and “man in the middle” attacks. • NFC has the capability to communicate both ways and thus is suitable to be used for advanced interactions such as card emulation and peer-to-peer sharing. • A number of RFID tags can be scanned simultaneously, while only a single NFC tag can be scanned at a time. HOW NFC WORKS The extended functionality of a traditional RFID system has led to the NFC Forum. The NFC Forum has defined three operating modes for NFC devices: tag reader/writer mode; peer-to-peer mode, and card emulation mode (see figure 1). The NFC Forum technical specifications for the different operating modes are based on the ISO/IEC 18092 NFC IP-1, JIS X 6319-4, and ISO/IEC 14443. These specifications must be used to derive the full benefit from the capabilities of NFC technology. Contactless smart card standards are referred to as NFC-A, NFC-B, and NFC-F in NFC Forum specifications.5 INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 NEAR FIELD COMMUNICATION (NFC) | SINGH 3 Figure 1. NFC Operation Modes6 Reader/Writer Mode In reader/writer mode (see figure 2), an NFC-enabled device is capable of reading NFC Forum- mandated tag types, such as a tag embedded in an NFC smart poster. This mode allows NFC- enabled devices to read the information that is stored on NFC tags embedded in smart posters and displays. Since these tags are relatively inexpensive, they provide a great marketing tool for companies. Figure 2. Reader Mode7 The reader/writer mode on the radio frequency interface is compliant with the NFC-A, NFC-B, and NFC-F schemes. Examples of its use include reading timetables, tapping for special offers, and updating frequent flyer points, etc.8 INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 NEAR FIELD COMMUNICATION (NFC) | SINGH 4 Peer-to-Peer Mode In peer-to-peer mode (see figure 3), both devices must be NFC-enabled in order for them to communicate with each other to exchange information and to share files. The users of NFC- enabled devices can thus quickly share information and other files with a touch. As an example, users can exchange data such as digital photos or virtual business cards via Bluetooth or WiFi. Figure 3. Peer-to-Peer Mode9 Peer-to-peer mode is based on the NFC Forum’s Logical Link Control Protocol Specification and is standardized on the ISO/IEC 18092 standard. Card-Emulation Mode In card-emulation mode (see figure 4), an NFC device behaves like a contactless smart card so that users can perform transactions such as purchases, ticketing, and transit access control with just a touch. An NFC device may have the ability to emulate more than one card. In card-emulation mode, an NFC-enabled device communicates with an external reader much like a traditional contactless smart card. This allows contact less payments and ticketing by NFC-enabled devices without changing the existing infrastructure. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 NEAR FIELD COMMUNICATION (NFC) | SINGH 5 Figure 4. Card-Emulation Mode By adding NFC to a contactless infrastructure one can enable two-way communications. In the air transport sector, this could simplify many operations such as updating seat information while boarding or adding frequent flyer points while making a payment.10 NFC STANDARDS AND SPECIFICATIONS The NFC specifications are defined by an industry organization called the NFC Forum, which has nearly 200 member companies. The NFC Forum was formed in 2004 with the objective of advancing the use of NFC technology. This was achieved by educating the market about NFC technology and developing specifications to ensure interoperability among devices and services. The NFC Forum members are working together in task forces and working groups. As noted earlier, NFC technology is based on existing 13.56 MHZ RFID standards and includes several protocols such as ISO 14443 type A and type B, and JIS X 6319-4 (which is also a Japanese Industrial standard known as Sony FeliCa). The ISO 15693 standard, an additional 13.56 MHZ protocol established in the market, is being integrated into the NFC specification by an NFC Forum task force. Smartphones in the market are already supporting the ISO 15693 protocol.11 These NFC specifications and especially the specifications for the extended NFC functionalities are again standardized by the international standard organizations like ISO/IEC ECMA and ETSI.12 Initially the RFID standards i.e. ISO/IEC 14443 A, ISO/IEC 14443 B and JIS X6319-4 were also pronounced as NFC Standards by Different companies working in the field such as NXP, Infineon, and Sony. The first ever NFC standard was ECMA 340, based on the Air Interface of ISO/IEC 14443A and JIS X6319-4. ECMA 340 adapted the ISO/IEC standard 18092. At the same time, major credit card companies like Europay, Mastercard, and Visa introduced the EMVCo payment standard, which is based on ISO/IEC 14443 A and ISO/IEC 14443 B. These groups harmonised the over-the-air interfaces within the NFC Forum. They are named NFC-A (ISO/IEC 14443 A based), NFC-B (ISO/IEC 14443 B based), and NFC-F (FeliCa based).13 INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 NEAR FIELD COMMUNICATION (NFC) | SINGH 6 NFC Tags An NFC tag is a small microchip embedded in a sticker or wristband that can be read by the mobile devices that are within range. Information regarding the item is stored in these microchips.14 An NFC tag has the capability to send the information stored on it to NFC enabled mobile phones. NFC tags can also perform various actions, such as changing the settings of handsets or even launch a website.15 Tag memory capacity varies by the type of tag. For example, a tag may store a phone number or a URL.16 The most common use of the NFC tag function on an object is mobile wallet payment processing, where the user swipes or flicks a mobile phone on a NFC tag to make payment. Google’s version of this system is Google Wallet.17 Figure 5. A Quick Overview of the Tag Types18 Applications of NFC Since it emerged as a standard technology in 2003, NFC technology has been implemented across multiple platforms in various ways. The primary driving force behind NFC is its application in the commercial sector in which the implementation of the technology focuses on such areas as sales and marketing. There are also emerging many new and interesting applications in various other fields of education and healthcare. All of these may impact libraries, librarians, and library users, either by prompting adaptations to existing collections and services or inspiring innovation in our profession.19 • Mobile payment: Customers with NFC-enabled smartphones can link with their bank accounts and are able to pay by simply tapping phones to an NFC-enabled point-of-sale.20 INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 NEAR FIELD COMMUNICATION (NFC) | SINGH 7 • Access and authentication: “keyless access” to restricted areas, cars, and other vehicles. One can imagine other potential uses of NFC in the future with the devices in the home being controlled by it.21 • Transportation and ticketing: NFC-enabled phones can connect with an NFC-enabled kiosk to download a ticket, or the ticket can be sent directly to an NFC-enabled phone over the air (OTA). The phone can then tap a reader to redeem that ticket and gain access. 22 • Mobile marketing: NFC tags they can be embedded into the indoor and outdoor signage. Upon tapping their smartphone on an NFC-enabled smart poster, the customer can read a consumer review, visit a website, or even view a movie trailer. • Healthcare: NFC medical cards and bracelet tags can store relevant, up-to-date patient information like health history, allergies, infectious diseases, etc. • Gaming: NFC technology is the bridge between physical and digital games. Players can tap each other’s phones together and earn extra points or receive access to a new level, or get clues, by using NFC application.23 • Inventory tracking, smart packaging, and shelf labels: NFC-tagged objects could provide a wide variety of information in different use environments. NFC-enabled smartphones can be used to tap the tags to access book reviews and information about the book’s author and recommend the book to other readers. Users could check out a book or add it to a wish list to check out at a later date. Indeed, with NFC, library records and metadata could theoretically be stored on and retrieved from library physical holdings themselves, allowing a patron to tap a book or resource borrowed from the library to recall its title, author, and due date.24 APPLICATIONS OF NFC IN LIBRARIES: INTRODUCING THE SMART LIBRARY Some libraries are beginning to use NFC technology as an alternative to RFID. Yusof et al. proposed a newly developed application called the Smart Library, or “S-Library,” that has adopted the NFC technology.25 In the S-Library, library users can perform many library transactions just by using their mobile smartphones with integrated NFC technology. The users of S-Library are required to download and install an app in their compatible mobile phone. This app provides the user relevant and easy to use library functionality such as searching, borrowing, returning, and viewing their transaction records. In this S-Library model the app is integrated with the library management software. The S-library app needs to be installed on the mobile device, and the mobile device requires an internet connection that will connect it to the LMS. The S-Library provides five major functionalities to the user: scan, search, borrow, return, and transaction history. In the scanning function, users can access the information of a book by simply touching their mobile phone to the NFC tag on the book. As soon as the phone touches the book, information regarding its title, author, contents, synopsis, etc. will automatically be displayed on the screen of the mobile device. Users can search for books by entering keywords such as book title, author name, year, etc. Through the borrowing function the app allows users to check out books of interest. The user just needs to touch their mobile phone to the NFC-tagged book to borrow it. The transaction is automatically stored to the LMS database. Similar to the borrowing process is the returning process. The user is required to select the return function on the menu and touch the mobile device to the book, and the returning transaction will be automatically performed and stored in the LMS database. However, it should be ensured that the book is physically returned to the library by returning the book through the NFC-enabled book drop INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 NEAR FIELD COMMUNICATION (NFC) | SINGH 8 system of the library and only then transaction should be updated in the LMS. The user can check the due date for the current transaction as well as his transaction history. The function of transaction history allows the user to view the list of books that have been borrowed from time to time and their status.26 Data transmission for NFC technology can be up to 848 kilobits/second whereas the data transmission rate with RFID technology is 484 kilobits/second. Taking advantage of this high data rate, the response time for S-Library is also very fast. This is a huge improvement over RFID technology and especially over barcode technology where data transmission rate is variable and inconsistent and dependent upon the quality of the barcodes. The second key advantage of S- Library is that the time taken to read a tag (the communication time between a reader and an NFC enabled device) is very fast. The third advantage of NFC is its usability in comparison to the other two technologies. NFC technology is human-centric because it is intuitive and fast and the user is able to use it anywhere, anytime using their mobile phones. In RFID and barcode technology usability is item centric as person has to go to the specific device located in the library. 27 Most of the shortcomings of RFID and barcode technology have been overcome by the S-Library. With barcode technology, the quality of barcodes, printing clarity, print contrast ratio , and also the low level of security were all challenges. RFID technology had many drawbacks such as lack of common RFID standards, security vulnerability, reader and tag collision that happens when multiple tags are energized by the RFID tag reader simultaneously and they reflect their respective signals back to the reader at the same time. Because NFC is touch based, it has presented a viable alternative tool for library users to overcome these weaknesses of the older technology. Yosof et al. found many advantages to S-Library: faster book borrowing; saved time of the user as well as the library staff; the connection can be initialised in less than a second; no configuration on the mobile device is required; and higher usability ratings and security.28 However, there are also some limitations of S-Library. First, device compatibility is an issue, because S-Library presently supports only the Android platform. Second, as the S-Library application only supports up to a 10- centimeter range, coverage is an issue. Mobile Payments NFC technology can be used for several library functions such as making payments, paying library fines, purchasing tickets to library events, or donating to library. Users may also be able to use their digital wallet to pay for photocopying, printing, scanning, etc. Keeping the requirements of the NFC technology in the future libraries have to enquire about the possibility of adding NFC payment capabilities into the existing hardware and also while purchasing new machines. Already, Bibliotheca’s Smartserv 1000 self-serve kiosk, introduced in September 2013, includes NFC as a payment option. In the future other library automation companies for NFC integration would also be worth monitoring.29 Library Access and Authentication NFC-enabled devices can be used to accessing the library and authenticate users. These capabilities suggest that NFC technology may play an important role in the next generation of identity management systems. Of particular interest in this context are several applications of NFC in two-factor authentication, which generally combines a traditional password or other digital credential with a physical, NFC-enabled component as well. For example, an authentication system INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 NEAR FIELD COMMUNICATION (NFC) | SINGH 9 could require the user to type in a fixed password in addition to tapping an NFC-enabled phone, identity card, or ring to the device they are logging in to. IBM has demonstrated a two -factor authentication method for mobile payment in which a user first types in a password and then taps an NFC-enabled credit card, issued by their bank, to their NFC-enabled smartphone. Libraries could investigate similar access and authentication applications for NFC, both for internal use (staff badges and keys) as well as for public services. Particularly if NFC mo bile payment finally gains consumer attraction, library patrons may begin to expect that they can use their NFC-enabled mobile devices to replace not just their credit cards but also their library cards. Already, D-Tech’s RFID AIR Self Check unit allows library patrons to log into their user accounts by tapping their NFC-enabled phone to the kiosk. The patron then uses the kiosk’s RFID reader to check out their library materials and receives a receipt via email or SMS. Beyond its application in circulation, NFC authentication can be applied to streamline access to other services and resources of the library.30 NFC-enabled devices could be used to make reservation of library spaces, classrooms, auditoriums or community halls, digital media labs, meeting rooms , etc. Library users could use NFC authentication to be able to access digital library resources, such as databases, e-journals, e-books collections, and other digital collections. NFC might allow libraries of all kinds to provide more convenient access and authentication options to users, though privacy and security considerations would certainly need to be addressed. NFC access and authentication will certainly have an impact on academic libraries. At universities where NFC access systems are deployed, student identification cards can be replaced with NFC-enabled mobile phones for after- hours services such as library building entry, WiFi access, and printing, copying, and scanning services. The inconvenience of multiple logins can be eliminated. However, the libraries will have to take the responsibility of protecting student information and library resources with added security.31 Promotion of Library Services Librarians can borrow ideas from commercial implementations of NFC-based marketing to enhance promotions for library resources, services, and events. As a first step, as Kane and Schneidewind suggested, NFC tags can complement several promotional uses of QR codes that have already been piloted or implemented in libraries. 32 For promotional use, libraries can easily embed NFC tags in their new book displays that can be linked to the bestseller list or current acquisitions lists in the library catalog or digital collections. Similarly, if the reference book collection is tagged with NFC tags, it could be linked to the relevant digital collections of databases or e-books. NFC tags can be placed on library building doors or on library promotional material by which information such as library hours, opening days, schedule of events, membership rules , or floor plans for the building could be shared. As an example, at the Renison University College Library in Ontario, Canada, visitors can tap an NFC-enabled “library smartcard” to retrieve a digital brochure of library services in a variety of formats, including PDF, EPUB, and MP3.33 To promote outreach programs and events instead of merely sharing links the libraries can take advantage of NFC’s interactive capabilities. As an example, libraries could use NFC tags on their event posters so that the users that can scan them and register for an event, save the event to their personal calendar, join the Friends of the Library program, or even download a library app. To send a text message to a librarian the users can tap the smart poster promoting a virtual reference service. NFC-enabled promotional materials can engage users with library content even when they are outside of the library building itself. A brilliantly creative example was created by the Field INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 NEAR FIELD COMMUNICATION (NFC) | SINGH 10 Museum of Chicago. It used NFC-enabled outdoor smart posters throughout the city to promote an exhibit of the 1893 World’s Fair. The event posters depicted a personage from 1893 that invited the viewer to “See What They Saw.” Users could tap their NFC-enabled mobile device to the smart poster (or read a QR code) to download an app from the Field Museum that included 360° images of the fair as well as videos highlighting items in the exhibition.34 Inventory Control The smart packaging use case brings forward a very important question for libraries that use RFID for inventory control. First, can existing RFID tags and infrastructure be leveraged to provide additional services to patrons with NFC-enabled mobile devices? The concept is not new; Walsh envisioned using library RFID tags to store book recommendations or other digital information, which users could then access with a conveniently located RFID reader. 35 What NFC brings to Walsh’s vision is that a dedicated RFID reader may no longer be necessary; a patron could use their own NFC-enabled smartphone to read a tag rather than taking it to a special location to be read. Indeed, with NFC, library records and metadata could theoretically be stored on and retrieved from library physical holdings themselves, allowing a patron to tap a book or resource borrowed from the library to recall its title, author, and due date. An exciting and immediate use for NFC in libraries is for self-checkout: a patron can browse the stacks and could tap an NFC- tagged book with their NFC-enabled phone to check it out without visiting the circulation desk or waiting in line.36 Smart Packaging A sector close to librarians’ hearts is publishing and several publishers have started testing smart packaging for books, using embedded NFC tags to share additional content with readers such as book reviews, reading lists, etc. With digital extras, the concept of smart packaging has significant implications for libraries as a new opportunity to connect physical collections (i.e., from books to digital media). One can envision in the future that when a user taps an NFC-enabled library book they shall get access to relevant digital information (such as bibliographic information) in a variety of citation formats, editorial reviews, the author’s biography, a projected rating for the book, and links to other similar information. Borrowing and Returning Books One of a library’s key functions is circulating physical books from the library’s collections. Due to the low cost of barcode technology, many libraries around the world are using it for circulation management. However, barcode technology has several constraints: it requires a line-of-sight to the barcode, it does not provide security of library collection, it does not offer any benefit for collection management, and it is becoming challenging for libraries to satisfy the increasing demands of their users, for example, reservation of books issued out, checking their transaction history, etc. This leads to the need to implement a new technology to improve the library circulation management, inventory, and security of library collections. Librarians are known as early adopters of technology and have started using RFID to provide circulation services in a more effective and efficient manner, for security of library collections, and to satisfy the increasing demands of the users, for example putting tags in books allows them to issue multiple books together by placing stack of books near a reader. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 NEAR FIELD COMMUNICATION (NFC) | SINGH 11 RECOMMENDATIONS According to McHugh and Yarmey, the implementation of NFC has been slow and unsteady and they do not foresee an immediate implementation in libraries.37 However, they recommend that librarians learn and prepare for NFC. They recommend, for example, that librarians: • follow the progress of research and scholarship on NFC and commercial progress of NFC technology to better anticipate its adoption in your community; • experiment with NFC technology and develop prototype applications for NFC use in the library; • offer an informational workshop on NFC for users and library colleagues; • enquire from the RFID vendor about tag compatibility with NFC and rewriting the tags; • monitor the progress of security and privacy aspects of NFC technology and educate the users about these issues; develop or update your library security policy; • allow patrons to “opt-in” to any NFC services at your library, providing other modes of communication where possible; • develop and share best practices for NFC implementations; and • support research on NFC in libraries via planning grants, research forums, and conference sessions. CONCLUSIONS Beyond the potential benefits of NFC, librarians should also be aware of and prepared for privacy and security concerns that accompany the technology. User privacy is of the utmost concern. NFC involves users’ mobile devices generating, collecting, storing, and sharing a significant amount of personal data. Several of these functions, particularly mobile payment, necessitate the exchange of highly confidential data, including but not limited to a user’s financial accounts, purchase history, etc. Spam may also be a concern; sending unwanted content (e.g., advertisements, coupons, or adware) to users’ mobile devices without their consent. Librarians should also use special caution when considering the implementation of NFC for library promotions or services. Security is a significant concern and an active area of research, as many NFC implementations involve the exchange of sensitive financial or otherwise personal data. An important concept in NFC security, particularly in the context of mobile payment, is the idea of a tamper-proof “secure element” as a basic protection for sensitive or confidential data such as account information and credentials for authentication.38 Outside of continued standardization, the most effective measures for protecting N FC data transmissions are data encryption and the establishment of a secure channel between the sending and receiving devices (e.g., using a key agreement protocol and/or via SSL). For security concerns, as with privacy concerns, librarians have a crucial role to play in user education. There are important steps that individual users can and should take to protect their devices—e.g., setting a lock code for their device, knowing how to remotely wipe a stolen phone, and installing and regularly updating antivirus software. However, many users are unaware of the vulnerability of their mobile devices and often fail to enact even basic protections. By empowering objects and people to communicate with each other at a different level and establish a “touch to share” paradigm, NFC technology has the potential to transform the INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 NEAR FIELD COMMUNICATION (NFC) | SINGH 12 information environment surrounding our libraries and fundamentally alter the ways in which the library patrons interact with information. ENDNOTES 1 Doaa Abdel-Gaber and Abdel-Aleem Ali, “Near-Field Communication Technology and Its Impact in Smart University and Digital Library: Comprehensive Study,” Journal of Library and Information Sciences, 3, no. 2 (December 2015): 43-77, https://doi.org/10.15640/jlis.v3n2a4. 2 “NFC Technology Discover what NFC is, and How to Use it,” accessed March 17, 2019, https://www.unitag.io/nfc/what-is-nfc. 3 Apuroop Kalapala, “Analysis of Near Field Communication (NFC) and other Short Range Mobile Communication Technologies” (Project Report, Indian Institute of Technology, Roorkee, 2013 ), accessed March 19, 2019, https://idrbt.ac.in/assets/alumni/PT- 2013/Apuroop%20Kalapala_Analysis%20of%20Near%20Field%20Communication%20(NFC) %20and%20other%20short%20range%20mobile%20communication%20technologies_2013. pdf. 4 Ed, “Near Field Communication vs Radio Frequency Identification,” accessed March 10, 2019, http://www.nfcnearfieldcommunication.org/radio-frequency.html. 5 “What It Does,” NFC Forum, accessed March 12, 2019, https://nfc-forum.org/what-is-nfc/what- it-does. 6 José Bravo et al., “m-Health: Lessons Learned by m-Experiences,” Sensors 18, 1569 (2018): 1–27. 10.3390/s18051569. 7 Vedat Coskun, Busra Ozdenizci, and Kerem Ok, “The Survey on Near Field Communication,” Sensors 15, no. 6 (2015): 13348-405, https://doi.org/10.3390/s150613348. 8 Coskun, Ozdenizci, and Ok, “The Survey on Near Field Communications,” 13352. 9 Coskun, Ozenizci, and Ok, “The Survey on Near Field Communication.” 10 “How NFC Works?,” CNRFID, accessed January 12, 2019, http://www.centrenational- rfid.com/how-nfc-works-article-133-gb-ruid-202.html. 11 Coskun, Ozdenizci, and Ok, “The Survey on Near Field Communication,” 13352. 12 C. Ruth, “NFC Forum Calls for Breakthrough Solutions for Annual Competition,” accessed March 21, 2019, https://nfc-forum.org/newsroom/nfc-forum-calls-for-breakthrough-solutions-for- annual-competition/. 13 M. Roland, “Near Field Communication (NFC) Technology and Measurements,” accessed May 12, 2019, https://cdn.rohdeschwarz.com/pws/dl_downloads/dl_application/application_notes/1ma182 /1MA182_5E_NFC_WHITE_PAPER.pdf. 14 Roland, “Near Field Communication (NFC) Technology and Measurements.” https://doi.org/10.15640/jlis.v3n2a4 https://www.unitag.io/nfc/what-is-nfc https://idrbt.ac.in/assets/alumni/PT-2013/Apuroop%20Kalapala_Analysis%20of%20Near%20Field%20Communication%20(NFC)%20and%20other%20short%20range%20mobile%20communication%20technologies_2013.pdf https://idrbt.ac.in/assets/alumni/PT-2013/Apuroop%20Kalapala_Analysis%20of%20Near%20Field%20Communication%20(NFC)%20and%20other%20short%20range%20mobile%20communication%20technologies_2013.pdf https://idrbt.ac.in/assets/alumni/PT-2013/Apuroop%20Kalapala_Analysis%20of%20Near%20Field%20Communication%20(NFC)%20and%20other%20short%20range%20mobile%20communication%20technologies_2013.pdf https://idrbt.ac.in/assets/alumni/PT-2013/Apuroop%20Kalapala_Analysis%20of%20Near%20Field%20Communication%20(NFC)%20and%20other%20short%20range%20mobile%20communication%20technologies_2013.pdf http://www.nfcnearfieldcommunication.org/radio-frequency.html https://nfc-forum.org/what-is-nfc/what-it-does https://nfc-forum.org/what-is-nfc/what-it-does https://doi.org/10.3390/s150613348 http://www.centrenational-rfid.com/how-nfc-works-article-133-gb-ruid-202.html http://www.centrenational-rfid.com/how-nfc-works-article-133-gb-ruid-202.html https://nfc-forum.org/newsroom/nfc-forum-calls-for-breakthrough-solutions-for-annual-competition/ https://nfc-forum.org/newsroom/nfc-forum-calls-for-breakthrough-solutions-for-annual-competition/ https://cdn.rohdeschwarz.com/pws/dl_downloads/dl_application/application_notes/1ma182/1MA182_5E_NFC_WHITE_PAPER.pdf https://cdn.rohdeschwarz.com/pws/dl_downloads/dl_application/application_notes/1ma182/1MA182_5E_NFC_WHITE_PAPER.pdf INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 NEAR FIELD COMMUNICATION (NFC) | SINGH 13 15 “What is a Near Field Communication Tag (NFC Tag)?,” Techopedia, accessed May 27, 2019, https://www.techopedia.com/definition/28812/near-field-communication-tag-nfc-tag. 16 “What is Meant by the NFC Tag?,” Quora, accessed July 12, 2019, https://www.quora.com/What-is-meant-by-the-NFC-tag. 17 S. Profis, “Everything You Need to Know About NFC and Mobile Payments,” accessed June 27, 2019, https://www.cnet.com/how-to/how-nfc-works-and-mobile-payments/. 18 “The 5 NFC Tag Types,” accessed March 24, 2019, https://www.dummies.com/consumer- electronics/5-nfc-tag-types/. 19 Abdel-Gaber and Ali, “Near-Field Communication Technology and Its Impact in Smart University and Digital Library,” 64–71. 20 Iviane Ramos de Luna et al., “NFC Technology Acceptance for Mobile Payments: A Brazilian Perspective,” Review of Business Management 19, no. 63 (2017): 82–103, https://doi.org/10.7819/rbgn.v0i0.2315. 21 Rajiv, “Applications and Future of Near Field Communication,” accessed March 14, 2019, https://www.rfpage.com/applications-near-field-communication-future/. 22 “NFC in Public Transport,” NFC Forum, accessed April 12, 2019, http://www.smart- ticketing.org/downloads/papers/NFC_in_Public_Transport.pdf. 23 “Gaming Applications with RFID and NFC Technology,” SmartTech, accessed May 14, 2019, https://www.smarttec.com/en/applications/gaming. 24 Sheli McHugh and Kristen Yarmey, “Near Field Communication: Recent Developments and Library Implications,” Synthesis Lectures on Emerging Trends in Librarianship 1, no. 1 (March 2014), 1–93. 25 M.K. Yusof et al., “Adoption of Near Field Communication in S-Library Application for Information Science,” New Library World 116, no. 11/12 (2015): 728–47, https://doi.org/10.1108/nlw-02-2015-0014. 26 Yusof et al., “Adoption of Near Field Communication,” 734–36. 27 Yusof et al., “Adoption of Near Field Communication,” 744. 28 Yusof et al., “Adoption of Near Field Communication,” 745. 29 Abdel-Gaber and Ali, “Near-Field Communication Technology and Its Impact in Smart University and Digital Library,” 64. 30 McHugh and Yarmey, “Near Field Communication,” 27. 31 McHugh and Yarmey, “Near Field Communication,” 734. https://www.techopedia.com/definition/28812/near-field-communication-tag-nfc-tag https://www.quora.com/What-is-meant-by-the-NFC-tag https://www.cnet.com/how-to/how-nfc-works-and-mobile-payments/ https://www.dummies.com/consumer-electronics/5-nfc-tag-types/ https://www.dummies.com/consumer-electronics/5-nfc-tag-types/ https://doi.org/10.7819/rbgn.v0i0.2315 https://www.rfpage.com/applications-near-field-communication-future/ http://www.smart-ticketing.org/downloads/papers/NFC_in_Public_Transport.pdf http://www.smart-ticketing.org/downloads/papers/NFC_in_Public_Transport.pdf https://www.smarttec.com/en/applications/gaming https://doi.org/10.1108/nlw-02-2015-0014 INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 NEAR FIELD COMMUNICATION (NFC) | SINGH 14 32 Danielle Kane and Jeff Schneidewind, “QR Codes as Finding Aides: Linking Electronic and Print Library Resources,” Public Services Quarterly 7, no. 3–4 (2011): 111–24, https://doi.org/10.1080/15228959.2011.623599. 33 McHugh and Yarmey, “Near Field Communication,” 31. 34 McHugh, and Yarmey, “Near Field Communication,” 31. 35 Andrew Walsh, “Blurring the Boundaries between our Physical and Electronic Libraries: Location-Aware Technologies, QR Codes and RFID Tags,” The Electronic Library 29, no. 4 (2011): 429–37, https://doi.org/10.1108/02640471111156713. 36 Projes Roy and Shailendra Kumar, “Application of RFID in Shaheed Rajguru College of Applied Sciences for Women Library, University of Delhi, India: Challenges and Future Prospects,” Qualitative and Quantitative Methods in Libraries 5, no. 1 (2016): 117–130, http://www.qqml- journal.net/index.php/qqml/article/view/310. 37 McHugh, and Yarmey, “Near Field Communication,” 61–2. 38 Garima Jain and Sanjeet Dahiya, “NFC: Advantages, Limits and Future Scope,” International Journal on Cybernetics & Informatics 4, no. 4 (2015): 1–12, https://doi.org/10.5121/ijci.2015.4401. https://doi.org/10.1080/15228959.2011.623599 https://doi.org/10.1108/02640471111156713 http://www.qqml-journal.net/index.php/qqml/article/view/310 http://www.qqml-journal.net/index.php/qqml/article/view/310 https://doi.org/10.5121/ijci.2015.4401 ABSTRACT INTRODUCTION HOW NFC WORKS Reader/Writer Mode Peer-to-Peer Mode Card-Emulation Mode NFC STANDARDS AND SPECIFICATIONS NFC Tags Applications of NFC APPLICATIONS OF NFC IN LIBRARIES: INTRODUCING THE SMART LIBRARY Mobile Payments Library Access and Authentication Promotion of Library Services Inventory Control Smart Packaging Borrowing and Returning Books RECOMMENDATIONS CONCLUSIONS Endnotes 11837 ---- Creating and Managing a Repository of Past Exam Papers COMMUNICATIONS Creating and Managing a Repository of Past Exam Papers Mariya Maistrovskaya and Rachel Wang INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2020 https://doi.org/10.6017/ital.v39i1.11837 Mariya Maistrovskaya (mariya.maistrovskaya@utoronto.ca) is Digital Publishing Librarian, University of Toronto. Rachel Wang (rachel.wang@utoronto.ca) is Application Programmer Analyst, University of Toronto. ABSTRACT Exam period can be a stressful time for students, and having examples of past papers to help prepare for the tests can be extremely helpful. It is possible that past exams are already shared on your campus—by professors in their specific courses, via student unions or groups, or between individual students. In this article, we will go over the workflows and infrastructure to support the systematic collection, provision of access to, and repository management of past exam papers. We will discuss platform-agnostic considerations of opt-in versus opt-out submission, access restriction, discovery, retention schedules, and more. Finally, we will share the University of Toronto setup, including a dedicated instance of DSpace, batch metadata creation and ingest scripts, and our submission and retention workflows that take into account the varying needs of stakeholders across our three campuses. BACKGROUND The University of Toronto (U of T) is the largest academic institution in Canada. It spans across three campuses and serves more than 90,000 students through its 700 undergraduate and 200 graduate programs.1 The University of Toronto structure is the product of its rich history and is thus largely decentralized. As a result, the management of undergraduate exams is carried out individually by each major faculty at the Downtown (St. George) Campus, and centrally at the University of Toronto Mississauga (UTM) Campus and the University of Toronto Scarborough (UTSC) Camp us. The Faculty of Arts and Science (FAS) at the St. George Campus has traditionally made exams from its departments available to students. In the pre-internet era, students were able to consult print and bound exams in departmental and college libraries’ reference collections. With the rise of online technologies, the FAS Registrar’s Office seized the opportunity to make access to past exams more equitable for students and worked with the University of Toronto Libraries (UTL) Information Technology Services (ITS) to digitize and make exams available online. They were initially shared electronically via the Gopher protocol and later via Docutek ERes, one of the first available course e-reserves systems. After the UTL became an early adopter of the DSpace (https://duraspace.org/dspace/) open source platform for its institutional repository in 2003, the UTL ITS created a separate instance of DSpace to serve as a repository of old exams. The repository makes the last three years of exams from the FAS, UTM, and UTSC available online in PDF. About 5,500 exam papers are available to students with U of T login at any given time. Discussed below are some of the considerations in establishing and maintaining a repository of old exams on campus, along with practical recommendations and shared workflows from the UTL. mailto:mariya.maistrovskaya@utoronto.ca mailto:rachel.wang@utoronto.ca https://duraspace.org/dspace/ INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 CREATING AND MANAGING A REPOSITORY OF PAST EXAM PAPERS | MAISTROVSKAYA AND WANG 2 CONSIDERATIONS IN ESTABLISHING A REPOSITORY OF OLD EXAMS If you are looking to establish a repository of old exams, these are some of the considerations to take into account when planning a new service or evaluating an existing one. The Source of Old Exams Depending on the level of centralization on your campus, exams may be administered by individual academic departments or submitted by instructors/admins into a single location and managed centrally. The stakeholders involved in this process may include the office of the registrar, campus IT, departmental admins or libraries, etc. Establishing a relationship with such stakeholders is key in getting access to the files. When arranging to receive electronic files, consider whether they could be accompanied with existing metadata. Alternatively, if the university archives or records management already receive copies of campus exams, you may be able to obtain them there. Print versions will need to be digitized for online access—later in this article we will share metadata creation strategies in this scenario. It is also possible that exams may be collected in less formal ways, for example, via exam drives by student unions and groups. The UTL works closely with the FAS Registrar’s Office to receive a batch of exams annually. The UTL receives a copy of print FAS exams that get digitized by the ITS staff. The UTL also receives exams from two U of T campuses, UTM and UTSC, that arrive in electronic format via the campus libraries. The U of T Engineering Society and the Faculty of Law each maintain their individual exam repositories, and the Arts and Science Student Union maintains a bank of term tests donated by students. Content Hosting and Management One of the key questions to answer is which campus department or unit will be responsible for hosting the exams, managing content collection, processing and uploads, and providing technical and user support. These responsibilities may be within the purview of a single unit or may be shared between stakeholders. Here are some examples of the tasks to consider: 1. Collecting exams from faculty or receiving them from a central location 2. Managing restrictions (exams that will not be made available online) 3. Digitizing exams received in print 4. Creating metadata or converting metadata received with the files 5. Uploading exams to the online repository 6. Removing exams from the online repository 7. Providing technical support and maintenance (e.g., platform upgrades, troubleshooting) 8. Providing user support (e.g., assistance with locating exams) At U of T, tasks 1–2 are taken care of by Registrar Offices at FAS and UTM and by the Library at UTSC. Tasks 3–8 are performed centrally by the UTL ITS, with the exception of digitization services for exams received from the UTM and UTSC campuses. Further details and considerations related to the content management system and processing pipelines are outlined in the “Infrastructure and Workflows” section below. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 CREATING AND MANAGING A REPOSITORY OF PAST EXAM PAPERS | MAISTROVSKAYA AND WANG 3 Collection Scope Depending on the sources of your exams, you may need to establish the scope rules for what gets included in the collection. For example: • Will you only include final exams? Will term tests also be included? • Will solutions be posted with the exams? • Will additional materials, such as course syllabi, also be included? At the UTL, only final exams are included in the repository, and no answers are supplied. Exam Retention Making old exams available online is always a balancing act between the interests of students who want to have access to past test questions and the interests of instructors who may have a limited pool of questions to draw from or who may teach different course content over time and want to ensure that the questions continue to be relevant. At the UTL, in consultation with campus partners, the balance was achieved by only posting the three most recent years of exams in the repository. As soon as a new batch is received, the UTL removes a batch of exams more than three years old. Opt-In versus Opt-Out Approach Where exam collection is driven centrally by a registrar’s office, for example, that office may require that all past exams be made available to students. Similarly to the retention considerations, the needs of instructors who draw questions from a limited pool can be accommodated via opt-outs, individual exam restrictions, and ad hoc take-down requests. An alternative approach to exam collection would be an opt-in model where faculty choose to submit exam questions on their own schedule. At the UTL, the FAS and the UTM campus both operate under the opt-out model. The UTL receives all exam questions in regular batches unless they have been restricted by instructors’ requests. Occasional withdrawal requests from instructors require an approval from the Registrar’s Office. Conversely, the UTSC campus operates under the opt-in model where individual departments submit their exams to the library. While this model provides the most flexibility, the volume of exams received from this campus is subsequently relatively small. Repository Access When making old exams available online, one of the things to consider is who will have access to them. Will the exams only be available to students of the respective academic department, or to all students, or to the general public? Will access be possible on campus as well as off campus? If the decision is made to restrict access, is there an existing authorization infrastructure in place that the repository could take advantage of, such as an institutional single sign-on or library’s proxy access? At the UTL, access to the Old Exams Repository is provided through EZProxy in the same fashion as subscription resources made available via the library. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 CREATING AND MANAGING A REPOSITORY OF PAST EXAM PAPERS | MAISTROVSKAYA AND WANG 4 Discoverability and Promotion How will students find out about the exams available in the repository? Will the repository be advertised via the library’s website, promoted by course instructors, or linked with the other course materials? Considering the challenge of promoting a resource like this along with a variety of other library resources, it will be preferable to make it known to students via the same channels through which they receive other course information. For many institutions this would be via their learning management system or their course information system. At U of T, the Old Exams Repository is linked from the Library website. Previously, the link was embedded in the university’s learning management system course template. With a recent transition to a new learning management engine, such exposure is yet to be reestablished. INFRASTRUCTURE AND WORKFLOWS Minimum CMS Requirements A repository of old exams does not require a specific content management system (CMS) or an off- the-shelf platform. Your institution may already have all the components in place to make it happen. Here are the minimum requirements you will want to see in such a system: • File upload by staff (preferably in batch) • File download by end users • Basic descriptive metadata • Search / browse interface • Access control / authentication (if you choose to restrict access) The UTL uses a stand-alone instance of DSpace for its Old Exams Repository. DSpace is an open- source software for digital repositories used across the globe primarily in academic institutions. The UTL chose this platform since it was already running an instance of DSpace for its institutional repository (IR) and had the infrastructure and expertise on site. However, this is not a solution we would recommend to an institution with no existing DSpace experience. While DSpace is an open - source platform, maintaining it locally requires significant staff expertise that may not be warranted considering that a collection of exams would only use a fraction of its robust functionality. If you do consider using DSpace, a hosted solution may be preferable in a situation when local IT resources and expertise are limited. Distributing Past Exams via an Existing Digital Repository An institution that already maintains a digital repository may consider adding exams as a collection to the existing infrastructure. When choosing to do so it is important to consider whether the exams use case may be different from your IR use case, and whether the new collection will fit in the existing mission and policies. Differences may include the following: • Access level. IR missions tend to revolve around providing openly accessible materials, whereas exams may need to be restricted. Will your repository allow selective access restrictions to the exams collection? • Longevity. IR materials are usually intended to be kept long-term, whereas exams may be on a retention schedule. For that reason, it also does not make sense to assign permanent identifiers to exams as many repositories do for their other materials. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 CREATING AND MANAGING A REPOSITORY OF PAST EXAM PAPERS | MAISTROVSKAYA AND WANG 5 • File types and metadata. Unlike a variety of research outputs and metadata usually captured in an IR, exams would have uniform metadata and object type. This makes them suitable for batch transformations and uploads. Batch Metadata Creation Options Because of the uniform object type, exams are well suited to batch processing, transformations, and uploads. At UTL, metadata is created from the filenames of scanned PDF files by a Python script.2 The script breaks up the filename into Dublin Core metadata fields based on the pattern shown in figure 1. See figure 2 for a snippet of the script populating Dublin Core metadata fields. Figure 1. File-naming pattern for metadata creation at UTL. Figure 2. A screenshot of the UTL script generating Dublin Core metadata from filenames. Once metadata is generated, the second Python script (figure 3) packages the PDF and metadata file into a DSpace Simple Archive (DSA) which is the format that DSpace accepts for batch ingests. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 CREATING AND MANAGING A REPOSITORY OF PAST EXAM PAPERS | MAISTROVSKAYA AND WANG 6 Figure 3. A screenshot of the UTL script packaging a PDF and metadata into a DSpace Simple Archive. The DSpace Simple Archive (DSA) then gets batch uploaded into the respective campus and exam- period collections (figure 4) using the DSpace native batch import functionality. Figure 5 shows what an individual exam record looks like in the repository. After a new batch is uploaded, collections older than three years are removed from the repository. The UTL’s exams processing scripts are openly available in Github under an Apache License 2.0 (https://github.com/utlib/dspace-exams-ingest-scripts/). Figure 4. A screenshot of collections in the UTL’s Old Exams Repository. https://github.com/utlib/dspace-exams-ingest-scripts/ INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 CREATING AND MANAGING A REPOSITORY OF PAST EXAM PAPERS | MAISTROVSKAYA AND WANG 7 Figure 5. A screenshot of a record in the UTL’s Old Exams Repository. CONCLUSION Having access to examples of past exam questions can be extremely helpful to students in preparing for upcoming tests. It is possible that old exams are already being shared on your campus in official or unofficial ways, in print or electronically. Facilitating online sharing of electronic copies means that all students, on and off campus, will have equitable access to these valuable resources. We hope that the considerations and workflows outlined in this article will help institutions establish such services locally. ACKNOWLEDGEMENTS The authors would like to acknowledge the UTL librarians and staff who contributed to the setup and maintenance of the Old Exams Repository over the years: Marlene Van Ballegooie, metadata technologies manager, who operated the filename-to-Dublin Core metadata crosswalk; Sean Xiao Zhao, former applications programmer analyst, who converted it into Python; and Sian Meikle, associate chief librarian for digital strategies and technology, who was at the inception of the original exam-sharing service and provided valuable historical context and feedback on this article. ENDNOTES 1 University of Toronto, “Quick Facts,” accessed November 4, 2019, https://www.utoronto.ca/about-u-of-t/quick-facts. 2 University of Toronto Libraries, “Exam Metadata Generation and Ingest for DSpace,” GitHub Repository, last modified September 20, 2019, https://github.com/utlib/dspace-exams-ingest- scripts/. https://www.utoronto.ca/about-u-of-t/quick-facts https://github.com/utlib/dspace-exams-ingest-scripts/ https://github.com/utlib/dspace-exams-ingest-scripts/ ABSTRACT Background Considerations in establishing a repository of old exams The Source of Old Exams Content Hosting and Management Collection Scope Exam Retention Opt-In versus Opt-Out Approach Repository Access Discoverability and Promotion Infrastructure and workflows Minimum CMS Requirements Distributing Past Exams via an Existing Digital Repository Batch Metadata Creation Options Conclusion Acknowledgements Endnotes 11847 ---- Virtual Reality: The Next Big Thing for Libraries to Consider Editorial Board Thoughts Virtual Reality: The Next Big Thing for Libraries to Consider Breanne Kirsch INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 4 Breanne Kirsch (Breanne.Kirsch@briarcliff.edu) is University Librarian, Briar Cliff University. I had the pleasure of attending EDUCAUSE Annual Conference from October 14-17, 2019. This was my first time at EDUCAUSE, but I was impressed with the variety of programs, vendors, and options for learning about technology and higher education. After recently completing my coursework for a second master’s in educational technology, I was curious to see what new technologies would be highlighted at EDUCAUSE. I found out about some new trends, such as the growth of esports in high schools and higher education. Esports are when players or teams compete through computers in video game competitions.1 There were over 20 programs and sessions about virtual reality at EDUCAUSE. Since there were so many programs about virtual reality at EDUCAUSE, I wanted to share a little of what I learned including how some higher education institutions are creating VR content, using pre-created content, and VR in libraries. Since virtual reality is still new to many higher education institutions, I wasn’t sure how many would be creating content, but I did attend a couple of sessions about how 360 -degree content is being created. Virtual reality content creation seems to happen most frequently in the medical field so students can practice different procedures that may not happen very frequently in their jobs, allowing them to experience a wider variety of procedures that they will eventually encounter in the workplace. Health sciences libraries are generally ahead of the curve in providing VR services to patrons.2 Additionally, STEM areas are finding more uses for VR, such as VR laboratories so expensive lab equipment does not need to be purchased, but students can still participate in VR lab experiences. Creating VR content using tools such as Unity can be difficult and time-consuming. Some educators are using 360-degree cameras to create virtual settings that can be used by students but are easier to create. Tim Fuller and Rich Kappel spoke about how they used a 360-degree camera and Matterport scans to create 360-degree virtual environments for students to explore and engage with robotics technology. Tags can then be added to include pictures, videos, or link to websites with more information. This creates a shareable link that can be used to share with students. I was able to use my iPhone and the Google Street View app to create a 360-degree tour of my library. It is not high quality enough to view in virtual reality with an Oculus Go or other VR headset, but it is a great starting point for creating a 360-degree virtual tour of a library on a budget. This was free (since I already had an iPhone). There is a wide variety of freely available, 360-degree content that can be used by educators in the classroom and more is being created. What does this mean for libraries? While quick virtual tours can be created with smartphones, higher quality VR experiences can also be created by librarians using a 360-degree video camera. These experiences could be used to teach students information literacy skills or search strategies in a VR environment. While this would be harder to do right now with the technologies available, mailto:Breanne.Kirsch@briarcliff.edu VIRTUAL REALITY: THE NEXT BIG THING FOR LIBRARIES TO CONSIDER | KIRSCH 5 https://doi.org/10.6017/ital.v38i4.11847 it could become easier down the road. Meanwhile, librarians can create 360-degree virtual tours. Libraries can offer VR services, such as a VR lab or checking out standalone VR headsets, such as Oculus Go or Oculus Quest. Just like with the Makerspaces trend, libraries are well situated to support virtual reality in education. Our library circulates an Oculus Go and when we were considering adding a virtual reality headset, there were some risks we considered prior to purchasing it. There are health risks for some people when using virtual reality headsets, such as motion sickness, dizziness, and, in some cases, epileptic seizures. It is important to explain this to students before they check out the device, so they know to immediately quit using the Oculus Go if they have an adverse reaction. Additionally, we keep cleaning wipes with the Oculus Go to help keep it sanitary when multiple people are using it. A tablet or smart phone needs to be associated with the Oculus Go in order to update apps or download new apps. Therefore, a passcode needs to be added so students can’t purchase paid apps on the Oculus Go with the associated credit card. Privacy can also be a concern, especially when using the social apps, which is why I decided not to download the social apps on the Oculus Go at this time. Some of the scary apps, such as the Face Your Fear app can cause students to scream, so it is important that students realize how realistic the experiences are before using them. One final consideration when offering VR services is staffing. There needs to be someone trained in the library that can help teach students how to use the VR headset and experiences. I’ve trained each of our student workers in how to use the headset so they can show other students. While these are some important considerations when deciding whether to offer VR services or not, I believe the benefits outweigh the risks. Virtual reality is expected to continue to grow, especially with wireless headsets, such as the Oculus Go and Oculus Quest available. It is important for libraries to be ready to offer support with virtual reality, just as we’ve offered support for prior technologies including tablets, laptops, computers, 3D printers, etc. Libraries can start small, by circulating an Oculus Go or creating a 360-degree library tour. Libraries with more resources could create a VR lab or provide support for creating VR content, such as 360 -degree video cameras or tools like Unity. It will be exciting to see how libraries can support VR in the future. FURTHER READINGS Van Arnhem, Jolanda-Pieta, Christine Elliott, and Marie Rose. Augmented and Virtual Reality in Libraries. Lanham: Rowman & Littlefield, 2018. Varnum, Kenneth J. Beyond Reality: Augmented, Virtual, and Mixed Reality in the Library. Chicago: ALA Editions, 2019. ENDNOTE 1 Matthew A. Pluss, Kyle J. M. Bennett, Andrew R. Novak, Derek Panchuk, Aaron J. Coutts and Job Fransen, “Esports: The Chess of the 21st Century,” Frontiers in Psychology 10, no. 156, 2019, https://doi.org/10.3389/fpsyg.2019.00156. 2 Susan Lessick and Michelle Kraft, “Facing Reality: The Growth of Virtual Reality and Health Sciences Libraries,” Journal of the Medical Library Association 105, no. 4, 2017, https://doi.org/10.5195/jmla.2017.329. https://doi.org/10.3389/fpsyg.2019.00156 https://doi.org/10.5195/jmla.2017.329 Further Readings Endnote 11859 ---- Cultivating Digitization Competencies: A Case Study in Leveraging Grants as Learning Opportunities in Libraries and Archives ARTICLE Cultivating Digitization Competencies A Case Study in Leveraging Grants as Learning Opportunities in Libraries and Archives Gayle O'Hara, Emily Lapworth, and Cory Lampert INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2020 https://doi.org/10.6017/ital.v39i4.11859 Gayle O’Hara (gayle.ohara@wsu.edu) is Manuscripts Librarian, Washington State University. Emily Lapworth (emily.lapworth@unlv.edu) is Digital Special Collections & Archives Librarian, University of Nevada Las Vegas. Cory Lampert (cory.lampert@unlv.edu) is Head of Digital Collections, University of Nevada Las Vegas. © 2020. ABSTRACT This article is a case study of how six digitization competencies were developed and disseminated via grant-funded digitization projects at the University of Nevada, Las Vegas Libraries Special Collections and Archives. The six competencies are project planning, grant writing, project management, metadata, digital capture, and digital asset management. The authors will introduce each competency, discuss why it is important, and describe how it was developed during the course of the grant project, as well as how it was taught in a workshop environment. The differences in competency development for three different stakeholder groups will be examined: early career grant staff gaining on-the-job experience; experienced digital collections librarians experimenting and innovating; and a statewide audience of cultural heritage professionals attending grant-sponsored workshops. INTRODUCTION Digitization of cultural heritage resources is commonly viewed as an important and necessary task for libraries, archives, and museums. There are many reasons for engaging in digitization projects and creating digital collections, including providing increased access to unique collections, preserving fragile records, raising the global profile of the institution, meeting user demand, and supporting the teaching, learning, and research needs of host institutions. In addition, there is an expectation among the public that research resources are digitized and available online. From the perspective of librarians and archivists, digitization of special collections and archives materials involves more than just reformatting analog materials into a digital format (this article uses the term “digitization” to refer to the entire lifecycle of digitization projects involving special collections and archives materials, from planning to preservation). Materials must be selected and prepared, the digital surrogates must be described and preserved, and access must be provided to the appropriate audiences. Digitization work is often project-based, since each set of materials to be digitized may require different equipment, specifications, approaches, or workflows. Digitization projects and workflows can be a solo affair, a temporary project team, or a permanent functional area complete with staff specializing in activities such as project management, grant- writing, web development, or metadata. Staff learning needs will significantly vary depending on organizational characteristics, assigned roles, project specifications, and motivation of individuals. Overall, the libraries’ and archives’ profession-wide approach to teaching and developing digitization competencies is somewhat haphazard. There are many methods to learn about digitization, including self-study of published resources, online tutorials and resources, conference presentations, workshops, continuing education courses, and masters in library and information mailto:gayle.ohara@wsu.edu mailto:emily.lapworth@unlv.edu mailto:cory.lampert@unlv.edu INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 2 science (MLIS) program classes.1 In many graduate school programs there has been a move toward integrating digital library theory and practice, but courses are necessarily broad in nature, and not every student will be required or have the opportunity to complete a practicum or internship while studying. This can make it difficult for new librarians to identify which skills are most in demand and which type of self-study is most useful for the job market. Identifying key competencies, and how to acquire them, may be helpful in supporting new librarians as th ey make the jump from graduate education to their first professional position, but it is not a challenge limited to newer professionals. Even seasoned librarians and archivists, with practical experience in their portfolio, may find that their local experience does not translate to different organizations, is too broad for a particular project, or is not deep enough for them to lead the initiation of a new digitization program. The Digital Collections department at the University of Nevada, Las Vegas (UNLV) has a decade- long record of hiring early career librarians for grant-funded projects, providing them with opportunities to develop digitization competencies on the job. From 2017 to 2019, UNLV’s Digital Collections department completed two grant-funded digitization projects that specifically set out goals to contribute to competency development for multiple stakeholders. Early career project managers learned, practiced, and refined skills; the department experimented and innovated its own workflows; and the project team held two workshops to contribute to the development of digitization competencies throughout the state. The six main competencies that were developed during the grant projects are project planning, grant writing, project management, metadata, digital capture, and digital asset management. The authors, who were members of the grant project teams, will discuss the six competencies in this article. Using the grant projects as a case study, they will describe each competency and share how it was used and developed within the project team via on-the-job learning, and within the state via the statewide workshops. LITERATURE REVIEW The idea of professional competencies for librarians and archivists is well-established and documented in academic literature, and defined competencies are recognized as valuable tools for education, recruitment, professional development, and evaluation. Drawing from organizational project management literature, Daniel, Oliver, and Jamieson define competency as the ability to apply combined knowledge, skills, and abilities in service of a measurable and observable goal. 2 In the United States, the American Library Association (ALA) defines “Core Competencies of Librarianship” and “Competencies for Special Collections Professionals.”3 The Competency Framework of the Archives & Records Association of the United Kingdom and Ireland (ARA) describes five levels of experience: novice, beginner, competent, proficient, and expert/authoritative.4 ARA’s recognition of the varying dimensions of competency is a helpful guide, and aligns with the reality of different levels of expertise. However, the competencies identified by ALA, ARA, and other similar professional organizations are necessarily broad; competencies for specific library roles are harder to generalize and define. In order to identify the knowledge, skills, and abilities required of “digital librarians,” researchers such as Choi and Rasmussen analyzed job announcements and surveyed practitioners.5 Job announcement analysis shows that there is no single definition of a digital librarian; instead digital librarian positions consist of many varied roles and responsibilities in almost infinite combinations. The competencies discussed in this article (project planning, grant writing, project management, metadata, digital capture, and digital asset management) were locally important to INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 3 UNLV’s digitization projects, but they also align with the competencies identified in previous research. In their study of projects undertaken in the National Digital Stewardship Residency program (NDSR), Blumenthal et al. found that project management skills and technical skills (including metadata, workflow enhancement/development, digital asset management, and digitization) were important.6 The level of required technical competency tended to vary by project but workflow enhancement stood out as a universally important skill. A 2019 analysis of the latest career trends for information professionals by San Jose State University’s (SJSU) iSchool noted that there is increasing demand for project management skills across all career types. 7 This usually encompasses the ability to organize complex tasks and collaborate with other departments or institutions in service of a shared goal. SJSU also cited “new technologies” as a necessary skill. However, they specified that this refers to “all iterations relating to interest in, familiarity with, or experience with new and emerging technologies” (emphasis in the original). In Choi and Rasmussen’s article analyzing job ads, the authors note that many of the frequently stated job requirements tend to be vaguely described or cover broad areas, including current trends in digital libraries, competency on general technological knowledge, and the current state of information technology as three most frequently mentioned competencies. 8 Digital asset management, digital scanning, digital preservation, and metadata were some of the specific technical skills desired, as well as project management, planning, and organization. Research shows that the more generic the competencies, the more broadly applicable they are; but specific competencies depend on the local environment, the role of the position, and the variables of the project or responsibilities. The wide range of competencies required by the digital library field paired with the specificity of local implementation requires new librarians and archivists to seek out learning opportunities that target both theory and practice. In fact, one of the most important aspects of practical experience is the benefit gained by experiencing the concepts in real-world situations that require decision-making, iteration, and sometimes even failure. The education field points to the Kolb model of experiential learning, a cycle that is composed of four elements: concrete experience, reflective observation, abstract conceptualization, and active experimentation. 9 These elements mirror the process of learning observed in the grant case studies. New project staff are often trained to do tasks, then reflect upon what went well or was challenging. Then permanent staff in leadership roles encourage and facilitate discussions in abstract concepts such as the philosophy behind an organization’s decision to prioritize efficiency or the concepts of creating authentic digital surrogates. While it may not happen in every project, within both grant cases, the final phase of the learning cycle was also reached as project staff and permanent employees worked together to move practice forward through testing, experimentation with new methods, and ultimately innovation of new models for digital library practices in the area of large-scale digitization. Kolb’s model can be useful throughout the library and archives field, as shown in the following example. The Federal Agencies Digital Guidelines Initiative (FADGI) started in 2007 as a collaboration of federal agencies seeking to articulate common sustainable practices and guidelines for digitized and born-digital cultural heritage resources. The FADGI website is a treasure trove of approved and recommended guidelines covering still image digitization, embedding metadata, project planning for digitization activities, and more.10 It essentially provides step-by-step guides for all aspects of digitization and is a tool that those interested or actively involved in digitization should be familiar with and consult on a regular basis. However, INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 4 FADGI technical standards are relatively prescriptive, so organizations often have to decide how to implement them within their local environments, taking into consideration a wide range of variables. If every new digitization project manager conscientiously implemented the FADGI standards without associated institutional context, they could be investing their organization in long-term cost commitments that cannot be sustained over time or that do not meet the project goals. This scenario points to the need for hands-on experience and learning as outlined in the Kolb model. The digitization project manager may want to revisit the goals of the project (access vs. preservation, or both) and resource allocations (storage capacity, software and hardware specifications, staff time and expertise), and then pilot a subset of materials by capturing with the FADGI standard and calculating the storage sizes of the files and any associated workflows for long-term management. Through this small experiential exercise, much information can be gained, reflected upon, and then used to conceptualize how to proceed. Most of the tasks associated with digital library projects demand increasing competency over time to progress from enacting the technical standard in an organizational context, to revising it across projects or local environments, to educating others about the role of the standard, or to, at the highest levels of competency actively participate in the creation or revision of the standard itself as it changes over time. The ability to not only implement but also refine and even innovate comes from a process of mastery of the competency in question. Experiential learning is an important method for developing and refining competencies from a novice to more expert level, but not all librarians and archivists have the opportunity to learn from more experienced colleagues on the job. Matusiak and Hu emphasize the importance but also the inconsistency of integrating experiential learning into MLIS programs.11 For those who do not gain practical experience in library school or on the job, workshops are an additional learning opportunity that can help professionals bridge the gap from written resources to local implementation. The Illinois Digitization Institute is one example described in detail by Maroso in 2005.12 Digital Directions is a conference that presents the “fundamentals of creating and managing digital collections” in two days.13 Other available workshops focus more closely on different aspects of digitization, such as metadata or preservation, or training for specific equipment via a vendor. In the following examination of UNLV’s digitization grant projects and workshops, the authors address six competencies that were either employed or developed by staff or have been identified in existing literature. These competencies may be viewed as critical building blocks for digitization projects and the authors address how they were developed to different levels of expertise and using different methods, experiential learning, and workshops. OVERVIEW OF GRANT PROJECTS UNLV’s Digital Collections completed two grant projects with the main goals of: (1) the large-scale digitization of archival collections, (2) the development of large-scale digitization models and workflows that could be reused, and (3) statewide workshops to share those models and workflows with other libraries and archives institutions. Both projects were funded by Library Services & Technology Act (LSTA) grants administered by the Nevada State Library and Archives. The first project, “Raising the Curtain: Large-Scale Digitization Models for Nevada Cultural Heritage,” digitized mainly visual materials on the topic of Las Vegas entertainment, while the second project, “Building the Pipelines: Large-Scale Digitization Models for Nevada Cultural Heritage,” digitized mostly text documents about water issues in southern Nevada. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 5 Digital Collections hired two types of temporary project-specific staff for the two digitization grants: project managers and student assistants. The project manager for each grant coordinated the day-to-day activities, such as preparation of the materials, digital capture, quality control, metadata, and ingest into the digital collection management system, as well as helping to fine-tune workflow documentation. The student assistants contributed to digital capture, quality control, metadata creation, and upload to the digital collection management system. These grant projects are strong examples of experiential learning and competency development. Two of the authors were principal investigators (PIs) for both of the grants, and one author was the project manager for the second Building the Pipelines grant. At time of hire, the project manager for the second grant had experience working in special collections and archives but had not previously worked in a digital environment. One student assistant was hired for this project; she had already worked on the first large-scale digitization grant project in Digital Collections and was already familiar with the digitization workflow, as well as the hardware and software. Employing a student who had already experienced the concrete tasks (phase 1, “concrete experience” in the Kolb model) allowed her to help the new project staff as they cou ld together perform “reflective observation” (phase 2) and learn from their compiled shared experience. The project PIs were intentional in designing opportunities for discussion. They regularly met with the student and project manager to help them understand what they were seeing and experiencing in the context of the organization's mission and the grant goals (Kolb’s “abstract conceptualization”). The Building the Pipelines grant project facilitated each of them gaining more competency and moving to the next level while also helping the PIs learn through experimenting with new approaches (the final phase of “active experimentation”). The same experiential learning model was also successfully used for the first Raising the Curtain grant project. As previously stated, conducting a day-long digitization workshop for Nevada libraries and archives was a goal of both large-scale digitization grant projects undertaken by UNLV Digital Collections. The Nevada Statewide Large-Scale Digitization Workshops, which were held towards the end of each grant period, were free for participants, and travel grants were available thanks to the grant funding. The workshops sought to provide an overview of large-scale digitization using UNLV projects as examples, as well as to provide practical advice related to developing digitization competencies. The first workshop that UNLV held in May 2018 consisted of presentations and discussions addressing the basics, methods, and challenges of large-scale digitization. The second workshop, held in May 2019, still shared what UNLV learned about large-scale digitization during the grant project, but widened the scope to address multiple important digitization competencies, whether the project is large or small. COMPETENCIES Whether presented in a project-based learning environment, a one-day workshop, or in a self- study scenario, learners can benefit from a clear understanding of what is meant by competencies in each of the areas that make up a successful digitization project. Below, the authors share the competencies most critical to success in the case study projects. These were also the competencies selected as priorities for the workshops. While expertise is not mandatory in all of the competencies in order to start a digitization project or apply for a grant, reflection and planning for each of these steps should be addressed prior to initiating any project. By identifying available resources (such as existing documentation, available staff with expertise to consult, or approval from a supervisor for a self-study plan) project managers can ensure that if there are any INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 6 competency gaps, they will learn the needed competencies to carry out the project. In addition, throughout the learning process, interpersonal skills such as proactive communication, adaptability to change, flexibility in evolving job scope, and cultivation of comfort with ambiguity are all qualities that are just as necessary as any technical skill in mastering competency in digitization. Project Planning This competency can be defined as the ability to create a shared and documented vision and plan so that specifications, expectations, roles, goals, and timelines are considered in advance and clear to everyone involved. Planning for a digitization project is best approached holistically. The planning period is the time to consider all needed competencies and plan for their implementation. Writing up a project plan is important, especially since digitization can involve many collaborators and stakeholders. Even if one is working alone, there are so many components, steps, and details involved in digitization projects that it is important to plan ahead for them and to document everything. Brainstorm and write down ideas and plans for the project, from the overall scope, goals, timelines, and roles, to the specific details of each component, including specifications and workflows for digital capture, metadata, access, preservation, assessment, and promotion (see Appendix A, “An overview of planning and implementing digitization projects”). The plan should be communicated, remain flexible, and be updated (or better yet, versioned) to document changes implemented during the project. An important part of project planning is selecting materials for digitization. To develop competency in effectively selecting materials, a person should be familiar with the materials and the digitization process or collaborate closely with people who are. It is often not until one is in the weeds and discussing the nitty-gritty details of a project that the challenges and actual viability of digitizing specific materials become apparent. Format is a huge factor in digitization, as is description, and understanding how materials will be used.14 Digitizing a group of materials that can all be processed the same way is much easier than undertaking a project to digitize many different formats that require different digitization specifications, equipment, description, processing, etc. One must also take into account legal and ethical considerations. Successful selection of materials takes all factors into account and targets materials that fit with the overall goals and vision of a specific project.15 In the case of UNLV’s grant projects, the head of Digital Collections and the Digital Collections librarian identified the main goals, developed tentative workflows, and authored the grant applications as co- PIs. The PIs had multiple years of experience planning and completing digitization projects, which they drew upon to plan these projects. They both started off developing their digitization competencies by completing pilot projects, developing workflows and writing grants to fund smaller-scale, highly curated “boutique” projects. As they honed their skills and the department’s workflows over the years and the organization built the capacity and expertise to successfully scale up the rate of digitization, digital object production grew from one staff member using one scanner to digitize a couple hundred items in a year, to a robust department with a digitization lab that produces tens of thousands of digital surrogates per year. The PIs documented the vision and goals of the projects in the grant applications, along with timelines, desired outcomes, the roles of the team members, and budgets. The grant application INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 7 provided a structure to help with the bigger picture of project planning, and the Digital Collections librarian also used a template to create detailed digitization plans for the collections. The template was developed locally based on past experience planning and implementing digitization projects (see Appendix B for UNLV’S “digitization plan template”). Project planning was completed prior to the hire of the project managers and student assistants. The project managers and student assistants were responsible for enacting the project plans, and during the projects they were empowered to adapt and improve upon the plans. The modelling provided by the PIs, coupled with the day-to-day experience of the project managers, led to the continuous improvement of and adaptation of workflows through experiential learning. The grant application and digitization plan, along with all of the prepared workflow documentation and tracking spreadsheets, provided a concrete example of how large digitization projects can successfully be planned. By implementing and refining the plans herself, the project manager gained direct experience and intimate knowledge of the plans, including what worked well and what did not. The project manager therefore developed competency in project planning to be able to create plans herself, and the PIs further refined their own planning skills, allowing them to plan for even larger or more complex projects in the future. Based on previous experience with project- based learning, the PIs had already established a level of expertise at roughly level 4 in the ARA tiers. Level 5 includes innovation, which was a target of the grant project as it required the PIs not only to successfully map past experience to a new situation, but in cases where experiences did not map, gain new knowledge through experimentation. The project team included project planning as a topic of the statewide digitization workshops, sharing digitization plan templates, finalized workflows, and other planning resources that aided in the successful completion of the grant projects. Building upon feedback from the first workshop in 2018, the second one addressed the ability to create a digitization project plan of any scale, recognizing that many Nevada institutions do not have the ability to engage in large-scale projects. Despite the emphasis on the foundational importance of project planning, most attendees noted that they do not currently create detailed digitization plans prior to starting a project. Providing examples of plans, practical resources, sharing hands-on experiences, and welcoming discussion was helpful to participants, as indicated by feedback on the post-workshop survey. The workshop organizers scheduled time for participants to work on their own digitization plan, and also offered private consultations to help them, but many participants did not have a specific project in mind and did not seem ready to jump into the details of project planning during the workshop. Overall, these teaching strategies helped participants gain a better idea about how to plan digitization projects, but they do not match the experience of creating or implementing a plan oneself. Grant Writing All projects begin with an idea, but only a small fraction of possible projects are acted upon. This is due primarily to a scarcity of resources. Grant writing is not a necessary competency for all projects, but it is a valuable skill that can secure funding for projects that otherwise would not have been prioritized or possible. In its simplest form, a grant is a well-communicated idea with supporting rationale that effectively communicates why a project is a priority to undertake.16 Grant applications are usually composed of a narrative section that covers the main goals, a budget with associated costs for the project, letters of support from partners, and details about the project team leading the work. Even if a grant is not needed to undertake a project, the process of writing one often mirrors the very same decision-making that is necessary in the project planning INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 8 step. Project planning is recommended for all digitization projects and it is nearly always required by external grant funders. Grant writing can be undertaken alone, in a team, or (in larger institutions) as part of a research office or external funding program. In any case, it can be defined as the skill of writing text, calculating costs, and compiling relevant documentation to successfully propose projects for the award of external funding. Competency in grant writing requires excellent communication skills, including the ability to craft persuasive arguments advocating for the project and the analytical ability to interpret instructions and guidelines to ensure the project is in compliance with the funder’s requirements. Often, grant writing involves several people: disciplinary experts, collaborative partners, commercial vendors or contractors, technicians, and advisory boards. Being able to facilitate discussions and coordinate actions is vital to wrangling the pieces of a large grant pre-award, as well as successful grant administration once funded. Grants are competitive in nature, so creativity and originality in framing of a problem can mean the difference between a highly ranked grant and one that is passed over by reviewers. One method to obtain competency in grant writing is to read as many grant proposals as possible, specifically targeting those for similar projects.17 In addition, some funders look for a panel of grant reviewers and seeking out opportunities to participate in these processes is a valuable education. In the case of UNLV Digital Collections and the PIs, grant writing has been honed over time by some of the strategies mentioned above: reading other grant applications, serving on grant review panels, collaborating with other stakeholders, and communicating with the granting agency to understand criteria and solicit feedback. Although the grant proposal was written and the grant was secured prior to the hire of the project managers, the project managers were able to develop a thorough understanding of the grant process. By successfully completing the grant projects, in addition to reviewing the grant proposals, contributing to quarterly reports, and discussing the projects with the PIs and other stakeholders, the project managers gained valuable experience and understanding to inform their own future grant applications. Given the scarcity of resources, the statewide digitization workshops made it a priority to address various aspects of locating grant opportunities, preparing to write proposals, seeking out collaborations to strengthen applications, and the mechanics and timelines to expect when applying for grants. One of the panel sessions in the workshops included a presentation by the state library’s grant administrator, who provided an overview of the state process and what the board looks for when reviewing project proposals. Many participants found this particularly helpful because seeking out and applying for grants for digitization projects was not within their frame of reference, especially as many did not believe they had the requisite expertise in digitization. Awareness of a need, gathering information, and analyzing examples are some of the first steps in developing a competency. The workshops helped attendees take these first steps of developing competency in grant writing and management but fell short of actually helping them to write their own grants. In this case, however, it was appropriate since the attendees did not have specific projects in mind and likely needed to spend more time in the first stages of competency development before jumping into implementation. Workshops are most effective when the level of the content is appropriate to the level of expertise of the attendees. Project Management Project management training is not often specifically emphasized in MLIS programs. While there is literature on this topic, most people learn on the job.18 A successful project manager demonstrates INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 9 mastery of this competency by taking responsibility and assuming leadership of the project throughout the process, even if they are not intimately involved in the day-to-day tasks. They often are responsible for hiring and training project team members as well as communicating and responding to project team members and stakeholders. While they are tracking and analyzing progress using appropriate metrics, they are often the one raising a red flag if the project is experiencing delays or challenges. Because they are responsible for ensuring the completion of the main goals of the project within the specified timeline, they often need to analyze bottlenecks and propose possible solutions in order to deliver high-quality results. Ideally, they learn from their experiences and also help other team members and the organization learn from experience. A key role of the project manager is not only to deliver the outputs, but to assess and analyze, both during the project, in order to make improvements, and after, in order to inform future projects. Therefore, investment in mentoring and supporting a project manager, whether a temporary or permanent staff member, can greatly influence how much learning takes place during the project and how that acquired knowledge is transferred to others. Documentation is a key part of project management. This needs to happen at every interval of the project—while planning, during implementation, and at the conclusion.19 Documenting concrete data including the time spent on specific activities helps to plan cost predictions for future projects, as well as to make recommendations regarding future staffing and equipment. Mastering this competency involves planning, an eye for both details and the big picture, clarity, transparency, communication, and dedicated recordkeeping from the start of the project to the end. Much like in project planning, the UNLV PIs had multiple years of experience stewarding projects from start to finish, which assisted them in on-the-job development of the project management competency. They were able to share with the project managers their accumulated years of learning experiences on both the projects, providing guidance on what to look for and how to comprehensively document the current digitization projects. This mentorship, combined with the experience of managing the day-to-day workings of the digitization projects, allowed the early career librarians to develop this competency. In addition, monthly project staff meetings, complemented by on the spot consultation when necessary, contributed to the ease of competency development. During the statewide digitization workshops, the project teams discussed digitization project management and shared strategies and tools such as using Google Sheets and Trello to track workflows and progress. The teams also provided advice for aspects of project management such as managing student workers, troubleshooting equipment, transparent communication, and more. The project team chose to focus specifically on their own large-scale digitization experience because literature and resources about general and library project management are readily available. In addition, participants were encouraged to consider how their non-digitization experiences with project management could be translated to this kind of project as a way to encourage reflective learning based on their individual experience. Metadata Digitizing materials would not be a valuable endeavor without comparable investment in describing them with metadata that aids users in discovering and using the digital objects. Developing a project plan that includes metadata approaches is essential in scoping project work and resources. Metadata assignment and quality review is often a far more resource-intensive step INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 10 than the process of digital capture. Metadata is one digitization competency that is robustly addressed in library school programs. Standards are well documented and examples of digital library metadata are easily accessible online. The importance of metadata to the library and archives profession means that many professionals already have a foundational knowledge. What makes metadata a difficult competency to master is the level of detail and specificity it entails, which makes the step from theory to practice challenging. Metadata competencies require an understanding of recognized standards, the ability to interpret and apply them, and an awareness of metadata mobility including: reusability, interoperability, and flexibility for migration or transfer.20 Metadata-related skills require comfort moving along a wide spectrum of varied tasks, often toggling between awareness and understanding of high-level philosophical issues (such as inclusiveness of subject terms) and a laser-focused eye for detail to troubleshoot data issues (like correcting spreadsheets or code). Metadata work traverses several phases of the digitization lifecycle: from initial preparation of collections, during capture, through the ingest into systems, and over the long-term to maintain and preserve the assets. Metadata quality itself is difficult to quantify, making this a competency that can be tricky to evaluate. Mastery can be indicated through the identification and study of appropriate standards, including compliance with any data reuse requirements, such as a regional digital library consortium, or metadata requirements to ensure compatibility with existing systems and data. In addition to selection of standards, or adherence to existing standards, metadata can be subjective and needs to be undertaken with attention to the level of specificity required for the project. Completion of successful projects demonstrates efficient processing of records balanced with an appropriate level of metadata richness. Documentation of metadata approach via a metadata application profile (MAP) as well as training materials and examples for metadata creators are also good indicators of metadata expertise. While technical skills are valuable for metadata competencies, communication and soft skills should not be underestimated as part of this skill set. Often metadata competency is an area where collaboration is required. Many libraries have catalogers, metadata librarians or aggregators that can advise and sometimes train or provide documentation for projects. Before creating a new metadata approach from scratch, consultation can be a very effective way to gain greater competency. At UNLV, the choice of an already processed collection eased the metadata choices for digitization. This meant there was already a certain amount of basic metadata regarding the collection; in addition, having the curator function as a subject expert engaged in prepping the collection enabled the project team to have a readymade list of prioritized subject terms, people, and corporate bodies available to input as each folder in the collection was digitized. The Building the Pipelines project manager had prior coursework in metadata, as well as experience assigning metadata in a previous internship. Using the UNLV’s metadata application profile as a guide and the existing metadata procedures established for the project, the project manager was able to hone a better understanding of metadata theory applied in practice, including how to best capture the “aboutness” of these particular digital objects. The project manager also observed the importance of consistency in applying metadata by performing quality control of the student- created metadata. A final contributing factor in developing competency in this area is that the team, consisting of the Digital Collections librarian, the project manager, and the student assistant, had many resources available as a team to solve problems. As previously mentioned, the team met INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 11 to review any project concerns and to pull in adjacent team members such as technical services staff, the metadata librarian, the curator, or even those with experience in programming and application development who could advise on how the metadata would appear in other systems, such as the one being developed for future digital collections. This larger group feedback was invaluable in the learning process and often touched on the more abstract concepts underpinning the tasks. At the statewide workshops, a metadata “bootcamp” was held in which staff addressed the types of metadata standards attendees were likely to encounter, the role of a metadata application profile, how to identify an existing MAP and apply it to your collection materials, as well the value of having a subject expert available for consultation. While reuse of existing description data (e.g. , finding aids or inventories) was an important topic for the first workshop, in response to feedback the second workshop’s metadata bootcamps focused more on concrete steps required to make digitized images searchable regardless of other workflows or systems that might be in use. Again , this was an example of tailoring the content to the learning level of the audience. While all participants were familiar with metadata, many did not have experience using a MAP or taking interoperability into consideration. Many recognized a need to devote more time to developing this competency, regardless of project. Digital Capture Whether it is done in-house or outsourced to a vendor, competency in digital capture (digitization in the most specific sense) is key. This competency requires considering the materials to be digitized, how they will be displayed, and how long-term access will be provided to the digital objects. Working in-house, technical mastery is not required, but it is necessary to have a solid idea of what hardware and software capabilities are, as well as who to consult should difficulties arise (and they will).21 Mastery of this competency means having a vision for the ongoing presentation and use of the digitized material and outlining specifications to make that happen. Documenting digitization specifications is useful not only for the project manager and for fu ture projects, but also as a training tool for students, interns, and volunteers. It can also be a source of important preservation and technical metadata ensuring files created today are sustainable into the future. In addition, a robust quality control workflow should be in place prior to uploading digital objects for display and use. A key component of digital capture is efficiently preparing the selected materials. At UNLV, experience has taught the Digital Collections department that digitization is most successful when using materials that have already been physically processed (surveyed and arranged) and for which an inventory (finding aid) has been created. Digitization of archival materials can quickly become complicated because they are often not physically uniform or consistent, and sometimes they are grouped together for digitization into complex/compound/aggregate digital objects. Well thought-out workflows for naming and tracking individual files can make the digital capture process smoother, especially when files are related (such as the front and back of a photo, or pages of a scrapbook). This item-level documentation is critical to managing the large volume of files created in digital capture. Any conservation or preservation concerns of the physical materials should also be addressed prior to capture. Additional consultation may be required if unforeseen complications or problems arise during digital capture; item-level review may not be possible for all materials during the planning stage. For instance, there may need to be an alternate workflow for items that contain personally identifiable information or which are too fragile to undergo scanning or capture. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 12 There are a number of options for capturing images to create digital surrogates, including digital camera systems and a variety of scanners. Depending upon the method of capture, additional software may be needed to edit, output, and ingest the images into a digital management system. For a text-heavy collection, software for optical character recognition (OCR) makes the items full- text searchable. For audiovisual materials, digital capture is even more complex. The local hardware, software, and procedures for capture all may require an investment in hands-on time learning and testing procedures. The repetitive nature of capturing items may also require some investigation of ergonomics or more human-friendly configurations of these variables. At UNLV, step-by-step documentation for using the various hardware and software is key to developing staff competencies. Such documentation includes screenshots of steps in the process to contribute to comprehensive understanding and correct implementation of the workflow. Project- based staff also make suggestions, as they move through projects, to improve current workflows. The clear documentation, repetition of tasks, access to workflows of prior digitization projects, consultation with experienced staff, and review of available resources (such as the previously mentioned FADGI website) all contributed to competency development for the project managers. Although the PIs have years of experience digitizing, it is a detailed process that can be forgotten without use and practice, and it is a competency that must be continually cultivated because of changing technology. If it is decided to outsource digital capture, there are a number of factors to take into account in order to find the right vendor. Issues to consider include cost, company stability, prior clients and completed projects, timelines, where the work is performed, and preferred communication methods. Requesting a quote for services can be a good way to gain visibility into vendor communications, flexibility, and workflows, and will be essential if the project funds are administered in conjunction with any state or organizational purchasing rules or guidelines. Although it can be time-consuming, it is vital for the research and legwork to take place prior to starting the project (see the “Project Planning” section). In outsourcing, confidence in the digital capture partner is key. Mastery of this aspect of digitization means a comprehensive, transparent agreement, a regular flow of communication, and comfort in letting go of control over a major part of the project. Resources provided by the Northeast Document Conservation Center (NEDCC) and the Sustainable Heritage Network help to consider the pros and cons of both in-house and outsourced digital capture.22 Project management skills can also be very useful as working with a vendor shifts the needed competency from digital capture to more of a project management focus. UNLV often employs vendors for the more challenging formats mentioned, such as oversized materials like maps and architectural drawings, and for materials like newspapers that require specialized zoning in the metadata to retrieve articles. Working with a vendor can be an informative experience, teaching communication skills, negotiation of contracts, building appropriate timelines, and quality reviewing deliverables. Some granting agencies cover a limited timeframe and outsourcing digital capture can free up an organization’s time to do more library- centric work like metadata or archival processing. For the Building the Pipelines project, most of the material in the selected collections was flat printed material that was not oversized or in challenging formats such as film/transparent material, newspapers, or media (audio/video). This led to a high comfort level for in-house digital capture as there were established procedures for the archival collection. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 13 At the statewide workshops, participants attended a digital capture session where they were presented with digital capture workflows and information about UNLV’S decision-making regarding digitization equipment, outsourcing vendors, and technical standards, and then they went into the digitization lab to observe the equipment in action. The digital capture bootcamp was facilitated by the head of Digital Collections, the student assistant, and the visual resources curator (who is a professional photographer). This unstructured session offered a place for attendees to preview equipment that might be suitable for their projects, get a sense of costs if they were looking to purchase equipment, and to observe the digital capture in a large-scale workflow (a specially designed rapid capture overhead camera system), a medium-scale workflow (with Digital SLR camera and copy stand), and a small-scale workflow (flatbed and map scanners). Attendees were encouraged to match equipment to their project needs or identify if outsourcing was an appropriate approach for their collection. Attendees were not able to use the equipment themselves or practice the digital capture workflows, but the small workshop format allowed them to view demonstrations in person, ask specific questions, and also see example workflows in action, which is a step above what online research or resources provide for competency development. Digital Asset Management Competency in digital asset management goes beyond identifying the storage capacity necessary for a project. Digital asset management includes the storage, access, and preservation of digital files and their accompanying metadata. There are different ways to provide access to digital objects, some of the most popular being online content management systems like Omeka or digital collection management systems like CONTENTdm.23 As mentioned previously, metadata is important for staff and users to discover and locate digital objects. Competency in digital asset management requires technical knowledge of how to securely and efficiently transfer digital files that are requested, or how to provide secure and user-friendly online access. It also requires planning to ensure that whichever approach taken is sustainable and can meet demand. Good digital preservation means planning and implementing the necessary actions to ensure that digitized resources continue to be discoverable, accessible, and usable well into the future. In the case of digitized libraries and archives materials, this means that they must be well-documented and trustworthy. Preserving digital materials includes maintaining multiple copies of files, capturing checksums to verify if the bits of a file have been corrupted over time, and in some cases, migrating file formats so that items can be viewed and used with future hardware and software. Models for digital preservation include the Open Archival Information System (OAIS) model and the National Digital Stewardship Alliance (NDSA) levels of preservation.24 Software and tools to aid in digital preservation tasks are available, as is training. However, digital preservation is still relatively new to many in the libraries and archives profession, although some individuals and institutions have developed very sophisticated and carefully considered programs and approaches. Since digital preservation is based on technology, it will always be changing. One must not only learn and be able to implement the current standards and best practices of digital preservation, but also always keep up with changes. Success in digital preservation requires ongoing effort and evaluation. Successful digital preservation means that staff and users can find, understand, view, and use digital resources at any point in the future. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 14 For the grant project managers, this was the most challenging competency. While they were exposed to the complexities of digital preservation at UNLV, this process was already well- established, having been developed over time by the PIs and other library staff. The project managers essentially stewarded the newly digitized objects up to this point and then handed over the reins to the Digital Collections librarian. While they were free to ask questions and developed an understanding of the standards that contribute to long-term digital preservation, the project managers did not implement this particular workflow, nor did they contribute to adapting it. It is important to keep in mind that digital preservation is not an all or nothing proposition; small steps can be taken by libraries and archives professionals to address short-term digital preservation while gaining a better understanding of long-term solutions.25 Given the complexity of this competency, it was difficult to train participants in the statewide digitization workshop setting. However, UNLV’s Digital Collections staff emphasized the multiplicity of options available for libraries and archives with varying levels of resources and encouraged participants to be open to starting despite ambiguity about the ultimate long-term solution for their organization. Digital collections staff also provided an overview of these options and shared the evolution of digital preservation strategies at UNLV, including suggesting some first steps such as creating an inventory of digital assets and a digital preservation plan. Developing expertise in this competency requires in-depth research, consultation, and analysis to customize plans for local circumstances. The statewide workshops provided only an hour-long introduction to the topic and a broad overview as an example. Digital preservation is a topic that is well-addressed by other more intensive workshops though, such as the Society of American Archivists Digital Archives Specialist courses and the POWRR Institute.26 SUMMARY OF COMPETENCY DEVELOPMENT: EXPERIENTIAL LEARNING VERSUS WORKSHOPS Learning through Experience for Project Teams UNLV’s grant projects are examples of how specific time-bound projects and grant funding can be utilized to develop both individual and organizational competencies, and to share what is learned via workshops, aiding in the professional development of others. The early career project managers advanced the most in competency development because of the opportunity for focused training and experiential learning through practice. They progressively developed digitization competencies in a number of ways, including training from the PIs, working with experienced student assistants and staff, reading locally created documentation, observing project activities and decision-making by the team, proposing solutions to challenges and testing them through trial and error, learning by doing tasks and suggesting small iterations to improve them, consulting the workflows from previous projects, and reviewing recommended resources such as the FADGI website. The project managers, though temporary employees, were treated with the same status as permanent staff and encouraged to attend meetings, ask questions, take risks, and experiment in a safe and controlled environment of learning. Given the many multifaceted details and tasks that go into a digitization project from start to finish, it is unrealistic to expect staff to remember everything without engaging in the process themselves. Training for the grant projects broke each project down into a series of discrete steps, including preparation, digital capture, quality review, OCR transcription, metadata creation and review, and upload into the digital asset management system. Each task was reviewed and practiced in a linear manner. Given the volume of materials, basic mastery and self-sufficiency for INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 15 the grant project staff were achieved fairly quickly. This allowed project staff to then identify areas for workflow improvements and test adjustments for increased efficiency. Despite having just two dedicated staff, one of whom had no prior digitization experience, over 55,000 digital surrogates were created during the ten-month Building the Pipelines project, far exceeding the original goal of 10,000 images. In both grant projects the project managers were able to develop digitization competencies as a result of on-the-job experience, enriching their skill sets while also assisting UNLV Digital Collections to refine workflows for large-scale digitization projects. This in turn strengthened competencies on the organizational level, and those of the PIs. In the best-case scenario of this kind of project, temporary staff develop valuable digitization competencies via project-based work; however, that is not always the case, and temporary project-based positions can be very harmful to the personal and professional development of workers. When undertaking a project that uses temporary labor, the organization should plan for and prioritize equitable hiring practices, fair compensation and benefits, and a positive and productive experience for temporary staff.27 Learning through Experience for Organizations Grants are temporary in nature, so it is important that organizations who fund them and who receive funding think about the long-term implications of the temporary work. It is important for project staff to clearly document all of the details of the digitization approaches and workflows that worked successfully in the grant, as well as any problems that can be avoided. All the extra work of testing and refining new workflows completed by the project staff, can (and should) be adopted and integrated by permanent staff into the existing structure of the department or institution. One of the drawbacks for institutions undertaking grant-funded projects is that temporary staff leave and take their expertise with them. It is essential for permanent staff to not only teach, but also be open to and active in learning from the temporary staff during the project, even if the permanent staff are not doing the day-to-day work. Building opportunities for information-sharing and knowledge transfer into a project plan vastly increases the value of the grant project funding. This organizational learning is a form of accountability to the funder ensuring that projects can be sustained and that lessons learned contribute to increased capacity in the funded organization and beyond. Learning through Workshops for Professional Development Grant projects also pave the way to share lessons learned with colleagues via workshops or collaborative endeavors. As previously stated, conducting a day-long digitization workshop for Nevada libraries and archives institutions was a goal of both large-scale digitization grant projects undertaken by UNLV Digital Collections. Besides the metropolitan areas surrounding Las Vegas and Reno, much of Nevada is rural and sparsely populated. These workshops provided a forum for people who might not usually come together to meet and talk about their work. Many libraries and archives institutions in Nevada are small and may have limited or no experience with digitization. The workshops sought to provide an overview of large-scale digitization using UNLV projects as examples, as well as to provide practical advice related to developing digitization competencies. The first workshop at UNLV, held in May 2018, consisted of presentations and discussions addressing the basics, methods, and challenges of large-scale digitization (see Appendix C for the May 2018 agenda, “Nevada Statewide Large-Scale Digitization Symposium”). The grant team surveyed participants after the workshop and received mostly positive responses. Sixteen out of INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 16 nineteen people who completed the survey said they learned something, thirteen said they were confident and likely to apply what they learned, and eleven people said that if there was a follow - up workshop they would attend. The comments from the surveys showed that participants wanted more interactive activities, and many of them were not ready to implement large-scale digitization at their institutions—they wanted to learn more about the basics of digitization first. This feedback highlighted two challenges of workshop-based learning: the tendency toward passive delivery of large amounts of information and designing content for an audience with unknown or varying skill levels. The second workshop, held in May 2019, still shared what UNLV learned about large-scale digitization during the grant project, but widened the scope to address multiple important digitization competencies, whether the project is large- or small-scale (See Appendix D for the May 2019 agenda, “Nevada Statewide Large-Scale Digitization Symposium”). Prior to the workshop, attendees were surveyed about their expertise level and topics of interest and were asked to review a project planning document with their local materials in mind. Sessions were designed as bootcamps with more extensive documentation that could be used as a template for implementation at their home organization. Participants were encouraged to ask questions and share their own experiences during the workshops and were given the option to sign up for a private consultation. The team endeavored across the workshop to allow for more interactive, hands-on learning. Although UNLV adjusted the second workshop based on the feedback from the first, teaching practical how-to skills that are broadly applicable in a one-day workshop is challenging. Digitization is a complicated and technical undertaking that is most easily learned via hands-on experience, which is most effectively gained through repetition rather than a one-day workshop. There was not enough time or equipment for participants to actually practice parts of the digitization process themselves and so experiential learning was not always an option for every competency. Also, if participants return to an organization with different equipment, hardware, and software, there are limits to hands-on training. Another potentially problematic issue is staying up to date with the rapid technological changes that characterize digital collections. If a person gains a basic intellectual understanding of digitization via a workshop or other professional education opportunities, and then returns to their setting without starting a specific project in a timely manner, there is a risk that the knowledge they gained becomes outdated. Despite the drawbacks of workshop-based learning, workshops are still valuable venues for colleagues to come together and learn from one another. They can also provide demonstrations or hands-on learning activities that help to bridge the gap from written theory to local implementation. CONCLUSION Online access to libraries and archives materials is expected and increasingly necessary in order for institutions and their collections to remain vital, useful, and relevant. Ideally, digitization in libraries, archives, and museums would be a permanent functional area with specialized staff. However, many medium and smaller libraries and archives institutions do not have the capacity to sustain such an area. Competencies in the areas of project planning and management, grant writing and administration, digital capture, metadata, and digital asset management are instrumental in order to complete a successful digitization project or institute a digitization program in any setting. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 17 Despite the proliferation of professional workshops, online resources, literature, and conferences regarding digitization skills, it can be difficult to make time to study these materials and put such learning into practice in a way that builds to more sophisticated learning through experience. The diversity of collection materials to be digitized, the range of local circumstances, and the changing pace of technology prevent any profession-wide standardized approach to digitization education. Instead, individuals, organizations, and the profession as a whole must strategically invest in the most effective and efficient methods and opportunities for developing digitization competencies. Locally, UNLV Digital Collections has found that experiential project-based learning is the most effective way to pilot new workflows and develop competencies. Project-based experiences, if thoughtfully designed with an eye to mentoring and supporting temporary staff, provide an opportunity for individuals to develop and practice these competencies in a hands-on way that encourages deep learning. There is a unique place for small pilot projects, modest grant projects, or one-time experimental projects to create a space for this kind of learning in almost any organization. As capacity increases, digitization projects can also be designed to develop competencies at the staff functional group level, the organizational level, or the regional level. Workshops in turn can be an opportunity for project teams or experienced individuals to share what they’ve learned and teach basic competencies to others. Although not as comprehensive and effective as experiential learning, workshops can provide a solid introduction to digitization competencies, especially if interactive and hands-on learning methods are incorporated and there is a willingness for organizations to remain available for consultation or questions from attendees. Workshops that have a pre- and post-session component can add continuity, and workshops that can be offered multiple times have the ability to evolve and scale. Rotating instructors, incorporating hands-on sessions, and on-going mentoring are all ways to improve workshop- based learning. Scaffolding these approaches and sharing what is learned individually or locally with others is a way to continue to develop the capacity of libraries and archives institutions to provide global online access to unique historical materials. Although this approach is already widespread in the profession, it is important not to leave individuals or institutions with less resources behind. When planning new digitization projects or initiatives, institutions should consider adding and investing in new positions, partnerships, and regional collaborations and networks. When new permanent positions are not possible, temporary positions should be designed to be empowering and valuable for workers, rather than exploitative and harmful. In an age where technology is changing rapidly and is driven by large, well-resourced corporations, developing the profession’s competencies in digitization, keeping pace with digital technologies, and remaining relevant in the information environment depends on decentralized, peer-to-peer educational opportunities that use efficient and effective methods of teaching, such as interactive and hands -on learning. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 18 APPENDIX A An overview of planning and implementing digitization projects Created by Emily Lapworth for local use, March 8, 2018. Shared at the statewide digitization workshops. These steps were written for large-scale digitization but can be applied to any size digitization project. 1) Identify collections for digitization. a) Brainstorm your goals for this project. Think about what you will do with these digital surrogates, and who your audience is. b) Criteria for selection of materials i) Formats: Start simple. If everything's the same, large-scale workflows are easier to apply. Ultimately you will need to create different workflows for each format with differing requirements. For example, print photos are digitized differently than film negatives. Text documents benefit from transcription using optical character recognition (OCR) software, while photos do not, and handwritten materials present additional discoverability challenges. When creating complex digital objects with different formats within them things can become even more complicated. ii) Condition: fragile materials require extra handling time and possibly additional physical treatment prior to digitization. iii) Existing arrangement and description: it is easiest if online access can directly mirror physical access, but the materials may need additional arrangement and description before digitization, depending on your goals. If the materials already have item or folder level description that is ideal. If there is any hierarchy in the existing description, especially inconsistent or complex hierarchy, consider how you will reuse that description for digital objects. iv) Copyright: plan on providing public online access only if you own the copyright, have permission from the copyright holder, or if it is a strong case of fair use. c) See preparation step (below) to come up with some idea of how you will undertake this project. It will likely be modified during the actual preparation, but you need to have some idea of what you will do and how you will do it in order to gather support and resources. 2) Assess the technical infrastructure needed to create, manage, provide access to, and preserve the digital files. a) Estimate how much storage space you will need, and how much space will be needed for long-term digital preservation. b) Make sure that your current digital preservation policies and workflows will be able to accommodate this project. Adjust them if needed. c) Identify what equipment and software will be needed and if you already have it, can acquire it, or can use someone else’s. d) Assess if your existing workflows and systems for providing access to digital materials will be able to accommodate this project, and what changes you might need to make. e) Technology could be a great area for collaboration! If you lack certain resources, explore opportunities to collaborate with other institutions. 3) Coordinate with other stakeholders to verify choices and plans for digitization. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 19 a) Find out what kind of support there is (financial, staffing, etc.) from management, administration, and the community. b) Identify possible collaborators and discuss plans, make agreements, etc. c) Decide who will manage and oversee the project and how different responsibilities will be distributed. d) Identify and apply for grants if appropriate. 4) Prepare collections for digitization. a) Arrangement: assess how are the materials physically arranged and described, and if it will help or slow down your anticipated workflows. Plan for and complete additional processing if needed. b) Decide how you will display digitized materials. Mirroring existing arrangement is the easiest, but you also have to consider the file formats you want to create. c) Description: figure out how you can reuse existing description. Plan metadata fields, vocabularies, prioritized subject terms and names. d) Prepare preliminary metadata. Reuse what you already have! e) Prepare physical materials. Verify that physical contents of the collection match existing description or inventory. Remove staples, unbind, unsleeve, flatten, etc. Identify and address any preservation or conservation issues. f) Identify physical formats (this will help determine timeline and what equipment is needed). g) Decide: outsource or in house? h) Create and test workflows and procedures. i) Create documentation for workflows and procedures (important for duration of project, for reusing for future projects, and also for future employees stewarding these digital assets to know what you did and how you did it). j) Create and prepare systems, documents, or mechanisms to track work (it’s important to stay organized, especially when dealing with a large amount of materials or a team of workers). 5) Digitize collections a) Set up consistent file naming procedures and make sure they are followed. b) When dealing with mixed materials in house: Depending on equipment and composition of materials, start with the easiest or what you have the most of, then take note of other formats (e.g., transparencies, oversize, etc.) that require different equipment or settings so you group them together to do later all at once. c) Keep specifications simple if possible, especially if you have student workers doing the digitization. (For example, if you have complex digital objects with both text and photographic prints, and can digitize both materials on the same equipment without changing settings, do so. If you normally digitize text at 300 ppi but want photos at 600 ppi, rather than having the technician stop and change the settings, capture all at 600 ppi if you have the space.) d) Auto-crop is a great tool if you have it but otherwise try to improve the efficiency of your processes with any tools at your disposal. Sometimes this can be as simple as placing the item with the correct orientation to avoid the need to manually rotate later. e) File formats: Archival images are generally tiffs. Smaller derivative files may be necessary for access or to speed up OCR processes. Sometimes it’s better to output them at the time of scanning than to batch process later. 6) Process images INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 20 a) See above: try to improve your digitization workflows and procedures to shave time off of image processing. b) OCR: If you have textual materials OCR transcription makes them much more accessible with less manual work into creating detailed metadata. This is especially true for large aggregations of textual documents. Resist the urge to have perfect OCR transcription. Something is better than nothing, and when dealing with scale, you do not have the time to correct everything. Here is also an opportunity for crowdsourcing, if you have the technical resources to set it up. c) OCR file output: depending on how you choose to display and make the digital surrogates available, you may need to output text files and/or PDF/As. 7) Describe and provide access a) Reuse description that already exists (e.g., from an inventory or a finding aid). If a finding aid exists, make sure you are using all available information and understand how description is inherited and can be reused. b) At the beginning of the project transform the metadata that already exists into a format you can use to describe the digital objects. You can add to this existing metadata throughout the workflow. c) At the beginning of the project identify preferred subject terms and important names to look out for and add to digital object metadata when appropriate. This is especially important when metadata is created by students or teams or anyone unfamiliar with the subject matter of the collection. It will help ensure consistency and make faceting better for users. d) Explore how search engine optimization (SEO) works for your public online access system. Take that into consideration when creating metadata in order to optimize discovery of the materials. e) Make it as easy as possible for users to identify the provenance of the digital object and to find other digital objects from the same collection. f) Consider the links between the original collection description and the digital surrogates. Consider adding digitization information or links to digital surrogates into finding aids and other records. Consider also adding a link to the finding aid in the digital object metadata. Consider using persistent identifiers, such as ARKs (Archival Resource Keys), to d o this, instead of using regular URLs. g) Find out how your access system indexes full text transcripts and how it displays different file formats. Consider if you are able/if you want to offer multiple file formats of a digital object. For example, a compound digital object that includes both text and images could be available as a collection of image files, a single PDF file, or both. Identify what would be most useful to your users. h) Don’t forget about structural, administrative, technical, and preservation metadata! 8) Implement quality control procedures (QC) a) Have a strategy (e.g., sampling), guidelines, and goals for QC. b) For staff performing quality control, identify the most important things to look for. c) Decide how much time should be spent on QC. d) Identify and acquire any automated tools that can be used. e) Set up procedures or steps to follow when errors are found. 9) Preserve digital assets a) You should have already planned how you will ensure access to and preservation of the digital files and metadata in the long term. Best practice is to have policies in place INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 21 identifying what digital assets should be preserved and to what extent. Identify applicable standards and best practices, implement software and technical solutions. b) Set up workflows and procedures to ensure that the digital files receive appropriate ongoing digital preservation treatment. 10) Publicize and promote a) Work with administration, collaborators, and other stakeholders to publicize and promote the project. b) Depending on your audience, social media, academic listservs, and professional organization publications can be other avenues to spread the word. c) Set up harvesting with your regional digital library for inclusion in the Digital Public Library of America. 11) Assess a) Web statistics can be used to track the use of online materials. See SAA/ACRL’s “Standardized Statistical Measures and Metrics for Public Services” section 8 “Online Interactions” for general information on what information to collect, and the Digital Libr ary Federation’s “Best Practices for Google Analytics” for specific information on Google Analytics. If you are a CONTENTdm user, see “Google Analytics in CONTENTdm.”28 b) Surveys, interviews, and focus groups are other methods that can be used to gather feedback. c) Record and compile any oral or written feedback received from stakeholders and audiences. d) Analyze feedback and use statistics to identify areas of success and areas for improvement. Make improvements as necessary and incorporate findings into planning for future projects. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 22 APPENDIX B Digitization plan template Template created by Emily Lapworth for local use and shared at the statewide digitization workshop. Project overview Collection name(s): Collection number(s): Link to finding aid(s) or existing description(s): Project staff: Project supervisor: Research value/audience: Goals: Available resources: staff, money, equipment, software, etc. Additional resources needed: staff, training, money, equipment, software, etc. Priority level: low, medium, high. Why is this being digitized now? Part of the regular workflow, part of a grant project, or specially requested? Publicity and promotion plans: Assessment plans: Estimated time frame/due date: Estimate how much time should be spent on the collection, or when it should be finished by. Date completed/approximate hours spent: Formats and quantity of items: e.g., seven boxes of photographic prints, three folders of flat text documents, two drawers of oversize materials, etc. Existing arrangement & description: How the collection is currently arranged, what description is currently available? Copyright: What is the copyright status of the materials and can you legally digitize and provide access to them? Restricted or sensitive materials: e.g., skip over restricted folders, digitize restricted item notice, or physically cover PII (personally identifiable information, such as social security numbers) during digitization. Preservation issues: Any fragile or delicate materials that need extra attention? Supply needs: e.g., envelopes needed for rehousing INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 23 Notes for future/follow-up: e.g., missing items, materials that should be restricted, recommend additional processing, rehousing, digitization, metadata enhancement, etc. Preparation What will be digitized and what won’t be? e.g., series x will not be digitized at this time. It consists of audiovisual materials which would need to be outsourced. How will items be arranged and described online? How will identifiers/file names be assigned? e.g., each folder = a compound object, file titles from the finding aid will be used as titles for the digital objects What physical preparation must take place before digitization? e.g., remove all staples and fasteners Digitization Equipment/technical specs: • Outsourcing or in-house equipment to be used • File types (e.g., tiffs) • File quality (e.g., 24-bit color, 600ppi) • File naming Other specifications: • Where will digital files be stored and preserved? • How will special physical formats be handled? (e.g., scrapbooks- entire page or individual photos; magazines- entire issue or just cover? etc.) Digital file processing • Image correction? • Cropping or other editing? • OCR or transcription? • Create derivative files? Digital file quality control: What procedures and workflows will you put in place to ensure that everything is digitized accurately and according to the project specifications? Metadata What standards, fields, guidelines, and controlled vocabularies will you use? Metadata quality control: What procedures and workflows will you put in place to ensure that all metadata is accurate, consistent, and conforms to the project specifications? Access How will digital objects be accessed? What systems, workflows, and procedures will be used to provide access? INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 24 APPENDIX C Nevada Statewide Large-Scale Digitization Symposium Funded by LSTA May 18, 2018 Coffee and Pastries 9:00 - 9:30 Digitization Lab Tour 9:30 - 10:00 Welcome 10:00 - 10:15 Opening remarks from the Dean of University Libraries and the Director of Special Collections and Archives. Session: What is Large-Scale? [Live Streaming Begins] 10:15 - 11:00 This session will cover the characteristics of large-scale digitization and what sets it apart from other types of digitization projects. The UNLV Entertainment Project Team will also provide an update on the LSTA funded project they undertook to digitize over 25,000 items from UNLV’s entertainment collections. Panel: Methods for Ramping Up - Identifying Resources 11:00 - 12:00 There is a mandate to increase efficiency in digitization, but what resources can help you get there? This session will detail four methods to increase digitization output and address how organizations of varying resource levels can adopt them. 12:00 - 1:00 Lunch Enjoy a catered lunch and some discussion time with colleagues from across the state and region. There will be time to walk around the room and share digitization activities at your organization via whiteboards. During lunch you can also browse the “Equipment Buffet” where we will have handouts/displays on various types of digitization equipment and outsourcing vendors. Panel: Challenges of Digitization at a Larger Scale 1:00 - 2:00 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 25 Ramping up digitization is not as simple as merely increasing numbers. In this session we will discuss the challenges encountered each phase of digitization when scaling up and some strategies to meet the challenges. Break [Live Streaming Ends] 2:00 - 2:15 During the break, browse the “Equipment Buffet” where we will have handouts/displays on various types of digitization equipment and outsourcing vendors. Using the provided worksheet, shop the buffet and rank how well each product meets your digitization needs. Discussion: Resource 5: Statewide Collaboration (in groups) 2:15 - 3:15 The last session of the day will focus on an additional resource to ramping up digitization: your peers and partners right here in Nevada! We will review the notes about organizational projects and shared challenges, identify potential partnerships or collaborations, discuss grant opportunities, and work as a group to prioritize our state’s most at-risk collections. Wrap Up / Assessment 3:15 - 3:30 Before everyone departs for home, we will share contact information from attendees, complete a workshop evaluation and discuss follow up activities for next year. All attendees will leave with a customized plan of action for their organization. Attendee Learning Objectives: • Be able to define characteristics of digitization projects (mass, large-scale, boutique) and where your organization fits. Decide on the type of digitization appropriate for your organization to move toward. • Understand pros and cons of each method and the type of resources needed to support implementation. Identify one or more method/resource for your organization to target to increase your organizational capacity. • Understand complexities of large-scale digitization and identify one or more challenges at your organization. • Gain perspective on projects across Nevada. Be able to identify at least one future collaborative opportunity. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 26 APPENDIX D Nevada Statewide Large-Scale Digitization Workshop Funded by LSTA May 10, 2019 Workshop outcomes: • Digitization Boot Camp sessions guided by survey responses • UPR LSTA Project Update and lessons learned • Project Consultations available • Reflections on Statewide workshops - compare over 1 year AGENDA 8:00 - 9:00 *concurrent session Coffee and Pastries Digitization Lab Equipment Consultations Welcome 9:00 - 9:15 Opening remarks from the Dean of University Libraries Panel: Challenges of Digitization at a Larger Scale 9:15 -10:00 What does it take to complete a large digitization project? In this case study panel presentation, we will cover the approach used in digitizing the Union Pacific Railroad water documents, including: writing the grant and selecting materials, preparing archival collections for efficient digitization, managing the project, the student technician perspective, and trouble-shooting imaging and technical issues. Panelists: Project Manager; Curator; Digital Collections Librarian; Student Technician; Visual Resources Curator Goal: Overview of large-scale digitization and project deliverables. Boot Camp: Preparing to Digitize 10:00 - 11:00 Goal: Dig into the decisions needed to create a digitization plan. There will be a short presentation to go over the planning document, including asking “What makes a good project”? We will discuss labor and students and complete hands-on activities with actual collections to encourage work on individual plans. 11:00 - 12:00 *concurrent session Boot Camp: Capture Images - Group A Boot Camp: Create Metadata - Group B INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 27 Goal: Provide introductions to two main workflows in digitization projects: digital capture and metadata creation. There will be demonstrations, hands-on activities, and a chance to ask questions with the goal of helping to complete digitization plans. 12:00 - 1:00 Lunch 1:00 - 2:00 *concurrent session Boot Camp: Capture Images - Group B Boot Camp: Create Metadata - Group A Goal: Provide introductions to two main workflows in digitization projects: digital capture and metadata creation. There will be demonstrations, hands-on activities, and a chance to ask questions with the goal of helping to complete digitization plans. Boot Camp - Finding External Funding 2:00 - 2:30 Goal: Learn what opportunities exist to secure funding for your project. Hear tips on successful grant writing. Discuss possible collaboration opportunities across the state. Presenting Online Images DAMs Overview 2:30 - 3:30 Goal: See several options for presenting your collection to an online audience. Options will highlight strategies for many staffing configurations including: solo librarian/historian, low IT resourced institutions, common systems in the profession, and complex open source development communities focused on digital asset management platform (Islandora 8). Wrap Up / Assessment 3:30 - 3:45 Goal: Complete short survey on the workshop and ideas for future statewide events related to digitization. One on One Consultations Available 3:45 - 4:30 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 28 ENDNOTES 1 Some examples include: “Moving Theory into Practice: Digital Imaging Tutorial,” Cornell University Library/Research Department, http://preservationtutorial.library.cornell.edu/contents.html; “BCR’s CDP Digital Imaging Best Practices Version 2.0,” Bibliographical Center for Research, June 2008, https://sustainableheritagenetwork.org/system/files/atoms/file/bcrcdpImagingBP.pdf; “New Self-Guided Curriculum for Digitization,” Digital Public Library of America, https://dp.la/news/new-self-guided-curriculum-for-digitization/; Elizabeth La Beaud, “Analysis of Digital Preservation Course Offerings in ALA Accredited Graduate Programs,” SLIS Connecting 6, no. 2 (2017): 10, https://doi.org/10.18785/slis.0602.09. 2 Anne Daniel, Amanda Oliver, and Amanda Jamieson, “Toward a Competency Framework for Canadian Archivists,” Journal of Contemporary Archival Studies 7, Article 4 (2020): 1–13, https://elischolar.library.yale.edu/jcas/vol7/iss1/4. 3 “ALA’s Core Competences of Librarianship,” American Library Association, 2009, http://www.ala.org/educationcareers/careers/corecomp/corecompetences; “Guidelines: Competencies for Special Collections Professionals,” Association of College and Research Libraries, 2017, http://www.ala.org/acrl/standards/comp4specollect. 4 Archives & Records Association of the United Kingdom and Ireland, “The ARA Competency Framework,” 2016, https://www.archives.org.uk/160-cpd/cpd/700-competency-framework.html. 5 Youngok Choi and Edie Rasmussen, "What is Needed to Educate Future Digital Librarians," D-Lib Magazine 12, no. 9 (September 2006), https://doi:10.1045/september2006-choi. Youngok Choi and Edie Rasmussen, "What Qualifications and Skills are Important for Digital Librarian Positions in Academic Libraries? A Job Advertisement Analysis," The Journal of Academic Librarianship 35, no. 5 (2009): 457–67, https://doi.org/10.1016/j.acalib.2009.06.003. 6 Karl-Rainer Blumenthal et al., “What Makes a Digital Steward: A Competency Profile Based on the National Digital Stewardship Residencies,” LIS Scholarship Archive (2017), https://doi.org/10.17605/OSF.IO/TNMRA. 7 “MLIS Skills at Work: A Snapshot of Job Postings,” San Jose State University School of Information, 2019, https://ischool.sjsu.edu/lis-career-trends-report. 8 Choi and Rasmussen, “What Qualifications.” 9 David A. Kolb and Ronald Fry, “Toward an Applied Theory of Experiential Learning,” in Theories of Group Process, ed. Cary L. Cooper (London: John Wiley, 1975), 33–57. 10 “Guidelines,” Federal Agencies Digital Guidelines Initiative, http://www.digitizationguidelines.gov/guidelines/. 11 Krystyna K. Matusiak and Xiao Hu, "Educating a New Cadre of Experts Specializing in Digital Collections and Digital Curation: Experiential Learning in Digital Library Curriculum," Proceedings of the American Society for Information Science and Technology 49, no. 1 (2012): 1– 3, https://doi.org/10.1002/meet.14504901018. http://preservationtutorial.library.cornell.edu/contents.html https://sustainableheritagenetwork.org/system/files/atoms/file/bcrcdpImagingBP.pdf https://dp.la/news/new-self-guided-curriculum-for-digitization/ https://doi.org/10.18785/slis.0602.09 https://elischolar.library.yale.edu/jcas/vol7/iss1/4 http://www.ala.org/educationcareers/careers/corecomp/corecompetences http://www.ala.org/acrl/standards/comp4specollect https://www.archives.org.uk/160-cpd/cpd/700-competency-framework.html https://doi:10.1045/september2006-choi https://doi.org/10.1016/j.acalib.2009.06.003 https://doi.org/10.17605/OSF.IO/TNMRA https://ischool.sjsu.edu/lis-career-trends-report http://www.digitizationguidelines.gov/guidelines/ https://doi.org/10.1002/meet.14504901018 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 29 12 Amy Lynn Maroso, "Educating Future Digitizers," Library Hi Tech 23, no. 2 (June 1, 2005): 187– 204, https://doi.org/10.1108/07378830510605151. 13 “Agenda, Digital Directions: Fundamentals of Creating and Managing Digital Collections, October 19-20, 2020, Tucson, AZ,” Northeast Document Conservation Center, https://www.nedcc.org/preservation-training/dd20/agenda. 14 Kim Christen and Lotus Norton-Wisla, “Digitization Project Decision-making: Starting a Digitization Project,” Center for Digital Scholarship and Curation, Sustainable Heritage Network, July 1, 2017, https://sustainableheritagenetwork.org/digital-heritage/digitization- project-decision-making-starting-digitization-project. 15 Kim Christen and Lotus Norton-Wisla, “Digitization Project Decision-making: Should We Digitize? Can We Digitize?,” Center for Digital Scholarship and Curation, Sustainable Heritage Network, July 1, 2017, https://sustainableheritagenetwork.org/digital-heritage/digitization- project-decision-making-should-we-digitize-can-we-digitize-0. 16 Taylor Surface, “Getting a Million Dollar Digital Collection Grant in Six Easy Steps,” OCLC Next, December 6, 2016, http://www.oclc.org/blog/main/getting-a-million-dollar-digital-collection- grant-in-six-easy-steps/. 17 Institute of Museum and Library Services, “Putting Your Best Foot Forward: Tips on Making Your Preliminary Proposal Competitive”, December 31, 2015, https://www.imls.gov/blog/2015/12/putting-your-best-foot-forward-tips-making-your- preliminary-proposal-competitive. 18 Examples of project management literature relevant to cultural heritage digitization projects include: Cyndi Shein, Hannah E. Robinson, and Hana Gutierrez, “Agility in the Archives: Translating Agile Methods to Archival Project Management,” RBM: A Journal of Rare Books, Manuscripts, and Cultural Heritage 19, no. 2 (2018), https://rbm.acrl.org/index.php/rbm/article/view/17418/19208; Michael Dulock and Holley Long, “Digital Collections Are a Sprint, Not a Marathon: Adapting Scrum Project Management Techniques to Library Digital Initiatives,” Information Technology and Libraries 34, no. 4 (2015), https://doi.org/10.6017/ital.v34i4.5869; Michael Middleton, “Library Digitisation Project Management," Proceedings of the IATUL Conferences (1999), http://docs.lib.purdue.edu/iatul/1999/papers/20; “DLF Project Managers Toolkit," Digital Library Federation, https://wiki.diglib.org/DLF_Project_Managers_Toolkit; Theresa Burress and Chelcie Juliet Rowell, “Project Management for Digital Library Projects with Collaborators Beyond the Library,” Journal of College & Undergraduate Libraries 24, no. 2–4 (2017), https://doi.org/10.1080/10691316.2017.1336954. 19 “Guiding Digital Success,” Online Computer Library Center (OCLC), https://www.oclc.org/content/dam/oclc/contentdm/guiding_digital_success_handout.pdf . 20 Useful metadata resources include: Digital Public Library of America, “Metadata Application Profile,” https://pro.dp.la/hubs/metadata-application-profile; Dublin Core Metadata Initiative, “Guidelines for Dublin Core Application Profiles,” https://www.dublincore.org/specifications/dublin-core/profile-guidelines/; Oksana L. https://doi.org/10.1108/07378830510605151 https://www.nedcc.org/preservation-training/dd20/agenda https://sustainableheritagenetwork.org/digital-heritage/digitization-project-decision-making-starting-digitization-project https://sustainableheritagenetwork.org/digital-heritage/digitization-project-decision-making-starting-digitization-project https://sustainableheritagenetwork.org/digital-heritage/digitization-project-decision-making-should-we-digitize-can-we-digitize-0 https://sustainableheritagenetwork.org/digital-heritage/digitization-project-decision-making-should-we-digitize-can-we-digitize-0 http://www.oclc.org/blog/main/getting-a-million-dollar-digital-collection-grant-in-six-easy-steps/ http://www.oclc.org/blog/main/getting-a-million-dollar-digital-collection-grant-in-six-easy-steps/ https://www.imls.gov/blog/2015/12/putting-your-best-foot-forward-tips-making-your-preliminary-proposal-competitive https://www.imls.gov/blog/2015/12/putting-your-best-foot-forward-tips-making-your-preliminary-proposal-competitive https://rbm.acrl.org/index.php/rbm/article/view/17418/19208 https://doi.org/10.6017/ital.v34i4.5869 http://docs.lib.purdue.edu/iatul/1999/papers/20 https://wiki.diglib.org/DLF_Project_Managers_Toolkit https://doi.org/10.1080/10691316.2017.1336954 https://www.oclc.org/content/dam/oclc/contentdm/guiding_digital_success_handout.pdf https://pro.dp.la/hubs/metadata-application-profile https://www.dublincore.org/specifications/dublin-core/profile-guidelines/ INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 30 Zavalina et al., “Developing an Empirically-based Framework of Metadata Change and Exploring Relation between Metadata Change and Metadata Quality in MARC Library Metadata,” Procedia Computer Science 99 (2016 ) 50–63, https://doi.org/10.1016/j.procs.2016.09.100. 21 “Guidelines: Technical Guidelines for Digitizing Cultural Heritage Materials,” Federal Agencies Digital Guidelines Initiative, http://www.digitizationguidelines.gov/guidelines/digitize- technical.html; “Digital Preservation at the Library of Congress,” Library of Congress, https://www.loc.gov/preservation/digital/. 22 Robin L. Dale, “Reformatting: 6.7 Outsourcing and Vendor Relations,” Northeast Documentation Conservation Center, https://www.nedcc.org/free-resources/preservation-leaflets/6.- reformatting/6.7-outsourcing-and-vendor-relations; “Deciding to Outsource or Digitize In- House," Digital Stewardship Curriculum, Center for Digital Scholarship and Curation/Sustainable Heritage Network, https://www.sustainableheritagenetwork.org/system/files/atoms/file/1.20_OutsourcingvsIn House.pdf. 23 “Omeka,” Roy Rosenzweig Center for History and New Media, https://omeka.org/; “CONTENTdm: Build, showcase, and preserve your digital collections,” OCLC, https://www.oclc.org/en/contentdm.html. 24 “ISO 14721:2012 Space data and information transfer systems—Open archival information system (OAIS)—Reference model,” International Organization for Standardization, https://www.iso.org/standard/57284.html; “Levels of Digital Preservation,” National Digital Stewardship Alliance/Digital Library Federation, https://ndsa.org//activities/levels-of-digital- preservation/. 25 “From Theory to Action: ‘Good Enough’ Digital Preservation Solutions for Under-resourced Cultural Heritage Institutions”, Preserving Digital Objects with Restricted Resources (POWRR), August 2014, http://commons.lib.niu.edu/handle/10843/13610. 26 “Digital Archives Specialist (DAS) Curriculum and Certificate Program,” Society of American Archivists, https://www2.archivists.org/prof-education/das; “POWRR Institutes,” Digital POWRR, https://digitalpowrr.niu.edu/institutes/. 27 Sandy Rodriguez et al., “Collective Responsibility: Seeking Equity for Contingent Labor in Libraries, Archives, and Museums,” Institute for Museum and Library Services white paper, http://laborforum.diglib.org/wp- content/uploads/sites/26/2019/09/Collective_Responsibility_White_Paper.pdf. 28 SAA-ACRL/RBMS Joint Task Force on Public Services Metrics, “Standardized Statistical Measures and Metrics for Public Services in Archival Repositories and Special Collections Libraries,” 2018, https://www2.archivists.org/standards/standardized-statistical-measures- and-metrics-for-public-services-in-archival-repositories; Molly Bragg et al., “Best Practices for Google Analytics in Digital Libraries: Digital Library Federation Assessment Interest Group Analytics” working group, 2015, https://doi.org/10.17605/OSF.IO/CT8BS; “Google Analytics in CONTENTdm,” OCLC, https://doi.org/10.1016/j.procs.2016.09.100 http://www.digitizationguidelines.gov/guidelines/digitize-technical.html http://www.digitizationguidelines.gov/guidelines/digitize-technical.html https://www.loc.gov/preservation/digital/ https://www.nedcc.org/free-resources/preservation-leaflets/6.-reformatting/6.7-outsourcing-and-vendor-relations https://www.nedcc.org/free-resources/preservation-leaflets/6.-reformatting/6.7-outsourcing-and-vendor-relations https://www.sustainableheritagenetwork.org/system/files/atoms/file/1.20_OutsourcingvsInHouse.pdf https://www.sustainableheritagenetwork.org/system/files/atoms/file/1.20_OutsourcingvsInHouse.pdf https://omeka.org/ https://www.oclc.org/en/contentdm.html https://www.iso.org/standard/57284.html https://ndsa.org/activities/levels-of-digital-preservation/ https://ndsa.org/activities/levels-of-digital-preservation/ http://commons.lib.niu.edu/handle/10843/13610 https://www2.archivists.org/prof-education/das https://digitalpowrr.niu.edu/institutes/ http://laborforum.diglib.org/wp-content/uploads/sites/26/2019/09/Collective_Responsibility_White_Paper.pdf http://laborforum.diglib.org/wp-content/uploads/sites/26/2019/09/Collective_Responsibility_White_Paper.pdf https://www2.archivists.org/standards/standardized-statistical-measures-and-metrics-for-public-services-in-archival-repositories https://www2.archivists.org/standards/standardized-statistical-measures-and-metrics-for-public-services-in-archival-repositories https://doi.org/10.17605/OSF.IO/CT8BS INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 CULTIVATING DIGITIZATION COMPETENCIES | O’HARA, LAPWORTH, AND LAMBERT 31 https://help.oclc.org/Metadata_Services/CONTENTdm/Get_started/Google_Analytics_in_CON TENTdm. https://help.oclc.org/Metadata_Services/CONTENTdm/Get_started/Google_Analytics_in_CONTENTdm https://help.oclc.org/Metadata_Services/CONTENTdm/Get_started/Google_Analytics_in_CONTENTdm ABSTRACT Introduction Literature Review Overview of grant projects Competencies Project Planning Grant Writing Project Management Metadata Digital Capture Digital Asset Management Summary of competency development: experiential learning versus workshops Learning through Experience for Project Teams Learning through Experience for Organizations Learning through Workshops for Professional Development Conclusion Appendix A Appendix B Project overview Preparation Digitization Metadata Access Appendix C Appendix D ENDNOTES 11863 ---- Collaboration and Integration: Embedding Library Resources in Canvas ARTICLES Collaboration and Integration Embedding Library Resources in Canvas Jennifer L. Murray and Daniel E. Feinberg INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2020 https://doi.org/10.6017/ital.v39i2.11863 Jennifer L. Murray (jennifer.murray@unf.edu) is Associate Dean, University of North Florida. Daniel E. Feinberg (daniel.feinberg@unf.edu) is Online Learning Librarian, University of North Florida. ABSTRACT The University of North Florida (UNF) transitioned to Canvas as its Learning Management System (LMS) in summer 2017. This implementation brought on opportunities that allowed for a more user- friendly learning environment for students. Working with students in courses which were in-person, hybrid, or online, brought about the need for the library to have a place in the Canvas LMS. Students needed to remember how to access and locate library resources and services outside of Canvas. During this time, the Thomas G. Carpenter Library’s online presence was enhanced, yet still not visible in Canvas. It became apparent that the library needed to be integrated into Canvas courses. This would enable students to easily transition between their coursework and finding resources and services to support their studies. In addition, librarians who worked with students, looked for ways for students to easily find library resources and services online. After much discussion, it became clear to the Online Learning Librarian (OLL) and the Director of Technical Services and Library Systems (Library Director) that the library needed to explore ways to integrate more with Canvas. INTRODUCTION Online learning is not a new concept at the UNF. In fact, in-person, hybrid, and online courses used online learning in some capacity since distance learning took hold in higher education. UNF transitioned to Canvas as their Learning Management System (LMS) in summer 2017, which replaced Blackboard. This change, which affected all the UNF’s online instruction and student learning, brought on new benefits and challenges that allowed for a more secure system for students to take in-person, hybrid, and distance learning courses. While this change occurred, UNF’s Library went through many changes in its virtual presence. Students, specifically those who had classes that utilized Canvas, needed a user-friendly way to use the library website and its resources virtually. In response, the library’s resources transitioned into having a greater online presence. However, ultimately, many students needed to use resources that they did not actually realize were available electronically from the library. Through instruction and research consultations (both in-person and virtually), students needed to be directed back to the library homepage to access resources; however, the reality was that unless there was a presence of library instruction or professors pointing out library resources, students instead turned to Google or other easy to find online resources to which they were previously exposed. HOW THE PROJECT ORIGINATED By spring 2018, there was growth of UNF courses that were converted to online or hybrid courses. As students used Canvas more, librarians started receiving feedback from in-person and online sessions that students had difficulty accessing the library’s resources while in Canvas. The lack of library visibility in Canvas caused the librarians to truly acknowledge that this was a problem. mailto:jennifer.murray@unf.edu mailto:daniel.feinberg@unf.edu INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 COLLABORATION AND INTEGRATION | MURRAY AND FEINBERG 2 Students had to open a new browser window to access the library and then go back to Canvas to complete their assignments, which involved multiple steps. This caused frustration among students who had to remember the library URL, while also getting used to navigating their new courses in Canvas. Librarians consistently spent large amounts of time instructing students how to navigate to the library website during library instruction sessions and research consultations. In effect, more time was spent with students to guide them to library resources such as programmatic or course specific Springshare hosted LibGuides (also known as library guides), or the library homepage. Rather than being focused on how to use library resources and become more information literate, students spent more time on just locating the library website to get to the UNF library’s online resources. Together, the OLL and Library Director talked about possibilities in Canvas that would benefit all students who attended UNF both in-person and online. Canvas is located in UNF’s MyWings, a portal where all students go for coursework, email, and resources that support their academic studies at UNF. It became apparent that if it was possible, there needed to be a quicker way to access the UNF library resources for students. LITERATURE REVIEW With the advent of online learning, it became obvious that students needed to have library access within their online learning management system. For campuses such as UNF, this meant within Canvas. For UNF students that are distance or online students only, this was especially true. Farkas noted that librarians had worked to determine the best ways to provide library materials, services, and embed librarians into the LMS.1 Over the last fifteen years, LMS have become more important to support the growth of online learning. Pomerantz noted that the LMS has become critical to instruction and online learning. Approximately 99 percent of institutions adopt an LMS and 88 percent of faculty utilize an LMS platform.2 This “puts it in the category with cars and cellphones as among the most widely adopted technologies in the United States.”3 Library guides that have been integrated into an LMS increased their visibility, but did not guarantee that faculty and students would utilize them. That is why it was critical to continuously collaborate and communicate with faculty, students, and librarians to bring attention to the resources that could assist students. Farkas noted that librarians at The Ohio State University discussed that no matter how the library was integrated into a university’s LMS, the usage of the library there was decidedly dependent on if the faculty professor promoted the library to their students.4 The reality that libraries faced was that without visibility in an LMS, students that were online/distance learners needed to remember or find the library’s website. While this seemed to be inconsequential, it caused students to use Google or other resources instead of their university/college’s library discovery tool or library databases. Farkas noted that Shank and Dewald’s seminal article described a university’s LMS as having two levels, macro and micro. When there was one way to access the library in the LMS, then it was termed macro. This single pathway allowed for less maintenance since there was a single way to access the library from the LMS.5 The University of Kentucky embedded the library by adding a library tab in Blackboard. Other institutions like Portland State University, Ohio State University, and Oakland State University developed library widgets to make the library more accessible.6 The addition of library and research guides in library instruction was critical to increase visibility for INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 COLLABORATION AND INTEGRATION | MURRAY AND FEINBERG 3 students and furthermore make sure students had easier access to the library through their LMS. Getting librarians’ access to the LMS at their institutions is an ongoing issue.7 UNF librarians wanted to determine best practices to decide how the library could integrate into Canvas. Therefore, research was needed to see what other university libraries were doing. The librarians at UNF discovered that there was no obvious preference based on examples found in research to accomplish how to get the library into Canvas. Davis observed that “claiming a small amount of real estate in the LMS template . . . is an easy way to put the library in students’ periphery.” By simply having a library link added or a page added to each course was “the digital equivalent for students of walking past the library on the way to class.”8 However, it seemed that a lot depended on how an LMS was used at an UNF and the technical expertise available. Thompson and Davis noted that the “LMS technology has added another layer of complexity to the puzzle. As technology evolves to address changes in learning design, student and facu lty attitudes, expectations, perceptions will continue to be a critical piece of the course integration puzzle.” 9 While looking at other institutions, there were a variety of ways in which Canvas and the library were integrated. There were numerous examples from embedded Springshare product library guides, to the creation of modules of quizzes or tutorials, and even to the creation of online mini- courses, and embedded librarians in LMS courses.10 Penn State University looked at their method of how to add library content into Canvas. They already had a specific way of putting library guides in Canvas, but it was not in a highly visible location for students to easily access. When faculty put guides in their courses, with the collaboration of librarians, the guides were used. However, many of the faculty did not use these librarians or resources. A student survey and user studies were used to best learn how to fix the problem of students and faculty that did not use the guides and content more. Penn State worked with their COMM 190 instructors to administer a survey that was extra credit, to ensure getting responses.11 “General findings included: 53 percent of students did not know what a course guide was; 41 percent of students had used a guide for a class project; 60 percent accessed a course guide via the LMS; and 37 percent of students used course guides on their own.”12 Many students were interested in doing their library research within Canvas itself. It should be noted that the guides needed to be in a prominent place in Canvas, but not overwhelm the course content. For course-specific guides an introductory statement was needed to describe what the guide was about. When the release of Springshare’s LTI tool occurred, it became an optimal time in which the technical solutions allowed for Penn State’s library guides to be embedded smoothly into Canvas.13 The Learning Tools Interoperability (LTI tool) allows for online software to be integrated into Canvas. In effect, when professors want to add a tool to their course, it allows for more seamless and controlled avenue. In the case of Library Guides, it creates a way in which guides can be embedded into the LMS with little problems. Another example of a library integration into a campus LMS was at the Utah State University (USU) Merrill-Cazier Library. They looked to find a way to maximize the effectiveness of Springshare’s library guides when they assessed the design and reach of library guides within their LMS.14 They took a unique approach to build an LTI tool that automatically pulled in the most appropriate library guide when the “Research Help” link in Canvas was activated by a professor. They also saw this as an opportune time to redesign their subject guides and ensure there were guides for all disciplines. They provided usage data to subject librarians to help determine where there might be opportunities to interact with classes and provide more library instruction. Overall, INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 COLLABORATION AND INTEGRATION | MURRAY AND FEINBERG 4 the study and feedback they received from students helped them to find ways to improve how librarians used and thought about library guides, and expanded their reach based on usage data. 15 This ability to add library guides to Canvas provided students a way to access library materials or the library without having to leave the online classroom. Many libraries have conducted focus groups and usability studies that were key to providing valuable feedback on the knowledge and understanding that faculty and students had of guides, ways to improve information shared that assisted students with their coursework and faculty in their online teaching. Research indicated that exploration and implementation of integrating library guides into an LMS led to a need to improve and provide more consistently designed guides.16 The literature indicated the importance of a strong relationship with the department that manages the LMS. These integrations were made much easier when there was a relationship established and it sometimes led to finding out about additional opportunities to integrate more with the LMS. Penn State saw an increase of over 200,000 hits to its course guides believed to be because of the LTI integration.17 This, however, did not guarantee that the students benefited from the course guides, similar to library statistics not proving resources were being used despite page hits. In addition, faculty were able to turn off the LTI course guide tool, which reduced the chances of student usage or awareness of the course guide. Based on feedback from students and faculty, it did not matter where the course guides were since they could be ignored anyway. A Penn state blog was developed by the Teaching and Learning with Technology unit to provide instructors a venue in which they could be aware of online services that librarians provide.18 “Although automatic integration allows specialized library resources to be targeted at all LMS courses, that does not mean that they’ll be accessed. It is important then to build ongoing relationships with stakeholders, providing not just information that such integrations exist, but also reasons why to use them.”19 However, not all universities and colleges decided to integrate the library strictly through a library guide or a link to the library integration into their LMS. Karplus noted that students spent more time online rather than going to the physical academic library. Karplus discussed that the digital world combined with academic library resources had two benefits. One of which brought online research as a more normal occurrence. The second benefit was that students were more comfortable with accessing online resources.20 While using Blackboard, St. Mary’s College’s goal was to incorporate library information literacy modules into courses that existed. Using the Blackboard LMS, students were able to access all components of their courses including assigned readings. This became their academic environment. Therefore, information literacy modules, tutorials, and outside internet resources could be added to the LMS.21 Tutorials combined with pre- and post-testing, gave faculty instant feedback. Librarians were also able to participate in Blackboard through discussion boards and work with students.22 There was a constant need to update the modules and the information added to Blackboard. Librarians having access to the Blackboard site, allowed for students to use the library resources more readily. “The site can be the focal point for many librarians in one location thus ensuring a consistent, collaborative instructional program.”23 Overall, the integration of campus librarians into an LMS was to get students to use the library in order to be more successful in their academic endeavors. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 COLLABORATION AND INTEGRATION | MURRAY AND FEINBERG 5 DEVELOPING A PLAN OF ACTION Initially, the OLL and Library Director brainstormed possible integration ideas, ranging from adding a library global navigation button to the Canvas UI, to adding a link to the library in the Canvas help menu. At the same time, they also researched what other libraries had done. After brainstorming, it was realized that additional conversations needed to occur within the library and with UNF’s Online Learning Support Team, a part of the Center for Instruction and Research Technology (CIRT), the group that manages Canvas. The discussion to integrate the library and Canvas was a complex matter. UNF Administrators asked for a proposal to be written so it could be brought to the library, Online Learning Support Team, and Information Technology Services (ITS) stakeholders for discussion and approval. That proposal, along with much needed discussion, was critical in order to determine the possibilities and actions that needed to be taken. That being said it was important to recognize the importance of what was best to serve the faculty and students. When brainstorming discussions started to occur with UNF’s Online Learning Support Team, it was important for the library to determine what options were available to embed the library in Canvas. The Library had a strong relationship with UNF’s Online Learning Support Team and ITS administrators, which made this an easy process to pursue. What the OLL and Library Director initially wanted was to add a simple link to the global navigation in Canvas that would take all users to the library homepage. However, it became apparent that this was not possible due to the fact that this space is limited and many departments on campus would like greater visibility in Canvas. The next option, which was easier to implement, was to add a link to the library homepage under the help menu in Canvas. Although this menu link was added, it was so hidden in Canvas that the OLL and Library Director felt that it would never be found in Canvas by faculty, let alone students. CIRT administrators asserted to the OLL and Library Director of what other possibilities were available. After researching options, the library recommended creating access to library resources and services using a Springshare LTI Tool for library guides, which CIRT agreed to. Library guides, or LibGuides, are library webpages that are built from Springshare software. Using the LTI tool seemed like a great possibility since it would allow for more of a presence in addition to the help link to the library homepage. After approval from library administration and initial discussions with IT, the project moved forward. IMPLEMENTATION The project took about a year to complete from the time discussions began internally in the library to the time the integration went live (see figure 1). INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 COLLABORATION AND INTEGRATION | MURRAY AND FEINBERG 6 Figure 1. Project Timeline The idea to have a seamless entryway to the library seemed to be a good idea based on observations and feedback from students, but the OLL and Library Director started by completing an environmental scan to see what other institutions did to get ideas on ways the UNF Library could integrate into Canvas. The OLL and Library Director learned that there were a variety of ways it had been done from the integration of the library at the global navigation level, course level, and by an added link to the library under the help menu in Canvas. It became clear that an integration into Canvas would seem like an obvious progression to strengthen not only online learning, but also give students the ability to benefit from the resources that the library subscribed to and enhance their curricular needs. Conversations then occurred with UNF’s Online Learning Support Team to discuss integration options further. After much discussion, a decision was made to pursue an added link to the librar y website under the Canvas help menu and a new LTI tool at the course level. Since Canvas was used in so many courses, it was determined that university-wide campus committee agreement was needed on how to go about adding library guides to Canvas courses. Librarians were also approached at this time to get their input and feedback. The goal seemed obvious to the librarians. When they were approached, buy-in to support the students with Canvas by way of the help button and LTI tool integration seemed more than straightforward. Therefore, for the librarians, the goal was to solve the problem of making sure that students could easily access library materials. Overall, the library faculty’s preference for the implementation was to embed the library website under the Canvas help menu while also have the Student Resources LibGuide inside all Canvas courses using the Springshare LTI tool. After all internal approvals were obtained, the link to the library was seamlessly added under the Canvas help menu. As for the Springshare LTI tool, it required more work and discussion before it could be implemented. After approval was granted from the UNF Online Learning Support Team and campus ITS Security Team, the integration began. Configuration options for the LTI tool were explored and the Systems and Digital Technologies Librarian worked closely with the UNF Online Learning Support Team and Springshare to setup the LibApp LTI tool. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 COLLABORATION AND INTEGRATION | MURRAY AND FEINBERG 7 The first step was to configure Springhare’s Automagic LTI Tool to automatically add LibGuid es to courses in Canvas. This included adding an LTI Tool Name and Description, which appeared in Canvas during setup and the course navigation. It was also decided to set the Student Resources LibGuide as the default guide for all courses based on feedback from across campus. Instructors could request to use a different LibGuide for their course. To enable this, two parameters had to be set in the Automagic LTI Tool to enable LibGuide matching between Canvas and LibGuides: • LTI Parameter Name: For UNF, this was set to “context” label, to select the course code field in Canvas. • LibGuides Metadata Name: This was set to the appropriate value to identify the metadata field used in LibGuides. If an instructor decided to change the default guide to another guide, these two parameters would need to be entered into a specific LibGuide’s Custom Metadata, so that Canvas could link to the designated guide to display in a course. The change had to be made in the LibGuide itself, so it was handled by the Systems and Digital Technologies Librarian. There had not been many instructors who had requested this yet, but if utilized, the library would also have had to ensure this carried over each semester by updating the metadata in the guide to the new course code. After the configuration was completed on the Springshare side, the next step was to set up the integration in the Canvas test environment. An external application had to be installed in Canvas to allow the Springshare LTI tool to run. After it was tested, the application was applied across all courses and set to display by default, which the majority of faculty preferred. Faculty who did not want to use the integration had the ability to manually turn it off in Canvas. During the implementation setup, a few minor issues were encountered. After seeing what the Student Resources Guide looked like in Canvas it became clear that the header and footer were not needed and just cluttered the guide. They were both removed in the LTI Setup Options to ensure a cleaner looking guide. Since the LibGuides were being embedded into another system (Canvas), formatting of the guides had to be adjusted. The other issue encountered was trying to add available Springshare widgets such as the library hours or room booking content to the guide using the Automagic LTI Tool. While this was not successful, it was determined that the additional options were not needed. Once the integration was set up in the Canvas test environment, demonstrations were held and input was gathered from stakeholders through campus-wide meetings with faculty to obtain their input. It was critical to determine if faculty would utilize LibGuides in their Canvas courses. An overview of the integration and the benefits were given to the Campus Technology Committee and Distance Learning Committee faculty. A demonstration was also given so that these faculty committees could see what the integration would look like in their courses. Overall, the feedback obtained from the faculty was very positive. The preference was to have the configuration be opt- out, where the library guides would automatically display in Canvas courses. Many faculty members were excited about the integration and looked forward to having it in their courses. After demos took place and final setup was completed based on feedback, the integration was then setup in the Canvas production environment and was announced via newsletters, emails , and social media. As of the fall 2019 semester, the library's Student Resources Guide was integrated into all courses in Canvas (see figure 2). INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 COLLABORATION AND INTEGRATION | MURRAY AND FEINBERG 8 Figure 2. Student Resources Library Guide in Canvas BENEFITS OF THE INTEGRATION Students are dependent on their campus LMS in order to complete their coursework, support their studies, and in the case at UNF, have easier access to the online campus. The LibGuide integration not only streamlined their way to library resources, but also promoted library usage from students that may not have known how to get to the resources available to them. For faculty it should be noted that they were able to replace the default student LibGuide with a more specific subject or course guide. Either way, it brought more awareness to resources and services that supported curricular needs. The Springshare chat widget in the guide also gave students the ability to communicate directly with a librarian from within Canvas. This integration not only increased the library visibility in the online environment but enabled all students, whether in- person, hybrid, or online, with direct access to the resources they needed for their coursework. CHALLENGES OF THE INTEGRATION Although there were many benefits to integrate the library into Canvas, there were many challenges with making the integration happen. There were many more stakeholders than expected. From library administration, to the Canvas administrators, to library faculty, and teaching faculty committees, their input was needed prior to the project taking place. Since the project grew organically, this meant that all of the stakeholders were brought in as the project grew or unfolded. Once the project received approval from the library and CIRT administrators, ITS administrators had to give the final approval in order to proceed with the integration of library guides. The process to implement the integration took some time to figure out. In addition, getting buy-in from the teaching faculty was key as the navigation options in their Canvas courses would be impacted. Making sure the faculty understood how it would assist their students was important as the goal was to help their students succeed with their coursework. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 COLLABORATION AND INTEGRATION | MURRAY AND FEINBERG 9 A concern was if faculty would tell their students, or conversely, would students find the link to the LibGuide on their own? Determining how the news of the library and Canvas integration would be communicated to the UNF community was the final step. The Library Director, OLL, and CIRT administrators needed to determine the best communication routes to get the UNF community aware of this news. In effect, emails, UNF updates/newsletters, and by word of mouth by teaching faculty. It was crucial that students be aware of these tools. This meant that going forward, UNF would depend on word of mouth or student's curiosity in the Canvas navigation bars themselves. DISCUSSION AND NEXT STEPS Integrating the library with the UNF’s Learning Management System, Canvas, took much planning and collaboration, which was key to creating a more user-friendly learning environment for students. In reflecting on what went well and what did not, the UNF librarians learned several important lessons that will help improve upon the implementation of future projects. To begin, it is important to identify and involve stakeholders early on, so they can provide feedback along the way. Getting buy-in from the teaching faculty is also key since the integration affects the navigation options in their Canvas courses. For UNF, initially, the OLL and Library Director did not realize how many groups of teaching faculty and departments would need to approve this Canvas change and implementation. It was important to have them understand the importance of the integration and how it can assist their students with their coursework. Considering the content of the library guides was important because of the impact it would have on Canvas courses. For example, at the UNF Library some students thought that the librarian’s picture on the default guide was in fact their professor. In turn, students began to contact her. This caused much confusion for our students and professors alike. Along the way, communication is critical so that everyone is kept informed as the integration progresses. Communication at the appropriate times and ensuring all information is gathered about configuration options before starting conversations with stakeholders is important too. Having this transparency at the appropriate times and ensuring there was enough info rmation about the configuration options before starting conversations with stakeholders was important too. Finally, investigating the ability to retrieve usage statistics from day one would be extremely beneficial and provide data to assess how often the library guides are being used in the LMS and by whom. This information would help determine next steps and explore other potential integration opportunities. At UNF, the librarians were not able to implement statistics as part of our integration which has made it more difficult to assess the usage of the library guides in Canvas. Now that the integration has been completed, ensuring the integration continues to meet the needs of faculty and students will be important. Feedback will need to be gathered from stakeholders to find out if they find the integration useful, if there are any issues being encountered, and/or if they have any recommendations for ways to enhance the integration. Usage statistics will also be gathered as soon as they are available. This will provide information on which instructors are using the library guides in their courses and which instructors are not using them. For those who have used it, it will be an opportunity to target those courses for instruction. For those who have not used them, it will be an opportunity to find out why and make sure they are aware of the benefits of using them in their courses. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 COLLABORATION AND INTEGRATION | MURRAY AND FEINBERG 10 Exploring other integration possibilities, especially as the technology continues to evolve, will be important to ensure the library continues to reach students. While the natural progression of the UNF integration would be to embed librarians in the Canvas platform, others have faced challenges. “According the Ithaka S & R Library Survey 2013 by Long and Schonfeld, 80–90 percent of academic library directors perceive their librarians’ primary role as contributing to student learning while only 45–55 percent of faculty agree librarians contribute to student learning.”24 Even though this is a challenge, faculty collaboration with librarians is crucial for the embedded librarian role. Without a requirement of embedded librarianship, marketing for the librarians and what they can do for students will be essential for their role to be successful.25 At UNF, conversations will have to be held to determine what other integrations would be of interest and possible at our university. The UNF Library will also be looking to improve the design and layout of library guides. Now that their visibility has increased, it will be important to standardize them and ensure they all have a consistent look and feel, which will make it easier for students to find the information and resources they are looking for. CONCLUSION In today’s rapidly changing technological world, it is critical to make resources available despite where students are physically located. Integrating the library’s LibGuides into Canvas not only brought more visibility to the library, its resources, and its services, but it also brought the library to where the students were engaged with the university. As noted by Farkas, “positioning the library at the heart of the virtual campus seems as important as positioning the library as the heart of the physical campus.”26 Providing resources to students at their point of need, enabled them to easily access the information they needed to help them succeed in their courses. It also allowed faculty to integrate library resources that were most beneficial to their courses and enhanced their teaching as well as the educational needs of their students. The UNF Library will continue to look at how library resources are used, and how to best serve the online community going forward. It will be important to explore ways to enhance existing services with existing technology but also look ahead and determine what may be possible down the road with new and upcoming technologies. In addition, assessing how the library connects to online learners and gathers feedback from students and faculty will be critical to contributing to the success of students. ENDNOTES 1 Meredith Gorran Farkas, “Libraries in the Learning Management System,” American Libraries Tips and Trends (Summer 2015), https://acrl.ala.org/IS/wp- content/uploads/2014/05/summer2015.pdf. 2 Jeffrey Pomerantz et al., “Foundations for a Next Generation Digital Learning Environment: Faculty, Students, and the LMS” (Jan 12, 2018): 1–4. 3 Pomerantz et al., “Foundations for a Next Generation Digital Learning Environment.” 4 Farkas, “Libraries in the Learning Management System.” https://acrl.ala.org/IS/wp-content/uploads/2014/05/summer2015.pdf https://acrl.ala.org/IS/wp-content/uploads/2014/05/summer2015.pdf INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 COLLABORATION AND INTEGRATION | MURRAY AND FEINBERG 11 5 Farkas, “Libraries in the Learning Management System.” 6 Farkas, “Libraries in the Learning Management System.” 7 Farkas, “Libraries in the Learning Management System.” 8 Robin Camille Davis, “The LMS and the Library,” Behavioral & Social Sciences Librarian 36, no. 1 (Jan 2, 2017): 31–5, https://doi.org/10.1080/01639269.2017.1387740. 9 Liz Thompson and Davis Vess, “A Bellwether for Academic Library Services in the Future: A Review of User-Centered Library Integrations with Learning Management Systems,” Virginia Libraries 62, no. 1 (2017): 1–10, https://doi.org/10.21061/valib.v62i1.1472. 10 Davis, “The LMS and he Library.” 11 Amanda Clossen and Linda Klimczyk, “Chapter 2: Tell Us a Story: Canvas Integration Strategy,” Library Technology Reports 54, no. 5 (2018): 7–10, https://doi.org/10.5860/ltr.54n5. 12 Clossen and Klimczyk, “Chapter 2,” 8. 13 Clossen and Klimczyk, “Chapter 2,” 8. 14 Britt Fagerheim et al. “Extending our Reach,” Reference & User Services Quarterly 56, no. 3 (2017): 180–8, https://doi.org/10.5860/rusq.56n3.180. 15 Fagerheim et al., “Extending our Reach,” 187. 16 Fagerheim et al., “Extending our Reach,” 188. 17 Amanda Clossen, “Chapter 7: Ongoing Implementation: Outreach to Stakeholders,” Library Technology Reports 54, no. 5 (2018): 28. 18 Amanda Clossen, “Chapter 7,” 29. 19 Amanda Clossen, “Chapter 7,” 29. 20 Susan S. Karplus, “Integrating Academic Library Resources and Learning Management Systems: The Library Blackboard Site,” Education Libraries 29, no. 1 (2006): 5, https://doi.org/10.26443/el.v29i1.219. 21 Karplus, “Integrating Academic Library Resources and Learning Management Systems.” 22 Karplus, “Integrating Academic Library Resources and Learning Management Systems.” 23 Karplus, “Integrating Academic Library Resources and Learning Management Systems.” 24 Beth E. Tumbleson, “Collaborating in Research: Embedded Librarianship in the Learning Management System,” Reference Librarian 57, no. 3 (Jul, 2016): 224–34, https://doi.org/10.1080/02763877.2015.1134376. https://doi.org/10.1080/01639269.2017.1387740. https://doi.org/10.21061/valib.v62i1.1472 https://doi.org/10.5860/ltr.54n5 https://doi.org/10.5860/rusq.56n3.180 https://doi.org/10.26443/el.v29i1.219 https://doi.org/10.1080/02763877.2015.1134376 INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 COLLABORATION AND INTEGRATION | MURRAY AND FEINBERG 12 25 Tumbelson, “Collaborating in Research.” 26 Farkas, “Libraries in the Learning Management System.” ABSTRACT INTRODUCTION HOW THE PROJECT ORIGINATED LITERATURE REVIEW DEVELOPING A PLAN OF ACTION IMPLEMENTATION BENEFITS OF THE INTEGRATION CHALLENGES OF THE INTEGRATION DISCUSSION and Next steps CONCLUSION ENDNOTES 11877 ---- VR Hackfest Public Libraries Leading the Way VR Hackfest Chris Markman, M Ryan Hess, Dan Lou, and Anh Nguyen INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 6 Chris Markman (Chris.Markman@cityofpaloalto.org) is Senior Librarian – Information Technology & Collections, Palo Alto City Public Library. M Ryan Hess (ryan.hess@cityofpaloalto.org) is Library Services Manager — Digital Initiatives, Information Technology & Collections, Palo Alto City Public Library. Dan Lou (dan.lou@cityofpaloalto.org) is Senior Librarian — Information Technology & Collections, Palo Alto City Public Library. Anh Nguyen (anh.nguyen@cityofpaloalto.org) is Library Specialist, Information Technology & Collections, Palo Alto City Public Library. We built the future of the Internet…today! The eLibrary team at the Palo Alto City Library held a VR Hackfest weaving together multiple emerging technologies into a single workshop. During the event, participants had hands -on experience building VR scenes, which were loaded to a Raspberry Pi and published online using the Distributed Web. Throughout the day, participants discussed how these technologies might change our lives, for good and for ill. And afterward, an exhibit showcasing the participants’ VR scenes was placed at our Mitchell Park branch to stir further conversation. MULTIPLE EMERGING TECHNOLOGIES EXPLORED The workshop was largely focused around the A-Frame code, a framework for publishing 3D scenes to the web (https://aframe.io/). However, we also integrated a number of other technologies, including a Raspberry Pi, QR codes, a Twitter-bot, and the Inter-Planetary File System (IPFS), which is a distributed web technology. Virtual Reality Built With A-Frame Code In the VR Hackfest, participants first learned how to use A-Frame code to render 3D scenes that can be experienced through a web browser or VR headset. A-Frame is a new framework that web publishers and 3D designers can use to design web sites, games and 3D art. A-Frame is an extension of HTML, the code used to build web pages. Anyone who is familiar with HTML will pick up A-Frame very quickly, but it is simple enough even for beginners. For example, here is some raw A-Frame code: mailto:Chris.Markman@cityofpaloalto.org mailto:ryan.hess@cityofpaloalto.org mailto:dan.lou@cityofpaloalto.org mailto:anh.nguyen@cityofpaloalto.org https://aframe.io/ VR HACKFEST | MARKMAN, HESS, LOU, AND NGUYEN 7 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11877 Figure 1. Try this code example! https://tinyurl.com/IPFSVR02. Save the above code as an HTML file and open it with a WebVR compatible browser like Chrome and you will then see a blue cube in the center of your screen. By just changing the values of a few parameters, novice coders can easily change the shape, size, color and location of primitive 3D objects, add 3D backgrounds and more. Advanced users can also insert JavaScript code to make the 3D scenes more interesting. For example, in the workshop, we provided JavaScript that animated a 3D robot head (see figure 1) pre-loaded into the CodePen (https://codepen.io) interface for quicker editing and iteration. The Inter-Planetary File System (IPFS) The collection of 3D scenes created in the VR Hackfest was published to the Internet using the Inter-Planetary File System (IPFS), an open source distributed web technology originally created in Palo Alto by Protocol Labs in 2014 and now actively improved by a global network of software developers. IPFS allows anyone to publish to the Internet without a server, through a peer-to-peer network that can also work seamlessly with the regular Internet through HTTP “gateways”. In November 2019, Brave Browser (https://brave.com) became the first to offer seamless IPFS integration, capable of spawning its own background process or daemon that can upload and download to IPFS content on the fly without the need for an HTTP gateway or separate browser extension installation. Unlike p2p technologies such as BitTorrent, IPFS is best suited for distributing small files available for long periods of time rather than the quick distribution of large files over a short period of time. This is an oversimplification of what is really happening behind the scenes (part of the magic involves content-addressable storage and asynchronous communication methods based on pub/sub messaging, to name a few) but the ability to share and publish 3D environments and 3D objects in a way that can instantly scale to meet demand could have far reaching consequences for future technologies like augmented reality. https://tinyurl.com/IPFSVR02 https://codepen.io/ https://ipfs.io/ https://brave.com/ INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 8 Figure 2. Workshop attendees. IPFS can load content much faster, more securely (through features like automated cryptographic hash checking), and allows people to publish directly to the Internet without the need of a third- party host. Google, Facebook, and Amazon Web Services need not apply. The same technology has already been used to overcome censorship efforts by governments, but like any technology it has its downsides. Content on IPFS is essentially permanent, allowing free speech to flourish but it could also make undesirable content, like hate speech or child pornography, all but impossible to control. Toward 21st Century Literacy Like our other technology programs, the VR Hackfest was designed to engage customers around new forms of literacy, particularly around understanding code and thinking critically about emerging communication technologies. In 2019, we are already seeing how technologies like machine learning and social media are impacting social relations, politics and the economy. It is no longer enough to know how to read and write code that underlies the web. True literacy must also understand how these technologies interface with each other and how they impact people and society. VR HACKFEST | MARKMAN, HESS, LOU, AND NGUYEN 9 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11877 Figure 3. The free-standing exhibit. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 10 To this end, the VR Hackfest sought to take participants on a journey, both technological but also sociological. Once the initial practice with the code was completed, we moved on to a discussion of the consequences for using these technologies. With the distributed web, for example, we explored questions like: • What are the implications for permanent content on the web which no one can take down? • What power do gatekeepers like the government and private companies have over our online speech? • What does a 3D web look like and how will that change how we communicate, tell stories and learn? After the workshop ended, we continued the conversation with the public through an exhibit placed at our Mitchell Park branch (see figure 3). In this exhibit, we showcased the VR scenes participants had created and introduced the technologies underlying them. But we also asked people to reflect on the future of the Internet and to share their thoughts by posting on the exhibit itself. Public comments reflected the current discourse around the Internet. Responses (see figure 5) were generally positive—most of our customers mentioned better download speeds or other efficiency increases but a few also highlighted online privacy and safety improvements. We recorded an equal number of pessimistic and technical responses to the same question, these often demonstrated either knowledge of similar technology (e.g. “how is this different than Napster?”) or displeasure with the current state of the world wide web (e.g. “less human connections” or “more spyware and less freedom”). OUTCOMES One surprise outcome was that our project reached the attention of the developers of IPFS, who happen to live a few blocks away from the library. After reading about the exhibit online, their whole team visited our team at the library. In fact, one of their team turned out to be a former child customer of our library! The workshop itself, which was featured as a summer reading program activity, also brought in record numbers. Originally open to 20 participants and later expanded to 30, the workshop grew a waitlist that more than quadrupled our initial room capacity. Clearly, people were interested in learning about these two emerging technologies. We also want to take a moment to highlight the number of design iterations this project went through before making its way into the public eye. The free-standing VR Hackfest exhibit was originally conceived as a wall mounted computer kiosk that encouraged users to publish a short message directly to the web with IPFS, but this raised too many privacy concerns and ultimately our building design does not make mounting a computer on the wall an easy task. Our workshop also initially focused much more on command line skills working directly with IPFS, but user testing with library staff showed learning A-frame was more than enough. VR HACKFEST | MARKMAN, HESS, LOU, AND NGUYEN 11 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11877 Figure 4. Building the exhibit. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 12 Figure 5. Exhibit responses. Figure 6. Visit from Protocol Labs Co-Founders. 0 2 4 6 8 10 12 14 16 18 20 Optimistic Pessimistic Technical Spam Illegible N u m b e r o f P o st -I t N o te s VR HACKFEST | MARKMAN, HESS, LOU, AND NGUYEN 13 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11877 The VR Hackfest was also a win because it combined so many different skills into a single project. We were not only working with open source tools and highlighting new technologies, but also building an experience for workshop attendees and showcasing their work to thousands of people. Future Work Our immediate plans include re-use of the exhibit frame for future public technology showcases and offering another round of VR Hackfest workshops, perhaps in a smaller group so participants have the chance to view their work while wearing a VR headset. Figure 7. 3D Mock-up. Beyond this, we also think libraries have the opportunity to harness the distributed web for digital collections, potentially undercutting the cost of alternative content delivery networks or file hosting services. Through this project we have already tested things like embedded IPFS links in MARC records and building a 3D object library. Essentially, all the pieces of the “future web” are already here and it is just a matter of time before all modern web browsers offer native support for these new technologies. In general, our project demonstrated the popularity of 21st-century literacy programs. But it also demonstrated the significant technical difficulties of conducting cutting edge technology workshops in public libraries. Clearly, the demand is there, and our library will continue to strive to re-imagine library services. Multiple Emerging Technologies Explored Virtual Reality Built With A-Frame Code The Inter-Planetary File System (IPFS) Toward 21st Century Literacy Outcomes Future Work 11883 ---- Integrated Technologies of Blockchain and Biometrics Based on Wireless Sensor Network for Library Management ARTICLES Integrated Technologies of Blockchain and Biometrics Based on Wireless Sensor Network for Library Management Meng-Hsuan Fu INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2020 https://doi.org/10.6017/ital.v39i3.11883 Meng-Hsuan Fu (msfu@mail.shu.edu.tw) is Assistant Professor, Shih Hsin University (Taiwan). © 2020. ABSTRACT The Internet of Things (IoT) is built on a strong internet infrastructure and many wireless sensor devices. Presently, Radio Frequency Identification embedded (RFID-embedded) smart cards are ubiquitous, used for many things including student ID cards, transportation cards, bank cards, prepaid cards, and citizenship cards. One example of places that require smart cards is libraries. Each library, such as a university library, city library, local library, or community library, has its own card and the user must bring the appropriate card to enter a library and borrow material. However, it is inconvenient to bring various cards to access different libraries. Wireless infrastructure has been well developed and IoT devices are connected through this infrastructure. Moreover, the development of biometric identification technologies has continued to advance. Blockchain methodologies have been successfully adopted in various fields. This paper proposes the BlockMetrics library based on integrated technologies using blockchain and finger-vein biometrics, which are adopted into a library collection management and access control system. The library collection is managed by image recognition, RFID, and wireless sensor technologies. In addition, a biometric system is connected to a library collection control system, enabling the borrowing procedure to consist of only two steps. First, the user adopts a biometric recognition device for user authentication and then performs a collection scan with the RFID devices. All the records are recorded in a personal borrowing blockchain, which is a peer-to-peer transfer system and permanent data storage. In addition, the user can check the status of his collection across various libraries in his personal borrowing blockchain. The BlockMetrics library is based on an integration of technologies that include blockchain, biometrics, and wireless sensor technologies to improve the smart library. INTRODUCTION The Internet of Things (IoT) connects individual objects together through their uniqu e address or tag, which are based on the sensor devices and wireless network infrastructure. Presently, “smart living” (a term that includes concepts such as the smart home, smart city, smart university, smart government, and smart transportation) is based on the IoT, which plays a key role to achieve a convenient and secure living environment. Gartner, a data analytics company that presents the top ten strategic technology trends for the next year at the end of each year, listed blockchain as one of the top ten in 2017, 2018, 2019, and 2020.1 The fact that blockchain has been proposed as one of the top strategic technology trends for four consecutive years represents its sustained interest among technology experts and developers. In a blockchain, a block is the basic storage unit where data is saved and protected with cryptography and complex algorithms. The technology of peer-to-peer transfer is adopted mailto:msfu@mail.shu.edu.tw INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 INTEGRATED TECHNOLOGIES OF BLOCKCHAIN AND BIOMETRICS | MENG-HSUAN FU 2 when data or information is exchanged without the need for a third party. In other words, data is transferred directly from node to node or user to user thanks to the decentralized nature of the blockchain. In addition, blockchain is authorized and maintained by all nodes in the same blockchain network. Each node has equal right (also known as equal weight) to access the blockchain and authorize new transactions. Thus, all transactions are published and broadcast to all nodes and content cannot be altered by single or minority users or nodes. Additionally, transaction content is secured by cryptography and complex secure algorithms. Therefore, transactions occur and are preserved under a fully secure and private network. In practice, the blockchain has been applied to various fields including finance, medicine, academia, and logistics. The blockchain has also been adopted for personal transaction records for its privacy and security properties and because it offers immutable and permanent data storage. In this research, blockchain technologies are adopted to store the records of collections borrowed from various libraries in a personal borrowing blockchain. Table 1. Definition of Key Terms of Blockchain Key Term Definition Blockchain A blockchain comprises many blocks. It has the characteristics of security, decentralized, immutability, distributed ledgers, transparent log, and irreversible data storage. Block A block is the basic unit in blockchain. Each block consists of a block header with nonce, previous block hash, timestamp, Merkle root, and many transactions in a block body. Nonce The counter of the algorithm, hash value will be changed once the nonce modified. Merkle root A secure hash algorithm (SHA) is used in Merkle root to transform data into a meaningless hash value. Transaction Each transaction is composed of address, hash, index, and timestamp. All transactions will be stored in blocks permanently. Hash Secure hash algorithm (SHA) transforms input data into meaningless output data, called a hash, which consists of English letters and digital numbers, in order to protect data content during transmission. Biometrics Using human physical characteristics including finger vein, iris, voice, and facial features for recognition. Sensor Network A sensor is a small and portable node with a data record function and power source. A sensor network is composed of many sensors based on a communication infrastructure. IoT Internet of Things (IoT) is a system to connect sensors and devices together under an internet environment. Presently, many IoT applications were adopted such as smart home, health care, and smart transportation. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 INTEGRATED TECHNOLOGIES OF BLOCKCHAIN AND BIOMETRICS | MENG-HSUAN FU 3 Although the IoT and wireless networks have been well developed, people still own many different RFID-embedded cards, such as public transportation cards, credit cards, student cards, medical cards, identification cards, membership cards, or library cards. An RFID-embedded card is issued for each place or purpose, requiring the user to bring the appropriate cards to access the corresponding functions. In this study, the library is used as the objective to which blockchain technologies are applied because currently each library has its own library card for entering the library and borrowing material. This implies that users may have to carry several library cards to access each of a university library, community library, and district library on the same day. Here, biometrics can be adopted to solve the problem of having to carry many access control cards and managing various borrowing policies. In this study, the BlockMetrics library is designed based on the technologies of blockchain and biometrics within the environment of a wireless sensor network with IoT devices. Here, borrowing records are transferred and stored through blockchain technologies, automatic library access control is managed by biometric identification and the borrowing and returning of library materials are achieved under a wireless sensor network with IoT devices to create a convenient, efficient, and secure library environment. The key terms of blockchain and its related terms, biometrics, sensor network and IoT applied in this research are defined in table 1. RELATED WORKS Blockchain Technology Nakamoto has presented bitcoin as a peer-to-peer electronic system that uses blockchain technologies, which include smart contracts, cryptography, decentralization, and consensus in proof of work. Because this electronic system is based on cryptography, a trusted third party is not required in the payment mechanism. Additionally, peer-to-peer technology and a timestamp server are adopted and a block is given a hash in serial order. This procedure solves the problem of double-spending during payment.2 In addition, proof of work is used in a decentralized system for authentication by most nodes in the blockchain network. Each node has equal rights to compete to receive a block and each node can vote to authenticate a new block. 3 Košt’ál et al. define Proof-of-Work (PoW) as an asymmetric method with complex calculations where its difficulty is adjusted by the problem-solving duration.4 However, PoW has drawbacks, such as high power consumption and the fact that some users can control the blockchain if their shares of users in the same blockchain network reach 51 percent.5 Despite the possible presence of malicious parties, the information in a blockchain is difficult to modify because of the distributed ledger methodology in which each node has the same copy of ledger, making it difficult for a single or minority node to change or destroy the stored data. 6 A block is a basic unit in a blockchain. In other words, a blockchain is composed of blocks that are connected. One of the blockchain technologies is the distributed ledger, in which a ledger can be distributed to each node all over the world.7 Each block is composed of a block body and header, and a block size is 4 bytes. The block header is a combination of the version, previous block hash, Merkle root, timestamp, difficulty target, and nonce. A blockchain is 80 bytes in total and transactions are between 1 to 9 bytes (see table 2). INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 INTEGRATED TECHNOLOGIES OF BLOCKCHAIN AND BIOMETRICS | MENG-HSUAN FU 4 Table 2. Block Element. Block size 4 bytes Size Block header (80 bytes) 4 bytes Version 32 bytes Previous block hash 32 bytes Merkle root 4 bytes Timestamp 4 bytes Difficulty target 4 bytes Nonce Block body 1-9 bytes Transactions Blockchain technologies have been applied to various fields including finance, art, hygiene, healthcare, and academic certificates. For example, in healthcare, user medical records are stored in a blockchain so users can check their health conditions and share them with their family members in advance. In business, blockchain is adopted into the supply chain management for monitoring activities during goods production. In the academic field, the certificates are permanently saved in a blockchain where users can retrieve them from their mobile devices and show them upon request.8 Because blockchain is protected by cryptography and offers privacy, reliability, and decentralization, an increasing number of applications are beginning to adopt it. As an application for a library system, a blockchain, in combination with biometrics within a wireless infrastructure, can be adopted for personal borrowing records. Library Borrowing Management Each library has its own regulations. For example, the National Central Library (NCL) in Taiwan has created reader service directions in which the general principles and rules for library card application, reader access to library materials, reference services, request for library materials, violations, and supplementary provisions are clearly stated. According to the NCL reader service directions, citizens are required to present their national ID card and foreigners are asked to present their passport to apply for a library card. Users are allowed to access the library when they have a valid library card. Those who have library cards but have forgotten to bring them can apply for a temporary library card to enter the library, but this is limited to three times.9 This rule is only specific to the NCL. Other libraries in the same country have their own regulations. Another example is the Taipei Public Library. Citizens can apply for a library card using their citizen’s ID card, passport, educational certificate, or residence permit. A Taipei citizen can apply for a family library card using their family certificate. Users can borrow material and return the material to all the libraries in Taipei city. However, these policies are only applicable to users who hold library cards issued by libraries in Taipei city.10 As for university libraries, each library also has its own regulations. For instance, Shih Hsin University (SHU) issues its own library card to access its library. Alumni are requested to present their ID cards and photo to apply for a library card in person. The number of items and their loan periods are clearly stated in the rules set by the SHU library.11 Again, the regulations are individually set by each university library. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 INTEGRATED TECHNOLOGIES OF BLOCKCHAIN AND BIOMETRICS | MENG-HSUAN FU 5 Biometrics RFID-embedded smart cards such as student cards, transportation cards, and bank cards are widely used, however, they can be stolen, lost, or forgotten at home. Biometrics is becoming more widely used in access-control systems for homes, offices, buildings, government facilities, and libraries. For these systems, the fingerprint is one of the most commonly used biometrics. Users place their finger on a read device, usually a touch panel. This method ensures a unique identity, is easy to use and widely accepted, boasts a high scan speed, and is difficult to falsify. However, its effectiveness is influenced by the age of the user and the presence of moisture, wounds, dust, or particles on the finger, in addition to the concern for hygiene because of the use of touch devices. Face recognition has been used in various applications such as unlocking smart devices, performing security checks and community surveillance, and maintaining home security. This method of biometric identification is convenient, widely accepted, difficult to falsify, and can be applied without the awareness of the person. However, limitations to face recognition include errors that can occur due to lighting, facial expression, and cosmetics. Also, privacy is an issue in face recognition because it may take place at a distance without the user’s consent. Another form of biometric identification uses the iris as an inner biometric indicator because th e iris is unique for each person. Nevertheless, this method is also prone to errors that can be caused by bad lighting and the possible presence of diseases such as diabetes or glaucoma. Devices used for iris recognition are expensive, and thus rarely adopted in biometrics.12 Speech recognition is used for equipment control such as a smart device switch, however, it can be affected by noise, physical conditions of the user, or weather. Vein recognition using finger or palm veins is becoming more prevalent as a form of biometric identification for banks or access control but can be limited by the possible presence of bruises or a lower temperature. However, vein recognition ensures a unique identity, is easy to use, convenient, accurate, and widely accepted, thus, many businesses are adopting vein recognition for various usages. To summarize, biometric identification is convenient, reduces the error rate in recognition, and is difficult to falsify. Therefore, biometric identification is suitable for access control. BLOCKMETRICS LIBRARY The BlockMetrics library is based on the integration of blockchain and biometric technologies in a wireless sensor network with IoT devices. Figure 1 shows the BlockMetrics library architecture with its bottom-up structure consisting of five layers: hardware, software, internet, transfer and security, and application. All components are sequentially described in detail from the bottom to the top layers. In the hardware layer, sensor nodes are physically located on library collection shelves, entrance gates, and relevant equipment to be further connected with the upper layers. RFID tags are attached to each item in the library, including books, audio resources, and videos. Tag information is read and transferred by RFID readers. The biometric devices used in this study include fingerprint readers, palm and finger-vein identifiers, and face or iris recognition devices for biometric authentication when users enter libraries or borrow collections. All images including action images, collection images, and surveillance images are recorded with cameras. The ground surveillance, library collection recognition, image processing, and user identification are manipulated by graphics processing units. Touch panels are used for typing or searching for information and there is a particular process for user registration. For general input and output of information, I/O devices include speakers, microphones, keyboards, and monitors. The entrance gate is connected to biometric devices and recognition systems for automatic access control. Microprocessors and servers, which make up the core of the hardware, handle all the functions that run in the operating system. Data and programs are run and securely saved on a large INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 INTEGRATED TECHNOLOGIES OF BLOCKCHAIN AND BIOMETRICS | MENG-HSUAN FU 6 memory drive. Data transmission occurs through the wireless data collector, and data collection and transfer in the library are based on a wireless environment. In the BlockMetrics library, a library collection database is used to store and maintain all the library material information in a local library for backup usage, and a blockchain is used to record personal borrowing and returning history. Figure 1. BlockMetrics Library Architecture In the software layer, an open-source word processor (such as Writer, Impress, or Calc provided by LibreOffice) is used to record library collections and handle library affairs. Biometric recognition identifies a user’s biological features collected from biometric devices such as a finger- vein recognition device, which is adopted in this research. All images and videos include ground INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 INTEGRATED TECHNOLOGIES OF BLOCKCHAIN AND BIOMETRICS | MENG-HSUAN FU 7 and entrance surveillance and library material borrowing and returning are recorded by cameras. All images and videos are operated by and processed with video-processing software. The data of the images, videos, personal information, and library collections is managed and saved through an image and file management system as well as a database management system. The software programs associated with creating, modifying, and maintaining library processes are written with open-source programming codes, in particular, Python and R, which were scored as two of the top ten programming languages by IEEE Spectrum in 2019.13 There are various functions saved as packages that are free to download and can be modified and reproduced into a customized program for specific purposes. The hardware houses the CPU, which runs the general library operations, and software programs are maintained by the operating system and management information system. Library stock management, collection search, personal registration, and other library-related functions can be designed and developed through App Inventor. Each borrow and return record is connected with the personal identification recognized from the finger-vein reader and recognition system and is saved as a transaction. The transactions that take place in a specific period are saved in a block that can be connected to another block to form a personal borrowing blockchain. The internet technology layer is built upon the hardware and software structure. The main purpose of internet technologies is to connect the equipment and devices with the internet, in which the internet plays the role of an intermediary for all devices communicating and cooperating together in the library. In the internet technology layer, Bluetooth connects devices such as earphones, speakers, and audio guidance within short distances. Files are exchanged between smart devices, including smartphones, tablets, or iPads, through near-field communication, which is a contact-less sensor device. RFID is adopted for collection borrowing and returning services in the library. Because fiber-optic cables have been ubiquitously planted within infrastructure with the development of smart cities, most libraries have also been built with them. Users or vehicles are more easily and more accurately located by the Global Positioning System (GPS), which also assists with image recognition when a material is taken from the shelves. Sensors transfer the sensing data to the relative devices for recording or processing under the infrastructure of the wireless sensor network. The library is currently built with a Wi-Fi environment, but Li-Fi is one of the future trends that involve creating a wireless environment just with light. Mobile devices operate under wireless communications and most countries provide 4G and 4G+ with some supporting even 5G. The internet technology layer is the tools provider for intercommunication among devices. Data security and transmission reliability are extremely important issues when various equipment is linked together and connected to the internet. The user interface is the bridge between the user and the devices. In other words, the user gives commands to the devices or software through a user interface or app. Biometric recognition devices, RFID readers, and entrance control equipment are connected via the internet in this study. The devices send the information to the corresponding devices for specific purposes in a specified order. Collected data such as private user identification are secured by cryptography utilized in blockchain technology. The finger-vein identification used as personal identification is combined with the borrow and return records, stored as transactions, and secured under a secure hash algorithm before being saved into a blockchain. All data and personal identification are transferred under the corresponding secure methodologies. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 INTEGRATED TECHNOLOGIES OF BLOCKCHAIN AND BIOMETRICS | MENG-HSUAN FU 8 In the BlockMetrics library, self-check-in and self-checkout rely mostly on RFID technology and finger-vein biometrics. The borrow and return records are stored in a personal borrowing blockchain, where records are saved in the blockchains of user and library, biometrics system, and library servers. Entrance control is automatically managed by finger-vein recognition. Library stock management is particularly based on image assistance and RFID technology. New user registration is performed only through a few identification questions and finger-vein characteristics extraction. The BlockMetrics library is without a circulation desk environment and has an automated borrow and return mechanism through a single sign at the entrance and exit. The five layers in the BlockMetrics library architecture communicate with each other such that operations are inseparably related. The BlockMetrics library scenario is described in the next section. SCENARIO In this section, the scenarios for registration, entry, and material borrowing and returning in the BlockMetrics library will be described in detail. In figures 2 and 3, the user side indicates the actual user actions and is represented as solid lines and the background shows mostly background operations and is indicated as dotted lines. In figure 2, when a new user comes into the BlockMetrics library, the registration procedure starts with a biometric pattern extraction and recognition of the user. Finger-vein authentication is selected as the personal biometrics for entrance and material borrowing. On the user side, registration is completed with only two steps. The first is finger-vein extraction and the second is to simply provide personal information. The biometric recognition data is processed and stored in the appropriate database, which is linked to the personal identification management system. Personal information is secured through the cryptography used in blockchain technology, thus, all information is securely stored. The registration procedure is performed only once at the first entry, and afterward all registered users can enter the BlockMetrics library using finger-vein authentication. Biometric recognition proceeds with a biometrics database that verifies user identity followed by verification results sent to the entrance control management for entrance guarding. The entrance is automatically controlled because the results from the biometric recognition step are sent as the rules for entrance control. Users will be permitted to enter the library when they pass the biometrics recognition step. Users do not have to bring their library cards to enter or borrow material, increasing convenience and decreasing identity infringement when library cards get lost. Figure 3 shows the scenario of library material borrowing and returning. On the user side, library material borrowing consists of four simple steps performed by the user: 1) retrieving items, 2) authenticating with the finger-vein recognition device, 3) placing items on the RFID reader, and 4) exiting the library. When the user removes a book from the shelf, an infrared detector is triggered, recognizing that a book was removed from the shelf. Then, image recognition identifies the specific book and the book’s status is marked as charged to the user in the stock database. If the user wants to leave the library without borrowing anything, the user just scans their finger with the finger-vein device to open the entrance gate. If the user wants to borrow library materials such as books or videos, the borrowing procedure is quickly completed after finger-vein scanning and placing all material including books and videos under the RFID read area. In the background, the user’s recognition results from the finger-vein scan are saved in the biometrics database, which is connected to the blockchain. When the library materials are placed together in the RFID read area, all the tags are read at once while the materials’ statuses in the INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 INTEGRATED TECHNOLOGIES OF BLOCKCHAIN AND BIOMETRICS | MENG-HSUAN FU 9 database are updated. User information and material borrowing information are linked and saved as transactions that are stored in the personal borrowing blockchain. Figure 2. BlockMetrics Library Scenario—Registration and Entry INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 INTEGRATED TECHNOLOGIES OF BLOCKCHAIN AND BIOMETRICS | MENG-HSUAN FU 10 Figure 3. BlockMetrics Library Scenario – Borrowing and Returning To return library material, the user only needs to put the library materials in the specific area with the RFID reader and the return procedure is completed. The RFID tags of returned materials are read and recorded and their status in the stock database is updated. Personal borrow and return records are saved as transactions and stored in the personal borrowing blockchain as well. LIMITATIONS Partial biometric technologies such as facial recognition or fingerprint recognition systems have been adopted by some libraries. These tools have increased the efficiency of accessing and borrowing procedures. However, all the records include some personal information (e.g., fingerprint, historical borrowing records, log of library access, etc.) that is still stored in the individual libraries’ database. The blockchain model may not suitable for all current libraries system due to the unknown database design of each library. At present, library classification systems are by each library individually. Therefore, integrating library information among national or international libraries will be a huge task. Thus, how to establish the general regulations for all libraries to develop and manage the library information will need additional INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 INTEGRATED TECHNOLOGIES OF BLOCKCHAIN AND BIOMETRICS | MENG-HSUAN FU 11 research. After that, the information management system should be designed and built by collecting diverse comments from all library managers. The works should be completed by interdisciplinary experts including library management, information engineering, biometrics system design, and data management. The cost may include manager committees, collection coding design, system development, hardware layout, and related training plans. Also, th ere may exist unpredictable privacy issues which could be known until practical system operation. Lastly, some users need an adaptation period while new technologies are implemented, the duration of which can depend on how smooth the interface design is, if the system manipulation is easy and clear to use, and what benefits the technologies bring to users’ life. The limitations are concluded as: 1) integrating library information such as stock data and serial numbers, 2) establishing general regulations, 3) creating a consistent library management system, 4) the cost for this system, 5) potential for privacy breaches, and 6) library patron resistance or reluctance to use the technology. CONCLUSION In this research, the BlockMetrics library is designed under a wireless sensor network infrastructure combined with blockchain, biometric, IoT, and RFID technologies. The library access control system is based on finger-vein biometric recognition, in which users can register with their finger-vein information through biometric devices and input personal information via various I/O devices. Thus, automatic and secure library access control is achieved through biometric recognition. Additionally, image recognition, GPS, and RFID are adopted in the library collection management, providing a simplified way to borrow and return library material. Blockchain technologies are utilized to record personal borrowing history of collections from various libraries into a personal borrowing blockchain where records are permanently stored. Users can clearly understand their borrowing status through their own blockchain and manage their borrowing information through an application. To summarize, users can enter the library with finger-vein recognition instead of a specific library card. Then, if they would like to check out library material, the user can retrieve the items, pass them through the RFID reader, scan their finger vein, and go. The BlockMetrics library is designed for convenience and security, which are achieved by combining a wireless sensor network with the integration of blockchain and biometric technologies. This method eliminates the inconvenience of having to bring many library cards, increases the efficiency of collection borrowing procedures, and simplifies the management of collection borrowing from different libraries. Adoption of these biometric technologies is still in its early stages. Some libraries have begun using different tools, but few libraries have adopted all of them. It simplifies both accessing and borrowing procedures, and all the records are still stored in a particular library’s database for private access only. The development of the BlockMetrics library will help to integrate biometric technologies and blockchain under the infrastructure of wireless sensor network to maintain library-accessing recognition, library collections, library users, borrowing records crossing libraries to raise the user convenience and satisfactions, library management efficiency, and library security. In the near future, the library transaction formula in a blockchain will be developed for collection borrowing storage. The library collection serial numbers will be considered in information management system as well. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 INTEGRATED TECHNOLOGIES OF BLOCKCHAIN AND BIOMETRICS | MENG-HSUAN FU 12 ENDNOTES 1 Gartner, “Smart With Gartner, Gartner Top 10 Strategic Technology Trends for 2020,” https://www.gartner.com/smarterwithgartner/gartner-top-10-strategic-technology-trends- for-2020/; Gartner, “Smart With Gartner, Gartner Top 10 Strategic Technology Trends for 2019,” https://www.gartner.com/smarterwithgartner/gartner-top-10-strategic-technology- trends-for-2019/; Gartner, “Smart With Gartner, Gartner Top 10 Strategic Technology Trends for 2018,” https://www.gartner.com/smarterwithgartner/gartner-top-10-strategic- technology-trends-for-2018/; Gartner, “Smart With Gartner, Gartner Top 10 Strategic Technology Trends for 2017,” https://www.gartner.com/smarterwithgartner/gartners-top- 10-technology-trends-2017/. 2 Satoshi Nakamoto, “Bitcoin: A Peer-to-Peer Electronic Cash System” (2009), https://bitcoin.org/bitcoin.pdf. 3 David Shrier, Weige Wu, and Alex Pentland, “Blockchain & infrastructure (identity, data security),” Connection Science & Engineering, Massachusetts Institute of Technology, 2016. 4 Kristián Košt’ál et al., “On Transition between PoW and PoS,” International Symposium ELMAR (2018). 5 Thomas P. Keenan, “Alice in Blockchains: Surprising Security Pitfalls in PoW and PoS Blockchain Systems,” 15th Annual Conference on Privacy, Security and Trust (2017); Takeshi Ogawa, Hayato Kima, and Noriharu Miyaho, “Proposal of Proof-of-Lucky-ID (PoL) to Solve the Problems of PoW and PoS,” IEEE International Conference on Internet of Things and IEEE Green Computing and Communications and IEEE Cyber, Physical and Social Computing and IEEE Smart Data (2018). 6 Quoc Khanh Nguyen, Quang Vang Dang, “Blockchain Technology for the Advancement of the Future,” 4th International Conference on Green Technology and Sustainable Development, (2018); Nir Kshetri and Jeffrey Voas, “Blockchain in Developing Countries,” IT Professional, 20, no.2 (2018): 11-14. 7 Shangping Wang, Yinglong Zhang, and Yaling Zhang, “A Blockchain-Based Framework for Data Sharing with Fine-Grained Access Control in Decentralized Storage Systems,” 2018 IEEE Access, 6 (2018):38437-38450. 8 Pinyaphat Tasatanattakool and Chian Techapanupreeda, “Blockchain: Challenges and applications,” 2018 International Conference on Information Networking (ICOIN), (2018), https://doi.org/10.1109/ICOIN.2018.8343163; Abderahman Rejeb, John G. Keogh and Horst Treiblmaier, “Leveraging the Internet of Things and Blockchain Technology in Supply Chain Management,” Future Internet, 11, no. 7 (2019): 161; Stanislaw P. Stawicki, Michael S. Firstenberg, and Thomas J. Papadimos, “What’s new in academic medicine? blockchain technology in health-care: bigger, better, fairer, faster, and leaner,” International Journal of Academic Medicine, 4, no. 1 (2018): 1-11; Guang Chen et al., “Exploring blockchain technology and its potential applications for education,” Smart Learning Environments, 5, no. 1 (2018), https://doi.org/10.1186/s40561-017-0050-x; Asma Khatoon, “A Blockchain-Based Smart Contract System for Healthcare Management,” Electronics, 9, no. 1 (2020): 94. https://doi.org/10.1109/ICOIN.2018.8343163 https://doi.org/10.1186/s40561-017-0050-x INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 INTEGRATED TECHNOLOGIES OF BLOCKCHAIN AND BIOMETRICS | MENG-HSUAN FU 13 9 National Central Library, “National Central Library Reader Service Directions,” November 11, 2016, https://enwww.ncl.edu.tw/content_26.html. 10 Taipei Public Library, “Regulation of Circulation Services,” June 13, 2018, https://english.tpml.gov.taipei/cp.aspx?n=AF5CCA6FC258864E. 11 Shih Hsin University Library, “Library Regulations, Access to SHU Libraries,” http://lib.shu.edu.tw/e_orders_enter.htm; Shih Hsin University Library, “Library Regulations, Borrowing Policies,” accessed September 25, 2019, http://lib.shu.edu.tw/e_orders_borrows.htm. 12 Sudhinder Singh Chowhan and Ganeshchandra Shinde, “Iris Biometrics Recognition Application in Security Management,” 2008 Congress on Image and Signal Processing. 13 Stephen Cass, “The Top Programming Languages 2019,” IEEE Spectrum (2019), https://spectrum.ieee.org/computing/software/the-top-programming-languages-2019. https://enwww.ncl.edu.tw/content_26.html https://english.tpml.gov.taipei/cp.aspx?n=AF5CCA6FC258864E http://lib.shu.edu.tw/e_orders_enter.htm http://lib.shu.edu.tw/e_orders_borrows.htm https://spectrum.ieee.org/computing/software/the-top-programming-languages-2019 ABSTRACT INTRODUCTION RELATED WORKS Blockchain Technology Library Borrowing Management Biometrics BLOCKMETRICS LIBRARY SCENARIO LIMITATIONS CONCLUSION ENDNOTES 11905 ---- LITA President’s Message Joining Together Emily Morton-Owens INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 2 Emily Morton-Owens (egmowens.lita@gmail.com) is LITA President 2019-20 and the Assistant University Librarian for Digital Library Development & Systems at the University of Pennsylvania Libraries. . In writing this column I am looking ahead, as I have been throughout my term as Vice-President and President of LITA, to the possibility of our merger with ALCTS and LLAMA. Recently our discussions have included an exploration on all sides of how a division can support members through their career. This has inspired me to reflect on how LITA has always taken a broad and inclusive view of what library technology work is and can be in the future. I believe the proposed Core division can support and extend that tradition. One question that I’ve heard posed from time to time is “Am I technical enough for LITA?” Long- time LITA members like to answer that with a full-throated “yes!” If you’re interested enough to ask the question, we want you to join us in using technology as a part of your work. We want you to be supported in doing so at your current skill level, whether or not you want to make technology more a part of your work than it is today. If you want to go deeper into technology, we’ll be there with you. While the culture of the for-profit technology industry can promote imposter syndrome, we want LITA to be a haven. In LITA’s events and meetings, we consistently see different facets of library technology work reflected. Some of us are training users in new technologies or creating programs that get young people excited about coding. Others are working to make online resources accessible and easy for our users to benefit from. We have members who are manipulating metadata, creating services to help researchers comply with data management requirements, creating websites that guide users to the information they need, and preserving cultural heritage in digital forms. Some of us manage tech projects or workers. Some of our members work on large tech teams with generous resources and others are spinning magic just from their own skills. When I started working in libraries, my bosses and mentors were often librarians who had started in technical services or other roles, before “automation.” Eager to improve their own workflows, and getting pulled into ILS migrations and catalog development, they had become the technology experts. These accidental systems librarians have always been some of my favorite colleagues because of their sure-footed approach to our data. Recently I’ve come to work with colleagues who are accidental systems librarians in the opposite sense; tech workers who took jobs in libraries and embraced what we do. One developer on my team, who had no previous library experience, took to our projects and ethical stance like a duck to water. He told me that he now goes to parties and tells people about how librarians are defenders of privacy and protectors of information. LITA embraces growth in any direction because we want to support learning and problem-solving with a foundation of shared principles and resources. I don’t see these developments as time-based or inevitable in any given person’s career. There are plenty of library tech workers who prefer being an individual contributor and think they have mailto:egmowens.lita@gmail.com JOINING TOGETHER | MORTON-OWENS 3 https://doi.org/10.6017/ital.v38i4.11905 their biggest impact doing direct work on applications. And many of my technical services colleagues prefer to define their work goals in those terms, no matter how adept they become with tech tools. Whether or not they seek out a management position, our members will probably find themselves exhibiting leadership in some context, like developing standards or advocating for standards. Instead of a rigid path of career development, many librarians today have fluid and multi-faceted careers. For myself, I have held similar positions at quite different types of libraries—public, medical, academic. LITA has always been a part of my experience, though, providing a sort of collegial bedrock through a lot of change. The people are what make LITA, LITA: friendly, principled, and quirky. LITA members are the kind of people who will learn all they can about a technology like the Amazon Alexa—and then unplug the one on the exhibit floor at Annual. Both as I was thinking about all this, and in this resulting column, leadership, collections, and technical services kept coming up. There is such strong and fruitful cross-pollination among these specialties, and I see that as something that would enhance the member experience—both for current LITA members who want more contact with expert colleagues and for current LLAMA and ALCTS members who want learning opportunities and support for their work with technology. LITA members love to share their knowledge and hash through challenges together. Sometimes I wish more ALA members would feel comfortable giving us a try, and perhaps Core will be a new, friendly face for that ongoing outreach. If, in the future, someone asked the new question “Am I technical enough for Core?” I’m sure the answer will be the same: “Yes, please join us!” 11923 ---- Letter from the Editor (December 2019) Letter from the Editor Kenneth J. Varnum INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2019 1 HTTPS://DOI.ORG/10.6017/ITAL.V38I4.11923 Earlier this fall, I had the privilege of participating in the Sharjah Library Conference, a three-day event hosted by the Sharjah Book Authority in the United Arab Emirates with programming coordinated by the ALA International Relations Office. The experience of meeting with so many librarians from cultures different from my own was truly rewarding and enriching. It was both refreshing and invigorating to see, first-hand, the global importance of the local matters that occupy so much of my professional life. I returned to my regular job with a newfound appreciation for how much the issues I spend so much of my professional time on—information access, equity, user experience, and the like—are universal. It is easy to get lost in the weeds of my own circumstances and environment, and sometimes difficult to look up and explore what colleagues, known and unknown, are doing and thinking. The experience reinforces the importance of important open access publications such as Information Technology and Libraries. While “open access” doesn’t remove every possible barrier to accessing the knowledge, experience, and lessons contained within in its virtual cover, it does remove the all-important paywall. And that is no small thing, in a community of library technologists who interact and exchange information through social media, email, and other tools. Our open access status gives this journal a vibrant platform for sharing knowledge, experience, and expertise to all who seek it. I hope you find this issue’s contents useful and informative, and will share the items you find most important with your peers at your institutions and beyond. I invite you to add your own knowledge and experience to our collective wisdom through a contribution to the journal. For more details, see the About the Journal page or get in touch with me. Sincerely, Kenneth J. Varnum, Editor varnum@umich.edu December 2019 https://www.sibfala.com/program http://www.sba.gov.ae/ http://www.ala.org/aboutala/offices/iro https://ejournals.bc.edu/index.php/ital/about mailto:varnum@umich.edu 11937 ---- Virtual Reality as a Tool for Student Orientation in Distance Education Programs: A Study of New Library and Information Science Students ARTICLES Virtual Reality as a Tool for Student Orientation in Distance Education Programs A Study of New Library and Information Science Students Sandra Valenti, Brady Lund, and Ting Wang INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2020 https://doi.org/10.6017/ital.v39i2.11937 Dr. Sandra Valenti (svalenti@emporia.edu) is Assistant Professor, School of Library and Information Management, Emporia State University. Brady Lund (blund2@g.emporia.edu) is doctoral student of Library and Information Management at Emporia State University. Ting Wang (twang2@emporia.edu) is doctoral student of Library and Information Management, Emporia State University. ABSTRACT Virtual reality (VR) has emerged as a popular technology for gaming and learning, with its uses for teaching presently being investigated in a variety of educational settings. However, one area where the effect of this technology on students has not been examined in detail is as tool for new student orientation in colleges and universities. This study investigates this effect using an experimental methodology and the population of new master of library science (MLS) students entering a library and information science (LIS) program. The results indicate that students who received a VR orientation expressed more optimistic views about the technology, saw greater improvement in scores on an assessment of knowledge about their program and chosen profession, and saw a small decrease in program anxiety compared to those who received the same information as standard text- and-links. The majority of students also indicated a willingness to use VR technology for learning for long periods of time (25 minutes or more). The researchers concluded that VR may be a useful tool for increasing student engagement, as described by Game Engagement Theory. LITERATURE REVIEW Computer-assisted instruction (CAI) has, for many years, been considered an effective method of instructional delivery that improves student engagement and outcomes.1 New technologies, such as the learning management system (LMS), online video, laptops and tablets, word processors, spreadsheets, and presentation platforms, have all significantly altered how knowledge is transferred and measured in students. When adopted by instructors, these technologies can improve the quality of student learning, work, and their evaluation of this work. Empirical research has shown that learning technologies do indeed contribute to better learning than a lecture alone.2 Positive reaction to the adoption of new learning technologies among student populations has been shown across all grade levels, from pre-K through postgraduate education.3 Research in the fields of instructional design technology (IDT) and information science (IS) have shown that the novelty of new learning technology provides short-term improvement in outcomes.4 This supports the broader hypothesis that engagement increases retention of knowledge. These findings would suggest that, at least in the short term, instructors could anticipate improvement in knowledge retention through the use of a new technology like virtual mailto:svalenti@emporia.edu mailto:blund2@g.emporia.edu mailto:twang2@emporia.edu INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 VIRTUAL REALITY AS A TOOL FOR STUDENT ORIENTATION | VALENTI, LUND, AND WANG 2 reality. When used in sustained instructional efforts, many learning technologies show som e promise for improving the attainment of learning outcomes.5 This is why interest in learning technology has grown so significantly in the past two decades and the job outlook for instructional designers is increasing faster than the national average. 6 A large proportion of instructional technologies are not truly “adopted” by instructors, but rather used only in one-off sessions and then discarded.7 There seem to be some common factors among those technologies that are adopted and used regularly by instructors: 1. Practicality, or the amount of work the new technology requires versus the perceived value of said technology; 2. Affordability, or the cost of a new technology versus the perceived value of said technology; and 3. Stability, or the likelihood of the product to be continuously supported and updated by its manufacturer (e.g., a product like Microsoft Office has a higher likelihood of ongoing maintenance).8 As noted by Lund and Scribner, only recently, with the introduction of free VR development programs and inexpensive viewers/headsets like Google Cardboard, has VR fit this criteria. 9 It is finally practical to use VR as a learning tool for classrooms with large numbers of students. “Virtual reality is the computer-created counterpart to actual reality. Through a video headset, computer programs present a visual world that can, pixel-perfectly, replicate the real world—or show a completely unreal one.”10 Virtual reality is distinct from augmented reality, which augments a real-world, real-time image (e.g., viewed through a camera on a mobile device) with computer-generated information, such as images, text, videos, animation, and sound.11 The focus of the present study is virtual reality only, not related augmented (or mixed) reality technology. An important contribution to the study of virtual reality in library and information science (LIS) is Varnum’s Beyond Reality.12 This short introductory book covers both theoretical and practical considerations for the use of virtual, augmented, and mixed reality in a variety of library contexts. While the book describes how VR can be utilized in a variety of library education (for non-LIS majors) contexts, it does not include an example of how virtual reality may be used for library school education. It also does not investigate in significant detail the use of virtual reality for a virtual orientation to an academic program. These are the gaps in which the following study attempts to address. The present study may be viewed through the framework game engagement theory, as described by Whitton.13 Game engagement theory suggests that five major learning engagement factors exist and that using gaming activities may improve how well learning activities address these factors. These factors include: • challenge, motivation to undertake activity; control, the level of choice; • immersion, extent to which an individual is absorbed into activity; • interest, an individual’s interest in the subject matter; and • purpose, the perceived value of the outcome of the activity. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 VIRTUAL REALITY AS A TOOL FOR STUDENT ORIENTATION | VALENTI, LUND, AND WANG 3 It has been suggested by several researchers, including Dede, that immersive experiences like VR touch on similar factors of engagement.14 EMPORIA STATE UNIVERSITY’S SCHOOL OF LIBRARY AND INFORMATION MANAGEMENT The setting for this study is Emporia (KS) State University’s School of Library and Information Management (ESU SLIM). ESU SLIM is the oldest library school west of the Mississippi River, founded in 1902. Compared to other LIS education programs, ESU SLIM is unique in that it offers a hybrid course delivery format. The six core courses in the MLIS degree program are online with two in-person-class weekends for each class. Each class weekend is eleven hours: from 6 to 9 p.m. Friday and 9 a.m. to 5 p.m. Saturday at one of nine distance education locations scattered throughout the western half of the United States. Due to this course delivery format, the student population of ESU SLIM may skew slightly older and have more individuals who are employed full- time in relation to residential master’s programs. ESU SLIM uses a cohort system, with a new group of students beginning annually at each of the eight distance locations as well as the main Emporia, Kansas campus. Before each new cohort begins its first course, a one-day, in-person student orientation is offered on the campus in which the cohort will attend classes. The purpose of this experimental study is to examine how well VR technology can support or satisfy the role of the in-person student orientation by emulating the experience/information students receive during this informational session. METHODS This study was designed with a pre-test/post-test experimental design. Depending on the state in which the students reside, they were assigned either to the experimental or control group . The experimental group received a cardboard VR headset (similar to Google Cardboard) and a set of instructions on how to use them. They were instructed to utilize this headset to view an interactive experience that introduced elements of library service and library education as a form of new student orientation. Students in the control group received a set of links that contained the same information as the VR experience, but in a more static (non-immersive or interactive) setting. Participants for this study were library school students from four states: South Dakota, Idaho, Nevada, and Oregon. These students were all enrolled in a mixed-delivery program in LIS. For each core course in the program, students attend two intensive, in-person, weekend class sessions. The rest of the course content is delivered via a learning management system. For this study, the researchers were particularly interested in understanding the role of VR orientation for distance education students, as these students do not have access to the physical university campus and thus miss out on information that in-person interaction with faculty and the library environment might provide. This also seemed like a worthwhile population to study given that a large portion of LIS programs have adopted the distance education (online or mixed-delivery) format. In March 2019, a sample of this population was asked to complete a short survey to indicate their interest in virtual reality for new student orientation and the extent to which acquiring information via this medium may relieve their anxiety and increase their success in the program. Sixty-one percent of students indicated at least some elevated level of anxiety about their first MLS INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 VIRTUAL REALITY AS A TOOL FOR STUDENT ORIENTATION | VALENTI, LUND, AND WANG 4 course, while 55 percent agreed that knowing more about the program’s faculty and course structure and purpose would decrease that anxiety. Students were also asked to indicate the most pressing information needs they have about the program. These needs are displayed in table 1 below. This information was used to guide the design of the VR content for this study. Table 1. Information needs expressed by new MLS students Information Need Number of Respondents (out of 55) Information about ESU’s curriculum 50 What courses professors normally teach 42 Information about information access 41 Information about librarianship in general 39 Professors’ research interests 35 Information about ESU’s faculty 27 To see who they are via a video introduction 25 Information about ESU’s library 24 Why they teach for ESU’s MLS program 23 A little personal information about faculty 20 Information about my regional director 14 To which associations do faculty belong 13 Information about ESU’s physical spaces 5 Information about ESU’s archives 4 These students were also asked to indicate the extent to which they would like to use VR to virtually “meet” faculty, learn more about the program’s format, see program spaces, and learn about library services, using a five-point Likert scale. The findings for this question are displayed in figure 1. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 VIRTUAL REALITY AS A TOOL FOR STUDENT ORIENTATION | VALENTI, LUND, AND WANG 5 Figure 1. New MLS Students Reception to Using VR as an Orientation Tool Based on the largely positive response towards using VR for new student orientation, the researchers progressed to the experimental phase of the study. A VR experience was developed using Veer VR (veer.tv), a completely free and intuitive VR-creation platform. Within this platform, creators are able to upload images that were captured using a 360-degree VR camera (we used a Samsung Gear 360 camera) and drag-and-drop interactive elements, including text boxes, videos, audio, and transitions to new images. Thus, it was possible to create a VR experience within the setting of an academic library where users could navigate throughout the building and virtually meet faculty and learn about fundamental concepts in librarianship. For this phase of the study a set of research questions were defined, hypothesis created, and independent and dependent variables identified: Research Questions 1. Research Question 1: Will VR improve students’ knowledge of topics related to their library school and basic library topics, relative to those without a VR experience? 2. Research Question 2: Will VR reduce students’ anxiety about their library program, relative to those without a VR experience? 3. Research Question 3: Will students’ perceptions towards the usefulness of VR be significantly different based on whether or not they utilized the VR experience? 0 2 4 6 8 10 12 14 16 18 20 I'd Like to Use VR to "Meet" Faculty I'd Like to Use VR to Learn More About the Program Format I'd Like to Use VR to See the Classrooms I'd Like to Learn More About Library Services Using VR F re q u e n c y o f R e sp o n d e n ts Category of VR Use as Student Orientation Tool Strongly Agree Agree Neutral Disagree Strongly Disagree INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 VIRTUAL REALITY AS A TOOL FOR STUDENT ORIENTATION | VALENTI, LUND, AND WANG 6 Hypothesis Use of VR will improve students’ knowledge of topics related to library schools and librarianship, reduce their anxiety, and result in a more positive perspective towards VR technology. Variables Independent variable: Whether a student viewed the VR experience for a virtual orientation or viewed the web links for an online orientation. Dependent variables: Change in students’ scores on a post-test assessment of orientation knowledge, compared to their pre-test scores. Change in students’ anxiety levels and perceptions of VR. EXPERIMENTAL PHASE The experimental phase of the study was conducted in August 2019. Twenty-nine students agreed to participate in this study. The age and gender characteristics of this population are as follows: fourteen under age 35, eleven age 35–44, four age 45+; nine male, seventeen female, and three fluid or transgender. Thirty-three percent of the students who agreed to participate were in the control group, while 67 percent were in the experimental group. All participants in the study received a free VR headset, which was theirs to keep. Funding for these VR headsets was provided by a generous grant from a benefactor at the researchers ’ university. Participants in the control group were encouraged to use the VR headset after they had completed their participation in the study. Both groups received instructions with their viewer that instructed them to complete a pre-test survey, embedded within a module of their learning management system account. Following the pre-test, the experimental group was instructed to use the VR experience created by the researchers to learn about their library school, its faculty, and the library concepts. The control group was instructed to use links provided in the module to experience the same content, but without the VR experience. Following the experience, both groups were instructed to complete a post-test survey in the module, as well as a follow-up survey that asked questions about how long they interacted with the content, how the experience affected their program anxiety, and additional comments. Once the data was collected for all participants, the researchers’ conducted a series of analyses on the data, including an analysis of covariance (ANCOVA) for post-test scores among the control and experimental groups, and ANCOVA for program anxiety following the experimental treatment. 15 RESULTS Figure 2 displays the amount of time participants in the experimental group spent using the VR experience. Nearly 60 percent of participants spent more than 25 minutes using the virtual reality experience. This finding may seem remarkable, given the average attention span of students is generally no more than a handful of minutes, but aligns with that of Geri, Winer, and Zaks, who found that engagement with interactive video lengthens the attention span of users, and supports the premise of engagement theory as discussed in the literature review.16 Only 10 percent of individuals assigned to the experimental group decided not to use the headset. Additionally, about INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 VIRTUAL REALITY AS A TOOL FOR STUDENT ORIENTATION | VALENTI, LUND, AND WANG 7 one-third of participants in both the experimental and control groups indicated that they used the VR headset to view other content after they completed the study. Figure 2. Amount of Time Experimental Group Participants Spent in VR Experience In table 2, responses for Likert question about the participants’ post-test perspectives of VR are shown. Participants in the VR group generally had more favorable perspectives on their experience than participants in the control group. Participants in the control group, however, were a bit more optimistic on the idea that VR has promising uses for education and librarianship (though both groups expressed optimistic perspectives on these questions). There was some indication that participants would be willing to use VR for student orientation again, as both groups responded favorably to the idea that VR orientation information is appropriate and negatively to the idea that it would be better to get information from other sources. Tables 3 and 4 display the ANCOVA for pre-test/post-test score change among groups and the change in anxiety among the groups, respectively. Post-test scores for the experimental (17.23 correct out of 20 questions, or 86 percent) and control group (17.38/20, or 87 percent) were virtually identical; however the pre-test scores differed (experimental group, 72 percent, scored worse on the pre-test than control group, 78 percent), so the change in scores was actually greater for the experimental group. As shown in table 3, though, this difference in score change was not found to be statistically significant, F (1, 20) = .641 p = .4, r = .01. That is, no significant difference was found as to whether VR improves scores compared to links. It can be concluded, however, that INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 VIRTUAL REALITY AS A TOOL FOR STUDENT ORIENTATION | VALENTI, LUND, AND WANG 8 the links and VR together did improve scores from the pre-test to the post-test, with ANCOVA values of F (1, 20) = 7.6, p < .01, r = .47. Table 2. Post-test Perspectives of VR for Experimental and Control Groups Question Control (text- links)* Experimental (VR)* The instructions were easy to understand and follow 3 3.38 The viewer/text-links were fun to use 3 3.63 The VR/text-links content was engaging 3 3.13 I would recommend continuing VR/text- links use 2.67 3 I felt better informed about the topics presented 2.5 3.11 The information given was helpful 2.5 3.38 I feel more connected to the school than before 2.5 2.88 Virtual reality is just a fad 2 2.88 There are exciting uses for VR in education 4 3.5 There are exciting uses for VR in librarianship 4 3.5 Using VR is too time consuming 2 3 I’d rather get information in formats other than VR 2.5 2.89 VR orientation information is appropriate 4 3.38 *Five-point Likert Scale (level of agreement—1, strongly disagree; 5, strongly agree) Table 3. ANCOVA for Pre-test/Post-test Change in Scores Degrees of Freedom F- value p- value Pretest 1 .135 .7 Group 1 .641 .4 Error 18 Total 19 Corrected Total 20 Though the VR group generally reported less anxiety on a five-point Likert scale following the experiment than the control group (both groups showed some reduction), this difference was not statistically significant at p<.05 (though it was significant at p<.1). It is worth noting that few students indicated prior experience with VR before this study, so it may have simply been the unfamiliar technology that resulted in anxiety not dropping as far as anticipated, not the nature of the content. At the same time, it is worth noting, as Bawden and Robinson did, that information overload, which could certainly be the product of immersive VR orientations, is connected to INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 VIRTUAL REALITY AS A TOOL FOR STUDENT ORIENTATION | VALENTI, LUND, AND WANG 9 information anxiety.17 Thus, it may be better, in the design of VR orientations, to keep the amount of new information at a minimum, only introducing broad concepts and allowing more freedom and flexibility for the user. Table 4. ANCOVA for Anxiety Following the Orientation Experience Sum of Squares df Mean Square F Sig. Between Groups 3.219 1 3.219 3.44 9 .07 9 Within Groups 17.733 19 .933 Total 20.952 20 DISCUSSION Participants in this study expressed willingness to use VR for extended periods of time (over 25 minutes) and demonstrated strong levels of engagement. Based on this finding, it seems possible that a well-designed VR orientation could be a suitable substitute for the in-person orientation for distance students. This is a significant finding, given that the majority of existing research on orientation for distance education students focuses on the design of online course modules or video streaming for orientation, which are not nearly as immersive and dynamic as physical presence in the environment.18 VR much more closely emulates physical presence than non- interactive/immersive videos and text. Those among the participants who were in the experimental (VR) group expressed more favorable perspectives towards the technology. This suggests that experience with the technology increases comfort and interest in the technology. This aligns with the findings of Theung, Mei-Ling, Liu, Cheok, among others, who found that use of VR were more likely to accept the technology after usage.19 Additionally, stated interest in using VR for other purposes, including one-third of participants who have already utilized the technology to explore other apps suggested by the researchers. The findings of this study align with game engagement theory in several of its key aspects. VR is shown to have garnered the interest of the students who participated in the study, as indicated in table 2, aligning with the aspect of interest. They could see the purpose of the experience and were able to take control of the experience to ensure that they interacted with necessary information to satisfy this purpose. This is opposed to the control group, which had to follow links and read text in a sequential order with little control or creativity involved. Accordingly, greater improvement in scores was observed for the experimental group. Even though the improvement was not statistically significant, this could likely be explained by the relatively small sample size. With a larger number of participants, the statistical strength of the differences between the two study groups may have been more pronounced. This is one limitation of the present study. In addition to a small participant group, several other limitations exist with this study. Participants came from only a small sample of states, all in the western half of the United States. A less homogeneous sample may have produced more robust results. Some VR headsets arrived late due to delays in distributing them, giving the students less opportunity to review the content than INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 VIRTUAL REALITY AS A TOOL FOR STUDENT ORIENTATION | VALENTI, LUND, AND WANG 10 they otherwise may have had. Finally, the researchers were not able to easily troubleshoot problems with accessing the VR experience for distance students. While the best was done to help all participants figure out how to use the technology, several students opted to discontinue participation when the technology gave them trouble. This also led to a smaller study sample population than initially anticipated. CONCLUSION The findings of this study may have several important implications for library professionals who are considering using VR technology for library orientations or instruction. This study found VR to have a positive effect on students’ interest and to slightly increase scores and reduce anxiety among them. While there is no indication from this study whether VR would produce positive effects over a sustained period of time (e.g., every class session over the course of a semester), in limited usage it appears to at least draw students’ attention more so than the traditional online teaching options like static text and links. The same VR experience developed to introduce students to basic concepts within the librarianship/the library could be used for undergraduate and graduate students in all majors during library orientation sessions. This may make the library a more memorable component of students’ early university experiences, as opposed to lecture information that students are likely to easily forget. Library professionals may consider these factors when deciding whether to opt for the more traditional methods of instruction/orientation or experimenting with a more innovative method of teaching like virtual reality. ENDNOTES 1 Jennifer J. Vogel et al., “Using Virtual Reality with and without Gaming Attributes for Academic Achievement,” Journal of Research on Technology in Education 39, no. 1 (2006): 105–18, https://doi.org/10.1080/15391523.2006.10782475. 2 Yigal Rosen, “The Effects of an Animation-based On-line Learning Environment on Transfer of Knowledge and on Motivation for Science and Technology Learning,” Journal of Educational Computing Research 40, no. 4 (2009): 451–67, https://doi.org/10.2190/EC.40.4.d; Elisha Chambers, Efficacy of Educational Technology in Elementary and Secondary Classrooms: A Meta- analysis of the Research Literature from 1992–2002 (Carbondale, IL: Southern Illinois University at Carbondale, 2002). 3 Elisha Chambers, “Efficacy of Educational Technology in Elementary and Secondary Classrooms: A Meta-analysis of the Research Literature from 1992–2002,” PhD diss., Southern Illinois University at Carbondale, 2002. 4 Jason M. Harley et al., “Comparing Virtual and Location-based Augmented Reality Mobile Learning: Emotions and Learning Outcomes,” Educational Technology Research and Development 64, no. 3 (2016): 359–88, https://doi.org/10.1007/s11423-015-9420-7; Jocelyn Parong and Richard E. Mayer. “Learning Science in Immersive Virtual Reality,” Journal of Educational Psychology 110, no. 6 (2018): 785–95, https://doi.org/10.1037/edu0000241; Paul Legris, John Ingham, and Pierre Collerette, “Why Do People Use Information Technology? A https://doi.org/10.1080/15391523.2006.10782475 https://doi.org/10.2190%2FEC.40.4.d https://doi.org/10.1007/s11423-015-9420-7 https://psycnet.apa.org/doi/10.1037/edu0000241 INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 VIRTUAL REALITY AS A TOOL FOR STUDENT ORIENTATION | VALENTI, LUND, AND WANG 11 Critical Review of the Technology Acceptance Model,” Information and Management 40, no. 3 (2003): 191–204, https://doi.org/10.1016/S0378-7206(01)00143-4. 5 Zaid Khot et al., “The Relative Effectiveness of Computer‐based and Traditional Resources for Education in Anatomy,” Anatomical Sciences Education 6, no. 4 (2013): 211–15, https://doi.org/10.1002/ase.1355; Michael J. Robertson and James G. Jones, “Exploring Academic Library Users’ Preferences of Delivery Methods for Library Instruction,” Reference & User Services Quarterly 48, no. 3 (2011): 259–69. 6 Joshua Kim, “Instructional Designers by the Numbers,” Inside Higher Ed (2015), https://www.insidehighered.com/blogs/technology-and-learning/instructional-designers- numbers. 7 Elena Olmos-Raya et al., “Mobile Virtual Reality as an Educational Platform: A Pilot Study on the Impact of Immersion and Positive Emotion Induction in the Learning Process,” Eurasia Journal of Mathematics Science and Technology Education 14, no. 6 (2018): 2045-57, https://doi.org/10.29333/ejmste/85874. 8 Brady D. Lund and Shari Scribner, “Developing Virtual Reality Experiences for Archival Collections: Case Study of the May Massee Collection at Emporia State University,” The American Archivist, https://doi.org/10.17723/aarc-82-02-07. 9 Lund and Scribner, “Developing Virtual Reality Experiences for Archival Collections.” 10 Kenneth J. Varnum, “Preface,” in Kenneth J. Varnum, ed., Beyond Reality: Augmented, Virtual, and Mixed Reality in the Library (Chicago: ALA Editions, 2019): x. 11 Brady D. Lund and Daniel A. Agbaji, “Augmented Reality for Browsing Physical Collections in Academic Libraries,” Public Services Quarterly 14, no. 3 (2018): 275–82, https://doi.org/10.1080/15228959.2018.1487812. 12 Kenneth J. Varnum, ed., Beyond Reality: Augmented, Virtual, and Mixed Reality in the Library (Chicago: ALA Editions, 2019). 13 Nicola Whitton, “Game engagement theory and adult learning,” Simulation and Gaming 42, no. 5 (2011): 596–609, https://doi.org/10.1177/1046878110378587. 14 Chris Dede, “Immersive interfaces for engagement and learning,” Science 323, no. 5910 (2010): 66–69, https://doi.org/10.1126/science.1167311. 15 Pat Dugard and John Todman, “Analysis of Pre‐test‐Post‐test Control Group Designs in Educational Research,” Educational Psychology 15, no. 2 (1995): 181–98, https://doi.org/10.1080/0144341950150207. https://doi.org/10.1016/S0378-7206(01)00143-4 https://doi.org/10.1002/ase.1355 https://www.insidehighered.com/blogs/technology-and-learning/instructional-designers-numbers https://www.insidehighered.com/blogs/technology-and-learning/instructional-designers-numbers https://doi.org/10.29333/ejmste/85874 https://doi.org/10.17723/aarc-82-02-07 https://doi.org/10.1080/15228959.2018.1487812 https://doi.org/10.1177%2F1046878110378587 https://doi.org/10.1126/science.1167311 https://doi.org/10.1080/0144341950150207 INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 VIRTUAL REALITY AS A TOOL FOR STUDENT ORIENTATION | VALENTI, LUND, AND WANG 12 16 Nitza Geri, Amir Winer, and Beni Zaks, “Challenging the Six-minute Myth of Online Video Lectures: Can Interactivity Expand the Attention Span of Learners?,” Online Journal of Applied Knowledge Management 5, no. 1 (2017): 101–11. 17 David Bawden and Lyn Robinson, “The Dark Side of Information: Overload, Anxiety and Other Paradoxes and Pathologies,” Journal of Information Science 35, no. 2 (2009): 180–91, https://doi.org/10.1177/0165551508095781. 18 Moon-Heum Cho, “Online Student Orientation in Higher Education: A Developmental Study,” Educational Technology Research and Development 60, no. 6 (2012): 1051–69, https://doi.org/10.1007/s11423-012-9271-4; Karmen Crowther and Alan Wallace, “Delivering Video-streamed Library Orientation on the Web: Technology for the Educational Setting,” College and Research Libraries News 62, no. 3 (2001): 280–85. 19 Yin-Leng Theng et al., “Mixed Reality Systems for Learning: A Pilot Study Understanding User Perceptions and Acceptance,” International Conference on Virtual Reality (2007): 728–37, https://doi.org/10.1007/978-3-540-73335-5_79. https://doi.org/10.1177/0165551508095781 https://doi.org/10.1007/s11423-012-9271-4 https://doi.org/10.1007/978-3-540-73335-5_79 ABSTRACT Literature Review Emporia State University’s School of Library and Information Management Methods Research Questions Hypothesis Variables Experimental phase Results Discussion Conclusion ENDNOTES 11977 ---- Filling the Gap in Database Usability: Putting Vendor Accessibility Compliance to the Test ARTICLE Filling the Gap in Database Usability Putting Vendor Accessibility Compliance to the Test Samuel Kent Willis and Faye O'Reilly INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2020 https://doi.org/10.6017/ital.v39i4.11977 Samuel Kent Willis (samuel.willis@wichita.edu) is Assistant Professor and Technology Development Librarian, Wichita State University. Faye O’Reilly (faye.oreilly@wichita.edu) is Assistant Professor and Digital Resources Librarian, Wichita State University. © 2020. ABSTRACT Library database vendors often revamp simpler interfaces of their database platforms with script- enriched interfaces to make them more attractive. Sadly, these enhancements often overlook users who rely on assistive technology, leaving electronic content difficult for this user base despite the potential of electronic materials to be easier for them to access and read than print materials. Even when providers are somewhat aware of this user group's needs there are questions about the effect of their efforts to date and whether accessibility documentation from them can be relied upon. This study examines selected vendors’ VPAT reports (Voluntary Product Accessibility Template) through a manual assessment of their database platforms to determine their overall accessibility. INTRODUCTION Libraries are now providing more access to online databases than ever before. In fact, as Blechner notes, most of the “information patrons seek is located in indexes and databases that are only available digitally. Students and faculty rely heavily on these resources in completing course assignments and conducting research.”1 Vendors frequently revamp simpler interfaces of their database platforms with script-enriched interfaces to make it more attractive to students.2 Sadly, these enhancements often overlook users who rely on assistive technology, leaving electronic content difficult for this user base despite the potential of electronic materials to be easier for them to access and read than print materials. Online databases not only bridge the gap for distance users but can also improve service to users with print disabilities.3 Resources produced digitally or properly digitized for online dissemination more readily allow all users, including patrons with physical or mental impairments, to make use of them than do print materials. These resources allow all patrons to have access to updates and new publications at the same time, and can be presented in multiple formats.4 Key features of electronic access that are helpful to users are zooming in on text and automatic reflow to reduce the need to scroll, improving color contrast or changing colors to make looking at the screen easier on the eyes, and the capability of the text to be read aloud by either a built-in feature or user-provided assistive technology such as a screen reader or refreshable braille display.5 All of this, however, presupposes that the content can be accessed using the platform provided by the vendor to navigate the database, and that the documents be made at least minimally accessible. The question is then, how well do these platforms interact with the assistive technologies employed by the largest minority group in the United States (persons with disabilities), relying on libraries to facilitate “their full participation in society,” and to achieve academic success?6 Many vendors provide accessibility documentation pertaining to their database platforms. Some note considerable limitations in accessibility while others claim to be highly accessible when in mailto:samuel.willis@wichita.edu mailto:faye.oreilly@wichita.edu INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 2 fact they may be no better that the former. Accessibility guidelines like Section 508 of the Rehabilitation Act sets forth are a good starting point, but related literature has emphasized that even conformance to these standards does not guarantee they will be usable for all.7 LITERATURE REVIEW Accessibility in libraries has been examined from a variety of vantage points. Some studies were an inspiration to our work and complementary to it, though our manual and holistic review of library databases from third parties was a unique approach. Dermody and Majekodunmi conducted a usability study of electronic databases, focusing on students unable to fully make use of analog materials.8 They asserted that technology, online databases in particular, can either be a help or a hindrance to users with print disabilities.9 After having visually impaired students use screen readers to test three proprietary databases, the authors concluded that their use of the platforms was disrupted by advanced features designed to engage users. Study participants were frustrated to have to abandon a research article applicable to their topic because it was presented in an unreadable format.10 The authors found that as website design evolves to enhance the user experience, screen reader users and others who relied on assistive technology were often overlooked and unable to make use of the sites due to the construction of the platforms and due to inaccessible PDFs.11 Regarding accessibility assessment, the authors asserted that database providers were unlikely to catch all issues or evaluate their products accurately.12 The legal responsibility for these shortfalls, however, belongs to the subscribing institutions.13 The results of Dermondy and Majekodunmi’s survey demonstrated that the usability of electronic databases was stunted by the limitations of screen readers, the platforms or materials themselves, and by insufficient information literacy training for assistive technology users. In 2015, Blechner wrote about the challenges law students with disabilities face in their education, similar to any undergraduate or graduate program. This study was conducted by a librarian with screen-reading software and an accessibility checklist. Blechner highlighted that using research databases with assistive technology to locate material and complete assignments was a barrier to completing legal education programs or passing the bar.14 In academic institutions, student success is related to library access. As much of a library’s resources are online, inaccessible electronic resources present a massive issue.15 Database design is especially important to users who use assistive technologies to access online resources. Blechner pointed out that an additional barrier to online resource access was an average delay of three years before an accessible version of a requested platform or service was prepared.16 If an undergraduate degree took four years to complete, a freshman living with a disability would be a senior before they have equitable access. Blechner stressed a need for librarians to go beyond addressing the accessibility of their native web platforms and to inspect vendor platforms prior to subscribing to them. Libraries "rarely raise the issue when selecting electronic indexes and databases for procurement from outside vendors.”17 Libraries cannot adequately serve patrons and comply with legal requirements if they are unable to provide meaningful access to information for all library patrons. A significant point from Blechner’s article was that compliance with federal standards does not guarantee a service is easy to use or usable at all. “A product can receive a rubber stamp even when it is not functional or usable despite a company’s good faith efforts to provide an accessible product.”18 Other authors have supported this claim, which, along with our own observations, was an impetus for this research. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 3 In Chapter 8 of Ensuring Digital Accessibility through Process and Policy, Lazar, Goldstein, and Taylor used different web accessibility evaluation methods to verify vendor accessibility information on their platforms. The three methods they examined were (1) Having users with disabilities test the platform or content using assistive technology; (2) Conducting an expert review to ensure compliance with usability standards; and (3) Performing an automated scan of the content using scanning software. Regardless of method chosen for evaluation, the authors stressed the importance of continuous evaluation, as content can easily become inaccessible through changes to the user interface. The authors identified strengths and weaknesses to each of the approaches but recommended that whenever possible Method One be used from early on in the development with a goal of ongoing improvement, and that Method Two be used in conjunction with it. When specifically examining the accessibility review of vendor-supplied database content, the authors noted that a Voluntary Product Accessibility Template (VPAT) is one form of Method Two; however, its findings are only reliable insofar as the template is completed by an accessibility expert, and even then there is room for disagreement. 19 This supported the approach we undertook in this study to examine vendor databases and compare our findings with vendors’ VPATs when available. In our professional experience, some VPAT creators are experts in accessibility, while others are not, and even among experts opinions vary, which led us to the same conclusion as Lazar et al.: “Multiple experts, working independently, can increase the validity of the accessibility inspection.”20 Jennifer Tatomir created a checklist, the Tatomir Accessibility Check-list (TAC), to apply the accessibility guidelines to a usability study.21 At the time the article was written in 2010, the then- current web accessibility standards would have been the WCAG 2.0 (released in 2008) and Section 508 standards, last revised in 1998 to include equitable access to information and data under the protection of the law. WCAG is now in version 2.1, with version 2.2 already in development, and Section 508 requirements were updated in 2017 to include many WCAG principles. The TAC examined (1) documents and webpages; (2) bypass links; (3) page element labels; (4) captions for images and figures; (5) scripts and code that would interfere with assistive technology; (6) duplicate links; (7) transcripts for audiovisual material; (8) site organization; (9) timed responses; and (10) the accessibility of web forms.22 While the testing criteria used in this study differed from ours on several points, Tatomir and Durrance’s work supported our creation of the Accessibility Remediation Guide (ARG), a checklist of which Section 508 standards would be the most important to our libraries (see Appendix A). The ARG will be discussed in more detail later in this article. Finally, DeLancey conducted an assessment of the accuracy of 17 vendors’ VPATs which was similar to one aspect of our research. Her work used automated assessment tools as the primary measure for comparison against VPATs, while this study is a direct comparison of two expert reviews.23 The goal of our research project was to determine the accuracy of vendor-supplied accessibility documentation—VPATs in particular—to inform future communications with those vendors as well as collection development decisions moving forward. The studies used in this paper used sighted librarians, students using screen readers , and native users of screen readers to conduct accessibility testing. Ideal candidates for accessibility testing would of course be users with disabilities. However, this approach can be complemented by a review for basic usability and compliance with Section 508 standards. Librarians are also ideal candidates for accessibility testing since they have access to and expertise in using research databases and are committed to providing access to all.24 Librarians can also provide information INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 4 in advance, in anticipation of need. The findings of such accessibility testing could be beneficial in drafting licensing agreements that would ensure a higher level of service for patrons with disabilities. As Blechner said, it is “critical that libraries independently exercise their power as buying agents to improve the state of electronic resource accessibility.”25 Librarians can be instrumental in the development of database platforms moving forward by continually checking the accessibility of these platforms and sharing opportunities for enhancement with vendors.26 METHODOLOGY This study made use of the ARG (Appendix A) for both VPAT accuracy analysis, and overall testing of database accessibility. The ARG was based on the standards set forth in Section 508 of the Rehabilitation Act and related VPAT creation guidelines. The ARG has 11 criteria and was originally intended for accessibility evaluation of new databases, but two criteria were merged with others to make nine in order for it to be easier for a graduate student to evaluate. Due to the breadth of technologies covered in a VPAT, the authors determined that many of the sections in a VPAT were not relevant to our examination. An example of this is Section 1194.25 which refers to physical accessibility of kiosks and the like and therefore has no bearing on electronic content. The functionalities we chose to test were a restricted subset of the functionalities assessed in a VPAT, but this set was selected for several reasons. Some of the guidelines were selected due to their wide impact on a variety of assistive technologies related to the needs of persons with disabilities including blindness, deafness, limited vision, hearing, or mobility. Following these guidelines would improve the performance of the platforms for use with screen readers and keyboards, eye tracking software, refreshable braille displays, and other assistive technology.27 Other tests were chosen as a result of our preliminary investigation and use of the databases, and resulting evidence that they were areas on concern. Finally, some of these items to be examined were selected because a lack of accessibility in these areas would result in drastic limitations to the usability and therefore utility of the databases overall, even if they rarely applied. The reasons behind this study were threefold. Firstly, 62 percent (48) of our vendors had provided no VPAT. This test would fulfill a similar purpose, allowing us to know how accessible these databases without VPATs were as well as identify particular areas requiring remediation in anticipation of patron needs. Secondly, our library had anecdotal evidence that some of the VPATs that were provided contained inaccuracies but without a thorough examination it was impossible to know the particulars or extent of the issues. Finally, the goal of the project was to identify trends in database accessibility and usability for persons with disabilities, comparing major database providers with smaller vendors. These findings will give insight into what most needs to be addressed based on the size and type of content provider and will likely have some bearing on similar institutions’ collections. These are the criteria we used in testing. Other institutions, if following our example, would likely want to adapt the list to meet their needs and institutional priorities: 1) Keyboard Navigation and Intuitive Forms 2) Presence of Keyboard Traps 3) Platform Optical Character Recognition (OCR) 4) Document OCR 5) Alternative Text 6) Table Data INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 5 7) Skip Navigation 8) Transcripts 9) Closed Captions Note: Criteria 3, 8, and 9 included testing support materials, including video tutorials for Criteria 8 and 9. Once we had determined which sections to include, we hired and trained a graduate student to use the ARG to examine each database to which we subscribed on the nine criteria. We were awarded a grant to fund the student’s work. The student tested each database platform and a minimum of three items in each database, manually checking them using a keyboard and screen reader (NVDA). This testing fulfilled the majority of our priorities but was supplemented by her checking for transcripts and captions for video content. While the findings cannot be comprehensive in a manual test this work is complementary with existing VPATs in enabling us to identify areas in need of development in vendor platform usability. It is noteworthy that by testing the databases manually with a screen reader, certain limitations in the usability of the databases were discovered that would not have been revealed by doing automatic checks as have been done in similar studies. An excellent example of this is poorly designed skip navigation (which was found for nearly half of our databases). Using the data we collected with the assistance of our graduate student, we compiled and compared our findings on the various vendors. Our scoring (based on representative random sampling) gave one point for a database passing a single criterion, half a point for a minor issue, and no points for any criteria that failed our tests. The scores were then added together, ignoring any criteria which did not apply to particular databases, to form a composite score. When analyzing vendors with multiple databases, their overall score was based on the average of the individual database scores. This enabled us to codify a percentage of accessibility for every vendor and compare them. For the purposes of this study, we will refer to any vendor that provided the University Libraries with 15 or more database subscriptions as large vendors (LVs), and the rest as small vendors (SVs). Given that we only subscribe to 15 or more databases from a few vendors, some of the vendors we classified as SVs would likely be considered LVs at other institutions. RESEARCH FINDINGS VPAT Accuracy Assessment As previously stated, one goal of this research was to measure the accuracy of vendor-supplied VPATs. 227 databases assessed had an associated VPAT from the vendor, but the rest did not ( see Appendix B for list of all databases by vendor). We used the ARG (Appendix A), and compared the vendors’ claims on the VPAT to our manual testing of the database functionality. Of the 227 databases, only 10 databases were found to fully match the claims the vendor made on the VPAT for the 11 criteria assessed from the ARG. Databases where the VPAT claims did not match the findings of the testing on one criteria were given a score of “Partial Match.” Of the 227 databases, 138 were considered partial matches (See figure 1 for details). The main incongruity between VPATs and our results were due to the databases not having sufficient skip navigation, meaning they did not have appropriate or functional bypasses. These issues are likely due to outdated VPATs that do not reflect the latest changes to the databases but could also be the result of vendors’ lack of understanding of what it means to be truly usable by persons with disabilities. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 6 For databases that failed two or more of the criteria tested, a score of “Not a Match” was given, with 79 databases of 227 failing (See figure 1 for details). For these databases, skip navigation and alternative text were the main issues. The databases, when presenting essential content in an image, like a photo or chart, did not provide an alternative presentation of that content, which means only sighted users could access the data from that image. The findings of this study are similar to the data from the overall usability study, finding that vendors struggled with skip navigation and alt-text, as we will discuss below. Some of these databases were also found to have keyboard traps that prevented screen reader users from navigating to the entire site and at times may even trap the user’s navigation in a single content area. This number of inconsistencies was even higher than the authors anticipated but reinforced all the more the importance of not taking information in VPATs for granted, especially when the VPAT is several years old and the platform has undergone any changes. Figure 1. VPAT Accuracy Assessment Accessibility Analysis Overall Related to the VPAT accuracy assessment, we conducted manual tests of our databases and database platforms, both those with and without a VPAT provided by the vendor. Of our 351 databases, 124 (35 percent) had no related VPAT, and on a whole, examining all criteria, we found them notably less accessible. That said, there were exceptions where databases with no associated VPAT still had accessibility information giving reasonable detail, and others where the VPAT provided was inaccurate or where it highlighted significant accessibility issues (see tables 1 and 61% 4% 35% Partial Match Match Not a Match INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 7 2). The average composite score of VPAT-linked databases was 74 percent, compared to 67 percent for those with none (see table 3 for comparison). Each criterion was compared and any instance where one category of databases was more than five percent higher than the other was highlighted. Table 1. Summary of Issues for Databases with VPATs (227 total) Good Partial Poor Applicable N/A Download OCR 50 32 78 160 67 Skip Navigation 68 124 35 227 0 Transcripts 42 4 15 61 166 Alt Text 71 39 12 122 105 Tables 17 38 4 59 168 Captions 35 0 8 43 184 Platform OCR 108 41 7 156 71 Keyboard Navigation 202 22 3 227 0 Keyboard Traps 224 0 3 227 0 Average 90.78 33.33 18.33 142.44 84.56 Table 2. Summary of Issues for Databases without VPATs (124 total) Good Partial Poor Applicable N/A Download OCR 61 14 27 102 22 Skip Navigation 47 27 48 122 2 Transcripts 3 0 8 11 113 Alt Text 52 38 26 116 8 Tables 30 8 2 40 84 Captions 6 1 4 11 113 Platform OCR 88 10 16 114 10 Keyboard Navigation 74 35 15 124 0 Keyboard Traps 123 0 1 124 0 Average 53.78 14.78 16.33 84.89 39.11 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 8 Table 3. Comparison of Databases with and without VPATs (351 total) Percent Good of Applicable Databases with VPATs (227) Percent Good of Applicable Databases without VPATs (124) Download OCR 41.25% 66.67% Skip Navigation 57.27% 49.59% Transcripts 72.13% 27.27% Alt Text 74.18% 61.21% Tables 61.02% 85.00% Captions 81.40% 59.09% Platform OCR 82.37% 81.58% Keyboard Navigation 93.83% 73.79% Keyboard Traps 98.68% 99.19% Average 73.57% 67.04% The biggest barriers to accessibility found in this study pertained to downloadable files’ OCR, skip navigation, transcripts, and alternative text (see figure 2 and table 4). The accessibility of downloadable files through OCR or alternative formats (TXT, HTML, etc.) was found to be the most major concern, though it did not apply to all databases. Its overall score for applicable databases was 51 percent, based on the frequency and severity of the issues. Many database platforms had full text available for download only through PDFs that were images of text or that had other issues failing to work with assistive technologies. It was more than twice as frequent for a database to have inaccessible downloadable files as inaccessible full text online. Often HTML or TXT formats were not available for download, but in instances where it was available through the vendor’s platform, another means of accessing the information mitigated this issue. Other times, however, the full text on the platform itself was not accessible. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 9 Figure 2. Accessibility Issues by Database Table 4. Summary of Issues by Database Platform (351 total) Good Partial Poor Applicable N/A Percent Good of Applicable Download OCR 111 46 105 262 89 51.15% Skip Navigation 115 151 83 349 2 54.58% Transcripts 45 4 23 72 279 65.28% Alt Text 123 77 38 238 113 67.86% Tables 47 46 6 99 252 70.71% Captions 41 1 12 54 297 76.85% Platform OCR 196 51 23 270 81 82.04% Keyboard Navigation 276 57 18 351 0 86.75% Keyboard Traps 347 0 4 351 0 98.86% Average 144.56 48.11 34.67 227.33 123.67 72.67% A lack of or poorly executed skip navigation accounted for the second greatest number of issues by vendor. This criterion’s final score was 55 percent. When skip navigation existed, the most common problem was for it to not redirect to the main content. Often times, for example, on the search results page, the link would take the user to the filters in the margin with no easy way to bypass them and get to the actual results. Eighty-three databases were found to have no skip navigation whatsoever, but the majority of issues found were from existing bypass links not working as intended. D O W N L O A D O C R S K I P N A V I G A T I O N T R A N S C R I P T S A L T T E X T T A B L E S C A P T I O N S P L A T F O R M O C R K E Y B O A R D N A V I G A T I O N K E Y B O A R D T R A P S NUMBER OF DATABASES T E S T IN G C R IT E R IA Good Partial Poor INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 10 Databases with audiovisual materials made up a relatively small portion of our databases, but when these types of items existed, problems were not infrequent. Additionally, we examined support videos made available by database providers to test all multimedia content for transcripts and captions. Twenty-seven out of 72 (38 percent) were determined to have inaccurate transcripts or be in need of them. Captions are irrelevant to non-visual materials, so were only applicable to 54 databases. Of these, 13 (24 percent) were found lacking. Therefore, transcripts were the bigger issue. Nearly half of the databases with images had at least minor issues with alternative text, whether in documents or the platforms themselves. In many cases, this issue was not identified by the vendor in any accessibility documentation because alternative text was present, but not properly descriptive. Thirty-eight databases (16 percent of applicable) had major issues where images were important to the performance of the platform or database and no alternative text was provided. In database materials, charts and graphs were often lacking any alternative text, though on occasion we found the information conveyed in the chart was covered in the main text. In these instances, that was not counted as an issue. The results for tables were similar. Both in the platforms and the documents, tables often lacked identifying header and cell information for screen readers to make sense out of the data. A few were entirely unreadable. Fifty-two of 99 databases with tabular data (53 percent) had problems, but most of them were not major, and for this reason, tables were of less concern than alternative text. Finally, keyboard navigation was a rarer issue, but still was found to be a concern in 75 databases (21 percent). This was often related to images or forms not having descriptive text for screen readers, so non-visual users would be unable to know the purpose of the form, etc. On a few occasions database platforms would have keyboard traps that prevented screen reader users from navigating to the entire site, or more often at least buttons or links that could be used only with a mouse. While our testing only included keyboard navigation, it is important to remember that if it is not usable by keyboard, neither is it likely to work with other assistive technology used for navigation. While this area was of least frequent concern of all criteria we tested, it is nevertheless a vital part of making any website or platform truly usable. All these findings were important to our study as they helped us to identify areas of need, especially for databases that had no corresponding VPAT. Whether the databases had a VPAT or not, this research provided us with the details needed to reach out to database providers and request specific improvements. Vendor Comparison by Size The final goal of our research was to compare the relative accessibility of database providers based on the number of databases we subscribed to from each. While at times we may have subscribed to only a small number of databases from a larger content provider, there was a general correlation between what we considered SVs in this study and those vendors that only offer a more limited number of collections. In assessing the percent accessible a provider was for each criterion, we added all good scores, one point for each related database, to the partial scores, one half a point for each, then divided it by the total number of databases in this area. In this way, minor issues were not recorded as negatively as major issues. Overall accessibility of the LV databases was found to be significantly higher than accessibility of individual databases and SVs (see tables 5 and 6 for findings for LVs and SVs respectively). Our INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 11 findings showed our LVs to have an average score of 74 percent accessibility, compared to 69 percent for SVs, both averages being based on the number of applicable databases. There were two tested criteria, however, that LVs scored lower on than SVs: Downloadable files’ OCR, and tables. The details of each criterion will be discussed below. Most LV content is on a consistent platform and we found, similar to an earlier study, that this consistency helped those materials to be more accessible.28 The issues LV databases most often had were related to individual items, rather than to the platform as a whole. For example, LVs were found to have frequent problems with PDF files. Given that our LVs account for 61 percent of our databases (214 of 351) and they are typically larger than the databases of SVs, this has significant impact on ongoing vendor communication and accessibility remediation efforts. Skip navigation issues was the largest problem found for SVs. Interestingly, while no LVs were entirely missing skip navigation, a lack of proper functionality was a major concern for half of them, accounting for 121 databases. Thirteen databases were found where LVs had no skip navigation. In contrast, 70 SV databases (52 percent of SV content for which this criterion applied) had no skip navigation or it failed to function at all. An additional 30 SV databases and 121 LV databases had improperly functioning bypass links. Overall, SVs were more likely to have none at all, and LVs were more likely to have it not properly set up. Full text OCR results varied greatly depending on the type. Platform OCR showed little difference between LVs and SVs, both being found to be 82 percent accessible. As mentioned previously, downloadable files OCR had more accessibility problems than platform OCR, but there was a large difference between LVs and SVs. For this criterion LV content was found to be only accessible about 40 percent of the time, and SV content 70 percent of the time. This may be due to SVs generally having smaller databases so it is less difficult to address accessibility needs for individual items. Whatever the cause, the disparity between LVs and SVs in this area was very significant. Transcripts and captions were far more common for LVs than SVs. Fifty-five databases (26 percent of LV databases) included audiovisual material, including support tutorials, while only 17 (12 percent) of SV content did. LV content was found to be accessible 73 percent of the time for transcripts, and 82 percent for captions. Applicable SVs on the other hand were only 41 percent accessible for transcripts, and 66 percent accessible for captions. This demonstrates the need for development in both these areas, but especially for transcripts, which when synchronized with the videos have the capability to full more user needs than captions can. Closely following transcripts was alternative text for non-textual content like charts, diagrams and other images. It is worth mentioning that some databases have images neither in their platforms nor in their collection materials. If the platform is simple and the database only provides abstracts, for example, there may be no images, in which case this criterion does not apply. Nearly one-third (113) of the databases were found to have no images. Of the 238 databases with images, we found at least some issues with 115 (48 percent of applicable, 33 percent overall), there being no significant difference between SVs and LVs as a whole. Individually, the platforms varied greatly, and regarding major limitations in alternative text there were found to be 21 SV databases, but only 17 LV databases. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 12 While table accessibility applied to only 99 databases, there were significant issues found particularly with one LV. Given the disparity between LVs it is impossible to draw meaningful conclusions comparing LVs and SVs for this criterion. Further study is needed in this area. Finally, the areas of least frequent concern were keyboard navigation and keyboard traps. Seventy-five databases (21 percent) were found to have suboptimal navigation. In this case, LVs did not have as many issues as SVs. Optimization is needed for them, but only one LV had major issues in this area. Forty percent of SV databases (55 of 137) had at least some navigation issues identified, whereas only nine percent of LV databases (20 of 214) had any issues in this area. As for major issues, only four databases were identified in our study as having keyboard traps, two SVs and two LVs. These only seemed to appear for separate platforms and never for large ones, suggesting that our vendors are likely aware of this issue and avoiding it in newly created platforms. The authors hope the remaining databases with this issue will not be neglected in making these improvements. To sum up, LV content was found to be more accessible overall. Their largely consistent platforms more often had skip navigation (29 percent more), transcripts (32 percent more) and captions (16 percent more) for multimedia content, and superior keyboard navigation (18 percent more). SV platforms, however, had a higher score on downloadable files OCR (31 percent more) and on tables (24 percent more). See table 7 for detailed comparison. Table 5. Issues by LV Database (214 total) Good Partial Poor Applicable N/A Download OCR 52 25 86 163 51 Skip Navigation 80 121 13 214 0 Transcripts 38 4 13 55 159 Alt Text 59 41 17 117 97 Tables 14 35 4 53 161 Captions 31 0 7 38 176 Platform OCR 106 39 8 153 61 Keyboard Navigation 194 13 7 214 0 Keyboard Traps 212 0 2 214 0 Average 87.33 30.89 17.44 135.67 78.33 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 13 Table 6. Issues by SV Database (137 total) Good Partial Poor Applicable N/A Download OCR 59 21 19 99 38 Skip Navigation 35 30 70 135 2 Transcripts 7 0 10 17 120 Alt Text 64 36 21 121 16 Tables 33 11 2 46 91 Captions 10 1 5 16 121 Platform OCR 90 12 15 117 20 Keyboard Navigation 82 44 11 137 0 Keyboard Traps 135 0 2 137 0 Average 57.22 17.22 17.22 91.67 45.33 Table 7. Comparison of LV Databases and SV Databases (351 total) Percent Good of Applicable Databases from LVs (214) Percent Good of Applicable Databases from SVs (137) Download OCR 39.57% 70.20% Skip Navigation 65.65% 37.04% Transcripts 72.73% 41.18% Alt Text 67.95% 67.77% Tables 59.43% 83.70% Captions 81.58% 65.63% Platform OCR 82.03% 82.05% Keyboard Navigation 93.69% 75.91% Keyboard Traps 99.07% 98.54% Average 73.52% 69.11% INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 14 CONCLUSION AND LIMITATIONS This investigation was intended to complement existing studies related to library database accessibility. It was unique in that it manually analyzed content from every database subscription in the University Libraries, rather than only major or representative databases or automated tests. Building a comparison between vendor VPATs and our manual assessment was a key value of this research that we hope will be further developed in future inquiry. The comparison of different types of vendors was also important. While the consistency of LV platforms was found to improve the sites overall, the authors expected that LV content would be more compliant with accessibility regulations than they were found to be. From a usability and accessibility perspective, the increased cost of these databases was deemed to be associated with too little improvement of service. It matters little how clean a platform looks to visual users, for example, if it is impossible or very difficult to use by non-visual users. As anticipated, there were few instances of keyboard traps (when a keyboard and screen reader user is caught in a loop or on a single link when attempting to navigate through the website). When these occur, however, it is a major concern, as it renders the site virtually useless for non-mouse users. There was no significant difference between LVs and SVs on three of nine criteria—including keyboard traps— and on two criteria, SVs were superior. Therefore, despite that LVs were found to be 14 percent more accessible on average, the authors urge LVs to work diligently to address the areas where they were found to be deficient. Both aspects of this study concluded that vendors generally misunderstood the execution of sk ip navigation and alternative text, as a usability study of databases proved many databases failed in fulfilling these criteria, while a separate study of their VPATs’ accuracy proved vendors claimed they did comply with the criteria, while the platform was found to not comply fully. This study is limited in that few samples were able to be examined for each content type in every database platform. The authors anticipate that a deeper investigation would bring to light additional accessibility concerns. Another limitation of this research was related to the time involved in testing. Database platforms changed during the course of this work, but the results of this study pertain to only a short period of time, making them in cases outdated even at the time of this writing. Therefore, the manual testing we have performed would work best when used in conjunction with automated tools for testing database content as other studies have done. The authors hope that further study in this area could involve persons with varied impairments to test the platforms directly and assert that there is potential for collaboration between vendors and libraries in this area. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 15 APPENDIX A: ACCESSIBILITY REMEDIATION GUIDE The authors developed the ARG for database testing prior to signing licensing agreements with vendors. While initially create d based on VPAT Version 1 criteria as defined in Section 508 Standards, it was adapted and cross -referenced with VPAT Version 2 criteria following the refresh of Section 508 in January of 2018. The organization of the criteria was altered greatly at tha t time, but VPATs from vendors may use either version, depending on the age of the VPAT. Finally, it was used in this study to create the testing criteria. Testing Criteria VPAT Version 1-1.6 Standards VPAT Version 2-2.3 Standards Notes Section 1194.22 (web-based intranet and internet information and applications) Related standards after Section 508 Refresh 5 (alternative text) A) A text equivalent for every non-text element shall be provided (e.g., via “alt,” “longdesc,” or in element content). E101 (Web, Software), E201 (Application) WCAG: 1.1.1 Non-text Content 8 and 9 (transcripts and closed captions) B) Equivalent alternatives for any multimedia presentations shall be synchronized with the presentation. 500 (Software) WCAG: 1.2.2 Captions (Prerecorded) and 1.2.3 Audio Description FOR STREAMING MEDIA ONLY “Equivalent alternatives” include transcripts. 3, 4 and 6 (platform OCR, document OCR, and table data) D) Documents shall be organized so they are readable without requiring an associated style sheet. E205.2-4 (Electronic Content) WCAG: 1.3.2 Meaningful Sequence “Documents” describes the webpage. Is the webpage well organized so it’s readable without style elements (colors, blocking, font sizes, etc.). 1 and 3 (keyboard navigation and intuitive forms, and platform OCR) L) When pages utilize scripting languages to display content, or to create interface elements, the information provided by the script shall be identified with the functional text that can be read by assistive technology. E205.2-4 (Electronic Content) WCAG: 2.1.1 Keyboard Does the database include interactive content (buttons, check boxes, or other mouse input), news tickers, media players, browser games etc.)? Is this content accurately identified via text for use with screen readers? INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 16 Testing Criteria VPAT Version 1-1.6 Standards VPAT Version 2-2.3 Standards Notes 1 (keyboard navigation and intuitive forms) N) When electronic forms are designed to be completed on- line, the form shall allow people using Assistive Technology to access the information, field elements, and functionality required for completion and submission of the form, including all directions and cues. E205.2-4 (Electronic Content) WCAG: 3.2.1 On Focus Definition of “form” includes search boxes in databases. Is the search box, search boxes’ purpose, and purpose accurate? 7 (skip navigation) O) A method shall be provided that permits users to skip repetitive navigation links. E205.2-4 (Electronic Content) WCAG: 2.4.1 Bypass Blocks and 1.3.1 Info and Relationships Section 1194.24 (Video and Multi-Media Products) Related standards after Section 508 Refresh 8 and 9 (transcripts and closed captions) E) Display or presentation of alternate text presentation or audio descriptions shall be user-selectable unless permanent. 400 (Hardware) WCAG: 1.2.1 and 1.2.3 Audio Description or Media Alternative FOR STREAMING MEDIA ONLY Section 1194.31 (Functional Performance Criteria) Related standards after Section 508 Refresh 3, 4 and 6 (platform OCR, document OCR, and table data) A) At least one mode of operation and information retrieval that does not require user vision shall be provided or support for assistive technology used by people who are blind or visually shall be provided. 302.1 (Vision) WCAG: 1.4.5 Images of Text Do PDFs have optical character recognized (OCR) text, or are they only images of text? If they do have OCR text, is it accurate? Is it missing information in images or figures? INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 17 Testing Criteria VPAT Version 1-1.6 Standards VPAT Version 2-2.3 Standards Notes 8 and 9 (transcripts and closed captions) C) At least one mode of operation and informational retrieval that does not require user hearing shall be provided, or support for assistive technology used by people who are deaf or hard of hearing shall be provided. 303.4 (Hearing) WCAG: 1.2.1 and 1.2.2 1 and 2 (keyboard navigation and intuitive forms, and presence of keyboard traps) F) At least one mode of operation and information retrieval that does not require fine motor control or simultaneous actions and that is operable with limited reach and strength shall be provided. 303.7 (limited manipulation), 303.8 (limited reach) WCAG: 2.1.1 Keyboard Section 1194.41 (Information, Documentation and Support) Related standards after Section 508 Refresh 3, 8 and 9 (platform OCR, transcripts, and closed captions) B) End-users shall have access to a description of the accessibility and compatibility features of products in alternate formats for alternate methods upon request, at no additional charge. 602.2 (Accessibility and Compatibility Features) and 603.2 (Information on Accessibility and Compatibility Features) WCAG: 3.3.5 Help INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 18 APPENDIX B: DATABASE BY VENDOR LIST USED IN VPAT ACCURACY AUDIT Vendor Name AAPG (American Association of Petroleum Geologists) AAPG/Datapages ABC-CLIO ARBAonline ACLS (American Council of Learned Societies) ACLS Humanities E-Book ACM (Association of Computing Machinery) ACM Digital Library ACS (American Chemical Society) SciFinder Adam Matthew Digital African American Communities Migration to New Worlds American Indian Histories and Cultures American West Digital Collection AIAA (American Institute of Aeronautics & Astronautics) AIAA Electronic Library Alexander Street Press Academic Video Online African American Music Reference American Civil War: Letters and Diaries American History in Video Anthropological Field Work Online Anthropology Online Art and Architecture in Video Asian American Drama BBC Video Collection Black Drama Black Studies in Video Border and Migration Studies Online Broadway HD Classical Music in Video Classical Music Library Classical Performance in Video Classical Scores Library Contemporary World Drama Counseling and Psychotherapy Transcripts: Volume I Counseling and Psychotherapy Transcripts: Volume II Counseling and Therapy in Video Dance Online: Dance in Video Dance Online: Dance Studies Collection Diagnosing Mental Disorders: DSM-5 and ICD- 10 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 19 Vendor Name Disability in the Modern World Drama Texts Collection Early Encounters in North America Education in Video Engineering Case Studies Online Environmental Issues Online Ethnographic Sound Archives Online Ethnographic Video Online Food Studies Online Gilded Age Global Issues Library Human Rights Studies Online Illustrated Civil War Newspapers and Magazines Images of America: A History of American Life in Images and Texts International Business Online LGBT Studies in Video LGBT Thought and Culture Music Online: Listening (United States) Music Periodicals of the 19th Century New World Cinema: Independent Features and Shorts (1990-present) North American Immigrant Letters, Diaries and Oral Histories North American Indian Thought and Culture North American Women's Drama North American Women's Letters and Diaries Nursing and Mental Health in Video: A Symptom Media Collection Nursing Education in Video PBS Video Collection Performance Design Archive Psychological Experiments Online Royal Shakespeare Company Collection Silent Film Online Sixties: Primary Document and Personal Narratives 1960– 1974 Social Theory Social Work Online Sony Pictures Classics INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 20 Vendor Name Theatre & Drama Premium Theatre in Context Theatre in Performance Theatre in Video: Volume I Theatre in Video: Volume II Twentieth Century Drama Underground & Independent Comic, Comix and Graphic Novels Women and Social Movements in the United States, 1600- 2000 World History in Video 60 Minutes: 1997–2014 American Institute of Physics Scitation Index SPIN American Mathematical Society MathSciNet APA (American Psychological Association) APA Books E-Collections ASM International ASM Handbooks Online ASME (American Society of Mechanical Engineer) ASME Digital Collection ASTM ASTM Standards & Engineering Digital Library BioOne BioOne Books 24x7 FinancePro ITPro Britannica Encyclopedia Britannica Online Spanish Reference Center Business Expert Press Business Expert Press Cabell's Cabell's Directory - Psychology Set Cabell's Directory - Educational Set Cambridge Crystallographic Data Centre Cambridge Structural Database (WebCSD) WebCSD Cambridge University Press Historical Statistics of the United States (HSUS) Chadwyck Healey Early English Books Online Black Abolitionist Papers Black Studies Center Black Studies Center: History Makers Module Early English Books Online Text Creation Project CLCD (Children's Literature Comprehensive Database) Children's Literature Comprehensive Database (CLCD) CQ Press CQ Researcher INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 21 Vendor Name Credo Reference Masterworks Credo Reference dataZoa dataZoa EBSCO Agricola Alt-HealthWatch America: History & Life (EBSCO) American Antiquarian Society (AAS) Historical Periodicals Collection (Series 1–5) American Doctoral Dissertations 1933–1955 Anthropology Plus Applied Science & Technology Abstracts Art Abstracts Art Full Text Art Index Retrospective ATLA (American Theological Library Association) Historical Monographs Collection: Series I ATLA (American Theological Library Association) Historical Monographs Collection: Series II Auto Repair Reference Center Biography Reference Bank Book Collection: Nonfiction Book Review Digest Plus Business Abstracts with Full Text Business Source Complete CINAHL Complete Communication & Mass Media Complete Computer Source: Consumer Edition Consumer Health Complete Criminal Justice Abstracts with Full Text eBook Collection (formerly NetLibrary) EBSCO Databases EconLit Education Full Text Ergonomics Abstracts ERIC (EBSCO) European Views of the Americas: 1493 to 1750 Fuente Academica General Science Full Text GeoRef INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 22 Vendor Name GeoRef in Process GreenFILE Health Source: Consumer Edition Health Source: Nursing/Academic Edition History Reference Center Humanities Full Text Library Literature & Information Science Full Text Library, Information Science & Technology Abstracts (LISTA) Literary Reference Center MedicLatina MEDLINE (EBSCO) Mental Measurements Yearbook with Tests in Print MLA Directory of Periodicals MLA International Bibliography Music Index Native American Archives Newspaper Source Plus Novelist Plus OmniFile Full Text Mega Philosopher's Index PsycARTICLES Psychology and Behavioral Sciences Collection PsycINFO PsycTESTS Readers' Guide Full Text Regional Business News Religion & Philosophy Collection RILM Abstracts of Music Literature Small Business Reference Center SmartSearch Social Sciences Full Text SPORTDiscus with Full Text Teacher Reference Center TOPICsearch Vocational & Career Collection Women's Studies International Academic Search Complete INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 23 Vendor Name Ei Engineering Village Compendex Elsevier ScienceDirect Clinical Pharmacology Scopus Gale 19th Century U.S. Newspapers 19th Century UK Periodicals Academic OneFile Archives Unbound Artemis Primary Sources British Literary Manuscripts Online Business Insights: Essentials Economist Historical Archive Educator's Reference Complete Eighteenth Century Collections Online (ECCO) Expanded Academic ASAP Gale Databases Gale Digital Collections Gale Virtual Reference Library General OneFile GREENR (Global Reference on the Environment, Energy, and Natural Resources) Health & Wellness Resource Center (with Alternative Health Module) Health Reference Center Academic Indigenous Peoples: North America Informe Academico InfoTrac Newsstand Kansas History, Territorial through Civil War Years, 1854– 1865 LegalTrac Literature Resource Center Making of the Modern World Nineteenth Century Collections Online (NCCO) Opposing Viewpoints In Context Sabin Americana, 1500–1926 Slavery and Anti-Slavery Collection Smithsonian Collections Online: Evolution of Flight 1784– 1991 Testing & Education Reference Center: TERC INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 24 Vendor Name Times (London) Google Google Scholar Guidestar Guidestar HathiTrust HathiTrust HeinOnline HeinOnline: Government, Politics and Law HeinOnline: Slavery in America and the World: History, Culture & Law IBISWorld IBISWorld IEEE IEEE - MIT Press eBooks LIbrary IEEE Xplore Digital Library IEEE-Wiley eBooks Library Infobase Learning Films On Demand Infogroup ReferenceUSA Institute of Physics IOPscience InterDok Directory of Published Proceedings JSTOR JSTOR Kanopy Kanopy Streaming Knovel Knovel LexisNexis LexisNexis Academic Nexis Uni Library of Congress Congress.gov (formerly THOMAS Legislative) Mergent Key Business Ratios Mergent Archives Mergent Intellect Mergent Online National Academies Press National Academies Press Publications National Library of Medicine PubMed (Medline) Naxos Naxos Music Library Naxos Sheet Music Library NCJRS National Criminal Justice Reference Service Abstracts Newsbank Access World News Newsbank OCLC ArchiveGrid ArticleFirst CAMIO: Catalog for Art Images Online Clase and Periodica ECO (Electronic Collections Online) FirstSearch INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 25 Vendor Name OAIster OCLC Electronic Books PapersFirst ProceedingsFirst WorldCat (OCLC) WorldCat Dissertations and Theses WorldCat.org Ovid OvidSP Oxford University Press Oxford Art Online Oxford English Dictionary Oxford History of Western Music Oxford Medicine Online Oxford Music Online Oxford Reference Online: Premium ProjectMUSE Project MUSE ProQuest ABI/INFORM Collection Aerospace Database Agricultural & Environmental Science Database American Periodicals Series (1741–1988) Annual Register (1758–2016) Art and Architecture Archive (1845–2005) Biological Science Database Chicago Defender (1910–1975) (ProQuest Historical Black Newspapers) Cleveland Call & Post (1934–1991) (ProQuest Historical Black Newspapers) ComDisDome Design and Applied Arts Index (DAAI) Digital National Security Archive (DNSA) Dissertations and Theses @ Wichita State University Earth, Atmospheric & Aquatic Science Database EBL- Ebook Library (Now Ebook Central) *EBook Central Ebrary (Now Ebook Central) ERIC (ProQuest) Fold3 Harper's Bazaar Archive HeritageQuest Online Literature Online (LION) INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 26 Vendor Name Los Angeles Sentinel (1934–2005) (ProQuest Historical Black Newspapers) Materials Science & Engineering Database MEDLINE (Proquest) National Criminal Justice Reference Service Abstracts (ProQuest) New York Amsterdam News (1922–1993) (ProQuest Historical Black Newspapers) New York Times (1851–3 years ago) with Index (1851– 1993) (ProQuest Historical Newspapers) New York Tribune/ Herald Tribune (1841–1962) (ProQuest Historical Newspapers) PAIS Index Periodicals Archive Online PILOTS Pittsburg Courier (1911–2002) (ProQuest Historical Black Newspapers) Pittsburg Post- Gazette (1786–2003) (ProQuest Historical Newspapers) ProQuest Civil War Era 1840–1865 ProQuest Congressional Publications (including Hearings) ProQuest Databases ProQuest Digital Microfilm ProQuest Historical Newspapers ProQuest History Vault ProQuest Nursing & Allied Health Source ProQuest Research Library Research Library, ProQuest SciTech Premium Collection Social Services Abstracts Sociological Abstracts Technology Collection The Christian Science Monitor (1908–1994) (ProQuest Historical Newspapers) The Guardian & The Observer (1791–1909) (ProQuest Historical Newspapers) Ulrichsweb.com Vogue Archive Women's Magazine Archive Collection 1: 1883–2005 Women's Magazine Archive Collection 2: 1846–2015 Readex African American Newspapers (1827–1998) INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 27 Vendor Name America's Historical Newspapers (1690–1922) American State Papers, 1789–1838 Early American Imprints Readex AllSearch Territorial Papers of the United States, Series 1 U.S. Congressional Serial Set, 1817–1994 SAGE SAGE Journals Online SAGE Reference Online SAGE Research Methods SAGE Research Methods Cases SAGE Stats Salem Press Salem History Salem Literature SBRnet Sports Market Analysis (formerly SBRnet) Springer SpringerLink State Library of Kansas Mango Languages Cloud Library Digital Books eLending Learning Express Library OneClick Digital Statista Statista Swank Swank Digital Campus Taylor & Francis CRC Press eBooks Europa World Year Book Thomson Reuters Arts & Humanities Citation Index MEDLINE (Web of Science) RIA Checkpoint Science Citation Index Social Sciences Citation Index Web of Science U.S. Department fo Commerce STAT-USA U.S. Census Bureau U.S. Department of Education ERIC U.S. Government Printing Office Catalog of U.S. Government Publications GPO Monthly Catalog Homeland Security Digital Library University of Chicago GSS (General Social Survey) INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 28 Vendor Name University of Michigan ICPSR (Inter-University Consortium for Political and Social Research) UpToDate UpToDate ValueLine ValueLine Investment Survey - Plus Wharton Research Data Services (WRDS) Compustat Eventus Wiley Cochrane Library Wiley Online Library INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 29 ENDNOTES 1 A. J. Blechner, “Improving Usability of Legal Research Databases for Users with Print Disabilities,” Legal Reference Services Quarterly, 34, no. 2 (2000): 139, https://doi.org/10.1080/0270319X.2015.1048647. 2 Jennifer Horwath, “Evaluating Opportunities for Expanded Information Access: A Study of the Accessibility of Four Online Databases,” Library Hi Tech 20, no. 2 (2002): 199, https://doi.org/10.1108/07378830210432561. 3 Blechner, “Improving Usability of Legal Research Databases for Users with Print Disabilities,” 140. 4 Horwath, “Evaluating Opportunities for Expanded Information Access: A Study of the Accessibility of Four Online Databases,” 199. 5 Sarah George, Ellie Clement, and Grace Hudson, “Auditing the Accessibility of Electronic Resources,” SCONUL Focus, 62 (2014): 16. 6 Blechner, “Improving Usability of Legal Research Databases for Users with Print Disabilities,” 141. 7 Suzanne L. Byerley and Mary Beth Chambers, “Accessibility and Usability of Web-based Library Databases for Non-Visual Users,” Library Hi Tech, 20, no. 2 (2002) 177, https://doi.org/10.1108/07378831111116976; Blechner, “Improving Usability of Legal Research Databases for Users with Print Disabilities,” 140. 8 Kelly Dermody and Norda Majekodunmi, “Online Databases and the Research Experience for University Students with Print Disabilities,” Library Hi Tech 20, no. 1 (2011): 150, https://doi.org/10.1108/07378831111116976. 9 Dermody and Majekodunmi, “Online Databases and the Research Experience for University Students with Print Disabilities,” 156. 10 Dermody and Majekodunmi, “Online Databases and the Research Experience for University Students with Print Disabilities,” 156. 11 Dermody and Majekodunmi, “Online Databases and the Research Experience for University Students with Print Disabilities,” 156–7. 12 Dermody and Majekodunmi, “Online Databases and the Research Experience for University Students with Print Disabilities,” 151. 13 Dermody and Majekodunmi, “Online Databases and the Research Experience for University Students with Print Disabilities,” 144. 14 Blechner, “Improving Usability of Legal Research Databases for Users with Print Disabilities,” 142. https://doi.org/10.1080/0270319X.2015.1048647 https://doi.org/10.1108/07378830210432561 https://doi.org/10.1108/07378831111116976 https://doi.org/10.1108/07378831111116976 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 FILLING THE GAP IN DATABASE USABILITY | WILLIS AND O’REILLY 30 15 Blechner, “Improving Usability of Legal Research Databases for Users with Print Disabilities,” 139. 16 Blechner, “Improving Usability of Legal Research Databases for Users with Print Disabilities,” 145. 17 Blechner, “Improving Usability of Legal Research Databases for Users with Print Disabilities,” 138. 18 Blechner, “Improving Usability of Legal Research Databases for Users with Print Disabilities,” 140. 19 Jonathan Lazar, Daniel F. Goldstein, and Anne Taylor, Ensuring Digital Accessibility through Process and Policy (Amsterdam: Morgan Kaufmann/Elsevier, 2015), 150. 20 Lazar, Goldstein, and Taylor, Ensuring Digital Accessibility through Process and Policy, 153. 21 Jennifer Tatomir and Joan C. Durrance, “Overcoming the Information Gap: Measuring the Accessibility of Library Databases to Adaptive Technology Users,” Library Hi Tech 28, no. 4 (2010): 581. 22 Tatomir and Durrance, “Overcoming the Information Gap: Measuring the Accessibility of Library Databases to Adaptive Technology Users,” 581. 23 Laura DeLancey, “Assessing the Accuracy of Vendor-supplied Accessibility Documentation,” Library Hi Tech 33, no. 1 (2015): 104, https://doi.org/10.1108/LHT-08-2014-0077. 24 Blechner, “Improving Usability of Legal Research Databases for Users with Print Disabilities,” 168. 25 Blechner, “Improving Usability of Legal Research Databases for Users with Print Disabilities,” 147. 26 Lazar, Goldstein, and Taylor, Ensuring Digital Accessibility through Process and Policy, 155. 27 Nondiscrimination on the Basis of Disability; Accessibility of Web Information and Services of State and Local Government Entities, 81 Fed. Reg. 28,658 (May 9, 2016) (to be codified at 28 CFR pt. 35). 28 Christina Mune and Ann Agee, “Are E-books for Everyone? An Evaluation of Academic E-book Platforms’ Accessibility Features,” Journal of Electronic Resources Librarianship 28, no. 3 (2016): 181, https://doi.org/10.1080/1941126X.2016.1200927. https://doi.org/10.1108/LHT-08-2014-0077 https://doi.org/10.1080/1941126X.2016.1200927 ABSTRACT Introduction Literature Review Methodology Research Findings VPAT Accuracy Assessment Accessibility Analysis Overall Vendor Comparison by Size Conclusion and Limitations Appendix A: Accessibility Remediation Guide Appendix B: Database by vendor list used in VPAT accuracy audit ENDNOTES 12041 ---- At the Click of a Button: Assessing the User Experience of Open Access Finding Tools ARTICLES At the Click of a Button Assessing the User Experience of Open Access Finding Tools Elena Azadbakht and Teresa Schultz INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2020 https://doi.org/10.6017/ital.v39i2.12041 Elena Azadbakht (eazadbakht@unr.edu) is Health Sciences Librarian, University of Nevada, Reno. Teresa Schultz (teresas@unr.edu) is Social Sciences Librarian, University of Nevada, Reno. ABSTRACT A number of browser extension tools have emerged in the past decade aimed at helping information seekers find open versions of scholarly articles when they hit a paywall, including Open Access Button, Lazy Scholar, Kopernio, and Unpaywall. While librarians have written numerous reviews of these products, no one has yet conducted a usability study on these tools. This article details a usability study involving six undergraduate students and six faculty at a large public research university in the United States. Participants were tasked with installing each of the four tools as well as trying them out on three test articles. Both students and faculty tended to favor simple, clean design elements and straightforward functionality that enabled them to use the tools with limited instruction. Participants familiar with other browser extensions gravitated towards tools like Open Access Button, whereas those less experienced with other extensions preferred tools that load automatically, such as Unpaywall. INTRODUCTION While the open access (OA) movement seeks to make scholarly output freely accessible to a wide number of people, finding the OA versions of scholarly articles can be challenging. In recent years, several tools have emerged to help individuals retrieve an OA copy of articles when they hit a paywall. Some of the most familiar of these—Lazy Scholar, Open Access Button, Unpaywall, and Kopernio—are all free browser extensions. However, poor user experience can hamper even the adoption of free tools. Usability studies, particularly of academic websites and search tools, are prevalent in the literature, but as of yet no one has compared the user-friendliness of these extensions. How Open Access Tools Work All of the tools can be installed for free as a Google Chrome browser extension. All four tools also work in Firefox. The idea is that when a user hits a paywall for an article, they can use that tool to search for an open version. Each works slightly differently: Open Access Button (https://openaccessbutton.org/)—The OA icon will appear to the right of the browser’s search bar (see figure 1). When a user clicks it, a new page will open that is either the open version of the article if one is found or a message saying it was not able to find an open version. The user is then given the option to write an email to the author asking that it be made open. mailto:eazadbakht@unr.edu mailto:teresas@unr.edu https://openaccessbutton.org/ INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 AT THE CLICK OF A BUTTON | AZADBAKHT AND SCHULTZ 2 Figure 1. The OAB icon appears as an orange padlock in the browser's toolbar. Lazy Scholar (http://www.lazyscholar.org/)—A horizontal bar will appear at the top of the page for any scholarly article (see figure 2). Along with other information, such as how many citations an article has and the ability to generate a citation for that article, PDF and/or file icons will appear in the middle of the bar if an open version is found. Users can then click on any of the icons to be taken to that open version. If no open version is found, no icons will appear. There is no text message indicating nothing has been found. A browser button is also installed, and users can click it to make the bar disappear and reappear. Figure 2. The Lazy Scholar toolbar appears just below the browser's search bar. Kopernio (https://kopernio.com/)—A tab will appear in the bottom left corner of the screen for any scholarly article (see figure 3). If there is an open version, the tab will be dark green. If no article is found, the tab will be shorter and grey. If a user hovers over it, they will see a message indicating if an open version was found. When a user clicks on the dark green tab, Kopernio automatically opens the article in its own viewer, called a locker, instead of the browser’s viewer. Unlike the other three tools, users must register with Kopernio and they can add their institution so Kopernio can search to see if their institution has access to the article. http://www.lazyscholar.org/ https://kopernio.com/ INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 AT THE CLICK OF A BUTTON | AZADBAKHT AND SCHULTZ 3 Figure 3. The Kopernio icon appears on the bottom left of the screen. Unpaywall (https://unpaywall.org/)—A short tab will appear in the middle right of a screen for a scholarly article. When it has found an open version, the tab will turn bright green (see figure 4). When an open version has not been found, it will turn a light grey. Clicking on the grey tab will also open a message indicating an open version could not be found. Figure 4. Unpaywall's green padlock icon appears halfway down on the right side of the screen. LITERATURE REVIEW The Need for Open Access Finding Tools Although OA helps take down financial barriers to accessing the scholarly literature, there is no one place to deposit content in order to make it OA. The Registry of Open Access Repositories, a database of both institutional and subject repositories, shows 4,725 repositories.1 No central database exists that searches every possible location for OA material, which means discovery of OA content remains difficult. Willi Hooper noted that “making repository content findable is a major challenge facing libraries.”2 Nicholas et al. found in their study of international early-career researchers that most rely on Google and Google Scholar to find scholarly articles and that one of their main goals is to find the full text as fast as possible.3 Google Scholar does include OA versions of articles, but this is not always readily obvious without clicking and trying each article version until they find an OA version. Dhakal also notes that search engines do not always aggregate content in institutional repositories on a consistent basis.4 Joe McArthur, one of the founders of the Open Access Button, said he decided to invent it after hitting paywalls after graduating.5 https://unpaywall.org/ INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 AT THE CLICK OF A BUTTON | AZADBAKHT AND SCHULTZ 4 OAFT Reviews and Other Research Most of the scholarly literature on OAFTs has focused on reviews of specific tools. Unpaywall has received a number of positive reviews,6 and both Open Access Button7 and Kopernio8 have received several as well. Dhakal has noted that Unpaywall “helps take the guesswork out of accessing OA articles.”9 Reviewers have generally found the tools easy to use, although some have included criticism. For instance, Rodriguez found that Open Access Button can result in odd error messages and false negatives (that is, not finding an open access version that actually does exist), although he liked the tool overall.10 Little research has looked at how well the tools work and how usable they are, however. Regier informally investigated why Unpaywall and similar tools do not always find articles that are open and noted that one problem is likely that publishers of OA journals do not always upload their license information to Crossref, one of the sites that Unpaywall relies on.11 Schultz et al. looked at how many OA versions the tools found in comparison to Google Scholar. None of the tools found as many as Google Scholar, although Lazy Scholar, Unpaywall, and Open Access Button all compared favorably to it, and each tool found at least some open versions that no other tool did.12 Usability and Other Evaluation Studies Since the late 1990s, libraries have sought to improve the user experience of their websites and electronic resources. Usability testing has since become a popular means of evaluating a library’s online presence with the input of its users. Blummer’s 2007 literature review chronicles the first phase of this trend in a section of her article dedicated to early usability studies of academic library websites.13 Many of these studies included both student and faculty participants and found that navigation issues needed to be resolved in order to maximize users’ ability to locate key information on the library websites being evaluated. Some also discovered that users misunderstood library terminology and that providing better descriptive terms and text helped improve the user experience.14 More recent examples of library website usability studies include one from 2018 by Guay, Rudin, and Reynolds and another published in 2019 by Overduin. The former’s findings echoed that of earlier studies that a cluttered interface can mask important navigational elements and content, hindering use.15 Overduin describes a think-aloud, task-based study of the California State University Bakersfield Walter W. Stiern Library that concluded it was important for libraries to consider the preferences of both new and returning users when redesigning their websites.16 While most of the literature involves usability studies of library websites, online catalogs, and discovery layers, librarians have also evaluated other academic products and tools. In 2015, Imler, Garcia, and Clements investigated pop-up chat reference widgets, such as those available through SpringShare’s LibChat software program.17 Librarians at the Penn State University interviewed thirty students across three campuses, asking them to interact with a chat widget. The vast majority of students did not find the pop-up widget annoying, and many agreed that they would be more likely to use chat reference if they encountered it on the website. In addition, the participants preferred to have at least a ten-second delay between the loading of the webpage and the appearance of the pop-up, with an average ideal time of about fourteen seconds.18 INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 AT THE CLICK OF A BUTTON | AZADBAKHT AND SCHULTZ 5 Haggerty and Scott evaluated the usability of an academic library search box and its s pecialized search features through task-based interviews with twenty participants, most of whom were students.19 Most of the study’s participants indicated a preference for a simplified search box, though some were reluctant at losing access to the specialized search tabs. At around the same time, Beisler, Bucy, and Medaille conducted a primarily task-based usability study of three streaming video databases to determine how patrons were using them.20 The students showed a preference for intuitive interfaces, whereas the faculty were concerned with the videos’ metadata and descriptions as well as the accessibility and shareability of the content. The databases’ advanced features were used less successfully. The results suggest that vendors would benefit from making navigation simpler and terminology clearer while enhancing search functionality. METHODOLOGY In keeping with usability testing best practices, this study involved twelve subjects total, six students and six faculty members at the University of Nevada, Reno (UNR).21 The authors sought subjects from a diverse set of science and social science disciplines. Recruitment efforts consisted of fliers and targeted emails, some sent directly by the authors to faculty members in their liaison areas and others distributed to students and faculty by liaison librarian colleagues. Interested students were directed to a simple Qualtrics form that asked them for their name, major, class standing, contact information, and whether they had ever used any of the four too ls before. The student participants each received a $15 Amazon gift card. Faculty did not receive any compensation. The study was approved by UNR’s Institutional Review Board (project number 1452303-2). Faculty interviews took place in September 2019, and student interviews took place in November 2019. The usability testing took place in three private conference rooms within the main library on a university-owned laptop running Microsoft Windows 10 and the Chrome browser. Participants were asked if they were regular users of Chrome, and all were to some degree, with several indicating that they use it exclusively. Both authors were present at all of the tests, alternating who walked the participants through the various tasks and who took notes. The Screencast-O- Matic screen capture software recorded both the participants’ audio and video as well as their movements on the computer screen. Referring to a script, the authors asked the participants to install each browser extension and use them, in turn, to find three scholarly articles from journals not available to the UNR community but recently requested through the Libraries’ interlibrary loan service. The authors switched the order in which they had participants install the four tools. Half of the participants started by installing Open Access Button, followed by Lazy Scholar, Unpaywall, and Kopernio, whereas the other half installed them in reverse order, beginning with Kopernio. The authors uninstalled the tools between usability tests. The three journal articles were selected with the assistance of UNR Libraries’ Coordinator of Course Reserves and Document Delivery Services. The study purposely included articles that the university did not have access to in order to ensure that none of the tools found “open versions” simply because of the libraries’ subscription. Two were findable by all four OA finding tools and one was a “planned fail” that could not be retrieved by any of them, allowing the autho rs to witness how participants responded to this failure on the part of the tools. Participants decided INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 AT THE CLICK OF A BUTTON | AZADBAKHT AND SCHULTZ 6 how long they were willing to spend figuring out a particular tool or finding the full text of an article. If, after a certain point, they deemed an article unfindable or realized that, in another setting, they would have given up on a tool, they could tell the interviewer that they wished to move on to the next task. Finally, the authors ended each interview by asking the participants to expand upon any stray observations, share any final thoughts, and name which of the tools, if any, they would consider using in the future. Each of the two authors reviewed half of the recordings and any notes, documenting key, de- identified information in a shared Google Sheet. They kept track of how long it took participants to install each of the four OA finding tools and whether they succeeded in locating the three articles—or, in the case of the planned fail, whether they successfully determined it was inaccessible. They also noted any issues the participants experienced and any comments they made. The authors met and coded all the information they had gleaned from the 12 usability tests together. Limitations This study included only 12 participants, all from the same institution. The authors know the faculty participants, as they recruited faculty directly from the disciplines with which they routinely work. Moreover, these faculty are all considered early or mid-career. While this was intentional, as the authors wanted to focus on that sub-population of researchers, it may have had an effect on the faculty members’ impression of the tools. Likewise, the participants’ comfort with technology, particularly their ability to learn new technology on-the-fly, and prior experience using other browser extensions or research productivity tools was not formally assessed prior to testing. These skills may have impacted how quickly participants were able to figure out a particular tool and how long they tried to find the full text of one of the articles before giving up. RESULTS Installation All participants successfully installed each of the four tools, and most took around the same time to install each tool, with none taking more than 90 seconds to install. The longest installation, 84 seconds for Kopernio, was connected to a technical issue that occurred during the installation. Most participants seemed to have an easy time installing the tools, although some noted they found certain tools easier to download than others. For instance, Faculty 1 noted that they thought Lazy Scholar was easier to install than Open Access Button, and Faculty 3 said they thought both Open Access Button and Unpaywall “were pretty smooth.” Student 2 liked it when there was an obvious “Install now”–type button, saying, “That’s pretty convenient.” When participants did struggle, it was usually with Open Access Button and Lazy Scholar. Participants did not always seem to realize right away which button on Open Access Button’s website would download the tool. Other times, participants were not sure if the tool had installed, not noting the new button on the Chrome browser bar. For Lazy Scholar, one participant, Student 4, noted it seemed to take more clicks to install it, and two participants received an error message, although they both were able to successfully install the tool on a second try. Kopernio also INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 AT THE CLICK OF A BUTTON | AZADBAKHT AND SCHULTZ 7 resulted in several error messages for one participant when creating an account, and the authors had to use one of their accounts to allow the participant to continue with the study. Ability to Use the Tools When looking at whether participants were able to successfully use each tool on each of the three sample articles, we determined that participants were most successful using Unpaywall. All participants successfully used the tool on Article 1 and Article 3. Because Article 2 was the planned fail in that no tool was able to locate it, participants who did not realize this and continued to try to use a tool to find it were deemed to have “failed” that particular task. By this measure, one faculty member failed on the second article while using Unpaywall. Lazy Scholar and Open Access Button each had a total of eight fails, with two participants—a faculty member and a student—failing on all three articles, and another student failing on the first two before successfully using the tool on the third article. Kopernio had a total of ten fails, with two participants failing on all three articles, and two others failing on the first two. All four of these participants were faculty members. In some cases of failures, participants would either try to find instructions for the tools or try clicking around on the screen and then following various links to see if they could successfully use the tools. In other cases, participants gave a cursory search for the tool but stopped after a short period of time. Article 2, the planned fail, also caused confusion for participants. For instance, one faculty participant seemed to think that Open Access Button had a technical glitch and looked to the instructions to see if they could troubleshoot it. Others never seemed certain if the tool was working incorrectly or if the article just was not available. Another issue came with the article version that Lazy Scholar returned for Article 3. Unlike the other instances when the tool took users directly to the article file, in this case Lazy Scholar took participants to the record page for the article in a scholarly repository. Participants could then click on a link for the full text, and it took several participants a few tries of clicking around on the page before finding the correct link. Student 2 noted “I expected it to pull up the PDF like all the others did.” Another student stopped at the record page, not realizing they could click one more link to get the full text, which was considered a fail. Themes Several themes emerged during usability testing. A major one was the design of the various extensions, encompassing their aesthetics and on-screen behavior. Other themes include the usefulness of each tool’s instructions and additional features as well as how participants’ experience with other browser extensions shaped how their expectations of the four tools. Design As with most usability studies, certain design choices determined how successful students and faculty were at finding the three test articles and how they felt about the experience. Participants gravitated toward simple, clean designs and faltered or expressed displeasure whenever they encountered extension elements that appeared overloaded with information or details. Several participants, for instance, thought that Lazy Scholar’s toolbar was clunky or too cluttered looking, INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 AT THE CLICK OF A BUTTON | AZADBAKHT AND SCHULTZ 8 even when they successfully used it to find the test articles. There were too many options embedded in the toolbar, which caused confusion, and its small font also proved problematic for a majority of the students and faculty. Conversely, several participants said that they appreciated Unpaywall’s minimalism, and many turned to it first when instructed to use the four tools to find the three articles. “This one is the most obvious one,” Faculty 1 stated. Many also responded positively to Open Access Button’s neat-looking icon and simplicity. Kopernio’s design led to a mix of user experiences. While participants seemed to appreciate its clear-cut, dark green icon, some of its other features—the search box and the storage “locker”—created unnecessary clutter. Throughout testing, participants also expressed mixed views of tools that featured an automatic pop-up as a means of indicating that a free version of an article was or was not available. Lazy Scholar, Unpaywall, and Kopernio all involve some version of this design choice. Open Access Button behaves like other commonly used browser extensions, such as Zotero, in that the extension remains inactive until clicked upon. The participants’ stated preferences did not always align with their behavior. Some participants did not like that the pop-ups appeared without prompting and that the pop-up tools blocked parts of the computer screen. “I like things that go away,” explained Faculty 3. Faculty 6 noted that they preferred Open Access Button because it did not load automatically and that it opened in a new, separate window. What’s mo re, those participants who had experience using other browser extensions were not expecting the pop -ups and first tried clicking on the tools’ icons embedded in the browser bar. This happened more often with Lazy Scholar and Kopernio than it did with Unpaywall, but all three experienced this. However, several who said that they found pop-ups “annoying” or “distracting” nevertheless were able to successfully use the tools to quickly find free versions of the test articles. This discrepancy was especially evident in the case of Unpaywall, which almost everyone used successfully and with apparent ease. A tool’s placement on the screen was likewise one of the key aspects of the tools’ design that made it either easier or more difficult to use during usability testing. Unpaywall’s tab sits on the middle- right side of the computer screen, whereas Kopernio’s green “K” tab appears toward the bottom of the screen. Sometimes the icon would disappear entirely after a few seconds, reappearing only after the page had been reloaded. Kopernio’s location was especially problematic because most participants are not accustomed to needing to scroll or look to the bottom of a webpage. Moreover, needing to scroll is “not convenient,” explained Student 3. This appeared related to at least some of the failures that participants had with the tool. Kopernio’s design did improve somewhat midway through usability testing. The icon is now highlighted when the webpage first loads and stopped dropping to the bottom of the page. However, some participants still missed the icon on their initial use of Kopernio. Student 2 said afterward that “Unpaywall is definitely easier to use, because its pop-up button stayed up.” Lazy Scholar’s toolbar also proved a stumbling block for several participants. Some did not notice it at first whereas others were not sure where within the toolbar they needed to click to retrieve the article, even though this is indicated by a standard PDF or HTML icon. The use of color also impacted participants’ success with the tools, particularly Unpaywall. Unpaywall’s lock icon turns bright green when the tool has found an open version of the article INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 AT THE CLICK OF A BUTTON | AZADBAKHT AND SCHULTZ 9 and grey when it has not. Both faculty and students appreciated this simple status indicator. “I recognized the little green button,” said Student 4. For users with color vision deficiency, however, this favorite feature could be problematic. Users can click on the icon regardless of whether they can differentiate between the icon’s two settings or not, but some convenience is lost. The Kopernio icon’s darker green is likewise an issue for those with some forms of color blindness. Open Access Button’s and Lazy Scholar’s color choices garnered less comment. Prior Experience with Browser Extensions Another aspect of the tools’ design that influenced how participants interacted with a particular tool and how intuitive they ultimately found it to use was their prior experience with other browser extensions. Several participants indicated that they used other browser extensions in their everyday lives. Specifically, this knowledge appeared to affect their success with Open Access Button, which behaves like most browser extensions do in that it does not launch automatically. Faculty 2 said that using Open Access Button “felt the most natural,” and Faculty 3 said, “Most other browser extensions I’ve used, when you want it to do the thing, you click it.” Some participants who had less experience with other browser extensions still managed to use Open Access Button successfully, though it took them slightly longer to do so. However, a few participants failed to use the tool at all during testing, having given up when they could not determine how it worked. Instructions Participants expressed a desire for simple, straightforward instructions and were more likely to read instructions that seemed succinct and easy-to-follow. They were also more likely to try out a tool just after installing it if the tools’ instructions were clear and if the instructions provided an example they could use to see the tool in action. Unpaywall’s instructions do this particularly well, as they consist of minimal text on a large image of how the tool works. Open Access Button and Kopernio both provided instructions and examples that helped mitigate some of the issues participants had with them. For example, those who tried out Open Access Button’s example before attempting to find the test articles—or who referred back to the instructions when they encountered a problem—were more likely to use it successfully, even if their prior experience with traditionally designed browser extensions was limited. Kopernio’s instructions highlight where the icon appears, which primed the participants to later look towards the bottom of the screen for it. Although this did not prevent confusion when using Kopernio (as noted previously), it did reduce it. Lazy Scholar’s instructions, on the other hand, are quite detailed and are written in a very small font. This combination intimidated the participants, many of whom chose to quickly move on to the next task. Some scanned the instructions, but none read through them. Additional Features Three of the four tools—Lazy Scholar, Kopernio, and, to a limited extent, Open Access Button— offer additional features, including a way to contact article authors, integration with citation management tools, article metrics, and file storage space. However, many participants did not take note of the tools’ ability to do things other than find open versions of scholarly articles, and their enthusiasm for these options varied. This is likely partly due to the focus of the usability tests on these tools’ core function. More of the students responded positively to the tools’ extra features than did the faculty. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 AT THE CLICK OF A BUTTON | AZADBAKHT AND SCHULTZ 10 Lazy Scholar’s and Kopernio’s extra features received the most attention. Two students responded positively to Lazy Scholar’s cite option (a Mendeley integration) in particular. For Student 5, it made Lazy Scholar stand out. Participants also tried out Kopernio’s Google Scholar-powered search box when they had trouble locating and using its pop-up tab. A few students indicated that they would consider using this feature again to find related articles. However, those participants who came across mentions of Kopernio’s article storage tool, known as a “locker,” either expressed confusion over its purpose—“Locker? What locker?” wondered Faculty 5—or were simply not interested in learning more about it. Others said they did not need storage space of this kind. “I don’t get the metaphor. My hard drive is my locker,” noted Faculty 2. Favorites When asked which, if any, of the tools they preferred and would consider using, eight of the participants said Unpaywall, followed by seven who said Open Access Button (see table 1). Four said they liked Lazy Scholar, and two said they liked Kopernio, although two participants said they specifically would not use Kopernio, and two said the same of Lazy Scholar. It is important to note that many of the participants named multiple tools, suggesting that they saw the need to rely on more than just one tool. Table 1. Breakdown of preference for OA finding tool by faculty and students. Participant Group Open Access Button Unpaywall Lazy Scholar Kopernio Faculty 3 4 1 1 Students 4 4 3 1 DISCUSSION Keep it Simple The results show that users most preferred simplicity, including the instructions for downloading and using the tools. For example, participants seemed to have the easiest time downloading and trying out Unpaywall because of how large and obvious its download button was, as well as how minimal and large their instructions were. In comparison, participants also seemed to like Lazy Scholar’s large and easy-to-see download button but disliked the long instructions, which were in a smaller font. As most of them did look at the instructions for Unpaywall, it is clear they do find instructions helpful, as long as they can be read and understood in just a few seconds. This also seemed to be the reason why some participants struggled to find Open Access Button’s icon. Although the site does provide an instructional image similar to Unpaywall’s, it is smaller and does not do as good of a job of pointing out the button’s location. Some participants took a moment to look at the image but failed to notice what it was trying to highlight. Likewise, participants, especially faculty, seemed to prefer the tools with a simple and clean design. The added features of Lazy Scholar were not worth the space it took up on the page. A few even remarked negatively on the size of the Kopernio pop-up tab, saying it blocked too much of the screen. Although a few at first remarked negatively on the Unpaywall tab, several said that by INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 AT THE CLICK OF A BUTTON | AZADBAKHT AND SCHULTZ 11 the end of the study, the tab no longer bothered them and that its usefulness outweighed its obtrusiveness. Do Not Assume Prior Experience Most participants who figured out how to use Open Access Button seemed to like it; however, several struggled with finding it to begin with. Part of this might be because the other tools, all of which use a pop-up tab, might have conditioned them to look for something similar. However, several participants noted that they were not familiar with browser extensions, which likely affected their ability to find the tool in the browser bar. They would try clicking directly on the article homepage screen. Providing clear and obvious instructions would likely help ameliorate this issue. Extra Features Not Always Worthwhile Overall, participants did not seem interested in the extra features, especially Kopernio’s locker and Open Access Button’s option to email the author. And for faculty, the additional features of Lazy Scholar, including citation information and similar articles, proved to be a negative. However, some students did seem interested in these features, meaning this tool might be better for those who are still new to information discovery. CONCLUSION Although participants’ reaction to certain design elements, such as pop-ups and the finding tools’ additional features, were mixed, most of them were able to use the four browser extensions successfully. The tools’ location on the computer screen and their similarity (or dissimilarity) to other browser extensions influenced success rates. Likewise, clean, simple design elements and straightforward instructions enhanced participants’ experience with the four tools. Even though more of the students and faculty said they preferred Unpaywall and Open Access Button, each of the four tools appealed to at least some of the participants. Both students and faculty were excited to find out about these tools and some even expressed surprise that they are freely available. Many seemed open to the idea of using more than one tool, which can be helpful given each extension’s distinctive approach to finding and retrieving articles. However, having them use four tools at once also appeared to create issues for at least some of the participants as they would confuse which tool was which. Librarians and other OA advocates can use the information from this study to help guide potential users to the tools that best suit their individual preferences and comfort level with similar technologies. Increased promotion will ramp up adoption of the tools by a more diverse pool of users, which will ultimately generate the feedback needed to make the extensions more intuitive overall. ENDNOTES 1 Registry of Open Repositories, “Welcome to the Registry of Open Access Repositories - Registry of Open Access Repositories,” 2019, http://roar.eprints.org/. 2 Michaela D. Willi Hooper, “Product Review: Unpaywall [Chrome & Firefox Browser Extension],” Journal of Librarianship & Scholarly Communication 5 (January 2017): 1–3, https://doi.org/10.7710/2162-3309.2190. http://roar.eprints.org/ https://doi.org/10.7710/2162-3309.2190 INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 AT THE CLICK OF A BUTTON | AZADBAKHT AND SCHULTZ 12 3 David Nicholas et al., “Where and How Early Career Researchers Find Scholarly Information,” Learned Publishing 30, no. 1 (January 1, 2017): 19–29, https://doi.org/10.1002/leap.1087. 4 Kerry Dhakal, “Unpaywall,” Journal of the Medical Library Association 107, no. 2 (April 15, 2019): 286–88, https://doi.org/10.5195/jmla.2019.650. 5 Eleanor I. Cook and Joe McArthur, “What Is Open Access Button? An Interview with Joe McArthur,” The Serials Librarian 73, no. 3–4 (November 17, 2017): 208–10, https://doi.org/10.1080/0361526X.2017.1391152. 6 Terry Ballard, “Two New Services Aim to Improve Access to Scholarly PDFs,” Information Today 34, no. 9 (November 2017): Cover-29; Chris Bulock, “Delivering Open,” Serials Review 43, no. 3–4 (October 2, 2017): 268–70, https://doi.org/10.1080/00987913.2017.1385128; Dhakal, “Unpaywall”; E. E. Gering, “Review: Unpaywall,” May 24, 2017, https://eegering.wordpress.com/2017/05/24/review-unpaywall/; Barbara Quint, “Must Buy? Maybe Not,” Information Today 34, no. 5 (June 2017): 17; Michael Rodriguez, “Unpaywall,” Technical Services Quarterly 36, no. 2 (April 3, 2019): 216–17, https://doi.org/10.1080/07317131.2019.1585002; Willi Hooper, “Product Review.” 7 Quint, “Must Buy?”; Michael Rodriguez, “Open Access Button,” Technical Services Quarterly 36, no. 1 (January 2, 2019): 101–2, https://doi.org/10.1080/07317131.2018.1532043. 8 Ballard, “Two New Services Aim to Improve Access to Scholarly PDFs”; Matthew B. Hoy, “Kopernio,” Journal of the Medical Library Association 107, no. 4 (October 1, 2019): 632–33, https://doi.org/10.5195/jmla.2019.805. 9 Dhakal, “Unpaywall.” 10 Rodriguez, “Open Access Button.” 11 Ryan Regier, “How Much Are We Undercounting Open Access? A Plea for Better and Open Metadata.,” A Way of Happening (blog), May 1, 2019, https://awayofhappening.wordpress.com/2019/05/01/how-much-are-we-undercounting- open-access-a-plea-for-better-and-open-metadata/. 12 Teresa Auch Schultz et al., “Assessing the Effectiveness of Open Access Finding Tools,” Information Technology and Libraries (Online) 38, no. 3 (September 2019): 82–90, https://doi.org/10.6017/ital.v38i3.11009. 13 Barbara A. Blummer, “A Literature Review of Academic Library Web Page Studies,” Journal of Web Librarianship 1, no. 1 (June 21, 2007): 45–64, https://doi.org/10.1300/J502v01n01_04. 14 Blummer, “A Literature Review,” 49–51. 15 Sarah Guay, Lola Rudin, and Sue Reynolds, “Testing, Testing: A Usability Case Study at University of Toronto Scarborough Library,” Library Management 40, no. 1/2 (January 1, 2019): 88–97, https://doi.org/10.1108/LM-10-2017-0107. https://doi.org/10.1002/leap.1087 https://doi.org/10.5195/jmla.2019.650 https://doi.org/10.1080/0361526X.2017.1391152 https://doi.org/10.1080/00987913.2017.1385128 https://eegering.wordpress.com/2017/05/24/review-unpaywall/ https://doi.org/10.1080/07317131.2019.1585002 https://doi.org/10.1080/07317131.2018.1532043 https://doi.org/10.5195/jmla.2019.805 https://awayofhappening.wordpress.com/2019/05/01/how-much-are-we-undercounting-open-access-a-plea-for-better-and-open-metadata/ https://awayofhappening.wordpress.com/2019/05/01/how-much-are-we-undercounting-open-access-a-plea-for-better-and-open-metadata/ https://doi.org/10.6017/ital.v38i3.11009 https://doi.org/10.1300/J502v01n01_04 https://doi.org/10.1108/LM-10-2017-0107 INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 AT THE CLICK OF A BUTTON | AZADBAKHT AND SCHULTZ 13 16 Terezita Overduin, “‘Like a Robot’: Designing Library Websites for New and Returning Users,” Journal of Web Librarianship 13, no. 2 (April 3, 2019): 112–26, https://doi.org/10.1080/19322909.2019.1593912. 17 Bonnie Brubaker Imler, Kathryn Rebecca Garcia, and Nina Clements, “Are Reference Pop-up Widgets Welcome or Annoying? A Usability Study,” Reference Services Review 44, no. 3 (2016): 282–91, https://doi.org/10.1108/RSR-11-2015-0049. 18 Imler, Garcia, and Clements, “Are Reference Pop-up Widgets Welcome or Annoying,” 287–9. 19 Kenneth C. Haggerty and Rachel E. Scott, “Do, or Do Not, Make Them Think?: A Usability Study of an Academic Library Search Box,” Journal of Web Librarianship 13, no. 4 (October 2, 2019): 296–310, https://doi.org/10.1080/19322909.2019.1684223. 20 Amalia Beisler, Rosalind Bucy, and Ann Medaille, “Streaming Video Database Features: What Do Faculty and Students Really Want?,” Journal of Electronic Resources Librarianship 31, no. 1 (January 2, 2019): 14–30, https://doi.org/10.1080/1941126X.2018.1562602. 21 Ritch Macefield, “How to Specify the Participant Group Size for Usability Studies: A Practitioner’s Guide,” Journal of Usability Studies 5, no. 1 (2009): 34–5; World Leaders in Research-Based User Experience, “How Many Test Users in a Usability Study?,” Nielsen Norman Group, accessed December 24, 2019, https://www.nngroup.com/articles/how-many- test-users/; Robert A. Virzi, “Refining the Test Phase of Usability Evaluation: How Many Subjects Is Enough?,” Human Factors 34, no. 4 (August 1, 1992): 457–68, https://doi.org/10.1177/001872089203400407. https://doi.org/10.1080/19322909.2019.1593912 https://doi.org/10.1108/RSR-11-2015-0049 https://doi.org/10.1080/19322909.2019.1684223 https://doi.org/10.1080/1941126X.2018.1562602 https://www.nngroup.com/articles/how-many-test-users/ https://www.nngroup.com/articles/how-many-test-users/ https://doi.org/10.1177/001872089203400407 ABSTRACT Introduction How Open Access Tools Work Literature Review The Need for Open Access Finding Tools OAFT Reviews and Other Research Usability and Other Evaluation Studies Methodology Limitations Results Installation Ability to Use the Tools Themes Design Prior Experience with Browser Extensions Instructions Additional Features Favorites Discussion Keep it Simple Do Not Assume Prior Experience Extra Features Not Always Worthwhile Conclusion ENDNOTES 12053 ---- Measuring the Impact of Digital Heritage Collections using Google Scholar ARTICLES Measuring the Impact of Digital Heritage Collections Using Google Scholar Ángel Borrego INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2020 https://doi.org/10.6017/ital.v39i2.12053 Ángel Borrego (borrego@ub.edu) is Associate Professor, Universitat de Barcelona (Spain). ABSTRACT This study aimed to measure the impact of digital heritage collections by analysing the citations received in scholarly outputs. Google Scholar was used to retrieve the scholarly outputs citing Memòria Digital de Catalunya (MDC), a cooperative, open-access repository containing digitized collections related to Catalonia and its heritage. The number of documents citing MDC has grown steadily since the creation of the repository in 2006. Most citing documents are scholarly outputs in the form of articles, proceedings and monographs, and academic theses and dissertations. Citing documents mainly pertain to the humanities and the social sciences and are in local languages. The most cited MDC collection contains digitized ancient Catalan periodicals. The study shows that Google Scholar is a suitable tool for providing evidence of the scholarly impact of digital heritage collections. Google Scholar indexes the full-text of documents, facilitating the retrieval of citations inserted in the text or in sections that are not the final list of references. It also indexes document types, such as theses and dissertations, which contain a significant share of the citations to digital heritage collections. INTRODUCTION In recent years, many libraries have been devoting a large amount of resources, in terms of staff, equipment and infrastructure, to digitalize their special collections and to make them available on the web. The European Union “has invested €265 million in research and innovation for advanced digitisation technologies, digital curation and innovative cultural projects.”1 In most cases, the purpose of these initiatives is twofold: to facilitate access to scholars, and to the wider public, to rare and important materials, and to enhance long-term preservation. Despite the benefits and value derived from these initiatives, there are few examples of evaluation of their impact. Much of the existing evidence remains anecdotal and the culture of assessment “has not yet penetrated digitization and digital collection building activities to nearly the same extent as many other areas of research library activity.”2 Most previous research aimed at assessing the results of digitalization projects has focused on issues such as interface design, usability or users’ information behaviour, whereas “studies about the impact of digital collections have not been conspicuous in the field.”3 A meta-analysis of 41 evaluations of Europeana revealed that system-centered evaluations prevailed over user-centered evaluations and “only a marginal number of studies tried to assess the impact of Europeana on different stakeholders.”4 More recently, a survey conducted by LIBER’s working group on Digital Humanities and Digital Cultural Heritage showed that “digital humanities work within libraries is currently undergoing limited evaluation,” with over half of the respondents not conducting any mailto:borrego@ub.edu INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 MEASURING THE IMPACT OF DIGITAL HERITAGE COLLECTIONS USING GOOGLE SCHOLAR | BORREGO 2 specific assessment.5 The report recommended that research libraries should measure their achievements and impact, to not only make decisions, prove success, and support arguments for resources if required, but also to provide new ways for academics to value the library. According to Shaw, comprehensive assessment of digital collections requires a combination of methodological approaches, including statistics and surveys, user and usability studies, and web- based analytics.6 The latter would encompass citation analysis, a method that could be employed to assess the reach and impact of digitized collections in a similar fashion to its use as a metric of the impact of scholarly works. Unfortunately, citation information for digitized collections is hard to capture due to the lack of specific guidelines in standard citation formats for these materials, resulting in inconsistent citation practices. The impact of digitalization projects is sometimes measured in terms of visiting statistics, which are used as a proxy to evaluate the effectiveness of the resources devoted to digitalizing heritage collections. Biswas and Marchesoni, at the Hunter Library, were among the first authors to employ web analytics to obtain usage data from digital collections.7 Although enlightening, usage figures alone do not illuminate the reasons for usage and its impact, since download statistics do not indicate whether users find digital collections useful for learning, teaching, research, or leisure purposes. In a different approach, Sinn conducted a bibliometric study aimed at determining the relationship between digital resources and historical research.8 She analyzed references and figures in articles published in American Historical Review to observe how frequently and widely digital collections were used. She found that secondary materials were the most frequently employed digitized resource, with archival materials coming in second place. Digital archival materials were more frequently mentioned in figures than in citations, proving the difficulty in compiling citation information for digitized collections. The present study aimed to emulate Sinn’s pioneering study on the use of citation analysis to measure the impact of digitized heritage collections, expanding our understanding of how digital heritage collections are used for academic and scholarly purposes. To do this we introduced two changes to Sinn’s design related to the population of citing documents and the tool employed to retrieve the citations. First, instead of analyzing the references in a sample of journals in a given field to identify citations to digital collections, we retrieved all the citations to a specific digital collection that was used as a case study. Therefore, we did not measure how digital collections are cited in journals in a given discipline, but how scholarly outputs in different formats and disciplines cite a specific digital heritage collection. The collection used as a case study is Memòria Digital de Catalunya (MDC, http://mdc1.csuc.cat/en), a cooperative open-access repository containing digitized collections related to Catalonia and its heritage. The project is promoted by the universities of Catalonia and the Biblioteca de Catalunya, with the participation of other Catalan institutions. Second, we used Google Scholar to retrieve the citations to MDC. The rationale for this choice was to obtain an accurate picture of the diversity of research outputs and disciplines in which digitized collections are used. Previous research has shown Google Scholar to be reliable and to have good http://mdc1.csuc.cat/en INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 MEASURING THE IMPACT OF DIGITAL HERITAGE COLLECTIONS USING GOOGLE SCHOLAR | BORREGO 3 coverage of the diversity of disciplines, languages, and document types in the humanities and the social sciences, where usage of heritage collections is expected to be highest.9 The study aimed to address two questions: 1. To what extent are MDC collections being used in the creation of scholarly outputs? 2. Is Google Scholar a useful tool to measure the impact of digital heritage collections? METHODOLOGY On February 23, 2019, we searched Google Scholar for documents including a web reference to MDC in the full-text. The server hosting MDC has changed its URL several times since the inauguration of the service in 2006, so we used six queries to retrieve as many records as possible: mdc.cbuc.cat, mdc.csuc.cat, mdc1.cbuc.cat, mdc2.cbuc.cat, mdc1.csuc.cat and mdc2.csuc.cat. All queries, except for the last one, retrieved some records, giving a total of 366 results. For each record, we accessed the full-text of the document. At this stage, we removed 42 duplicates, i.e., copies of the same document hosted on two servers that Google Scholar had not been able to match. Additionally, two records were no longer available at the URL listed in Google Scholar and were removed from the analysis, leaving 322 citing documents. In order to download the records, we used the “My library” feature in Google Scholar. This service allows users to export records in four formats: BibTeX, EndNote, RefMan, and CSV. Exported records had eight fields, although the level of completion was different for each field. “Authors” and “title” were provided for all the records, but the level of completion was lower for the “publication” (53 percent), “volume” (24 percent), “number” (33 percent), “pages” (44 percent), “year” (85 percent), and “publisher” (34 percent) fields. In order to analyse the evolution in the number of citations by year of publication, we manually retrieved the year of publication for 46 additional documents, thus covering 99 percent of the records. For each of the 322 citing documents, we searched the full-text for the citation to MDC. However, 48 documents were behind a paywall and we were unable to access them. As a result, the population of citing documents for the second part of the analysis on cited MDC collections was reduced to 274 documents. Most citing documents included a single reference to MDC, but, in some cases, the number of citations to MDC was higher, with an extreme case of an analysis of medical cartoons citing 323 resources in MDC. In order to analyse the results, when one document cited different resources in a single MDC collection, we counted this as a single citation. This was, for instance, the case for the medical cartoons example, since all references cited the same magazine. However, when a single document cited different MDC collections we counted as many citations as collections were cited. For instance, if a document cited a digitized resource in a parchment collection plus an article belonging to a collection of digitized magazines, we counted two citations. In total, the 274 citing documents contained 313 citations. Citations were made at very different levels. In some cases, authors referred to the whole MDC website, citing the generic URL of the platform. In other cases, they cited a specific collection in MDC such as “Incunabula” or “Manuscripts.” Some references cited a certain digitized magazine INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 MEASURING THE IMPACT OF DIGITAL HERITAGE COLLECTIONS USING GOOGLE SCHOLAR | BORREGO 4 whereas, in other cases, they referred to a specific issue or an individual article in a magazine. Additionally, many links to MDC were broken, forcing us to gather much information manually. In order to compare the coverage of Google Scholar with that of Scopus, we also searched Scopus using the “Reference website” option in the advanced search. This option retrieves the URL of a website of a cited reference. We used the same six queries previously employed in Google Scholar. We only found twelve records when searching for mdc.cbuc.cat, whereas the other five queries did not retrieve any results. RESULTS Citing documents: chronological evolution, document types, disciplines, languages, and coverage of Google Scholar compared to Scopus The number of documents citing MDC has grown steadily since its creation in 2006, reaching a maximum of 61 documents published in 2016 citing the repository (see figure 1). Figure 1. Documents citing MDC by year of publication, January 2006–February 2019 (n = 319). Documents citing MDC can be broadly classified into two large categories: published scholarly outputs, such as journal articles, proceedings, or monographs (68 percent of the citing documents), and academic theses and dissertations (25 percent), with the remaining 8 percent of the citing documents including reports, open educational resources (OERs), course syllabuses, conferences, etc. (see table 1). INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 MEASURING THE IMPACT OF DIGITAL HERITAGE COLLECTIONS USING GOOGLE SCHOLAR | BORREGO 5 Table 1. Typologies of Documents Citing MDC (n = 322) n (%) Scholarly outputs Journal articles 167 (52%) Proceedings 22 (7%) Book chapters 20 (6%) Books 9 (3%) Theses and Dissertations Undergraduate 32 (10%) Postgraduate 25 (8%) PhD 22 (7%) Other Academic essays, conferences, datasets, open educational resources (OERs), reports, syllabuses, etc. 25 (8%) More than half of the documents citing MDC were journal articles. These articles had been published in 126 different journals. In order to determine the disciplines of the journals citing MDC, we searched for them in Ulrich’s Periodicals Directory. Figure 2 shows the subject of the 148 articles published in journals indexed in Ulrich’s Periodicals Directory that we were able to retrieve; the remaining 19 articles had been published in journals not indexed in Ulrich’s Periodicals Directory. Figure 2. Disciplines of the Articles Citing MDC (n = 148) As shown in figure 2, half of the articles citing MDC were published in two humanities disciplines: linguistics and literature (26 percent) and history (25 percent). Additionally, a third of the articles (34 percent) had been published in journals in different fields in the social sciences and humanities, including education (6 percent), music (4 percent), art (3 percent), religion (2 percent), anthropology (1 percent), philosophy (1 percent), and political science (1 percent), among others. There were also a few cases of citing articles published in journals in the INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 MEASURING THE IMPACT OF DIGITAL HERITAGE COLLECTIONS USING GOOGLE SCHOLAR | BORREGO 6 experimental, health, and natural sciences (3 percent), usually in studies employing a historical approach to these disciplines. In all cases, the authors of the articles had used MDC as a primary source for their research. Finally, 15 percent of the articles citing MDC had been published in library and information science journals. In most cases, these articles had been written by librarians who described the features of MDC to a professional audience in their field. Consistent with the high number of outputs in the humanities and the social sciences, most citing documents were in local languages, i.e., 48 percent had a title in Catalan and 35 percent in Spanish. The remaining outputs were in English (11 percent) or in other languages (7 percent). When we looked at the coverage of the citing journals in Scopus, we observed that 37 (29 percent) of the 126 citing journals were indexed by the database. These 37 journals had published 49 articles citing MDC. These results were inconsistent with those obtained when searching Scopus for MDC references, when we were able to retrieve just twelve records. We followed up these discrepancies, i.e., the articles published in journals indexed in Scopus whose MDC citations were retrieved in Google Scholar but not in Scopus, and in most cases, the reason for the discrepancy was that MDC citations were not in the list of references at the end of the article. The refer ences to MDC were inserted in the text or located in footnotes, annexes, lists of websites, etc. In some cases, MDC was cited in document types not indexed by Scopus, such as book reviews. Figure 3 shows three examples of these discrepancies, i.e., MDC references cited in articles indexed in Scopus but not retrievable when searching the database. In the first example, the reference to MDC was in a footnote but not in the final reference list. In the second example, the reference to MDC was included in a list of “sources” placed before the reference list that starts at the bottom of the image. Finally, in the third example, the reference to MDC was included in an annex, but not in the list of references. In all three cases, the articles were indexed in Scopus, but the records did not include the references to MDC. Conversely, all three documents were retrieved in Google Scholar since it had indexed the full-text. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 MEASURING THE IMPACT OF DIGITAL HERITAGE COLLECTIONS USING GOOGLE SCHOLAR | BORREGO 7 Figure 3. Examples of MDC citations in articles indexed in Scopus not retrievable in the database. (Top: Antonio López Estudillo, “Especialización olivarera, cambios institucionales y desigualdad agraria en la Alta Campiña de Córdoba (siglos XVIII-XX),” Historia Agraria 73 (December 2017): 185– 220, https://doi.org/10.26882/HistAgrar.073E07l; middle: Paloma Fernández Pérez and Ferran Sabaté Casellas, “Entrepreneurship and management in the therapeutic revolution: The modernisation of laboratories and hospitals in Barcelona, 1880–1960,” Investigaciones de Historia Económica – Economic History Research 15, no. 2 (June 2019): 91–101, https://doi.org/10.1016/j.ihe.2017.09.001; bottom: Maria-Rosa Lloret, “La sufixació apreciativa del català: creacions lèxiques i implicacions morfològiques,” Caplletra: Revista Internacional de Filologia 58 (2015): 55–89, https://doi.org/10.7203/caplletra.58.7137.) In other cases, although the citations to MDC were in the final list of references, Scopus failed to retrieve them. Figure 4 shows an example of a complete reference in an original document (including the URL pointing to MDC), whereas the Scopus record, at the bottom of the image, just https://doi.org/10.26882/HistAgrar.073E07l https://doi.org/10.1016/j.ihe.2017.09.001 https://doi.org/10.7203/caplletra.58.7137 INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 MEASURING THE IMPACT OF DIGITAL HERITAGE COLLECTIONS USING GOOGLE SCHOLAR | BORREGO 8 included the name of the author and the title of the magazine. The MDC reference, therefore, was not retrieved when searching Scopus. Figure 4. Differences between a reference in the original document and the record in Scopus. (Source of the example: Ramon Farrés, “La recepción del poeta catalán Joan Brossa en Brasil,” Meta: journal des traducteurs 60, no 1 (April 2015): 158–72, https://doi.org/10.7202/1032404ar.) Leaving aside scholarly outputs published in the form of journal articles, proceedings or monographs, the second category of MDC-citing documents (25 percent) was that of dissertations and theses. This included dissertations at all academic levels: undergraduate, postgraduate and PhD. In nearly all cases, dissertations were hosted in University institutional repositories and belonged to disciplines similar to those recorded for scholarly outputs, such as language and literature or history and, more generally, to the humanities and the social sciences. Finally, up to 75 citing documents (8 percent) were classified in other typologies. As in the case of journal articles, this section included several examples of reports and other outputs prepared by librarians explaining the features of MDC to a professional audience. Cited collections in MDC The population of 274 citing documents available in open access sources included 313 references to MDC. One-fifth of the citations did not refer to any specific collection or document, but to the whole MDC portal (see table 2). Table 2. Cited Sources in MDC (n = 313). n (%) Digital Memory of Catalonia (MDC) 67 (21%) Archive of Ancient Catalan Periodicals (ARCA) 111 (35%) Other collections 135 (43%) The most cited collection in MDC was ARCA, a collection of digitized ancient Catalan periodicals, which received more than one third of the citations. These citations, however, were not strictly comparable, since four citations referred to the whole ARCA collection, 33 citations referred to a specific magazine, six citations referred to a specific issue, and 68 citations referred to a specific article. Thirty different digitized magazines were cited. https://doi.org/10.7202/1032404ar INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 MEASURING THE IMPACT OF DIGITAL HERITAGE COLLECTIONS USING GOOGLE SCHOLAR | BORREGO 9 Finally, 135 citations referred to 49 different collections. Just three collections received more than ten citations each: a collection of digitized posters from the Spanish Second Republic and Civil War (15 citations), a photographic collection of the Hiking Club of Catalonia (11 citations), and a collection of manuscripts from the Biblioteca de Catalunya (11 citations). DISCUSSION AND CONCLUSIONS The results of the study provide evidence of the academic impact of MDC collections among scholars and students. Academics make use of the digitized resources hosted by MDC and find them useful to build their scholarship, as evidenced by the citations made in their scholarly outputs. However, its impact goes beyond academic publications such as journal articles, proceedings, and monographs, including a significant number of dissertations and theses citing MDC resources. Professional scholars accessing MDC collections online may save time and money compared to the travel costs required to consult the resources in situ. However, in the case of students, it is possible that in many cases they would be unable to access the collections had th ey not been digitized. When considering this type of impact, it should be borne in mind that the actual number of academic essays, dissertations, and theses citing MDC is presumably higher than the number recorded in this study. While all PhD theses in Catalonia are posted online, only bachelor and master dissertations with high grades are archived online in institutional repositories and, therefore, could be retrieved in our study. As expected, given the characteristics of the digitized collections, most citations come from academic documents in local languages in the fields of humanities and social sciences. Most cited resources are located in a collection of digitized ancient Catalan periodicals, although citations are spread among a wide range of collections. In addition to the use of MDC by scholars and students, there is also a significant proportion of professional publications written by librarians citing MDC. The results of the study show that Google Scholar is possibly the most suitable tool for conducting citation studies on the impact of digital heritage collections. Given the characteristics of the resources contained in these collections, most citations are retrieved from local journals, which are frequently not indexed in large citation indexes such as Web of Science or Scopus. Even when they are indexed, our results show that Scopus frequently fails to properly index citations to digital heritage collections, since most citations are not included in the final list of references but are inserted in the text, in footnotes, or in lists of resources employed. Conversely, Google Scholar indexes the full-text of documents and, therefore, allows retrieval of these citations. These results are consistent with those found in previous research that showed that digital archive materials are more frequently mentioned in figures than in citations and described how citation information for digitized collections is hard to capture due to the lack of specific guidelines for citing digitized collection materials, resulting in inconsistent citation practices.10 In addition, as mentioned previously, a significant proportion of citations to digital heritage collections originates from documents indexed by Google Scholar, but not by Scopus, especially theses and dissertations, but also reports, educational resources, etc. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 MEASURING THE IMPACT OF DIGITAL HERITAGE COLLECTIONS USING GOOGLE SCHOLAR | BORREGO 10 Although our study provides evidence of the scholarly impact of MDC, it also suffers from some limitations. Our results do not reflect any other kind of impact, such as on learning, teaching , or leisure. Similarly, our study did not consider how MDC contributes to the long-term preservation of heritage collections. Further research could explore these issues using participatory research methods, including surveys and interviews with users. ENDNOTES 1 “EU Member States Sign Up to Cooperate on Digitising Cultural Heritage,” European Commission, last updated April 24, 2020, https://ec.europa.eu/digital-single-market/en/news/eu-member- states-sign-cooperate-digitising-cultural-heritage. 2 Emily Frieda Shaw, “Making Digitization Count: Assessing the Value and Impact of Cultural Heritage Digitization,” Archiving 2016 Final Program and Proceedings (Springfield, VA: Society for Imaging Science and Technology, 2016), 197, https://doi.org/10.2352/issn.2168- 3204.2016.1.0.197. 3 Donghee Sinn, “Impact of Digital Archival Collections on Historical Research,” Journal of the American Society for Information Science and Technology 63, no. 8 (August 2012): 1521, https://doi.org/10.1002/asi.22650. 4 Vivien Petras and Juliane Stiller, “A Decade of Evaluating Europeana—Constructs, Contexts, Methods & Criteria,” in Research and Advanced Technology for Digital Libraries: 21st International Conference on Theory and Practice, eds. Jaap Kamps et al. (Cham: Springer, 2017), 241, https://doi.org/10.1007/978-3-319-67008-9_19. 5 LIBER, “Europe's Digital Humanities Landscape: A Report from LIBER's Digital Humanities & Digital Cultural Heritage Working Group,” (2017), 28, https://doi.org/10.5281/zenodo.3247286. 6 Shaw, “Making Digitization Count,” 198. 7 Paromita Biswas and Joel Marchesoni, “Analyzing Digital Collections Entrances: What Gets Used and Why It Matters,” Information Technology and Libraries 35, no. 4 (2016): 19–34, https://doi.org/10.6017/ital.v35i4.9446. 8 Sinn, “Impact of Digital Archival Collections,” 1525. 9 Alberto Martín-Martín et al., “Google Scholar, Web of Science, and Scopus: A Systematic Comparison of Citations in 252 Subject Categories,” Journal of Informetrics 12, no 4 (November 2018): 1160–77, https://doi.org/10.1016/j.joi.2018.09.002. 10 Sinn, “Impact of Digital Archival Collections,” 1533; Shaw, “Making Digitization Count,” 199. https://ec.europa.eu/digital-single-market/en/news/eu-member-states-sign-cooperate-digitising-cultural-heritage https://ec.europa.eu/digital-single-market/en/news/eu-member-states-sign-cooperate-digitising-cultural-heritage https://doi.org/10.2352/issn.2168-3204.2016.1.0.197 https://doi.org/10.2352/issn.2168-3204.2016.1.0.197 https://doi.org/10.1002/asi.22650 https://doi.org/10.1007/978-3-319-67008-9_19 https://doi.org/10.5281/zenodo.3247286 https://doi.org/10.6017/ital.v35i4.9446 https://doi.org/10.1016/j.joi.2018.09.002 ABSTRACT Introduction Methodology Results Citing documents: chronological evolution, document types, disciplines, languages, and coverage of Google Scholar compared to Scopus Cited collections in MDC Discussion and conclusions ENDNOTES 12067 ---- Tackling the Big Projects: Do it Yourself or Contract with a Vendor? EDITORIAL BOARD THOUGHTS Tackling the Big Projects Do it Yourself or Contract with a Vendor? Laurie Willis INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2020 https://doi.org/10.6017/ital.v39i1.12067 Laurie Willis (Laurie.Willis@sjlibrary.org) is Web Services Manager, San Jose Public Library and a member of the Information Technology and Libraries editorial board. Copyright © 2020. Everyone who works with library technology sooner or later finds they are faced with a major project to tackle. Sometimes we contract with a vendor to do the bulk of the work, sometimes we do the project ourselves. There are advantages and disadvantages to both methods. Here at San Jose Public Library we were faced with two large projects at once—a website migration/redesign, and a new catalog discovery layer. We considered BiblioCommons as the vendor for both projects. They offer both a website product (BiblioWeb) and a discovery layer (BiblioCore). We opted to complete the website migration/redesign ourselves using open source software, migrating from our previous Drupal 7 platform to Drupal 8, and to contract with BiblioCommons to provide our new discovery layer. This put us in an unusual position. We were implementing a website migration/redesign ourselves while simultaneously working with the vendor we would likely have chosen for the website project on the catalog discovery layer. This gave us the opportunity to compare the experience of implementing the website project ourselves with what the same project might have been like if we had been working with a vendor. WHAT WE LEARNED Timing Not surprisingly, completing the website project on our own took longer than expected. • Learning curve—We expected there to be a learning curve but it turned out to be significantly steeper than anticipated. • Unknowns—In addition to basic learning, we also came across functionality that didn’t work as expected. • Failures—There were times when what we tried to do didn’t work at all and we had to backtrack. Timing for the vendor-led project, on the other hand, kept to the planned timeline. • Prescribed timeline—As part of their contract, the vendor provided a timeline at the outset. We made small adjustments but for the most part the project stayed on time. • Predictability—The vendor has completed many similar projects so they had a solid idea what to expect and how long it would take. • Problem solving—Some challenges unique to our situation did arise and caused some delays. Control The ability to have more control over the project results was a significant factor in our decision to complete the website project ourselves. We had the opportunity to make choices and also faced the challenge of a sometimes-overwhelming number of options. mailto:Laurie.Willis@sjlibrary.org INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 TACKLING THE BIG PROJECTS | WILLIS 2 • Options—Many options were available to us. We had choices regarding structure (website platform and theme), design and content. • Overwhelm - The plethora of options encouraged a tendency to spend a lot of time (too much?) “shopping”—researching and evaluating options. • We completed a thorough audit of our content and created a new site based on our needs. • User experience (UX) testing—We were able to perform testing with our users and adapt our website to better fit their needs. Working with a vendor, on the other hand, limited what we were able to do but the decision -making process was easier and faster. • We had the option to select colors but otherwise the structure and design were fixed. • We had some control over textual content within the parameters given. E.g. we could add links to the footer but the number of links allowed was limited. • Little time was spend making these decisions. • It’s a challenge fitting unique content into a pre-determined format. • User experience (UX) testing—The vendor is able to include a wider sampling of people while testing, but they’re not able to specifically consider our local users. Implementation For the website project, implementation turned out to be more complex than expected. • Learning—as mentioned above, there were many new things to learn that came up as the project progressed. • Consultant—We came up with technical questions that were beyond the scope of our knowledge. We found it extremely helpful to contract with a consultant for guidance. • Conflicting responsibilities—We worked on this project while continuing with our normal workload and maintaining the current website. We were also simultaneously working on the discovery layer implementation. The vendor-led implementation went more smoothly. • Learning—the vendor assigned a project manager, who was available to guide us through the process. The vendor also provided documentation that walked us through the process. • Expertise—When challenges did arise, the vendor had an experienced staff to help us work through them. • Staff time—Although the vendor did most of the work, the project did consume significantly more staff time than expected as we worked through every detail. Training and Marketing • Staff - For the website, we had to create our own training for staff. For the catalog, the vendor offered webinars for staff and sent a trainer to do in-person training. • Public - The vendor offered samples of materials from other libraries to both inform and educated the public. Since both projects were launching at the same time, we were able to adapt some of these materials to include both. Cost The cost of hiring a vendor initially seems steep, but staff time is also expensive. Considering the unexpected additional staff time spent, it likely would have been less expensive to choose the vendor option. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 TACKLING THE BIG PROJECTS | WILLIS 3 Conclusion There are pros and cons to both methods—completing a project on your own or working with a vendor. Whether your project is a new website or catalog or something else entirely, learn as much as you can about what will be involved before you decide on an approach. Weigh your options by looking at your needs and the resources and time available to you. The primary aspects to consider are: • Do staff have the necessary expertise to complete the project? Will there be a learning curve? Are staff prepared and willing to learn new things and figure things out? If you are considering a vendor, do they have a training plan for • How much time is available? Is there a deadline? If there is a deadline, what will be the costs if it needs to be extended? If you are considering a vendor, how committed are they to achieving the prescribed deadline? • Which is more important to you—control and flexibility or ease of implementation? • What resources are available if you have questions? If you work on your own, are there people and online resources you will be able to turn to? If you are considering a vendor, will you be assigned a representative to walk you through the process? For our particular situation, I believe we made the right choice to complete the website project on our own. Staff had enough expertise that they were willing and able to learn the necessary skills, calling upon a consultant when needed without outsourcing the entire project. While we had an expected timeline, we were able to extend it with only minor consequences (paying for additional web hosting while the project was under construction.) We maintained the control and flexibility we needed in order to present some of the unique services and spaces that our library offers, which might have been lost using a vendor package. We had some knowledge of consultants working in the field and were able to hire one to show us how to proceed when we were over our heads. We also relied heavily on tutorials and other training resources posted online. Whatever you decide, taking time to think things through before beginning will help make your project a success. What We Learned Timing Control Implementation Training and Marketing Cost Conclusion 12089 ---- Google Us! Capital Area District Libraries Gets Noticed with Google Ads Grant PUBLIC LIBRARIES LEADING THE WAY Google Us! Capital Area District Libraries Gets Noticed with Google Ads Grant Sheryl Cormicle Knox and Trenton M. Smiley INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2020 https://doi.org/10.6017/ital.v39i1.12089 Sheryl Cormicle Knox (knoxs@cadl.org) is Technology Director for Capital Area District Libraries. Trenton M. Smiley (smileyt@cadl.org) is Marketing & Communications Director for Capital Area District Libraries. Increased choices in the marketplace are forcing libraries to pay much more attention to how they market themselves. Libraries can no longer simply employ an inward marketing approach that speaks to current users through printed materials and promotional signage plastered on the walls. Furthermore, they cannot rely on occasional mentions by the local media as the primary driver of new users. That’s why in 2016, Capital Area District Libraries (CADL), a 13 branch library system in and around Lansing, Michigan, began using more digital tactics as a cost-effective way to increase our marketing reach and to have more control over promoting the right service, at the right time, to the right person. One example of these tactics is ad placement on the Weather Channel App. This placement allows ads about digital services like OverDrive and hoopla to appear when certain weather conditions, such as a snowstorm, occur in the area. In 2017, while attending the Library Marketing and Communications Conference in Dallas, our Marketing and Communications Director had the good fortune of sitting in on a presentation by Trey Gordner and Bill Mott from Koios (www.koios.co) on how to receive up to $10,000 of in-kind advertising every month from a Google Ad Grants (www.google.com/grants). During this presentation, Koios offered participants a 60- day trial of their services to help secure the Google Ad Grants and create a few starter campaigns. Google Ads are text-based and appear in the top section of Google's search results, along with the ads of paying advertisers. Nonprofits in the Google Ad Grants program can set up various ad campaigns to promote whatever they like—the overall brand of the library, the collection, and various events, meeting room offerings or any other product or service. The appearance of each Google Ad is triggered by keywords chosen for each campaign. After CADL's trial period expired, we decided to retain Koios to oversee the Google Ad Grants project. While the library has used Google Ads for the sharing of video, we had not done much with keyword advertising. So, we were excited to learn more about the process of using keywords and the funding available through the grant. We viewed this as a great new tool to add to our marketing toolbox. It would help us achieve a few of our marketing goals: expanding our overall marketing reach and digital footprint by 50 percent; increasing the library’s digital advertisement budget by 300% (by using alternative funding); and promoting the right service at the right time. GETTING STARTED Koios coached us through the slalom course of obtaining accounts and setting them up. To secure the monthly ad grant, we first obtained a validation key from Tech Soup (www.techsoup.org), the nonprofit that makes technology accessible to other non-profits and libraries. That, in turn, pre-qualified us for a Google for Nonprofits account. (At the time, we were able to get a validation token from our existing Tech Soup account, but Koios currently recommends starting by registering a 501c3 Friends organization or Library Foundation with Tech Soup whenever possible.) After creating our Google for Nonprofits account, we used the same account username to create a Google Ads account. Finally, to work efficiently with Koios, mailto:knoxs@cadl.org mailto:smileyt@cadl.org https://www.koios.co/ https://www.google.com/grants https://www.techsoup.org/ INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 GOOGLE US! | KNOX AND SMILEY 2 we provided them access to our Google Analytics property (which we have configured to scrub patron identifying information) and our Google Tag Manager account (with the ability to create tags that we in turn review and approve). If you are taking the do-it-yourself approach, Google has a step-by-step Google Ad Grants activation guide and extensive help online. DESIGNING CAMPAIGNS Spending money well is hard work and that holds true with keyword search ads as well. There are some performance and ad quality requirements in the grant program that must be observed to retain your monthly allotment. Understanding these guidelines and implementing campaigns that respect them, while working well enough to spend your grant allocation requires study and patience. Again, we relied on Koios to guide us. They helped us create campaigns and ad groups within those campaigns that were effective within the grant program. Figure 1. Example of Minecraft title keyword landing page created by Koios. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 GOOGLE US! | KNOX AND SMILEY 3 In August 2018, we started with campaigns for general branding awareness that included ads aimed at people actively searching for local libraries and our core services. These ads funnel users to our homepage and our online card signup. They are configured to display only to searchers who are geographically located in our service area. This campaign has been grown and perfected over 18 months into one of our most successful campaigns, garnering over 2,300 impressions and 650 clicks in January 2020, yet it spends just $450 of our grant funds. Another consistent performer for us has been our Digital Media campaign with ads targeting users searching for eBooks and audiobooks. By June 2019 we had grown our grant spend to $1,500 a month using 27 different campaigns. The game changer for us has been working with Koios to create campaigns based on an export of MARC records from our catalog. We worked with Koios to massage this data into a very simple pseudo-catalog of landing pages based on item titles. The landing page is very simple and SEO friendly so that it ranks well in the split-second ad auction competition that determines whether your ad will be displayed. It has cover images, clear calls to action, loads fast, is mobile friendly and communicates the breadth of formats held by the library (see figure 1). Clicking the item title or the borrow button sends users straight into our full catalog to get more information, request the item, or link to the digital version. Figure 2. A user search in Google for “dad jokes” showing a catalog campaign ad. Grant program ads are displayed below paid ads. The format of the ad may vary as well. This version shows several extensions, like phone number, site links, and directions links. INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 GOOGLE US! | KNOX AND SMILEY 4 Figure 3. The landing page displayed to the searcher after they click on the ad and the resulting catalog page if the searcher clicks the Borrow button. In Google Ads, Koios created 14 catalog campaigns out of the roughly 250,000 titles we sent them. Each campaign has keywords (single words and phrases from titles) derived from roughly 18,000 titles ranked by how frequently they are used in Google search. Again, these ads are limited geographically to our service area. Figures 2 and 3 illustrate what a Google searcher in Ingham County, Michigan, potentially encounters when searching for “dad jokes”. Since their inception in September 2019, these catalog campaigns have been top performers for us, generating clickthrough rates of 8-15% and a couple thousand additional ad clicks monthly, the aggregation of a small number of clicks on any one ad from our “long tail” of titles. We are now spending over $5,000 of our grant funds and garnering nearly 23,000 impressions and 3,000 ad clicks monthly. RESULTS In general, we find that our Google Ads have succeeded in drawing additional new visitors to our web site. Using our long-established Google Analytics implementation that measures visits to our website and catalog combined, we compared the third quarter of 2018, when we were ramping up our Google Ad Grants campaigns, to the third quarter of 2019, after our catalog campaign was firmly established. The summary numbers are encouraging. The number of users is up 17%, and number of sessions is up 4%. Within the overall rise in users, returning users are up 9%, but new users are up 25%. Therefore, we are getting more of those coveted, elusive “non-library-users” to visit us online. When comparing the behavior of new and returning visitors, we also see that the overall increase in sessions was achieved despite the head wind of a 4% decline in returning visitor sessions. However, are the new visitors engaging? Perhaps the most tangible measure of engagement for a public library catalog is placing holds. We have a Google Analytics conversion goal that measures those holds. The INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 GOOGLE US! | KNOX AND SMILEY 5 rate of conversion on the hold goal among new visitors rose 7%, while dropping 13% among returning visitors. From other analysis, we know that our highly-engaged members are migrating to our mobile app and to digital formats, so the drop for returning users is explainable and the rise among new visitors is hopeful. We are working on ways to study more closely these new visitors so that we can discover and remove more barriers in the way of them becoming highly engaged members of their public library. FUTURE PLANS With the help of Koios, new campaigns will be created to promote our blogs and podcasts. We will also link a campaign to our Demco events database. Finally, in partnership with Koios, we will work with Patron Point to incorporate our automated email marketing system into Google Ad campaigns. We will add campaigns for pop-up ads that encourage library card signup through our online registration system. Once someone signs up for a library card online, the system will trigger a welcome email that promotes some of our core services. This on-boarding set-up will also include an opportunity for the new cardholder to fill out a form to tailor content in future emails to their interests. Through all these means, CADL leads the way in delivering the right service, at the right time, to the right person. Getting Started Designing Campaigns Results Future Plans 12105 ---- LITA President's Message: A Framework for Member Success LITA PRESIDENT’S MESSAGE A Framework for Member Success Emily Morton-Owens INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2020 https://doi.org/10.6017/ital.v39i1.12105 Emily Morton-Owens (egmowens.lita@gmail.com) is LITA President 2019-20 and the Assistant University Librarian for Digital Library Development & Systems at the University of Pennsylvania Libraries. This column represents my final venue to reflect on our potential merger with ALCTS and LLAMA before the vote. After a busy Midwinter Meeting with lots of intense discussions about the Steering Committee on Organizational Effectiveness (SCOE)’s recommendations, the divisions, the merger, ALA finances, and more, my thoughts keep turning in a particularly wonkish direction: towards our organization. So many of the challenges before us hinge on one particular dilemma. For those of us who are most involved in ALA and LITA, the organization (our committees, offices, processes, bylaws, etc.) may be familiar and supportive. But for new members looking for a foothold, or library workers who don’t see themselves in our association, our organization may look like a barrier. Moreover, many of our financial challenges are connected to our organization. The organization must evolve, but we must achieve this without losing what makes us loyal members. While ALA and LITA have specific audiences of library workers and technologists, we have a lot in common with other membership organizations. One of the responsibilities for the LITA vice- president is attendance at a workshop put on by the American Society of Association Executives, where we learn how to steward an organization. Representatives from many different groups attended this workshop, where I had a chance to discuss challenges with leaders from medical and manufacturing associations, and I learned that these challenges are often orthogonal to the subject matter at hand. Everyone was dealing with the need to balance membership cost and value, how to give members a voice while allowing for agile decision-making, and how to put on events that are great for attendees without becoming the only way to get value from membership. Hearkening back even further, I worked as a library-school intern at a library with a long run of German and French-language serials that I retrospectively cataloged. One batch that has always stuck in my mind is the planning materials for international congresses that were held in the early 20th century by the international societies for horticulture and botany. These events were massive undertakings held at multi-year intervals, gradually planned by international mail. Interested parties would receive a lavish printed prospectus, with registration and travel arrangements starting several years in advance. The most interesting documents pertained to the events planned for the mid to late 1930s in Europe. These events were cancelled or fell short of intentions because of pre-World War II political pressures. The congress schedules did not resume until 1950 or later, with some radical changes—for example, German was no longer used as the language of science, and the geographic distribution of events increased significantly in the later 20th century. When I first encountered this material, I was intrigued by how the war affected science. Looking back now, I see a dual case study in organizations weathering a crisis whose magnitude we can only imagine, and then reinventing themselves on the other side. Both of these organizations still exist and continue to meet, by the way—and I can’t help but feel that reinvention is the key to survival. mailto:egmowens.lita@gmail.com INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 A FRAMEWORK FOR MEMBER SUCCESS | MORTON-OWENS 2 Our organizational framework is a key part of the challenge for both ALA and LITA. I have no doubt that members remain excited about our key issues for advocacy, our subjects for continuing education, and our opportunities for networking. But we have concerns about how we make those things happen. In LITA, for example, continuing education requires a massive effort on the part of both member volunteers and staff to organize. We need to brainstorm relevant topics, recruit qualified instructors, schedule and promote the events, and finally run the sessions, collect feedback, and arrange payment for the instructors. This takes the time of the same people we’d like to have creating newsletters and booking conference speakers. Meanwhile, right across the hall at ALA headquarters, we have staff from ALCTS and LLAMA doing the same things. These inefficiencies hit at the heart of our financial problems. At the ALA level, SCOE has proposed ideas like a single set of dues structures for divisions, and a single set of policies and procedures for all round tables. These changes would reduce the overhead required to operate these groups as unique entities, a financial benefit, while also making it easier for members to afford, join, and move between them, a membership benefit. That framework also offers us an opportunity to improve our associations. Members have been asking how the association can act more responsively on issues of diversity, equity, and inclusion—for example, how can we have incident response that is proactive and sensitive to member needs while recognizing the complexities of navigating that space as a member-based organization. This is a chance to live up to our aspirations as a community. The actions LITA has taken to extend all forms of participation to members who can only participate remotely/online are a way to make us more accessible to library workers regardless of finances or home circumstances. Bylaws and policies may not be the most glamorous part of associations but they are the levers we can employ to change the character of our community. Coming back to Core, we can observe elements of the plan that are responding to both threats and opportunities. Members of ALCTS, LLAMA, and LITA know that financial pressures are a major impetus for the merger effort. But, in the hope of achieving a positive reinvention, the merger planning steering committee put most of its emphasis on the opportunity side. The diagram of intersecting interests for Core’s six proposed sections (https://core.ala.org/core-overlap/) is a demonstration of the new frontiers of collaboration that Core will offer members. The proposed structure of Core retains committees while also offering a more nimble way to instantiate interest groups. Moreover, the process of creating Core reflects the kind of transparent process we want to see in the future. The steering committee and the communications sub-committee crossed not just the three divisions but also different levels of experience and types of prior participation in the divisions. The communications group answered freeform questions, held Twitter AMA’s, and held numerous forums to collect ideas and feelings about the project. Zoom meetings and Twitter are not new media, but the sustained effort that went into soliciting and responding to feedback through these channels is a new mode for our divisions. The LITA Board recently issued a statement (https://litablog.org/2020/02/news-regarding-the- future-of-lita-after-the-core-vote/) explaining that if the Core vote does not succeed, we don’t see a viable financial path forward and will be spending the latter half of 2020 and the beginning of 2021 working toward an orderly dissolution of LITA. It is tempting to approach this crossroads from a place of disappointment or fear. We cannot yet say precisely what it will be like to be a https://core.ala.org/core-overlap/ https://litablog.org/2020/02/news-regarding-the-future-of-lita-after-the-core-vote/ https://litablog.org/2020/02/news-regarding-the-future-of-lita-after-the-core-vote/ INFORMATION TECHNOLOGY AND LIBRARIES MARCH 2020 A FRAMEWORK FOR MEMBER SUCCESS | MORTON-OWENS 3 member of Core. But when I look at the organizational structure Core offers us, I feel hopeful about it being a framework in which members will find their home and flourish. The new division includes what we need for a rich member experience coupled with a streamlined structure that makes it easier to be involved in the ways and extent that make sense for you. In fifty years, perhaps a future member of Core will be writing a letter to their members: looking back at this moment of technological and organizational disruption and reflecting on how we reinvented our organization at the moment it needed it most. 12123 ---- Navigation Design and Library Terminology: Findings from a User-Centered Usability Study on a Library Website COMMUNICATION Navigation Design and Library Terminology Findings from a User-Centered Usability Study on a Library Website Isabel Vargas Ochoa INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2020 https://doi.org/10.6017/ital.v39i4.12123 Isabel Vargas Ochoa (ivargas2@csustan.edu) is Web Services Librarian, California State University, Stanislaus. © 2020. ABSTRACT The University Library at California State University, Stanislaus is not only undergoing a library building renovation, but a website redesign as well. The library conducted a user-centered usability study to collect data in order to best lead the library website “renovation.” A prototype was created to assess an audience-based navigation design, homepage content framework, and heading terminology. The usability study consisted of 38 student participants. It was determined that a topic- based navigation design will be implemented instead of an audience-based navigation, a search-all search box will be integrated, and the headings and menu links will be modified to avoid ambiguous library terminology. Further research on different navigation and content designs, and usability design approaches, will be explored for future studies. INTRODUCTION The University Library at California State University, Stanislaus is currently undergoing a much anticipated and necessary redesign of the library website. Website redesigns are crucial and a part of website maintenance to acclimate with modern technology and meet accessibility standards. “If librarians are expected to be excellent communicators at the reference desk and in the classroom, then the library website should complement the work of a librarian.”1 In this case, a library website prototype was created, using a Springshare LLC product, LibGuides CMS, as the testing subject for our user-centered usability study. The usability study was completed with 38 student participants belonging to different academic years and areas of study. The library website prototype tested was designed using a user-based design framework and an audience-based navigation. This study found issues reported from users based on navigation design and ambiguous library terminology. An audience-based navigation was chosen in order to best organize and group the information and services offered to best make them accessible for users. However, an audience-based navigation will directly affect users and their search behaviors.2 The prototype, like the current library website, did not have a search-all search box during the study. A catalog search box was utilized to test whether or not the catalog was enough for student participants to find information. This also forced the participants to utilize the menu navigation. LITERATURE REVIEW The design and approach of usability studies, preference for types of search boxes, navigation design, and library terminology evolve over time in parallel with technology changes. Most recent usability studies use screen and audio recording tools as opposed to written observation notes. Participants in recent studies are also more adapted to learning how to navigate websites, as mailto:ivargas2@csustan.edu INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 NAVIGATION DESIGN AND LIBRARY TERMINOLOGY | OCHOA 2 opposed to participants in usability studies twenty years ago. Regardless, it’s crucial to compare the results from previous usability studies to analyze differences and similarities. Different types of usability studies include user-centered usability studies and heuristic usability studies. This study chose a user-centered approach because of the library's desire to collect data and feedback from student users. The way in which the usability study is presented is also detrimental to the approach. Website usability studies are meant to test the website, although participants may unconsciously believe they are being tested. In Tidal’s library website case study (2012) researchers assured the participants that “the web site was being tested and not the participants themselves.”3 This unconscious belief may also affect the data collected from the participants and “influence user behavior, including number of times students might attempt to find a resource or complete a given task.”4 The features tested were the navigation design and homepage elements. The navigation design in the prototype was developed to test an audience-based navigation design (see figure 1). An audience-based navigation design organizes the navigation content by audience type. 5 That is to say, the user will begin their search by identifying themselves first. Although this design can organize content in a more efficient manner, especially for organizations that have specific, known audiences, critics argue that this design forces users to identify themselves before searching for information, thus taking them out of their task mindset.6 For this usability study, I wanted to test this navigation design and compare the results to our current navigation design which is a topic- based navigation design. A topic-based navigation design is developed to present topics as navigation content.7 This design is our current library website navigation design (see figure 2) Figure 1. Screenshot of the audience-based navigation design developed for the library website prototype. Figure 2. Screenshot of the current content-based navigation design in the library website. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 NAVIGATION DESIGN AND LIBRARY TERMINOLOGY | OCHOA 3 Designing the navigation and homepage also means choosing accessible terms that are relevant to all users. Unfortunately, over the course of many decades, library terminology has been a hindrance for student users. Terms such as “catalog,” “reference,” and “research guides” are still difficult for users to understand. As Conrad states (2019), “students are not predisposed to think of a ‘research guide’ as a useful tool to help them get started.”8 A research guide isn’t necessarily a self-explanatory term. In many ways, the phrase is ambiguous. Augustine’s case study in 2002 had similar difficulties. Students “lack of understanding and awareness of library resources impacted their ability more than the organization of the site did.”9 It’s unsettling to know that our own terminology has been deterring users from accessing library resources for decades. Librarians use library terminology to such an extent that it’s part of our everyday language, but what is common knowledge to us may be completely alien to our very own audience. Not only should libraries be aware of confusing library terms, but content should also not overwhelm the user with an abundance of information. Most students who visit the library are looking for something specific and easy to find. It’s important for librarians to condense their information on guides or website pages to not frustrate the user or make them search elsewhere, like Google. “Students scan. . . rather than [read] material.” 10 This is also something that has been noted from our Crazy Egg statistics. Heatmaps of our website’s pages prove that users are not scrolling to the bottom of the pages. This also applies to the use of large images, or unnecessary flashy or colorful content that covers most of the desktop or mobile screen. These images should be reduced in size so that users can find information swiftly. For this reason, any large design on the homepage should also be included in menu links, in case large flashy content is ignored.11 The search box is also another fundamental element I analyzed. In this case study, our search box was the catalog search box for Ex Libris Primo. If a page, particularly the homepage, has two search boxes—search-all and catalog search—the user can be confused. Search boxes are primarily placed at the center of the page. Depending on how these search boxes are labeled and identified, users may not know which one to use. Students approach library search boxes as if searching Google.12 In our case, neither the current website nor the prototype has a general search-all box. We have a catalog search box placed on the top center of the homepage for both sites. If we were to add a general search-all box, it would be placed away from the catalog search box and preferably in the header where it is visible in all pages. METHODOLOGY The usability study was conducted by the author, the web services librarian at California State University, Stanislaus, who also worked with a computer science instructor in order to recruit participants. Not only is the University Library redesigning its website, but the University Library building is also undergoing a physical renovation. Due to this project, the library has relocated to the library annex, a collection of modular buildings providing library services to the campus community. The usability study was conducted in a quiet study room in one of these modular sites. I reserved this study area and borrowed eight laptops for the sessions. The usability study employed two different methods to get students to participate. The first offered an extra credit incentive, which was offered when I collaborated with the computer science instructor. This instructor was teaching a course on human-centered design for websites. She offered her students an extra credit incentive, since several of her learning objectives centered on website design and usability studies. The second approach was an informal one. This approach INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 NAVIGATION DESIGN AND LIBRARY TERMINOLOGY | OCHOA 4 was promoted by scouting students who were already at the library annex during the usability study scheduled sessions. this enabled students to participate without having to sign-up or remember to participate. The students were recruited in-person during the usability session and through flyers posted in study rooms on the days of the study. An incentive of snacks for students to take home was also included. I created questions and seven tasks to be handed out to the participants during the study. The tasks were created to test the navigation design of the main menu and content on the homepage. I also added a task to test the research skills of the student. After these tasks, students were asked to rate the ease of access, answer questions about their experience navigating the prototype and to provide feedback. All students were given the same tasks, however if the student was taking the human-centered design course, they were also given specific web design questions for feedback (see Appendices A and B). The tasks were piloted before the study with three library student workers who provided feedback on how to better word the tasks for students. The following tasks are the final seven tasks used for the usability study: 1. Find research help on citing legal documents—a California statute— in APA style citation. 2. Find the library hours during spring break. 3. Find information on the library study spaces hours and location. 4. You’re a student at the Stan State campus and you need to request a book from Turlock to be sent to Stockton. Fill out the request-a-book form. 5. You are a graduate student and you need to submit your thesis online. Fill out the thesis submission form. 6. For your history class, you need to find information on the university’s history in the University Archives and Special Collections. Find information on the University Archives and Special Collections. 7. Find any article on salmon migration in Portland, Oregon. You need to print it, email it to yourself, and you also need the article cited. The usability study sessions took place from 11am to 2pm on February 10, 12, and 14, 2020. These days and times were chosen because the snack incentive would attract students during lunch hour and I wanted to accommodate the start and end times of the human-centered design course on Mondays, Wednesdays, and Fridays. The total time it took for students to complete the 7 tasks averaged 15 minutes. In total, there were 38 student participants. The student’s experience was recorded anonymously. I asked students to provide their academic year and major. Students ranged from freshman (5), sophomore (2), junior (12), senior (17), graduate (1), and unknown (1). Areas of study included computer science (16), criminal justice (2), business (2), psychology (3), communications (1), sociology (1), English (3), nursing (1), Spanish (1), biology (3), geology (1), history (2), math (1), gender studies (1), and undeclared (1). The subject tested was the library website prototype created and executed using a Springshare LLC Product, LibGuides CMS. The tools I used were eight laptops and a screen recording tool, Snagit. Snagit is a recording tool made accessible through a campus subscription. The laptops were borrowed from the library for the duration of the sessions. During the session, students navigated and completed the tasks on their own with no direct interference, including no direct observations. I planned to create a space where my presence didn’t directly influence or intimidate their experience with the website. My findings were based solely on their written responses and screen recordings. I also explained to the students that their screen recorded video INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 NAVIGATION DESIGN AND LIBRARY TERMINOLOGY | OCHOA 5 will not be linked to their identity, since they had to sign-in to the laptop using their campus student ID. I did, however, occasionally walk around the tables in the room in case a student was navigating the current website or using a separate site to complete the tasks. Once the students completed the tasks and answered the questions, I collected the handouts and the screen -capture videos by copying them to a flash drive. LIMITATIONS During the usability study session, there were two technical issues that hindered the initial process. On the first day, there were difficulties accessing the campus Wi-Fi in the room as well as difficulties accessing the Snagit video recording application. This limitation affected some of the students' experiences and feedback. These issues were resolved and not present on the second and third day of the study. RESULTS AND OBSERVATIONS The results and observations collected from this study mirror results from the studies conducted by Azadbakht and Swanson.13 I found that students searched the catalog search box for library collections, citations, and other library terms they didn’t understand, even though it was a catalog search box with the keywords “find articles, books, and other materials” labeled in the search bar. Another finding was that the navigation design can detrimentally affect a user's experience with the website. Mixed reviews were received from utilizing the audience-based navigation design. The study also found that students are adept at finding research materials. For example, most students knew how to search, find, print, email, and cite an article. Students in general are also familiar with book requests, ILL accounts, and filling out book request webforms. This indicates that, in terms of utilizing library services, students are well aware of how to find, request, and acquire resources, using the website on their own. What was most difficult for students was interpreting library terminology. This was explicitly shown in their attempts to complete tasks 1 and 6: finding how to cite a legal document in APA style and finding information on Special Collections and the University Archives. The following results and observations are divided into three categories: written responses, video recording observations, and data collected. Data was collected based on observations from the video recording and the written responses. Data was then input into eight separate charts. Written Responses Observations Comments from both non-human-centered website design students and human-centered design students included mixed reviews on the navigation layout, overall positive outlook on the page layout design, suggestions to add a search-all “search bar,” and frustrations with tasks 1 and 6. Video Recording Observations The Ex Libris Primo search box was constantly mistaken as a search-all search box. This occurred during students’ search for tasks 1 and 6: citation help and university archives, respectively. Students also used the research guides search box in LibGuides as a search-all search box. Students found the citation style guides easily because of this feature, however on the proposed new website, it was difficult to find citation help. Students were also using research guides to complete other tasks, such as task 6. A search bar for the entire website was continuously mentioned as a solution from student participants. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 NAVIGATION DESIGN AND LIBRARY TERMINOLOGY | OCHOA 6 Tasks 2 and 3, regarding library hours and study spaces, were easily completed. Tasks 4 and 5 were also easily accessible. After completing task 4 (book request form) it was easier for participants to complete task 5 (thesis submission form) because both tasks required students to search the top main navigation menu. To complete task 4, several students immediately signed-in to their ILL account or login to Primo for CSU+, which was expected as signing-in to these accounts are alternate modes to request a book. An additional other observation, regarding task 4 is that the confusion revolving the library terms, “call number,” was solved by adding an image reference pointing to the call number in the catalog. The call number image reference was opened several times for assistance. Most students completed task 7 (find a research article) but not all students used the catalog search box on the homepage to complete it. Several students searched the top main navigation and clicked on the “Research Help” link. Others utilized research guides and the research guides search box on the homepage. A particular unique observation was made by some computer science students. Most computer science students were quicker to give up on a task as opposed to non-computer science students. Some computer science students did not scroll down when browsing pages. These students failed to complete several tasks because they didn’t scroll down the page after being on the page for less than ten seconds. DATA COLLECTED Figure 3. Ease of navigation (overall). INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 NAVIGATION DESIGN AND LIBRARY TERMINOLOGY | OCHOA 7 Figure 3 illustrates the ease of navigation rating overall from all student participants. Students were asked to rate the ease of access of the website (see Appendices A and B). Other than the keywords “ease of navigation (1 difficult; 10 easy)” students were given the freedom to define what “easy” and “difficult” meant to them individually. The mean for the ease of access rating for all student participants was 7.7. The lowest rating of ease of access was 3 and the highest rating of ease of access was 10. Figure 4. Ease of navigation (computer science major). Figure 4 illustrates the ease of access rating by the student participants based on whether the student was a computer science major or not a computer science major. The lowest ease of access ratings were from computer science majors. Overall non-computer science majors had higher ease of access ratings than computer science majors. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 NAVIGATION DESIGN AND LIBRARY TERMINOLOGY | OCHOA 8 Figure 5. Ease of navigation (human-centered design). Figure 5 illustrates the ease of access rating by the student participants based on whether the student was taking the human-centered design course. The human-centered design students’ learning outcomes include website user-interface design and an assignment on how to create a usability study. Similar to patterns found in figure 2, human-centered design students had lower ease of access ratings. Figure 6. Tasks – Status of completion. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 NAVIGATION DESIGN AND LIBRARY TERMINOLOGY | OCHOA 9 Figure 6 illustrates whether a task was completed or not. Completion of task was determined by analyzing whether or not the student not only found the page(s) that provided the solution to the task. It was determined that a student did not complete the task if the student was unable to find the page(s) that provided the solution to the task. “Not applicable” was determined if th e student did not use the website prototype (e.g., followed a link that led elsewhere or opted to use Google search instead). Most students completed tasks 2, 3, 4, 5, and 7. The task with most “did not complete” was task 1 , which 64 percent of student participants did not complete. Task 6 had neutral completion, 63 percent. 86 percent of students completed tasks 2 and 4, and 90 percent of students completed tasks 3, 5 and 7. It is evident that task 1 was a difficult task to complete, regardless of the stu dent’s area of study. Task 1 required students to find APA legal citation help. The terms “APA legal citation” confused users. Likewise, for task 6 (Special Collections), students did not understand what “collections” referred to or where to search them. Figure 7. Tasks – Number of clicks (complete). Figure 7 illustrates how many clicks it required students to complete the task. The clicks were separated into three categories: 1-2 clicks, 3-5 clicks, and more than 6 clicks. This figure only illustrates data collected from tasks that were completed. The number of clicks began at the website prototype’s homepage or from the main menu navigation found in the website prototype’s header, when it was evident that the student was starting a new task. Tasks 2 and 3 were completed in 1-2 clicks, whereas tasks 1, 4, 5, 6, and 7 required an average of 3-5 clicks. Because of experience helping students find articles at the librarian's research help desk, Task 7 (find research articles) was expected to require 6+ clicks. Task 1 may have a pattern of needing a high number of clicks to complete because it was a generally a difficult task to complete. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 NAVIGATION DESIGN AND LIBRARY TERMINOLOGY | OCHOA 10 Figure 8. Tasks – Number of clicks (did not complete). Figure 8 illustrates how many clicks a student participant made before they decided to skip the task or if they believed they had completed the task. This figure only illustrates data from tasks that were not completed. The clicks were separated into three categories: 1-2 clicks, 3-5 clicks, and more than 6 clicks. The number of clicks began at the website prototypes homepage or from the main menu navigation found in the website prototype’s header, when it was evident that the student was starting a new task. Task 1 and 6 show the most patterns in this figure. Task 1 (citation help) shows that students generally skipped the task after more than 6 clicks. Task 6 (Special Collections) was generally skipped after 3-6+ clicks. Figure 9 illustrates the duration to complete each task. The duration was separated into three categories: 0-1 minutes, 1-3 minutes, or more than 3 minutes. This figure only illustrates data for tasks that were completed. The duration began when the student started a new task. This was determined when it was observed that the students started to use the main menu navigation, or if the student directed their screen back to the website prototype’s homepage. There are parallels between the number of clicks and duration of tasks. For tasks 2, 3, and 5, the duration to complete the task was less than 1 minute. Task 5 was a task similar to task 4 (both are forms, linked once on the website), but the duration for task 5 may have averaged lower than the duration of task 4, because task 5 was after task 4. Having completed a form before task 5 may have influenced the student’s behavior on searching for forms. Tasks 1, 6, and 7 averaged 1-3 minutes to complete. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 NAVIGATION DESIGN AND LIBRARY TERMINOLOGY | OCHOA 11 Figure 9. Tasks – Question duration (complete) Figure 10. Tasks – Question duration (did not complete) INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 NAVIGATION DESIGN AND LIBRARY TERMINOLOGY | OCHOA 12 Figure 10 illustrates the duration of each task that wasn’t completed. The duration was separated into three categories: 0-1 minutes, 1-3 minutes, or more than 3 minutes. This figure only illustrates data for tasks that were not completed. The duration began when the student started a new task. This was determined when it was observed that the students started to use the main menu navigation, or if the student directed their screen back to the website prototype’s homepage. Similarly, to observations for figure 7, there are parallels between the number of clicks and duration of tasks. For task 1, the average time before students skipped the task varied, however most students who didn’t complete the task skipped it after more than 3 minutes of trying to complete it. For task 6, the average duration before skipping the task was 1-3 minutes. CONCLUSION AND RECOMMENDATIONS The purpose of this study was primarily designed to test the user-centered study approach and the navigational redesign of the library website. The results, however, provided the library with a variety of outcomes. Based on suggestions and comments on the website prototype navigation design, menus, and page content, there are several elements that will be integrated to help lead the redesign of the library’s website. Students found that the navigation design of the website was clear and simple, but also required a “getting used to.” Because of this, and due to navigation design literature, it is recommended to design a menu navigation that is a topic-based navigation as opposed to an audience-based navigation. Our findings also highlighted the effects of the use of library terms. To make menu links exceptionally user-friendly, it is recommended to utilize clear and common terminology. Student participants also voiced that a search-all search box for the website was necessary. This will enable users to access information efficiently. Library website developers should also map more than one link to a specific page, especially if the only link to the page is on an image or slideshow. The user-centered usability approach for this case study worked well in collaboration with campus faculty and as an informal recruitment. It provided relevant and much needed data and feedback for the University Library. In terms of future usability studies, a heuristic approach may be effective. A heuristic study approach will enable moderators to gather feedback and analysis from library web development experts.14 Moreover, the usability study could be conducted over a semester long time and include focus groups to acquire consistent feedback. 15 Overall, website usability studies are evolving and require constant improvements and research. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 NAVIGATION DESIGN AND LIBRARY TERMINOLOGY | OCHOA 13 APPENDIX A Major: ___________ Year (freshman, sophomore, etc.): ______________ Link to site: URL - please do NOT use URL Please complete the following situations. For some of these, you don’t need to actually submit/send, but pretend as if you are. 1. Find research help on citing legal documents - a California statute - in APA style citation. 2. Find the library hours during spring break. 3. Find information on the library study spaces hours and location. 4. You’re a student at the Stan State campus and you need to request a book from Turlock to be sent to Stockton. Fill out the request-a-book form. 5.You are a graduate student and you need to submit your thesis online. Fill out the thesis submission form. 6. For your history class, you need to find information on the university’s history in the University Archives and Special Collections. Find information on the University Archives and Special Collections. 7. Find any article on salmon migration in Portland, Oregon. You need to print it, e-mail it yourself, and you also need the article cited. Complete the following questions. 1. Rate the ease of access of the website (1= really difficult to navigate, 10=eas y to navigate) 1 2 3 4 5 6 7 8 9 10 2. Did you ever feel frustrated or confused? If so, during what question? 3. Do you think the website provides enough information to answer the above questions? Why or why not? INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 NAVIGATION DESIGN AND LIBRARY TERMINOLOGY | OCHOA 14 APPENDIX B CS 3500 Major: ___________ Year (freshman, sophomore, etc.): ______________ Link to site: URL - please do NOT use URL Please complete the following situations. For some of these, you don’t need to actually submit/send, but pretend as if you are. 1. Find research help on citing legal documents - a California statute - in APA style citation. 2. Find the library hours during spring break. 3. Find information on the library study spaces hours and location. 4. You’re a student at the Stan State campus and you need to request a book from Turlock to be sent to Stockton. Fill out the request-a-book form. 5.You are a graduate student and you need to submit your thesis online. Fill out the thesis submission form. 6. For your history class, you need to find information on the university’s history in the University Archives and Special Collections. Find information on the University Archives and Special Collections. 7. Find any article on salmon migration in Portland, Oregon. You need to print it, e-mail it yourself, and you also need the article cited. Then, complete the following questions. 1. Rate the ease of access of the website (1= really difficult to navigate, 10=easy to navigate) 1 2 3 4 5 6 7 8 9 10 2. What did you think of the overall web design? 3. What would you change about the design? Please be specific. 4. What did you like about the design? Please be specific. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 NAVIGATION DESIGN AND LIBRARY TERMINOLOGY | OCHOA 15 ENDNOTES 1 Mark Aaron Polger, “Student Preferences in Library Website Vocabulary,” Library Philosophy and Practice, no. 1 (June 2011): 81, https://digitalcommons.unl.edu/libphilprac/618/. 2 Jakob Nielsen, “Is Navigation Useful?,” NN/g Nielsen Norman Group, https://www.nngroup.com/articles/is-navigation-useful/. 3 Junior Tidal, “Creating a User-Centered Library Homepage: A Case Study,” OCLC Systems & Services: International Digital Library Perspectives 28, no. 2 (May 2012): 95, https://doi.org/10.1108/10650751211236631. 4 Suzanna Conrad and Christy Stevens, “‘Am I on the Library Website?’: A LibGuides Usability Study,” Information Technology and Libraries (Online) 38, no. 3 (September 2019): 73, https://doi.org/10.6017/ital.v38i3.10977. 5 Eric Rogers, “Designing a Web-Based Desktop That's Easy to Navigate,” Computers in Libraries 20, no. 4 (April 2000): 36, ProQuest. 6 Katie Sherwin, “Audience-Based Navigation: 5 Reasons to Avoid It,” NN/g Nielsen Norman Group, https://www.nngroup.com/articles/audience-based-navigation/. 7 Rogers, “Designing a Web-Based Desktop That's Easy to Navigate,” 36. 8 Conrad, “‘Am I on the Library Website?’: A LibGuides Usability Study,” 71. 9 Susan Augustine and Courtney Greene, “Discovering How Students Search a Library Web Site: A Usability Case Study,” College & Research Libraries 63, no. 4 (July 2002): 358, https://doi.org/10.5860/crl.63.4.354. 10 Conrad, “‘Am I on the Library Website?’: A LibGuides Usability Study,” 70. 11 Kate A. Pittsley and Sara Memmott, “Improving Independent Student Navigation of Complex Educational Web Sites: An Analysis of Two Navigation Design Changes in LibGuides,” Information Technology and Libraries 31, no. 3 (September 2012): 54, https://doi.org/10.6017/ital.v31i3.1880. 12 Elena Azadbakht, John Blair, and Lisa Jones, “Everyone's Invited: A Website Usability Study Involving Multiple Library Stakeholders,” Information Technology and Libraries 36, no. 4 (December 2017): 43, https://doi.org/10.6017/ital.v36i4.9959. 13 Azadbakht, “Everyone's Invited,” 43; Troy A. Swanson and Jeremy Green, “Why We Are Not Google: Lessons from a Library Web Site Usability Study,” The Journal of Academic Librarianship 37, no. 3 (February 2011): 226, https://doi.org/10.1016/j.acalib.2011.02.014. 14 Laura Manzari and Jeremiah Trinidad-Christensen, “User-Centered Design of a Web Site for Library and Information Science Students: Heuristic Evaluation and Usability Testing ,” Information Technology and Libraries 25, no. 3 (September 2006): 164, https://doi.org/10.6017/ital.v25i3.3348. 15 Tidal, “Creating a User-Centered Library Homepage: a Case Study,” 97. https://digitalcommons.unl.edu/libphilprac/618/ https://www.nngroup.com/articles/is-navigation-useful/ https://doi.org/10.1108/10650751211236631 https://doi.org/10.6017/ital.v38i3.10977 https://www.nngroup.com/articles/audience-based-navigation/ https://doi.org/10.5860/crl.63.4.354 https://doi.org/10.6017/ital.v31i3.1880 https://doi.org/10.6017/ital.v36i4.9959 https://doi.org/10.1016/j.acalib.2011.02.014 https://doi.org/10.6017/ital.v25i3.3348 ABSTRACT Introduction Literature Review Methodology Limitations Results and Observations Written Responses Observations Video Recording Observations Data Collected Conclusion AND Recommendations Appendix A Appendix B ENDNOTES 12137 ---- Letter from the Editor: The Core Question Letter from the Editor The Core Question Kenneth J. Varnum INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2020 https://doi.org/10.6017/ital.v39i1.12137 As I write this, the members of the Association for Library Collections and Technical Services (ALCTS), the Library Leadership and Management Association (LLAMA), and LITA are voting to merge into a new consolidated division, Core: Leadership, Infrastructure, Futures. This merger is essential to the continuing activities that we library technologists rely on. The LITA Board has indicated that if the merger does not go through, LITA will be forced to dissolve over the coming year. The merger will enrich LITA members’ opportunities. LITA has long focused on the library technology practitioner. That has been our core competency, born of a time when technology was the new thing in libraries. We technologists know—the entire information profession knows— that technology is no longer an addition to a library, but is the way society operates, for a huge portion of our work and life. Core reflects this evolutionary change. Similar evolutions have taken place in technical services and collections development areas; those functions have been forever changed by the wave of technologies that we have implemented over the past half century. Core brings together the practitioners and technologies that make libraries run, and combines them with the library leadership areas that many of us aspire to, or end up taking on, as our careers develop. When I joined LITA over a decade ago, I was myself moving from “doer” to “manager.” Now that my role is largely project and personnel management, the skills and conversations I seek for personal growth are often found in other parts of ALA, and beyond. Yet, the focus—the core, if I may—of what I do is still in the center of the Venn diagram of technology, people, and data. I voted to support Core and hope that all of you who belong to LITA, ALCTS, and/or LLAMA, will do the same. Sincerely, Kenneth J. Varnum, Editor varnum@umich.edu March 2020 https://core.ala.org/ mailto:varnum@umich.edu 12163 ---- Tending to an Overgrown Garden: Weeding and Rebuilding a LibGuides v2 System ARTICLE Tending to an Overgrown Garden Weeding and Rebuilding a LibGuides v2 System Rebecca Hyams INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2020 https://doi.org/10.6017/ital.v39i4.12163 Rebecca Hyams (rhyams@bmcc.cuny.edu) is Web and Systems Librarian, Borough of Manhattan Community College/CUNY. © 2020. ABSTRACT In 2019, the Borough of Manhattan Community College’s library undertook a massive cleanup and reconfiguration of the content and guides contained in their LibGuides v2 system, which had been allowed to grow out of control over several years as no one was in charge of its maintenance. This article follows the process from identifying issues, getting departmental buy-in, and doing all of the necessary cleanup work for links and guides. The aim of the project was to make their guides easier for students to use and understand and for librarians to maintain. At the same time, work was done to improve the look and feel of their guides and implement the built-in A-Z database list, both of which are also discussed. INTRODUCTION In early 2019, the A. Philip Randolph Library at the Borough of Manhattan Community College (BMCC) (part of the City University of New York (CUNY) system) hired a new web and systems librarian. The position itself was new to the library, though some of its functions had previously been performed by a staff member who had left more than a year prior. It quickly became apparent to the newest member of the library’s faculty that, while someone had at one point managed the website, the same could not really be said for the library’s LibGuides system and the mass of content contained within. The library’s LibGuides system was first implemented in January 2013 and over time the system came to be used primarily by instruction librarians to serve their teaching efforts. Not long after BMCC implemented LibGuides, Springshare announced LibGuides version 2 (v2), a new version of the system that included several enhancements and features not present in the earlier version.1 These features included the ability to mix content types in a single box (in the earlier version, for example, boxes could have either rich text or links but not both), a centrally managed asset library, and an automatically-generated A-Z database list designed to make it easy to manage a public- facing display. BMCC moved to LibGuides v2 around early 2015, but few of those who worked with the system ever took advantage of the newer features offered for quite some time, if at all. At the time the web and systems librarian came aboard, the BMCC LibGuides system contained over 400 public guides and an unwieldy asset library filled with duplicates and broken widgets and links. Many of the guides essentially duplicated others, with only the name of the classroom instructor differing. There were, for example, 69 separate guides just for English 101, some of which had not been updated in three or four years. There were there no local guidelines for creating or maintaining guides, and in theory, each librarian was responsible for their own. However, it was apparent that in practice, no one was actively managing the guides or their related assets, as the lists of both were overwhelming. The creators of existing guides were primarily reference and instruction librarians whose other responsibilities meant there was little mailto:rhyams@bmcc.cuny.edu INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 TENDING TO AN OVERGROWN GARDEN | HYAMS 2 time to do guide upkeep and because there was no single person in charge of the guides, there was no one to ensure any maintenance took place. In addition to the unwieldy guide list and asset library, the BMCC Library was also effectively maintaining two separate A-Z database lists, one on the library’s website that was a homegrown SQL database built by a previous staff member, and another running on LibGuides to provide links to databases via the guides. The lists were not in sync with one another and several of the librarians were unaware that the LibGuides version of the list even existed, leading to links to databases appearing on both the database list and as link assets. And, while the LibGuides A -Z list was not linked from the library’s website, it was still accessible from points within LibGuides, meaning that patrons could encounter an incorrect list that was not being maintained. GETTING STARTED Before any work could be done on our system, there needed to be buy-in from the rest of the library faculty. With the library director in agreement, agenda items were added to department meetings between March and May 2019 for discussion and department approval. The various aspects of the project were pitched to emphasize the following goals: • Removing outdated material, broken links, etc. • Streamlining where information could be found • Decluttering guides to make everything easier to use and understand for students • Improving the infrastructure to make maintenance and new guide creation easier and more manageable • Standardizing layouts and content The aim of all of this would be to increase guide usability, accessibility, and make the guides overall a more consistent resource for our students. For the sake of transparency (as well as to have a demo of some of the aesthetic changes discussed in more detail below), a project guide was created and shared with the rest of the library department to share preliminary data as well as detailed updates as tasks were completed.2 PROCESS The Database List While the LibGuides A-Z database list, a feature built into v2 of the platform, contained information about our databases, it was essentially only serving to provide links to databases when creating guide content. There was some indication, in the form of a dormant A-Z Database “guide,” that someone had tried to create a list in LibGuides by manually adding assets to a guide. While that was a common practice in LibGuides v1 sites, as the built-in list was not yet a part of the system, the built-in list itself was never properly put into use. The links on our website all pointed to a homegrown list which, while powered by an SQL database, was essentially a manual list. Because of its design, it had proved impossible for anyone in the library to update without extensive web programming knowledge. It seemed a no-brainer to work on the database list first. This way we had both the infrastructure to update database-related content on the guides and a single and up-to-date list of resources with enhanced functionality that could benefit the library’s users almost immediately.3 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 TENDING TO AN OVERGROWN GARDEN | HYAMS 3 To begin, the two lists were compared to find any discrepancies, of which there were many. As the e-resources librarian was on leave at the time, the library director was consulted to determine which of databases missing from the LibGuides list were active subscriptions (and which of the ones missing from the homegrown list were previously cancelled so they could be removed). Once the database list reflected current holdings, the metadata entries for the databases on the LibGuides side were updated to include resource type, related subjects, and related icons. These updates would enhance the functionality of the LibGuides list, as it could be filtered or searched using that additional information, something that was missing from the homegrown lis t. In addition to updating content and adding useful metadata, some slight visual changes were made to improve the look and usability of the list using custom CSS. Most of this was done because as the list was being worked on, several librarians (of those who were even aware of it in the first place) mentioned that one reason they disliked the LibGuides list was because of the font size and spacing, which they felt was too small and hard to read. With the list updated, it was presented at the March 2019 department meeting and quickly won over all in attendance, especially when it was pointed out that the list could be very easily maintained because it required no special coding knowledge. While the homegrown list would remain live on the server for the rest of the semester (so as to not disrupt any classes that may have been using it), it was agreed that the web and systems librarian could go ahead with switching all of the links pointing to the homegrown list to point to the Springshare list instead. The Asset Library Because of how guides were typically created over the years since adopting LibGuides (many appeared to have been copied from another existing guide each time) the asset library had grown immense and unmanageable. For example, there were 149 separate links to our “Databases by Subject” page on the library’s website, the overwhelming majority of which were only used once. There were also 145 separate widgets for the same embedded Scribd-hosted keyword worksheet, which was in fact broken and displayed no content. This is to say nothing of the broken-link report that no one had reviewed in quite some time. Tackling the cleanup of duplicates and fixing of broken links/embeds was a large piece of the invisible work taken on behind the scenes to make maintaining the guides easier in the future. In order to analyze the data, the asset library report was exported to an Excel file to make it easier to identify issues that needed correction. To start this process, we requested that Springshare technical support wipe out all assets (other than documents) that were not mapped to anything and were just cluttering up the asset library (this ended up being just under 2,000 assets).4 Most of those items had been removed from the guides they were originally included on but were never removed from the asset library. They served no real function other than to clutter up the backend. The guide authors had given the web and systems librarian permission to remove anything broken that could not be easily fixed. This included the aforementioned broken worksheet (and other similar items), as well as an assortment of YouTube video embeds where the video had since been taken down, resulting in a “this video is unavailable” error message. It was felt that since those were already not working and seriously hurt the reliability of our guides to our users, that no further permission was needed. Then came the much more tedious task of standardizing (where possible) which assets were in use. This involved going into guides listed as containing known-duplicate assets, replacing them with a single, designated asset, and then removing the resulting unmapped items.5 It was decided INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 TENDING TO AN OVERGROWN GARDEN | HYAMS 4 that while many of the guides would likely be deleted after the spring semester, that only assets appearing on currently-active guides would be standardized. While in hindsight, as many of the links that were fixed were on guides that were soon-to-be deleted, it would have been better to hold off and wait until guides could be deleted first. However, doing at least some of this work in advance helped find other issues including instances where our proxy prefix was included directly in the URL (an issue as we were also in the process of changing our EZProxy hosting) and where custom descriptions or link names were unclear. “Books from the Catalog” assets had their own issues that also needed to be addressed. With a pending migration of the library’s ILS, it was already apparent that the links to any books in the library’s catalog would need updating so they could have a shot at continuing to function post- migration.6 We had been told at the time that the library’s Primo instance would remain through the migration (though this changed during the migration process) so at the time we felt it important to ensure that all links were pointing to Primo, as some had been pointing to the soon- to-be decommissioned OPAC. For consistency, the URLs were structured as ISBN searches instead of ones relying on internal system numbers that would soon change. However, it became obvious very early on that some of the links to books were either pointing to materials that were no longer in the library’s collection, or were pointing to a previously decommissioned OPAC server, both of which resulted in errors. Because the domain of the previously decommissioned OPAC server had been whitelisted in the link checker report settings, these items had not appeared on the broken link list. Using the filtered list of “Books from the Catalog” assets, all titles were checked, which allowed the web and systems librarian to remove items that were no longer in the collection and make other adjustments as needed. As a result of the asset cleanup process, the asset library went from an unwieldy total of more than 5,000 items to just over 2,000 items. It also simplified the process for reusing assets in new guides, as there was now only one choice per item, and made it much easier to find and fix broken links and embeds. The Guides The cleanup of the guides themselves was by far the most complex task. Before starting the guide cleanup work itself, the web and systems librarian performed a content analysis to identify and recommend guides for deletion and which could be converted into general subject area guides. Because a common practice was to create a “custom” guide for each class that came in for a library instruction session, there was an overrepresentation of guides for the classes that had regular sessions: English 101 (English Composition), English 201 (Introduction to Literature), Speech 100 (Public Speaking), and Introduction to Critical Thinking. Those four courses accounted for 187 guides, or over 40 percent of the total number in our system. The majority of them had not been updated directly in over three years, and in some cases, were designed for instructors who no longer taught at the college. Perhaps more telling was that the content for these guides diff ered more across the librarians who created them than across the courses they were designed for. This meant that while there might be three or four different iterations of the English 101 guide, the guides created by the same librarian for different introductory courses were essentially the same. Before the arrival of the web and systems librarian, one of the other librarians had been occasionally maintaining guide groups for “current courses” and “past courses,” but it was unclear if anyone was still actively maintaining these groupings, as guides for current instructors were sometimes under “past courses” and vice versa. Because these groups did not actually hide the INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 TENDING TO AN OVERGROWN GARDEN | HYAMS 5 guides from view on the master list of guides and appeared to be unnecessary work, it was decided to remove the groupings. Instead, the web and systems librarian would plan to revisit the guides on a regular basis to unpublish/remove anything for courses that were no longer taught. However, since the philosophy behind the guides was to move from “custom” guides for each instructor’s section to a general guide for the course as a whole for the overwhelming majority of cases, the need for maintaining these groupings was essentially eliminated anyway. In May 2020, a preliminary list of guides to be deleted was presented to the librarians at the monthly department meeting. The list was broken down as: • Duplicates to be deleted: This portion consisted primarily of course guides like those mentioned above where multiple guides existed for the same course, most of which used the exact same content. • Guides to be “merged:” While merging guides is not actually possible in the LibGuides platform, there were cases where we had two or three for the same course. They could be condensed into a single guide with the rest deleted. • Guides to convert to subject area guides: These were guides that were essentially already structured as a subject guide but were titled for a specific course, and in many cases, a guide for the subject area did not already exist (for example, a course-specific guide for business would become the business subject area guide). • Dead guides: These were guides that had not been updated in more than two years and had not been viewed in at least one year. Librarians were given an opportunity in the department meeting to comment on the list, as well as to contact the web and systems librarian with any comments. Additionally, as some of the classroom faculty on campus had connections to specific guides, the library director also sent out a message to classroom faculty to let them know of our general plan to revamp the guides and that many would be removed over the summer. Surprisingly, there were few objections either amongst the librarians or the classroom faculty once they understood the rationale and process. Of the few classroom faculty members that did respond to the library director’s message, most of them were more concerned with content or specific links that they felt strongly about versus the guides themselves. In those cases, we noted the content requests to make sure they appeared on the new guides. Most of these instructors were satisfied when we further explained our process and , if needed, ensured them that the content they requested would be worked into the new guide. Only one instructor who responded, whose assignment was related to a grant they had received, made a strong case for keeping a separate guide for their sections of English 101. With the project approval out of the way, it was then time to begin removing all of the to-be- deleted guides and start the process of revamping those that would be kept. The goal was that the project would be completed by the start of the fall semester so that faculty and students would come back to a new (and hopefully, much improved) set of guides. Removing Debris To be cautious, a few preliminary steps were taken before the guides selected for deletion were removed. For starters, the selected guides had their status changed to “unpublished,” meaning that they no longer appeared on the public-facing list of guides. This gave everyone a chance to say something if a guide they were actively using suddenly went “missing.” These unpublished guides were then downloaded using the LibGuides HTML Backup feature and saved to the department’s INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 TENDING TO AN OVERGROWN GARDEN | HYAMS 6 network share drive. While the HTML Backup output is not a full representation of the guide (the file generated displays as a single page and is missing any formatting or images that were included in the guide), it does include all of a guide’s content, meaning that a link or block of text can be retrieved from the backup in case of moments of “I know I had this on my guide before but....” Because of the somewhat haphazard nature of our guides, deleting unwanted ones turned out to result in interesting and unexpected challenges. Over the years, some of librarians had, from time to time, reused individual boxes between guides, but there was no consistency to the practice. While there was a repository guide for reusable content, not everyone used it or used it consistently. Thankfully, LibGuides runs a pre-delete check, which proved to be invaluable in this process, as it showed if any of the boxes displayed on one guide were reused on any others. In most cases where boxes were reused, they were reused on guides that were also on the “to be deleted” list, but that was not always the case. By having that check we could find the other guides listed and make copies of the boxes that would have otherwise been deleted. If a box was reused on multiple guides that were being kept, it was copied to the Reusable Content guide and then remapped from there. Cosmetic Improvements In conjunction with the work being done to improve content of our guides, the web and systems librarian felt it was the perfect opportunity to update the guide templates and overall aesthetics to make the guides more visually appealing, especially considering little had been done in this area system-wide apart from setting the default color scheme. Using the project guide as an initial sandbox, several changes were put into motion that would eventually be worked into new templates and pushed out to all of the reworked guides. The first, and perhaps biggest, change was the move from tab navigation to side navigation (an option first made available with the release of LibGuides v2). While there have been several usability studies that have debated using one over the other, in this case side navigation was chosen both for the streamlined nature of the layout as a whole (by default there is only one full content column), and because enabling the box-level navigation could serve as a quick index for anyone looking to find specific content on a page.7 Side navigation also avoided the issue of long lists of tabs spilling into a second row, which further complicated page navigation. Several changes to the look and feel of the guides were also put into place, with many of the changes coming from suggestions given on various LibGuide style or best practice guides or more general recommendations from web usability guidelines.8 Perhaps most importantly, all of the font sizes were increased for improved readability, especially on box titles and headers, to better facilitate visual scanning. The default fonts were also replaced with two commonly used fonts from the Google Fonts library, Roboto (for headings and titles) and Open Sans (for body text). Additionally, the navigation color scheme was changed because the orange of the college’s blue- and-orange color scheme regularly failed accessibility contrast checks and was described by some colleagues as “harsh on the eyes.” Instead, two analogous lighter shades of blue (one of which was taken from the college’s branding documentation) were selected for the navigation and box titles respectively, both of which allowed for the text in those areas to be changed from white to black (again, for improved readability). Figure 1 shows a typical “before” guide navigation design, and figure 2 shows a typical “after” design. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 TENDING TO AN OVERGROWN GARDEN | HYAMS 7 Figure 1. A sample of guide navigation and content frequently found on guides before start of cleanup Figure 2. Navigation and content after revisions Additionally, the web and systems librarian took this opportunity to go through the remaining guides to ensure they were all consistent. Most of this work fell in the area of text styling, or rather, undoing text styling. It was clear from several of the guides that over the years, librarians had not been happy with the default font sizes or styles, which lead to a lot of customizing using the built-in WYSIWYG text editor. Not only did this create a nightmare in the code itself (as the WYSIWYG editor adds a lot of extraneous tags and style markup), but it also meant that the changes coming from the new stylesheet were not being applied universally as any properties assigned on a page overrode the global CSS. There was also the issue of paragraph text (

) that was sometimes styled as fake headings (made larger or bolder to look like headings, but not using the proper tags) which needed to be corrected for consistency and accessibility purposes. Replanting and Sprucing Up With an overwhelming majority of the guides (and their associated assets) deleted, it was finally time to rework the remaining guides into clear, easy-to-use resources that would benefit our students. At this point the guides fell into three categories: • Guides that just needed to be pruned and updated. • Guides that should be combined into a single subject area guide. • Guides that should be created to fill an unmet need. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 TENDING TO AN OVERGROWN GARDEN | HYAMS 8 Pruning and updating tasks were generally the least-arduous, as many of the guides included content that was also housed on discrete guides (citations, resource evaluation, etc.). Instead of duplicating, for example, citation formats on every guide, those pages were replaced with navigation-level links out to the existing citation guide. This was also the point that we could do more extensive quality control such as switching to a single content column which further emphasized the extraneous information on many of our guides. Infographics, videos, and long blocks of links or text were scrutinized to determine if they were helping to enhance students’ understanding of the core content or if they were merely providing clutter that would make it more difficult to understand the important information.9 In some cases, by going from guide to guide, it became apparent that there were guides for multiple courses in a subject area where the resources were basically identical. This was most noticeable in the criminal justice and health education subject areas. In these cases, it made little sense to keep separate course guides when the content was basically the same across them. To remedy this duplication, one of the course guides for each subject was transformed into the subject area guide, and resources were added to ensure they covered the same materials that the separate course guides may have covered. The remaining course guides were then marked for future deletion as they were no longer needed. Lastly, subject areas without guides were identified so that work could be done later to create them. As we had discussed moving towards using the “automagic” integration of guide content into our Blackboard Learning Management System (LMS), this step will be key in ensuring that all subject areas have at least some resources students can use. However, as of this time we have yet to finish creating these additional guides, and several subject areas (including computer science, nursing, and gender studies) have no guides at all. NEXT STEPS Now that all of the work to clean and update our LibGuides is done, the most important next step is coming up with a workflow to ensure that the guides stay relevant and useful. The web and systems librarian mostly left the guides alone for the Fall 2019 semester to allow their colleagues time to use them and report back any issues. To the web and systems librarian’s surprise there were few issues reported, but that does not mean there is no room for future improvement. As a department, it is clear that we need a formal plan for maintaining the guides, including update frequency, content review, and guidelines for when guides should be added or deleted. Additionally, immediately following the conclusion of this cleanup project the library’s website was forced into a server migration and full rebuild for reasons outside of the scope of this article. However, as a result there were changes made on the library’s site involving the look and feel of pages that will need to be carried through into our guides and associated Springshare platforms. While most of this work is relatively simple, mimicking changes developed in WordPress to work properly on external services will take time and effort. CONCLUSION Overall, while this project was a massive undertaking (done almost entirely by a single person), the end result, at least on the surface, has made our guides much easier to use and understand. There were obviously several things that, if the project were to be done over, should have been done differently, mostly involving the cleaning of the asset library. However, it is now much easier INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 TENDING TO AN OVERGROWN GARDEN | HYAMS 9 to refer students to guides for their courses and the feelings about the guides amongst the Library faculty have become much more positive. ENDNOTES 1 “LibGuides: The Next Generation!,” Springshare Blog (blog), June 26, 2013, https://blog.springshare.com/2013/06/26/libguides-the-next-generation/. 2 The guide can be viewed at: https://bmcc.libguides.com/guidecleanup. 3 Though the author only learned of the project undertaken at UNC a few years ago, after they had already finished this project, a similar project was outlined here: Sarah Joy Arnold, “Out with the Old, in with the New: Migrating to LibGuides A-Z Database List,” Journal of Electronic Resources Librarianship 29, no. 2 (April 2017): 117–20, https://doi.org/10.1080/1941126X.2017.1304769. 4 Because there was no way to view the documents before a bulk deletion, documents were manually reviewed and deleted as needed. 5 It was only long after this process that Springshare promoted that they could do this on the backend by request. 6 However, it turned out that due to the differences in URL structure between classic Primo and Primo VE that this change was completely unnecessary as the URLs did actually needed to be changed again post-migration. At least they were consistent which meant a systemwide find- and-replace could take care of most of the links. 7 Several studies have been done since the roll out of LibGuides v2 including: Sarah Thorngate and Allison Hoden, “Exploratory Usability Testing of User Interface Options in LibGuides 2,” College and Research Libraries 78, no. 6 (2017): 844–61, https://doi.org/10.5860/crl.78.6.844; Kate Conerton and Cheryl Goldenstein, “Making LibGuides Work: Student Interviews and Usability Tests,” Internet Reference Services Quarterly 22, no. 1 (January 2017): 43–54, https://doi.org/10.1080/10875301.2017.1290002. 8 Of the many guides the author consulted, the following were the most informative: Stephanie Jacobs, “Best Practices for LibGuides at USF,” https://guides.lib.usf.edu/c.php?g=388525&p=2635904; Jesse Martinez, “LibGuides Standards and Best Practices,” https://libguides.bc.edu/guidestandards/getting-started; Carrie Williams, “Best Practices for Building Guides & Accessibility Tips,” https://training.springshare.com/libguides/best-practices-accessibility/video. 9 There is a very detailed discussion of cognitive overload in LibGuides in Jennifer J. Little, “Cognitive Load Theory and Library Research Guides,” Internet Reference Services Quarterly 15, no. 1 (March 1, 2010): 53–63, https://doi.org/10.1080/10875300903530199. https://blog.springshare.com/2013/06/26/libguides-the-next-generation/ https://bmcc.libguides.com/guidecleanup https://doi.org/10.1080/1941126X.2017.1304769 https://doi.org/10.5860/crl.78.6.844 https://doi.org/10.1080/10875301.2017.1290002 https://guides.lib.usf.edu/c.php?g=388525&p=2635904 https://libguides.bc.edu/guidestandards/getting-started https://training.springshare.com/libguides/best-practices-accessibility/video https://doi.org/10.1080/10875300903530199 ABSTRACT Introduction Getting Started Process The Database List The Asset Library The Guides Removing Debris Cosmetic Improvements Replanting and Sprucing Up Next Steps Conclusion ENDNOTES 12191 ---- Making Disciplinary Research Audible: The Academic Library as Podcaster ARTICLES Making Disciplinary Research Audible The Academic Library as Podcaster Drew Smith, Meghan L. Cook, and Matt Torrence INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2020 https://doi.org/10.6017/ital.v39i3.12191 Drew Smith (dsmith@usf.edu) is Associate Librarian, University of South Florida. Meghan L. Cook (mlcook3@usf.edu) is Coordinator of Library Operations, University of South Florida. Matt Torrence (torrence@usf.edu) is Associate Librarian, University of South Florida. © 2020. ABSTRACT Academic libraries have long consulted with faculty and graduate students on ways to measure the impact of their published research, which now include altmetrics. Podcasting is becoming a more viable method of publicizing academic research to a broad audience. Because individual academic departments may lack the ability to produce podcasts, the library can serve as the most appropriate academic unit to undertake podcast production on behalf of researchers. The article identifies what library staff and equipment are required, describes the process needed to produce and market the published episodes, and offers preliminary assessments of the podcast impact. INTRODUCTION The academic library has always had an essential role in the research activities of university faculty and graduate students, but until the last several years, that role has primarily focused on assisting university researchers with obtaining access to all relevant published research in their fields, making it possible for those researchers to complete a thorough literature review. More recently, that role has evolved to encompass assisting with other aspects of research and publication, including consulting on copyright-related issues, advising researchers on the most appropriate places to publish, preserving publications and data in institutional repositories, helping tenure-track faculty to evaluate their research impact as part of the tenure and promotion process, and hosting open-access journals. Meanwhile, libraries of all types have experimented in the last ten to fifteen years with using social media to promote library collections, services, and events. Many libraries have taken advantage of Facebook, Twitter, and YouTube as part of these efforts. Increasingly, libraries have incorporated makerspaces so that library patrons can create and edit video and audio files, meaning that this same equipment and software is now available to librarians and other library staff for their own purposes. This has resulted in libraries producing promotional videos and podcasts. The dramatic increase in mobile technology (smartphones and tablets) ownership and usage over the last decade has resulted in an increase in the consumption of podcasts wherever the listener happens to be when their ears are not otherwise fully occupied, such as commuting, exercising, and engaging in home chores. As a result, academic libraries are now finding themselves in an excellent position to use podcasting for instructional and promotional purposes in an effort to reach a broad audience. What happens when the university library combines its inherent interest in supporting the promotion of faculty and graduate student research with its ability to create podcasts to quickly and inexpensively reach an international audience? This paper documents the efforts of an academic library at a high-level research university to partner with one of the university’s mailto:dsmith@usf.edu mailto:mlcook3@usf.edu mailto:torrence@usf.edu INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 MAKING DISCIPLINARY RESEARCH AUDIBLE | SMITH, COOK, AND TORRENCE 2 academic departments to use podcasting to promote the research done by that department’s faculty and doctoral candidates. We will describe which library staff were involved, how the podcast was planned, the execution of the podcasting process, the issues that were encountered throughout the process, and how the impact of the podcast was assessed. Calling: Earth, the podcast produced by the University of South Florida (USF) Libraries, can be found at http://callingearth.lib.usf.edu/. LITERATURE REVIEW Podcasting as a means for promoting scholarly communication is a relatively new and uncommon idea in a library setting, therefore the extant literature is scarce on the subject. A high percentage of the contemporary articles on the aforementioned topic focus on the use of podcasts as a means to satisfy a wide array of student learning needs. While pedagogical best practices knowledge is useful, what current literature does exist is not an exact match for the concept of promoting scholarly communication, which offers subject specificity, faculty and graduate interaction, marketing of libraries, and research visibility as aggregate goals. What follows in this literature review is a summary of a slice of the literature related to podcasting, academia, and/or libraries. The researchers chose as a starting point to look at the general use of podcasting, as well as social media, in various academic and library environments. In a recent article on the use of social media and altmetrics, for example, the increased use of these tools is outlined, but with numerous caveats regarding the initial non-probabilistic methods of gathering information on the how and why of their adoption.1 To further emphasize the use of podcasts and, in a related way, social marketing, an examination of an article related to Association of Research Libraries (ARL) efforts in this vein was examined. A comprehensive study of ARL member libraries published in 2011, with not much on this topic published since this date, demonstrated in figure 1 of their research that five of the 37 respondents contained recorded interviews and only one included scholarly publishing content.2 This ten-year vacuum in further research was unexpected but indicates an opportunity for a new type of podcast focusing on academic production. Scholars in academic libraries have long examined student preferences for new technologies and types of information transfer, including the use of podcasts. A study from Sam Houston State University found that 36 percent of users in 2011 were using podcasts for recreational purposes as opposed to much lower use for academic and scholarly communication benefits. 3 In the future, academic creation and utilization of podcasts for scholarly communication is ripe for a hearty statistical and qualitative analysis. Specific to this inquiry, the application of podcasts for scholarly communication in a subject discipline present in the literature appears to be lacking. Furthermore, this literature review emphasizes the dearth of research that relates to promoting the research efforts of geosciences faculty and graduate students. In terms of recent literature, there are also a number of publications available that deal with the history and evolution of podcasting in education and, specifically, higher education. One such current work provides an excellent outline of this growth in use, as well as outlining several major types, or genres, of podcasting in these types of environments. Following a strong and succinct overview of the technology and its use in college and university settings, the author continues to effectively define, with examples, the three main genres they have identified: the “Quick Burst,” the “Narrative,” and the “Chat Show.”4 The model that most represents USF’s Calling: Earth program is “Narrative,” as this includes a subcategory of “Storytelling.” This work is truly beneficial for any group or individual developing, or improving, an educational podcast effort. http://callingearth.lib.usf.edu/ INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 MAKING DISCIPLINARY RESEARCH AUDIBLE | SMITH, COOK, AND TORRENCE 3 In 2011, Peoples and Tilley outlined the emergence of podcasts to disseminate information in academic libraries. One of the excellent questions that arises from this work deals with the access, advancement, and archiving of the content; is this content to be archived, or cataloged, as more permanent material, or is it electronic ephemera?5 This is a question for the USF Calling: Earth podcast group going forward as the level and quality of content and, ideally, use is expanded. Additionally, educators are studying more about the limitations of podcasts; not to rule them out as academic tools, but to inspire and enhance the best possible outcomes. One excellent warning to be heeded by any library hoping to utilize podcasts for education and dissemination of research is summed up well in this quote: “If students do not utilize or do not realize the benefits of the self- pacing multimedia characteristics of podcasting, then the resource becomes a more likely contributor to cognitive overload.”6 There have been a small number of the quantitative elements of podcast use in academic libraries. An article in VINE: The Journal of Information & Knowledge Management Systems outlined, via content analysis and other methods, various unique and shared characteristics of existing academic podcasts, while also furthering the concept of podcasting as a “library service.”7 This may not have been the first publication to make this assertion, but this is a view that is also held by these authors and this view shapes the development and advancement of the USF Libraries podcasting efforts. Librarians of all types must be wary, however, as there are numerous articles that focus on the better understanding of student learning preferences. As presented by a recent article on the success of satellite and distance learners showed, though, these tools often hit the spot on the delivery preferences for these types of students.8 Switching gears to a bit more topic specificity, a number of news and academic articles were identified on the use of podcasts in areas of the geosciences. One such effort is th e Geology Flannelcast. The development and implementation of this combination of education and entertainment, which is also a goal of these authors, is outlined by the creators’ poster presentation at a recent Geological Society of America conference. With a focus on the increasing ease of podcasting technology, reduced cost of equipment, and the use of “conversational atmosphere” within a pedagogical framework, this model stood out as one worth studying.9 Furthermore, the geosciences are, or can be, interesting and exciting. A recent podcast on communicating geosciences with fun and flair is just the encouragement this research group needed to go all-in on this project. And that the geosciences are far from boring!10 As is evidenced by an examination of current and historical literature on this topic, there are multiple opportunities for further exploration and library efforts, expressly as one of the main points of this work is to emphasize faculty and graduate research efforts and scholarly communication and original content creation. In addition to the focus on these publications and presentation efforts, the results will be measured by the initial assessment projects including download and utilization data and, hopefully, positive feedback from participants and library administration. Further measurement is expected to demonstrate advanced citation counts and downloads of the publications of the faculty and graduate student interviewees. It will be correlation and not causation, of course, but the team hopes to have positive feedback for participants and the library. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 MAKING DISCIPLINARY RESEARCH AUDIBLE | SMITH, COOK, AND TORRENCE 4 STAFFING As with any successful project, a project to produce a podcast focused on academic research had to begin with individuals who had either the interest or the expertise, ideally both, to initiate the work. One was an associate librarian with more than 13 years of experiencing in producing regular podcasts, while the other was a library staff member who was a doctoral candidate serving on the USF Libraries Research Platform Team (RPT) for the USF School of Geosciences. The RPT was already tasked with assisting the Geosciences faculty and graduate students in maximizing the impact of their work and had been using various means in order to accomplish this, such as an institutional repository for research output, and tools to measure the impact of previously published work. During a conversation in late 2018, the librarian suggested to the RPT staff person that podcasting could be used to promote research to a variety of audiences, including USF faculty and students, faculty and students at other universities, K-12 science teachers, and members of the general public (both local and beyond). The librarian offered to initiate the podcast and train the RPT staff on how to continue the podcast after a number of episodes had been produced. The librarian brought to the project the needed expertise with launching and maintaining a podcast, while the RPT doctoral candidate was already familiar with the Geosciences faculty and other doctoral candidates and could identify those who would make good candidates for being interviewed about their research. PLANNING The initial planning for the podcast began approximately two months before the first episode release. The original project managers and podcast creators met a number of times to discuss logistics, equipment, and staffing needs, and to agree upon a podcast name (Calling: Earth). Since the notion of podcasting for researcher promotion was an unexplored territory, the support from higher administration was cautious. However, after production of the first episodes, traction behind the podcast grew and additional support for future endeavors was received. The podcasters acquired handheld recording equipment, a Tascam DR-05 Linear PCM Recorder, from the USF Libraries Digital Media Commons and tested it out in multiple environments (for instance, a quiet office versus a recording studio) to find the optimal location to record the interviews. We found the hand-held recording equipment worked well in a quiet office and allowed for travel to the researcher’s office if they requested. The podcast creation team discussed how to add intro and outro music to the podcast that would not violate any copyright restrictions but that would fit the mood of the podcast. Th e RPT staff person knew of a local Tampa-based band, The Growlers, as a potential source for music because the bass guitarist was an adjunct professor and alumnus of the USF School of Geosciences. The alumnus gave permission to use a portion of the band’s recorded music for the podcast. A hosting service was needed to host and publish the podcast. The librarian suggested using Libsyn, because of their 13 years of previous experience with the platform, Libsyn’s inexpensive hosting plan, and the ability to acquire statistics including the geographic locations (countries and states) where the podcast was being downloaded. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 MAKING DISCIPLINARY RESEARCH AUDIBLE | SMITH, COOK, AND TORRENCE 5 EXECUTION Potential interviewees were contacted via email and invited to be interviewed. Once the potential interviewee agreed, a time and a place to conduct the interview was agreed upon. The RPT staff person determined what the most recent research was for each interviewee, and then pro vided that content to the librarian host for review. The host then prepared interview questions based on the research content. The host went over the questions with the interviewee before the interview began to clear the content with the interviewee and to make sure everything would be covered in the interview that the interviewee wished to cover. The interviews took approximately 30 minutes to an hour. Editing of the podcast was done using GarageBand, allowing for the addition of the music to the beginning and end, as well as the host introducing both the general podcast and the specific episode, identifying the academic units involved in the podcast, indicating how listeners might provide feedback, and thanking the music group for allowing the use of their music. In a few rare cases, small interview segments were removed, usually due to the interviewee feeling that it did not represent them well. CHALLENGES As with any new endeavor, challenges were faced at all stages in the process of getting the podcast to production and beyond. Buy-in from Library Administration An early challenge was to gain buy-in from the library administration. This began with requesting that the library fund the hosting service, and the feeling of the administrator was that it was a worthwhile experiment, at least in the short term. Once a number of episodes had been produced, the library administration had a better sense of the quality of the production and how it would serve the interests of the library in its academic support role. Lack of Budget With no budget for this project (beyond the administration’s monthly payment of the hosting service), the podcasters were at the mercy of the quality of the recorders available for library checkout. If the recorders did not produce a high-quality recording, the podcast would possibly lack the sophistication needed for production. Also, high-quality graphics work was needed and required us to look into other library units for help with creating a logo. Getting the Podcast into Apple Podcasts Once content was being produced and published, it was time to submit the podcast to Apple Podcasts. Apple initially rejected the submission because the first logo looked very similar to an iPhone. It should be noted that Apple did not supply a specific explanation of what copyright was being infringed, so the podcasters were faced with making a best guess as to what the problem was. Based on our assumption, we changed the logo and resubmitted the podcast. A further problem arose when Apple required that the new submission use a different RSS feed than the original submission. Eventually the podcasters sought assistance from Libsyn, who explained how to make a minor change to the URL of the RSS feed so that the podcast could be successfully resubmitted. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 MAKING DISCIPLINARY RESEARCH AUDIBLE | SMITH, COOK, AND TORRENCE 6 New Logo Creation The first logo continued to be used for the entire first season, but before the second season was released, the library’s new communications and marketing coordinator assisted with the creation of a new logo that looked more sophisticated and more in-line with other podcast logos. Having an in-house graphics designer was extremely helpful in rolling out a new logo (See figures 1 and 2). Figure 1. Season 1 Logo Figure 2. Current Logo Setting Up Interviews Identifying potential interviewees, requesting interviews, and setting good times and locations for the interviews brought on another batch of challenges. The USF School of Geosciences is composed of geologists, geographers, and environmental scientists so when planning out the schedule for the potential interviewees, an effort was made to involve a wide range of researchers. Some potential interviewees denied the request altogether, while others were not available for the needed time period. Given that the podcast was released every two weeks, there was a little wiggle room for scheduling hiccups, but once or twice a last-minute request to a new potential interviewee was made to ensure production stayed on schedule. Where the interview was held and what time required a lot of back-and-forth emails between the RPT staff person and the interviewee. Preference on time and location was given to the interviewee, but it was requested that, if they did not want to come to the library to be interviewed, their own office/lab space could be used if it was a sufficiently quiet environment for recording purposes. Comfortability of the Interviewee Once an interview began, the challenge of engagement from the host and comfortability of the interviewee became apparent. The host had to engage the researcher at a level appropriate for a general audience, which was challenging given that the research done by the USF School of Geosciences is often at a high-level of critical thinking and problem-solving. To add on to the complexity of the research being explained, the comfort level of the interviewee had the potential INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 MAKING DISCIPLINARY RESEARCH AUDIBLE | SMITH, COOK, AND TORRENCE 7 to dampen the interview. One researcher was so uncomfortable speaking in an interview that they typed up in advance what they wanted to say. ASSESSMENT Libsyn Statistics According to Libsyn statistics (as of July 17, 2020) there were a total of 3,593 unique downloads from 48 different countries of the published 35 episodes of Calling: Earth. In table 1, the 48 countries where Calling: Earth has been downloaded are shown, as well as how many times the podcast has been downloaded in each country. It is worth noting that there are 105 downloads that do not have a location specified, so the total of the downloads in table 1 does not equal the total number of downloads reported by Libsyn. Table 1. Downloads by Country Name Downloads Name Downloads United States 2,729 Chile 3 United Kingdom 103 Denmark 3 India 98 Romania 3 Australia 88 South Africa 3 France 62 Yemen 3 Ireland 50 Argentina 2 Bangladesh 43 Ecuador 2 Spain 37 Poland 2 Russian Federation 36 Taiwan 2 Norway 30 Turkey 2 Portugal 30 Belgium 1 Germany 20 Bulgari 1 Japan 19 Colombia 1 Mexico 18 Costa Rica 1 Italy 14 Estonia 1 Netherlands 12 Greece 1 New Zealand 11 Latvia 1 Brazil 9 Macedonia 1 Korea, Republic of 9 Nigeria 1 Czech Republic 7 Pakistan 1 Ukraine 7 Saudi Arabia 1 China 6 United Arab Emirates 1 Hong Kong 5 Vietnam 1 Sweden 4 Without a location 105 Canada 3 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 MAKING DISCIPLINARY RESEARCH AUDIBLE | SMITH, COOK, AND TORRENCE 8 Preliminary Survey and Scholarly Impact A survey was sent out to the interviewees to gauge their impressions of the podcast and to see if they had noticed any impact to their citations or document downloads. Our goal for the survey was to find out if the podcast was accomplishing the intention for starting a podcast, which was to increase researcher impact by research dissemination, as well as to inform the podcast processes and procedures. The questions asked were: 1. In what ways do you view the Calling: Earth podcast as a way to positively affect your research impact? 2. What evidence do you have, if any, to suggest your research has been positively impacted because of being an interviewee on the Calling: Earth podcast? 3. What would you have liked to be different about your interview process for the Calling: Earth podcast? 4. What suggestions do you have for the future seasons of the Calling: Earth podcast? For example, should the format change, the focus be different, change the length of the interview, etc. Furthermore, each interviewee was asked to contribute their scholarship to the library’s institutional repository, Scholar Commons, to allow for the archiving of their research publications and to use as a means of tracking scholarship impact as a result of the podcast. Once an interviewee’s scholarship was placed in Scholar Commons, a Selected Works profile was created so that a direct link to the scholar’s work could be disseminated through the podcast notes. Impact on faculty has also been noteworthy. The download totals for faculty interview participants (when comparing roughly the same amount of time just prior to and following their published interview) showed an average increase of 30 percent and suggest a strong correlative link between the podcast and researcher impact. Furthermore, anecdotal evidence from interviewees such as “puts my name out there to a wider audience,” “enhances the visibility of my work,” and “allow others to hear about [my research] in a more passive way” indicates the potential impact a researcher can see from being a part of the podcast. A second survey was sent to the faculty, students, and staff of the entire School of Geosciences to determine who was listening to the podcast and conversely, who was not, and their reasons for listening or not listening. The survey contained five questions in total, but according to how the participant selected their answers, not all were available to be answered (figure 3). The first question asked their status in the School of Geosciences (faculty, staff, undergraduate, graduate, or other). The second question asked if they had heard of the podcast and if they had or had not listened to it. If a participant chose the option that they had never heard of the podcast, then the survey ended for them. If a participant chose the option that they had heard of the podcast, but had not listened to it, then the survey directed them to a question that asked them to provide reasons they had not listened to the podcast. If a participant chose the option that they had heard of the podcast, and had listened to at least one episode, the survey directed them to a question that asked how many episodes the participant had listened to and for what reasons were they listening to the podcast. This data was collected to inform the future direction of the podcast. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 MAKING DISCIPLINARY RESEARCH AUDIBLE | SMITH, COOK, AND TORRENCE 9 Figure 3. Flow Chart for the Entire School of Geosciences Survey CHECKLIST FOR PODCAST PLANNING/EXECUTION Based on our experiences in the production of the Calling: Earth podcast, we recommend that academic librarians and library staff use the following list to help with planning and executing the production of their own podcasts: • Get general buy-in from library staff and administration, and update as the planning progresses and budgeting is needed. • Decide on goals, audience, content, format, frequency of production, and methods of assessment. • Work with media staff to design marketing, including podcast title (avoiding duplication with other podcasts) and logo development. • Choose a podcast hosting service. • Identify relevant staff for hosting, recording, editing, and publishing and train as needed. • Evaluate existing hardware and software and make additional purchases as needed. • Contact potential interviewees and create a schedule. • Prepare customized interview questions and share as appropriate with interviewees. • Record interviews. • Edit and publish episodes. • Submit podcast to Apple Podcasts, Spotify, and other popular podcast directories. • Monitor statistics. • Continue to engage in marketing and assessment activities. What choice best describes your current status in the USF School of Geosciences: Which of the following describes you: I have never heard of the Calling: Earth podcast I have heard of the Calling: Earth podcast, but have not listened to it I have heard of the Calling: Earth podcast and have listened to at least 1 episode What choice best describes why you have not listened to the Calling: Earth podcast: I know what a podcast is, but I do not have time to listen to the Calling: Earth podcast. END SURVEY Faculty Staff Graduate Student Undergraduate Student Other I do not know what a podcast is. Other I know what a podcast is, but I am not interested in listening to Calling: earth podcast. I know what a podcast is, but I do not have time to listen to the Calling: Earth podcast. END SURVEY Approximately how many episodes of the Calling: Earth podcast have you listened to? 1 For enjoyable content For awareness of current research in the USF School of Geosciences For instructional purposes For ways to find collaborators Other 2 3 4 5 6 7 8 END SURVEY INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 MAKING DISCIPLINARY RESEARCH AUDIBLE | SMITH, COOK, AND TORRENCE 10 CONCLUSIONS AND FUTURE DIRECTIONS Enthusiasm and anecdotal positive feedback are enough fuel for current activities and levels of enthusiasm and the future of podcasting in libraries also appears open and exciting. At the USF Libraries, Calling: Earth is currently in its third season and with each new episode, new ideas and increased archival content become a permanent part of the library’s legacy and collections. This is another area ripe for future exploration, as this type of original content is archived, cataloged, and disseminated, becoming another part of regular academic impact measure. In this vein, the USF Libraries podcasting group plans to further codify cyclical assessment tools, including the receipt of IRB clearance for future surveys and data collection. In addition to cleaning up and refining these assessment practices, this will also provide the opportunity to publish and present publicly on more specific data. Ideally, the group will be able to correlate the show’s presence to positive citation or metrics levels with show participants. The USF Libraries Geosciences RPT is currently collecting baseline aggregate information, which could then be compared following further maturation and dissemination of the podcast. Causality may never be within reach, but any positive impacts will be exciting and beneficial. It is also the hope of those involved with Calling: Earth that it might provide a model or template for other RPT or library podcasts or media efforts. One of the current benefits is the strong and effective support from the Development and Communication directors at the USF Libraries and their partnerships in the future will certainly be key to the success of this and any other potential projects of this type. In closing, the academic library podcasting landscape is wide-open for further exploration and examination, and the USF Libraries plans to lead and learn. ENDNOTES 1 Cassidy R Sugimoto et al., “Scholarly Use of Social Media and Altmetrics: A Review of the Literature,” Journal of the Association for Information Science and Technology 68, no. 9 (2017): 2,037–62. 2 James Bierman and Maura L. Valentino, “Podcasting Initiatives in American Research Libraries,” Library Hi Tech 29, no. 2 (May 2011): 349, https://doi.org/10.1108/07378831111138215. 3 Erin Dorris Cassidy et al., “Higher Education and Emerging Technologies: Student Usage, Preferences, and Lessons for Library Services,” Reference & User Services Quarterly 50, no. 4 (2011): 380–91, https://doi.org/10.5860/rusq.50n4.380. 4 Christopher Drew, “Educational Podcasts: A Genre Analysis,” E-Learning and Digital Media 14, no. 4 (2017): 201–11, https://doi.org/10.1177/2042753017736177. 5 Brock Peoples and Carol Tilley, “Podcasts as an Emerging Information Resource,” College & Undergraduate Libraries 18, no. 1 (January 2011): 44, https://doi.org/10.1080/10691316.2010.550529. 6 Stephen M Walls et al., “Podcasting in Education: Are Students as Ready and Eager as We Think They Are?”, Computers & Education 54, no. 2 (January 2010): 372, https://doi.org/10.1016/j.compedu.2009.08.018. https://doi.org/10.1108/07378831111138215 https://doi.org/10.5860/rusq.50n4.380 https://doi.org/10.1177/2042753017736177 https://doi.org/10.1080/10691316.2010.550529 https://doi.org/10.1016/j.compedu.2009.08.018 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 MAKING DISCIPLINARY RESEARCH AUDIBLE | SMITH, COOK, AND TORRENCE 11 7 Tanmay De Sarkar, “Introducing Podcast in Library Service: An Analytical Study,” Vine 42, no. 2 (2012): 191–213, https://doi.org/10.1108/03055721211227237. 8 Lizah Ismail, “Removing the Road Block to Students’ Success: In-Person or Online? Library Instructional Delivery Preferences of Satellite Students,” Journal of Library & Information Services in Distance Learning 10, no. 3–4 (2016): 286–311, https://doi.org/10.1080/1533290X.2016.1219206. 9 Jesse Thornburg, “Podcasting to Educate a Diverse Audience: Introducing the Geology Flannelcast,” in Innovative and Multidisciplinary Approaches to Geoscience Education (Posters) (Boulder, CO: Geological Society of America, 2015). 10 Catherine Pennington, “PODCAST: Geology Is Boring, Right? What?! NO! Why Scientists Should Communicate Geoscience...,” n.d., https://britgeopeople.blogspot.com/2018/10/PODCAST- geology-is-boring-right.html. https://doi.org/10.1108/03055721211227237 https://doi.org/10.1080/1533290X.2016.1219206 https://britgeopeople.blogspot.com/2018/10/PODCAST-geology-is-boring-right.html https://britgeopeople.blogspot.com/2018/10/PODCAST-geology-is-boring-right.html ABSTRACT Introduction Literature Review Staffing Planning Execution Challenges Buy-in from Library Administration Lack of Budget Getting the Podcast into Apple Podcasts New Logo Creation Setting Up Interviews Comfortability of the Interviewee Assessment Libsyn Statistics Preliminary Survey and Scholarly Impact Checklist for Podcast Planning/Execution Conclusions and Future Directions ENDNOTES 12197 ---- Using the Harvesting Method to Submit ETDs into ProQuest: A Case Study of a Lesser-Known Approach COMMUNICATIONS Using the Harvesting Method to Submit ETDs into ProQuest A Case Study of a Lesser-Known Approach Marielle Veve INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2020 https://doi.org/10.6017/ital.v39i3.12197 Marielle Veve (m.veve@unf.edu) is Metadata Librarian, University of North Florida. © 2020. ABSTRACT The following case study describes an academic library’s recent experience implementing the harvesting method to submit electronic theses and dissertations (ETDs) into the ProQuest Dissertations & Theses Global database (PQDT). In this lesser-known approach, ETDs are deposited first in the institutional repository (IR), where they get processed, to be later harvested for free by ProQuest through the IR’s Open Archives Initiative (OAI) feed. The method provides a series of advantages over some of the alternative methods, including students’ choice to opt-in or out from ProQuest, better control over the embargo restrictions, and more customization power without having to rely on overly complicated workflows. Institutions interested in adopting a simple, automated, post-IR method to submit ETDs into ProQuest, while keeping the local workflow, should benefit from this method. INTRODUCTION The University of North Florida (UNF) is a midsize public institution established in 1972, with the first theses and dissertations (TDs) submitted in 1974. Since then, copies have been deposited in the library, where bibliographic records are created and entered in the library catalog and the Online Computer Library Center (OCLC). During the period of 1999 to 2012, some TDs were also deposited in ProQuest by the graduate school on behalf of students who decided to. This practice, however, was discontinued in the summer of 2012, when the institutional repository, Digital Commons, was established and submission to it became mandatory. Five years later, in the summer of 2017, interest in getting UNF TDs hosted in ProQuest resurfaced. This renewed interest grew out from a desire of some faculty and graduate students to see the institution’s electronic theses and dissertations (ETDs) posted there, in addition to a recent library subscription to the ProQuest Dissertations & Theses Global database (PQDT). A month later, conversations between the library and graduate school began on the possibility of resuming hosting UNF ETDs in ProQuest. Consensus was reached that the PQDT database would be a good exposure point for our ETDs, in addition to the institutional repository (IR), yet some concerns were raised. One of the concerns was cost of the service and who would be paying for it. Neither the library nor the graduate school had allocated funds for this. The next concern was the possibility of ProQuest imposing restrictions that could prevent students, or the university, from posting ETDs in other places. It was important to make sure there were no such restrictions. Another concern was expressed over students entering embargo dates in ProQuest that do not match the embargo dates selected for the IR. This is a common problem encountered by other libraries.1 For that reason, we wanted to keep the local workflow. The last concern expressed during the conversations was preserving students’ right to opt-in or out from distributing their theses in ProQuest. This is something both the graduate school and library have been adamant mailto:m.veve@unf.edu INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 USING THE HARVESTING METHOD TO SUBMIT ETDS INTO PROQUEST | VEVE 2 about. In higher education, requiring students to submit to ProQuest is a controversial issue which has raised ethical concerns and has been highly debated over the years.2 Once conversations between the library and graduate school were held and concerns were gathered, the library moved ahead to investigate the available options to submit ETDs into ProQuest. LITERATURE REVIEW Currently, there are three options to submit ETDs into ProQuest: (1) submission through the ProQuest ETD Administrator tool, (2) submission via File Transfer Protocol (FTP), and (3) submission through harvests performed by ProQuest.3 ProQuest ETD Administrator Submission Option In this option, a proprietary submission tool called ProQuest ETD Administrator is used by students, or assigned administrators, to upload ETDs into ProQuest. Inside the tool, a fixed metadata form is completed with information on the degree, subject terms are selected from a proprietary list, and keywords are provided. The whole administrative and review process gets done inside the tool. Afterwards, zip packages with the ETDs and ProQuest’s Extensible Markup Language (XML) files are sent to the institution via FTP transfers, or through direct deposits to the IR using the Simple Web-service Offering Repository Deposit (SWORD) protocol. The ETD Administrator submission method presents several shortcomings. First, the ProQuest XML metadata that is returned to the institutions must be transformed into IR metadata for ingest in the IR, a process that can be long and labor intensive.4 Second, the subject terms supplied in the returned files come from a proprietary list of categories maintained by ProQuest, which does not match the Library of Congress Subject Headings (LCSH) used by libraries.5 Third, control over the metadata provided is lost because the metadata form cannot be altered, plus customizations to other parts of the system can be difficult to integrate. 6 Fourth, there have been issues with students indicating different embargo periods in the ProQuest and IR publishing options, with instances of students choosing to embargo ETDs in the IR, while not in ProQuest.7 Lastly, this method does not allow students’ choice, unless the ETDs are submitted separately in two systems in a process that can be burdensome. Ultimately, for these reasons, we found the ETD Administrator not a suitable option for our institution. FTP Submission Option In this option, an administrator sends zip packages with the institution’s ETD files and ProQuest XML metadata to ProQuest via FTP.8 At the time of this investigation, there was a $25 charge per ETD submitted through this method.9 We did not want to pursue this option because of the charge and the tedious metadata transformations that would be needed between IR and ProQuest XML schemas. Another way to go around this would have been to submit the ETDs through the VIREO application. VIREO is an open source, ETD management system used by libraries to freely submit ETDs into ProQuest via FTP.10 This alternative, however, was not an option for us as our IR, Digital Commons, does not support the VIREO application. Harvesting Submission Option This is the latest method available to submit ETDs into ProQuest. In this option, ETDs are submitted first into an IR, or other internal system, where they get processed to be later harvested by ProQuest through the IR’s existing Open Archives Initiative (OAI) feed.11 At the time of this writing, we were not able to find a single study that documents the use of this method. This option INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 USING THE HARVESTING METHOD TO SUBMIT ETDS INTO PROQUEST | VEVE 3 looked appealing and worth pursuing as it met most of our desired criteria. First, with this option, students’ choice would not be compromised as ETDs would be submitted to ProQuest after being posted in the IR. Second, because the ETD Administrator would not be used, issues with conflicting embargo dates and unalterable metadata forms would be avoided. In addition, the local workflow would be retained, thus eliminating the need for tedious metadata transformations between ProQuest and IR schemas. From the available options, this one seemed the most feasible solution for our institution. IMPLEMENTATION OF THE HARVESTING METHOD AT UNF After research on the different submittal options was performed, the library approached ProQuest to express interest in depositing our future ETDs into their system by using a post-IR option. In the first communications, ProQuest suggested we use the ETD Administrator to submit ETDs because is the most commonly used method. When we expressed interest in the harvesting option, they said “we have not been harvesting from BePress sites” (the company that makes Digital Commons) and suggested we use the FTP option instead.12 Ten months later, they clarified the harvests could be performed from BePress sites and that the option is free, with the only requirement of a non-exclusive agreement between the university and ProQuest. The news appeased both the library’s and the graduate school’s previous concerns, as we would be able to adopt a free method that would not compromise on students’ choice nor restrict students from posting in other places, while keeping the local workflow. After agreement on the submittal method was established, planning and testing of the harvesting method began. The library worked with ProQuest and BePress to customize the harvesting process while the university’s Office of the General Counsel worked with ProQuest on the negotiation process. Negotiation Process Before ProQuest could harvest UNF ETDs, two legal documents needed to be in place. The first document was the Theses and Dissertations Distribution Agreement, which specifies the conditions under which ETDs can be obtained, reproduced, and disseminated by ProQuest. The document had to be signed by the UNF’s Board of Trustees and ProQuest. The agreement stipulated the following conditions: • The agreement must be non-exclusive. • The university must make the full-text Uniform Resource Locators (URLs) and abstracts of ETDs available to ProQuest. • ProQuest must harvest the ETDs from the university’s IR. • The university and students have the option to elect not to submit individual works or to withdraw them. • No fees are due from the university or students for the service. • ProQuest must include the ETDs in the PQDT database. The second document that needed to be in place was the Theses and Dissertations Availability Agreement, which grants the university the non-exclusive right to reproduce and distribute the ETDs. This agreement between students and UNF specifies the places where ETDs can be hosted and the embargo restrictions, if any. UNF already has been using this document as part of its ETD workflow, but the document needed to be modified to include the additional option to submit INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 USING THE HARVESTING METHOD TO SUBMIT ETDS INTO PROQUEST | VEVE 4 ETDs into ProQuest. Beginning with the spring 2019 semester, the revised version of the agreement provided students with two hosting alternatives: posting in the IR only or in the IR and ProQuest. Local Steps Performed Before the Harvesting The workflow begins when students upload their ETDs and supplemental files (Certificate of Approval and Availability Agreements) directly into the Digital Commons IR. In there, students complete a metadata template with information on the degree and keywords related to the thesis are provided. After this, the graduate school reviews the submitted ETDs and approves them inside the IR platform. Next, the Library Digital Projects’ staff downloads the native PDF files of ETDs, processes them, and creates public and archival versions for each ETD. Availability Agreements are reviewed to determine which students chose to embargo their ETDs and which ones chose to host them in ProQuest, in addition to the IR. If students choose to embargo their ETDs, the embargo dates are entered in the metadata template. If students choose to publish their ETDs in ProQuest, a “ProQuest: Yes” option is checked in their metadata template, while students who choose not to host in ProQuest would get a “ProQuest: No” in their template. (The ProQuest field is a new administrative field that was added to the ETD metadata template, starting with the spring 2019 semester, to assist with the harvesting process. It was designed to alert ProQuest of the ETDs that were authorized for harvesting. More detail on its functionality will be provided in the next section.) The reason library staff enters the ProQuest and embargo fields on behalf of students is to avoid having students enter incorrect data on the template. Following this review, the Metadata Librarian assigns Library of Congress Subject Headings to each ETD and creates authority files for the authors. These are also entered in the metadata template. Afterwards, the ETDs get posted in the Digital Commons’ public display, with the full- text PDF files available only for the non-embargoed ETDs. Information that appears in the public display of Digital Commons will also appear immediately in the OAI feed for harvesting. At this point, two separate processes take place: 1. Metadata Librarian harvests the ETDs’ metadata from the OAI feed and converts it into MARC records that are sent to OCLC, with the IR’s URL attached. The workflow is described at https://journal.code4lib.org/articles/11676. 2. On the seventh of each month, ProQuest harvests the full-text PDF files, with some metadata, of the non-embargoed ETDs that were authorized for harvesting from the OAI feed. Harvesting Process (Customized for Our Institution) To perform the harvests, ProQuest creates a customized robot for each institution that crawls OAI- PMH compliant repositories to harvest metadata and full-text PDF files of ETDs.13 The robot performs a date-limited OAI request to pull everything that has been published or edited in an IR’s publication set during a specific timeframe. Information to formulate the date limited request is provided to ProQuest by the institution for the first harvest only, subsequently, the process gets done automatically by the robot. The request contains the following elements: INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 USING THE HARVESTING METHOD TO SUBMIT ETDS INTO PROQUEST | VEVE 5 • Base URL of the OAI repository • Publication set • Metadata prefix or type of metadata • Date range of titles to be harvested In the particular case of our institution, we needed to customize the robot to limit the harvests to authorized ETDs only. To achieve this, we worked with BePress to add a new, hidden field at the bottom of our Digital Commons’ ETD metadata template. The field, called ProQuest, consisted of a dropdown menu with 2 alternatives: “ProQuest Yes” or “ProQuest No” (see figure 1). The field was mapped to an element in the OAI feed that displays the value of “ProQuest: Yes” or “ProQuest: No,” thus alerting the robot of the ETDs that were authorized for harvesting and the ones that were not. The element used to map the ProQuest field in the OAI feed is the , which is a Qualified Dublin Core (QDC) element (figure 2). For that reason, the robot needs to perform the harvests from the QDC OAI feed in order to see this field. Figure 1. Display of the ProQuest Field’s Dropdown Menu in the Metadata Template Figure 2. Display of the ProQuest Field in the QDC OAI Feed After the ETDs authorized for harvesting have been identified with help from the “ProQuest: Yes” field, the robot narrows down the ones that can be harvested at the present moment by using the element. This element, as the name implies, provides the date when the full - text file of an ETD becomes available. It also displays in the QDC OAI feed (see figure 3). If the date is on or before the monthly harvest day, the ETD is currently available for harvesting. If the date is in the future, the robot identifies that ETD as embargoed and adds its title to a log of embargoed ETDs with some basic metadata (including the ETD’s author and the last time it was checked). The log of embargoed ETDs is then pulled out in the future to identify the ETDs that come out of embargo so the robot can retrieve them. Figure 3. Display of the Element in the QDC OAI Feed INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 USING THE HARVESTING METHOD TO SUBMIT ETDS INTO PROQUEST | VEVE 6 After the ETDs that are currently available for harvesting have been identified (because they have the “ProQuest: Yes” field and a present or past availability date), the robot performs a harvest of their full-text PDF files by using the third element, which displays at the bottom of records in the OAI feed (figure 4). The third element contains a URL with direct access to the complete PDF file of ETDs that are currently not embargoed. ETDs that are currently on embargo contain a URL that redirects the user to a webpage with the message: “The full-text of this ETD is currently under embargo. It will be available for download on [future date]” (see figure 5). Figure 4. Display of the Third Element at the Bottom of Records in the QDC OAI Feed Figure 5. Message that Displays in the URL of Embargoed ETDs Once the metadata and full-text PDF files of authorized, non-embargoed ETDs have been obtained by the robot, they get queued for processing by the ProQuest editorial team, who then assigns them International Standard Book Numbers (ISBNs) and ProQuest’s proprietary terms. It takes an average of four to nine weeks for the ETDs to display in the PQDT database after been harvested. Records in the PQDT come with the institutional repository’s original cover page and a copyright statement that leaves copyright to the author. Afterwards, the process gets repeated once a month. This frequency can be set to quarterly or semi-annually if desired. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 USING THE HARVESTING METHOD TO SUBMIT ETDS INTO PROQUEST | VEVE 7 ADDITIONAL POINTS ON THE HARVESTING METHOD Handling of ETDs that come out of embargo. When the embargo period of an ETD expires, the full-text PDF of it becomes automatically available in the IR’s webpage, and consequently, in the third element that displays in the OAI record. Each month, when the robot prepares to crawl the OAI feed, it will first check for the titles in the log of embargoed ETDs to determine if any of them have become fully available through the third element. The ones that become available are then pulled by the robot through this element. Handling of metadata edits performed after the ETDs have been harvested and published in PQDT. Edits performed to metadata of ETDs will trigger a change of date in the element that displays in the OAI records. This change of date will alert the robot of an update that took place in a record, which is then manually edited or re-harvested, depending on the type of update that took place. Sending MARC records to OCLC. As part of the harvesting process, ProQuest provides free MARC records for the ETDs hosted in their PQDT database. These can be delivered to OCLC on behalf of the institution on an irregular basis. Records are machine-generated “K” level and come with URLs that link to the PQDT database and with ProQuest’s proprietary subject terms. We requested to be excluded from these deliveries and continue our local practice of sending MARC records to OCLC with LCSH, authority file headings, and the IR’s URLs. Notifications of harvests performed by ProQuest and imports to the PQDT database. When harvests or imports to the PQDT have been performed by ProQuest, institutions do not get automatically notified. Still, they can request to receive scheduled monthly reports of the titles that have been added to the PQDT. UNF requested to receive these monthly reports. Usage statistics of ETDs hosted in PQDT. Usage statistics of an institution’s ETDs hosted in the PQDT can be retrieved from a tool called Dissertation Dashboard. This tool is available to the institution’s ETD administrators and provides the number of times some aspect of an ETD (e.g., citations, abstract viewings, page previews, and downloads) has been accessed through the PQDT database. Royalty payments to authors. Students who submit ETDs through this method are also eligible to receive royalties from ProQuest. OBSTACLES FACED During the planning phase, we encountered some obstacles that hindered progress on the implementation. These were: • Amount of time it took to get the ball rolling. Initially, we were misled by the assumption we would not be able to use the harvesting method to submit ETDs into ProQuest because we were BePress users, as we were originally told, but that ended up not being the case. Ten months later, we were notified by the same source that the harvesting option for BePress sites would be possible and doable by ProQuest. These were ten months that delayed the implementation process. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 USING THE HARVESTING METHOD TO SUBMIT ETDS INTO PROQUEST | VEVE 8 • Amount of time it took to get the paperwork finalized and signed before the harvesting. From the moment first contact was initiated with ProQuest, to the moment the last agreement was finalized and signed by both parties, 21 months went by. There was a lot of back and forth in the negotiation process and paperwork between the University and ProQuest. • Inconsistent lines of communication. There were multiple parties involved in the communication process and some of the emails began with one person only to be later transferred to someone else. This lack of consistency in the communication lines made it difficult to determine who was in charge of particular tasks at certain stages of the process. CONCLUSION AND RECOMMENDATIONS Although problems were encountered at the beginning, implementation of the harvesting process at UNF was a complete success. Once the process started, it ran smoothly without complications. Harvests were performed on schedule and no issues with unauthorized content been pulled from the OAI were faced. Fields used to alert the robot in the OAI of the ETDs authorized for harvesting worked as planned, and so did the embargo log used to identify and pull the out of embargo ETDs. It should be noted that Digital Commons users who want to exclude embargoed ETDs from displaying in the OAI can do so by setting up an optional yes/no button in their submission form. This button prevents metadata of particular records from displaying in the OAI feed. We did not pursue this option because we have been using the ETD metadata that displays in th e OAI to generate the MARC records we send to OCLC. In addition, we took the necessary precautions to avoid exposing full content of the embargoed ETDs in the OAI feed. Institutions planning to use this method should be very careful with the content they display in the OAI as to avoid embargoed ETDs from been mistakenly pulled by ProQuest. Access restrictions can be set by either suppressing the metadata of embargoed ETDs from displaying in the OAI or by suppressing the URLs with full access to the embargoed ETDs. The same precaution should be taken if planning to provide students with the choice to opt-in or out from ProQuest. Altogether, the harvesting option proved to be a reliable solution to submit ETDs into ProQuest without having to compromise on students’ choice nor rely on complicated workflows with metadata transformations between IR and ProQuest schemas. Institutions interested in adopting a simple, automated, post-IR method, while keeping the local workflow, should benefit from this method. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 USING THE HARVESTING METHOD TO SUBMIT ETDS INTO PROQUEST | VEVE 9 ENDNOTES 1 Dan Tam Do and Laura Gewissler, “Managing ETDs: The Good, the Bad, and the Ugly,” in What’s Past Is Prologue: Charleston Conference Proceedings, eds. Beth R. Bernhardt et al. (West Lafayette, IN: Purdue University Press, 2017), 200-04, https://doi.org/10.5703/1288284316661; Emily Symonds Stenberg, September 7, 2016, reply to Wendy Robertson, “Anything to watch out for with etd embargoes?,” Digital Commons Google Users Group (blog), https://groups.google.com/forum/#!searchin/digitalcommons/embargo$20dates%7Csort:da te/digitalcommons/RNInGtRarNY/6byzT9apAQAJ. 2 Gail P. Clement, “American ETD Dissemination in the Age of Open Access: ProQuest, NoQuest, or Allowing Student Choice,” College & Research Libraries News 74, no. 11 (December 2013): 562– 66, https://doi.org/10.5860/crln.74.11.9039; FUSE, 2012-2013, Graduate Students Re-FUSE!, https://oaktrust.library.tamu.edu/bitstream/handle/1969.1/152270/Graduate%20Students %20Re-FUSE.pdf?sequence=25&isAllowed=y. 3 “PQDT Submissions Options for Universities,” ProQuest, http://contentz.mkt5049.com/lp/43888/382619/PQDTsubmissionsguide_0.pdf . 4 Meghan Banach Bergin and Charlotte Roh, “Systematically Populating an IR With ETDs: Launching a Retrospective Digitization Project and Collecting Current ETDs,” in Making Institutional Repositories Work, eds. Burton B. Callicott, David Scherer, and Andrew Wesolek (West Lafayette, IN: Purdue University Press, 2016), 127–37, https://docs.lib.purdue.edu/purduepress_ebooks/41/. 5 Cedar C. Middleton, Jason W. Dean, and Mary A. Gilbertson, “A Process for the Original Cataloging of Theses and Dissertations,” Cataloging and Classification Quarterly 53, no. 2 (February 2015): 234–46, https://doi.org/10.1080/01639374.2014.971997. 6 Wendy Robertson and Rebecca Routh, “Light on ETD’s: Out from the Shadows” (presentation, Annual Meeting for the ILA/ACRL Spring Conference, Cedar Rapids, IA, April 23, 2010), http://ir.uiowa.edu/lib_pubs/52/; Yuan Li, Sarah H. Theimer, and Suzanne M. Preate, “Campus Partnerships Advance both ETD Implementation and IR Development: A Win-win Strategy at Syracuse University,” Library Management 35, no. 4/5 (2014): 398–404, https://doi.org/10.1108/LM-09-2013-0093. 7 Do and Gewissler, “Managing ETDs,” 202; Banach Bergin and Roh, “Systematically Populating,” 134; Donna O’Malley, June 27, 2017, reply to Andrew Wesolek, “ETD Embargoes through ProQuest,” Digital Commons Google Users Group (blog), https://groups.google.com/forum/#!searchin/digitalcommons/embargo$20proquest%7Csort :date/digitalcommons/Gadwi8INfgA/sg7de7SdCAAJ. 8 Gail P. Clement and Fred Rascoe, “ETD Management & Publishing in the ProQuest System and the University Repository: A Comparative Analysis,” Journal of Librarianship and Scholarly Communication 1, no. 4 (August 2013): 8, http://doi.org/10.7710/2162-3309.1074. 9 “U.S. Dissertations Publishing Services: 2017-2018 Fee Schedule,” ProQuest. https://doi.org/10.5703/1288284316661 https://groups.google.com/forum/#!searchin/digitalcommons/embargo$20dates%7Csort:date/digitalcommons/RNInGtRarNY/6byzT9apAQAJ https://groups.google.com/forum/#!searchin/digitalcommons/embargo$20dates%7Csort:date/digitalcommons/RNInGtRarNY/6byzT9apAQAJ https://doi.org/10.5860/crln.74.11.9039 https://oaktrust.library.tamu.edu/bitstream/handle/1969.1/152270/Graduate%20Students%20Re-FUSE.pdf?sequence=25&isAllowed=y https://oaktrust.library.tamu.edu/bitstream/handle/1969.1/152270/Graduate%20Students%20Re-FUSE.pdf?sequence=25&isAllowed=y http://contentz.mkt5049.com/lp/43888/382619/PQDTsubmissionsguide_0.pdf https://docs.lib.purdue.edu/purduepress_ebooks/41/ https://doi.org/10.1080/01639374.2014.971997 http://ir.uiowa.edu/lib_pubs/52/ https://doi.org/10.1108/LM-09-2013-0093 https://groups.google.com/forum/#!searchin/digitalcommons/embargo$20proquest%7Csort:date/digitalcommons/Gadwi8INfgA/sg7de7SdCAAJ https://groups.google.com/forum/#!searchin/digitalcommons/embargo$20proquest%7Csort:date/digitalcommons/Gadwi8INfgA/sg7de7SdCAAJ http://doi.org/10.7710/2162-3309.1074 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 USING THE HARVESTING METHOD TO SUBMIT ETDS INTO PROQUEST | VEVE 10 10 “Support: ProQuest Export Documentation,” Vireo Users Group, https://vireoetd.org/vireo/support/ProQuest-export-documentation/. 11 “PQDT Global Submission Options, Institutional Repository + Harvesting,” ProQuest, https://media2.proquest.com/documents/dissertations-submissionsguide.pdf. 12 Marlene Coles, email message to author, January 19, 2018. 13 “ProQuest Dissertations & Theses Global Harvesting Process,” ProQuest. https://vireoetd.org/vireo/support/ProQuest-export-documentation/ https://media2.proquest.com/documents/dissertations-submissionsguide.pdf ABSTRACT INTRODUCTION LITERATURE REVIEW ProQuest ETD Administrator Submission Option FTP Submission Option Harvesting Submission Option IMPLEMENTATION OF THE HARVESTING METHOD AT UNF Negotiation Process Local Steps Performed Before the Harvesting Harvesting Process (Customized for Our Institution) ADDITIONAL POINTS ON THE HARVESTING METHOD Handling of ETDs that come out of embargo. Handling of metadata edits performed after the ETDs have been harvested and published in PQDT. Sending MARC records to OCLC. Notifications of harvests performed by ProQuest and imports to the PQDT database. Usage statistics of ETDs hosted in PQDT. Royalty payments to authors. OBSTACLES FACED CONCLUSION AND RECOMMENDATIONS ENDNOTES 12207 ---- Intro to Coding Using Python at the Worcester Public Library PUBLIC LIBRARIES LEADING THE WAY Intro to Coding Using Python at the Worcester Public Library Melody Friedenthal INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2020 https://doi.org/10.6017/ital.v39i2.12207 Melody Friedenthal (mfriedenthal@mywpl.org) is a Public Services Librarian, Worcester Public Library. ABSTRACT The Worcester Public Library (WPL) offers several Digital Learning courses to our adult patrons, and among them is “Intro to Coding Using Python”. This 6-session class teaches basic programming concepts and the vocabulary of software development. It prepares students to take more intensive, college-level classes. The Bureau of Labor Statistics predicts a bright future for software developers, web developers, and software engineers. WPL is committed to helping patrons increase their “hireability” and we believe our Python class will help patrons break into these lucrative and gratifying professions… or just have fun. HISTORY AND DETAILS OF OUR CLASS I came to librarianship from a long career in software development, so when I joined the Worcester Public Library in January 2018 as a Public Services Librarian, my manager proposed that I teach a class in programming. She asked me to research what language would be best. Python got high marks for ease of use, flexibility, growing popularity, and a very active online community. Once I selected a language, I had to choose an environment to teach it in – or so I thought. I had absolutely no experience in front of a classroom, and few pedagogical skills, so I sought out an online Python course within which to teach. I decided to use the Code Academy (CA) website as our programming environment. CA has self- guided classes in a number of subjects and the free Beginning Python course seemed to be just what we needed. I went through the whole class myself before using it as courseware. My intent was to help students register for CA, then, each day, teach them the concepts in that day’s CA lesson. They would then be set to do the online lesson and assignments. We first offered Python in June 2018. Problems with CA came up right from the start: students registered for the wrong class (despite the handout explicitly naming the correct class) and CA frequently tried to upsell to a not-free Python class. Since CA’s classes are MOOCs (Massive Open Online Courses), the developers built in an automated way of correcting student code: embedded behind each web page of the course, there’s code that examines the student’s code and decides whether it is acceptable or not. Good in theory, not so good in practice. CA’s “code-behind” is flawed and sometimes prevented students from advancing to the next lesson. mailto:mfriedenthal@mywpl.org INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 INTRO TO CODING USING PYTHON AT THE WORCESTER PUBLIC LIBRARY | FRIEDENTHAL 2 Moreover, some of the CA tasks were inane. For example, one lesson incorporated a kind of Mad Libs game. This is where the instructions ask, for example, for 13 nouns and 11 adjectives, and these are combined with set sentences to generate a silly story. This assignment turned out to be too long and difficult to fulfill, preventing students from advancing. Although I used CA the first few times I offered the class, I subsequently abandoned it and wrote my own classroom material. After determining that CA wasn’t appropriate, I chose an online IDE where the students could code independently. This platform worked well when I tested it ahead of time, but when the whole class tried to log on at once, we received denial-of-service error messages. Hurriedly moving on to Plan C, I chose Thonny, a free Python IDE which we downloaded to each PC in the Lab (see https://thonny.org/). Each student receives a free manual (see figure 1), which I wrote. Every time I’ve offered this class I’ve edited the manual, clarifying those topics the students had a hard time with. I’ve also added new material, including commands students have shown me. It is now 90 pages long, written in Microsoft Word, and printed in color. We use soft binders with metal fasteners. Figure 1. Intro to Coding Using Python manual developed for the course. https://thonny.org/ INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 INTRO TO CODING USING PYTHON AT THE WORCESTER PUBLIC LIBRARY | FRIEDENTHAL 3 The manual consists of the following sections: • Cover: course name, dates we meet, time class starts and ends, location, instructor’s name, manual version number, and a place for the student to write their own name. • Syllabus: goals for each of the six sessions. This is aspirational. • Basic information about programming, including an online alternative to Thonny, for students who don’t have a computer at home and wish to use our public computers for homework. • Lessons 1 – 17: “Hello World” and beyond. • Lesson 18: Object Oriented Design, which I consider to be advanced, optional material. Skipped if time is pressing or the class isn’t ready for it. • Lesson 19: Wrap-up: o How to write good code. o How to debug. o List of suggested topics for further study. o Online resources for Python forums and community. • List of WPL‘s print resources on Python and programming. • Relevant comic strips and cartoons. In March 2019, my manager asked me to start assigning homework. If a student attends all six sessions and makes a decent attempt at each assignment, at the sixth session they receive a Certificate of Completion. The certificate has the WPL name & logo, the student’s name, and my signature. Typically three or four students earn a certificate. Homework is emailed to me as an attachment. This class meets on Tuesday evenings and I tell students to send me their homework as soon as possible. Inevitably, several students don’t email me until the following Monday. While I don’t give out grades, I do spend considerable time reviewing homework, line by line, and I email back detailed feedback. When the January 2020 course started, I found that between October’s class and January, Outlook implemented a security protocol which removes certain file extensions from incoming email. And – you can see where this is going – the .py Python extension was one of them. I told students to rename their Python code files from xxxx.py to xxxx.py.doc, where “xxxx” is their program name. This fools Outlook into thinking the file is a Microsoft Word document and the email is delivered to me intact. When it arrives, I remove the .doc extension from the attachment and save it to a student-specific file. Then I open the file in Thonny and review it. Physically, our Computer Lab contains an instructor’s computer and twelve student computers (see figure 2). It also has a projector which projects the active window from the instructor’s computer onto a screen: usually the class manual. I use dry erase markers in a variety of colors to illustrate the concepts on a whiteboard. There is also a supply of pencils on hand for student note- taking use. The class is offered once per season. Although the classroom can accommodate twelve students, we set our maximum registration to fourteen, which allows us to maximize attendance even if patrons cancel or don’t show up. And if all fourteen do attend the first class, we have two lap tops I INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 INTRO TO CODING USING PYTHON AT THE WORCESTER PUBLIC LIBRARY | FRIEDENTHAL 4 can bring into the Lab. We also maintain a small waitlist, usually of five spots. We’ve offered this class seven times and the registration and waitlists have been full every time. Sometimes we have to turn students away. Figure 2. Classroom at Worcester Public Library. However, we had a problem with registered patrons not showing up, so last spring we implemented a process where, about a week before class starts, I email each student, asking them confirm their continued interest in the class. I tell them that if they are no longer interested—or don’t respond - I will give the seat we reserved for them to another interested patron (from the waitlist). In this email I also outline how the course is structured and that they can each earn a Certificate of Completion. I tell them class starts promptly at 5:30 and to please plan accordingly. Some students don’t check their email. Some patrons show up without ever registering; they are told registration is required and to try again in a few months. I keep track of attendance on an Excel spreadsheet. Here in Worcester, MA, weather is definitely a factor for our winter sessions. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 INTRO TO CODING USING PYTHON AT THE WORCESTER PUBLIC LIBRARY | FRIEDENTHAL 5 Over time I’ve made the class more dynamic. I have a student read a paragraph in the manual aloud. I’ve switched around the order of some lessons, in response to student questions. I have them play a game to teach Boolean logic: “If you live in Worcester And you love pizza, stand up!”… then: “If you live in Worcester Or you love pizza, stand up!” Students range from experienced programmers (of other languages), to people with no experience but great aptitude, to people who just never seem to “get it”. This material is technical and I try hard to communicate the concepts but I lose a few students every time. We ask our patrons for feedback on all of our programs. Our Python students have written: • “… the classes were formatted in an organized manner that was beginner friendly” • “The manual is a big help. I'm thankful that the program is free.” • “… coding is fun and I learned a new skill.” • “This made me think critically and helped me understand where my errors in the programs were.” WPL is proud to offer classes that make a difference in our patrons’ lives. ABSTRACT History and Details of Our Class 12209 ---- Applying Gamification to the Library Orientation: A Study of Interactive User Experience and Engagement Preferences ARTICLES Applying Gamification to the Library Orientation A Study of Interactive User Experience and Engagement Preferences Karen Nourse Reed and A. Miller INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2020 https://doi.org/10.6017/ital.v39i3.12209 Karen Nourse Reed (karen.reed@mtsu.edu) is Associate Professor, Middle Tennessee State University. A. Miller (a.miller@mtsu.edu) is Associate Professor, Middle Tennessee State University. © 2020. ABSTRACT By providing an overview of library services as well as the building layout, the library orientation can help newcomers make optimal use of the library. The benefits of this outreach can be curtailed, however, by the significant staffing required to offer in-person tours. One academic library overcame this issue by turning to user experience research and gamification to provide an individualized online library orientation for four specific user groups: undergraduate students, graduate students, faculty, and community members. The library surveyed 167 users to investigate preferences regarding orientation format, as well as likelihood of future library use as a result of the gamified orientation format. Results demonstrated a preference for the gamified experience among undergraduate students as compared to other surveyed groups. INTRODUCTION Background Newcomers to the academic campus can be a bit overwhelmed by their unfamiliar environment: there are faces to learn, services and processes to navigate, and an unexplored landscape of academic buildings to traverse. Whether one is an incoming student or recently hired employee of the university, all need to become quickly oriented to their surroundings to ensure productivity. In the midst of this transition, the academic library may or may not be on the list of immediate inquiries; however, the library is an important place to start. Newcomers would be wise to familiarize themselves with the building and its services so that they can make optimal use of its offerings. Two studies found that students who used the library received better grades and had higher retention rates. 1 Another study regarding university employees revealed that untenured faculty made less use of the library than tenured faculty, a problem attributed to lack of familiarity with the library.2 Researchers have also found that faculty will often express interest in different library services without realizing that these services are in fact available.3 It is safe to say that libraries cannot always rely on newcomers to discover the physical and electronic services on their own; they need to be shown these items in order to mitigate the risk of unawareness. In consideration of these issues, the Walker Library at Middle Tennessee State University (MTSU) recognized that more could be done to welcome its new arrivals to campus. The public university enrolls approximately 21,000 students, the majority of whom are undergraduates. However, with a Carnegie classification of doctoral/professional and over one hundred graduate degree programs, there was a strong need for specialized research among the university’s graduate students and faculty. Other groups needed to use the library too: non-faculty employees on campus as well as community users who frequently used Walker Library for its specialized and general collections. The authors realized that when new members of these different groups mailto:karen.reed@mtsu.edu mailto:a.miller@mtsu.edu INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 2 arrived on campus, few opportunities were available for acclimation to the library’s services or building layout. Limited orientation experiences were conducted within library instruction classes, but these sessions primarily taught research skills and targeted freshman general- education classes as well as select upper-division and graduate classes. In short, it appeared that students, employees, and visitors to the university would largely have to discover the library’s services on their own through a search on the library website or an exploration of the physical library. It was very likely that, in doing so, the newcomers might miss out on valuable services and information. As MTSU librarians, the authors felt strongly that library orientations were important to everyone at the university so that they might make optimal use of the library’s offerings. The authors based this opinion on their knowledge of relevant scholarly literature as well as their own anecdotal experiences with students and faculty.4 The authors defined the library orientation differently from library instruction: in their view, an orientation should acquaint users with the services and physical spaces of the library, as compared to instruction that would teach users how to use the library’s electronic resources such as databases. The desired new approach would structure orientations in response to the different needs of the library’s users. For example, the authors found that undergraduates typically had distinct library interests compared to faculty. It was recognized that library orientations were time-consuming for everyone: library patrons at MTSU often did not want to take the time for a physical tour, nor did the library have the staffing to accommodate large-scale requests. The authors turned to the gamification trend, and specifically interactive storytelling, as a solution. Interactive storytelling has previous applications in librarianship as a means of creating an immersive and self-guided user experience.5 However, no previous research appears to have been conducted to understand the different online, gamified orientation needs of various library groups. To overcome this gap, the authors developed an online, interactive, game-like experience via storytelling software to orient four different groups of users to the library’s services. These groups were undergraduate students, graduate students, faculty members (which included both faculty and staff at the university), and community members (i.e., visitors to the university or alumni); see figure 1 for an illustration of each groups’ game avatars. These groups were invited to participate in the gamified experience called LibGO (short for library game orientation). After playing LibGO, participants gave feedback through an online survey. This paper will give a brief explanation of the creation of the game, as well as describe the results of research conducted to understand the impact of the gamified experience across the four user groups. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 3 Figure 1. LibGO players were allowed to self-select their user group upon entering the game. Each of the four user groups was assigned an avatar and followed a logic path specified for that group. LITERATURE REVIEW Traditional Orientation Searches for literature on library orientation yield very broad and yet limited details about users of the traditional library orientation method. It is important to note that the terms “library tour” and “library orientation” can be somewhat vague, because this terminology is not interchangeable, yet is frequently treated as such in the literature.6 These terms are often included among library instruction materials which predominately influence undergraduate students.7 Kylie Bailin, Benjamin Jahre, and Sarah Morris define orientation as “any attempt to reduce library anxiety by introducing students to what a college/university library is, what it contains, and where to find information while also showing how helpful librarians can be.”8 Their book is a culmination of case studies of academic library orientation in various forms worldwide where the common theme across most chapters is the need to assess, revise, and change library orientation models as needed, especially in response to feedback, staff demands, and the evolving trend of libraries and technology.9 Furthermore, the majority of these studies are undergraduate-focused, and often freshman-focused, while only a few studies are geared towards graduate students. Other traditional orientation problems discussed in the literature include students lacking intrinsic motivation to attend library orientation, library staff time required to execute the orientation, and lack of attendance.10 Additionally, among librarians there seems to be consensus that the traditional library tours are the least effective means of orientation, yet they are the most highly used and with attention predominately focused on the undergraduate population alone. 11 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 4 In 1997, Pixey Anne Mosely described the traditional guided library tour as ineffective, and documented the trend of libraries discontinuing it in favor of more active learning options.12 Her study surveyed 44 students who took a redesigned library tour, all of whom were undergraduates (with freshmen as the target population). Although Mosely’s study only addressed one group of library users, it does attempt to answer a question on library perception whereby 93 percent of surveyed students indicated feeling more comfortable in using the library after the more active learning approach.13 A comparison study by Marcus and Beck looked at traditional vs treasure hunt orientations, and ultimately discovered that perception of the traditional method is limited by the selective user population and lack of effective measurements. They cited the need for continued study of alternative approaches to academic library orientation.14 A study by Kenneth Burhanna, Tammy Eschedor Voelker, and Julie Gedeon looked at the traditional library tour from the physical and virtual perspective. Confronted with a lack of access to the physical library, these researchers at Kent State University decided to add an online option for the required traditional freshman library tour.15 Their study compared the efficacy of learning and affective outcomes between face-to-face library tours and those of online library tours. Of the 3,610 students who took the required library tour assignment, 3,567 chose the online tour method and 63 opted or were required to take the in-person, librarian-led tour. Surveys were later sent to a random list of 250 students who did not take the in-person tour and the 63 students who did take the in-person tour. Of the 46 usable responses all but one were undergraduates and 39 (85 percent) of them were freshman.16 This is a small sample size with a ratio of slightly greater than 2:1 for online versus in-person tour participation. Although results showed that an instructor’s recommendation on format selection was the strongest influencing factor, convenience was also significant for those who selected the online option (81.5 percent). In contrast, only 18.5 percent of the students who took the face-to- face tour rated it as convenient. The authors found that regardless of tour type, students were more comfortable using the library (85 percent) and more likely to use library resources (80 percent) after having taken a library tour. Interestingly, students who took the online tour seemed slightly more likely to visit the physical library than those who took the in-person tour. Ultimately the analysis of both tours showed this method of library orientation encourages library resource use, and the “online tour seems to perform as well, if not slightly better than the in-person tour.”17 Gamification Use in Libraries An alternative format to the traditional method is gamification. Gamification has become a familiar trend within academic libraries in recent years, and most often refers to the use of a technology - based game delivery within an instructional setting. Some users find gamified library instruction to be more enjoyable than traditional methods. For these people, gamification can potentially increase student engagement as well as retention of information.18 The goal of gamification is to create a simplified reality with a defined user experience. Kyle Felker and Eric Phetteplace emphasized the importance of user interaction over “specific mechanics or technologies” in thinking about the gamification design process.19 Proponents of gamification of library instructional content indicate that it connects to the broader mission of library discovery and exploration as exemplified through collaboration and the stimulation of learning.20 Additional benefits of gamification are its teaching, outreach and engagement functions.21 Many researchers have documented specific applications of online gaming as a means of imparting library instruction. Mary J. Broussard and Jessica Urick Oberlin described the work of librarians at Lycoming College in developing an online game as one approach to teaching about INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 5 plagiarism.22 Melissa Mallon offered summaries of nine games produced for higher education, several of which were specifically created for use by academic libraries.23 Many of these online library games reviewed used Flash, or required players to download the game before playing. By contrast, J. Long detailed an initiative at Miami University to integrate gamification into the library instruction, a project which utilized Twine.24 Twine is an in-browser method and therefore avoids the problem of requiring users to download additional software prior to playing the game. Other libraries have used online gamification specifically as a tool for library orientations. Although researchers have demonstrated that the library orientation is an important practice in establishing positive first impressions of the library and counteracting library anxiety among new users, the differences between in-person versus online delivery formats are unclear.25 Several successful instances have been documented in which the orientation was moved to an online game format. Nancy O’Hanlon, Karen Diaz, and Fred Roecker described a collaboration at Ohio State University Libraries between librarians and the Office of First Year Experience; for this project, they created a game to orient all new students to the library prior to arrival on campus.26 The game was called “Head Hunt,” and was cited among those games listed in the article by Mallon. 27 Anna-Lise Smith and Lesli Baker reported the “Get a Clue” game at Utah Valley University which oriented new students over two semesters.28 Another orientation game developed at California State University-Fresno was noteworthy for its placement in the university’s learning management system (LMS).29 In reviewing the literature regarding online library gamification efforts, there appear to be several best practices. Several studies cite initial student assessment to understand student knowledge and/or perceptions of the content, followed by an iterative design process with a team of librarians and computer programmers.30 Felker and Phetteplace reinforced the need for this iterative process of prototyping, testing, deployment, and assessment as one key to success; however they also stated that the most prevalent reason for failure is that the games are not fun for users.31 Librarians are information experts, and are not necessarily trained in fun game design. Some libraries have solved this problem by partnering with or hiring professional designers; however for many under-resourced libraries, this is not an option.32 Taking advantage of open- source tools, as well as the documented trial-and-error practices of others, can be helpful to newcomers who wish to break into new library engagement methods utilizing gamification. As literature has shown, a traditional library tour may have a place in the list of library services, but for whom and at what cost are questions with limited answers in studies done to date. Gamification has offered an alternative perspective but with narrow accounts of its success in the online storytelling format and for users outside of the heavily focused freshman group. Across all literature of library orientation studies, there is little reference to other library user populations such as faculty, staff, community users, distance students, or students not formally part of a class that requires library orientation. DEVELOPMENT OF THE LIBRARY GAME ORIENTATION (LIBGO) LibGO was developed by the authors with not only a consideration for the Walker Library user experience, but also with a specific attention to the differing needs of the multiple user groups served by the library. This user-focused concern led to exploring creative methodologies such as user experience research and human-centered design thinking, a process of overlapping phases that produces a creative and meaningful solution in a non-linear way. The three pillars of design INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 6 thinking are inspiration, ideation, and iteration.33 Defining the problem and empathizing with the users (inspiration) led into the ideation phase, whereby the authors created low- and high-fidelity prototypes. The prototypes were tested and improved (iteration) through the use of beta testing in which playtesters interacted with the gamified orientation. The authors were novice developers of the gamified orientation, and this entailed a learning curve for not only the design thinking mindset but also the technical achievability. The development started with design thinking conversations and quickly turned to low-fidelity prototypes designed on paper. The development soon advanced to the actual coding so that the authors could get early designs tested before launching the final version. Prior to deployment on the library’s website, LibGO underwent a series of playtesting by library faculty, staff, and student employees. This testing was invaluable and led to such improvements as streamlining of processes and less ambiguity of text. LibGO was developed with the Twine open-source software (https://twinery.org), a product which is primarily used for telling interactive, non-linear stories with HTML. Twine was an excellent application for this project as it allowed the creation of an online and interactive “choose your own adventure” styled library orientation game, in which users could explore the library based upon their selection of one of multiple available plot directions. With a modest learning curve and as an open source software, Twine is highly accessible for those who are not accustomed to coding. For those who know HTML, CSS, JavaScript, variables, and conditional logic, Twine’s capabilities can be extended. The library’s interactive orientation adventure requires users to select one of the four available personas: undergraduate student, graduate student, faculty, or community member. Users subsequently follow that persona through a non-linear series of places, resources and points of interest built with the HTML output of using Twee (Twine’s programming language). See figure 2 for an example point of interest page and figure 3 for an example of a user’s final score after completing the gamified experience. Once the Twine story went through several iterations of design and testing, the HTML file was placed on the library’s website for the gamified orientation to be implemented with actual users. https://twinery.org/ INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 7 Figure 2. This instructional page within LibGO explains how to reserve different library spaces online. Upon reading this content, the user will progress by clicking on one of the hypertext lines in blue font at the bottom. Figure 3. Based upon the displayed avatar, this LibGO page is representative of a graduate student’s completion of LibGO. The page indicates the player’s final score and gives additional options to return to the home page or complete the survey. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 8 Purpose of Study LibGO utilized the common "choose your own adventure" format whereby players progress through a storyline based upon their selection of one of multiple available plot directions. Although the literature suggests that other technology-based methods are an engaging and instructive mode of content delivery, little prior research exists regarding this specific approach to library outreach. Furthermore, no previous research appears to have been conducted to understand the different online, gamified orientation needs of various library groups. The researchers wanted to understand the potential of interactive storytelling as a means to educate a range of users on library services as well as make the library more approachable from a user perspective. The study was designed to understand the user experience of each of the four groups. The researchers hoped to discern which users, if any, found the gamified experience to be a helpful method of orientation to the library’s physical and electronic services. Another area of inquiry was to determine whether this might be an effective delivery method by which to target certain segments of the campus for outreach. Finally, the study intended to determine whether this method of orientation might incline participants toward future use of the library. METHODOLOGY Overview The authors selected an embedded mixed methods design approach in which quantitative and qualitative data were collected concurrently through the same assessment instrument.34 The survey instrument primarily collected quantitative data, however a qualitative open-response question was embedded at the end of the survey: this question gathered additional data by which to answer the research questions. Each data set (one quantitative and one qualitative) was analyzed separately for each participant group, and then the groups were compared to develop a richer understanding of participant behavior. Research Questions The data collection and subsequent analysis attempted to answer the following questions: 1. Which group(s) of library users prefer to be oriented to library services and resources through the interactive storytelling format, as compared to other formats? 2. Which group(s) of library users are more likely to use library services and resources after participating in the interactive storytelling format of orientation? 3. What are user impressions of LibGO, and are there any differences in impression based on the characteristics of the unique user group? Participants Participants for the study were recruited in-person and via the library website. In-person recruitment entailed the distribution of flyers and use of signage to recruit participants to play LibGO in a library computer lab during a one-day event. Online recruitment lasted approximately ten weeks and simply involved the placement of a link to LibGO on the home page of th e library’s website. A total of 167 responses were gathered through both methods and participants were distributed as shown in table 1. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 9 Table 1. Composition of Study’s Participants Group number Affiliation Number of responses 1 Undergraduate students 55 2 Graduate students 62 3 Faculty 13 4 Staff 28 5 Community members 9 TOTAL 167 For the purposes of statistical data analysis, groups 3 and 4 were combined to produce a single group of 41 university employee respondents; also, group 5’s data was not included in the statistical analysis due to the low number of participants. Qualitative data for all groups, however, was included in the non-statistical analysis. Survey Instrument A survey with twelve total questions was developed for this study and was administered online through Qualtrics. After playing LibGO, participants were asked to voluntarily complete the survey; if they agreed, they were redirected to the survey’s website. Before answering any survey questions, the instrument administered an informed consent statement to participants . All aspects of the research, including the survey instrument, were approved through the university’s Institutional Review Board (protocol number 18-1293). The first part of the survey (see appendix A) consisted of ten questions, each with a ten-point Likert scaled response. The first five questions were each designed to measure a Preference construct, and the next five questions each measured a Likelihood construct. The Pref erence construct referred to participant’s preference for a library orientation: did they prefer LibGO’s online interactive storytelling format, or did they prefer another format such as in-person talks? The Likelihood construct referred to the participant’s self-perceived likelihood of more readily engaging with the library in the future (both in-person and online) after playing LibGO. The second part of the survey gathered the participant’s self-reported affiliation (see table 1 for the list of possible group affiliations) as well as offered participants an open-ended response area for optional qualitative feedback. Data Collection The study’s data was collected in two stages. In stage one, LibGO was unveiled to library visitors during a special campus-wide week of student programming events. On the library’s designated event day, the researchers held a drop-in event at one of the library’s computer labs (see figure 4 for an example of event advertisement). Library visitors were offered a prize bag and snacks if they agreed to play LibGO and complete the survey. During the three-hour-long drop-in session, 58 individual responses were collected: the vast majority of these came from undergraduate students (51 responses), with additional responses from graduate students (n = 4), university staff employees (n = 2), and one community member responding. Community members were defined as anyone not currently directly affiliated with the university; this group may have included prospective students or alumni. Stage 2 began the following day after the library drop-in event, and simply involved the placement of a link to LibGO on the home page of the library’s website. Any visitor to the library’s website could click on the advertisement to be taken to LibGO. This link INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 10 remained active on the library website for ten weeks, at which point the final data was gathered. A total of 167 responses were gathered during both stages and participants were distributed as previously shown in table 1. Figure 4. Example of Student LibGO Event Advertisement RESULTS Quantitative Findings Statistical analysis of each of the ten quantitative questions required the use of one-way ANOVA in SPSS. A post hoc test (Hochberg’s GT2) was run in each instance to account for the different sample sizes. For all statistical analysis, only the data from undergraduates, graduate students, and university employees (a group which combined both faculty and staff results) were utilized. A listing of mean comparisons by group, for each of the ten survey questions, may be found in table 2. The analysis of the one-way ANOVAs yielded statistically significant results for three of the ten individual questions in the first part of the survey: questions 2, 3, and 6 (see table 3). Table 2. Descriptive Statistics for Survey Results (10-point scale, with 10 as most likely) Survey Question Mean for Undergraduate Students Mean for Graduate Students Mean for University Employees 1. In considering the different ways to learn about Walker Library, do you find this library orientation game to be more or less preferable as compared to other orientation options (such as in-person tours, speaking with a librarian, or clicking through the library website on your own)? 7.02 6.39 6.02 2. In your opinion, was the library orientation game a useful way to get introduced to the library’s services and resources? 8.13 6.94 7.12 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 11 3. If your friend needed a library orientation, how likely would you be to recommend the game over other orientation options (such as in-person tours, speaking with a librarian, or clicking through the library website on your own?) 7.38 5.94 5.98 4. Please indicate your level of agreement with the following statement: “As compared to playing the game, I would have preferred to learn about the library’s resources and services by my own exploration of the library website?” 6.11 6.50 5.88 5. Please indicate your level of agreement with the following statement: “As compared to playing the game, I would have preferred to learn about the library’s resources and services through an in- person orientation tour.” 6.11 5.08 5.76 6. After playing this orientation game, are you more or less likely to visit Walker Library in person? 8.27 6.94 6.90 7. After playing this library orientation game, are you more or less likely to use the Walker Library website to find out about the library (such as hours of operation, where to go to get different materials/services, etc.)? 7.82 6.97 7.20 8. After playing this library orientation game, are you more or less likely to seek help from a librarian at Walker Library? 6.95 6.58 6.63 9. After playing this library orientation game, are you more or less likely to use the library’s online resources (such as databases, journals, e-books)? 7.67 7.15 6.90 10. After playing this library orientation game, are you more or less likely to attend a library workshop, training, or event? 6.96 6.73 6.24 TABLE 3. Overall Statistically Significant Group Differences df F p w2 Question 2 2 3.714 .027 .03 Question 3 2 4.508 .012 .04 Question 6 2 7.178 .001 .07 Question 2 asked “In your opinion, was the library orientation game a useful way to get introduced to the library’s services and resources?” The one-way ANOVA found that there was a statistically significant difference between groups (F(2,155) = 3.714, p = .027, ω2 = .03). The post hoc comparison using the Hochberg’s GT2 test revealed that undergraduates were statistically significantly more likely to prefer LibGO in this manner (M = 8.13, SD = 1.94, p = .031) as INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 12 compared to the graduate students (M = 6.94, SD = 2.72). There was no statistically significant difference between undergraduates and the university employees (p = .145). According to criteria suggested by Roger Kirk, the effect size of .03 indicates a small effect in perceived usefulness of LibGO as an introduction among undergraduates.35 Question 3 asked “If your friend needed a library orientation, how likely would you be to recommend the game over other orientation options (such as in-person tours, speaking with a librarian, or clicking through the library website on your own)?” The one-way ANOVA found that there was a statistically significant difference between groups (F(2, 155) = 4.508, p = .012, ω2 = .04). The post hoc comparison using the Hochberg’s GT2 test found that undergraduates were statistically significantly more likely to prefer LibGO over other orientation options (M = 7.38, SD = 2.49, p = .021) as compared to graduate students (M = 5.94, SD = 3.06). There was no statistically significant difference between undergraduates and university employees (p = .053). The effect size of .04 indicates a small effect regarding undergraduate preference for LibGO versus other orientation options. Question 6 asked “After playing this library orientation game, are you more or less likely to visit Walker Library in person?” The one-way ANOVA found that there was a statistically significant difference between groups (F(2,155) = 7.178, p = .001, ω2 = .07). The post hoc comparison using the Hochberg’s GT2 test revealed that undergraduates were statistically significantly more likely to visit the library after playing LibGO (M = 8.27, SD = 2.09, p = .003) as compared to graduate students (M = 6.94, SD = 2.20). Additionally, the test found that undergraduates were statistically significantly more likely to visit the library after playing LibGO (p = .007) as compared to university employees (M = 6.90, SD = 2.08). According to criteria suggested by Kirk, the effect size of .07 indicates a medium effect regarding undergraduate potential to visit the library in person after playing LibGO. 36 In addition to testing each individual survey question, tests were run to understand the possible group differences by construct (Preference and Likelihood). The Preference construct was an aggregate of survey questions 1-5, and the Likelihood construct was an aggregate of survey questions 6-10. For both constructs, the one-way ANOVA found results which were not statistically significant. In all, the quantitative findings indicated three areas by which the experience of playing LibGO was more helpful for the surveyed undergraduates than the other surveyed groups (i.e., graduate students or university employees). At this point, the analysis turned to the qualitative data so as to better understand participant views of LibGO. Qualitative Findings Analysis of the qualitative results was limited to the data collected in the survey’s final question. Question 12 was an open-response area, and was intentionally prefaced with a vague prompt: “Do you have any final thoughts for the library (suggestions, additions, modification, comments, criticisms, praise, etc.)?” Of the 167 total survey responses, 67 individuals chose to answer this question. Preliminary analysis showed that the feedback derived from this question covered a spectrum of topics, ranging from remarks on the LibGO experience itself to broader concerns regarding other library services. Open coding strategies were utilized to interpret the content of participant responses. Under this methodology, the responses were evaluated for general themes and then coded and grouped INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 13 under a constant comparative approach.37 NVivo 12 software was used to code all 67 participant responses. Initial coding yielded eight open codes, but these were later consolidated into six final codes (see table 4). One code (LibGO Improvement Tip) was rather nuanced and yielded five axial codes (see table 5). Axial codes denoted secondary concerns which fell under a larger category of interest. Although some participants gave longer feedback which addressed multiple concerns, care was taken to segregate each distinct concern to a specific code. Therefore, it is important to note that some comments addressed multiple concerns, and so the total number of concerns (n = 76) is greater than the total number of individuals responding to the prompt (n = 67). TABLE 4. Distribution of Qualitative Codes by User Group Code Undergraduate Graduate Faculty Staff Community member Total # concerns Positive feedback 7 7 1 4 2 21 Negative feedback 1 2 0 3 0 6 In-person tour preference 2 3 0 1 0 6 LibGO improvement tip 5 11 1 3 3 23 Library services feedback 2 4 3 0 0 9 Library building feedback 1 7 1 2 0 11 Total: 18 34 6 13 5 76 Discussion of Qualitative Themes Positive Feedback (21 separate concerns). Affirmative comments regarding LibGO were primarily split between undergraduate and graduate students, with a small number of comments coming from the other groups. Although all groups stated that the game was helpful, one undergraduate wrote “I wish I would’ve received this orientation at the very beginning of the year!” A graduate student declared “This was a creative way to engage students, and I think it should be included on the website for fun.” Both community members commented on the utility of LibGO in providing an orientation without having to physically come to the library; for example, “Interactive without having to actually attend the library in person which I liked.” Additionally, a community member pointed out the instructional capability of LibGO, writing “I think I learned more from the game than walking around in the library.” Negative Feedback (6 separate concerns). Unfavorable comments regarding LibGO primarily challenged the orientation’s characterization as a “game” in terms of its lack of fun. One graduate student wrote a comment representative of this concern by stating, “The game didn’t really seem like a game at all.” A particularly searing comment came from a university staff member who INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 14 wrote, “Calling this collection of web pages an ‘interactive game’ is a stretch, which is a generous way of stating it.” In-person Tour Preference (6 separate concerns). A small number of concerns indicated a preference for in-person orientations versus online. One undergraduate cited the ability to ask questions during an in-person tour as an advantage of that delivery medium. A graduate student mentioned their desire for kinesthetic learning over an online approach, writing, “I prefer hands - on exploration of the library.” LibGO Improvement Tip (23 separate concerns). Suggested improvements to LibGO were the largest area of qualitative feedback and produced five axial themes (subthemes); see table 5 for a breakdown of the five axial themes by group. 1. Design issues were the largest cited area of improvement, and the most commonly mentioned design problem was the inability of the user to go back to previously seen content. Although this functionality did in fact exist, it was apparently not intuitive to users; design modifications in future iterations are therefore critical. Other users made suggestions as to the color scheme used and the ability to magnify image sizes. 2. User experience was another area of feedback, and primarily included suggestions on how to make LibGO a more fun experience. One graduate student offered a role-playing game alternative. Another graduate student expressed an interest in a game with side missions, in addition to the overall goals, where tokens could be earned for completed missions; the student justified these changes by stating “I feel that incorporating these types of idea will make the game more enjoyable.” In suggesting similar improvements, one undergraduate stated that LibGO “felt more like a quiz than a game.” 3. Technology issues primarily addressed two related issues: images not loading and broken links. Images not loading could be dependent on many factors, including the user’s browser settings, internet traffic (volume) delaying load time, or broken image links, among others. Broken links could be the root issue since the images used in LibGO were taken from other areas of the library website. This method of gathering content pointed out a design vulnerability of using existing image locations (controlled by non-LibGO developers) rather than images exclusively for LibGO. 4. Content issues were raised exclusively by graduate students. One student felt that LibGO placed an emphasis on physical spaces in the library and did not give a deep enough treatment to library services. Another graduate student asked for “an interactive map to click on so that we physically see the areas” of the library, thus making the interaction more user-friendly with a visual. 5. Didn’t understand purpose is a subtheme where improvement is needed and is based on two comments made by the two university staff members. One wrote that “An online tour would have been better and just as informative,” although LibGO was not only designed to be an online tour of the library, but also an orientation of the library’s services. The other staff member wrote, “I read the rules but it was still unclear what the objective was.” In all, it is clear that LibGO’s purpose was confusing for some. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 15 Table 5. LibGO Improvement Tip Axial Codes by User Group Axial Code Undergraduate Graduate Faculty Staff Community member Total # concerns Design 4 3 0 0 1 8 User experience 1 2 1 0 1 5 Tech issue 0 1 0 1 0 2 Content 0 5 0 0 1 6 Didn’t understand purpose 0 0 0 2 0 2 Total: 5 11 1 3 3 23 Library Services Feedback (9 separate concerns). Several participants took the opportunity to provide feedback on general library services rather than on LibGO itself. Undergraduates simply gave general positive feedback about the value of the library, but many graduate students gave recommendations regarding specific electronic resource improvements. Additionally, one graduate student wrote, “I think it is critical to meet with new graduate students before they start their program,” something the library used to do but had not pursued in recent years. Although these comments did not directly pertain to LibGO, the authors accepted all of them as valuable feedback to the library. Library Building Feedback (11 separate concerns). This was another theme in which graduate students dominated the comments. Feedback ranging from requests for microwave use, additional study tables and better temperature control in the building appeared. Several participants asked for greater enforcement of quiet zones. Like the Library Services Feedback, the authors again took these comments as helpful to the overall library rather than LibGO. DISCUSSION The results of this study indicated that some groups of library visitors better received the gamified library orientation experience than other groups. Undergraduate students indicated the largest appreciation for a library orientation via LibGO. Specifically, they demonstrated a statistically significant difference over the other groups in supporting LibGO’s usefulness as an orientation tool, a preference for LibGO over other orientation formats, and a likelihood of future use of the physical library after playing LibGO. These very encouraging results provide evidence for the efficacy of alternative means of library orientation. The qualitative results provided additional helpful insight regarding the user impressions from each of the five surveyed groups. This feedback demonstrated that a variety of groups benefited from the experience of playing LibGO, including some community members who appreciated LibGO as a means of becoming acclimated to the library without having to enter the building. A virtual orientation format was not ideal for a few players who indicated a preference for a face-to- face orientation due to the ability to ask questions. Many people identified areas of improvement for LibGO. Graduate students in particular offered a disproportionate number of suggestions as compared to the other groups. While they provided a great deal of helpful feedback, it is possible that graduate students were so distracted by the perceived problems that they could not fully take in the experience or gain value from LibGO’s orientation purpose. It is also very likely that LibGO INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 16 simply was not very fun for these players: several players noted that it did not feel like a game but rather a collection of content. The review of literature indicated that this amusement issue is a common pitfall of educational games. Although the authors tried to design an enjoyable orientation experience, it is possible that more work is needed to satisfy user expectations. The mixed-methods design of this study was instrumental in providing a richer understanding of user perceptions. While the statistical analysis of participant survey responses was very helpful in identifying clear trends between groups, the qualitative analysis helped the authors draw valuable conclusions. Specifically, the open-response data demonstrated that additional groups such as graduate students and community members appreciated the experience of playing LibGO; this information was not readily apparent through the statistical analysis. Additionally, the qualitative analysis demonstrated that many groups had concerns regarding areas of improvement that may have impaired their user experience. These important findings could help guide future directions of the research. In all, the authors concluded this phase of the research feeling satisfied that LibGO showed great promise for library orientation delivery but could benefit from continued development and future user assessment. Although undergraduate students seemed most receptive overall to a virtual orientation experience, other groups appeared to have benefited from the resource. STUDY LIMITATIONS A primary limitation of this study was its small sample size. As the entire university campus was targeted for participation in the study, the number of respondents was far too small to generalize the results. Despite this limitation however, the study’s population reflected many different groups of library patrons on campus. The findings are therefore valuable as a means of stimulating future discussion regarding the value of alternative library orientation methods utilizing gamification. Another limitation is that the authors did not pre-assess the targeted groups for their prior knowledge of Walker Library services and building layout, nor for their interest in learning about these topics. It is possible that various groups did not see the value in learning about the library for a variety of reasons. Faculty members, in particular, may have considered their prior knowledge adequate for navigating the electronic holdings or building layout without recognizing the value of the other many services offered physically and electronically by the library. All groups may have experienced a level of “library anxiety” that prevented them from being motivated to learn more about the library.38 It is difficult to understand the range of covariate factors without a pre-assessment. Finally, there was qualitative evidence supporting the limitation that LibGO did not properly convey its stated purpose of orientation rather than imparting research skills. Without understanding LibGO’s focus on library orientation, users could have been confused or disappointed by the experience. Although care was taken to make this purpose explicit, some users indicated their confusion in the qualitative data. This observed problem points to a design flaw that undoubtedly had some bearing on the study’s results. CONCLUSION & FUTURE RESEARCH Convinced of the importance of the library orientation, the authors sought to move this traditional in-person experience to a virtual one. The quantitative results indicated that the gamified INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 17 orientation experience was useful to undergraduate students in its intended purpose of acclimating users to the library, as well as encouraging their future use of the physical library. At a time in which physical traffic to the library has shown a marked decline, new outreach strategies should be considered.39 The results were also helpful in showing that this particular iteration of the gamified orientation was preferred over other delivery methods by undergraduate students, as compared to other groups, to a statistically significant level. This is an important finding as it demonstrates that a diversified outreach strategy is necessary: different groups of library patrons desire their orientation information in different formats. The next logical question to ask however is: Why did the other groups examined through the statistical data analysis (graduate students and faculty) not appreciate the gamified orientation to the same level as undergraduates? The answers to this question are complicated and may be explained in part by the qualitative analysis. Based upon those findings, it is possible that the game did not appeal to these groups on the basis of fun or enjoyment; this concern was specifically mentioned by graduate students. Faculty members, including staff, provided a smaller level of qualitative feedback; it is therefore difficult to speculate as to their exact reasons for disengagement with LibGO. With this concern in mind, the authors would like to concentrate their next iteration of research on the specific library orientation needs of graduate students and faculty. Both groups present different, but critical, needs for outreach. Graduate students were the largest group of survey respondents, presumably indicating a high level of interest in learning more about the library. Many graduate programs at MTSU are delivered partially or entirely online; as a result, these students may be less likely to come to campus. Due to graduate students’ relatively infrequent visits to campus, a virtual library orientation could be even more meaningful for them in meeting their need for library services information. Faculty are another important group to target because if they lack a full understanding of the library’s offerings, they are unlikely to assign assignments that wholly utilize the library’s services. Although it is possible that faculty prefer an in-person orientation, many new faculty have indicated limited availability for such events. A virtual orientation seems conducive to busy schedules. However, it is possible that the issue is simply a matter of marketing: faculty may not know that a virtual option is available, nor do they necessarily understand all that the library has to offer. In all, future research should begin with a survey to understand what both groups already know about the library, as well as the library services they desire. Another necessary step in future research would be the expansion of the development team to include computer programmers. Although the authors feel that LibGO holds great promise as a virtual orientation tool, more needs to be done to enhance the user’s enjoyment of the experience. Twine is a user-friendly software that other librarians could pick up without having to be computer programmers; however, programmers (professional or student) could bring a design expertise to the project. Future iterations of this project should incorporate the skills of multiple groups, including expertise in libraries, user research, visual design, interaction design, programming, marketing, and testers from each type of intended audience. Collectively, this group will have the greatest impact on improving the user experience and ultimately the usefulness of a gamified orientation experience. This experience with gamification, and specifically interactive storytelling, was a valuable experience for Walker Library. These results should encourage other libraries seeking an alternate INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 18 delivery method for orientations. The authors hope to build upon the lessons learned from this mixed methods research study of LibGO to find the correct outreach medium for their range of library users. ACKNOWLEDGMENTS Special thanks to our beta playtesters and student assistants who worked the LibGO Event, which was funded, in part, by MT Engage and Walker Library at Middle Tennessee State University. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 19 APPENDIX A: SURVEY INSTRUMENT INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 20 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 21 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 22 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 23 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 24 ENDNOTES 1 Sandra Calemme McCarthy, “At Issue: Exploring Library Usage by Online Learners with Student Success,” Community College Enterprise 23, no. 2 (January 2017): 27–31; Angie Thorpe et al., “The Impact of the Academic Library on Student Success: Connecting the Dots,” Portal: Libraries and the Academy 16, no. 2 (2016): 373–92, https://doi.org/10.1353/pla.20160027. 2 Steven Ovadia, “How Does Tenure Status Impact Library Usage: A Study of LaGuardia Community College,” Journal of Academic Librarianship 35, no. 4 (January 2009): 332–40, https://doi.org/10.1016/j.acalib.2009.04.022. 3 Chris Leeder and Steven Lonn, “Faculty Usage of Library Tools in a Learning Management System,” College & Research Libraries, 75, no. 5 (September 2014): 641–63, https://doi.org/10.5860/crl.75.5.641. 4 Kyle Felker and Eric Phetteplace, “Gamification in Libraries: The State of the Art,” Reference and User Services Quarterly 54, no. 2 (2014): 19-23, https://doi.org/10.5860/rusq.54n2.19; Nancy O’Hanlon, Karen Diaz, and Fred Roecker, “A Game-Based Multimedia Approach to Library Orientation,” (paper, 35th National LOEX Library Instruction Conference, San Diego, May 2007), https://commons.emich.edu/loexconf2007/19/; Leila June Rod-Welch, “Let’s Get Oriented: Getting Intimate with the Library, Small Group Sessions for Library Orientation,” (paper, Association of College and Research Libraries Conference, Baltimore, March 2017), http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/201 7/LetsGetOriented.pdf. 5 Kelly Czarnecki, “Chapter 4: Digital Storytelling in Different Library Settings,” Library Technology Reports, no. 7 (2009): 20-30; Rebecca J. Morris, “Creating, Viewing, and Assessing: Fluid Roles of the Student Self in Digital Storytelling,” School Libraries Worldwide, no. 2 (2013): 54–68. 6 Sandra Marcus and Sheila Beck, “A Library Adventure: Comparing a Treasure Hunt with a Traditional Freshman Orientation Tour,” College & Research Libraries 64, no. 1 (January 2003): 23–44, https://doi.org/10.5860/crl.64.1.23. 7 Lori Oling and Michelle Mach, “Tour Trends in Academic ARL Libraries,” College & Research Libraries, 63, no. 1 (January 2002): 13-23, https://doi.org/10.5860/crl.63.1.13. 8 Kylie Bailin, Benjamin Jahre, and Sarah Morriss, “Planning Academic Library Orientations: Case Studies from Around the World,” (Oxford, UK: Chandos Publishing, 2018): xvi. 9 Bailin, Jahre, and Morriss, “Planning Academic Library Orientations.” 10 Marcus and Beck, “A Library Adventure”; A. Carolyn Miller, “The Round Robin Library Tour,” Journal of Academic Librarianship 6, no. 4 (1980): 215–18; Michael Simmons, “Evaluation of Library Tours,” EDRS, ED 331513 (1990): 1-24. 11 Marcus and Beck, “A Library Adventure”; Oling and Mach, “Tour Trends”; Rod-Welch, “Let’s Get Oriented.” https://doi.org/10.1353/pla.20160027 https://doi.org/10.1016/j.acalib.2009.04.022 https://doi.org/10.5860/crl.75.5.641 https://commons.emich.edu/loexconf2007/19/ http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/LetsGetOriented.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/LetsGetOriented.pdf https://doi.org/10.5860/crl.64.1.23 https://doi.org/10.5860/crl.63.1.13 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 25 12 Pixey Anne Mosley, “Assessing the Comfort Level Impact and Perceptual Value of Library Tours,” Research Strategies 15, no. 4 (1997): 261–70, https://doi.org/10.1016/S0734- 3310(97)90013-6. 13 Mosley, “Assessing the Comfort Level Impact and Perceptual Value of Library Tours.” 14 Marcus and Beck, “A Library Adventure,” 27. 15 Kenneth J. Burhanna, Tammy J. Eschedor Voelker, and Jule A. Gedeon, “Virtually the Same: Comparing the Effectiveness of Online Versus In-Person Library Tours,” Public Services Quarterly 4, no. 4(2008): 317–38, https://doi.org/10.1080/15228950802461616. 16 Burhanna, Voelker, and Gedeon, “Virtually the Same,” 326. 17 Burhanna, Voelker, and Gedeon, “Virtually the Same,” 329. 18 Felker and Phetteplace, “Gamification in Libraries.” 19 Felker and Phetteplace, “Gamification in Libraries,”20. 20 Felker and Phetteplace, “Gamification in Libraries.” 21 Felker and Phetteplace, “Gamification in Libraries”; O’Hanlon et al., “A Game-Based Multimedia Approach.” 22 Mary J. Broussard and Jessica Urick Oberlin, “Using Online Games to Fight Plagiarism: A Spoonful of Sugar Helps the Medicine Go Down,” Indiana Libraries 30, no. 1 (January 2011): 28–39. 23 Melissa Mallon, “Gaming and Gamification,” Public Services Quarterly 9, no. 3 (2013): 210–21, https://doi.org/10.1080/15228959.2013.815502. 24 J. Long, “Chapter 21: Gaming Library Instruction: Using Interactive Play to Promote Research as a Process,” Distributed Learning (January 1, 2017), 385–401, https://doi.org/10.1016/B978-0- 08-100598-9.00021-0. 25 Rod-Welch, “Let’s Get Oriented.” 26 O’Hanlon et al., “A Game-Based Multimedia Approach.” 27 Mallon, “Gaming and Gamification.” 28 Anna-Lise Smith and Lesli Baker, “Getting a Clue: Creating Student Detectives and Dragon Slayers in Your Library,” Reference Services Review 39, no. 4 (November 2011): 628–42, https://doi.org/10.1108/00907321111186659. 29 Monica Fusich et al., “HML-IQ: Frenso State’s Online Library Orientation Game,” College & Research Libraries News 72, no. 11 (December 2011): 626–30, https://doi.org/10.5860/crln.72.11.8667. https://doi.org/10.1016/S0734-3310(97)90013-6 https://doi.org/10.1016/S0734-3310(97)90013-6 https://doi.org/10.1080/15228950802461616 https://doi.org/10.1080/15228959.2013.815502 https://doi.org/10.1016/B978-0-08-100598-9.00021-0 https://doi.org/10.1016/B978-0-08-100598-9.00021-0 https://doi.org/10.1108/00907321111186659 https://doi.org/10.5860/crln.72.11.8667 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 APPLYING GAMIFICATION TO THE LIBRARY ORIENTATION | REED AND MILLER 26 30 Broussard and Oberlin, “Using Online Games”; Fusich et al., “HML-IQ”; O’Hanlon et al., “A Game- Based Multimedia Approach.” 31 Felker and Phetteplace, “Gamification in Libraries.” 32 Felker and Phetteplace, “Gamification in Libraries”; Fusich et al., “HML-IQ.” 33 “Design Thinking for Libraries: A Toolkit for Patron-Centered Design,” Ideo (2015), http://designthinkingforlibraries.com. 34 John W. Creswell and Vicki L. Plano Clark, Designing and Conducting Mixed Methods Research (Thousand Oaks, CA: Sage Publications, 2007). 35 Roger Kirk, “Practical Significance: A Concept Whose Time Has Come,” Educational and Psychological Measurement, no. 5 (1996). 36 Kirk, “Practical Significance.” 37 Sandra Mathison, “Encyclopedia of Evaluation,” SAGE, 2005, https://doi.org/10.4135/9781412950558. 38 Rod-Welch, “Let’s Get Oriented.” 39 Felker and Phetteplace, “Gamification in Libraries.” http://designthinkingforlibraries.com/ https://doi.org/10.4135/9781412950558 ABSTRACT INTRODUCTION Background Literature Review Traditional Orientation Gamification Use in Libraries Development of the Library Game Orientation (LibGO) Purpose of Study Methodology Overview Research Questions Participants Survey Instrument Data Collection Results Quantitative Findings Qualitative Findings Discussion of Qualitative Themes Discussion Study Limitations Conclusion & Future Research Acknowledgments Appendix A: Survey Instrument ENDNOTES 12211 ---- Likes, Comments, Views: A Content Analysis of Academic Library Instagram Posts ARTICLES Likes, Comments, Views A Content Analysis of Academic Library Instagram Posts Jylisa Doney, Olivia Wikle, and Jessica Martinez INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2020 https://doi.org/10.6017/ital.v39i3.12211 Jylisa Doney (jylisadoney@uidaho.edu) is Social Sciences Librarian, University of Idaho. Olivia Wikle (omwikle@uidaho.edu) is Digital Initiatives Librarian, University of Idaho. Jessica Martinez (jessicamartinez@uidaho.edu) is Science Librarian, University of Idaho. © 2020. ABSTRACT This article presents a content analysis of academic library Instagram accounts at eleven land-grant universities. Previous research has examined personal, corporate, and university use of Instagram, but fewer studies have used this methodology to examine how academic libraries share content on this platform and the engagement generated by different categories of posts. Findings indicate that showcasing posts (highlighting library or campus resources) accounted for more than 50 percent of posts shared, while a much smaller percentage of posts reflected humanizing content (emphasizing warmth or humor) or crowdsourcing content (encouraging user feedback). Crowdsourcing posts generated the most likes on average, followed closely by orienting posts (situating the library within the campus community), while a larger proportion of crowdsourcing posts, compared to other post categories, included comments. The results of this study indicate that libraries should seek to create Instagram posts that include various types of content while also ensuring that the content shared reflects their unique campus contexts. By sharing a framework for analyzing library Instagram content, this article will provide libraries with the tools they need to more effectively identify the types of content their users respond to and enjoy as well as make their social media marketing on Instagram more impactful. INTRODUCTION Library use of social media has steadily increased over time; in 2013, 86 percent of libraries reported using social media to connect with their patron communities.1 The ways in which libraries use social media tend to vary, but common themes include marketing services, content, and spaces to patrons, as well as creating a sense of community.2 Even with this wealth of research, fewer studies have examined how libraries use Instagram, and those that do often utilize a formal or informal case study methodology.3 This research seeks to fill that gap by examining the types of content shared most frequently by a subset of academic library Instagram accounts. Although this research focused on academic libraries, its methods and findings could be leveraged by educational institutions and non-profits in their own investigations of Instagram usage and impact. LITERATURE REVIEW Since its inception in 2010, Instagram’s number of account holders has been steadily increasing. By 2019, more than one billion user accounts were active each month, making it the third most popular social media network in the world, and the Pew Research Center has reported that Instagram is the second most used social media platform among people ages 18-29 in the United States, after Facebook.4 Instagram has estimated that 90 percent of user accounts follow at least one business account.5 Previous research has also shown that individuals who use Instagram to follow specific brands have the highest rates of engagement with, and commitment to, those mailto:jylisadoney@uidaho.edu mailto:omwikle@uidaho.edu mailto:jessicamartinez@uidaho.edu INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 LIKES, COMMENTS, VIEWS | DONEY, WIKLE, AND MARTINEZ 2 brands when compared to users of other social media platforms.6 Though businesses are fundamentally different in the products or services they are trying to market, academic libraries share a desire to provide information to, and engage with, their followers. As such, in the past decade, libraries have begun to adopt Instagram as a way to market their libraries and interact with patrons.7 However, methods and parameters for libraries’ use of Instagram vary across types of libraries and even within specific library types.8 Research has demonstrated that academic libraries’ use of social media, including Instagram, is often for the purpose of increasing the sense of community among librarians and patrons by marketing the library’s services and encouraging student feedback and interaction.9 Similarly, Harrison et al. discovered that academic library social media posts reflected three main themes: “community connections, inviting environment, and provision of content.”10 Chatten and Roughley have also reported that libraries’ use of social media ranges from providing customer service to promoting the library and building a community of users.11 Indeed, when comparing modern social networking systems, such as Instagram, to older platforms, such as Myspace, Fernandez posited that today’s popular social media sites encourage networking and are especially suited to creating community.12 Ideally, community engagement in the virtual social media environment would encourage more patrons to enter the library and thus engage in more face-to-face encounters.13 Libraries’ methods for measuring the success of their social media engagement are as varied as the ways in which they use social media. Assessment of libraries’ social media efficacy is tricky, and highly variable from institution to institution. Hastings has cautioned that librarians should recognize that patrons both actively and passively interact with social media content.14 For this reason, while a large number of comments or likes may be identified as positive markers for active engagement, passive forms of engagement, such as the number of times a post appeared in users’ Instagram feeds, may also be relevant.15 Therefore, when librarians measure the success of an Instagram post by examining only the number of likes and comments, they should be aware that they are measuring a very specific type of engagement: one which, on its own, may not determine a post’s full reach or effectiveness. Other ways to measure engagement include monitoring how the number of people subscribed to an account changes over time, evaluating reach and impressions,16 or analyzing the content of comments (a type of qualitative measure that may indicate the type of community developing around the library’s social media). Despite, or perhaps because of, the general excitement surrounding the possibilities that libraries’ engagement with social media can produce, very little has been written about how different types of libraries (such as academic libraries, law libraries, public libraries, etc.), or libraries in general, use these platforms.17 Additionally, many librarians may lack expertise in marketing, including those who are managing social media accounts.18 As social media culture continues to evolve, librarians should move toward a more targeted and pragmatic approach to their Instagram practices. This refinement in social media practices may enable libraries to develop more structure, so that they may create and share the type of content that would achieve their desired result at a given time. However, in order to develop this kind of measured approach, it is necessary for researchers to first analyze libraries’ current Instagram practices to determine how posts are being used and the outcomes they generate. One effective method of analyzing Instagram content centers on coding and classifying images. While many such schemas have been developed for analyzing images posted by Instagram users and businesses, transferring these schemas to academic contexts has been difficult. 19 To address INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 LIKES, COMMENTS, VIEWS | DONEY, WIKLE, AND MARTINEZ 3 this gap, Stuart et al. adapted a schema that had been used to examine how “news media [and] non-profits,” as well as businesses, used Instagram.20 This new schema allowed Stuart et al. to classify Instagram posts produced by academic institutions in the UK and measure the effect of these universities’ attempts to engage with students via Instagram.21 Stuart et al.’s schema, which classified Instagram images into six categories (orienting, humanizing, interacting, placemaking, showcasing, and crowdsourcing), was the basis for the present study.22 METHODS Research Questions The impetus for this study was to learn more about how academic libraries use Instagram to connect with their campus communities and promote their services and events. The authors of the present study adapted the research questions posed by Stuart et al. to reflect academic library contexts:23 • RQ1: Which type of post category is used most frequently by libraries on Instagram? • RQ2: Is the number of likes or the existence of comments related to the post category? Identifying a Sample Population This study investigated a small subset of academic institutions: the University of Idaho’s sixteen peer institutions. These peers have similar “student profiles, enrollment characteristics, research expenditures, [or] academic disciplines and degrees”; each is designated as a land-grant institution; and the University of Idaho considers three to be “aspirational peers.”24 After selecting this population, the authors investigated the library websites of each of the sixteen peer institutions to determine whether or not they had a library-specific Instagram account. When a link was not available on the library websites, the authors conducted a search within Instagram as well as a general Google search in an attempt to identify these Instagram accounts. Of the University of Idaho’s sixteen peer institutions, eleven had active, library-specific Instagram accounts. Data Collection The authors undertook manual data collection between November and December 2018 for these eleven library Instagram accounts. Initial information about each Instagram account was gathered prior to the study on October 23, 2018: the date of the first post, the total number of posts shared by the account, the total number of followers, and the total number of accounts followed. For each account, the authors identified posts shared from January 1, 2018, to June 30, 2018. The “print to PDF” function available in the Chrome browser was used to preserve a record of the content, in case the accounts were later discontinued while research was underway. If a post included more than one image, only the first image was captured in the PDF and analyzed. To organize the 3 77 Instagram posts shared within this timeframe, the authors assigned each institution a unique, five- digit identifier; file names included this identifier as well as the date of the post (e.g. , 00004_IGpost_20180423). This file naming convention ensured that posts were separated based on institution and that future studies could use the same file naming convention, even if the sample size increased significantly. The authors added the file names of all 377 Instagram posts to a shared Google Sheet, and for each post they reported the kind of post (photo or video), the number of likes, and whether comments existed. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 LIKES, COMMENTS, VIEWS | DONEY, WIKLE, AND MARTINEZ 4 Research Data Analysis Content Analysis This project adapted the coding schema Stuart et al. employed to investigate the ways in which UK universities used Instagram.25 Expanding on research by McNely, Stuart et al. employed six Instagram post categories: orienting, humanizing, interacting, placemaking, showcasing, and crowdsourcing.26 For the purposes of the present study, the authors used the same category names when coding library Instagram posts. However, they updated and adapted the descriptions of each category over the course of two rounds of coding to better reflect academic library contexts (see table 1). Within this coding schema, the authors elected to apply only a single category name (i.e., a code) to each library Instagram post. Interrater Reliability During the first round of coding, the authors selected two or three institutions every month, independently coded the posts based on the initial adapted schema, met to discuss discrepancies, and identified the final code based on consensus.27 However, during these discussions, it became evident that there was substantial disagreement concerning how specific categories were interpreted. To examine the impact of this disagreement, the authors calculated Fleiss’ kappa, which can be used to assess interrater reliability when two or more coders categorically evaluate data.28 Although this project’s Fleiss’ kappa (0.683554901) was relatively close to a score of 1.0, demonstrating moderate agreement between each of the three coders, the authors recognized that additional fine-tuning of the adapted coding schema would allow for a more accurate representation of the types of content shared by academic libraries. After updating the schema (table 1), a small sample of collected Instagram posts (20 percent, or 76 posts) was randomly selected for independent recoding by each of the authors. Again, after coding this random sample individually, the authors met to seek consensus. Anecdotal feedback from the coders, as well as an increase in the project’s Fleiss’ kappa (0.795494117), demonstrated that the updated coding schema was more robust and representative. Based on this evidence, the authors randomly distributed the remaining 301 posts amongst themselves; each post was coded by one author. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 LIKES, COMMENTS, VIEWS | DONEY, WIKLE, AND MARTINEZ 5 Table 1. Coding Schema for Library Instagram Posts [Adapted from: Emma Stuart, David Stuart, and Mike Thelwall, “An Investigation of the Online Presence of UK Universities on Instagram,” Online Information Review 41, no. 5 (2017): 588, https://doi.org/10.1108/OIR-02-2016-0057.] Category Description Example1 Crowdsourcing Posts that were created with the intention of generating feedback within the platform. If the content of the post itself fits within a different classification category, but the image is accompanied by text that explicitly asks for viewer feedback, then the post should be classified as crowdsourcing. Includes requests for followers to like, comment on, or tag others in a particular post. Humanizing Posts that aim to emphasize human character or elements of warmth, humor, or amusement. This includes historic/archival photos used to convey these sentiments. This code is only used if both the text and the photo or video can be categorized as humanizing because many library posts contain a “humanizing” element. 1 Sample images from the University of Idaho Library’s Instagram account. https://doi.org/10.1108/OIR-02-2016-0057 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 LIKES, COMMENTS, VIEWS | DONEY, WIKLE, AND MARTINEZ 6 Category Description Example1 Interacting Posts with candid photographs or videos at library and library- associated events. Includes events within or outside the library. Orienting Posts that situate the library within its larger community, especially regarding locations, artifacts, or identities. Text often includes geographic information. Placemaking Posts that capture the atmosphere of the library through its physical space and attributes. Includes permanent murals, statues, etc. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 LIKES, COMMENTS, VIEWS | DONEY, WIKLE, AND MARTINEZ 7 Category Description Example1 Showcasing Posts that highlight library or campus resources, services, or future events. Can include current or on-going events if people are not the focus of the image (e.g. exhibit, highlight of collection, etc.). These posts can also present information about library operations, such as hours and fundraising. Posts can also entice their audience to do something, outside of Instagram, such as visit a specific website. RESULTS General Data about the Library Instagram Accounts As of October 23, 2018 (the date this initial information was gathered), the eleven academic library Instagram accounts had shared a combined 3,124 posts. Most libraries created their Instagram accounts and started posting between 2013 and 2016, but one library shared a post in 2012 and one created their account in April 2018. Since the date of their first post, each account had shared 284 posts on average, while the actual number of posts shared across accounts ranged from 62 to 520. The number of followers and accounts followed across these eleven accounts ranged from 115 to 1,390 and 65 to 2,717, respectively. Between January 1, 2018 , and June 30, 2018, these eleven library Instagram accounts shared a total of 377 posts. The number of posts shared by each account during this time period ranged from four to 57, with an average of 34 posts. RQ1: Which Type of Post Category is Used most Frequently by Libraries on Instagram? Of the 377 posts analyzed, 359 included photos and 18 included videos. More than 50 percent of posts shared were coded as showcasing, with humanizing (18 percent) and crowdsourcing (9.8 percent) being the next most common categories (see table 2), although data demonstrated that individual libraries differed in their use of specific post categories (see table 3). When examining frequency based on category of post, the authors identified slight differences between video and photo posts. As with photos, the majority of videos (55.6 percent) were still coded as showcasing; however, the second most common post category for videos was interacting (16.7 percent). INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 LIKES, COMMENTS, VIEWS | DONEY, WIKLE, AND MARTINEZ 8 Table 2. Number and Percentage of Posts by Category for Posts with Photos or Videos Category Number of Posts Percentage of Posts Crowdsourcing 38 10.1% Humanizing 68 18.0% Interacting 16 4.2% Orienting 28 7.4% Placemaking 33 8.8% Showcasing 194 51.5% Total 377 100% Table 3. Percentage of Posts by Category and Library for Posts with Photos or Videos Library Crowdsourcing Humanizing Interacting Orienting Placemaking Showcasing Lib 1 7.7% 15.4% 0% 23.1% 30.8% 23.1% Lib 2 4.2% 50.0% 0% 4.2% 0% 41.7% Lib 3 56.1% 10.5% 1.8% 3.5% 7.0% 21.1% Lib 4 0% 4.1% 4.1% 4.1% 2.0% 85.7% Lib 5 0% 24.4% 2.2% 20.0% 26.7% 26.7% Lib 6 7.5% 18.9% 3.8% 11.3% 11.3% 47.2% Lib 7 0% 20.0% 0% 0% 10.0% 70.0% Lib 8 0% 21.6% 9.8% 5.9% 0% 62.7% Lib 9 0% 25.0% 25.0% 0% 0% 50.0% Lib 10 0% 16.1% 6.5% 0% 9.7% 67.7% Lib 11 0% 15.0% 5.0% 5.0% 5.0% 70.0% RQ2: Is the Number of Likes or the Existence of Comments Related to the Post Category? Number of Likes by Category The results of the coding process also indicated that the number of likes differed based on the category of post. When examining photo posts, the authors noted that every post received at least five likes, with most posts receiving between 20-39 likes (see table 4). On average, crowdsourcing INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 LIKES, COMMENTS, VIEWS | DONEY, WIKLE, AND MARTINEZ 9 photo posts generated the highest average number of likes across all categories, followed by orienting and placemaking posts (see table 5). However, it is important to recognize that crowdsourcing posts often asked visitors to participate in a post by “liking” it, often with the chance to win a library-sponsored contest, which may partially explain the higher average number of likes. Table 4. Number of Posts by Category and Range of Likes for Posts with Photos (does not include posts with videos) Range of Likes Category 5-19 20-39 40-59 60-79 80-99 100- 119 120- 140 Crowdsourcing 0 11 16 6 1 1 1 Humanizing 16 26 10 9 5 0 1 Interacting 5 5 3 0 0 0 0 Orienting 2 7 9 8 0 1 0 Placemaking 3 10 12 3 2 1 1 Showcasing 67 83 27 5 1 0 1 Total 93 142 77 31 9 3 4 Table 5. Average Number of Likes by Category for Posts with Photos (does not include posts with videos) Category Average Number of Likes Number of Posts Crowdsourcing 53.6 36 Humanizing 39.9 67 Interacting 27.8 13 Orienting 50.0 27 Placemaking 46.9 32 Showcasing 27.6 184 Existence of Comments by Category The authors also examined the existence of comments, another metric for engagement with Instagram posts. Data demonstrated that 78.9 percent of crowdsourcing posts included INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 LIKES, COMMENTS, VIEWS | DONEY, WIKLE, AND MARTINEZ 10 comments, while a much lower percentage of placemaking (30.3 percent), orienting (28.6 percent), and humanizing (26.5 percent) posts generated this type of engagement (see table 6). As with the data on the number of “likes,” many crowdsourcing posts encouraged visitors to comment on a particular post, at times with an incentive connected to this type of engagement. Table 6. Presence of Comments by Category for Posts with Photos or Videos Category Number of Posts with Comments Number of Posts without Comments Total Number of Posts Percentage of Posts with Comments Crowdsourcing 30 8 38 78.9% Humanizing 18 50 68 26.5% Interacting 3 13 16 18.8% Orienting 8 20 28 28.6% Placemaking 10 23 33 30.3% Showcasing 40 154 194 20.6% Total 109 268 377 28.9% DISCUSSION As noted previously, the post category used most frequently by these eleven libraries on Instagram was showcasing (51.5 percent). The fact that libraries were more likely to share this type of content—which highlighted library resources, events, or collections—is understandable, as library promotion is one of the foundational reasons libraries spend the time and effort required to maintain social media accounts.29 This finding differs substantially from previous research with UK universities, which classified only 28.8 percent of posts as showcasing.30 When examining other post categories, it also became clear that UK universities shared humanizing posts more frequently (31 percent) than the eleven libraries (18 percent) included in this study.31 Although the results of this study demonstrated that showcasing posts were shared most often, the data also indicates that showcasing posts were neither the category with the most likes on average nor the category that received comments most often. Crowdsourcing posts were the category with the highest average number of likes (53.6) with orienting posts coming in at a close second (50), followed by placemaking (46.9) and humanizing (39.9) posts. Showcasing posts, along with interacting posts, only generated slightly more than half the number of likes on average, when compared to the other categories (27.6 and 27.8, respectively). The category with the largest proportion of comments was crowdsourcing posts, with 78.9 percent of posts in this category generating comments from visitors. However, this result is likely skewed, as one of the library Instagram accounts had exceptionally successful crowdsourcing posts, which often included a giveaway or other incentive for participation. In fact, when this institution was removed from the data set, only six crowdsourcing posts remained, two of which generated INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 LIKES, COMMENTS, VIEWS | DONEY, WIKLE, AND MARTINEZ 11 comments. To better determine whether crowdsourcing posts are always this effective at generating engagement, it would be necessary to code a larger sample of Instagram pos ts. It is clear that while showcasing posts were the most common among the Instagram accounts analyzed, they also received the lowest number of likes, on average, and generated comments less frequently than all but one post category. While this may seem disheartening, it is important to remember that the showcasing category includes informational posts that convey library hours, services, or closures; this information that may be effectively relayed to users without necessitating an active response in the form of likes and comments. Therefore, one might use different criteria to determine the success of showcasing posts, perhaps examining Instagram data related to reach (the total number of unique visitors that view a post) and impressions (the total number of times a post is viewed).32 Data on reach and impressions are only available to Instagram account “owners.” In the current study, the authors did not quantify these types of engagement as their goal was to evaluate the content and metrics available to all Instagram users, rather than the data that was only available to the “owners” of these library Instagram accounts. In addition to answering the research questions, coding these Instagram posts prompted several new questions regarding the types of information libraries and other institutions share online. One such question includes: With both universities and academic libraries working with students, why did academic libraries share a smaller percentage of interacting posts than UK universities? 33 Additional research is needed to answer this question, but anecdotally, this difference may be related to the fact that universities, as a whole, have a larger number of opportunities to promote and share instances of interaction via Instagram than libraries. For example, general university Instagram accounts often include photos of students and affiliates interacting at large scale events such as sports games, musical performances, and other student gatherings that take place across campus. Library-specific accounts on the other hand, have fewer opportunities to post photos that capture individuals “interacting” candidly. Further, the fact that libraries tend to be proponents of privacy rights may inhibit library staff from taking photos of their users and sharing them online without first getting permission. Therefore, differences related to the number of events and the organization type may contribute to whether or not universities and libraries share interacting posts; more research is needed to examine this hypothesis. Another issue that arose during coding was that, if not for their inclusion of a request to comment, many crowdsourcing posts could have been classified under other categories. If an account follower looked only at the photos included in many of the crowdsourcing posts without reading the captions, they may not interpret those posts as crowdsourcing. Therefore, a future research project might examine whether applying secondary categories to crowdsourcing posts, as a means of further classifying images and not just their captions, could generate a more comprehensive picture of what libraries are sharing on their Instagram accounts. The authors also discovered that a majority of the library Instagram posts included in this sample contained humanizing elements. Almost all posts attempted to convey warmth, humor, or assistance, and therefore had the potential to be classified as humanizing. To successfully adapt Stuart et al.’s coding schema for academic library Instagram accounts, the authors specified that a post had to have both a humanizing caption as well as a humanizing photo to be coded as such.34 As with crowdsourcing posts, adding secondary categories to humanizing posts could better reflect the dual nature of this content and help future coders more accurately interpret the types of content shared by academic libraries. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 LIKES, COMMENTS, VIEWS | DONEY, WIKLE, AND MARTINEZ 12 LIMITATIONS AND FUTURE RESEARCH The number of library Instagram accounts selected as well as the use of a six-month timeframe were limitations of the current study. In the future, selecting a larger sample size and a different group of academic libraries would serve to advance the discipline’s understanding of the types of content shared by academic libraries and how users interact with these Instagram posts. Additionally, collecting Instagram posts shared during an expanded timeframe could allow researchers to explore whether library Instagram accounts consistently share the same types of content at various points throughout the year. As mentioned in the Discussion section, future research could also include adding secondary categories to posts, which would allow researchers to gather more granular information about the types of content shared and the relationships between post category, comments, and likes. Lastly, to better understand the post categories that generate the greatest engagement, collaborative research between institutions could allow researchers to gather and analyze metrics that are only available to account owners, such as impressions and reach. With this type of collaboration, researchers could also investigate how social media outreach goals influence the types of content shared on library Instagram accounts. For example, researchers could conduct interviews or surveys with libraries and ask questions such as: what does your library hope to accomplish with its Instagram account, who are you attempting to reach, how do you define a successful post, what metrics do you use to evaluate your Instagram presence, and do your social media outreach goals influence the types of content shared on Instagram? Pursuing these types of questions, in addition to examining the actual content shared, would allow researchers to gain a more complete picture of what a successful social media presence looks like for an academic library. CONCLUSION This research provides initial insight into the Instagram presence of a subset of academic libraries at land-grant institutions in the United States. Expanding on the research of Stuart et al., this project used an adapted coding schema to document and analyze the content and efficacy of academic libraries’ Instagram posts.35 The results of this study suggest that social media accounts, including those used by academic libraries, perform better when they reflect the community the library inhabits by highlighting content that is unique to their particular constituents, rather than simply functioning as another platform through which to share information. This study’s findings also demonstrate that academic libraries should strive to create an Instagram presence that encompasses a variety of post categories to ensure that their online information sharing meets various needs. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 LIKES, COMMENTS, VIEWS | DONEY, WIKLE, AND MARTINEZ 13 ENDNOTES 1 Nancy Dowd, “Social Media: Libraries are Posting, but is Anyone Listening?,” Library Journal 138, no. 10 (May 7, 2013), 12, https://www.libraryjournal.com/?detailStory=social-media-libraries-are- posting-but-is-anyone-listening. 2 Marshall Breeding, Next-Gen Library Catalogs (London: Facet Publishing, 2010); Zelda Chatten and Sarah Roughley, “Developing Social Media to Engage and Connect at the University of Liverpool Library,” New Review of Academic Librarianship 22, no. 2/3 (2016), https://doi.org/10.1080/13614533.2016.1152985; Amanda Harrison et al., “Social Media Use in Academic Libraries: A Phenomenological Study,” The Journal of Academic Librarianship 43, no. 3 (2017), https://doi.org/10.1016/j.acalib.2017.02.014; Nicole Tekulve and Katy Kelly, “Worth 1,000 Words: Using Instagram to Engage Library Users,” Brick and Click Libraries Symposium, Maryville, MO (2013), https://ecommons.udayton.edu/roesch_fac/20; Evgenia Vassilakaki and Emmanouel Garoufallou, “The Impact of Twitter on Libraries: A Critical Review of the Literature,” The Electronic Library 33, no. 4 (2015), https://doi.org/10.1108/EL- 03-2014-0051. 3 Yeni Budi Rachman, Hana Mutiarani, and Dinda Ayunindia Putri, “Content Analysis of Indonesian Academic Libraries’ Use of Instagram,” Webology 15, no. 2 (2018), http://www.webology.org/2018/v15n2/a170.pdf; Catherine Fonseca, “The Insta-Story: A New Frontier for Marking and Engagement at the Sonoma State University Library,” Reference & User Services Quarterly 58, no. 4 (2019), https://www.journals.ala.org/index.php/rusq/article/view/7148; Kjersten L. Hild, “Outreach and Engagement through Instagram: Experiences with the Herman B Wells Library Account,” Indiana Libraries 33, no. 2 (2014), https://journals.iupui.edu/index.php/IndianaLibraries/article/view/16633; Julie Lê, “#Fashionlibrarianship: A Case Study on the Use of Instagram in a Specialized Museum Library Collection,” Art Documentation: Bulletin of the Art Libraries Society of North America 38, no. 2 (2019), https://doi.org/10.1086/705737; Danielle Salomon, “Moving on from Facebook: Using Instagram to Connect with Undergraduates and Engage in Teaching and Learning,” College & Research Libraries News 74, no. 8 (2013), https://doi.org/10.5860/crln.74.8.8991. 4 “Our Story,” Instagram, https://business.instagram.com/; Chloe West, “17 Instagram Stats Marketers Need to Know for 2019,” Sprout Blog, April 22, 2019, https://web.archive.org/web/20191219192653/https://sproutsocial.com/insights/instagra m-stats/; Pew Research Center, “Social Media Fact Sheet,” last modified June 12, 2019, http://www.pewinternet.org/fact-sheet/social-media/. 5 “Our Story,” Instagram. 6 Joe Phua, Seunga Venus Jin, and Jihoon Jay Kim, “Gratifications of Using Facebook, Twitter, Instagram, or Snapchat to Follow Brands: The Moderating Effect of Social Comparison, Trust, Tie Strength, and Network Homophily on Brand Identification, Brand Engagement, Brand Commitment, and Membership Intention,” Telematics and Informatics 34, no. 1 (2017), https://doi.org/10.1016/j.tele.2016.06.004. https://www.libraryjournal.com/?detailStory=social-media-libraries-are-posting-but-is-anyone-listening https://www.libraryjournal.com/?detailStory=social-media-libraries-are-posting-but-is-anyone-listening https://doi.org/10.1080/13614533.2016.1152985 https://doi.org/10.1016/j.acalib.2017.02.014 https://ecommons.udayton.edu/roesch_fac/20 https://doi.org/10.1108/EL-03-2014-0051 https://doi.org/10.1108/EL-03-2014-0051 http://www.webology.org/2018/v15n2/a170.pdf https://www.journals.ala.org/index.php/rusq/article/view/7148 https://journals.iupui.edu/index.php/IndianaLibraries/article/view/16633 https://doi.org/10.1086/705737 https://doi.org/10.5860/crln.74.8.8991 https://business.instagram.com/ https://web.archive.org/web/20191219192653/https:/sproutsocial.com/insights/instagram-stats/ https://web.archive.org/web/20191219192653/https:/sproutsocial.com/insights/instagram-stats/ http://www.pewinternet.org/fact-sheet/social-media/ https://doi.org/10.1016/j.tele.2016.06.004 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 LIKES, COMMENTS, VIEWS | DONEY, WIKLE, AND MARTINEZ 14 7 Fonseca, “The Insta-Story;” Hild, “Outreach and Engagement;” Lê, “#Fashionlibrarianship;” Rachman, Mutiarani, and Putri, “Content Analysis;” Salomon, “Moving on from Facebook;” Tekulve and Kelly, “Worth 1,000 Words.” 8 Vassilakaki and Garoufallou, “The Impact of Twitter.” 9 Breeding, Next-Gen Library Catalogs; Hild, “Outreach and Engagement;” Rachman, Mutiarani, and Putri, “Content Analysis;” Vassilakaki and Garoufallou, “The Impact of Twitter.” 10 Harrison, Burress, Velasquez, Schreiner, “Social Media Use,” 253. 11 Chatten and Roughley, “Developing Social Media.” 12 Peter Fernandez, “‘Through the Looking Glass: Envisioning New Library Technologies’ Social Media Trends that Inform Emerging Technologies,” Library Hi Tech News 33, no. 2 (2016), https://doi.org/10.1108/LHTN-01-2016-0004. 13 Robin M. Hastings, Microblogging and Lifestreaming in Libraries (New York: Neal-Schumann Publishers, 2010). 14 Hastings, Microblogging. 15 Robert David Jenkins, “How Are U.S. Startups Using Instagram? An Application of Taylor's Six- Segment Message Strategy Wheel and Analysis of Image Features, Functions, and Appeals” (MA thesis, Brigham Young University, 2018), https://scholarsarchive.byu.edu/etd/6721. 16 Lucy Hitz, “Instagram Impressions, Reach, and Other Metrics you Might be Confused About,” Sprout Blog, January 22, 2020, https://sproutsocial.com/insights/instagram-impressions/. 17 Vassilakaki and Garoufallou, “The Impact of Twitter.” 18 Mark Aaron Polger and Karen Okamoto, “Who’s Spinning the Library? Responsibilities of Academic Librarians who Promote,” Library Management 34, no. 3 (2013), https://doi.org/10.1108/01435121311310914. 19 Yuhen Hu, Lydia Manikonda, and Subbarao Kambhampati, “What We Instagram: A First Analysis of Instagram Photo Content and User Types,” Eighth International AAAI Conference on Weblogs and Social Media (2014), https://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/viewPaper/8118; Jenkins, “How Are U.S. Startups Using Instagram?;” Brian J. McNely, “Shaping Organizational Image- Power Through Images: Case Histories of Instagram,” Proceedings of the 2012 IEEE International Professional Communication Conference, Piscataway, NJ (2012), https://doi.org/10.1109/IPCC.2012.6408624; Emma Stuart, David Stuart, and Mike Thelwall, “An Investigation of the Online Presence of UK Universities on Instagram,” Online Information Review 41, no. 5 (2017): 584, https://doi.org/10.1108/OIR-02-2016-0057. 20 Stuart, Stuart, and Thelwall, “An Investigation of the Online Presence;” McNely, “Shaping Organizational Image-Power,” 3. https://doi.org/10.1108/LHTN-01-2016-0004 https://scholarsarchive.byu.edu/etd/6721 https://sproutsocial.com/insights/instagram-impressions/ https://doi.org/10.1108/01435121311310914 https://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/viewPaper/8118 https://doi.org/10.1109/IPCC.2012.6408624 https://doi.org/10.1108/OIR-02-2016-0057 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 LIKES, COMMENTS, VIEWS | DONEY, WIKLE, AND MARTINEZ 15 21 Stuart, Stuart, and Thelwall, “An Investigation of the Online Presence.” 22 Stuart, Stuart, and Thelwall, “An Investigation of the Online Presence,” 588. 23 Stuart, Stuart, and Thelwall, “An Investigation of the Online Presence,” 585. 24 “University of Idaho’s peer institutions,” University of Idaho, accessed October 8, 2019. 25 Stuart, Stuart, and Thelwall, “An Investigation of the Online Presence,” 588. 26 McNely, “Shaping Organizational Image-Power,” 4; Stuart, Stuart, and Thelwall, “An Investigation of the Online Presence,” 588. 27 Johnny Saldaña, The Coding Manual for Qualitative Researchers (Los Angeles: Sage Publications, 2013), 27. 28 “Fleiss’ Kappa,” Wikipedia, https://en.wikipedia.org/wiki/Fleiss%27_kappa. 29 Chatten and Roughley, “Developing Social Media.” 30 Stuart, Stuart, and Thelwall, “An Investigation of the Online Presence,” 590. 31 Stuart, Stuart, and Thelwall, “An Investigation of the Online Presence,” 590. 32 Hitz, “Instagram Impressions, Reach, and Other Metrics.” 33 Stuart, Stuart, and Thelwall, “An Investigation of the Online Presence,” 590. 34 Stuart, Stuart, and Thelwall, “An Investigation of the Online Presence,” 588. 35 Stuart, Stuart, and Thelwall, “An Investigation of the Online Presence.” https://en.wikipedia.org/wiki/Fleiss%27_kappa ABSTRACT INTRODUCTION LITERATURE REVIEW METHODS Research Questions Identifying a Sample Population Data Collection Research Data Analysis Content Analysis Interrater Reliability RESULTS General Data about the Library Instagram Accounts RQ1: Which Type of Post Category is Used most Frequently by Libraries on Instagram? RQ2: Is the Number of Likes or the Existence of Comments Related to the Post Category? Number of Likes by Category Existence of Comments by Category DISCUSSION LIMITATIONS AND FUTURE RESEARCH CONCLUSION ENDNOTES 12219 ---- Analytics and Privacy: Using Matomo in EBSCO’s Discovery Service ARTICLES Analytics and Privacy Using Matomo in EBSCO’s Discovery Service Denise FitzGerald Quintel and Robert Wilson INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2020 https://doi.org/10.6017/ital.v39i3.12219 Denise FitzGerald Quintel (denise.quintel@mtsu.edu) is Discovery Services Librarian and Assistant Professor, Middle Tennessee State University. Robert Wilson (robert.wilson@mtsu.edu) is Systems Librarian and Assistant Professor, Middle Tennessee State University. © 2020. ABSTRACT When selecting a web analytics tool, academic libraries have traditionally turned to Google Analytics for data collection to gain insights into the usage of their web properties. As the valuable field of data analytics continues to grow, concerns about user privacy rise as well, especially when discussing a technology giant like Google. In this article, the authors explore the feasibility of using Matomo, a free and open-source software application, for web analytics in their library’s discovery layer. Matomo is a web analytics platform designed around user-privacy assurances. This article details the installation process, makes comparisons between Matomo and Google Analytics, and describes how an open-source analytics platform works within a library-specific application, EBSCO’s Discovery Service. INTRODUCTION In their 2016 article from The Serials Librarian, Adam Chandler and Melissa Wallace summarized concerns with Google Analytics (GA) by reinforcing how “reader privacy is one of the core tenets of librarianship.”1 For that reason alone, Chandler and Wallace worked to implement and test Piwik (now known as Matomo) on the library sites at Cornell University. Taking a cue from Chandler and Wallace, the authors of this paper sought out an analytics solution that was robust and private, that could easily work within their discovery interface, and provide the same data as their current analytics and discovery service implementation. This paper will expand on some of the concerns from the 2016 Wallace and Chandler article, make comparisons, and provide installation details for other libraries. Libraries typically use GA to support data-informed decisions or build discussions on how users interact with library websites. The goal of this pilot project was to determine the similarities between Google Analytics and Matomo, how viable Matomo might be as a Google Analytics replacement, and seek to bring awareness to privacy concerns in the library. Matomo could easily be installed on multiple websites. However, this project looked into a specific instance of monitoring, that of the library’s discovery layer, EBSCO Discovery Service (EDS). LITERATURE REVIEW Google Analytics The 2005 release of Google Analytics was a massive boon to libraries who long searched for an easy to implement and budget-friendly tool for analytics. Shortly after its release, academic libraries were quick to adopt the platform and install its JavaScript code into their library web pages.2 In a little over a decade, there have been nearly forty scholarly articles published that discuss the ways in which Google Analytics is used for libraries’ websites. Articles that not only mailto:denise.quintel@mtsu.edu mailto:robert.wilson@mtsu.edu INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 2 introduced the service, but also discuss the various ways libraries utilize the platform.3 In fact, in their survey of 279 libraries, O’Brien et al.’s 2018 research found that 88 percent of libraries surveyed had implemented Google Analytics or Google Tag Manager.4 In contrast, during that same period, authors found Matomo, or its earlier name, Piwik, discussed in a total of five scholarly articles, with only three libraries who wrote about using it as a web analytics tool.5 In addition to measuring website use, libraries found that Google Analytics allowed for several different assessments. In using Google Analytics, libraries could provide immediate feedback for projects, indicate website design change possibilities, create key performance indicators, and determine research paths and user behaviors.6 Convenience of implementation and use, minimal cost, and a user-friendly interface were all reasons cited for the widespread and fast adoption.7 Although the early literature covers a lot of ground about the reporting possibilities and the coverage of Google Analytics, there is rarely a mention of user privacy. Early articles that mention privacy provide a cursory discussion, reiterating that the data collected by Google is anonymous and therefore, protects the privacy of the user. Recently, there has been a shift in literature, with articles that now provide more in-depth discussions about user privacy and the concerns libraries have with third parties that collect and host user data. O’Brien et. al discussed the problematic ways that libraries adopted and implemented GA, by either overlooking front-facing policies or implementing it without the consent of their users.8 In their webometrics study, O’Brien et. al found that very few libraries (1 percent) had implemented HTTPS with the GA tracking code, only 14 percent had used IP anonymization, and not a single site utilized both features.9 The concern is not solely Google’s control of the data, but in Google’s involvement with third-party trackers. Third parties, as Pekala remarks, are rarely held accountable.10 With an advertisement revenue of $134 billion in 2019, representing 84 percent of its total revenue, it is important to remember that Google is an advertising company.11 Google's search engine monetization transformed it into one of the world's most recognizable brands. As the most visited site in the world, Google is firmly committed to security, especially when it comes to data theft. Google offers protection from unwanted access into user accounts, even providing ways for high-risk users, such as journalists or political campaigns, to purchase additional security keys for advanced protection.12 But while Google keeps data breaches and hackers at bay, the user data that Google collects and stores for advertising revenue tells a different story. Goo gle stores user data for months on end; only after nine months is advertisement data anonymized by removing parts of IP addresses. Then, after 18 months, Google will finally delete stored cookie information.13 Recent surveys are reporting an increase in users who want to know how companies are collecting information to provide data-driven services. In a 2019 Pew Research Survey, 62 percent of respondents believe it is impossible to go through their daily lives untracked by companies. Additionally, even with the ease that certain data-driven services bring, “81 [percent] of the public reported that the potential risks they face because of data collection by companies outweigh the benefits.”14 CISCO Technologies, in a 2019 personal data survey, found a segment of the population (32 percent) that not only cares about data privacy and wants control of their data, but has also taken steps to switch providers, or companies, based on their data policies. 15 Additionally, in Pew Research Survey results published as recently as April 2020, Andrew Perrin reports that an even larger number of U.S. adults (52 percent) are now choosing to not use products or services INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 3 out of concerns for their privacy and the personal information companies collect. 16 With a growing population of users who make inquiries about who, or what, is in control of their data, a web analytics tool that can easily answer those questions might serve libraries, and their users, well. COMPARISONS Google Analytics had been the library’s only web analytics tool until the start of the pilot project. During the pilot period, the authors simultaneously ran both analytics tools. Once Matomo was installed the authors found several similarities between the two products, and discovered that nearly identical analyses could occur, given the quality and quantity of the data collected. The pilot study focused only on one analytics project, which would be the library’s discovery layer— EBSCO’s Discovery Service. Authors worked with their dedicated EBSCO engineer to replicate the Google Analytics EDS widget, and have it configured to send output to Matomo instead. In making comparisons, one of the common statements about GA and Matomo, is that the numbers will never be exact matches. Oftentimes with much higher counts presented in GA than in Matomo. Several forums and blogs, even Matomo themselves, admit that there are several possible reasons why there is a noticeable difference between the two.17 Those involved in the discussion theorize that this is due to GA spam hits, bot hits, and Matomo’s ability for users to limit tracking. Beyond the counts, both products measure the same kinds of metrics for websites.18 For this project, the authors only wanted to look at specific metrics within EDS, those measurements that look more closely at the user, rather than the larger aggregate data. For the sake of the analysis, it is important to note that although both products have several great features; this is a specific situation where the researchers use certain features in terms of analytics. The analytics we collect for EDS strive to answer specific questions: • Are users searching for known item or exploratory searches? How often? • Are users utilizing the facets and limiters? How often? Although you can use both products to count page views or set events for your website, when looking at meaningful metrics for our discovery system, we focus more on the user level. In Google Analytics, the best way to capture these is by going through the User Explorer tool, which breaks up a user journey into search terms, events, and actions that occur during sessions. In the same way, Matomo provides anonymized user profiles that include search terms, events, and actions in its Visits Log report. In GA, you can export this User Explorer data in JSON format, but only at one user at a time, as seen in figure 1. This restriction also means you cannot see data from multiple users, with those details, on a single page. To contrast, in Matomo’s Visits Log, you can export the same data (search terms, events, actions) from multiple users in CSV, XML, PHP, TSV, JSON, or HTML formats. As seen in figure 2, Matomo offers a snapshot of this data in an easy-to-read single page, versus Google’s one user at a time option which requires clicking through to see a user report. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 4 Figure 1. Screenshot of the Google Analytics User Explorer Tool Figure 2. Screenshot of the Matomo Visits Log Report INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 5 In summary, libraries using either of these analytics tools can measure usage and users with page views, visits, and unique visitors. Looking at how users navigate a site is possible with the available user paths, from the initial search, to events as seen in figures 3 and 4, and an exit page URL. Goals can be set and maintained with conversion metrics tied to referrers, visits, user location, devices, or user attributes. Like Google Analytics, Matomo can run reports on engagement and performance, and share customizable user-friendly graphs or graphs or other visual representation. Figure 3. Peer Reviewed Limiter as Event Action in Google Analytics Figure 4. Peer Reviewed Limiter Use as Event Name in Matomo Comparisons on Privacy Both Google Analytics and Matomo offer ways to protect the privacy of your users. Both offer IP anonymization, the option for data deletion after a certain time, and both provide Do Not Track feature for users. It is important to note the way Google offers these adjustments to the user. For Matomo, Do Not Track is a default behavior, meaning that the tracker automatically honors a browser’s settings for all sites, which is sometimes not the case, as respecting the Do Not Track browser setting is voluntary for websites, not mandatory .19 Google Analytics offers the same service, as long as it is implemented by the user through a browser extension.20 IP anonymization and data deletion are all features that Matomo users can adjust easily from the dashboard, whereas Google Analytics users will need to make those adjustments programmatically. 21 In Matomo, you can choose to automatically delete your old visitor logs from the database, although Matomo recommends keeping detailed Matomo logs from three to six months, and then INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 6 delete the older log data.22 Quite the contrast is Google Analytics where a user makes a data deletion request to Google, which then creates a report for your review, before submitting the request to Google. Even after submitting a request, Google still allows for seven days to reverse that decision. In terms of data retention, Google Analytics gives you the option to retain user data anywhere from 14 months to 50 months, with the option to never expire. Fourteen months is the shortest amount of time you can retain user data for, nothing less.23 IP anonymization is the default for Matomo analytics but is an opt-in feature for Google Analytics. Again, like data retention, any adjustments to IP anonymization in Matomo can occur in the dashboard with options to have two or three bytes removed from the address. Google Analytics will adjust the last octet to zero.24 Both products are similar in several ways, but the standout feature of Matomo is that the data belongs only to your institution. In his interview with Katherine Schwab for Fast Company, Mathieu Aubry, Matomo’s founder states it clearly: When [Google] released Google Analytics, [it] was obvious to me that a certain percent of the world would want the same technology, but decentralized, where it’s not provided by a centralized corporation and you’re not dependent on them… If you use it on your own server, it’s impossible for us to get any data from it.25 IMPLEMENTATION AND INSTALLATION Originally released as Piwik in 2007, Matomo was designed as a replacement to phpMyVisites.26 It is an open-source software application licensed under GNU GPL v3.27 It is designed as a PHP/MySQL application allowing the server operating system (OS) and web service to best match a user’s needs or institutional preferences and expertise.28 To match the organization’s preferences and expertise, this Matomo instance was set up as a Linux-Apache-MySQL/PHP (LAMP) stack server (CentOS 7 in our case) with Apache 2.4.6 and MySQL-MariaDB 5.5.60. The required configurations needed to run Matomo are well-documented on the Matomo documentation site as well as the download and documentation area. Depending on the version of Matomo, the mileage a user gets with the documentation may vary. For example, on the recent upgrade to 3.11.0, the instance displayed a warning notification that PHP v7.0 had reached end of life and recommended updating to PHP v7.1 or greater to accommodate future Matomo versions. However, at the time of this writing, the minimum PHP version required stated in Matomo’s documentation is 5.5.9 or greater.29 Like many PHP applications, once the prerequisite applications are installed (PHP, MySQL, and the selected web service, Apache in this case), the Matomo install is completed by browsing to the server’s URL or IP address on port 80. Browsing to the index.php path in a web browser will guide a user through the install process. The installer will also review file directories on the server and inform a user of any permissions problems that will need to be addressed for correct install and use. Compared to other PHP application install experiences, installing Matomo was straightforward and easier to follow than many. Within a few minutes, the admin user was created and the first website was added. The web-based administration area is also more robust and easier to use than many comparable applications. Many features that might typically require configuration file changes directly on the server, including Matomo upgrades, can be configured through the administration area. While the administration page has many options relating to paid-for premium features, there are several INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 7 particularly helpful free configuration cards in the interface. Most notably is the “System Summary” card that displays the current version of Matomo, PHP, and MySQL as well as total users, segments, goals, tracking failures, total websites configured, and a few other metrics. There is a “Tracking Failures” card that notifies of issues with websites, and a “Need Help?” card that links to the Matomo Community forums. Finally, the “System Check” card displays any warnings or errors as well as a link to the full system check report. This is extremely helpful when Matomo has been installed but the instance still needs additional configuration changes or follow-up tasks on upgrades. If there are warnings or errors, the full system report will often have recommendations of changes to make either in the administration page or on the server in the configuration files. These administration features make maintenance a straightforward process. Since setting up the server, two upgrades have been completed. In both cases, an email notification was received indicating a new stable release was available. On login to Matomo, this information also appeared as a banner. Simply clicking on the download update option automatically updated the service without any need to access the server directly or via SSH. In both cases the updates ran smoothly with one exception. In that case, several files were created or overwritten with the root user as the owner. As a result, Matomo indicated an issue with the files and/or path not being found. In actuality, the files did exist, but Matomo no longer had permission to read them. Resolution of the problem required browsing to the directory path indicated in a warning on the server and changing ownership from the root user to the apache user to match other files. Despite this issue, the update process is much more user-friendly than similarly structured applications. Standalone implementation and installation of Matomo is made simple by the installation documentation that is readily available on the Matomo.com website, especially if one is familiar with PHP/MySQL applications. Adding one or two websites whose architectures a new Matomo user is well-acquainted with is a good way for new users to pilot and get introduced to Matomo’s overall functions without being so overwhelmed that the more granular functions are never learned. A system admin may find maintenance and updates to this service less problematic with less interruption of the service than similarly structured applications while users may find the overall functionality of Matomo easier to use and finer points of reporting and analytics more transparent and easier to understand than Google Analytics. Once installed, the authors then tested Matomo on a low-traffic library site. After tracking proved successful, EDS was entered as a new website in the Matomo dashboard and the JavaScript tracking tag was placed in the bottom branding of EDS. The process of adding EDS as a new site to Matomo was as easy as expected, and the data collection was almost immediate. To mirror the EDS and Google Analytics integration, the authors worked with their EBSCO Library Service Engineer to create a Matomo widget. Luckily, another engineer had previously worked on an integration when it was known as Piwik. Instead of building from the ground up, the Piwik widget only needed clean and updated code to match the Google Analytics widget, which would allow for the tracking of events and site searches. Adding a user outside of the organization to Matomo was necessary for the EBSCO engineer to fine-tune the widget. Matomo admins can set up users with specific permissions within the system, with access to only a specific site. Each Matomo user has their own email address and password (not domain-specific), settings, and users can even customize their dashboard. After INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 8 testing proved successful, the new Matomo widget moved into the live profile of EDS, and data collection commenced. SECURITY Though the service is in a pilot stage with limited data collection, the authors wanted to ensure an SSL certificate was in place for login to Matomo. With EFF’s Certbot (https://certbot.eff.org/), the authors installed a Let’s Encrypt (https://letsencrypt.org/) SSL certificate. The SSL certificate is automatically renewed every three months via a cronjob on our server. Because of the power of the administration interface, caution should be used when assigning the “Super User” role to user accounts. It would also be wise to require two-factor authentication (2FA) on the service. Turning on 2FA is a very simple process and Matomo works with multiple third-party authentication utilities including Authy, LastPass, and 1Password. While each user can choose to activate 2FA, an admin can require it for all users if desired. CONCLUSION As the amount of research and rate of adoption testifies, since 2005 GA has set the benchmark for assessment of library web asset success and has made possible a completely new understanding of the library user experience and overall assessment of library services. Matomo’s earliest iteration appeared shortly after in 2007 and is a viable alternative to proprietary web analytics applications with a few notable advantages over GA. From a long-term perspective, the two biggest advantages of Matomo is that it is licensed under a copyleft GPL free and open source software (FOSS) license and is designed with user privacy at heart. For libraries, using FOSS applications whenever possible allows them to practice what they preach. FOSS does not mean cost-free. In fact, free in the FOSS sense is more akin to freedom (freedom to download, modify, distribute, and change the code) rather than free of charge. Budgeting for a hosted subscription, support, or the costs of a library running and maintaining the application itself or through an Infrastructure as a Service (IaaS) provider like Amazon Web Services (AWS) or Microsoft’s Azure is necessary, but the freedom Matomo provides by ensuring the library is in control of its patron data, that it is protected, and that data is not at risk of becoming a product in and of itself may well be worth the cost. Like other initiatives in the open-access movement or open-education resources, and as third- party data collection and privacy on the web becomes a more mainstream concern, opting to use Matomo to protect patron privacy principles allows libraries to be the leaders on issues relating to privacy and intellectual freedom. As noted earlier, there are other feature-based advantages Matomo provides that impact the day-to-day aspects of monitoring web asset use and assessment, like export options and viewing the full log of visits. Lastly, by focusing on EDS in this pilot, the authors were able to demonstrate and verify that Matomo rises to the challenge not just with traditional web asset analytics requirements, but to library-specific applications like proprietary discovery layer services. https://certbot.eff.org/ https://letsencrypt.org/ INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 9 ENDNOTES 1 Adam Chandler and Melissa Wallace, “Using Piwik Instead of Google Analytics at the Cornell University Library.” Serials Librarian 71, no. 3 (October 2016): 174, https://doi.org/10.1080/0361526X.2016.1245645. 2 Tabatha Farney and Nina McHale, “Introducing Google Analytics for Libraries,” Library Technology Reports 49, no. 4 (May 2013): 5, https://journals.ala.org/ltr/article/download/4269/4881. 3 Paul Betty, “Assessing Homegrown Library Collections: Using Google Analytics to Track Use of Screencasts and Flash-Based Learning Objects,” Journal of Electronic Resources Librarianship 21, no. 1 (2009): 75–92, https://doi.org/10.1080/19411260902858631; Jason D. Cooper and Alan May, “Library 2.0 at a Small Campus Library,” Technical Services Quarterly 26, no. 2 (2009): 89–95, https://doi.org/10.1080/07317130802260735; Stephan Spitzer, “Better Control of User Web Access of Electronic Resources,” Journal of Electronic Resources in Medical Libraries 6, no. 2 (2009): 91–100, https://doi.org/10.1080/15424060902931997; Julie Arendt and Cassie Wagner, “Beyond Description: Converting Web Site Usage Statistics into Concrete Site Improvement Ideas,” Journal of Web Librarianship 4, no. 1 (2010): 37–54, https://doi.org/10.1080/19322900903547414; Steven J. Turner, “Website Statistics 2.0: Using Google Analytics to Measure Library Website Effectiveness,” Technical Services Quarterly 27, no. 3 (2010): 261–78, https://doi.org/10.1080/07317131003765910; Gail Herrera, “Measuring Link-Resolver Success: Comparing 360 Link with a Local Implementation of WebBridge,” Journal of Electronic Resources Librarianship 23, no. 4 (2011): 379–88, https://doi.org/10.1080/1941126X.2011.627809; Wayne Loftus, “Demonstrating Success: Web Analytics and Continuous Improvement,” Journal of Web Librarianship 6, no. 1 (2012): 45–50, https://doi.org/10.1080/19322909.2012.651416; Tabatha A. Farney, “Click Analytics: Visualizing Website Use Data,” Information Technology & Libraries 30, no. 3 (2011): 141–8, https://doi.org/10.6017/ital.v30i3.1771. 4 Patrick O’Brien et al., “Protecting Privacy on the Web: A Study of HTTPS and Google Analytics Implementation in Academic Library Websites,” Online Information Review 42, no. 6 (2018): 734–51, https://doi.org/10.1108/OIR-02-2018-0056. 5 Junior Tidal, “Using Web Analytics for Mobile Interface Development,” Journal of Web Librarianship 7, no. 4 (2013): 451–64, http://doi.org/10.1080/19322909.2013.835218; Ramiro Federico Uviña, “Bibliotecas Y Analítica Web: Una Cuestión De Privacidad = Libraries and Web Analytics: A Privacy Matter,” Información, Cultura Y Sociedad no. 33 (2015): 105–12, http://revistascientificas.filo.uba.ar/index.php/ICS/article/view/1906; Sukumar Mandal, “Site Metrics Study of Koha OPAC through Open Web Analytics and Piwik Tools,” Library Philosophy and Practice (2019), https://digitalcommons.unl.edu/libphilprac/2835; Mohammad Azim and Nabi Hasan, “Web Analytics Tools Usage among Indian Library Professionals,” 2018 5th International Symposium on Emerging Trends and Technologies in Libraries and Information Services, (2018): 31-35, https://doi.org/10.1109/ETTLIS.2018.8485212. 6 Ian Barba et al., “Web Analytics Reveal User Behavior: TTU Libraries’ Experience with Google Analytics,” Journal of Web Librarianship 7, no. 4 (2013): 389–400, https://doi.org/10.1080/19322909.2013.828991. https://doi.org/10.1080/0361526X.2016.1245645 https://journals.ala.org/ltr/article/download/4269/4881 https://doi.org/10.1080/19411260902858631 https://doi.org/10.1080/07317130802260735 https://doi.org/10.1080/15424060902931997 https://doi.org/10.1080/19322900903547414 https://doi.org/10.1080/07317131003765910 https://doi.org/10.1080/1941126X.2011.627809 https://doi.org/10.1080/19322909.2012.651416 https://doi.org/10.1080/19322909.2012.651416 https://doi.org/10.1108/OIR-02-2018-0056 http://doi.org/10.1080/19322909.2013.835218 http://revistascientificas.filo.uba.ar/index.php/ICS/article/view/1906 https://digitalcommons.unl.edu/libphilprac/2835 https://doi.org/10.1109/ETTLIS.2018.8485212 https://doi.org/10.1080/19322909.2013.828991 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 10 7 Betty, “Assessing Homegrown Library Collections.” 8 O’Brien et al., “Protecting Privacy on the Web,” 734. 9 O’Brien et al., “Protecting Privacy on the Web,” 741. 10 Shayna Pekala, “Privacy and User Experience in 21st Century Library Discovery,” Information Technology & Libraries 36, no. 2 (2017): 50, https://doi.org/10.6017/ital.v36i2.9817. 11 J. Clement, “Advertising Revenue of Google from 2001 to 2019,” Statista, February 5, 2020, https://www.statista.com/statistics/266249/advertising-revenue-of-google; Lily Hay Newman, “The Privacy Battle to Save Google From Itself,” Wired, November 1, 2018, https://www.wired.com/story/google-privacy-data/; Ben Popken, “Google Sells the Future, Powered by Your Personal Data,” NBC News, May 10, 2018, https://www.nbcnews.com/tech/tech-news/google-sells-future-powered-your-personal- data-n870501; Richard Graham, “Google and Advertising: Digital Capitalism in the Context of Post-Fordism, the Reification of Language, and the Rise of Fake News,” Palgrave Communications 3, no. 45 (2017): 2-4, https://doi.org/10.1057/s41599-017-0021-4. 12 “Google Advanced Protection Program,” Google, https://landing.google.com/advancedprotection/. 13 “Google Privacy and Terms, Advertising,” Google, https://policies.google.com/technologies/ads?hl=en-US. 14 Brooke Auxier et al., “American and Privacy: Concerned, Confused and Feeling Lack of Control Over Their Personal Information,” November 15, 2019, Pew Research, https://www.pewresearch.org/internet/wp-content/uploads/sites/9/2019/11/Pew- Research-Center_PI_2019.11.15_Privacy_FINAL.pdf. 15 “Consumer Privacy Survey,” November 2019, CISCO, https://www.cisco.com/c/dam/en/us/products/collateral/security/cybersecurity-series- 2019-cps.pdf. 16 Andrew Perrin, “Half of Americans Have Decided Not to Use a Product or Service Because of Privacy Concerns,” Pew Research, April 14, 2020, https://www.pewresearch.org/fact- tank/2020/04/14/half-of-americans-have-decided-not-to-use-a-product-or-service-because- of-privacy-concerns/. 17 “Matomo vs. Google Analytics 360,” Matomo.org, https://matomo.org/matomo-vs-google- analytics comparison; Lemon, “A Comparison of Data: Piwik vs. Google Analytics,” The FPlus (blog), November 30, 2016, https://thefpl.us/wrote/about-piwik; Himanshu Sharman, “Best Google Analytics Alternatives in 2020—Matomo & Piwik Pro,” OptimizeSmart (blog), March 30, 2020, https://www.optimizesmart.com/introduction-to-piwik-best-google-analytics- alternative. 18 “Matomo vs. Google Analytics 360,” Matomo.org. https://doi.org/10.6017/ital.v36i2.9817 https://www.statista.com/statistics/266249/advertising-revenue-of-google https://www.wired.com/story/google-privacy-data/ https://www.nbcnews.com/tech/tech-news/google-sells-future-powered-your-personal-data-n870501 https://www.nbcnews.com/tech/tech-news/google-sells-future-powered-your-personal-data-n870501 https://doi.org/10.1057/s41599-017-0021-4 https://landing.google.com/advancedprotection/ https://policies.google.com/technologies/ads?hl=en-US https://www.pewresearch.org/internet/wp-content/uploads/sites/9/2019/11/Pew-Research-Center_PI_2019.11.15_Privacy_FINAL.pdf https://www.pewresearch.org/internet/wp-content/uploads/sites/9/2019/11/Pew-Research-Center_PI_2019.11.15_Privacy_FINAL.pdf https://www.cisco.com/c/dam/en/us/products/collateral/security/cybersecurity-series-2019-cps.pdf https://www.cisco.com/c/dam/en/us/products/collateral/security/cybersecurity-series-2019-cps.pdf https://www.pewresearch.org/fact-tank/2020/04/14/half-of-americans-have-decided-not-to-use-a-product-or-service-because-of-privacy-concerns/ https://www.pewresearch.org/fact-tank/2020/04/14/half-of-americans-have-decided-not-to-use-a-product-or-service-because-of-privacy-concerns/ https://www.pewresearch.org/fact-tank/2020/04/14/half-of-americans-have-decided-not-to-use-a-product-or-service-because-of-privacy-concerns/ https://matomo.org/matomo-vs-google-analytics%20comparison/ https://matomo.org/matomo-vs-google-analytics%20comparison/ https://thefpl.us/wrote/about-piwik https://www.optimizesmart.com/introduction-to-piwik-best-google-analytics-alternative https://www.optimizesmart.com/introduction-to-piwik-best-google-analytics-alternative INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 11 19 Ryan Singel, “Google Holds Out Against ‘Do Not Track’ Flag,” Wired, April 15, 2011, https://www.wired.com/2011/04/chrome-do-not-track; Kieren McCarthy, “Do Not Track Is Back in the US Senate,” The Register, May 20, 2019, https://www.theregister.co.uk/2019/05/20/do_not_track; “How Do I Turn on the Do Not Track Features?,” Mozilla, https://support.mozilla.org/en-US/kb/how-do-i-turn-do-not-track- feature. 20 “Google Analytics Opt-Out Browser Add-On,” Google, https://support.google.com/analytics/answer/181881. 21 “IP Anonymization,” Google, https://developers.google.com/analytics/devguides/collection/analyticsjs/ip-anonymization. 22 “Managing Your Database’s Size,” Matomo.org, https://matomo.org/docs/managing-your- databases-size/ - deleting-old-unprocessed-data. 23 “Data Retention,” Google, https://support.google.com/analytics/answer/7667196?hl=en&ref_topic=2919631. 24 “IP Anonymization,” Google. 25 Katherine Schwab, “It’s Time to Ditch Google Analytics,” Fast Company, February 1, 2019, https://www.fastcompany.com/90300072/its-time-to-ditch-google-analytics. 26 “Matomo and phpMyVisites,” Matomo.org, https://matomo.org/faq/general/faq_437. 27 “Licenses,” Matomo.org, https://matomo.org/licences. 28 “Matomo (software),” Wikipedia, https://en.wikipedia.org/wiki/Matomo_(software). 29 “Matomo Requirements,” Matomo.org, https://matomo.org/docs/requirements. https://www.wired.com/2011/04/chrome-do-not-track https://www.theregister.co.uk/2019/05/20/do_not_track https://support.mozilla.org/en-US/kb/how-do-i-turn-do-not-track-feature https://support.mozilla.org/en-US/kb/how-do-i-turn-do-not-track-feature https://support.google.com/analytics/answer/181881 https://developers.google.com/analytics/devguides/collection/analyticsjs/ip-anonymization https://matomo.org/docs/managing-your-databases-size/%20-%20deleting-old-unprocessed-data https://matomo.org/docs/managing-your-databases-size/%20-%20deleting-old-unprocessed-data https://support.google.com/analytics/answer/7667196?hl=en&ref_topic=2919631 https://www.fastcompany.com/90300072/its-time-to-ditch-google-analytics https://matomo.org/faq/general/faq_437 https://matomo.org/licences https://en.wikipedia.org/wiki/Matomo_(software) https://matomo.org/docs/requirements ABSTRACT Introduction Literature Review Google Analytics Comparisons Comparisons on Privacy Implementation and Installation Security Conclusion Endnotes 12235 ---- Evaluating the Impact of the Long-S upon 18th-Century Encyclopedia Britannica Automatic Subject Metadata Generation Results ARTICLES Evaluating the Impact of the Long-S upon 18th-Century Encyclopedia Britannica Automatic Subject Metadata Generation Results Sam Grabus INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2020 https://doi.org/10.6017/ital.v39i3.12235 Sam Grabus (smg383@Drexel.edu) is an Information Science PhD Candidate at Drexel University’s College of Computing and Informatics, and Research Assistant at Drexel’s Metadata Research Center. This article is the 2020 winner of the LITA/Ex Libris Student Writing Award. © 2020. ABSTRACT This research compares automatic subject metadata generation when the pre-1800s Long-S character is corrected to a standard < s >. The test environment includes entries from the third edition of the Encyclopedia Britannica, and the HIVE automatic subject indexing tool. A comparative study of metadata generated before and after correction of the Long-S demonstrated an average of 26.51 percent potentially relevant terms per entry omitted from results if the Long-S is not corrected. Results confirm that correcting the Long-S increases the availability of terms that can be used for creating quality metadata records. A relationship is also demonstrated between shorter entries and an increase in omitted terms when the Long-S is not corrected. INTRODUCTION The creation of subject metadata for individual documents is long known to support standardized resource discovery and analysis by identifying and connecting resources with similar aboutness .1 In order to address the challenges of scale, automatic or semi-automatic indexing is frequently employed for the generation of subject metadata, particularly for academic articles, where the abstract and title can be used as surrogates in place of indexing the full text. When automatically generating subject metadata for historical humanities full texts that do not have an abstract, anachronistic typographical challenges may arise. One key challenge is that presented by the historical “Long-S” < ſ >. In order to account for these idiosyncrasies, there is a need to understand the impact that they have upon the automatic subject indexing output. Addressing this challenge will help librarians and information professionals to determine whether or not they will need to correct the Long-S when automatically generating subject metadata for full-text pre-1800s documents. The problem of the Long-S in Optical Character Recognition (OCR) for digital manuscript images has been discussed for decades.2 Many scholars have researched methods for correcting the Long- S through the use of rule-based algorithms or dictionaries.3 While the problem of the Long-S is well-known in the digital humanities community, automatic subject metadata generation for a large corpus of pre-1800s documents is rare, as is research about the application and evaluation of existing automatic subject metadata generation tools on 18th-century documents in real-world information environments. The impact of the Long-S upon automatic subject metadata generation results for pre-1800s texts has not been extensively explored. The research presented in this paper addresses this need. The paper reports results from basic statistical analysis and visualization using the Helping Interdisciplinary Vocabulary Engineering (HIVE) tool automatic mailto:smg383@Drexel.edu INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 EVALUATING THE IMPACT OF THE LONG-S | GRABUS 2 subject indexing results, before and after the correction of the historical Long-S in the 3rd edition of the Encyclopedia Britannica. Background work was conducted over the Summer and Fall of 2019, and the research presented was conducted during Winter 2020. The work was motivated by current work on the “Developing the Data Set of Nineteenth-Century Knowledge” project, a National Endowment for the Humanities collaborative project between Temple University’s Digital Scholarship Center and Drexel University’s Metadata Research Center. The grant is part of a larger project, Temple University’s “19th-Century Knowledge Project,” which is digitizing four historical editions of the Encyclopedia Britannica.4 The next section of this paper presents background covering the historical Encyclopedia Britannica data, the automatic subject metadata generation tool used for this project, a brief background of “the Long-S Problem,” and the distribution of encyclopedia entry lengths in the 3rd edition. The background section will be followed by research objectives and method supporting the analysis. Next, the results are presented, demonstrating prevalence of terms omitted from the automatic subject metadata generation results if the Long-S is not corrected to a standard small < s > character, as well as the impact of encyclopedia entry length upon these results. The results are followed by a contextual discussion, and a conclusion that highlights key findings and identifies future research. BACKGROUND Indexing for the 19th-Century Knowledge Project The 19th-Century Knowledge Project, an NEH-funded initiative at Temple University, is fully digitizing four historical editions of the Encyclopedia Britannica (the 3rd, 7th, 9th, and 11th). The long-term goal of the project is to analyze the evolving conceptualization of knowledge across the 19th century.5 The 3rd edition of the Encyclopedia Britannica (1797) is the earliest edition being digitized for this project. The 3rd edition consists of 18 volumes, with a total of 14,579 pages, and individual entries ranging from four to over 150,000 words. For each individual entry, researchers at Temple have created individual TEI-XML files from the OCR output. In order to enrich accessibility and analysis across this digital collection, The Knowledge Project will be adding controlled vocabulary subject headings into the TEI headers of each encyclopedia entry XML file. Considering the size of this corpus, both in terms of entry length and number of entries, automatic subject metadata generation will be required for the creation of this metadata. The Knowledge Project will employ controlled vocabularies to replace or complement naturally extracted keywords for this process. Using controlled vocabularies adheres to metadata semantic interoperability best practices, ensures representation consistency, and helps to bypass linguistic idiosyncrasies of these 18th and 19th Century primary source materials. 6 We selected two versions of the Library of Congress Subject Headings (LCSH) as the controlled vocabularies for this project. LCSH was selected due to its relational thesaurus structure, multidisciplinary nature, and continued prevalence in digital collections due to its expressiveness and status as the largest general indexing vocabulary.7 In addition to the headings from the 2018 edition of LCSH, headings from the 1910 LCSH are also implemented in order to provide a more multi-faceted representation, using temporally-relevant terms that may have been removed from the contemporary LCSH. The tool applied for this process is HIVE, a vocabulary server and automatic indexing application. 8 HIVE allows the user to upload a digital text or URL, select one or more controlled vocabularies, and performs automatic subject indexing through the mapping of naturally extracted keywords to the available controlled vocabulary terms. HIVE was initially launched as an IMLS linked open INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 EVALUATING THE IMPACT OF THE LONG-S | GRABUS 3 vocabulary and indexing demonstration project in 2009. Since that time, HIVE has been further developed, with the addition of more controlled vocabularies, user interface options, and the RAKE keyword extraction algorithm. The RAKE keyword extraction algorithm has been selected for this project after a comparison of topic relevance precision scores for three keyword extraction algorithms.9 The Long-S Problem Early in our metadata generation efforts, we discovered that the 3rd edition of the Encyclopedia Britannica employs the historical Long-S. Originating in early Roman cursive script, the Long-S was used in typesetting up through the 18th century, both with and without a left crossbar. By the end of the 18th century, the Long-S fell out of use with printers.10 As outlined by lexicographers of the 17th and 18th centuries, the rules for using the Long-S were frequently vague, complicated, inconsistent over time, and varied according to language (English, French, Spanish, or Italian). 11 These rules specified where in a word the Long-S should be used instead of a short < s >, whether it is capitalized, where it may be used in proximity to apostrophes, hyphens, and the letters < f >, < b >, < h >, and < k >; and whether it is used as part of a compound word or abbreviation.12 This is further complicated by the inclusion of the half-crossbar, which occasionally results in two consequences: (a) The Long-S may be interpreted by OCR as an < f >, and < b > and < f > may be interpreted by OCR as a Long-S. Figure 1 shows an example from the 3rd edition entry on Russia, in which the original text specifies “of” (line 1 in top figure), yet the OCR output has interpreted the character as a Long-S. The Long-S may also occasionally be interpreted by the OCR as a lower- case < l >, such as the “univerlity of Dublin” in the 3rd edition entry on Robinson (The most Rev Sir Richard). These complications and inconsistencies are challenges when developing Python rules for correcting the Long-S in an automated way, and even preexisting scripts will need to be adapted for individual use with a particular corpus. Figure 1. Example from the 3rd edition entry on Russia, comparing the original use of a letter < f > in “of” to the OCR output of the same passage, which mistakenly interprets the character as a Long-S. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 EVALUATING THE IMPACT OF THE LONG-S | GRABUS 4 Despite the transition away from the Long-S towards the end of the 18th century, the 3rd edition of the Encyclopedia Britannica (published in 1797) implements the Long-S throughout, with approximately 100,594 instances of the Long-S in the OCR output. When performing metadata generation with the HIVE tool on the OCR output for an entry, the Long-S is most often interpreted by the automatic metadata generation tool as an < f >, which can result in (a) inaccurate keyword extraction (e.g., Russians→ Ruffians), and (b) when mapping extracted keywords to controlled vocabulary terms, essential topics could be unidentifiable, and HIVE will subsequently omit them from the results because they cannot be mapped to controlled vocabulary terms. Figure 2 provides a truncated view of Long-S words in the 3rd edition entry on Rum, which are subsequently removed from the pool of automatically extracted keywords when performing the automatic subject indexing sequence in HIVE. Using keyword extraction algorithms that are largely dependent upon term frequencies, automatic subject indexing for an entry on Rum may be substantially hindered when meaningful and frequently occurring words such as sugar, and yeast are removed. Figure 2. Examples of the Long-S in the 3rd edition Encyclopedia Britannica entry on Rum. Using this example entry, the automatic subject indexing results were compared using Python, to determine which terms only appear when the Long-S has been corrected to the standard < s >. The comparison showed that 16 total terms no longer appeared in the results when the Long-S was not corrected to a standard < s >: ten terms using the 2018 LCSH, and six terms using the 1910 LCSH. These omitted results included the terms sugar and yeast. The next section will discuss the encyclopedia entry word count for this corpus, and the possible impact that this may have upon automatic subject indexing between corrected and uncorrected Long-S instances. Encyclopedia Entry Lengths Consistent with other Encyclopedia Britannica editions in the 18th and 19th centuries, the encyclopedia entries in the 3rd edition vary substantially in length. A convenience sample of 3,849 3rd edition entries ranging in length from 2 to 202,848 words demonstrated an arithmetic mean of INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 EVALUATING THE IMPACT OF THE LONG-S | GRABUS 5 826.60, and a median word count of 71. As shown in figure 3, this indicates a significant skew towards shorter entry lengths. For the vast majority of encyclopedia entries in this corpus, a low total word count may impact the degree of Long-S impact for automatic subject indexing results, given the importance of term availability and frequency for keyword extraction algorithms. Figure 3. Scatterplot of word count for a convenience sample of 3,849 3rd Edition Encyclopedia Britannica entries. Large-scale metadata generation requires time, labor, and resources, and it becomes more costly when accounting for the complications of correcting the Long-S for a particular corpus. Library and information professionals working with digital humanities resources will need to understand the impact of correcting or not corrected the Long-S in the corpus before designating resources and developing a protocol for generating the automatic or semi-automatic metadata for full-text resources. This includes understanding whether or not the length of each individual document will affect the degree of Long-S impact upon the results. This challenge, and issues reviewed above, are in the research presented below. OBJECTIVES The overriding goal of this work is to determine the prevalence of omitted terms in automatic subject indexing results when the Long-S is not corrected in the 3rd edition entries of the Encyclopedia Britannica. Research questions: 1. What is the average number of terms that are omitted from automatic subject indexing results when the Long-S is not corrected to a standard < s >? 2. How does the encyclopedia entry length affect the number of terms that are omitted when the Long-S is not corrected to a standard < s >? This analysis will approach these goals by performing a comparative analysis of automatic subject indexing results to determine the number of terms that are omitted from the results when the Long-S is not corrected to a standard letter < s >. Basic descriptive statistics are generated to determine central tendency. The quantity of terms omitted are then compared with encyclopedia INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 EVALUATING THE IMPACT OF THE LONG-S | GRABUS 6 entry word counts. These objectives were shaped by collaboration between Drexel University’s Metadata Research Center and Temple University’s Digital Scholarship Center. The next section of this paper will report on methods and steps taken to address these objectives. METHODS We approached this research by performing a comparative analysis of subject metadata generated both before and after the correction of the historical Long-S in the 3rd edition of the Encyclopedia Britannica. The HIVE tool was used to automatically generate the subject metadata. Descriptive statistics were applied, and visualizations produced from the results were also examined to identify trends. Figure 4. The 30 Encyclopedia Britannica 3rd edition Encyclopedia Britannica entries randomly selected for this study, sorted in ascending order by their word counts. The protocol for performing this research involved the following steps: 1. Compile a sample for testing: 1.1. A random sample of 30 encyclopedia entries was identified from a convenience sample of entries that comprise the letter S volumes of the 3rd edition. The entries range, in length, from 6 to 6,114 words. The median word count for entries in this sample is 99 words. 1.2. The sample of terms selected for this study and their respective word counts are visualized in figure 4. 1.3. For each entry, the Long-S terms in the original XML file were extracted to a list. 2. Perform automatic subject indexing sequence upon entries to generate lists of terms: 2.1. Using the 2018 and 1910 versions of the LCSH. 2.2. With fixed maximum subject heading results set to 40: 20 maximum terms returned with the 2018 LCSH, and 20 maximum terms returned with the 1910 LCSH. 2.3. Before Long-S correction and after Long-S correction, using the Oxygen XML Editor TEI to TXT transformation. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 EVALUATING THE IMPACT OF THE LONG-S | GRABUS 7 3. Perform outer join on Python Data Frames, between terms generated when the Long-S has been corrected vs. terms generated when the Long-S has not been corrected. The resulting left outer join list displays terms that are omitted from the automatic indexing results if the Long-S is not corrected to a standard small < s >. The quantity of terms omitted are recorded for comparison. 4. Analysis: Descriptive statistics were generated to determine central tendency for the number and percentage of words omitted when the Long-S is not corrected. The quantity of terms omitted are also visualized in a continuous scatterplot with the corresponding word counts, to demonstrate that the quantity of terms omitted when the Long-S is not corrected seems to relate to the length of the document being automatically classified. RESULTS The results report the prevalence of omitted terms when the Long-S is not corrected to a standard < s >, as well as a visualization of the number of terms omitted as they relate to the encyclopedia entry length. For each of the 30 sample entries automatically indexed with HIVE, a fixed maximum number of 40 entries were returned: a maximum of 20 terms using the 2018 LCSH, and a maximum of 20 terms using the 1910 LCSH. As seen in table 1, central tendency is measured using the arithmetic mean and median, along with the standard deviation and range. The average number of terms omitted from an entry’s results is 6.73, and the average percentage of terms omitted from an entry’s results is 26.51 percent, with the 2018 and 1910 editions of LCSH performing at similar rates. The full results are displayed in appendix A. Table 1. Measures of centrality, standard deviation, range, and percentage for quantity of terms omitted when the Long-S is not corrected to a standard < s >, rounded to the hundredth. For each entry, a maximum of 40 terms were returned: 20 using 2018 LCSH and 20 using 1910 LCSH. The total results returned varies according to the entry length. These totals are reported in appendix B. (N= 30 entries.) For each entry in the sample, the results in appendix A display the total words omitted when the Long-S is not corrected, the number of 2018 LCSH terms omitted, the number of 1910 LCSH terms omitted, and the encyclopedia entry word count. Figure 5 visualizes the total number of terms omitted for each entry when the Long-S is not corrected, demonstrating an increase in terms omitted for entries with lower word counts. These results are broken down by vocabulary used in figure 6, demonstrating that both vocabularies used to generate these results indicate a significant increase in omitted terms for shorter entries. Column1 Both Vocabularies 2018 LCSH 1910 LCSH Average, Terms Omitted 6.73 3.67 3.07 Median, Terms Omitted 5 3 2 Standard Deviation 6.53 3.84 3.17 Range, Terms Omitted 0-24 0-13 0-11 Average Percentage, Omitted Terms 26.51% 27.51% 24.28% Median Percentage, Omitted Terms 22.36% 20.00% 19.09% INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 EVALUATING THE IMPACT OF THE LONG-S | GRABUS 8 Figure 5. Number of automatic subject indexing terms that are omitted when the Long-S is not corrected to a standard < s > as compared by encyclopedia entry word count. Figure 6. Number of automatic subject indexing terms that are omitted when the Long-S is not corrected to a standard < s > as compared by encyclopedia entry word count, separated by controlled vocabulary version. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 EVALUATING THE IMPACT OF THE LONG-S | GRABUS 9 DISCUSSION The analysis above presents measures of centrality for quantity of terms omitted if the Long-S is not corrected to a standard < s > prior to automatic subject indexing using HIVE, as well as a visualization to represent the relationship between encyclopedia entry word count and number of terms omitted. Although researchers have identified challenges with the Long-S and have focused a great deal on the technologies and methods used to correct it, there is still limited work on looking at the results of not correcting the Long-S character when performing an automatic subject indexing sequence. This research demonstrated an average of 6.73 potentially relevant terms omitted from automatic indexing results when the Long-S is not corrected, accounting for an average of 26.51 percent of the total results, with an approximately equal distribution of omitted terms across the two controlled vocabulary versions used. When the quantity of terms omitted is visualized using a continuous scatterplot, the results also demonstrated a significant increase in omitted terms for shorter entries, with longer entries less affected. These results reflect the impact of term frequency and total word count in keyword extraction and automatic subject indexing, with longer documents having a greater pool of total terms from which to identify key terms. Considering the complexities and similarities of the typographical characters in the original manuscript, the OCR output process for this corpus occasionally mistakes the letters < s >, < f >, < r >, and < l >. As a result, an occasional Long-S word in this study did not originally contain an < s > (e.g., sor instead of for). Correction of these Long-S OCR errors requires the development of a dictionary-based script. An additional complication of this research is that the corrected OCR output for the encyclopedia entries still contains a few errors not related to the Long-S, which will prevent the mapping of the term to any controlled vocabulary term (e.g., in the entry on Sepulchre, the OCR output for the term Palestine was Palestinc). These results are specific to this particular corpus of 3rd edition Encyclopedia Britannica entries, but it is very likely that testing another set of pre-1800s documents containing the Long-S would also illustrate that for best results with any algorithm or tool, the Long-S needs to be corrected. The results are also specific to the two versions of the LCSH used, both the 1910 LCSH and the 2018 LCSH, which are available in the HIVE tool. The 1910 version is key for the time period being studied, and the 2018, more contemporary to today, has supported additional analysis on the impact of the Long-S. Both of these vocabularies are important to the larger 19th-Century Knowledge Project. It should be noted that while the LCSH is updated weekly, we were limited to what is available via the HIVE tool, and any discrepancies that may be found with the 2020 LCSH will very likely have a minimal effect upon metadata generation results. It should be noted that the 2020 LCSH will be incorporated into HIVE soon and can be explored in future research. CONCLUSION AND NEXT STEPS The objective of this research was to determine the impact of correcting the Long-S in pre-1800s documents when performing an automatic metadata generation sequence using keyword extraction and controlled vocabulary mapping. This was accomplished by performing an automatic subject indexing sequence using the HIVE tool, followed by a basic statistical analysis to determine the quantity of terms omitted from the results when the Long-S is not corrected to a standard < s >. The number of omitted terms was also compared with the encyclopedia entry word count and visualized to demonstrate a significant increase in omitted terms for shorter INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 EVALUATING THE IMPACT OF THE LONG-S | GRABUS 10 encyclopedia entries. The study was conclusive in confirming that the correction of the Long-S is a critical part of our workflow. The significance of this research is that it demonstrates the necessity of correcting the Long-S prior to performing an automatic subject indexing on historical documents. Beyond the correction of the Long-S, the larger next steps for this project are to continue to explore automatic metadata generation for this corpus. These next steps include the comparison of results using contemporary vs. historical vocabularies and streamlining a protocol for bulk classification procedures and integration of terms into the TEI-XML headers. The research presented here can inform other digital humanities and even science-oriented projects, where researchers may not be aware of the impact of the Long-S on automatic metadata generation not only for subjects, but also named entities, particularly when automatic approaches with controlled vocabularies are desired. ACKNOWLEDGEMENTS The author thanks Dr. Jane Greenberg and Dr. Peter Logan for their guidance. The author acknowledges the support of the NEH grant #HAA-261228-18. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 EVALUATING THE IMPACT OF THE LONG-S | GRABUS 11 APPENDIX A Entry Term Total Words Omitted 2018 LCSH Terms Omitted 1910 LCSH Terms Omitted Encyclopedia Entry Word Count SARDIS 24 13 11 381 SUCTION 24 13 11 38 STYLITES, PILLAR SAINTS 19 13 6 199 SHADWELL 14 10 4 211 SALICORNIA 13 6 7 254 SEPULCHRE 11 3 8 348 SITTA NUTHATCH 9 5 4 620 SPRAT 9 3 6 475 SERAPIS 8 5 3 587 STRADA 8 1 7 189 SHOAD 7 4 3 463 SIGN 7 5 2 68 SHOOTING 6 3 3 6114 STRATA 6 3 3 2920 STEWARTIA 5 4 1 72 SUBCLAVIAN 5 3 2 20 SCHWEINFURT 4 2 2 84 SCROLL 4 2 2 45 SPALATRO 4 3 1 99 SPECIAL 4 3 1 24 SAMOGITIA 3 2 1 112 SHAKESPEARE 3 0 3 3855 SINAPISM 2 1 1 25 SECT 1 1 0 20 SEVERINO 1 1 0 38 SHADDOCK 1 1 0 6 SCARLET 0 0 0 65 SHALLOP, SHALLOOP 0 0 0 42 SOLDANELLA 0 0 0 56 SPOLETTO 0 0 0 99 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 EVALUATING THE IMPACT OF THE LONG-S | GRABUS 12 APPENDIX B *N = 30 entries Average Terms Returned Median Terms Returned Corrected 24.77 / 40 possible 28 / 40 possible Uncorrected 26.47 / 40 possible 29 / 40 possible 2018 LCSH Corrected 14.10 / 20 possible 19 / 20 possible 2018 LCSH Uncorrected 13.47 / 20 possible 18.5 / 20 possible 1910 LCSH Corrected 11.27 / 20 possible 11 / 20 possible 1910 LCSH Uncorrected 10.13 / 20 possible 9 / 20 possible INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 EVALUATING THE IMPACT OF THE LONG-S | GRABUS 13 ENDNOTES 1 Liz Woolcott, “Understanding Metadata: What is Metadata, and What is it For?,” Routledge (November 17, 2017), https://doi.org/10.1080/01639374.2017.1358232; Koraljka Golub et al., “A framework for evaluating automatic indexing or classification in the context of retrieval,“ Journal of the Association for Information Science and Technology 67, no. 1 (2016), https://doi.org/10.1002/asi.23600; Lynne C. Howarth, “Metadata and Bibliographic Control: Soul-Mates or Two Solitudes?,“ Cataloging & Classification Quarterly 40, no. 3-4 (2005), https://doi.org/10.1300/J104v40n03_03. 2 A. Belaid et al., “Automatic indexing and reformulation of ancient dictionaries“ (paper presented at the First International Workshop on Document Image Analysis for Libraries, Palo Alto, CA, 2004), https://doi.org/10.1109/DIAL.2004.1263264. 3 Beatrice Alex et al., “Digitised Historical Text: Does it have to be mediOCRe" (paper presented at the KONVENS 2012 (LThist 2012 workshop), Vienna, September 21, 2012); Ted Underwood, “A half-decent OCR normalizer for English texts after 1700," The Stone and the Shell, December 10, 2013, https://tedunderwood.com/2013/12/10/a-half-decent-ocr-normalizer-for-english- texts-after-1700/. 4 “Nineteenth-century knowledge project," (GitHub Repository), 2020, https://tu- plogan.github.io/. 5 “Nineteenth-century Knowledge Project.” 6 Marcia Lei Zeng and Lois Mai Chan, “Metadata Interoperability and Standardization - A Study of Methodology, Part II," D-Lib Magazine 12, no. 6 (2006); G. Bueno-de-la-Fuente, D. Rodríguez Mateos, and J. Greenberg, “Chapter 10 - Automatic Text Indexing with SKOS Vocabularies in HIVE" (Elsevier Ltd, 2016); Sheila Bair and Sharon Carlson, “Where Keywords Fail: Using Metadata to Facilitate Digital Humanities Scholarship," Journal of Library Metadata 8, no. 3 (2008), https://doi.org/10.1080/19386380802398503. 7 John Walsh, “The use of Library of Congress Subject Headings in digital collections," Library Review 60, no. 4 (2011), https://doi.org/10.1108/00242531111127875. 8 Jane Greenberg et al., “HIVE: Helping interdisciplinary vocabulary engineering,“ Bulletin of the American Society for Information Science and Technology 37, no. 4 (2011), https://doi.org/10.1002/bult.2011.1720370407. 9 Sam Grabus et al., “Representing Aboutness: Automatically Indexing 19th- Century Encyclopedia Britannica Entries,” NASKO 7 (2019), pp. 138-48, https://doi.org/10.7152/nasko.v7i1.15635. 10 Karen Attar, “S and Long S," in Oxford Companion to the Book, eds. Michael Felix Suarez and H. R. II Woudhuysen (Oxford: Oxford University Press, 2010); Ingrid Tieken-Boon van Ostade, “Spelling systems,“ in An Introduction to Late Modern English (Edinburgh University Press, 2009). 11 Andrew West, “The Rules for Long-S," TUGboat 32, no. 1 (2011). 12 Attar, “S and Long S.” https://doi.org/10.1080/01639374.2017.1358232 https://doi.org/10.1002/asi.23600 https://doi.org/10.1300/J104v40n03_03 https://doi.org/10.1109/DIAL.2004.1263264 https://tedunderwood.com/2013/12/10/a-half-decent-ocr-normalizer-for-english-texts-after-1700/ https://tedunderwood.com/2013/12/10/a-half-decent-ocr-normalizer-for-english-texts-after-1700/ https://tu-plogan.github.io/ https://tu-plogan.github.io/ https://doi.org/10.1080/19386380802398503 https://doi.org/10.1108/00242531111127875 https://doi.org/10.1002/bult.2011.1720370407 https://doi.org/10.7152/nasko.v7i1.15635 ABSTRACT INTRODUCTION Background Indexing for the 19th-Century Knowledge Project The Long-S Problem Encyclopedia Entry Lengths Objectives Methods Results Discussion Conclusion and Next Steps Acknowledgements Appendix A Appendix B 12367 ---- Seeing through Ontologies EDITORIAL BOARD THOUGHTS Seeing through Vocabularies Kevin Ford INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2020 https://doi.org/10.6017/ital.v39i2.12367 Kevin Ford (kevinford@loc.gov) is Librarian, Linked Data Specialist in the Library of Congress’s Network Development and MARC Standards Office. He works on the Library’s Bibframe Initiative, and similar projects, such as MADS/RDF, and is a member of the ITAL Editorial Board. The ideas and opinions expressed here are those of the author and do not necessarily reflect those of his employer. “Ontologies” are popular in library land. “Vocabularies” are popular too, but it seems that the library profession prefers “ontologies” over “vocabularies” when it comes to defining classes and properties that attempt to encapsulate some realm of knowledge. Bibframe, MADS/RDF, BIBO, PREMIS, and FRBR are well-known “ontologies” in use in the library community.1 They were defined either by librarians or to be used mainly in the library space, or both. SKOS, FOAF, Dublin Core, and Schema are well known “vocabularies.”2 They are used widely by libraries though none were created by librarians or specifically for library use. In all cases, those ontologies and vocabularies were created for the very purpose of publication for broader use, which is one of the primary objectives behind creating one: to define a common set of metadata elements to facilitate the description and sharing of data within a group or groups of users. Ontologies and vocabularies are common when working with RDF (Resource Description Framework), a very simple data model in which information is expressed as a series of triple statements, each consisting of three parts: a subject, a predicate, and an object. The types of ontologies and vocabularies referred to here are in fact defined using RDF—Thing A is a Class and Thing Z is a Property. Those using any given ontology or vocabulary employ the defined classes and properties to further describe their Things, for a lack of a better word. It is useful to provide an example. The first block of triples below represents Class and Property definitions in RDF Schema (RDFS), which provides some very basic means to define classes and properties and some relationships between them, such as the domains and ranges for properties. The second block is instance data. ontovoc:Book rdf:type rdfs:Class ontovoc:authoredBy rdf:type rdf:Property ontovoc:authorOf rdf:type rdf:Property ex:12345 rdf:type ontovoc:Book ex:12345 ontovoc:authoredBy ex:abcde ontovoc:Book is defined as a Class and ontovoc:authoredBy is defined as a Property. Using those declarations, it is possible to then assert that ex:12345, which is an identifier, is of type ontovoc:Book and was authored by ex:abcde, an identifier for the author. Is the first block— the definitions—an “ontology” or a “vocabulary?” Putting aside the question for now, air quotes— in this case literal quotes—have been employed around “ontologies” and “vocabularies” to suggest that these are more terms of art than technical distinctions, though it must also be acknowledged that there is a technical distinction to be made. mailto:kevinford@loc.gov INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 SEEING THROUGH VOCABULARIES | FORD 2 Ontologies in the RDF space frequently, if not always, use classes and properties from the Web Ontology Language (known as OWL) to define a specific realm’s classes and properties and how they relate to each other within that realm of knowledge. This is because OWL is a more expressive definition language than basic RDFS. Using OWL, and considering the example above, ontovoc:authoredBy could be defined as an inverse of ontovoc:authorOf. ontovoc:authoredBy owl:inverseOf ontovoc:authorOf In this way, and given the little instance data above (the two triples that begin ex:12345), it is then possible to infer the following bit of knowledge: ex:abcde ontovoc:authorOf ex:12345 Now that the owl:inverseOf triple/declaration has been added to the definitions, it’s worth re- asking: Do the definitions represent an “ontology” or a “vocabulary?” A purist might answer “not an ontology,” but only because those statements have not been combined in a document, which itself has been given a URI and declared to be an owl:Ontology. That’s the actual OWL Class that says, “This is an OWL Ontology.” But let’s say those statements had been added to a document published at a URI and declared to be an owl:Ontology. Is it an ontology now? Perhaps in a strict sense the answer is “yes.” But in a practical sense few would view those four declarations, wrapped neatly in a document that has been given a URI and called an Ontology, as an “ontology.” It doesn’t quite rise to the occasion—“ontologies” almost always have a broader scope and employ more formal semantics—making its use a term of art, often, rather than a real technical distinction. Yet, based on the same narrow definition (a published document declaring itself to be an OWL:Ontology) combined with a far more extensive set of class and property definitions with defined relationships between them, it is possible to describe FOAF as an ontology.3 But it is widely known as, and understood as, a “vocabulary.” (There is also an experimental version of Schema as OWL.4) And that gets to the crux of the issue in many ways. Putting aside the technical distinction that can be argued to identify something as an “ontology” versus a “vocabulary,” there are non-technical semantics at work here—what was earlier described as a “term of art”—about when, how, and why something is deemed an “ontology” versus a “vocabulary.” The library community appears to think of their creations as “ontologies” and not “vocabularies,” even when the documentation tends to avoid the word “ontology.” For example, the opening sentence of the Bibframe and MADS/RDF documentation very clearly introduces each as a “vocabulary,” as does FRBR in RDF.5 On the surface they may be presented as “vocabularies,” which they are of course, but despite this prominent self-declaration they are not seen in the same light as FOAF or Schema but instead as something more exacting, which they also are. It is worth contemplating why they are viewed principally as “ontologies” and to examine whether this has been beneficial. Perhaps the ideas behind designating something a “vocabulary” are, in fact, more in line with the way libraries operate, whereas “ontologies” represent an ideal (and who doesn’t set their sights on the ideal?), striving toward which only exposes shortcomings and sows confusion. The answer to “why” is historical and probably derives from a combination of lofty thinking, traditional standards practices, and good ol’ misunderstanding. Traditional standards practices favor more formal approaches. Libraries’ decades-long experience with XML and XML Schema INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 SEEING THROUGH VOCABULARIES | FORD 3 contributed significantly to this mindset. XML Schema provides a way to describe the precise construction of an XML document and it can then be used to validate the XML document. XML Schema defines what elements and attributes are permitted in the XML document and frequently dictates their order. It can further constrain the values of an element or attribute to a select list of options. In many ways, XML Schema was the very expression of metadata quality control. Librarians swooned. With the right controls and technology in place, it was impossible to produce poor, variable metadata. In the case of semantic modelling, OWL is certainly a more formal approach. It’s founded in description logics whose expressions take the form of occult-like mathematics, at least as viewed by a librarian with a humanities background. OWL can be used to declare domains and ranges for properties. One can also designate a property as a Datatype Property, meaning it takes a value such as a string or a date, as its value, or an Object Property, which means it will reference another RDF resource as its object. But these declarations are actually more about inferencing—deriving information by applying the ontology against some instance data—and not about restrictions, constraints, or validation. To be clear, there are ways to apply restrictions in OWL—“wine can be either red or white”—but this is a form of advanced OWL modelling that is not well understood and not often implemented, and virtually never in ontologies designed by librarians. Conversely, indicating a domain for a property, for example, is easy, relatively straightforward, and seductive because it gives the appearance that the property can only be used with resources of a specific class. Consider: The domain of ontovoc:authoredBy is ontovoc:Book. That does not mean that the ontovoc:authoredBy can only be used with a ontovoc:Book resource. It means that whatever resource uses ontovoc:authoredBy must therefore be a ontovoc:Book. Defining that domain for that property is not restricting its use only to books; it allows one to derive the additional knowledge that the thing it is used with must be a book even if it doesn’t identify itself as one. This may seem like a subtle distinction and/or it may seem like tortured logic, but if it does it may suggest that one’s point of view, one’s mindset, favors constraints, restrictions, and validations. And that’s OK. That’s library training and conditioning, completely reinforced in our daily work. It’s what has been taught in library schools for decades and practiced by library professionals even longer. Names should be entered “last name, first name” and any middle initial, if known, included. The data in this field should only be a three-character language code from this approved list of language codes. These rules and the consistency resulting from these rules are what make library data so often very high quality. Google loves MARC records from our community for this very reason. Wishing to exert strong control at the definition level when creating a model or metadata scheme with an eye to data quality, it is a natural inclination for librarians to gravitate to a more formal means of defining a model, especially one that seems to promise constraints. So, despite these models self-describing at a high-level as vocabularies, the models themselves employ a considerable amount of OWL at the technical level, which becomes the focus of any users wishing to implement the model. Users comprehend these models as something more than a vocabulary and therefore view the model through this more complex lens. Unfortunately, because OWL is poorly understood (sometimes by creators and sometimes by users, and sometimes by both), this leads to various problems. On the one hand, creators and users believe there are technical restrictions or constraints where there are, in fact, none. When this happens, the “constraint” is INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 SEEING THROUGH VOCABULARIES | FORD 4 either identified as a problem (“Consider removing the range for this property”) or—and this is more damaging—the property (read: model/vocabulary/ontology) is avoided. Even when it is recognized that the “constraint” is not a real restriction (just a means to infer knowledge), forging ahead can generate new issues. When faced with a domain and range declaration, for example, forging ahead can result in inaccurate, imprecise, or simply undesirable inferences. Most of the currently open “issues” (about 50 at the time of writing) about Bibframe follow a basic pattern: 1) there is a declaration about this Property or this Class that makes it difficult to use because of how it has been defined with OWL; 2) we cannot really use it presently because it would cause potential inferencing issues; 3) consider altering the OWL definitions.6 Pursuing an (OWL) ontology, while formal and seemingly comforting because it feels a little like constraining the metadata schema, can result in confusion and a lack of adoption. Given that vocabularies and ontologies are developed and published to encourage users to describe their data in a way that fosters wide consumption by others, this is unfortunate to say the least. It is notable that SKOS, FOAF, Dublin Core, and Schema have very different scopes and potentially much wider user bases than the more library-specific ontologies (Bibframe, MADS/RDF, BIBO, etc.). There is something to be learned here: the smaller the domain, the more effective an ontology might be; the larger the universe, a more general approach may be better. It is further true that FOAF, Dublin Core, and Schema define specific domains and ranges for many of their properties, but they have strived for clarity and simplicity. The creators of Schema, for example, eschewed the formal semantics behind RDFS and OWL and redefine domain and range to better match their needs and (perhaps unexpectedly) most users’ automatic understanding.7 What is generally true is that each of the “vocabularies” approached the creation and defining of their models so as to minimize the use of formal semantics, and promoted this as a feature. In this way, they limited or removed altogether the actual or psychological barriers to adoption. Their offering was more accessible, less fussy. Bearing in mind the differences in scale and scope, they have been rewarded with a wider adopter base and passionate advocates. The decision to create a “vocabulary” or an “ontology” is a technical one and a political one, both of which must be in alignment. It’s a mindset and it is a statement. It is entirely possible to define the model at a technical level using OWL, making it by definition an ontology, but to have it be perceived, and used, as a vocabulary because it is flexible and not strictly defined. Likewise, it is not enough to call something a vocabulary, but in reality be a model burdened with formal semantics that is then expected to be adopted and used widely. If the objective is to fashion a (pseudo?) restrictive metadata set with rules that inform its use, and which is strongly bonded with a specific community, develop an “ontology,” but recognize that this may result in confusion and lack of uptake. If, however, the desire is to cultivate a metadata element set that is flexible, readily useable, and positioned to grow in the future because it employs fewer rules and formal semantics, create a “vocabulary.” That’s really what is being communicated when we encounter ontologies and vocabularies. Interestingly, the political difference between “vocabulary” and “ontology” appears, in fact, to be understood by librarians: library models self-identify as “vocabularies.” But once past those introductory remarks, the truth is exposed quickly in the widespread use of OWL, revealing beyond doubt that it is not a flexible, accommodating vocabulary but a strictly defined model. To dispense with the air quotes: as librarians we’re creating ontologies and calling them vocabularies. We really want to be creating vocabularies that are ontologies in name only. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 SEEING THROUGH VOCABULARIES | FORD 5 ENDNOTES 1 “Bibframe Ontology,” Library of Congress, accessed May 21, 2020, http://id.loc.gov/ontologies/bibframe.html; “MADS/RDF (Metadata Authority Description Schema in RDF),” Library of Congress, accessed May 21, 2020, http://id.loc.gov/ontologies/madsrdf/v1.html; “Bibliographic Ontology Specification,” The Bibliographic Ontology, accessed May 21, 2020, http://bibliontology.com/; “PREMIS 3 Ontology,” Premis Editorial Committee, accessed May 21, 2020, http://id.loc.gov/ontologies/premis3.html; Ian Davis and Richard Newman, “Expression of Core FRBR Concepts in RDF,” accessed May 21, 2020, https://vocab.org/frbr/. 2 Alistair Miles and Sean Bechhofer, editors, “SKOS Simple Knowledge Organization System Reference,” W3C, accessed May 21, 2020, https://www.w3.org/TR/skos-reference/; Dan Brickley and Libby Miller, “FOAF Vocabulary Specification 0.99,” accessed May 21, 2020, http://xmlns.com/foaf/spec/; “DCMI Metadata expressed in RDF Schema Language,” Dublin Core™ Metadata Initiative, accessed May 21, 2020, https://www.dublincore.org/schemas/rdfs/; “Welcome to Schema.org,” Schema.org, accessed May 21, 2020, http://schema.org/. 3 “FOAF Ontology,” xmlns.com, accessed May 21, 2020, http://xmlns.com/foaf/spec/index.rdf. 4 See “OWL” at “Developers,” schema.org, accessed May 21, 2020, https://schema.org/docs/developers.html. 5 See “Bibframe Ontology” and “MADS/RDF (Metadata Authority Description Schema in RDF)” above. 6 “Issues,” Bibframe Ontology at GitHub, accessed 21 May 2020, https://github.com/lcnetdev/bibframe-ontology/issues. 7 R.V. Guha, Dan Brickley, and Steve Macbeth, “Schema.org: Evolution of Structured Data on the Web,” acmqueue 15, no. 9 (15 December 2015): 14, https://dl.acm.org/ft_gateway.cfm?id=2857276&ftid=1652365&dwn=1. http://id.loc.gov/ontologies/bibframe.html http://id.loc.gov/ontologies/madsrdf/v1.html http://bibliontology.com/ http://id.loc.gov/ontologies/premis3.html https://vocab.org/frbr/ https://www.w3.org/TR/skos-reference/ http://xmlns.com/foaf/spec/ https://www.dublincore.org/schemas/rdfs/ http://schema.org/ http://xmlns.com/foaf/spec/index.rdf https://schema.org/docs/developers.html https://github.com/lcnetdev/bibframe-ontology/issues https://dl.acm.org/ft_gateway.cfm?id=2857276&ftid=1652365&dwn=1 ENDNOTES 12383 ---- Facing What’s Next, Together LITA PRESIDENT’S MESSAGE Facing What’s Next, Together Emily Morton-Owens INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2020 https://doi.org/10.6017/ital.v39i2.12383 Emily Morton-Owens (egmowens.lita@gmail.com) is LITA President 2019-20 and the Acting Associate University Librarian for Library Technology Services at the University of Pennsylvania Libraries. When I wrote my March editorial, I was optimistically picturing some of the changes that we are now seeing for LITA—while being scarcely able to imagine how the world and our profession would need to adapt quickly to the impacts on library services as a result of COVID-19. It is a momentous and exciting change for us to turn the page on LITA and become Core, yet this suddenly pales in comparison to the challenges we face as professionals and community members. Libraries’ rapid operational changes show how important the ingenuity and dedication of technology staff are to our libraries. Since states began to shut down, our listserv, lita-l, has hosted discussions on topics like how to provide person-to-person reference and computer assistance remotely, how to make computer labs safe for re-occupancy, how to create virtual reading lists to share with patrons, and how to support students with limited internet access. There has been an explosion in practical problem-solving (ILS experts reconfiguring our systems with new user account settings and due dates), ingenuity (repurposing 3D printers and conservation materials to make masks), and advocacy (for controlled digital lending). Sometimes the expense of library technologies feels heavy, but these tools have the ability to scale services in crucial ways—making them available to more people at the same time, available to people who can only take advantage after hours, available across distances. Technologists are focused on risk, resilience, and sustainability, which makes us adaptable when the ground rules change. Our websites communicate about our new service models and community resources; ILL systems regenerate around increased digital delivery; reservation systems for laptops now allocate the use of study seating. Our library technology tools bridge past practices, what we can do now, and what we’ll do next. One of our values as ALA members is sustainability. (We even chose this as the theme for LITA’s 2020 team of Emerging Leaders.) Sustainability isn’t about predicting the future and making firm plans for it; it’s about planning for an uncertain future, getting into a resilient mindset, and including the community in decision-making. Although the current crisis isn’t climate-related per se, this way of thinking is relevant to helping libraries serve their communities. We will need this agile mindset as we confront new financial realities. Our libraries and ALA itself are facing difficult budget challenges, layoffs, reorganizations, and fundamental conversations about the vitalness of the services we provide. My favorite example from my own library of a COVID-19 response is one where management, technical services, and IT innovated together. Our leadership negotiated an opportunity for us to gain access to digitized, copyrighted material from HathiTrust that corresponds to print materials currently locked away in our library building. Thanks to decades of careful effort by our technical services team, we had accurate data to match our print records with records for the digital versions. Our IT team had processes for loading the new links into our catalog almost mailto:egmowens.lita@gmail.com INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 FACING WHAT’S NEXT, TOGETHER | MORTON-OWENS 2 instantaneously. The result was a swift and massive bolstering of our digital access precisely when our users needed it most. This collaboration perfectly illustrates how natural our merger with ALCTS and LLAMA is. As threats to our profession and the ways we’ve done things in the past gather around us, I am heartened by the strengths and opportunities of Core. It is energizing to be surrounded by the talent of our three organizations working together. I hope more of our members experience that over the summer and fall, as we convene working groups and hold events together, including a unique social hour at ALA Virtual and an online fall Forum. I close out my year serving as the penultimate LITA president in a world with more sadness and uncertainty than we could have foreseen. We are facing new expectations and new pressures, especially financial ones. As professionals and community members, we are animated by our sense of purpose. While LITA has been transformed by our vote to continue as Core, the support and inspiration we provide each other in our association will carry on. 12391 ---- LibraryVPN: A New Tool to Protect Patron Privacy PUBLIC LIBRARIES LEADING THE WAY LibraryVPN A New Tool to Protect Patron Privacy Chuck McAndrew INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2020 https://doi.org/10.6017/ital.v39i2.12391 Chuck McAndrew (chuck.mcandrew@leblibrary.com) is Information Technology Librarian, Lebanon (NH) Public Libraries. Due to increased public awareness of online surveillance, a rise in massive data breaches, and spikes in identity theft, there is a high demand for privacy enhancing services. VPN (Virtual Private Network) services are a proven way to protect online security and privacy. VPN’s effectiveness and ease of use have led to a boom in VPN service providers globally. VPNs protect privacy and security by offering an encrypted tunnel from the user’s device to the VPN provider. VPNs ensure that no one who is on the same network as the user can learn anything about their traffic except that they are connecting to a VPN. This prevents surveillance of data from any source, including commercial snooping such as your ISP trying to monetize your browsing habits by selling your data, malicious snooping such as a fake wifi hotspot in an airport hoping to steal your data, or government-level surveillance that can target political activists and reporters in repressive countries. Some people might ask why we need a VPN as HTTPS becomes more ubiquitous and provides end to end encryption for your web traffic. HTTPS will encrypt the content that goes over the network, but metadata such as the site you are connecting to, how long you are there, and where you go next are all unprotected. Additionally, some very important network protocols, such as DNS, are unencrypted and anyone can see them. A VPN eliminates all of those issues. However, there are two major problems with current VPN offerings. First, all reliable VPN solutions require a paid subscription. This puts them out of reach of economically vulnerable populations who often have no access to the internet in their homes. In order to access online services, they may rely on public internet connections such as those provided by restaurants, coffee shops, and libraries. Using publicly accessible networks without the security benefits of a VPN puts people’s security and privacy at great risk. This risk could be eliminated by providing free access to a high-quality VPN service. The second problem is that using a VPN requires people to place their trust in whatever VPN company they use. Some (especially free solutions) have proven not to be worthy of that trust by containing malware or leaking and even outright selling customer data. Companies that abuse customer data are taking advantage of vulnerable populations who are unable to afford more expensive solutions or who do not have the knowledge to protect themselves. Together, these two problems create a situation where having security and privacy is only available to those who can afford it and have the knowledge to protect themselves. Libraries are ideally positioned to help with this situation. Libraries work to provide privacy and security to people every day. This can mean teaching classes, making privacy resources available, and even advocating for privacy- friendly laws. mailto:chuck.mcandrew@leblibrary.com https://www.forbes.com/sites/forbestechcouncil/2018/07/10/the-future-of-the-vpn-market/#5b08fd8e2e4d https://research.csiro.au/ng/wp-content/uploads/sites/106/2016/08/paper-1.pdf INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 LIBRARYVPN | MCANDREW 2 Libraries are also located in almost every community in the United States and enjoy a high level of trust from the public. Librarians can be thought of as being a physical VPN. People who come into libraries know that what they read and information that they seek out will be protected by the library. In fact, libraries have helped to get laws protecting the library records of patrons in all 50 states of the USA. People know that when a library offers a service to their community it isn’t because they want to sell their information or show them advertisements. With libraries, our patrons are not the product. Libraries also already provide many online services to all members of their community, regardless of financial circumstances. Examples include access to online databases, language learning software, and online access to periodicals such as the New York Times or Consumer Reports. Many of these services would cost too much for individual patrons to access individually. By pooling their resources, communities are able to make more services available to all of their citizens. To help address the above issues, the Lebanon Public Libraries, in partnership with the Westchester (New York) Library System, the LEAP Encryption Access Project (https://leap.se/), and TJ Lamanna (Emerging Technology Librarian from Cherry Hill Public Library and Library Freedom Institute Graduate) started the LibraryVPN project. This project will allow libraries to offer a VPN to their patrons. Patrons will be able to download the LibraryVPN application on a device of their choosing and connect to their library’s VPN server from wherever they are. LibraryVPN was first conceived a number of years ago, but the real start of the project was when it received an IMLS National Leadership Grant (LG-36-19-0071-19) in 2019. This grant was to develop integrations between LEAP’s existing VPN solution and integrated library systems using SIP2 which will allow library patrons to sign in to LibraryVPN using their library card. This grant also included development of a Windows client (there was already a Mac and Linux client) and alpha testing at the Lebanon Public Libraries and Westchester Library System. We are currently working on moving into the testing phase of the software, and planning phase two of this project. Phase two of LibraryVPN will involve expanding our testing to up to 12 libraries and conducting end-user testing with patrons and library staff. We have submitted an application for IMLS funding for phase two and are actively looking for libraries that are excited about protecting patron privacy and would like to help us beta test this software. If you work for a library that would be interested in participating, you can reach us via email at libraryvpn@riseup.net or @libraryvpn on twitter. If you would like to help out with this project in another way, we would love to have more help. Please reach out. We currently are thinking about three deployment models for libraries in phase two. First would be an on-premises deployment. This would be for larger library systems with their own servers and IT staff. LibraryVPN is free and open source software and can be deployed by anyone. Since it uses SIP2 to connect to your ILS, it should work with any ILS that supports the SIP2 protocol. This deployment model has the advantage of not requiring any hosting fees but does require the library system to have staff that can deploy and manage public facing services. Drawbacks to this approach would include higher bandwidth use and dealing with abuse complaints. Phase 2 testing should give us better data about how much of an issue this will be, but https://leap.se/ mailto:libraryvpn@riseup.net INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2020 LIBRARYVPN | MCANDREW 3 our experience hosting a Tor exit node at the Lebanon Public Libraries suggest that it won’t be too bad to deal with. Our second deployment model would be cloud hosting. If a library has IT staff who can deploy services to the cloud, they could host their own LibraryVPN service without needing their own hardware. However, when deploying to the cloud, there will be ongoing costs for running the servers and bandwidth used. Figuring out how much bandwidth an average user will consume is part of the data we are hoping to get from our phase 2 testing so we can offer guidelines to libraries who choose to deploy their own LibraryVPN service. Finally, we are looking at a hosted version of LibraryVPN. We anticipate that smaller systems that do not have dedicated servers or IT staff will be interested in this option. In this case, there would be ongoing hosting and support costs, but managing the service would not be any more complicated than subscribing to any other service the library hosts for their patrons. LibraryVPN is a new project that is pushing library services outside of the library to where the library is. We want to make sure that all of our patrons are protected, not just those with the financial ability and technical know-how to get their own VPN service. As librarians, we understand that privacy and intellectual freedom are joined, and we want to maximize both. As the American Library Association’s Code of Ethics says, “We protect each library user's right to privacy and confidentiality.” http://www.ala.org/tools/ethics 12405 ---- Letter from the Editor: A Blank Page LETTER FROM THE EDITOR A Blank Page Kenneth J. Varnum INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2020 https://doi.org/10.6017/ital.v39i2.12405 Nothing is as daunting as a blank page, particularly now. As I sat down to write this issue’s letter, I was struck by how much fundamental uncertainty is in our lives, so much trauma. A blank page can emphasize our concerns that the old familiar should return at all, or that a new, better, normal will emerge. At the same time, a blank page can be liberating at a time when so much of our social, professional, and personal lives needs to be reconceptualized and reactivated in new, healthier , more respectful and inclusive ways. We are collectively faced with two important societal ailments. The first is the literal disease of the COVID-19 pandemic that has been with us for only months. The other is the centuries-long festering disease of racial injustice, discrimination, and inequality that typifies (particularly, but not uniquely) American society. While some of us may be in better positions to help heal one or the other of these two ailments, we can all do something in both, as different as they are. Lend emotional support to those in need of it, take part in rallies if your personal health and circumstances allow, and advocate for change to government officials at all levels from local to national. Learn about the issues and explore ways you can make a difference on either or both fronts. I hope I am not being foolish or naive when I say I believe the blank page before us as a society will be liberating: an opportunity to shift ourselves toward a better, more equitable, more just path. * * * * * * To rephrase Humphrey Bogart’s Rick Blaine in Casablanca, “it doesn’t take much to see that the problems of three little people library association divisions don’t amount to a hill of beans in this crazy world.” But despite the small global impact of our collective decision, I am glad our ALCTS, LLAMA, and LITA colleagues chose a united future as Core: Leadership, Infrastructure, Futures. Watch for more information about what the merged division means for our three divisions and this journal in the months to come. Sincerely, Kenneth J. Varnum, Editor varnum@umich.edu June 2020 https://core.ala.org/ mailto:varnum@umich.edu 12457 ---- The Role of the Library in the Digital Economy ARTICLE The Role of the Library in the Digital Economy Serhii Zharinov INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2020 https://doi.org/10.6017/ital.v39i4.12457 Serhii Zharinov (serhii.zharinov@gmail.com) is Researcher, State Scientific and Technical Library of Ukraine. © 2020. ABSTRACT The gradual transition to a digital economy requires all business entities to adapt to the new environmental conditions that are taking place through their digital transformation. These tasks are especially relevant for scientific libraries, as digital technologies make changes in the main subject field of their activities, the processes of creating, storing, and information disseminating. In order to find directions for the transformation of scientific libraries and determine their role in the digital economy, a study of the features of digital transformation and the experience of the digital transformation of foreign libraries was conducted. Management of research data, which is implemented through the creation of Current Research Information Systems (CRIS) was found to be one of the most promising areas of the digital transformation of libraries. The problem area of this direction and ways of engaging libraries in it have been also analyzed in the work. INTRODUCTION The transition to a digital economy contributes to the even greater penetration of digital technologies into our lives and the emergence of new conditions of competition and trends in organizations’ development. Big Data, machine learning, and artificial intelligence are becoming common tools implemented by the pioneers of digital transformation in their activities.1 Significant changes in the main functions of libraries, storage and dissemination of information caused by the development of digital technologies, affect the operational activities of libraries, user and partners’ requests to the library, and ways to meet them. In the process of adapting to these changes, the role of libraries in the digital economy is changing. This study is designed to find current areas of library development and to determine the role of the library in the digital economy. Achieving this goal requires study of the “digital economy” concept and the peculiarities of the digital transformation of organizations in order to better understand the role of the library in it; research on the development of libraries and determine what best fits the new role of the library in the digital economy; identification of obstacles to the development of this area and ways to engage libraries in it. THE CONCEPT OF THE “DIGITAL ECONOMY” The transition to an information society and digital economy will gradually change all industries, and all companies must change accordingly.2 Taking advantage of the digital economy is the main driving force of innovation, competitiveness, and economic development of the country.3 The transition to a digital economy is not instant but occurs over many years. The topic emerged starting at the end of the twentieth century, but in recent years has experienced rapid growth. In the Web of Science (WoS) citation database, publications with this term in the title began to be published in 1996 (figure 1). mailto:serhii.zharinov@gmail.com INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 THE ROLE OF THE LIBRARY IN THE DIGITAL ECONOMY | ZHARINOV 2 Figure 1. The number of publications in the WoS citation database for the query “digital economy.” One of the first books devoted entirely to the study of the digital economy concept is the work of Don Tapscott, published in 1996. In this book, the author understands the digital economy as an economy in which the use of digital computing technologies in economic activity becomes its dominant component.4 Thomas Mesenbourg, an American statistician and economist, identified in 2000 the three main components of the digital economy: e-business, e-commerce, and e-business infrastructure.5 A number of works on the development of indicators to assess the state of the digital economy, in particular, the work of Philip Barbet and Nathalie Coutinet, are based on the analysis of these components.6 Alnoor Bhimani, in his 2003 paper, “Digitization and Accounting Change,” defined the digital economy as “the digital interrelationships and dependencies between emerging communication and information technologies, data transfers along predefined channels and emerging platforms, and related contingencies within and across institutional and organizational entities.”7 Bo Carlsson’s 2004 article described the digital economy as a dynamic state of the economy characterized by the constant emergence of new activities based on the use of the Internet and new forms of communication between different authors of ideas, whose communication allows them to generate new activities.8 In 2009, John Hand gave the meaning of the digital economy as the new design or use of information and communication technologies that help transform the lives of people, society, or business.9 0 100 200 300 400 500 600 700 1995 2000 2005 2010 2015 2020 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 THE ROLE OF THE LIBRARY IN THE DIGITAL ECONOMY | ZHARINOV 3 Ciocoiu Carmen Nadia, in her 2011 article, explained the digital economy as a state of the economy where knowledge and networking begin to play a more important role than capital in a post- industrial society due to technology.10 In a 2014 article, Kit Lesya defined the digital economy as an element of the network economy, characterized by the transformation of all spheres of the economy by transferring information resources and knowledge to a computer platform for further use.11 Ukrainian scientists Mykhailo Voinarenko and Larysa Skorobohata, in a study of network tools in 2015, gave the following definition of the digital economy: “The digital economy, unlike the Internet economy, assumes that all economic processes (except for the production of goods) take place independently of the real world. Goods and services do not have a physical medium but are ‘electronic.’”12 Yurii Pivovarov, director of the Ukrainian Association for Innovation Development (UAID), gives the following definition: “Digital economy is any activity related to information technology. And in this case, it is important to separate the terms: digital economy and IT sphere. After all, it is not about the development of IT companies, but about the consumption of services or goods they provide—online commerce, e-government, etc.—using digital information technology.”13 Taking into account the above, in this study, the digital economy is defined as digital infrastructure encompasses all business entities and their activities. The transition to the digital economy is the process of creating conditions for the digital transformation of organizations, the creation of digital infrastructure, and the process of gradual involvement of various economic entities and certain sectors of the economy in the digital infrastructure. One of the first practical and political manifestations of the transition to the digital economy was the European Commission’s Index of Digital Economy and Society (DESI), first published in 2014. The main components of the index are communications, human capital, Internet use, digital integration, and digital public services. Among European countries in 2019, there is significant progress in the digitalization of business and in the interaction of society with the state.14 For Ukraine, the first step towards the digital economy was the Digital Economy and Development Concept of Ukraine, which defines the understanding of the digital economy, the direction and principles of transition to it.15 Thus, for active representatives of the public sector, this concept is a signal that the development of structures and organizations should be based not on improving operational efficiency, but on transformation in accordance with the requirements of Industry 4.0. Confirmation of the seriousness of the Ukrainian government’s intentions in this direction is the creation of the Ministry of Digital Transformation in 2019 and the digitization of the latest public services through online services.16 One of the priority challenges which needs to be solved at the stage of transition to the digital economy is the development of skills in working with digital technologies in the entire population . This is relevant not only for Ukraine, but also for the European Union. In Europe, a third of the active workforce does not have basic skills in working with digital technologies; in Ukraine, 15.1 percent of Ukrainians do not have digital skills, and the share of the working population with below-average digital skills is 37.9 percent.17 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 THE ROLE OF THE LIBRARY IN THE DIGITAL ECONOMY | ZHARINOV 4 Part of the solution to this challenge in Ukraine is entrusted to the “Digital Education” project, implemented by the Ministry of Digital Transformation (osvita.diia.gov.ua), which through the mini-series created by him for different target audiences should form digital literacy in the population of Ukraine. FEATURES OF DIGITAL TRANSFORMATION Developed digital skills in the population make the digital transformation of organizations not just a competitive advantage, but a prerequisite for their survival. Thus, the larger the target audience is accustomed to the benefits of the digital economy, the more actively the organization is to adapt to new requirements and customer needs, to the new competitive environment. Digital transformation of the organization is a complex process that is not limited to the implementation of software in the company’s activities or automation of certain components of production. It includes changes to all elements of the company, including methods of manufacturing and customer service, the organization’s strategy and business model, approaches , and management methods. According to a study by McKinsey, the integration of new technologies into a company's operations can reduce profits in 45 percent of cases.18 Therefore, it is extremely important to have a comprehensive approach to digital transformation, understanding the changes being implemented, choosing the method of their implementation, and gradually involving all structural units and business processes in the process of transformation. The Boston Consulting Group study identified six factors necessary for the effective use of the benefits of modern technologies:19 • connectivity of analytical data; • integration of technologies and automation; • analysis of results and application of conclusions; • strategic partnership; • competent specialists in all departments; and • flexible structure and culture. McKinsey consultants draw attention to the low percentage of successful digital transformation practices and based on the successful experience of 83 companies form five categories of recommendations that can contribute to successful digitalization:20 • involvement of leaders experienced in digitalization; • development of digital staff skills; • creating conditions for the use of digital skills by staff; • digitization of tools and working procedures of the company; and • establishing digital communication and ensuring the availability of information. Experts at the Institute of Digital Transformation identify four main stages of digital transformation in the company:21 1. Research, analysis and understanding of customer experience. 2. Involvement of the team in the process of digital transformation and implementation of corporate culture, which contributes to this process. 3. Building an effective operating model based on modern systems. 4. Transformation of the business model of the organization. https://osvita.diia.gov.ua/ INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 THE ROLE OF THE LIBRARY IN THE DIGITAL ECONOMY | ZHARINOV 5 The “Integrated Model of Digital Transformation” study identifies one of the key factors of successful digital transformation, focusing on priority digital projects, the development and implementation of which should be engaged in specific organizational teams. The authors identify three main functional activities for digital transformation teams, the implementation of which provides a gradual comprehensive renewal of the company, namely: the creation and implementation of digital strategy, digital activity management, digitization of operational activities.22 In their study, Ukrainian scientists Natalia Kraus, Oleksandr Holoborodko, and Kateryna Kraus determine that the general pattern for all digital economy projects is their focus on a specific consumer and comprehensive use of available information about the latter and the conditions of project effectiveness.23 Initially, the project is pre-tested on a small scale, and only after obtaining satisfactory results from the testing of new principles of activity in a narrow target audience is the project scaled to a wider range of potential users. All this reduces the risks associated with digital transformation. Eliminating unnecessary changes and false hypotheses on a small scale allows to avoid overspending at the stage of a comprehensive transformation of the entire enterprise. Therefore, the process of effective digital transformation should begin with the involvement of experienced leaders in the field of digital transformation, analysis of the weaknesses of the organization, and building of a plan for its comprehensive transformation, which is divided into individual projects implemented by individual qualified teams with a gradual increase in the volume of these projects, while confirming their effectiveness on a small scale. The process of digital transformation should be accompanied by constant training of employees in digital skills. The goal of digital transformation is to build an efficient, high-profile company that can quickly adapt to new environmental conditions, which is achieved through the introduction of digital technologies and new methods and tools of organization management. DIRECTIONS OF LIBRARY DEVELOPMENT IN THE DIGITAL ECONOMY Based on the study of the digital economy concept and the peculiarities of digital transformation, the review of library development in the digital economy was conducted to find the library’s place in digital infrastructure and potential projects that can be implemented on a separate library as part of its comprehensive transformation plan. The main task is to determine the new role of the library in the digital economy and the areas that best meet it. The search for directions in the development of the library in response to the spread of digital technology began at the end of the last century. One of the first concepts to reflect the impact of the internet on the library sector is the concept of the digital library, published in 1999.24 In 2006, the concept of “library 2.0” emerged, which is based on the use of WEB 2.0 technologies, dynamic sites, users become data authors, open-source software, API interfaces, data added to one database is immediately fed to partner databases.25 The spread of the use of social networks, mobile technologies, and their successful use in library practice has led to the formation of the concept of “library 3.0.”26 The development of Open Source, Cloud Service, Big Data, Augmented Reality, Context-Aware, and other technologies has influenced library activities, which is reflected in the “library 4.0.”27 Researchers, scholars, and the professional community continued to develop the concepts of the modern library, drawing on the experience of implementing changes in library activities and taking into account the development of other areas, and in 2020 articles began to appear which described the concept of “library 5.0,” based on a personalized approach to students, INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 THE ROLE OF THE LIBRARY IN THE DIGITAL ECONOMY | ZHARINOV 6 support of each student during the whole period of study, development of skills necessary for learning and a set of other supporting actions integrated into the educational process.28 In determining the current role of the library in the digital economy, it is necessary to pay attention to a study by Denis Solovianenko, who in identifies research and educational infrastructure as one of the key elements of scientific libraries of the twenty-first century.29 Olga Stepanenko considers libraries as part of the information and communication infrastructure, the development of which is one of the main tasks of the transformation of the socioeconomic environment in accordance with the needs of the digital economy, which ensures high efficiency of stakeholders the pace of digitalization of the state economy, which occurs through the development of its constituent elements.30 The importance of traditional library services replacing digital infrastructure, based on the example of the Moravian Library, is proved in a study by Michal Indrak and Lenka Pokorna, published in April 2020.31 Projects that contribute to the library’s adaptation to the conditions of the digital economy, implemented in the environment of public libraries, include: digitization of library collections (including historical heritage) and the creation of a database of full-text documents; providing free access to the Internet via library computers and Wi-Fi; organization of online customer service, development of services that do not require a physical presence in the library; organization of events for the development of digital skills of users, work with information.32 Under such conditions, the role of the librarian as a specialist in the field of information changes from being a custodian to being an intermediary, a distributor.33 One of the main objectives of library activity in the digital economy becomes overcoming a digital divide, dissemination of knowledge about modern technologies and innovations, the assistance of their use by the community, development of digital skills in all users of the library.34 An example of the digital public library is the Digital North Library project in Canada, which resulted in the creation of the Inuvialuit Digital Library (https://inuvialuitdigitallibrary.ca). The project lasted four years, bringing together researchers from different universities and the community in the region, who together digitized cultural heritage documents and created metadata. The library now has more than 5,200 digital resources collected in 49 catalogues. The implementation of this project provides access to library services and information to a significant number of people living in remote areas of Northern Canada and unable to visit libraries (https://sites.google.com/ualberta.ca/dln/home?authuser=0, https://inuvialuitdigitallibrary.ca).35 Other representatives of modern digital libraries, one of the main tasks of which is the preservation of cultural heritage and the spread of national culture, are the British Library (https://www.bl.uk), the Hispanic Digital Library—Biblioteca Nacional de España (http://www.bne.es), Gallica Digital Library in France (https://gallica.bnf.fr), the German Digital Library—Deutsche Digitale Bibliothek (https://www.deutsche-digitale-bibliothek.de), and the European Library (https://www.europeana.eu). Another direction was the development of analytical skills in information retrieval. Academic libraries, operating with their competencies in information retrieval and information technology, which refined the results of the analysis were able to better identify trends in academia and expand cooperation with teachers to update their curricula.36 Libraries become active participants https://inuvialuitdigitallibrary.ca/ https://sites.google.com/ualberta.ca/dln/home?authuser=0 https://inuvialuitdigitallibrary.ca/ https://www.bl.uk/ http://www.bne.es/ https://gallica.bnf.fr/ https://www.deutsche-digitale-bibliothek.de/ https://www.europeana.eu/ INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 THE ROLE OF THE LIBRARY IN THE DIGITAL ECONOMY | ZHARINOV 7 in the process of teaching, learning, and assessment of acquired knowledge in educational institutions. T. O. Kolesnikova, in her research of models of library development, substantiates the expediency of creating information intelligence centers for the implementation of the latest scientific advances in training and production processes, the involvement of libraries in the activities of higher educational establishments in the educational process, and the creation of centralized repositories as directions of development for university libraries of Ukraine.37 One of the advantages of the development and dissemination of digital technologies is the possibility of forming individual curricula for students. Involvement of university libraries in this area is one of the new areas of their activities in the digital economy.38 One of the important areas of operation for departmental and scientific-technical libraries that contribute to increasing the innovative potential of the country is activity in the area of intellectual property. Consulting services in the field of intellectual property, information support for scientists, creation of electronic patent information databases in the public domain , and other related services are important components of libraries in many countries. Consulting services in the field of intellectual property, information support for scientists, creation of electronic patent- information databases in the public domain and other related services are important components of libraries in many countries.39 Another important component of libraries’ transformation is the deepening of their role in scientific communication; expanding the boundaries of the use of information technology in order to integrate scientific information into a single network; creation and management of information technology infrastructure of science.40 The presence of libraries on social networks has become an important component of their digital transformation. On the one hand, libraries have thus created another source of information dissemination and expanded the number of service delivery channels, for the implementation of which they have developed online training videos and interactive help services.41 On the other hand, social networks have become a marketing tool to engage the audience in the digital fund of the library and its online services. An additional important component of the presence of libraries in social networks was the establishment of contacts and exchange of ideas with other professional organizations, which contributed to the further expansion of the network of library partners.42 Another area of activity that libraries take on in the digital economy is the management of research data, which is confirmed by the significant number of publications on this topic in professional scientific and research journals for 2017–18.43 Joining this area allows libraries to become part of the scientific digital information and communication infrastructure, the creation of which is one of the main tasks of digital transformation on the way to the digital economy.44 The development of this area contributes to the digitalization of scientific and information sphere, systematization and structuring of all scientific research data has a positive effect on the effectiveness of research, the level of scientific novelty of the results of intellectual activity. The Ukrainian Institute of the Future with the Digital Agency of Ukraine consider digital transformation as the integration of modern digital technologies into all spheres of business. The introduction of modern technologies (Artificial Intelligence, Blockchain, Koboty, Digital Twins, IIoT Platforms and others) in the production process will lead to the transition to Industry 4.0. According to their forecasts, the key competence in Industry 4.0 should be data processing and INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 THE ROLE OF THE LIBRARY IN THE DIGITAL ECONOMY | ZHARINOV 8 analytics.45 Research information is an integral part of this competence, so the development of this area is one of the most promising for the library in the digital economy. The tools used in the management of research data are called Current Research Information Systems, abbreviated as CRIS. In Ukraine, there is no such system connected to the international community. 46 The change of the library’s role from a repository to its manager, the alignment of the functions and tasks of a CRIS with the key requirements of the digital economy, and the advantages of such systems, together with the fact that they are still not used in Ukraine, make this area extremely relevant for research and a promising area of work of scientific libraries, so we’ll consider it more thoroughly. PROBLEMS IN RESEARCH DATA MANAGEMENT The global experience of research information management shows several problems in the process of research data management. Some of them are related to the processes of workflow organization, control, and reporting. This is due to the use of several poorly coordinated systems to organize the work of scientists. Data sets from different systems without metadata are very difficult to combine into a single system, and it is almost impossible to automate the process. All this is manifested in the lack of information security of the decision-making process in the field of science, both at the state level and at the level of individual structures. This situation can lead to wrong management decisions and can lead to overspending on similar, duplicate projects; increasing the cost of the process of recruiting and finding scientists with relevant experience for research, finding the equipment needed for research. CRIS, which began to appear in Europe in the 1990s, are designed to overcome these shortcomings and promote the effective organization of scientific work. Such systems are now widespread throughout the world, with a total of about five hundred, which are mainly concentrated in Europe and India. However, there is currently no research information management system in Ukraine that meets international standards and integrates with international scientific databases. This omission slows down Ukraine’s integration into the international scientific community. The solution to this problem may be the creation of the National Electronic Scientific Information System URIS (Ukrainian Research Information System).47 The development of this system is an initiative of the Ministry of Education and Science of Ukraine. It is based on combining data from Ukrainian scientific institutions with data from CrossRef and other organizations, as well as ensuring integration with other international CRIS systems through the use of the CERIF standard. Future developers of the system face a number of challenges, both specific and already studied by foreign scientists. A significant number of studies in this area are designed to over come the problem of lack of access to research data, as well as to solve problems of data standardization and openness. In the global experience, the directions of collection processes management and development of structured data sets, their distribution on a commercial basis, and also ways of receiving the advantage of providing them in open access are investigated. The mechanisms of financing these processes are studied, in particular, the effective ways of attracting patronage funds are analyzed. The possibilities of licensing the received data sets and their distribution, approaches and tools that can be the most effective for the library are determined. In particular, Alice Wise describes INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 THE ROLE OF THE LIBRARY IN THE DIGITAL ECONOMY | ZHARINOV 9 the experience of settling some legal aspects by clarifying the use of the site in the license agreement, which covers the conditions of access to information and search in it, while maintaining a certain level of anonymity.48 The problem of data consistency is related to the lack of uniform standards for information retention, which would relate to the format of the data, the metadata itself, the methods of their generation and use. Thus, the use of different standards and formats in repositories and archives leads to problems with data consistency in researchers, which, in turn, affects the quality of service delivery and makes it impossible to use multiple data sets.49 Another important problem for the dissemination of research data is the lack of tools, components in libraries, and repositories of higher educational establishments and scientific institutions. It is worth to develop the infrastructure so that at the end of the projects, in addition to the research results, the scientists publish the research data they used and generated. This approach will be convenient both for authors (in case they need to reuse the research data) and for other scientists (because they will have access to data that can be used in their own research).50 The development of the necessary tools is quite relevant, especially because researcher-practitioners are in favor of sharing the data they create with other researchers and the licensed use of other people’s datasets in conducting their own research, according to international surveys.51 Another reason for the low prevalence of research data is that datasets have less of an impact on a researcher’s reputation and rating than publications.52 This is partly due to the lack of citation tracking infrastructure in datasets, in contrast to the publication of research results, and the lack of standards for storing and publishing data. Prestigious scientific journals have been struggling with this problem for several years. For example, the American Economic Review requires authors whose articles contain empirical work, modelling, or experimental work to provide information about research data in a volume enough for replication.53 Nature and Science require authors to preserve research data and provide them at the request of the editors of the journals.54 One of the reasons for the underdeveloped infrastructure in research data management is the weak policy of disseminating free access to this data, as a result of which even a small part of usable scientific data remains closed by license agreements and cannot be used by other scientists.55 Open science initiatives related to publications have been operating in the scientific field for a long time, but their dissemination to research data remains insufficient. The development of the URIS system will provide management of scientific information, will solve problems highlighted in the above scientific works of researchers; will promote the efficient use of funds, will simplify the process of finding data for conducting research; will discipline research , and therefore will have a positive impact on the entire economy of Ukraine. LIBRARY AND RESEARCH INFORMATION MANAGEMENT Library involvement in the development process for scientific information management systems will be an important future direction of their work. Such systems, which could include all the necessary information about scientific research, will contribute to the renewal and development of the library sphere of Ukraine, will promote the transition of the state to a digital economy. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 THE ROLE OF THE LIBRARY IN THE DIGITAL ECONOMY | ZHARINOV 10 The creation of the URIS system is designed to provide access to research data generated by both Ukrainian and foreign scientists. Such a system can ensure the development of cooperation in the field of research, intensification of knowledge exchange, and interaction through the open exchange of scientific data and integration of Ukrainian scientific infrastructure into the world scientific and information space. According to surveys conducted by the international organizations EuroCRIS and OCLC, of the 172 respondents working in the field of research information management, 83 percent said that libraries play an important role in the development of open science, copyright, and the deposit of research results. The share of libraries that play a major role in this direction was 90 percent. Almost 68 percent of respondents noted the significant contribution of libraries in filling in the metadata needed to correctly identify the work of researchers in various databases; 60 percent noted the important role of libraries in verifying the correctness of metadata filling by researchers, and almost 49 percent of respondents assess the role of libraries as the main one in the management of research data (figure 4). Figure 4. The proportion of organizations among 172 users of CRIS-systems that assess the role of libraries in the management of research information as basic or supporting.56 At the same time, the activity of libraries in the direction of assistance in information management of scientific research can take various forms, which should be adopted by scientific libraries of Ukraine; some of these forms will be useful to public libraries that can become science ambassadors in their communities. Based on the experience of foreign libraries, we have identified areas of activity in which the library can join the management of research information. 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% Financial support for RIM Project management Maintaining or servicing technical operations Impact assessment and reporting Strategic development, management and planning Creating internal reports for departments System configuration Outreach and communication Initiating RIM adaption Research data management Metadata validation workflows Metadata entry Training and support Open access, copyright and deposit INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 THE ROLE OF THE LIBRARY IN THE DIGITAL ECONOMY | ZHARINOV 11 One of the main directions for libraries that cooperate with CRIS users or are themselves the organizers of such systems is the introduction and support of open science. Historically, libraries support open science because they provide access to scientific papers, but they can further expand their activities. Using open data resources and promoting them among the scientific community, involving scientific users in disseminating their own research results on the principles of open science, supporting users in disseminating their publications, creating conditions for increasing the citation of scientific papers, tracking information about user publications, creating and support of public profiles of scientists in scientific and professional resources and scientific social networks—all this will help to intensify researchers in engaging in open science and take advantages of this area. The analysis of the world experience shows that in the activity of scientific libraries there is a significant intensification of support for the strategic goals of the structures that finance their activities and to which they are subordinated. Libraries are moving away from the usual customer service and expanding their activities through the use of their own assets and the introduction of new modern tools. Such libraries try to promote the development of parent structures, increase modern competencies to meet the needs and goals of these institutions better. By introducing and implementing various tools for the development of management, libraries synchronize their strategy with the strategy of the parent structure to achieve a synergistic effect. The next important direction of library development is their socialization. Wanting to get rid of the antiquated understanding of the word library, many of them conduct campaigns aimed at changing the image of the library in the imagination of users, communities, and society. An important component of this system step is to build relationships with the target audience, creating user communities around the library, which are not only its users but also supporters, friends, and promoters. Building relationships with members of the scientific community allows libraries to reduce resistance to change as a result of the introduction of scientific information management systems; to influence users positively so that they introduce new tools into their usual activities, receive benefits, and become an active part of the scientific space structuring process. Recently, work with metadata has undergone some changes. The need for identification and structuring of data in the world scientific space leads to the fact that they are already filled not only by libraries but also by other organizations that produce, issued, publish scientific results and scientific literature. Scientists are beginning to make more active use of modern standards in the field of information in order to promote their own work. Libraries, in turn, take on the role of consultant or contractor with many years of experience working with metadata and sufficient knowledge in this area. On the other hand, filling in metadata by users frees up the time of librarians and creates conditions for them to perform other functions, such as information management, creation of automated data collection and management systems integrated with scientific databases—both Ukrainian and international. Another area of research information management is the direct management of this process. Thus, CRIS are developed and implemented with the contribution of scientific libraries in different countries of the world. This allows libraries to combine disparate data obtained from different sources, compile scientific reports, evaluate the effectiveness of scientific activities of the institution, create profiles of scientific institutions and scientists, develop research network s, etc. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 THE ROLE OF THE LIBRARY IN THE DIGITAL ECONOMY | ZHARINOV 12 Scientists and students can find the results of scientific research, and look for partners and sources of funding for research. Research managers have access to up-to-date scientific information, which allows to more accurately assess the productivity and influence of individual scientists, research groups and institutions. Business representatives get access to up-to-date information on promising scientific developments, and the public—a way to control research conducting effectively. CONCLUSIONS Ukraine is on the path to a digital economy, characterized by the penetration of new technologies in all areas of human activity, simplification of access to information, goods and services, blurring the geographical boundaries of companies, increasing the share of automated and robotic production units, strengthening the role of creation and use databases. These changes affect all sectors of the economy, and all organizations, without exception, need to adapt accordingly. Rapid response to relevant changes helps to increase competitiveness both at the level of individual organizations and at the level of the state economy. Adaptation to the conditions of the digital economy occurs through digital transformation—a complex process that requires a review of all business processes of the organization and radically changes its business model. The digital transformation of the organization takes place through the involvement of management, which is competent in digitization, updating management methods, developing digital skills, establishing efficient production and services, implementing digital to ols and building digital communication, implementing individual development projects, and adapting to new user needs. The digital transformation of the economy occurs through the transformation of its individual sectors, creating conditions for the transformation of their representatives. One of the first steps in the process of transition to the digital economy is the establishment of digital information and communication infrastructure. Libraries are representatives of the information sphere, which were the main operators of information in the analogue era. Significant changes in the subject area of their activities require the search for a new role for libraries. Modern projects and directions of library development are integral elements of transformation to the conditions of the digital economy. The result of completing this complex implementation will allow libraries to update their management methods, the range of services, and the channels of their provision; change fixed assets through their digitization, structuring the data and creating metadata; affect approaches to communication with users and cooperation with both domestic and international partners; change the functions and positioning of the library; and will enable them become effective information operator-managers. In the digital economy, the role of the library is changing from passively collecting and storing information to actively managing it. One of the areas of development that most comprehensively meets this role is the management of research data, which is implemented through the creation of CRIS systems. Thus, the main asset of libraries is a digital, structured database, which is automatically and regularly updated, the main focus of which is to support the decision-making process. The library becomes an assistant in conducting research, finding funding, partners, fixed assets and information; a partner in the strategic management of both scientific organizations and the state at the level of committees and ministries. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 THE ROLE OF THE LIBRARY IN THE DIGITAL ECONOMY | ZHARINOV 13 The development of this area in Ukraine requires solving a number of technical, administrative, and managerial questions that are relevant not only in Ukraine, but also around the world. In particular, libraries need to address the issue of data integration and consistency, its accessibility and openness, copyright, and personal data issues. Solving the problems of creation and operation of CRIS systems in Ukraine are promising areas for future research. ENDNOTES 1 Andriy Dobrynin, Konstantin Chernykh, Vasyl Kupriyanovsky, Pavlo Kupriyanovsky and Serhiy Sinyagov, “Tsifrovaya ekonomika—razlichnyie puti k effektivnomu primeneniyu tehnologiy (BIM, PLM, CAD, IOT, Smart City, BIG DATA i drugie),” International Journal of Open Information Technologies 4, no. 1 (2016): 4–10, https://cyberleninka.ru/article/n/tsifrovaya- ekonomika-razlichnye-puti-k-effektivnomu-primeneniyu-tehnologiy-bim-plm-cad-iot-smart- city-big-data-i-drugie. 2 Jurgen Meffert, Volodymyr Kulagin, and Alexander Suharevskiy, Digital @ Scale: nastolnaya kniga po tsifrovizatsii biznesa (Moscow: Alpina, 2019). 3 Victoria Apalkova, “Kontseptsiia rozvytku tsyfrovoi ekonomiky v Yevrosoiuzi ta perspektyvy Ukrainy,” Visnyk Dnipropetrovskoho universytetu. Seriia «Menedzhment innovatsii» 23, no. 4 (2015): 9–18, http://nbuv.gov.ua/UJRN/vdumi_2015_23_4_4. 4 Don Tapscott, The Digital Economy: Promise and Peril in the Age of Networked Intelligence (New York: McGraw-Hill, 1996). 5 Thomas L. Mesenbourg, Measuring the Digital Economy (Washington, DC: Bureau of the Census, 2001). 6 Philippe Barbet and Nathalie Coutinet, “Measuring the Digital Economy: State-of-the-Art Developments and Future Prospects,” Communications & Strategies, no. 42 (2001): 153, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.576.1856&rep=rep1&type=pdf . 7 Alnoor Bhimani, “Digitization and Accounting Change,” in Management Accounting in the Digital Economy, edited by Alnoor Bhimani, 1-12 (London: Oxford University Press, 2003), https://doi.org/10.1093/0199260389.003.0001. 8 Bo Carlsson, “The Digital Economy: What is M=New and What is Not?,” Structural Change and Economic Dynamics 15, no. 3 (September 2004): 245–64, https://doi.org/10.1016/j.strueco.2004.02.001. 9 John Hand, “Building Digital Economy—The Research Councils Programme and the Vision,” Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 16, (2009): 3, https://doi.org/10.1007/978-3-642-11284-3_1. 10 Carmen Nadia Ciocoiu, “Integration Digital Economy and Green Economy: Opportunities for Sustainable Development,” Theoretical and Empirical Researches in Urban Management 6, no. 1 (2011): 33–43, https://www.researchgate.net/publication/227346561. https://cyberleninka.ru/article/n/tsifrovaya-ekonomika-razlichnye-puti-k-effektivnomu-primeneniyu-tehnologiy-bim-plm-cad-iot-smart-city-big-data-i-drugie https://cyberleninka.ru/article/n/tsifrovaya-ekonomika-razlichnye-puti-k-effektivnomu-primeneniyu-tehnologiy-bim-plm-cad-iot-smart-city-big-data-i-drugie https://cyberleninka.ru/article/n/tsifrovaya-ekonomika-razlichnye-puti-k-effektivnomu-primeneniyu-tehnologiy-bim-plm-cad-iot-smart-city-big-data-i-drugie http://nbuv.gov.ua/UJRN/vdumi_2015_23_4_4 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.576.1856&rep=rep1&type=pdf https://doi.org/10.1093/0199260389.003.0001 https://doi.org/10.1016/j.strueco.2004.02.001 https://doi.org/10.1007/978-3-642-11284-3_1 https://www.researchgate.net/publication/227346561 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 THE ROLE OF THE LIBRARY IN THE DIGITAL ECONOMY | ZHARINOV 14 11 Lesya Zenoviivna Kit, “Evoliutsiia Merezhevoi Ekonomiky,” Visnyk Khmelnytskoho Natsionalnoho Universytetu, Ekonomichni nauky, no. 3 (2014): 187–94, http://nbuv.gov.ua/UJRN/Vchnu_ekon_2014_3%282%29__42. 12 Mykhailo Voinarenko and Larissa Skorobohata, “Merezhevi Instrumenty Kapitalizatsii Informatsiino-intelektualnoho Potentsialu ta Innovatsii,” Visnyk Khmelnytskoho Natsionalnoho Universytetu, . Ekonomichni nauky, no. 3 (2015): 18–24, http://elar.khnu.km.ua/jspui/handle/123456789/4259. 13 Yurii Pivovarov, “Ukraina Perehodut na “Cifrovu Economic,” Sccho ce oznachae,” edited by Miroslav Liskovuch. Ukrinform (January 21, 2020). https://www.ukrinform.ua/rubric- society/2385945-ukraina-perehodit-na-cifrovu-ekonomiku-so-ce-oznacae.html. 14 European Commission, “Digital Economy and Society Index,” Brussels, Belgium, https://ec.europa.eu/commission/news/digital-economy-and-society-index-2019-jun-11_en. 15 Kabinet Ministriv Ukrainu, “Pro Skhvalennia Kontseptsii Rozvytku Tsyfrovoi Ekonomiky ta Suspilstva Ukrainy na 2018–2020 Roky ta Zatverdzhennia Planu Zakhodiv Shchodo yii Realizatsii,” (Kyiv: 2018), https://zakon.rada.gov.ua/laws/show/67-2018-%D1%80. 16 Kabinet Ministriv Ukrainu, “Pytannia Ministerstva Tsyfrovoi Transformatsii,” (Kyiv: 2019), https://zakon.rada.gov.ua/laws/show/856-2019-%D0%BF. 17 Piatuy, “Biblioteky Stanut Pershymy Oflain-khabamy: Mintsyfry Zapustyt Kursy z Tsyfrovoi Osvity,” https://www.5.ua/suspilstvo/biblioteky-stanut-pershymy-oflain-khabamy-mintsyfry- zapustyt-kursy-z-tsyfrovoi-osvity-206206.html. 18 Jacques Bughin, Jonathan Deaki, and Barbara O’Beirne, “Digital Transformation: Improving the Odds of Success,” McKinsey & Company, https://www.mckinsey.com/business- functions/mckinsey-digital/our-insights/digital-transformation-improving-the-odds-of- success. 19 Domynyk Fyld, Shylpa Patel, and Henry Leon, “Kak Dostich Tsifrovoy Zrelosti,” The Boston Consulting Group Inc. (2018), https://www.thinkwithgoogle.com/_qs/documents/5685/ru_AdWords_Marketing___Sales_89 1609_Mastering_Digital_Marketing_Maturity.pdf. 20 Hortense de la Boutetière, Alberto Montagner, and Angelika Reich, “Unlocking Success in Digital Transformations,” McKinsey & Company, https://www.mckinsey.com/business- functions/organization/our-insights/unlocking-success-in-digital-transformations. 21 Top Lea, “Tsyfrova Transformatsiia Biznesu: Navishcho vona Potribna i Shche 14 Pytan,” BusinessViews, https://businessviews.com.ua/ru/business/id/cifrova-transformacija- biznesu-navischo-vona-potribna-i-sche-14-pitan-2046. 22 Vasily Kupriyanovsky, Andrey Dobrynin, Sergey Sinyagov, and Dmitry Namiot, “Tselostnaya Model Transformatsii v Tsifrovoy Ekonomike—Kak Stat Tsifrovyimi Liderami,” International Journal of Open Information Technologies 5, no. 1 (2017): 26–33, http://nbuv.gov.ua/UJRN/Vchnu_ekon_2014_3%282%29__42 http://elar.khnu.km.ua/jspui/handle/123456789/4259 https://www.ukrinform.ua/rubric-society/2385945-ukraina-perehodit-na-cifrovu-ekonomiku-so-ce-oznacae.html https://www.ukrinform.ua/rubric-society/2385945-ukraina-perehodit-na-cifrovu-ekonomiku-so-ce-oznacae.html https://ec.europa.eu/commission/news/digital-economy-and-society-index-2019-jun-11_en https://zakon.rada.gov.ua/laws/show/67-2018-%D1%80 https://zakon.rada.gov.ua/laws/show/856-2019-%D0%BF https://www.5.ua/suspilstvo/biblioteky-stanut-pershymy-oflain-khabamy-mintsyfry-zapustyt-kursy-z-tsyfrovoi-osvity-206206.html https://www.5.ua/suspilstvo/biblioteky-stanut-pershymy-oflain-khabamy-mintsyfry-zapustyt-kursy-z-tsyfrovoi-osvity-206206.html https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/digital-transformation-improving-the-odds-of-success https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/digital-transformation-improving-the-odds-of-success https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/digital-transformation-improving-the-odds-of-success https://www.thinkwithgoogle.com/_qs/documents/5685/ru_AdWords_Marketing___Sales_891609_Mastering_Digital_Marketing_Maturity.pdf https://www.thinkwithgoogle.com/_qs/documents/5685/ru_AdWords_Marketing___Sales_891609_Mastering_Digital_Marketing_Maturity.pdf https://www.mckinsey.com/business-functions/organization/our-insights/unlocking-success-in-digital-transformations https://www.mckinsey.com/business-functions/organization/our-insights/unlocking-success-in-digital-transformations https://businessviews.com.ua/ru/business/id/cifrova-transformacija-biznesu-navischo-vona-potribna-i-sche-14-pitan-2046 https://businessviews.com.ua/ru/business/id/cifrova-transformacija-biznesu-navischo-vona-potribna-i-sche-14-pitan-2046 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 THE ROLE OF THE LIBRARY IN THE DIGITAL ECONOMY | ZHARINOV 15 https://cyberleninka.ru/article/n/tselostnaya-model-transformatsii-v-tsifrovoy-ekonomike- kak-stat-tsifrovymi-liderami. 23 Nataliia Kraus, Alexander Holoborodko, and Kateryna Kraus, “Tsyfrova Ekonomika: Trendy ta Perspektyvy Avanhardnoho Kharakteru Rozvytku,” Efektyvna Ekonomika no. 1 (2018): 1–7, http://www.economy.nayka.com.ua/pdf/1_2018/8.pdf. 24 David Bawden and Ian Rowlands, “Digital Libraries: Assumptions and Concepts,” International Journal of Libraries and Information Studies (Libri), no. 49 (1999): 181–91, https://doi.org/10.1515/libr.1999.49.4.181. 25 Jack M. Maness, “Library 2.0: The Next Generation of Web-based Library Services,” LOGOS 13, no. 3 (2006): 139–45, https://doi.org/10.2959/logo.2006.17.3.139. 26 Woody Evans, Building Library 3.0: Issues in Creating a Culture of Participation (Oxford: Chandos Publishing, 2009). 27 Younghee Noh, “Imagining Library 4.0: Creating a Model for Future Libraries,” The Journal of Academic Librarianship 41, no. 6 (November 2015): 786–97, https://doi.org/10.1016/j.acalib.2015.08.020. 28 Helle Guldberg et al., “Library 5.0,” Septentrio Conference Series, UiT The Arctic University of Norway, no. 3 (2020), https://doi.org/10.7557/5.5378. 29 Denys Solovianenko, “Akademichni Biblioteky u Novomu Sotsiotekhnichnomu Vymiri. Chastyna Chetverta. Suchasnyi Riven Dyskursu Akademichnoho Bibliotekoznavstva ta Postup E-nauky,” Bibliotechnyi visnyk no.1 (2011): 8–24, http://journals.uran.ua/bv/article/view/2011.1.02. 30 Olga Petrivna Stepanenko, “Perspektyvni Napriamy Tsyfrovoi Transformatsii v Konteksti Rozbudovy Tsyfrovoi Ekonomiky,” in Modeliuvannia ta informatsiini systemy v ekonomitsi : zb. nauk. pr., edited by V. K. Halitsyn, (Kyiv: KNEU, 2017), 120–31, https://ir.kneu.edu.ua/bitstream/handle/2010/23788/120- 131.pdf?sequence=1&isAllowed=y. 31 Michal Indrák and Lenka Pokorná, “Analysis of Digital Transformation of Services in a Research Library,” Global Knowledge, Memory and Communication (2020), https://doi.org/10.1108/GKMC-09-2019-0118. 32 Irina Sergeevna Koroleva, “Biblioteka—Optimalnaya Model Vzaimodeystviya s Polzovatelyami v Usloviyah Tsifrovoy Ekonomiki,” Informatsionno-bibliotechnyie sistemyi, resursyi i tehnologii no. 1 (2020): 57–64, https://doi.org/10.20913/2618-7515-2020-1-57-64. 33 James Currall and Michael Moss, “We are Archivists, But are We OK?”, Records Management Journal 18, no. 1 (2008): 69–91, https://doi.org/10.1108/09565690810858532. 34 Kirralie Houghton, Marcus Foth and Evonne Miller, “The Local Library across the Digital and Physical City: Opportunities for Economic Development,” Commonwealth Journal of Local Governance no. 15 (2014): 39–60, https://doi.org/10.5130/cjlg.v0i0.4062. https://cyberleninka.ru/article/n/tselostnaya-model-transformatsii-v-tsifrovoy-ekonomike-kak-stat-tsifrovymi-liderami https://cyberleninka.ru/article/n/tselostnaya-model-transformatsii-v-tsifrovoy-ekonomike-kak-stat-tsifrovymi-liderami http://www.economy.nayka.com.ua/pdf/1_2018/8.pdf https://doi.org/10.1515/libr.1999.49.4.181 https://doi.org/10.2959/logo.2006.17.3.139 https://doi.org/10.1016/j.acalib.2015.08.020 https://doi.org/10.7557/5.5378 http://journals.uran.ua/bv/article/view/2011.1.02 https://ir.kneu.edu.ua/bitstream/handle/2010/23788/120-131.pdf?sequence=1&isAllowed=y https://ir.kneu.edu.ua/bitstream/handle/2010/23788/120-131.pdf?sequence=1&isAllowed=y https://doi.org/10.1108/GKMC-09-2019-0118 https://doi.org/10.20913/2618-7515-2020-1-57-64 https://doi.org/10.1108/09565690810858532 https://doi.org/10.5130/cjlg.v0i0.4062 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 THE ROLE OF THE LIBRARY IN THE DIGITAL ECONOMY | ZHARINOV 16 35 Sharon Farnel and Ali Shiri, “Community-Driven Knowledge Organization for Cultural Heritage Digital Libraries: The Case of the Inuvialuit Settlement Region,” Advances in Classification Research Online no. 1 (2019): 9–12, https://doi.org/10.7152/acro.v29i1.15453. 36 Elizabeth Tait, Konstantina Martzoukou, and Peter Reid, “Libraries for the Future: The Role of IT Utilities in the Transformation of Academic Libraries,” Palgrave Communications no. 2 (2016): 1–9, https://doi.org/10.1057/palcomms.2016.70. 37 Tatiana Alexandrovna Kolesnykova, “Suchasna Biblioteka VNZ: Modeli Rozvytku v Umovakh Informatyzatsii,” Bibliotekoznavstvo. Dokumentoznavstvo. Informolohiia no. 4 (2009): 57–62, http://nbuv.gov.ua/UJRN/bdi_2009_4_10. 38 Ekaterina Kudrina and Karina Ivina, “Digital Environment as a New Challenge for the University Library,”Bulletin of Kemerovo State University. Series: humanities and social sciences 2, no. 10 (2019): 126–34, https://doi.org/10.21603/2542-1840-2019-3-2-126-134. 39 Anna Kochetkova, “Tsyfrovi Biblioteky yak Oznaka XXI Stolittia,” Svitohliad no. 6 (2009): 68–73, https://www.mao.kiev.ua/biblio/jscans/svitogliad/svit-2009-20-6/svit-2009-20-6-68- kochetkova.pdf. 40 Victoria Alexandrovna Kopanieva, “Naukova Biblioteka: Vid E-katalohu do E-nauky,” Bibliotekoznavstvo. Dokumentoznavstvo. Informolohiia no. 6 (2016): 4–10, http://nbuv.gov.ua/UJRN/bdi_2016_3_3. 41 Christy R. Stevens, “Reference Reviewed and Re-Envisioned: Revamping Librarian and Desk- Centric Services with LibStARs and LibAnswers,” The Journal of Academic Librarianship 39, no. 2 (March 2013): 202–14, https://doi.org/10.1016/j.acalib.2012.11.006. 42 Samuel Kai-Wah Chu and Helen S Du, “Social Networking Tools for Academic Libraries,” Journal of Librarianship and Information Science 45, no. 1 (February 17, 2012): 64–75, https://doi.org/10.1177/0961000611434361. 43 ACRL Research Planning and Review Committee, “2018 Top Trends in Academic Libraries A Review of the Trends and Issues Affecting Academic Libraries in Higher Education,” C&RL News 79, no.6 (2018): 286–300. https://doi.org/10.5860/crln.79.6.286. 44 Currall and Moss, “We are Archivists, but are We OK?”, 69–91, https://doi.org/10.1108/09565690810858532. 45 Valerii Fishchuk et al., “Ukraina 2030E— Kraina z Rozvynutoiu Tsyfrovoiu Ekonomikoiu,” Ukrainskyi instytut maibutnoho, 2018, https://strategy.uifuture.org/kraina-z-rozvinutoyu- cifrovoyu-ekonomikoyu.html. 46 EuroCRIS, “Search the Directory of Research Information System (DRIS),” https://dspacecris.eurocris.org/cris/explore/dris. 47 MON, “MON Zapustylo Novyi Poshukovyi Servis dlia Naukovtsiv—Vin Bezkoshtovnyi ta Bazuietsia na Vidkrytykh Danykh z Usoho Svituю,” https://mon.gov.ua/ua/news/mon- https://doi.org/10.7152/acro.v29i1.15453 https://doi.org/10.1057/palcomms.2016.70 http://nbuv.gov.ua/UJRN/bdi_2009_4_10 https://doi.org/10.21603/2542-1840-2019-3-2-126-134 https://www.mao.kiev.ua/biblio/jscans/svitogliad/svit-2009-20-6/svit-2009-20-6-68-kochetkova.pdf https://www.mao.kiev.ua/biblio/jscans/svitogliad/svit-2009-20-6/svit-2009-20-6-68-kochetkova.pdf http://nbuv.gov.ua/UJRN/bdi_2016_3_3 https://doi.org/10.1016/j.acalib.2012.11.006 https://doi.org/10.1177/0961000611434361 https://doi.org/10.5860/crln.79.6.286 https://doi.org/10.1108/09565690810858532 https://strategy.uifuture.org/kraina-z-rozvinutoyu-cifrovoyu-ekonomikoyu.html https://strategy.uifuture.org/kraina-z-rozvinutoyu-cifrovoyu-ekonomikoyu.html https://dspacecris.eurocris.org/cris/explore/dris https://mon.gov.ua/ua/news/mon-zapustilo-novij-poshukovij-servis-dlya-naukovciv-vin-bezkoshtovnij-ta-bazuyetsya-na-vidkritih-danih-z-usogo-svitu INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 THE ROLE OF THE LIBRARY IN THE DIGITAL ECONOMY | ZHARINOV 17 zapustilo-novij-poshukovij-servis-dlya-naukovciv-vin-bezkoshtovnij-ta-bazuyetsya-na- vidkritih-danih-z-usogo-svitu. 48 Nancy Herther et al., “Text and Data Mining Contracts: The Issues and Needs,” Proceedings of the Charleston Library Conference, 2016, https://doi.org/10.5703/1288284316233. 49 Karen Hogenboom and Michele Hayslett, “Pioneers in the Wild West: Managing Data Collections.” Portal: Libraries and the Academy 17, no. 2 (2017): 295–319, https://doi.org/10.1353/pla.2017.0018. 50 Philip Young et al., “Library Support for Text and Data Mining,” A Report for the University Libraries at Virginia Tech, 2017, http://bit.ly/2FccOwu. 51 Carol Tenopir et al., “Data Sharing by Scientists: Practices and Perceptions,” PloS One 6 (2011), no. 6, https://doi.org/10.1371/journal.pone.0021101. 52 Filip Kruse and Jesper Boserup Thestrup, “Research Libraries’ New Role in Research Data Management, Current Trends and Visions in Denmark,” Liber Quarterly 23, no.4 (2014): 310– 35, https://doi.org/10.18352/lq.9173. 53 American Economic Review, “Data and Code.” AER Guidelines for Accepted Articles. Instructions for Preparation of Accepted Manuscripts, 2020, https://www.aeaweb.org/journals/aer/submissions/accepted-articles/styleguide#IIC. 54 “Data Access and Retention.” The Publication Ethics and Malpractice Statement, (New York: Marsland Press, 2019), http://www.sciencepub.net/marslandfile/ethics.pdf. 55 Patricia Cleary et al., “Text Mining 101: What You Should Know,” The Serials Librarian 72, no.1-4 (May 2017): 156–59, https://doi.org/10.1080/0361526X.2017.1320876. 56 Rebecca Bryant et al., Practices and Patterns in Research Information Management Findings from a Global Survey (Dublin: OCLC Research, 2018), https://doi.org/10.25333/BGFG-D241. https://mon.gov.ua/ua/news/mon-zapustilo-novij-poshukovij-servis-dlya-naukovciv-vin-bezkoshtovnij-ta-bazuyetsya-na-vidkritih-danih-z-usogo-svitu https://mon.gov.ua/ua/news/mon-zapustilo-novij-poshukovij-servis-dlya-naukovciv-vin-bezkoshtovnij-ta-bazuyetsya-na-vidkritih-danih-z-usogo-svitu https://doi.org/10.5703/1288284316233 https://doi.org/10.1353/pla.2017.0018 http://bit.ly/2FccOwu https://doi.org/10.1371/journal.pone.0021101 https://doi.org/10.18352/lq.9173 https://www.aeaweb.org/journals/aer/submissions/accepted-articles/styleguide#IIC http://www.sciencepub.net/marslandfile/ethics.pdf https://doi.org/10.1080/0361526X.2017.1320876 https://doi.org/10.25333/BGFG-D241 ABSTRACT INTRODUCTION THE CONCEPT OF THE “DIGITAL ECONOMY” FEATURES OF DIGITAL TRANSFORMATION DIRECTIONS OF LIBRARY DEVELOPMENT IN THE DIGITAL ECONOMY PROBLEMS IN RESEARCH DATA MANAGEMENT LIBRARY AND RESEARCH INFORMATION MANAGEMENT CONCLUSIONS ENDNOTES 12483 ---- Automated Fake News Detection in the Age of Digital Libraries ARTICLE Automated Fake News Detection in the Age of Digital Libraries Uğur Mertoğlu and Burkay Genç INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2020 https://doi.org/10.6017/ital.v39i4.12483 Uğur Mertoğlu (umertoglu@hacettepe.edu.tr) is a PhD Candidate, Hacettepe University. Burkay Genç (bgenc@cs.hacettepe.edu.tr) is Assistant Professor, Hacettepe University. © 2020. ABSTRACT The transformation of printed media into the digital environment and the extensive use of social media have changed the concept of media literacy and people’s habits of news consumption. While online news is faster, easier, comparatively cheaper, and offers convenience in terms of people's access to information, it speeds up the dissemination of fake news. Due to the free production and consumption of large amounts of data, fact-checking systems powered by human efforts are not enough to question the credibility of the information provided, or to prevent its rapid dissemination like a virus. Libraries, long known as sources of trusted information, are facing challenges caused by misinformation as mentioned in studies about fake news and libraries.1 Considering that libraries are undergoing digitization processes all over the world and are providing digital media to their users, it is very likely that unverified digital content will be served by world’s libraries. The solution is to develop automated mechanisms that can check the credibility of digital content served in libraries without manual validation. For this purpose, we developed an automated fake news detection system based on Turkish digital news content. Our approach can be modified for any other language if there is labelled training material. This model can be integrated into libraries’ digital systems to label served news content as potentially fake whenever necessary, preventing uncontrolled falsehood dissemination via libraries. INTRODUCTION Collins dictionary which chose the term “fake news” as the “Word of the Year 2017,” describes news as the actual and objective presentation of a current event, information, or situation that is published in newspapers and broadcast on radio, television, or online.2 We are in an era where everything goes online, and news is not an exception. Many people today prefer to read their daily news online, because it is a cost-effective and convenient way to remain up to date. Although this convenience has lucrative benefits for society, it can also have harmful side effects. Having access to news from multiple sources, anytime, anywhere has become an irresistible part of our daily routines. However, some of these sources may provide unverified content which can easily be delivered right to your mobile device. Most importantly, potential fake news content delivered by these sources may mislead society and cause social disturbances such as triggering violence against ethnic minorities and refugees, causing unnecessary fear related to health issues, or even sometimes result in crisis, devastating riots and strikes. Not having a steady definition compared to news, fake news is often defined according to the data used or the limited perspective of the study in the literature. For example; DiFranzo and Gloria- Garcia defined the fake news as “false news stories that are packaged and published as if they were genuine.”3 On the other hand, Guess et al. see the term as “a new form of political misinformation” within the domain of politics, whereas Mustafaraj is more direct and defines it as mailto:umertoglu@hacettepe.edu.tr mailto:bgenc@cs.hacettepe.edu.tr INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 2 “lies presented as news.”4 A comprehensive list of 12 definitions can be found in Egelhofer and Lecheler.5 In simplified terms, news which is created to deceive or mislead readers can be called fake news. However, the concept of fake news is a quite broad one that needs to be specified meticulously. Fake news is created for many purposes and emerges in many different types. Having an interwoven structure, most of these types are shown in figure 1. Although, it is not easy to cluster these types into separate groups, they can be categorized according to the information quality or based on the intention as it is created to deceive deliberately or not, as Rashkin et al. did.6 We propose the following classification where the two dimensions represent the potential impact and the speed of propagation. Figure 1. The volatile distribution of the fake news types (clustered in four regions: sr, Sr, Sr, SR) with respect to two dimensions: speed of propagation and potential impact. The four regions visualized are clustered according to their dangerousness. First of all, it should be noted that to order types of fake news in a stable precision is quite a challenging task. The variations within the field highly depend on dynamic factors such as timespan, actors, and echo- chamber effect. Hence, this figure should be considered as a clustering effort. There are possible intersecting areas of types within the regions. We will now give examples for two regions, “sr” and “SR.” For example, the SR grouping shows characteristics of high-risk levels and fast dissemination. This includes varieties of fake news such as propaganda, manipulation, misinformation, hate news, INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 3 provocative news, etc. We usually encounter this in the domain of politics. This kind of news may cause critical and nonrecoverable results in politics, the economy, etc., in a short period of time. The rise of the term fake news itself can also be attributed to this kind of news. On the other hand, the relatively less severe group (sr) of fake news, comprising of satire, hoax, click-bait, etc., has low-risk levels and a slow speed of dissemination. A frequently used type of this group, click-bait, is a sensational headline or link that urges the reader to click on a post, link, article, image, or video. These kinds of news have a repetitive style. It can be said that readers become aware of falsehood after experiencing a few times. So, risk level is lower, and dissemination is slower. Vosoughi et al. stated the assumption that “Falsehood diffuses significantly farther, faster, deeper, and more broadly than the truth.”7 So indeed, just one piece of fake news may affect many more people than thousands of true news items do because of the dramatic circulation of fake news. In their recent survey about fake news, Zhou and Zafarani highlighted that fake news is a major concern for many different research disciplines especially information technologies. 8 Being a trusted source of information for a long time, libraries will play an important role in fighting against fake news problem. Kattimani et al. claims that the modern librarian must be equipped with necessary digital skills and tools to handle both printed collections and newly emerging digital resources.9 Similarly, we foresee that digital libraries, which can be defined as collections of digital content licensed and maintained by libraries, can be a part of the solution as an authority service with a collective effort. Connaway et al. point to the key role of information professionals such as librarians, archivists, journalists, and information architects in helping society use the products and services related to news in a convenient way. 10 As libraries all over the world are transitioning into digital content delivery services, they should implement mechanisms to avoid fake and misleading content being disseminated through them under the guidance of information professionals. To lay out proper future directions for the solution strategy, a clear understanding of interaction between library and information science (LIS) community and fake news must be addressed. Sullivan states that the LIS community has been affected deeply in the aftermath of the 2016 US presidential elections.11 Moreover, he quotes many other scientists, emphasizing libraries’ and librarians’ role in the fight against fake news. For example, Finley et al. say that libraries are the direct antithesis of fake news, the American Library Association (ALA) called fake news an anathema to the ethics of librarianship in 2017, Rochlin emphasizes the role of librarians in this fight, and talks about the need to adopt fake news as a central concern in librarianship and many other researchers name librarians in the front lines of the fight against fake news.12 Today, the struggle to detect fake news and prevent their spread is so popular that competitions are being organized (e.g., http://www.fakenewschallenge.org/) and conferences are being held (e.g., Bobcatsss 2020). The struggle against fake news can be classified under three main venues: • Reader awareness • Fact-checking organizations and websites • Automated detection systems The first item requires awareness of individuals against fake news and a collective conscience within the society against spreading fake news. To this end, visual and textual checklists, frameworks, and guidance lists are being published by official organizations, such as IFLA’s13 http://www.fakenewschallenge.org/ INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 4 (International Federation of Library Associations) infographic which contains eight steps to spot fake news. The RADAR framework and the Currency, Relevance, Authority, Accuracy, and Purpose (CRAAP) test are some of the efforts trying to increase reader-awareness of fake news.14 Unfortunately, due to the nature of fake news and the clever way they are created triggering people’s hunger to spread sensational information, it is very difficult to achieve full control via this strategy. Some studies explicitly showed that humans are prone to get confused when it comes to spotting lies or deciding whether a news item is fake or not.15 Furthermore, people often overlook facts that conflict with their current belief, especially in politics and controversial social issues.16 The second strategy focuses on third-party manually driven systems for checking and labelling content as fake or valid. Recently, we have seen many examples of offline and online organizations trying to work according to this strategy, such as a growing body of fact-checking organizations, start-ups (Storyzy, Factmata, etc.), and other projects with similar purposes.17 Unfortunately, these manually powered systems cannot cope with the huge amounts of digital content being steadily produced. Therefore, they focus only on a subset of digital content that they classify as having higher priority. Even for this subset of content, their reaction speed is much slower than the fake information’s spread speed. Therefore, automated and verified systems emerge as an inevitable last option. The third strategy offers automated fact-checking systems, which once trained, can deliver content labelling at unprecedented speeds. Today, many researchers are researching automated solutions and building models with different methodologies.18 Notwithstanding the latest studies, there is still a lot to do in the realm of automated fake news detection. Automated fact-checking systems will be detailed in the rest of the paper. Thanks to the internet, the collections of digital content served by digital libraries can be accessed by a great number of users without distance and time limits. Therefore, we propose a solution to the problem by positioning digital libraries as automated fact-checking services, which label digital news content as fake or valid as soon as or before it is served through library systems. The main reason we associate this approach with digital libraries is their access to a wide variety of digital content which can be used to train the proposed mathematical models, as well as their role in the society as the publisher of trusted information. To this end, we develop a mathematical model that is trained using existing news content served by digital libraries, and capable of labelling news content as fake or valid with unprecedented accuracy. The proposed solution uses machine learning techniques with an optimized set of extracted features and annotated labels of existing digital news content. Our study mainly contributes (a) a new set of features highly applicable for agglutinative languages, (b) the first hybrid model combining a lexicon/dictionary- based approach with machine learning methods to detect fake news, and (c) a benchmark dataset prepared in Turkish for fake news detection. LITERATURE REVIEW Contemporary studies have indicated that social, economic, and political events in recent years, especially after the 2016 US presidential elections, are increasingly associated with the concept of fake news.19 Since then, fake news has begun to be used as a tool in many domains. On the other hand, researchers motivated by finding automated solutions started to make use of machine learning, deep learning, hybrid models, and other methodologies for their solutions. https://storyzy.com/ INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 5 Although computational deception detection studies applying NLP (Natural Language Processing) operations are not new, textual deception in the context of text-based news is a new topic for the field of journalism.20 Accordingly, we believe that there is a hidden body language of news text, which has linguistic clues indicating whether the news is fake or not. Thus, lexical, syntactic, semantic, and rhetorical analysis when used with machine learning and deep learning techniques offers encouraging directions. The textual deception spread over a wide spectrum and the studies have utilized many different techniques. There are some prominent studies which took the problem as a binary classification problem utilizing linguistic clues.21 Although it is still early to say the linguistic characteristics of fake news are fully understood, research into fake-news detection in English-language texts is relatively advanced compared to that in other languages. In contrast, agglutinative languages such as Turkish have been little researched when it comes to fake news detection. Agglutinative languages enable the construction of words by adding various morphemes, which means that words that are not practically in use may exist theoretically. For example, “gerek-siz-leş-tir-ebil- ecek-leri-miz-den-dir,” is a theoretically possible word that means “it is one of the things that we will be able to make redundant,” but it is not a practical one. Shu et al. classified the models for the detection of fake news in their study.22 According to this study, the automated approaches can focus on four types of attributes to detect fake news: knowledge based, style based, stance based, or propagation based. Among these, it can be said that the most useful approaches are the ones which focus on the textual news content. Th e textual content can be studied by an automated process to extract features that can be very helpful in classifying content as fake or valid. Many scholars have tried to build models for automatic detection and prediction of fake news using machine learning algorithms, deep learning algorithms, and other techniques. These scholars approach the detection of fake news from many different perspectives and domains. For example, in one of the studies, scientific news and conspiracy news were used.23 In Shu et al.’s study based on credibility of news, the headlines were used to determine whether the article was clickbait or not. In another study, Reis et al. worked on Buzzfeed articles linked to the 2016 US election using machine learning techniques with a supervised learning approach.24 Studies which try to detect satire and sarcasm can be attributed to subcategories of fake news detection.25 Our observation, in line with the general view, is that satire is not always recognizable and can be misunderstood for real news.26 For this reason, we included satirical news in our dataset. It should be noted that although satire or sarcasm can be classified by automated detection systems, experts should still evaluate the results of the classification. While some scholars used specific models focusing on unique characteristics, some others such as Ruchansky et al. proposed hybrid deep models for fake news detection making use of multiple kinds of features such as temporal engagement between users and news articles over time and generated a labelling methodology based on those features.27 In related studies, many features such as automatic extracted features, hand-crafted features, social features, network information, visual features, and some others such as psycholinguistic features, are applied by researchers.28 In this work, we focused on news content features, however the social context features can also be adapted using different tiers such as user activity patterns, INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 6 analysis of user interaction, profile metadata, social network/graph analysis etc. to extract features. We also have some of these features in our data but not having ground truth quantitatively, we avoided using these features. METHODOLOGY In this section, we present our motivation for this work which we visualized in a framework and named Global Library and Information Science (GLIS_1.0). Subsequently, we discuss the construction of the automated detection system as the key element of the GLIS_1.0 framework. We explain the framework, model, dataset, features, and the techniques used in this section. Framework The main structure of the proposed framework is shown in figure 2. This framework consists of highly cohesive but flexible layers. Figure 2. The GLIS_1.0 framework main structure. In the presentation layer one can find the different sources of news that are publicly available. These sources can be accessed directly using their websites or can be searched for via search engines. The news is received by fact-checking organizations which classify them manually, digital libraries which archives and serves them, and automated detection systems (ADS) which classify them automatically. Digital libraries work together with fact-checking organizations and ADSs to present clean and valid news to the public. Moreover, search engines use digital libraries systems to label their results as fake or valid. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 7 Fact-checking organizations should also benefit from the output of ADSs, as instead of manually checking heaps of news content, they could now focus on news labeled as potentially fake by an ADS. Through GLIS, ADSs make the life of fact-checking organizations and digital libraries much easier, all the while increasing the quality of news served to the public. Considering this is a high-level overview of a structure given in figure 2, there may be many other components, mechanisms, or layers, but the key elements of this structure are automated detection systems and the digital libraries. A critical approach to this framework can be why we need such an authority mechanism. The answer will be quite simple, technological progress is not the only solution. On the contrary, tech giants have already been subject to regulatory scrutiny for how they handle personal information.29 Also, their policy related to political ads has been questioned. Furthermore, they are often blamed for failing to fight fake news. Indeed, there is an urgent need for a global action more than ever. Digital libraries are much more than a technological advancement. Hence, they should be considered as institutions or services which can be a great authority service to provide news to society since the printed media disappears day by day. The threats caused by fake news are real and dangerous, but only recently have researchers from different disciplines been trying to find possible solutions such as educational, technological, regulatory, or political. Digital librarianship can be the intersection of all these solutions for promoting information/media literacy. Hence, digital librarianship will make use of many automated detection systems (ADS) to serve qualified news. In the following section, we discuss ADS in detail. Model An overview of our model of automated detection system solution which is very critical for the framework is shown in figure 3. Our fake news detection model consists of two phases. First is the Language Model/Lexicon Generation and the second is Machine Learning Integration. In this work, we used machine learning algorithms via supervised learning techniques which learn from labeled news data (training) and helps us to predict outcomes for unforeseen news data (test). Dataset We collected our data from three sources: • The primary source is the GDELT (Global Database of Events, Language and Tone) Project (https://www.gdeltproject.org/), a massive global news media archive offering free access to news text metadata for researchers worldwide. It can almost be considered a digital library of news in its own right. However, GDELT does not provide the actual news text and only serves processed metadata along with the URL of the news item. GDELT normally does not check for the validity of any news items. However, we have only used news from approved news agencies and completely ignored news from local and lesser-known sources to maximize the validity of the news we have automatically obtained through GDELT. Moreover, we have post-processed the obtained texts by cross-validating with teyit.org data to clean any potential fake news obtained through GDELT links. https://www.gdeltproject.org/ INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 8 Figure 3. Integrated fake news detection model with main phases combining language-model based approach with machine learning approach. • The second source is teyit.org which is a fact-checking organization based in Turkey, compliant to the principles of IFCN (International Fact-Checking Network) aiming to prevent spreading of false information through online channels. Manually analyzing each INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 9 news item, they tag them as fake, true, or uncertain. We used their results to automatically download and label each news text. • Lastly, our team collected manually curated and verified fake and valid news obtained from various online sources and named it as MVN (Manually Verified News). This set includes fake and valid news that we have manually accumulated in time during our studies and that were not overlapping with the news obtained from GDELT and teyit.org sources. We named our dataset TRFN. In Phase 2, the data is very similar to the one we used in Phase 1. However, to see the effectiveness of model, we made modifications to exclude old news before 2017 and added new items from 2019. The news in our dataset span a time frame between 2017– 2019 and are uniformly distributed. Table 1 outlines the dataset statistics, namely where the news text comes from, its class (fake or valid), the amount of distinct texts and the corresponding data collection method. It can be seen from the table that most of our valid news come from the GDELT source, whereas teyit.org, a fact-checking organization, contributes only fake news. Table 1. TRFN Dataset Summary after cleaning and duplicate removal. Dataset Class Size of Processed Data Collection Method GDELT NON-FAKE 82708 Automated Teyit.org FAKE 1026 MVN NON-FAKE 1049 Manual FAKE 400 All news items were processed through Zemberek (http://code.google.com/p/zemberek), the Turkish NLP engine for extracting different morphological properties of words within texts. After this processing phase, all obtained features were converted into tabular format and made available for future studies. This dataset is now available for scholarly studies upon request. In a study of this nature, the verifiability of the data used is important. As we have already mentioned, most of the data we used comes from verified sources such as mainstream news agencies accessed through GDELT and teyit.org archives which are verified by teyit.org staff. All data used in training the mathematical models which are to be explained in the rest of the paper are either directly or indirectly verified. Another important issue was generalizability of the dataset, which determines whether the results of the study are only applicable to specific domains or to all available domains. Although focusing on a specific news domain would clearly improve our accuracies, we preferred to work in the general domain and included news from all specific domains. The distribution of domains in our dataset is visualized in figure 4. This distribution closely matches the distribution one would experience reading daily news in Turkey. Hence, we have no domain specific bias in our training dataset. http://code.google.com/p/zemberek INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 10 Figure 4. The distribution of domains in the dataset. (SciTechEnvWetNatLife = Science, Technology, Environment, Weather, Nature, Life. EduCultureArtTourism = Education, Culture, Art, Tourism.) Moreover, we obtained highly correlated evidence showing syntactic similarities with the other NLP studies in Turkish during the exploratory data analysis. For example, the results of a study by Zemberek developers (http://zembereknlp.blogspot.com/2006/11/kelime-istatistikleri.html) to find the most common words in Turkish experimented with over five million words is compatible with most common words in our corpus. This evidence can be attributed to representability of our dataset. The last issue worth discussing is the imbalanced nature of the dataset. An imbalanced dataset occurs in a binary classification study when the frequency of one of the classes dominates the frequency of the other class in the dataset. In our dataset, the amount of fake news is highly surpassed by the amount of valid news. This generally results in difficulties in applying conventional machine learning methods to the dataset. However, it is a frequently observed phenomenon due to the disparity of variable classes in these kinds of problems in real world. To avoid potential problems due to the imbalanced nature of the dataset, we used SMOTE (Synthetic http://zembereknlp.blogspot.com/2006/11/kelime-istatistikleri.html INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 11 Minority Over-sampling Technique) which is an over-sampling method.30 It creates synthetic samples of the minority class that are relatively close in the feature space to the existing observations of the minority class. Features In this study, we discarded some features because of their relatively low impact on overall performance during the exploratory data analysis and subsequently in the training phase. The most effective features we decided on are shown in table 2. Table 2. Main Features Features Group Definition nRootScore Language Model Features The news score calculated according to the Root Model nRawScore The news score calculated according to the Raw Model SpellErrorScore Extracted Features Spell errors per sentences ComplexityScore The score of the complexity/readibility of the news Source Labels The URL or identifier of the news MainCategory The category of the news NewsSite The unique address of the news The language model features nRootScore and nRawScore are features that we have borrowed from our earlier study on fake news detection.31 In that study, we focused on constructing a fake news dictionary/lexicon based on different morphological segments of the words used in news texts. These two scores were found to be the most successful ones in determining the fakeness/validity of a news text, one considering the raw form of the words, the other considering the root form. The extracted features are ComplexityScore and SpellErrorScore. ComplexityScore basically represents the readability of the text. Studies for determining a good readability metric exist for the Turkish language.32 We used a modified version of the Gunnig-Fog metric, which is based on word length and sentence length.33 Since Turkish is an agglutinative language, we used word length instead of using the syllable count. We also made some modifications to normalize the scores. The average number of syllables per word syllable in Turkish is 2.6, so we defined a word as a long word if it has more than 9 letters.34 For a given news text T, the Complexity Score (CS) can be computed by equation 1. (1) 𝑇𝐶𝑆 = ( 𝑊𝑜𝑟𝑑𝑐𝑜𝑢𝑛𝑡 𝑆𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝑠𝑐𝑜𝑢𝑛𝑡 + 𝐿𝑜𝑛𝑔𝑊𝑜𝑟𝑑𝑐𝑜𝑢𝑛𝑡∗100 𝑊𝑜𝑟𝑑𝑐𝑜𝑢𝑛𝑡 10 ) The second Extracted Feature is SpellErrorScore. We foresee that there may be many more errors in fake news than in valid news. We calculated the spell error counts making use of Turkish Spellchecker class of Zemberek. Due to the text length of news varies, we calculate the ratio INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 12 according to the sentences. For a given news text T, SE (Spell Error Score) is calculated as shown in equation 2. (2) 𝑇𝑆𝐸 = ( 𝑆𝑝𝑒𝑙𝑙𝐸𝑟𝑟𝑜𝑟𝐶𝑜𝑢𝑛𝑡 𝑆𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝑠𝐶𝑜𝑢𝑛𝑡 ) Finally, we included the metadata categories Source, MainCategory, and NewsSite as additional identifiers for the learning process. Then, we combined features extracted from text representation techniques with the features shown in table 2 and trained the model with different classifiers. For text representation, we followed two directions for the experiments. First, we converted text into structured features with Bag of Words (BOW) approach in which text data is represented as the multiset of its words. Second, we experimented with N-grams which represents the sequence of n words, in other words splitting text into chunks of size N-words. In the (BOW) model, documents in TRFN are represented as a collection of words, ignoring grammar and even word order, but preserving multiplicity. In a classic BOW approach, each document can be represented as a fixed-length vector with length equal to the vocabulary size. This means each dimension of this vector corresponds to the occurrence of a word in a news item. We customized the generic approach by reducing variable-length documents to fixed-length vectors to be able to use with varying lengths with many machine learning models. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 13 Figure 5. An overview of BOW (Bag of Word) Approach. Because we ignore the word order, we reduced fixed length of counts as histograms as seen in figure 5. Assuming N is the number of news documents and W is the number of possible words in the corpus, it should be noted that in N*W count matrix, N is generally large but infrequent, because we have many news documents, but most words do not occur in any given document causing rareness of a term/word which is a drawback for the approach. Therefore, we modified the model to compensate the rarity problem by weighting the terms using TF-IDF measure which evaluates how important a word is to a document in a collection. The other technique we used, N-gram model is the generic term for a string of words in computational linguistics, and it is extensively used in text mining and NLP tasks. The prefixes that replace the n-part indicate the number of consecutive words in the string. So, a unigram is referred to one word, a bigram is two words, and an n-gram is n words. EXPERIMENTAL RESULTS AND DISCUSSION In this section, the experimental process and the results are presented. All experiments are performed using the Scikit-learn library. To evaluate the performance of the model and proposed features we employed the precision, recall, F1 score (the harmonic mean of the precision and recall), and accuracy metrics. We did many experiments using different combinations of features. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 14 Several classification models have been trained. These are as follows: K-Nearest Neighbor, Decision Trees, Gaussian Naive Bayes, Random Forest, Support Vector Machine, ExtraTrees Classifier, and Logistic Regression. To be effective, a classifier should be able to correctly classify previously unseen data. To this end, we tuned the parameter values for all the classification models used. Then, models were trained and evaluated on TRFN dataset using 10-fold cross-validation. In table 3, we present the ultimate best scores of the proposed model. The results are highly motivating to exemplify how useful automated detection systems can be as a key component of the integrated solution framework in figure 2. We compared the algorithms with three ultimate feature sets for having respectively consistent results to the other feature set combinations. Set1 stands for bigram+FOpt (Optimized Features), Set2 stands for BOWModified+ FOpt and Set3 stands for unigram+bigram+FOpt. The results show that there is a relative consistency in terms of performance across the models. In almost all models, the combination of unigram+bigram and optimized features sets (FOpt) gives better results than the other combinations. The ExtraTree Classifier model is chosen as the best due to its higher performance. This model is also known as Extremely Randomized Trees Classifier which is a type of ensemble learning technique aggregating the results of multiple decision trees collected in a “forest” to output its classification result. It is very similar to Random Forest Classifier and only differs in the manner of construction of the decision trees. So, we can also see closer results between these two classifiers. Table 3. Results. Evaluation results of all combinations of features and classification models. Model Feature Sets Precision%(0,1) Recall%(0,1) Accurac y F1Scor e Set1 93.32 93.96 93.92 93.3 6 93.64 93.62 Gaussian Naive Bayes Set2 93.37 94.02 93.98 93.4 2 93.70 93.68 Set3 93.95 94.21 94.19 93.9 7 94.08 94.07 Set1 93.70 93.50 93.52 93.6 9 93.60 93.61 K-Nearest Neighbour Set2 93.66 94.05 94.03 93.6 8 93.85 93.84 Set3 94.42 94.21 94.22 94.4 1 94.31 94.32 Set1 94.15 94.92 94.88 94.1 9 94.53 94.51 ExtraTrees Classifier Set2 94.09 94.94 94.90 94.1 4 94.51 94.49 Set3 97.90 95.72 95.81 97.8 6 96.81 96.85 Set1 89.61 88.92 88.99 89.5 4 89.26 89.30 Support Vector Machine Set2 89.70 88.96 89.04 89.6 2 89.33 89.37 Set3 90.85 91.26 91.22 90.8 9 91.05 91.03 Set1 91.56 92.28 92.23 91.6 2 91.92 91.89 Logistic Regression Set2 91.50 92.28 92.22 91.5 6 91.89 91.86 Set3 92.25 92.90 92.86 92.3 0 92.57 92.55 Set1 93.71 94.44 94.40 93.7 5 94.07 94.05 Random Forest Set2 93.87 95.00 94.94 93.9 4 94.44 94.41 Set3 94.77 95.14 95.12 94.7 9 94.96 94.95 Set1 93.95 94.59 94.56 93.9 9 94.27 94.25 Decision Trees Set2 94.05 95.08 95.03 94.1 1 94.57 94.54 Set3 94.94 95.24 95.23 94.9 5 95.09 95.08 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 15 Every ADS in GLIS_1.0 framework may use its own way to detect fake news. The open source ADS may improve with feedbacks. Hybrid models and other techniques such as neural networks with deep learning methodology can also be used according to the data, language of news and the news features related with both social context and news content. CONCLUSION AND FUTURE WORK In this study we presented a novel framework which offers a practical architecture of an integrated system for identifying fake news. We have tried to illustrate how digital libraries can be a service authority to promote media literacy and fight against fake news. Because librarians are trained to critically analyze information sources, their contributions to our proposed model are critical. Accordingly, we see this work as an encouraging effort for the next collaborative studies among the communities of LIS and CS (computer science). We think that there is an immediate need for LIS professionals to participate and contribute to automated solutions that can help detecting inaccurate and unverified information. In the same manner, we believe the collaboration of LIS professionals, computer scientists, fact-checking organizations, and pioneering technology platforms is the key to provide qualified news within a real-time framework to promote information literacy. Moreover, we put the reader at the core of the framework as the feed reader position while consuming news. In terms of automated detection systems, we proposed a fake news detection model in tegration of dictionary-based approach and machine learning techniques offering optimized feature sets applicable to agglutinative languages. We comparatively analyzed the findings with several classification models. We demonstrated that machine learning algorithms when used together with dictionary-based findings yield high scores both for precision and recall. Consequently, we believe once operational in the field, proposed workflow can be extended in the future to support other news elements such as photographs and videos. With the help of Social Network Analysis (SNA) it may be possible to stop or slow down the spread of fake news as it emerges. During all the experiments we did, this work also highlighted several tasks as future research directions such as: • The studies can be deepened to mathematically categorize the fake news types and the dissemination characteristics of each type can be analyzed. • The workflow has the potential to provide an automated verification platform for all news content existing in digital libraries to promote media literacy. ENDNOTES 1 M. Connor Sullivan, “Why Librarians Can’t Fight Fake News,” Journal of Librarianship and Information Science 51, no. 4 (December 2019): 1146–56, https://doi.org/10.1177/0961000618764258. 2 “Definition of 'News',” available at: https://www.collinsdictionary.com/dictionary/english/news 3 Dominic DiFranzo and Kristine Gloria-Garcia, “Filter Bubbles and Fake News,” XRDS: Crossroads, The ACM Magazine for Students 23, no. 3 (April 2017): 32–35, https://doi.org/10.1145/3055153. https://doi.org/10.1177/0961000618764258 https://www.collinsdictionary.com/dictionary/english/news https://doi.org/10.1145/3055153 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 16 4 Andrew Guess, Brendan Nyhan, and Jason Reifler, “Selective Exposure to Misinformation: Evidence from the Consumption of Fake News during the 2016 US Presidential Campaign,” European Research Council 9, no. 3 (2018): 4; Eni Mustafaraj and P. Takis Metaxas, “The Fake News Spreading Plague: Was It Preventable?” Proceedings of the 2017 ACM on Web Science Conference, (June 2017): 235–39, https://doi.org/10.1145/3091478.3091523. 5 Jana Laura Egelhofer and Sophie Lecheler, “Fake News as a Two-Dimensional Phenomenon: A Framework and Research Agenda,” Annals of the International Communication Association 43, no. 2 (2019): 97–116, https://doi.org/10.1080/23808985.2019.1602782. 6 Hannah Rashkin et al., “Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking,” Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, (2017): 2931–37. 7 Soroush Vosoughi, Deb Roy, and Sinan Aral, “The Spread of True and False News Online,” Science 359, no. 6380 (2018): 1146–51, https://doi.org/10.1126/science.aap9559. 8 Xinyi Zhou and Reza Zafarani, “A Survey of Fake News: Fundamental Theories, Detection Methods, and Opportunities,” ACM Computing Surveys (CSUR) 53, no. 5 (2020): 1–40, https://doi.org/10.1145/3395046. 9 S. F. Kattimani, Praveenkumar Kumbargoudar, and D. S. Gobbur, “Training of the Library Professionals in Digital Era: Key Issues” (2006), https://ir.inflibnet.ac.in:8443/ir/handle/1944/1234. 10 Lynn Silipigni Connaway et al., “Digital Literacy in the Era of Fake News: Key Roles for Information Professionals,” Proceedings of the Association for Information Science and Technology 54, no. 1 (2017): 554–55, https://doi.org/10.1002/pra2.2017.14505401070. 11 Matthew C. Sullivan, “Libraries and Fake News: What’s the Problem? What’s the Plan?,” Communications in Information Literacy 13, no. 1 (2019): 91–113, https://doi.org/10.15760/comminfolit.2019.13.1.7. 12 Wayne Finley, Beth McGowan, and Joanna Kluever, “Fake News: An Opportunity for Real Librarianship,” ILA reporter 35, no. 3 (2017): 8–12; American Library Association, “Resolution on Access to Accurate Information,” 2018; Nick Rochlin, “Fake News: Belief in Post-Truth,” Library Hi Tech 35, no. 3 (2017): 386–92, https://doi.org/10.1108/LHT-03-2017-0062; Linda Jacobson, “The Smell Test: In the Era of Fake News, Librarians Are Our Best Hope,” School Library Journal 63, no. 1 (2017): 24–29; Angeleen Neely–Sardon, and Mia Tignor, “Focus on the Facts: A News and Information Literacy Instructional Program,” The Reference Librarian 59, no. 3 (2018): 108–21, https://doi.org /10.1080/02763877.2018.1468849; Claire Wardle and Hossein Derakhshan, “Information Disorder: Toward an Interdisciplinary Framework for Research and Policy Making,” Council of Europe report 27 (2017). 13 IFLA, “How to Spot Fake News,” 2017. https://doi.org/10.1145/3091478.3091523 https://doi.org/10.1080/23808985.2019.1602782 https://doi.org/10.1145/3395046 https://doi.org/10.1002/pra2.2017.14505401070 https://doi.org/10.15760/comminfolit.2019.13.1.7 https://www.emerald.com/insight/publication/issn/0737-8831 https://doi.org/10.1108/LHT-03-2017-0062 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 17 14 Jane Mandalios, “Radar: An Approach for Helping Students Evaluate Internet Sources,” Journal of Information Science 39, no. 4 (2013): 470–78, https://doi.org/10.1177/0165551513478889; Sarah Blakeslee, “The CRAAP test,” LOEX Quarterly 3, no. 3 (2004):4. 15 Victoria L. Rubin and Niall Conroy, “Discerning Truth from Deception: Human Judgments and Automation Efforts,” First Monday 17, no. 5 (2012), https://doi.org/10.5210/fm.v17i3.3933; Verónica Pérez-Rosas et al., “Automatic Detection of Fake News,” arXiv preprint arXiv:1708.07104 (2017). 16 Justin P. Friesen, Troy H. Campbell, and Aaron C. Kay, “The Psychological Advantage of Unfalsifiability: The Appeal of Untestable Religious and Political Ideologies,” Journal of Personality and Social Psychology 108, no. 3 (2015): 515–29, https://doi.org/10.1037/pspp0000018. 17 Tanja Pavleska et al., “Performance Analysis of Fact-Checking Organizations and Initiatives in Europe: A Critical Overview of Online Platforms Fighting Fake News,” Social Media and Convergence 29 (2018). 18 Yasmine Lahlou, Sanaa El Fkihi, and Rdouan Faizi, “Automatic Detection of Fake News on Online Platforms: A Survey,” (paper, 2019 1st International Conference on Smart Systems and Data Science (ICSSD), Rabat, Morocco, 2019), https://doi.org/10.1109/ICSSD47982.2019.9002823; Christian Janze, and Marten Risius, “Automatic Detection of Fake News on Social Media Platforms,” (paper, Pasific Asia Conference on Information Systems (PACIS), 2017); Torstein Granskogen, “Automatic Detection of Fake News in Social Media Using Contextual Information” (master’s thesis, Norwegian University of Science and Technology (NTNU), 2018). 19 Jacob L. Nelson and Harsh Taneja, “The Small, Disloyal Fake News Audience: The Role of Audience Availability in Fake News Consumption,” New Media & Society 20, no. 10 (2018): 3720–37, https://doi.org/10.1177/1461444818758715; Philip N. Howard et al., “Social Media, News and Political Information During the US Election: Was Polarizing Content Concentrated in Swing States?,” arXiv preprint arXiv:1802.03573 (2018); Alexandre Bovet and Hernán A. Makse, “Influence of Fake News in Twitter During the 2016 US Presidential Election,” Nature Communications 10, no. 7 (2019): 1–14, https://doi.org/10.1038/s41467-018-07761-2. 20 Lina Zhou et al., “Automating Linguistics-Based Cues for Detecting Deception in Text-Based Asynchronous Computer-Mediated Communications,” Group Decision and Negotiation 13, no. 1 (2004): 81–106, https://doi.org/10.1023/B:GRUP.0000011944.62889.6f; Myle Ott et al., “Finding Deceptive Opinion Spam by Any Stretch of the Imagination,” arXiv preprint arXiv:1107.4557 (2011); Rada Mihalcea and Carlo Strapparava, “The Lie Detector: Explorations in the Automatic Recognition of Deceptive Language,” (paper, Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, (2009): Association for Computational Linguistics, 309–12); Julia B. Hirschberg et al., “Distinguishing Deceptive from Non-Deceptive Speech,” (2005), https://doi.org/10.7916/D8697C06. 21 Victoria L. Rubin, Yimin Chen, and Nadia K. Conroy, “Deception Detection for News: Three Types of Fakes,” Proceedings of the Association for Information Science and Technology 52, no. 1 (2015): 1–4, https://doi.org/10.1002/pra2.2015.145052010083; David M. Markowitz, and Jeffrey T. Hancock, “Linguistic Traces of a Scientific Fraud: The Case of Diederik Stapel,” PloS https://doi.org/10.1177/0165551513478889 https://doi.org/10.5210/fm.v17i3.3933 https://psycnet.apa.org/doi/10.1037/pspp0000018 https://doi.org/10.1109/ICSSD47982.2019.9002823 https://doi.org/10.1177%2F1461444818758715 https://doi.org/10.1038/s41467-018-07761-2 https://doi.org/10.1023/B:GRUP.0000011944.62889.6f https://doi.org/10.7916/D8697C06 https://doi.org/10.1002/pra2.2015.145052010083 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 18 one 9, no. 8 (2014): e105937, https://doi.org/10.1371/journal.pone.0105937; Jing Ma et al., “Detecting Rumors from Microblogs with Recurrent Neural Networks,” (paper, Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI 2016), (2016): 3818–24), https://ink.library.smu.edu.sg/sis_research/4630. 22 Kai Shu et al., “Fake News Detection on Social Media: A Data Mining Perspective,” ACM SIGKDD Explorations Newsletter 19, no. 1 (2017): 22–36, https://doi.org/10.1145/3137597.3137600. 23 Eugenio Tacchini et al., “Some Like It Hoax: Automated Fake News Detection in Social Networks,” arXiv preprint arXiv:1704.07506 (2017). 24 Julio C.S. Reis et al., “Supervised Learning for Fake News Detection,” IEEE Intelligent Systems 34, no. 2 (2019): 76–81, https://doi.org10.1109/MIS.2019.2899143. 25 Victoria L. Rubin et al., “Fake News or Truth? Using Satirical Cues to Detect Potentially Misleading News,” (paper, Proceedings of the Second Workshop on Computational Approaches to Deception Detection, (2016): 7–17); Francesco Barbieri, Francesco Ronzano, and Horacio Saggion, “Is This Tweet Satirical? A Computational Approach for Satire Detection in Spanish,” Procesamiento del Lenguaje Natural, no. 55 (2015): 135-42; Soujanya Poria et al., “A Deeper Look into Sarcastic Tweets Using Deep Convolutional Neural Networks,” arXiv preprint arXiv:1610.08815 (2016). 26 Lei Guo and Chris Vargo, “’Fake News’ and Emerging Online Media Ecosystem: An Integrated Intermedia Agenda-Setting Analysis of the 2016 Us Presidential Election,” Communication Research 47, no. 2 (2020): 178–200, https://doi.org/10.1177/0093650218777177. 27 Natali Ruchansky, Sungyong Seo, and Yan Liu, “CSI: A Hybrid Deep Model for Fake News Detection,” Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, (November 2017): 797–806, https://doi.org/10.1145/3132847.3132877. 28 Yaqing Wang et al., “EANN: Event Adversarial Neural Networks for Multi-Modal Fake News Detection,” Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (2018): 849–57, https://doi.org/10.1145/3219819.3219903; James W. Pennebaker, Martha E. Francis, and Roger J. Booth, “Linguistic Inquiry and Word Count: LIWC 2001”, Mahway: Lawrence Erlbaum Associates 71, no. 2001 (2001). 29 “Facebook, Twitter May Face More Scrutiny in 2019 to Check Fake News, Hate Speech,” accessed May 17, 2020, available: https://www.huffingtonpost.in/entry/facebook-twitter-may-face- more-scrutiny-in-2019-to-check-fake-news-hate-speech_in_5c29c589e4b05c88b701d72e. 30 Nitesh V. Chawla et al., “Smote: Synthetic Minority Over-Sampling Technique,” Journal of Artificial Intelligence Research 16, (2002): 321–57, https://doi.org/10.1613/jair.953. 31 Uğur Mertoğlu and Burkay Genç, “Lexicon Generation for Detecting Fake News,” arXiv preprint arXiv:2010.11089 (2020). 32 Burak Bezirci, and Asım Egemen Yilmaz, “Metinlerin Okunabilirliğinin Ölçülmesi Üzerine Bir Yazilim Kütüphanesi Ve Türkçe Için Yeni Bir Okunabilirlik Ölçütü,” Dokuz Eylül Üniversitesi https://doi.org/10.1371/journal.pone.0105937 https://ink.library.smu.edu.sg/sis_research/4630 https://doi.org/10.1145/3137597.3137600 https://doi.org10.1109/MIS.2019.2899143 https://doi.org/10.1177%2F0093650218777177 https://doi.org/10.1145/3132847.3132877 https://doi.org/10.1145/3219819.3219903 https://www.huffingtonpost.in/entry/facebook-twitter-may-face-more-scrutiny-in-2019-to-check-fake-news-hate-speech_in_5c29c589e4b05c88b701d72e https://www.huffingtonpost.in/entry/facebook-twitter-may-face-more-scrutiny-in-2019-to-check-fake-news-hate-speech_in_5c29c589e4b05c88b701d72e https://doi.org/10.1613/jair.953 INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 AUTOMATED FAKE NEWS DETECTION IN THE AGE OF DIGITAL LIBRARIES | MERTOĞLU AND GENÇ 19 Mühendislik Fakültesi Fen ve Mühendislik Dergisi 12, no. 3 (2010): 49–62, https://dergipark.org.tr/en/pub/deumffmd/issue/40831/492667. 33 Robert Gunning, “The technique of clear writing,” Revised Edition, New York: McGraw Hill, 1968. 34 Ender Ateşman, “Türkçede Okunabilirliğin Ölçülmesi,” Dil Dergisi 58, no. 71–74 (1997). https://dergipark.org.tr/en/pub/deumffmd/issue/40831/492667 ABSTRACT INTRODUCTION LITERATURE REVIEW METHODOLOGY Framework Model Dataset Features EXPERIMENTAL RESULTS AND DISCUSSION CONCLUSION AND FUTURE WORK ENDNOTES 12593 ---- A Collaborative Approach to Newspaper Preservation PUBLIC LIBRARIES LEADING THE WAY A Collaborative Approach to Newspaper Preservation Ana Krahmer and Laura Douglas INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2020 https://doi.org/10.6017/ital.v39i3.12596 Ana Krahmer (ana.krahmer@unt.edu) oversees the Digital Newspaper Unit at UNT. Through this work, she manages the Texas Digital Newspaper Program collection on The Portal to Texas History, which is a gateway to historic research materials freely available worldwide. Laura Douglas (laura.douglas@cityofdenton.com) is the librarian in charge of the Special Collections with the Denton Public Library which houses the genealogy, Texana, and local Denton history collections as well as the Denton municipal archives. In her work, she regularly assists patrons with newspaper research questions specifically related to Denton newspapers. © 2020. INTRODUCTION When we first proposed this column in January 2020, we had no idea how much the world would change between then and the July deadline. While we have collaborated for many years on a variety of projects, the value of our collaboration has never proven itself more than in this COVID - 19 reality: collaboration leverages the strengths and resources of partners to form something stronger than each. In this world of COVID-19, the collaboration between the Denton Public Library (DPL) and the University of North Texas Libraries (UNT) has allowed us to build open, online access to the first 16 years of the Denton Record-Chronicle (DRC). This newspaper is the city’s daily newspaper of record, and the collaboration between DPL and UNT resulted in free, worldwide research access, via The Portal to Texas History. The project was funded by a $24,820.00 grant through the IMLS Library Services and Technology Act (LSTA), awarded from September 2019 to August 2020 by the Texas State Libraries and Archives Commission (TSLAC) as part of their TexTreasures program, to digitize 24,000 newspaper pages. This project has also resulted in a follow-up collaboration to build open access to further years of this daily newspaper title, through a 2021 TexTreasures award to digitize an additional 24,000 newspaper pages. The real question, though, is what recipe made this a successful collaboration. BACKGROUND The DRC has been the community newspaper in Denton for over 100 years. Due to the sheer amount of material, digitizing a daily newspaper with such an extensive publication run is a long - term project that requires a lot of planning, time, and funding. Since the DPL’s inception in 1937, the library has endeavored to collect items related to Denton and Texas history. With community support, the library has developed a well-rounded collection of local history, Texana, and genealogical materials, all of which are housed in the Special Collections Research Area at the Emily Fowler Central Library. These materials support research, projects, and exhibits. One major research resource is the archival collection of local newspapers, mainly the DRC, maintained on 752 rolls of microfilm containing issues from 1908 to 2018. Before this project, access to these newspapers was only available in the Special Collections Research Area, through microfilm readers or paid subscription services. In addition, although steps had been taken to preserve the film, many of the rolls show wear from years of use, while others have developed vinegar syndrome and soon will no longer be a usable resource. In 2018, UNT obtained publisher permission to make the DRC run freely accessible on The Portal to Texas History. mailto:ana.krahmer@unt.edu mailto:laura.douglas@cityofdenton.com INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 A COLLABORATIVE APPROACH TO NEWSPAPER PRESERVATION | KRAHMER AND DOUGLAS 2 Laura had been exploring different avenues to digitize this microfilm and make them freely available to the public when Ana contacted her with information about the Texas State Library and Archives Commission (TSLAC), which awards annual grants supported by Library Services & Technology Act funds, through the Institute of Museum & Library Services. LSTA funding is annually provided to all fifty states through the Institute of Museum and Library Services, and the state library determines how this funding is expended. In Texas, LSTA funding is provided through a number of grant programs including TexTreasures, a competitive grant program for any Texas library. As described by TSLAC, the “TexTreasures grant is designed to help libraries make their special collections more accessible for the people of Texas and beyond. Activities considered for possible funding include digitization, microfilming, and cataloging.” Libraries can apply to fund the same type of project up to three years in a row, and the DRC project applied for $24,820.00 in 2019 to digitize 24,000 newspaper pages, representing the earliest years of microfilm available at the Denton Public Library. To create a viable grant application DPL partnered with the Texas Digital Newspaper Program (TDNP), available through UNT’s Portal to Texas History, and decided to start first by digitizing as many early years of microfilm as grant funding could cover. TDNP is the largest single-state, open access, digital newspaper preservation repository in the U.S., hosting just under 8 million newspaper pages at the time of this writing. In late 2018, UNT received permission from the owner of the DRC to include the newspaper run in the TDNP collection, which represented a very exciting opportunity for city and county researchers, as well as for the DPL. As thanks to the publisher for granting permission, UNT built access to the 2014 to 2018 PDF ePrint editions, which the TDNP preserves as a service to Texas Press Association- member publishers. After this, UNT contacted the DPL to discuss applying for grant funding. Once Laura learned that the DPL had received the 2019 award, she prepared the local planning steps necessary to collaborate with the university. THE PROJECT BECOMES REAL The Denton Record-Chronicle Digitization Project Grant contract and resolution for adoption went before the Denton City Council on October 8, 2019. The City of Denton issued a press release that day, and the DRC also published an article announcing the project. Over the next few days the DRC article appeared across social media, including the City of Denton’s social media accounts, as well as through library-associated email newsletters. After the first newspapers became available on the Portal, both DPL and UNT prepared blog posts about the project, which have also appeared on social media. These blog posts fulfilled publicity requirements specified by the grant, even while offering training to researchers in how to work with the online newspaper collection. One major convenience to this collaboration is that both organizations are in the same city. Transfer of materials was arranged by email and accomplished by a trip across town. We completed the digitization process in batches, with the first 10 microfilm rolls going to UNT on October 10, 2019, and UNT uploading the first 854 issues in December 2019. The newspapers from the first microfilm set represented 1908-1916. DPL transferred the last set of microfilm in April 2020, with dates ranging from 1917 through September 1924, shortly after which UNT completed and uploaded the grant-funded count of 24,000 newspaper pages. The estimated year given in the grant proposal that the scans would have gone through was 1938, but the page count on this newspaper proved to be much, much higher than originally estimated, and as a result, the funding only covered up to September 1924. DPL and UNT will continue their partnership by INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 A COLLABORATIVE APPROACH TO NEWSPAPER PRESERVATION | KRAHMER AND DOUGLAS 3 digitizing further years of the DRC, through a variety of methods. As we were in the midst of preparing this column the TSLAC contacted Laura to inform her that DPL had received a second grant award, in the amount of $24,820.00 to digitize 24,000 additional newspaper pages, which will move the newspapers through 1954. As of July 23, 2020, the Denton Record-Chronicle Collection on the Portal to Texas History hosts 6,168 items and has been used 16,397 times. This includes 1,743 items that are PDF ePrint editions of the paper from 2014 to 2018, which UNT uploaded for long-term preservation and access. UNT uploads ePrint editions without a charge, and digitally preserves these through an agreement with the Texas Press Association; these PDFs were not a part of the funded grant, but they do enhance access to the collection and helped to build community interest in seeing earlier years available on the Portal. The usage of the collection skyrocketed after the early editions became available. January 2020 saw the highest number with the collection uses at 3105. Once this project is complete, it will include over 200,000 newspaper pages. Neither DPL nor UNT has the ability to tackle this project on their own, but through collaboration, it is possible. RECIPE FOR YOUR OWN COLLABORATION SUCCESS These are planning recommendations as you prepare for your own collaboration, drawn from what we’ve learned as we worked on this project together. 1. Communicate Early and Often: Communicating needs enables partners to identify each other’s strengths. Each partner will bring their strengths to the project, which in this case included actual archival materials from DPL and technological expertise on the UNT side. In addition, be prepared to communicate with local groups who need to endorse or sign off on the project, including possibly the city council, the historical commission, or the city manager. 2. Partner to Write the Grant: Partnering in preparing the grant achieves two goals: first, it enables partners to develop a communication flow that will move forward throughout the collaboration; second, it ensures that partners know what each can realistically accomplish within the grant timeline. In this case, Laura wrote most of the grant application herself, but she had very specific questions that Ana had to answer, and she needed key elements from UNT, including project budget, technological infrastructure, and a commitment letter. Communicating early and partner on the grant application process ensured that there were no unexpected surprises that were within the control of either partner. 3. Work Together to Explain Your Partnership: With a grant of this size, we always spoke in advance to ensure we weren’t over-promising when newspapers would appear online. This also gave both Laura and Ana lead-time for promoting the project: Laura would share the years of the physical microfilm before sending them over, and Ana would walk Laura through the years that would get uploaded in a given month. This allowed them to plan publicity, training, and outreach efforts based on the dates of newspapers going online. In addition, Laura regularly communicated with Ana prior to submitting grant reports, and this was critical in preventing miscommunication going to the funding agency. 4. Pad Enough Time for the Unexpected: Of course, we had no way of knowing a pandemic would occur when we began this project, and what saved us was that we’d started planning as soon as we learned about receiving the grant, rather than as soon as the grant started, which was in September 2019. Planning two months in advance put us two months ahead of schedule, and we were able to start exchanging materials as soon as the grant period INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 A COLLABORATIVE APPROACH TO NEWSPAPER PRESERVATION | KRAHMER AND DOUGLAS 4 started. This gave us a few weeks of lead time so we successfully completed the project by the end of April 2020, at which point the microfilm page count had been scanned and UNT staff could remote in to complete the digitization processes. Extra time is only a benefit. If the COVID-19 pandemic had not occurred, we still might have had to address technological or film deterioration problems, and we could resolve these earlier rather than later because we had given ourselves a few extra weeks of lead time. 5. Don’t be Afraid to Explain Changes to Your Granting Agency: If your project changes due to unforeseen circumstances, for example in our project the uploaded total of pages reached 24,000 before we digitized the entire planned date range. UNT charges a per-page digitization fee, and these newspaper issues proved to contain more pages than expected . Laura contacted the representative at TSLAC to explain the situation and offer an alternative approach to cover the digitization of the remaining years. The important thing is to keep the granting agency informed of any changes, delays, or hiccups in the project. We are both proud of having completed this project three months before the end of the grant period, but we know that without solid communication, planning, or flexibility, the COVID-19 pandemic would have made the situation extremely difficult if not impossible. Leveraging the Portal’s technical infrastructure and TDNP’s newspaper expertise with the volume of material and collection expertise provided by the DPL has given us a model for success we plan to capitalize on in future projects. Best of all, in the world of COVID-19, our patrons can access these newspapers from the comfort of their own couches, without even taking off their pajamas! Introduction Background The Project Becomes Real Recipe for your own collaboration Success 12619 ---- What More Can We Do to Address Broadband Inequity and Digital Poverty? EDITORIAL BOARD THOUGHTS What More Can We Do to Address Broadband Inequity and Digital Poverty? Lori Ayre INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2020 HTTPS://DOI.ORG/10.6017/ITAL.V39I3.12619 Lori Ayre (lori.ayre@galecia.com) is Principal, The Galecia Group, and a member of the ITAL Editorial Board. © 2020. We are now almost seven months into our new lives with the novel coronavirus and over 190,000 Americans have died of COVID-19. Library administrators have been struggling with their commitment to provide services to their communities while keeping their staff safe. Initially, libraries relied on their online offerings, so more e-books and other online resources were acquired. Staff learned that they could do quite a bit of their work from home. They could still respond to email and phone messages. They could evaluate and order new material. They could deliver online programs like summer reading and story time. They could interact with people on social media. They could put together key resources for patrons and post them on the website.1 A lot of what the library was doing while the buildings were closed was not obvious. Most people associate the library with the building and since the building was closed… it seemed like nothing was happening at the library. Yet, library workers were busy. Once it became possible for library staff to enter the building (per local health ordinances), the first thing that libraries started to do was accept returns. That was a little fraught considering how little we knew about the virus and how long contaminants might live on returned library material. Eventually with the long-awaited testing results from the REALM Project and Battelle Labs (https://www.webjunction.org/explore-topics/COVID-19-research-project.html), people started standardizing on a three-day quarantine of returns. Then more testing of stacked material was done resulting in some people choosing to quarantine returns for four days. As of early September, we have learned that even five days isn’t enough to quarantine delivery totes and some other plastic material. Curbside pick-up was born in these early days of being allowed back in the buildings. If someone had mapped who was offering curbside pick-up, it would look like popcorn popping across the country. The number of libraries offering the service slowly increased and pretty soon nearly everyone was doing it.2 Many library directors will say that curbside pick-up is here to stay. People love the convenience too much to take the service away. Rolling out curbside pick-up has had some challenges: how to safely make the handoff between library staff and library patrons; whether to accept returns; whether to charge fines; modifying mailto:lori.ayre@galecia.com https://www.webjunction.org/explore-topics/COVID-19-research-project.html INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 WHAT MORE CAN WE TO DO ADDRESS BROADBAND INEQUITY AND POVERTY? | AYRE 2 circulation policies to fit the current needs; and selecting books for people that want them but who don’t have the skills needed to negotiate the library catalog’s requesting system. Some libraries started putting together grab bags of materials selected by staff for specific patrons—kind of like homebound services on-the-fly. Curbside helped get material in circulation again. Importantly, also during this period, libraries started finding creative ways to get Wi-Fi hotspots out into communities. They began lending them if they weren’t already. Those libraries already circulating hotspots increased their inventory. They took their bookmobiles into neighborhoods and created temporary hotspot connections around town. Many libraries made sure Wi-Fi from their building was available in their own parking lots.3 But one thing everyone has learned during this pandemic is that libraries alone cannot be the solution to the digital divide. This isn’t news to librarians who have been arguing that Internet access should be as readily available as electricity and water. Librarians understand that information cannot be free and accessible unless everyone has Internet access and knows how to use it. Public access computers, Wi-Fi hotspots, and media literacy are all staple services in our libraries today.4 However, these services are not enough to bridge a digital divide that only seems to be getting worse. The coronavirus that closed libraries and schools has made it painfully clear that something much bigger has to happen to address the problem. As Gina Millsap stated in a recent Facebook post: I think it’s become obvious that the COVID-19 crisis is shining a spotlight on the flaws we have in our broadband infrastructure and on our failure to make the investments that should have been made for equitable access to what should be a basic utility, like water or electricity.5 According to BroadbandNow, the number of people who lack broadband Internet access could be as high as 42 million.6 The FCC reports that at least “18 million people lacked access to broadband Internet at the end of 2018.”7 Even if all the libraries were open and circulating hundreds of Wi-Fi hotspots, we’d still have a very serious access problem. THINKING DIFFERENTLY ABOUT ADDRESSING THE DIGITAL DIVIDE In a paper published March 28, 2019, by the Urban Libraries Council (ULC), the author suggested three specific actions that libraries can take to address race and social equity and the digital divide. They are: 1. Assess and respond to the needs of your community through meaningful conversation (including considering different partners for your work) 2. Optimize funding opportunities to support your efforts (e.g. E-rate), and INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 WHAT MORE CAN WE TO DO ADDRESS BROADBAND INEQUITY AND POVERTY? | AYRE 3 3. Think outside the box to create effective solutions that are informed by those in need (e.g. lending Wi-Fi hot spots).8 While we know libraries have been heeding this advice when it comes to Wi-Fi hotspots, let’s look into what can be done when we consider ULC’s suggestion to consider different partners for your work. Community Partners An excellent example of what can be done with a coalition of community partners comes from Detroit where a mesh wireless network was put in place to provide permanent broadband in a low-income neighborhood.9 The project is called the Detroit Community Technology Project. With the community-based mesh network, only one Internet account is needed to provide access for multiple homes. The networks also enable people on the network to share resources on the network (calendar, files, bulletin board) and that data lives on their network, not in the cloud. One of the sponsors of the Detroit Community Technology Project is the Allied Media Project (https://www.alliedmedia.org/) which also sponsors the CassCoWifi and the Equitable Internet Initiative to get broadband and digital literacy training into several underserved areas. Community Networks (https://muninetworks.org/), a project of the Institute for Local Self- Reliance (https://ilsr.org/), describes several innovative projects in which communities partner with electric utilities. Surry County, Virginia, expects to extend broadband access to 6,700 households through a first-ever partnership between a utility (Dominion Energy Virginia) and an electric cooperative (Dominion Energy). A similar project is underway with the Northern Neck Cooperative and Dominion Energy.10 These initiatives are made possible due to some regulatory changes made in Virginia (SB 966). According to Community Networks, there are 900 communities providing broadband connectivity locally (https://muninetworks.org/communitymap). But nineteen states still have barriers in place that discourage, if not outright prevent, local communities from investing in broadband. Libraries in states where community networks are a viable option should be at the table, or perhaps setting the table, for discussions about how to bring broadband to the entire community - - not just into the library or dispatched one-at-a-time via Wi-Fi hotspots. This is an opportunity to convene community conversations focusing on the issue of broadband. Library staff have been doing more and more of this type of outreach into the community and acting as facilitator. The ALA has even produced a Community Conversation Workbook (http://www.ala.org/tools/sites/ala.org.tools/files/content/LTC_ConvoGuide_final_062414.pdf ) to support libraries just getting started. State Partners In California, the Governor recently issued Executive Order N-73-20 (https://www.gov.ca.gov/wp-content/uploads/2020/08/8.14.20-EO-N-73-20.pdf) directing state agencies to pursue a goal of 100 Mbps download speed and outlines actions across state agencies https://www.alliedmedia.org/ https://muninetworks.org/ https://ilsr.org/ https://muninetworks.org/communitymap http://www.ala.org/tools/sites/ala.org.tools/files/content/LTC_ConvoGuide_final_062414.pdf https://www.gov.ca.gov/wp-content/uploads/2020/08/8.14.20-EO-N-73-20.pdf INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 WHAT MORE CAN WE TO DO ADDRESS BROADBAND INEQUITY AND POVERTY? | AYRE 4 and departments to accelerate mapping and data collection, funding, deployment, and adoption of high-speed internet.11 This will undoubtedly create fertile ground for libraries to partner with other agencies and community organizations to advance this initiative. Libraries are specifically called out to raise awareness of low-cost broadband options to their local community. Every state has some kind of broadband task force or commission or advisory council (https://www.ncsl.org/research/telecommunications-and-information-technology/state- broadband-task-forces-commissions.aspx). This is another instance where libraries should be at the table. In my state, our State Librarian is on the California Broadband Council. But many of these commissions do not have a representative from the library world which means they probably are not hearing from us. Whether it is through your local library, your state library, or your state library association, it is important for librarians to build relationships with people on these commissions—if not get a seat on the commission themselves. National Partners Unless your community is blanketed with affordable broadband connectivity, it will be important that we continue to advocate nationally for the needs we see. In addition to helping the patron standing right in front of us checking out their hotspot, we also need to address the needs of the people who aren’t able to get to the library but are equally in need of access. Our job is to make sure that any new initiatives undertaken by a new administration provide for free and equitable access to the Internet for every household. Extending E-rate (the Federal Communication Commission’s program for making Internet access more affordable for schools and libraries) isn’t enough. Free (or at least affordable) broadband needs to be brought to every home. The Electronic Frontier Foundation (EFF) argues that fiber-to-the-home is the best option for consumers today because it will be easily upgradeable without touching the underlying cables and will support the next generation of applications (see https://www.eff.org/wp/case-fiber-home- today-why-fiber-superior-medium-21st-century-broadband). Libraries have worked with the EFF on issues related to privacy and government transparency. Maybe it’s time to team-up with them about broadband. Global Partners Low Earth Orbit (LEO) satellites could potentially bring broadband to everyone on Earth.12 Starlink (https://www.starlink.com/) is Elon Musk’s initiative and Project Kuiper (https://blog.aboutamazon.com/company-news/amazon-receives-fcc-approval-for-project- kuiper-satellite-constellation) is Amazon’s Jeff Bezos’ project. A private beta Starlink service is due (or perhaps it is already happening). If it works as Musk has envisioned, it could be a game- changer. Or it might just make the digital divide worse if it isn’t affordable to everyone who needs it. How might we lobby Musk to roll-out this service in a way that is equitable and fair? https://www.ncsl.org/research/telecommunications-and-information-technology/state-broadband-task-forces-commissions.aspx https://www.ncsl.org/research/telecommunications-and-information-technology/state-broadband-task-forces-commissions.aspx https://www.eff.org/wp/case-fiber-home-today-why-fiber-superior-medium-21st-century-broadband https://www.eff.org/wp/case-fiber-home-today-why-fiber-superior-medium-21st-century-broadband https://www.starlink.com/ https://blog.aboutamazon.com/company-news/amazon-receives-fcc-approval-for-project-kuiper-satellite-constellation https://blog.aboutamazon.com/company-news/amazon-receives-fcc-approval-for-project-kuiper-satellite-constellation INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 WHAT MORE CAN WE TO DO ADDRESS BROADBAND INEQUITY AND POVERTY? | AYRE 5 SPEAK UP, SPEAK OUT, AND GET IN THE WAY These are just a few avenues that we, as professionals committed to free access to information, might pursue. I worry that we have not made enough noise about the problems we see in our communities that are a result of broadband inequity and digital poverty. And although virtually every library is doing something to address the problem, our efforts are no match for the magnitude of the problem. In a blog post on the Brookings Institution’s website, authors Lara Fishbane and Adie Tomer argue for a new agenda focused on comprehensive digital equity that includes (among other things) “building networks of local champions, ensuring community advocates, government officials, and private network providers share intelligence, debate priorities, and deploy new programming .”13 There are no better local champions and advocates for communities than the City or County Librarians and their staffs. Let’s treat this problem with the seriousness it deserves and at a scale that will be meaningful. To quote John Lewis (as so many of us have since his death on July 17, 2020), it's time for us to “speak up, speak out, and get in the way.”14 We have to make it painfully clear to policymakers that libraries cannot bridge the digital divide with public access computers and hotspots. We need to tell our communities’ stories, convene conversations, and agitate for equitable broadband that is as readily available as water and electricity. ENDNOTES 1 “Libraries Respond: COVID-19 Survey,” American Library Association, accessed August 25, 2020, http://www.ilovelibraries.org/sites/default/files/MAY-2020-COVID-Survey-PDF-Summary- of-Results-web-2.pdf. 2 Erica Freudenberger, “Reopening Libraries: Public Libraries Keep Their Options Open,” Library Journal, June 25, 2020, https://www.libraryjournal.com/?detailStory=reopening-libraries- public-libraries-keep-their-options-open. 3 Lauren Kirchner, “Millions of American Depend on Libraries for Internet. Now They’re Closed,” The Markup, June 25, 2020, https://themarkup.org/coronavirus/2020/06/25/millions-of- americans-depend-on-libraries-for-internet-now-theyre-closed. 4 Jim Lynch, “The Gates Library Foundation Remembered: How Digital Inclusion Came to Libraries,” TechSoup, accessed August 24, 2020, https://blog.techsoup.org/posts/gates- library-foundation-remembered-how-digital-inclusion-came-to-libraries. 5 Gina Millsap, “This was in April. Q. We’re starting a new school year and what has changed? A. Not much. It’s past time to get serious about universal broadband in the U.S.” Facebook, August 16, 2020, 5:37 a.m., https://www.facebook.com/gina.millsap.7/posts/10218986781485855. Accessed September 14, 2020. http://www.ilovelibraries.org/sites/default/files/MAY-2020-COVID-Survey-PDF-Summary-of-Results-web-2.pdf http://www.ilovelibraries.org/sites/default/files/MAY-2020-COVID-Survey-PDF-Summary-of-Results-web-2.pdf https://www.libraryjournal.com/?detailStory=reopening-libraries-public-libraries-keep-their-options-open https://www.libraryjournal.com/?detailStory=reopening-libraries-public-libraries-keep-their-options-open https://themarkup.org/coronavirus/2020/06/25/millions-of-americans-depend-on-libraries-for-internet-now-theyre-closed https://themarkup.org/coronavirus/2020/06/25/millions-of-americans-depend-on-libraries-for-internet-now-theyre-closed https://blog.techsoup.org/posts/gates-library-foundation-remembered-how-digital-inclusion-came-to-libraries https://blog.techsoup.org/posts/gates-library-foundation-remembered-how-digital-inclusion-came-to-libraries https://www.facebook.com/gina.millsap.7/posts/10218986781485855 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 WHAT MORE CAN WE TO DO ADDRESS BROADBAND INEQUITY AND POVERTY? | AYRE 6 6 “Libraries are Filling the Homework Gap as Students Head Back to School,” Broadband USA, last modified September 4, 2018, https://broadbandusa.ntia.doc.gov/ntia-blog/libraries-are- filling-homework-gap-students-head-back-school. 7 James K. Willcox, “Libraries and Schools Are Bridging the Digital Divide During the Coronavirus Pandemic,” Consumer Reports, last modified April 29, 2020, https://www.consumerreports.org/technology-telecommunications/libraries-and-schools- ridging-the-digital-divide-during-the-coronavirus-pandemic/. 8 Sarah Chase Webber, “The Library’s Role in Bridging the Digital Divide”, Urban Libraries Council, last modified March 28, 2019, https://www.urbanlibraries.org/blog/the-librarys-role-in- bridging-the-digital-divide. 9 Cecilia Kang, “Parking Lots Have Become a Digital Lifeline,” The New York Times, May 20, 2020, https://www.nytimes.com/2020/05/05/technology/parking-lots-wifi-coronavirus.html. 10 Ry Marcattilio-McCracken, “Electric Cooperatives Partners with Dominion Energy to Bring Broadband to Rural Virginia,” last modified August 6, 2020, https://muninetworks.org/content/electric-cooperatives-partner-dominion-energy-bring- broadband-rural-virginia. 11 “Newsom Issues Executive Order on Digital Divide,” CHEAC (Improving the Health of All Californians), last modified August 14, 2020, https://cheac.org/2020/08/14/newsom-issues- executive-order-on-digital-divide/. 12 Tyler Cooper, “Bezos and Musk’s Satellite Internet Could Save Americans $30B a Year,” Podium: Opinion, Advice, and Analysis by the TNW Community, last modified August 24, 2019, https://thenextweb.com/podium/2019/08/24/bezos-and-musks-satellite-internet-could- save-americans-30b-a-year/. 13 Lara Fishbane and Adie Tomer, “Neighborhood Broadband Data Makes It Clear: We Need an Agenda to Fight Digital Poverty,” Brookings Institution, last modified February 6, 2020, https://www.brookings.edu/blog/the-avenue/2020/02/05/neighborhood-broadband-data- makes-it-clear-we-need-an-agenda-to-fight-digital-poverty/. 14 Rashawn Ray, “Five Things John Lewis Taught us About Getting in ‘Good Trouble,’” Brookings Institution, last modified July 23, 2020, https://www.brookings.edu/blog/how-we- rise/2020/07/23/five-things-john-lewis-taught-us-about-getting-in-good-trouble/. https://broadbandusa.ntia.doc.gov/ntia-blog/libraries-are-filling-homework-gap-students-head-back-school https://broadbandusa.ntia.doc.gov/ntia-blog/libraries-are-filling-homework-gap-students-head-back-school https://www.consumerreports.org/technology-telecommunications/libraries-and-schools-bridging-the-digital-divide-during-the-coronavirus-pandemic/ https://www.consumerreports.org/technology-telecommunications/libraries-and-schools-bridging-the-digital-divide-during-the-coronavirus-pandemic/ https://www.urbanlibraries.org/blog/the-librarys-role-in-bridging-the-digital-divide https://www.urbanlibraries.org/blog/the-librarys-role-in-bridging-the-digital-divide https://www.nytimes.com/2020/05/05/technology/parking-lots-wifi-coronavirus.html https://muninetworks.org/content/electric-cooperatives-partner-dominion-energy-bring-broadband-rural-virginia https://muninetworks.org/content/electric-cooperatives-partner-dominion-energy-bring-broadband-rural-virginia https://cheac.org/2020/08/14/newsom-issues-executive-order-on-digital-divide/ https://cheac.org/2020/08/14/newsom-issues-executive-order-on-digital-divide/ https://thenextweb.com/podium/2019/08/24/bezos-and-musks-satellite-internet-could-save-americans-30b-a-year/ https://thenextweb.com/podium/2019/08/24/bezos-and-musks-satellite-internet-could-save-americans-30b-a-year/ https://www.brookings.edu/blog/the-avenue/2020/02/05/neighborhood-broadband-data-makes-it-clear-we-need-an-agenda-to-fight-digital-poverty/ https://www.brookings.edu/blog/the-avenue/2020/02/05/neighborhood-broadband-data-makes-it-clear-we-need-an-agenda-to-fight-digital-poverty/ https://www.brookings.edu/blog/how-we-rise/2020/07/23/five-things-john-lewis-taught-us-about-getting-in-good-trouble/ https://www.brookings.edu/blog/how-we-rise/2020/07/23/five-things-john-lewis-taught-us-about-getting-in-good-trouble/ Thinking Differently About Addressing the Digital Divide Community Partners State Partners National Partners Global Partners Speak Up, Speak Out, and Get in the Way ENDNOTES 12637 ---- Harnessing the Power of OrCam PUBLIC LIBRARIES LEADING THE WAY Harnessing the Power of OrCam Mary Howard INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2020 https://doi.org/10.6017/ital.v39i3.12637 Mary Howard (mhoward@sccl.lib.mi.us) is Reference Librarian Library for Assistive Media and Talking Books (LAMTB) at the St. Clair County Library, Port Huron, Michigan. © 2020. Library for Assistive Media and Talking Books services (LAMTB) are located at the main branch for the St. Clair County’s Library System. LAMTB facilitates resources and technologies for residents of all ages who have visual, physical, and/or reading limitations that prevent them from using traditional print materials. Operating out of Port Huron, Michigan, we encounter many instances where we need to provide assistance above and beyond what a basic library may offer. We host Talking Book services which provide free players, cassettes, braille titles, and downloads to users who are vision or mobility impaired. We also have a large and stationary Kurtzweil reading machine that converts print to speech, video-enhanced magnifiers, large print books. We also provide home delivery service for patrons who are unable to travel to branches. The library has been searching for a more technology-forward focus for our patrons. The state’s Talking Books center in Lansing set up an educational meeting at the Library of Michigan in 2018 to see a live demonstration of the OrCam My Eye reader. This was the innovation we were seeking and I was thoroughly impressed with the compact and powerful design of the reader, the ease of use, and the stunningly accurate feedback provided by this AI reading assistive device. Users are able to read with minimal setup and total control. OrCam readers are lightweight, easily maneuverable assistive technology devices for users who are blind, visually impaired, or have a reading disability, including children, adults and the elderly. The device automatically reads any printed text: newspapers, money, books, menus, labels on consumer products, text on screens, books, or smartphones, etc. The OrCam reader will repeat back any text immediately and is fit for all ages and abilities. OrCam works with English, Spanish, and French languages and can identify money and other business and household items. It can be placed near either the left or right ear. Users can easily adjust the volume and speed of the read text. It can be to either the left or right temple on your glasses using a magnetic docking device. Having a diverse group of users with different needs use the reader as they like is one of the more impressive offerings. Changing most settings is normally facilitated with just a finger swipe on the OrCam device. The mission of OrCam is to develop a "portable, wearable visual system for blind and visually impaired persons, via the use of artificial computer intelligence and augmented reality” By offering these devices to our sight, mobility, or otherwise impaired patrons we open up the world of literacy, discovery and education. Some of our users are not able to read in any other fashion and the OrCam provides a much-needed boost to their learning profile. We secured a grant from the Institute of Museum and Library Services (IMLS) for the purchase of the readers (CFDA 45.310). We also worked with OrCam to get lower pricing for these units. Normally they retail for $3,500 but we were able to move this to the lower price point of $3,000. We also were awarded a $22,106 Improving Access to Information grant from the Library of Michigan to fund the entire purchase. Without this funding stream we would not have been able to secure the OrCam. However, if you have veterans in your service area please contact the company since there is availability for VA health coverage for low vision or legally blind veterans who may INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 HARNESSING THE POWER OF ORCAM | HOWARD 2 qualify to receive an OrCam device, fully paid for by the VA. Please visit https://orcam.com/en/veterans for more information. Figure 1. Close-up of the OrCam device. The grant was initially set to run from September 2019 to September 2020. We purchased six OrCam readers for our library users, and they were planned to be rotated among our twelve branches throughout this grant cycle. However, due to the pandemic and out of safety concerns for staff and visitors, our library was closed from March 23 to June 15 and we were only able to offer it to the public at six branches. As of July 14, 2020, we are projecting that we may open to the public in September, but COVID-19 issues could halt that. We have had to make arrangements with the grantor to extend the period for the usage of the OrCam from September to December. This will make up for some of the lost time and open a path for the other six libraries to have their turn offering the OrCam to their patrons. The interesting aspect of this is we now have to take our technology profile even further by offering remote training to prospective OrCam users. Thankfully, the design and rugged housing for the reader makes it easy to clean and maintain but the social distancing can prove to be intrusive for training. To set up a user you need to be within a foot or two of them and being very close in order to get them used to how the OrCam reads. There is a lot of directing involved and close contact with the user and instructor. We will use a work - around of providing distance instruction including in-person and remote training. OrCam also has a vast array of instructional videos that we will have cued up for users. We have had over 150 https://orcam.com/en/veterans INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 HARNESSING THE POWER OF ORCAM | HOWARD 3 residents attend presentations, demonstrations, and talks on the OrCam. I anticipate that this number will not be achieved for the second round; however, we may be more successful in our online presence since we can add the instruction to our YouTube page, offer segments on Facebook and other social media and provide film clips for our webpage. The situation has been difficult, but it has opened up LAMTB services to think about how we should be working to provide better and more remote service to our users. Since we cover over 800 square miles in the county, becoming more adaptable to servicing our patrons has become a paramount area of work for the library. The OrCam will bring about a new way of remote training to our patrons, which will bring about more awareness of the reader and how it can be beneficial to users. The St. Clair County Library System would like to thank the Institute of Museum and Library Services for supporting this program. The views, findings, conclusions or recommendations expressed in this article do not necessarily represent those of the Institute of Museum and Library Services. 12687 ---- In the Middle of Difficulty Lies Opportunity: Hope Floats LITA President’s Message In the Middle of Difficulty Lies Opportunity Hope Floats Evviva Weinraub Lajoie INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2020 https://doi.org/10.6017/ital.v39i3.12687 Evviva Weinraub Lajoie (evviva@gmail.com) is Vice Provost for University Libraries and University Librarian, University at Buffalo, and the last LITA President. © 2020. If quarantine has illustrated anything to me, it’s that time is merely a construct. While my approximately 2-month term as President may be the shortest in LITA history, it has been filled with meetings, reports, protests, and preparations for our metamorphosis into Core. My thoughts have been consumed with the myriad of financial, health, and societal issues that have also filled my news feed. I spend a lot of time thinking and worrying about what their impact will be on our work and our institutions, how it impacts me and the people I work with personally, and what role Core may play for many of us in the future. I imagine all of us are thinking about health and safety. We are all balancing those parts of ourselves that want to aid, to help, to teach and guide with the parts of ourselves that are anxious and scared. Many of us have responsibilities where we need to protect our loved ones and ourselves. We are seeing the health and safety of our BIPOC colleagues disproportionately harmed. Balancing our crucial role within our communities is complicated and there are no right answers. I imagine many of us have been spending a lot of time thinking about money, whether it be personal concerns, institutional and organizational concerns, or their intersection point. We’re thinking a lot more about where our money comes from, how it is invested, how we pay for things, how we prioritize paying for things, who decides what gets purchased, and whose voice gets centered when we make that purchase. We’re thinking carefully about the institutions and infrastructures that have existed and how they will look different and should be different in a post-COVID landscape. I imagine most of us are thinking about societal connections. We are interacting with our professional colleagues differently, and many of us are, perhaps for the first time, perceiving the deep imbalances that permeate our personal, social, and professional lives. We are all trying to figure out how to do the work we need to do when we are uncomfortable and the world is uncertain and the demands for change are coming from all angles and in a variety of forms. LITA remained my professional home through the years because I found it to be a place where no matter who you were or where you worked, there was a place for you. That feeling of connection is so vital to all of us, pandemic and social unrest or not. Knowing there is a network I can depend on to be there when I’m working through the difficult and uncomfortable makes the work just a little bit easier and significantly more meaningful. Our professional organizations and affiliations have the ability to be an anchor in uncertain times - whether through a change in career, a financial crisis, an environmental catastrophe, or a global health emergency. On August 31, 2020, LITA officially dissolved and on September 1, our home became Core. At our last LITA Board meeting, Margaret Heller and Amanda L. Goodman presented a history of LITA. mailto:evviva@gmail.com http://hdl.handle.net/11213/14823 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 IN THE MIDDLE OF DIFFICULTY LIES OPPORTUNITY | WEINRAUB LAJOIE 2 What became clear to me in the retelling is that this is not LITA’s first reorganization. Nor is it our second or our third. LLAMA, LITA, and ALCTS have always been dancing with each other. Our merger is an acknowledgement that we “...play a central role in every library, shaping the future of the profession by striking a balance between maintenance and innovation, process and progress, collaboration and leading.” Collectively, we have had a year that is beyond comprehension—it has been filled with loss, anger, frustration, grief, anxiety, depression, horror...we have all been weathering the same storm, but our ships are not all equally prepared for the task laid ahead of them. That has been, for so many of us, the hardest part of all of this. We may have always known that inequities existed, that the system was structured to make sure that some folks were never able to get access to the better goods and services, but for many, this pandemic is the first time we have had those systemic inequities held up to our noses and been asked, “what are you going to do to change this?” Balancing those priorities will require us to lean on our professional networks and organizations to be more and to do more. I believe that together, we can make Core stand up to that challenge. It has been an honor to serve as the last LITA President. For the brief time I have served, to have the chance to hold an office so many people I truly admire have held...it is a legacy I am proud to have had a moment to uphold. I am gratified to transition LITA into a partnership that will take all that we have loved about LITA and make something new, something Core. https://core.ala.org/our-mission-vision-and-values/ https://core.ala.org/our-mission-vision-and-values/ https://core.ala.org/our-mission-vision-and-values/ 12691 ---- Letter from the Editor (September 2020) LETTER FROM THE EDITOR September 2020 Kenneth J. Varnum INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2020 https://doi.org/10.6017/ital.v39i3.12xxx With “unprecedented” rising to first place on my personal list of words I would prefer never to need to use again, let alone hear used, I find it eminently satisfying that some activities and events from before COVID continue in their usual, predictable ways. For me, the quarterly rhythm of publication of Information Technology and Libraries is one of those activities. It is helping keep me grounded. While it is certainly not much in the scope of what is happening all around me, it is at least something. One thing that is changing is that this journal, along with Library Resources and Technical Services and Library Leadership & Management are now publications of ALA’s newest division: Core: Leadership, Infrastructure, Futures. You’ll notice a new logo at the top of our site, reflecting the new organizational structure. I am excited about the possibilities of richer cross-Core cooperation and collaboration as we explore our new structure. This issue includes the first—and last—LITA President’s Message from incoming and outgoing LITA President Evviva Weinraub Lajoie. Evviva assumed the LITA presidency this summer, just before the merger of LITA, LLAMA, and ALCTS into the new Core division took place on September 1. Members of those three merged divisions should watch for information about elections for the new Core president in October. I am pleased that this issue includes the 2020 LITA/Ex Libris Student Writing Award winning article, Evaluating the Impact of the Long-S upon 18th-Century Encyclopedia Britannica Automatic Subject Metadata Generation Results, by Sam Grabus of Drexel University. Julia Bauder, the Chair of this year’s selection committee (I was also a member, as ITAL editor) said, “This valuable work of original research helps to quantify the scope of a problem that is of interest not only in the field of library and information science, but that also, as Grabus notes in her conclusion, could affect research in fields from the digital humanities to the sciences.” Before closing, I would like to express my appreciation to Breanne Kirsch, who ably served on the editorial board from 2018-2020. Sincerely, Kenneth J. Varnum, Editor varnum@umich.edu September 2020 https://doi.org/10.6017/ital.v39i3.12235 https://doi.org/10.6017/ital.v39i3.12235 mailto:varnum@umich.edu 12847 ---- Public Libraries Respond to the COVID-19 Pandemic, Creating a New Service Model EDITORIAL BOARD THOUGHTS Public Libraries Respond to the COVID-19 Pandemic, Creating a New Service Model Jon Goddard INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2020 https://doi.org/10.6017/ital.v39i4.12847 Jon Goddard (jgoddard@northshorepubliclibrary.org) is a Librarian at the North Shore Public Library, and a member of the ITAL Editorial Board. © 2020. During the COVID-19 pandemic, public libraries have demonstrated, in many ways, their value to their communities. They have enabled their patrons to not only resume their lives, but to help them learn and grow. Additionally, electronic resources offered to patrons through their library card have allowed people to be educated and entertained. The credit must go to the librarians, who initially fueled, and have maintained this level of service by re-writing the rules—creating a new service model. Once libraries closed, librarians promoted ebooks and other important platforms available to patrons with their library cards. The result: The checkout of ebooks, and the use of these platforms rose, exponentially. Community engagement became completely virtual with librarians, and those who provide library programs to the public, providing services on platforms that they may or may not have heard of, such as Zoom and Discord. As libraries re-opened, many offered real-time reference services, as well as seamless and contactless curbside service, providing a sense of control and continuity amongst the chaos. EXPONENTIAL INCREASES IN ELECTRONIC RESOURCE USAGE Overdrive, which is currently used by nearly 90% of public libraries in the United States to manage both ebook and audiobook collections, saw an exponential increase in its usage. Since the lockdown began in mid-March, the daily average for ebook checkouts have been consistently 52% above pre-COVID periods. Additionally, new users to the platform have been consistently double and triple 2019 highs.1 Library staff have been helping readers during this time to ensure they obtain access with their devices. In Suffolk County, New York, where new patron registration to Overdrive is up 72% from last year (as of August 2020), there has been no shortage of requests for help.2 With kids being home from school and learning virtually, it is no surprise that ebook readership skyrocketed amongst YA and Juvenile readers with an 87% increase from last year. 3 To help them with their homework and studies, families turned to online tutoring. In Suffolk County, New York, the usage of the Brainfuse online live tutoring service has been consistently up by nearly 50% during the school closures.4 Gale, a Cengage company, which offers Miss Humblebee's Academy, a virtual learning program for preschoolers, saw its user sessions increase by 100% from the previous year.5 mailto:jgoddard@northshorepubliclibrary.org INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 PUBLIC LIBRARIES RESPOND TO THE COVID-19 PANDEMIC | GODDARD 2 Adults, also eager to learn new skills, took to online courses as well. Gale Courses saw a 50% increase in enrollments from March-July from the previous year. Likewise, Gale Presents: Udemy, which offers on-demand video courses, saw just over 21,000 course enrollments from March- June.6 To help those who did not have sufficient broadband Wifi to use these necessary resources and platforms, many libraries left their Wifi on even when the building was closed to allow access to those in the vicinity of the building. In addition, many libraries purchased Wifi hotspots to lend to their patrons. According to Pew Research, approximately 25% of households do not have a broadband internet connection at home.7 While public libraries cannot provide the only local solution to this gap, here are other steps libraries have been taking during the shutdown: • Strengthening wireless signals so people can access wireless from outside library buildings. • Hosting drive-up Wifi hotspot locations. • Partnering with schools to obtain and distribute Wifi hotspots to families in need. COMMUNITY ENGAGEMENT - VIRTUALLY Community engagement has been vital since the COVID-19 lockdown. Both librarians and those who provide library programs to the public had to quickly adjust to the virtual world in which we were suddenly living. Using a mixture of social media platforms, including Facebook Live and Stories, Discord, Instagram, YouTube, Zoom, and GoToMeeting, librarians flocked to the internet, providing a wide range of programming. Even those libraries that did not previously have any virtual programs managed to very quickly provide quality programs to their patrons. Virtual programming was not available at the San José Public Library (SJPL) prior to the shutdown. Librarians quickly started to move programs online, including story time, and created a program called Spring Into Reading, similar to the summer reading program, to continue to encourage families to read together. They also started a weekly recorded story time, so patrons could call the library and use their phones to hear a story. To date, SJPL has hosted over 2,000 virtual events since the lockdown began on March 17th.8 Some libraries, like the Oceanside Library in New York, were offering virtual programs before the pandemic. When the library closed on March 13, the team started planning to move completely virtual. Two days later, the library was offering four programs a day, including story times, book chats, and book clubs. By the end of the week, they were offering eight programs a day.9 In April, May, and June, they found book discussions and story times were the most popular programs. They then started to open their programs to people from out of state, partnering with other libraries. The result? Program attendance has increased and several Zoom meeting rooms have been maxed out.10 Through the lockdown, library patrons have been exercising, listening to concerts, taking virtual vacations, learning new skills, cooking, playing games, and reducing stress. This incredible INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 PUBLIC LIBRARIES RESPOND TO THE COVID-19 PANDEMIC | GODDARD 3 adaption was only possible due to library worker’s quick thinking and a never-ending determination to help. DELIVERING INFORMATION AND MATERIALS WITH A NEW SERVICE MODEL At the San Jose Public Library (SJPL), which has over 500,000 library members, library staff had to quickly shift to a new online reality just after the shutdown. To help patrons get the most from their electronic resources, SJPL used LibAnswers to post FAQs and email responses to their issues and questions. When a librarian was available, patrons could use LibChat to ask questions in real-time. Because no one was in the library buildings to answer phones, LibAnswers and LibChat became the only way the public could communicate with staff. Chat reference conversations increased by nearly 400%—from approximately 40 chat sessions per day to 160 per day. The chat service was also made available in Spanish, Vietnamese, and Chinese. When the library implemented its Express Pickup service, SJPL utilized the Spaces functionality in LibCal to allow patrons to create pickup appointments. When patrons arrived at the library for their appointment, the SMS functionality in LibAnswers allowed patrons to text staff upon arrival. Through the City of San José’s SJ Access initiative, which aims to help bridge the digital divide in the city, SJPL worked closely with other city departments, and the Santa Clara County Office of Education, to purchase approximately 16,000 high-speed AT&T hotspots for students and the public.11 Working Towards the New Normal The American Library Association (ALA) is committed to advocate strongly for libraries on several different fronts. Thanks to thousands of advocate communications with Congress, libraries secured $50 million for the Institute of Museum and Library Services (IMLS) in the Coronavirus Aid, Relief, and Economic Security (CARES) Act. This enabled libraries and museums to apply for grants during this time of need.12 In addition, the ALA is currently advocating for the passage of the Library Stabilization Fund Act (S.4181 / H.R.7486) to allow libraries to retain staff, maintain services, and safely keep communities connected and informed. The legislation calls for $2 billion in emergency recovery funding for America's libraries through the Institute of Museum and Library Services (IMLS).13 While the ALA is rightly advocating for these emergency funds, public librarians and administrators should take advantage of this time to strategically review what has been put into place to react to the COVID-19 pandemic, and plan for the long term. While it is true that libraries are physical spaces, they are also technology-driven services for learning and connections for all ages. Additionally, they have shown that due to this new service model, access has expo nentially expanded to new patrons, showing tremendous value when it comes to education and engagement. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 PUBLIC LIBRARIES RESPOND TO THE COVID-19 PANDEMIC | GODDARD 4 This new service model should be preserved. Programs that engage our communities should be both physical and virtual. Physical media and books should be provided both at the circulation desk and through a contactless service. Reference services should be provided both at the reference desk, and through chat reference services. This must be our new normal. ENDNOTES 1 David Burleigh, Director, Brand Marketing & Communication at Overdrive, phone conversation with author, October 9, 2020. 2 Maureen McDonald, Special Projects Supervisor at the Suffolk Cooperative Library System, phone conversation, September 14, 2020. 3 Burleigh. 4 McDonald. 5 Kayla Siefker, Head of Media & Public Relations at Gale, a Cengage Company, Brian Risse, VP of Sales – Public Libraries. Muna Sharif, Product Manager, Discovery & Analytics, phone conversation with author, October 16, 2020. 6 Siefker. 7 Pew Research Center, “Internet/Broadband Fact Sheet,” June 12, 2019, accessed October 13, 2020, https://www.pewresearch.org/internet/fact-sheet/internet-broadband/. 8 Laurie Willis, Web Services at SJPL, Phone conversation with author, October 14, 2020. 9 Erica Freudenberger, “Programming Through the Pandemic,” Library Journal, May 22, 2020, https://www.libraryjournal.com/?detailStory=Programming-Through-the-Pandemic-covid- 19. 10 Tony Iovino, Assistant Director for Community Services at the Oceanside Library, phone conversation with author, October 19, 2020. 11 Willis. 12 American Library Association, “Advocacy & Policy,” accessed October 15, 2020, http://www.ala.org/tools/covid/advocacy-policy. 13 Ibid. https://www.pewresearch.org/internet/fact-sheet/internet-broadband/ https://www.libraryjournal.com/?detailStory=Programming-Through-the-Pandemic-covid-19 https://www.libraryjournal.com/?detailStory=Programming-Through-the-Pandemic-covid-19 http://www.ala.org/tools/covid/advocacy-policy Exponential Increases in Electronic Resource Usage Community Engagement - Virtually Delivering Information and Materials with a New Service Model ENDNOTES 12857 ---- Journey with Veterans: Virtual Reality Program using Google Expeditions PUBLIC LIBRARIES LEADING THE WAY Journey with Veterans Virtual Reality Program using Google Expeditions Jessica Hall INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2020 https://doi.org/10.6017/ital.v39i4.12857 Jessica Hall (jessica.hall@fresnolibrary.org) Community Librarian, Fresno County Public Library. © 2020. “Where would you like to go?” is the question of the day. We have stood atop the Great Wall of China, swam with sea lions in the Galapagos Islands, and walked along the vast red sands of Mars. Each journey was unique and available through the library. As a community librarian in charge of outreach to seniors and veterans, I first learned about the virtual tour idea from a colleague who returned from a conference excited to tell me about a workshop she had attended. The workshop she had taken described a program which utilized Google Expeditions to take seniors on virtual tours. This idea stayed with me for months until Fresno County Public Library obtained the $3000 Value of Libraries grant, which was funded by the California Library Services Act. As a part of this grant, $2905 went to purchase a Google Expeditions kit and supplied to create a virtual reality program called Journey with Veterans. The kit includes 5 viewers and 1 tablet. A viewer is basically a Google Cardboard except the case is plastic and there is a smartphone inside of the case. During the program, I use the table to select and run each tour. The tour I select on the tablet is projected to the 5 viewers so participants can experience it. In this manner, veterans can explore places without physically having to travel anywhere. The Journey with Veterans program took the technology to the veterans instead of requiring them to come into the library. The two locations that were chosen were the Veterans Home of California - Fresno and the Community Living Center at the VA Medical Center in Fresno, CA. From the time the program began in September 2019 to March 2020, when the pandemic shutdown brought a halt to the program, the library hosted 26 sessions at these two locations with 182 veterans. In sessions where more than 5 people were in attendance, the viewers were shared between the participants. The tablet and smartphones inside of the viewers have an app installed on them called Google Expeditions which is the software that runs the tours. One hotspot, which was already owned by the library, was used for this program. It is a requirement that all the viewers and the tablet are connected to the same WiFi. Having a portable WiFi connection was necessary to run this program in locations where there was not access to a strong internet connection. Each tour is a selection of still 360-degree views. The landscape does not move. Instead, the participant turns their head around, up and down to look at the entire scene. The control tablet included additional menu items not seen by participants. These items included scripts that I can read off about the landscape we are looking at and suggested points of interest that I could highlight for participants. When I selected the point of interest on the tablet, the participant would see arrows pointing to that area of their screen. The participant would follow the arrows by turning their head in the direction that was indicated. The participants knew they were looking at the area of interest when the arrows disappeared and was replaced by a white circle surrounding the relevant portion of the screen. mailto:jessica.hall@fresnolibrary.org INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 JOURNEY WITH VETERANS | HALL 2 The viewers did not have straps attached to them and there was no way to attach straps to them. Therefore, the viewer could not be strapped to the participant’s head. Instead, the participant had to hold up the viewer the entire time they wished to look through it. This presented a challenge for participants who did not have the ability to hold the viewer on their own. At the locations I went to, there were staff available to help and they would hold the viewer up to a participant’s eyes. In some cases, one staff person held the viewer up for the participant while another would turn the participant’s wheelchair in a circle so they could see the entire image. Each program lasted 30-45 minutes but the amount of time looking through the viewer was kept to around 15-20 minutes. The rest of the time was filled with talking about the location that we are viewing. For the veterans in memory care at the Veterans Home of California - Fresno, this program was designed with the hope that it would allow the veterans to reminisce about places they had visited and lived in and encouraged them to talk about their experiences. Some of the participants had been to the countries that we visited virtually and they reminisced on their time there. At every session, the participants shared their enthusiasm and eagerness to continue the program. The program once was tried with music. On one of my first visits to the Community Living Center at the VA Medical Center, a participant asked if he could play music in the background. Since I had thought about incorporating music into the program, I agreed, and the participant played some classical music from his own device. Though it was a good idea, the execution did not work well. The music was coming from one location, which made it too loud when one stood near it but too quiet once one walked too far away. I found the music difficult to talk over while giving the tour. I believe that incorporating sounds of the location we visit, such as the sounds of the countryside or a big city would make the experience more immersive. However, I have yet to find a way to do so successfully. After the grant ended, I continued the program at both locations. The partnership I had created at the Veterans Home of California-Fresno grew into a second program, Storytime with Veterans which was requested by specifically by the residents. I alternated my visits so that some weeks we did a virtual reality program and some weeks I read to them. One time, there was miscommunication and the activity coordinator thought I had come to read a story but I was under the impression that it was a virtual reality week and so I had brought the Google Expeditions with me. The solution was to do both. One of the Google Expeditions tours is a very short and much abridged virtual reality version of Twenty Thousand Leagues Under the Sea by Jules Verne. The tour used artwork to represent scenes from the books and each scene tells a different part of the story. The Veterans Home’s residents were treated to both a story and a virtual reality tour at the same time. Up until the library’s shutdown in mid-March due to COVID-19, I was in the process of expanding the use of the Google Expeditions but was unable to continue. Since then, the equipment has not been used. Restarting the program now includes multiple challenges, not the least of which is sanitizing the devices. Sanitation was a consideration even before COVID-19 and sanitary virtual reality masks were acquired using grant funds as part of the initial program. These masks look like strips of cloth that line the eyes with strings to hook it around the ears to hold it in place. Cleaning products were also purchased and utilized to clean the devices after each program. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 JOURNEY WITH VETERANS | HALL 3 Before COVID-19, a viewer could be handled by multiple people before it was cleaned. I always handled them first to prepare them for use. Then I handed each one to the participant. Occasionally they were also handled by staff. I always cleaned the viewers right after the program ended but not during the program. With the current COVID-19 restrictions, the sanitation practices previously used are inadequate. I do not know the future of the program in a post- COVID-19 world, but I intend to begin the program again once when it becomes safe to do so and I will incorporate all required precautions and restrictions. I look forward to once more being able to take veterans on exciting virtual journeys. 13027 ---- Leadership and Infrastructure and Futures…Oh my! LETTER FROM THE CORE PRESIDENT Leadership, Infrastructure, Futures Christopher Cronin INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2020 https://doi.org/10.6017/ital.v39i4.13027 Christopher Cronin (cjc2260@columbia.edu) is Core President and Associate University Librarian for Collections, Columbia University. © 2020. I am so pleased to be able to welcome all ITAL subscribers to Core: Leadership, Infrastructure, Futures! This issue marks the first of ITAL since the election of Core’s inaugural leadership. A merger of what was formerly three separate ALA divisions—the Association of Library Collections & Technical Services (ALCTS), Library & Information Technology Association (LITA), and the Library Leadership & Management Association (LLAMA)—Core is an experiment of sorts. It is, in fact, multiple experiments in unification, in collaboration, in compromise, in survival. While initially born out of a sheer fight or flight response to financial imperatives and the need for organizational effectiveness, developing Core as a concept and as a model for an enduring professional association very quickly became the real motivation for those of us deeply embedded in its planning. Core is very deliberately not an all-caps acronym representing a single subset of practitioner within the library profession. It is instead an assertion of our collective position at the center of our profession. It is a place where all those working in libraries, archives, museums, historical societies—information and cultural heritage broadly—will find reward and value in membership and a professional home. All organizations need effective leaders, strong infrastructure, and a vision for the future. And that is what Core strives to build with and for its members. While I welcome ITAL’s readers into Core, I also welcome Core’s membership into ITAL. No longer publications of their former divisions, all three journals within Core have an opportunity to reconsider their mandates. As with all things, audience matters. ITAL’s readership has now expanded dramatically, and those new readers must be invited into ITAL’s world just as much as ITAL has been invited into theirs. As we embark on this first year of the new division, we do so with a sense of altogether newness more than of a mere refresh, and a sense of still becoming more than a sense of having always been. And who doesn’t want to reinvent themselves every once in a while? Start over. Move away from the bits that aren’t working so well, prop up those other bits that we know deserve more, and venture into some previously uncharted territory. How will being part of this effort, and of an expanded division, reframe ITAL’s mandate? The importance of information technology has never been more apparent. It is not lost on me that we do this work in Core during a year of unprecedented tumult. In 2020, a murderous global pandemic was met with unrelenting political strife, pervasive distribution of misinformation and untruths, devastating weather disasters, record-setting unemployment, heightened attention on an array of omnipresent social justice issues, and a racial reckoning that demands we look both inward and outward for real change. Individually and collectively, we grieve so many losses —loss of life, loss of income, loss of savings, loss of homes, loss of dignity, loss of certainty, loss of control, loss of physical contact. And throughout all of these challenges, what have we relied on more this year than technology? Technology kept us productive and engaged. It provided a focal point for communication and connection. It provided venues for advocacy, expression, inspiration, and, as a mailto:cjc2260@columbia.edu INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2020 LEADERSHIP, INFRASTRUCTURE, FUTURES | CRONIN 2 counterpoint to that pervasive distribution of misinformation, it provided mechanisms to amplify the voices of the oppressed and marginalized. For some, but unfortunately not all, technology also kept us employed. And as the physical doors of our organizations closed, technology provided us with new ways to invite our users in, to continue to meet their information needs, and to exceed all of our expectations for what was possible even with closed physical doors. And yet our reliance on and celebration of technology in this moment has also placed another critical spotlight on the devastating impact of digital poverty on those who continue to lack access, and by extension also a spotlight on our privilege. In her parting words to you in the final issue of ITAL as a LITA journal, Evviva Weinraub Lajoie, the last President of LITA, wrote: We may have always known that inequities existed, that the system was structured to make sure that some folks were never able to get access to the better goods and services, but for many, this pandemic is the first time we have had those systemic inequities held up to our noses and been asked, “what are you going to do to change this?” Balancing those priorities will require us to lean on our professional networks and organizations to be more and to do more. I believe that together, we can make Core stand up to that challenge. I believe we will do this, too, and with a spirit of reinvention that is guided by principles and values that don’t just inspire membership but also improve our professional lives and experience in tangible ways. It was a privilege to have served as the final President of ALCTS and such a humbling and daunting responsibility to now transition into serving as Core’s first. It is a responsibility I do not take lightly, particularly in this moment when so much is demanded of us. As we strive for equity and inclusion, we do so knowing that we are only as strong as every member’s ability to bring their whole selves to this work. We must work together to make our professional home everything we need it to be and to help those who need us. It is yours, it is theirs, it is ours. https://doi.org/10.6017/ital.v39i3.12687 13051 ---- Letter from the Editor: Farewell 2020 LETTER FROM THE EDITOR Farewell 2020 Kenneth J. Varnum INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2020 https://doi.org/10.6017/ital.v39i4.13051 I don’t think I’ve ever been so ready to see a year in the rear-view mirror as I am with 2020. This year is one I’d just as soon not repeat, although I nurture a small flame of hope. Hope that as a society what we have experienced this year will exert a positive influence on the future. Hope that we recall the critical importance of facts and evidence. Hope that we don’t drop the effort to be better members of our local, national, and global communities and treat everyone equitably. Hope that as a global populace we continue to get into “good trouble” and push back against institutionalized policies and practices of racism and discrimination and strive to be better. Despite the myriad challenges this year has brought, it is welcome to see so many libraries continuing to serve their communities, adapting to pandemic restrictions, and providing new and modified access to books and digital information. And equally gratifying, from my perspective as ITAL’s editor, is that so many library technologists continue to generously share what they have learned through submissions to this journal. Along those lines, I’m extending my annual invitation to our public library colleagues to propose a contribution to our quarterly column, “Public Libraries Leading the Way.” Items in this series highlight a technology-based innovation from a public library perspective. Topics we are interested in could include any way that technologies have helped you provide or innovate service to your communities during the pandemic, but could touch on any novel, interesting, or promising use of technology in a public library setting. Columns should be in the 1,000-1,500 word range and may include illustrations. These are not intended to be research articles. Rather, Public Libraries Leading the Way columns are meant to share practical experience with technology development or uses within the library. If you are interested in contributing a column, please submit a brief summary of your idea. Wishing you the best for 2021, Kenneth J. Varnum, Editor varnum@umich.edu December 2020 https://ejournals.bc.edu/index.php/ital/pllw https://docs.google.com/forms/d/e/1FAIpQLSd7c0-g-LxeTkJ2uKJoKD7OYT-VPrTOizdm1Fs8XuHKotCtug/viewform https://docs.google.com/forms/d/e/1FAIpQLSd7c0-g-LxeTkJ2uKJoKD7OYT-VPrTOizdm1Fs8XuHKotCtug/viewform mailto:varnum@umich.edu 5704 ---- Fulfill Your Digital Preservation Goals with a Budget Studio Yongli Zhou INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 26 ABSTRACT To fulfill digital preservation goals, many institutions use high-end scanners for in-house scanning of historical print and oversize materials. However, high-end scanner prices do not fit in many small institutions’ budgets. As digital single-lens reflex (DSLR) camera technologies advance and camera prices drop quickly, a budget photography studio can help to achieve institutions’ preservation goals. This paper compares images delivered by a high-end overhead scanner and a consumer-level DSLR camera, discusses pros and cons of using each method, demonstrates how to set up a cost-efficient shooting studio, and presents a budget estimate for a studio. INTRODUCTION Colorado State University Libraries (CSUL) are regularly engaged in a variety of digitization projects. Materials for some projects are digitized in-house, while items from selected projects are sometimes outsourced. Most fragile materials that require professional handling are digitized in- house using an expensive overhead scanner. However, the overhead scanner has been occasionally unstable since it was purchased, and this has delayed some of our digitization projects. As digital photography technologies advance, image quality delivered by digital single- lens reflex (DSLR) cameras is improving, and camera prices have lowered to an affordable level. In this paper, I will compare images produced by a scanner and a camera side-by-side, list pros and cons of using each method, illustrate how to establish a shooting studio, and present a budget estimate for that studio. LITERATURE REVIEW There are many online guidelines and manuals for digitizing print materials. Some universities and museums have information about their digitization equipment online. Most articles focus on either high-end scanners or customized scanning stations. These articles are very helpful for universities and museums that are relatively well funded. However, there is almost no literature discussing how to use inexpensive digital cameras and photography equipment to produce high- quality digitized images. This article will use a case study to prove that a low-budget studio can produce high-quality digitized images. COMPARISON OF SCANNED AND PHOTOGRAPHED IMAGES The test camera set was chosen because it was the one the author used for general purpose. The camera was also chosen by many professional photographers because of its quality and Yongli Zhou (yongli.zhou@colostate.edu) is Digital Repositories Librarian, Colorado State University Libraries, Fort Collins, Colorado. mailto:yongli.zhou@colostate.edu FULFILL YOUR PRESERVATIONS GOALS WITH A BUDGET STUDIO | ZHOU doi: 10.6017/ital.v35i1.5704 27 affordability. To avoid dispute, the overhead scanner’s make and model are not revealed. Test Equipment Budget Studio Overhead Scanner • Nikon D800 • Nikon AF Micro-Nikkor 60mm f/2.8D Lens • Manfrotto 055CXPRO3 3-Section Carbon Fiber Tripod Legs • Really Right Stuff BH-40 LR II Ballhead • Nonreflective glass • Book cradles • X-Rite Original ColorChecker Card • Natural daylight • Total cost: $4,500 and no maintenance fees (priced in 2014) • Our overhead scanner • Nonreflective glass • Book cradles • Purchase price: $55,000 (purchase in 2007) • $8,000 annual maintenance (2013 price) Table 1. Test Equipment Focus and Sharpness A quality digitized image needs to have a good focus. A well-focused image shows details better and can produce better Optical Character Recognition (OCR) results for text-based documents. At CSUL, we have no control over the automatic focus on our overhead scanner and have noticed that sometimes one page is sharply focused but the next page is slightly out-of-focus. During the scanning process, our overhead scanner does not indicate if a shot is focused or not. A DSLR camera can beep or display a flashing dot on the viewfinder when in focus. Illustration The following two figures compare images produced by our test DSLR and overhead scanner. Both images were originals and have not been enhanced by software. In addition to this image, we tested nine other illustrations. Following our comparison study, we concluded that a semiprofessional DSLR camera produces sharper images than our expensive overhead scanner. In figure 1, at 100 percent zoom , the left image has a better focus, contains more details, and has colors closer to the original. The left image was taken using a Nikon D800 + Nikkor 60mm macro lens and under natural lighting. The right image was produced by our overhead scanner. In Figure 2, at 200 percent zoom, the left image (taking using the DSLR) shows much more detail than the image on the right (taken with the overhead scanner). INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 28 Figure 1. Comparative Images from DSLR (Left) and Overhead Scanner (Right), at 100 Percent Zoom. Image from Samuel M. Janney, The Life of William Penn; with Selections from His Correspondence and Auto-Biography (Philadelphia: Hogan Perkins & CO, 1852), plate between pages 296 and 297. Figure 2. Comparative Images from DSLR (Left) and Overhead Scanner (Right), at 200 Percent Zoom. Image from Samuel M. Janney, The Life of William Penn; with Selections from His FULFILL YOUR PRESERVATIONS GOALS WITH A BUDGET STUDIO | ZHOU doi: 10.6017/ital.v35i1.5704 29 Correspondence and Auto-Biography (Philadelphia: Hogan Perkins & CO, 1852), frontispiece, print. At CSUL, the process of digitizing a text document includes scanning pages, converting them into Portable Document Format (PDF) files, and applying an OCR process. In general, a well-focused image of text produces better OCR results, although software such as Adobe Acrobat can tolerate fuzzy images and produce reasonably accurate OCR text. Our OCR tests from a slightly out-of-focus image and a well-focused image have no significant difference; however, from preservation and usability standpoints, we prefer well-focused images. Figure 3. The left image was produced by our test DSLR camera and has a better focus. The right image was produced by our overhead scanner. Samuel M. Janney, The Life of William Penn; with Selections from His Correspondence and Auto-Biography (Philadelphia: Hogan Perkins & CO, 1852), 300, print. Figure 4. We ran the OCR process on the above two images. The top image was produced by our test DSLR camera and the bottom image was produced by our overhead scanner. Samuel M. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 30 Janney, The Life of William Penn; with Selections from His Correspondence and Auto-Biography (Philadelphia: Hogan Perkins & CO, 1852), 300, print. Generated from the Image by Camera Generated from the Image by Scanner " On one or two points of high importance, he had notions more correct than were, in his day, common, even among men of e1~larged minds, and he had the rare good fortune of being able to carry his theories into practice without any compromise." Yet, "he was not a man of stron sense." " On one or two points of high importance, he bad notions more correct than were, in his day, common, even arnong men of e1~larged minds, and he had the rare good fortune of being able to carry his theories into practice without any compromise." Yet, "he was not a man of strong sense." Table 2. OCR Results Comparison These test results are very close because of the forgiveness of the Adobe Acrobat software. However, we have seen that for some other pages, a better-focused image generates improved OCR results. Photograph A 6.5 inches by 4.5 inches silver print was used for this test. Our tests show that the test DSLR camera produced a sharper image of this historic photograph. FULFILL YOUR PRESERVATIONS GOALS WITH A BUDGET STUDIO | ZHOU doi: 10.6017/ital.v35i1.5704 31 Figure 5. Tested 6.5 Inches by 4.5 Inches Photograph. The red square indicates the enlarged area for figure 6. Historical photograph from Colorado State University Archives and Special Collections. Figure 6. Screen View at 100 Percent Zoom of a Silver Print. The top image was produced by the test DSLR camera and the bottom one was produced by our overhead scanner. Historical photograph from Colorado State University Archives and Special Collections. Oversize Materials For oversized materials, overhead scanners and DSLR cameras have their drawbacks, so we do not think either option is ideal for them. Our library uses a map scanner to scan oversize maps and posters. However, a map scanner is expensive and may not fit many libraries’ budgets. A map scanner also is not suitable for fragile maps or posters. Our overhead scanner’s maximum scanning area is 24 inches by 17 inches, and the test map’s size is 25 inches by 26 inches. We had to scan the map in four sections and stitch them together using Adobe Photoshop. Each section image has a files size of 313 MB. Because of large file sizes, the stitching process is extremely slow. Also stitching images is not recommended because there are always some degrees of mismatching errors created by lens distortion. A camera can capture any material size, but the details of the photographed images diminish as the material’s size increases. The photo of the entire map taken by our test DLSR has a file size of 35.8 MB. The image produced by camera has a lower resolution and less detail. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 32 Figure 7. Oversized Materials Screen View at 100 Percent Zoom. The top image was photographed by the test DSLR. The bottom image was scanned by our overhead scanner. Historical map from Colorado State University Archives and Special Collections. Small Prints One big advantage of a DSLR camera is that it can be set farther away to take pictures of oversized materials or very close to smaller objects to take close-up pictures. Comparatively, the distance of lens and scanning platform on our overhead scanner is fixed, so no close-up images can be produced, and everything is reproduced at scale of 1:1. For the following example, we used a 5.5 inches by 3.5 inches drawing as our test subject. FULFILL YOUR PRESERVATIONS GOALS WITH A BUDGET STUDIO | ZHOU doi: 10.6017/ital.v35i1.5704 33 Figure 8. A 5.5 inches by 3.5 inches Fine Drawing. A historical booklet from Colorado State University Archives and Special Collections. Figure 9. Small Prints Screen View at 100 Percent Zoom. The left image is produced by a DSLR with a macro lens and the right image was scanned by our overhead scanner. A historical booklet from Colorado State University Archives and Special Collections. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 34 The image produced by our overhead scanner has a resolution of 3,427 pixels by 2,103 pixels. The camera produces a 6,776 pixels by 4,240 pixels image. The higher pixel count allows users to see more details at the same zoom level. The image produced by camera is not only sharper but also contains more details. It also is good for making enlarged prints for promotion materials. For smaller maps, a DSLR camera also produces superior images. For the following sample, we tested a 15 inches by 9.5 inches map. Figure 10. A 15 inches by 9.5 inches map. The blue square indicates the enlarged area for figure 11. Historical map from Colorado State University Archives and Special Collections. FULFILL YOUR PRESERVATIONS GOALS WITH A BUDGET STUDIO | ZHOU doi: 10.6017/ital.v35i1.5704 35 Figure 11. Small Map Screen Views at 100 Percent Zoom. The left image was photographed by a DSLR camera with a macro lens and the right image was produced by our overhead scanner. Historical map from Colorado State University Archives and Special Collections. Post-Processing Use of a Sharpening Filter Our tests showed that a main drawback of our overhead scanner is that images produced are out- of-focus. Some digitization guidelines recommend minor post-processing for delivered images files to improve image quality. One might argue that to fix our overhead scanner’s out-of-focus problem, sharpening can be applied. Technical Guidelines for Digitizing Cultural Heritage Materials: Creation of Raster Image Master Files recommends doing minor post-scan adjustment to optimize image quality and bring all images to a common rendition.1 This is good advice, but it is not applicable in real-world practice. To get the best result, each image would need to be evaluated and have a sharpening filter applied separately because when an improper sharpening setting is applied to an image, it often creates haloing artifacts and an unnatural look. The application of a sharpening filter to each image process will be extremely time-consuming. The haloing artifact is also called chromatic aberration (CA) effect. CA appears as unsightly color fringes near high contrast edges. Chromatic aberrations are typically only visible when viewing the image on-screen at higher zoom levels or on large prints. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 36 The following example shows that the CA may not appear at lower zoom levels, such as 50 percent or 100 percent. The left image has no sharpening filter applied and the right image has a sharpening filter applied. At 100 percent zoom, chromatic aberration is almost not identifiable, and the right image appears to be superior in turns of sharpness. Figure 12. Sharpening Filter Comparison Sample at 100 Percent Zoom. The left image has no sharpening filter applied and the right image has been applied a sharpening filter. Historical map from Colorado State University Archives and Special Collections. At a higher zoom level, we see CA, visible in the right image of figure 13. The extra colors are introduced by the software. Figure 13. Comparison of Sharpening Filter Applied to Images and at 500 Percent Zoom. The left image has no sharpening filter applied and the right image has sharpening filter applied. Historical map from Colorado State University Archives and Special Collections. FULFILL YOUR PRESERVATIONS GOALS WITH A BUDGET STUDIO | ZHOU doi: 10.6017/ital.v35i1.5704 37 We recommend not applying sharpening filters to original scanned images; instead, attempt to obtain well-focused images from the beginning. For this reason, the test DSLR camera out- performed our overhead scanner for most materials. Color Balance Have you seen a scanned color image or color photograph with colors very different from the original image? For example, a white area appears to be bluish, or it has an orange cast? When scanning or photographing an image under different lighting, the output image can have very different colors. In the following figure, the left image was shot at a correct white balance (WB) setting. WB is the process of removing unrealistic color casts so that objects that appear white in person are rendered white in your photo.2 The center image has a blue color cast, which was caused by a lower Kelvin setting, and the right image was shot at a higher Kelvin setting. A camera may create images with the wrong colors, but so will a scanner if it is not calibrated correctly. Figure 14. Images Shot under Different White Balance Settings. We pay an $8,000 annual service fee for overhead scanner maintenance, which includes scanner color calibration. In general, image colors rendered by the machine are close to original colors but not exact. We have noticed that some images have a very light green overcast and other others are overly yellow; sometimes images appear to be darker than they should be. Because we are not certified to calibrate the overhead scanner, we only use the prescribed settings set by technicians. Also, we have no control over maintaining a fading light bulb, which will affect correct exposure. WB adjustment on photographs taken in a studio can be very precise. Most DSLR contains a variety of preset white balances. In general, auto WB works well, but does not deliver the best results. Custom WB allows fine-tuning of colors. If a shooting studio is set up properly, the lighting should be consistent, so ideally one setting found most desirable can be used repeatedly. However, professional photographers do test shots at the beginning of each shooting session. Once they find INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 38 the optimal test shot, they will use the exact settings for the batch. Later, they will do minor color adjustment on the chosen test shot to ensure precise color representation, and then apply the adjustment settings on all other photos of the same batch. Because many small variations can be present for each shooting session, they do not use the settings from the previous shooting. It may seem arduous to do test shots for each shooting, but it ensures accurate color reproduction. Many professional photographers use ColorChecker Passport,3 which is a commercial product to help with quick and easy capture of accurate colors. I will demonstrate briefly a useful trick I learned from a professional photography seminar how to utilize ColorChecker Passport to apply correct white balance a group of images. 4 Step 1: Place an 18 percent gray card or a ColorChecker Passport card on top of a page. Choose the correct exposure and take the photo. Use the same exposure setting to take additional photos. For demonstration purposes, we deliberately used a very low and high Kelvin setting for sample images. The low Kelvin setting created cool and blue tones and the high Kelvin setting created a tone that was too warm. Note that the test shot with ColorChecker Board was not taken with exactly the correct white balance setting. Figure 15. Sample Images for White Balance Adjustment. Rocky Mountain Collegian 3–4 (1893), 118, Colorado State University Archives and Special Collections. Step 2: In Adobe Lightroom, select the test target image and switch to “Develop” mode. Select the White Balance tool, move the cursor over a gray area, try to find a spot where the red, green, and blue (RGB) values are close. If you can find a place with equal RGB values, it will be ideal. This simple click will set the test image’s white balance to an almost perfect setting. FULFILL YOUR PRESERVATIONS GOALS WITH A BUDGET STUDIO | ZHOU doi: 10.6017/ital.v35i1.5704 39 Figure 16. Applying a White Balance in Adobe Lightroom 4 Step 3. Synchronize other images’ settings with the target image. Select the target image and all other images, click the Sync button, and select settings you would like to synchronize. Make sure the WB button is checked. Figure 17. Synchronize Settings in Adobe Lightroom 4 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 40 Figure 18. Synchronized Images with Correct White Balance. Rocky Mountain Collegian 3–4 (1893), 118, Colorado State University Archives and Special Collections. Recently, I had the opportunity to visit the Spencer Museum of Art’s digitization lab. They have a different workflow to ensure even more scientifically correct colors. If you are interested in their approach, you can contact their information technology manager or photographer. Color Space One very important thing to understand is color space when you use a DSLR camera. Many DSLR cameras support Adobe RGB and sRGB. sRGB reflects the characteristics of the average cathode ray tube (CRT) display. This standard space is endorsed by many hardware and software manufacturers, and it is becoming the default color space for many scanners, low-end printers, and software applications. It is the ideal space for web work but not recommended for prepress work because of its limited color gamut. Adobe RGB (1998) was designed to encompass most of the colors achievable on CMYK printers, but only by using RGB primary colors on a device such as your computer display.5 It is recommended to use this color space if you need to do print production work with a broad range of colors. Many scanning vendors deliver images in Adobe RGB color space. ProPhoto RGB contains all colors that are in Adobe RGB, and Adobe RGB contains nearly every color that is in sRGB. This color space covers more colors than the human eye can see. It can only be used for images in RAW format and in 16-bit mode. Common file formats that support 16-bit images are TIFF and PSD. Most printers do not support 16-bit format. This color space normally is used by photographers who have a specific workflow and who print on specific high-end inkjet printers. When converting from 16-bit to 8-bit, some images will have banding or posterization problems. Banding is a digital imaging artifact. A picture with banding problem shows horizontal or vertical lines. FULFILL YOUR PRESERVATIONS GOALS WITH A BUDGET STUDIO | ZHOU doi: 10.6017/ital.v35i1.5704 41 Figure 19. An Example of Colour Banding, Visible in the Sky in This Thotograph.6 Posterization of an image entails conversion of a continuous gradation of tone to several regions of fewer tones, with abrupt changes from one tone to another.7 Figure 20. An Example of Posterization.8 While it is a good idea to capture images using Adobe RGB to preserve a wide range of colors, you should convert images to sRGB when delivering to unknown users and displaying on the web. Currently, sRGB is the only appropriate choice for images uploaded to the web, since most web browsers don’t support any color management. Adobe RGB images that are uploaded to websites without conversion to sRGB generally appear dark and muted.9 If they were printed on printers that do not support Adobe RGB format, colors will be dull too. SETTING UP A BUDGET STUDIO Commercial Approach BookDrive Pro is a commercially available digitization unit. It uses two digital cameras and built-in flash lights. It may be the optimal solution for your projects, but it also may not fit your library’s INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 42 budget. The unit also is not suitable for oversized material such as large maps and posters. For more information about this product, please visit http://pro.atiz.com/. Sample Budget Studio Setup A digitization lab can have three rooms or areas, one for oversized materials, one for smaller prints or 3-D objects, and one for computers. The area for shooting oversized materials should have black walls and floor. You can either use one flash light to bounce light off the ceiling or use two flash lights to shine lights directly onto the materials. For fragile materials, the first approach is more appropriate. The area for shooting smaller prints or 3-D objects should have a stable table and black or white background paper. For this room or area, black walls and floor are not required. For shooting equipment, I will use the set chosen by the photographer from the University of Kansas Spencer Museum of Art as my example. Item Name Sample Item Purchasing URL Price DSLR camera Nikon D810 http://www.bhphotovideo.co m/c/search?atclk=Camera+Mo del_Nikon+D810&ci=6222&N= 4288586280+3907353607 $2,996.95 Macro lens Nikon AF Micro-Nikkor 60mm f/2.8D Lens http://www.bhphotovideo.co m/c/product/66987- GREY/Nikon_1987_AF_Micro_ Nikkor_60mm_f_2_8D.html $429.00 Heavy duty mono stand Arkay 6JRCW Mono Stand Jr with Counter Weight— 6' http://www.bhphotovideo.co m/c/product/2727- REG/Arkay_605138_6JRCW_M ono_Stand_Jr.html $678.50 Strobe Broncolor G2 Pulso— 1600 Watt/Second Focusing Lamphead with 16' Cord http://www.bhphotovideo.co m/c/product/259745- REG/Broncolor_32_115_07_G2 _Pulso_with_16.html $3,053.68 Power pack Broncolor Senso A4 2,400W/s Power Pack http://www.bhphotovideo.co m/c/product/745060- REG/Broncolor_31_051_07_Se nso_A4_2_400W_s_Power.html $3,629.92 http://www.bhphotovideo.com/c/search?atclk=Camera+Model_Nikon+D810&ci=6222&N=4288586280+3907353607 http://www.bhphotovideo.com/c/search?atclk=Camera+Model_Nikon+D810&ci=6222&N=4288586280+3907353607 http://www.bhphotovideo.com/c/search?atclk=Camera+Model_Nikon+D810&ci=6222&N=4288586280+3907353607 http://www.bhphotovideo.com/c/search?atclk=Camera+Model_Nikon+D810&ci=6222&N=4288586280+3907353607 http://www.bhphotovideo.com/c/product/66987-GREY/Nikon_1987_AF_Micro_Nikkor_60mm_f_2_8D.html http://www.bhphotovideo.com/c/product/66987-GREY/Nikon_1987_AF_Micro_Nikkor_60mm_f_2_8D.html http://www.bhphotovideo.com/c/product/66987-GREY/Nikon_1987_AF_Micro_Nikkor_60mm_f_2_8D.html http://www.bhphotovideo.com/c/product/66987-GREY/Nikon_1987_AF_Micro_Nikkor_60mm_f_2_8D.html http://www.bhphotovideo.com/c/product/2727-REG/Arkay_605138_6JRCW_Mono_Stand_Jr.html http://www.bhphotovideo.com/c/product/2727-REG/Arkay_605138_6JRCW_Mono_Stand_Jr.html http://www.bhphotovideo.com/c/product/2727-REG/Arkay_605138_6JRCW_Mono_Stand_Jr.html http://www.bhphotovideo.com/c/product/2727-REG/Arkay_605138_6JRCW_Mono_Stand_Jr.html http://www.bhphotovideo.com/c/product/259745-REG/Broncolor_32_115_07_G2_Pulso_with_16.html http://www.bhphotovideo.com/c/product/259745-REG/Broncolor_32_115_07_G2_Pulso_with_16.html http://www.bhphotovideo.com/c/product/259745-REG/Broncolor_32_115_07_G2_Pulso_with_16.html http://www.bhphotovideo.com/c/product/259745-REG/Broncolor_32_115_07_G2_Pulso_with_16.html http://www.bhphotovideo.com/c/product/745060-REG/Broncolor_31_051_07_Senso_A4_2_400W_s_Power.html http://www.bhphotovideo.com/c/product/745060-REG/Broncolor_31_051_07_Senso_A4_2_400W_s_Power.html http://www.bhphotovideo.com/c/product/745060-REG/Broncolor_31_051_07_Senso_A4_2_400W_s_Power.html http://www.bhphotovideo.com/c/product/745060-REG/Broncolor_31_051_07_Senso_A4_2_400W_s_Power.html FULFILL YOUR PRESERVATIONS GOALS WITH A BUDGET STUDIO | ZHOU doi: 10.6017/ital.v35i1.5704 43 Reflector Broncolor P65 Reflector, 65 Degrees, 11" Diameter, for Broncolor Pulso 8, Twin and HMI http://www.bhphotovideo.co m/c/product/7162- REG/Broncolor_33_106_00_P6 5_Reflector_65_Degrees.html $513.52 Reflector Broncolor Softlight Reflector, 20" Diameter, for Broncolor Primo, Pulso 2/4 & HMI Heads http://www.bhphotovideo.co m/c/product/7167- REG/Broncolor_33_110_00_Sof tlight_Reflector_20_for.html $501.76 Light Stand Impact Air-Cushioned Light Stand http://www.bhphotovideo.co m/c/product/253067- REG/Impact_LS10AB_Air_Cush ioned_Light_Stand.html $44.99 Light meter Sekonic L-308S Flashmate—Digital Incident, Reflected and Flash Light Meter http://www.bhphotovideo.co m/c/product/368226- REG/Sekonic_401_309_L_308S _Flashmate_Light_Meter.html $199.00 Book cradle Book Exhibition Cradles http://www.universityproduct s.com/cart.php?m=product_list &c=1115&primary=1&parentI d=1271&navTree[]=1115 $30.00 Background paper Savage Seamless Background Paper (Both white and black) http://www.bhphotovideo.co m/c/product/45468- REG/Savage_1_12_107_x_12yd s_Background.html $45.00 x 2 = $90.00 Nonreflective glass 1/4" Optiwhite Starphire Purified Tempered Single Lite Clear Class Can be purchased at local glass store. $75.00 White balancing accessory X-Rite Original ColorChecker Card http://www.bhphotovideo.co m/c/product/465286- REG/X_Rite_MSCCC_Original_C olorChecker_Card.html $69.00 Software Adobe Lightroom 5 http://www.adobe.com/produ cts/photoshop-lightroom.html $150.00 Table 3. List of Items Needed to Prepare for a Budget Studio The total cost for a “budget” shooting studio ranges from $10,000 to $15,000, and there is no annual maintenance expense. http://www.bhphotovideo.com/c/product/7162-REG/Broncolor_33_106_00_P65_Reflector_65_Degrees.html http://www.bhphotovideo.com/c/product/7162-REG/Broncolor_33_106_00_P65_Reflector_65_Degrees.html http://www.bhphotovideo.com/c/product/7162-REG/Broncolor_33_106_00_P65_Reflector_65_Degrees.html http://www.bhphotovideo.com/c/product/7162-REG/Broncolor_33_106_00_P65_Reflector_65_Degrees.html http://www.bhphotovideo.com/c/product/7167-REG/Broncolor_33_110_00_Softlight_Reflector_20_for.html http://www.bhphotovideo.com/c/product/7167-REG/Broncolor_33_110_00_Softlight_Reflector_20_for.html http://www.bhphotovideo.com/c/product/7167-REG/Broncolor_33_110_00_Softlight_Reflector_20_for.html http://www.bhphotovideo.com/c/product/7167-REG/Broncolor_33_110_00_Softlight_Reflector_20_for.html http://www.bhphotovideo.com/c/product/253067-REG/Impact_LS10AB_Air_Cushioned_Light_Stand.html http://www.bhphotovideo.com/c/product/253067-REG/Impact_LS10AB_Air_Cushioned_Light_Stand.html http://www.bhphotovideo.com/c/product/253067-REG/Impact_LS10AB_Air_Cushioned_Light_Stand.html http://www.bhphotovideo.com/c/product/253067-REG/Impact_LS10AB_Air_Cushioned_Light_Stand.html http://www.bhphotovideo.com/c/product/368226-REG/Sekonic_401_309_L_308S_Flashmate_Light_Meter.html http://www.bhphotovideo.com/c/product/368226-REG/Sekonic_401_309_L_308S_Flashmate_Light_Meter.html http://www.bhphotovideo.com/c/product/368226-REG/Sekonic_401_309_L_308S_Flashmate_Light_Meter.html http://www.bhphotovideo.com/c/product/368226-REG/Sekonic_401_309_L_308S_Flashmate_Light_Meter.html http://www.universityproducts.com/cart.php?m=product_list&c=1115&primary=1&parentId=1271&navTree%5B%5D=1115 http://www.universityproducts.com/cart.php?m=product_list&c=1115&primary=1&parentId=1271&navTree%5B%5D=1115 http://www.universityproducts.com/cart.php?m=product_list&c=1115&primary=1&parentId=1271&navTree%5B%5D=1115 http://www.universityproducts.com/cart.php?m=product_list&c=1115&primary=1&parentId=1271&navTree%5B%5D=1115 http://www.bhphotovideo.com/c/product/45468-REG/Savage_1_12_107_x_12yds_Background.html http://www.bhphotovideo.com/c/product/45468-REG/Savage_1_12_107_x_12yds_Background.html http://www.bhphotovideo.com/c/product/45468-REG/Savage_1_12_107_x_12yds_Background.html http://www.bhphotovideo.com/c/product/45468-REG/Savage_1_12_107_x_12yds_Background.html http://www.bhphotovideo.com/c/product/465286-REG/X_Rite_MSCCC_Original_ColorChecker_Card.html http://www.bhphotovideo.com/c/product/465286-REG/X_Rite_MSCCC_Original_ColorChecker_Card.html http://www.bhphotovideo.com/c/product/465286-REG/X_Rite_MSCCC_Original_ColorChecker_Card.html http://www.bhphotovideo.com/c/product/465286-REG/X_Rite_MSCCC_Original_ColorChecker_Card.html http://www.adobe.com/products/photoshop-lightroom.html http://www.adobe.com/products/photoshop-lightroom.html INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 44 Figure 21. The University of Kansas Spencer Museum of Art Digitization Lab Setup for Oversized Materials Figure 22. Steelworks Museum of Industry and Culture’s Digitization Lab Setup for Oversized Materials FULFILL YOUR PRESERVATIONS GOALS WITH A BUDGET STUDIO | ZHOU doi: 10.6017/ital.v35i1.5704 45 Figure 23. The University of Kansas Spencer Museum of Art Digitization Lab Setup for Smaller Prints and 3-D Objects Figure 24. Steelworks Center of the West’s Digitization Lab Setup for 3-D Objects Functions of Some Elements in the Sample Shooting Studio 1. Macro Lens: It allows close up shooting of objects. It is especially useful when photograph small prints and small 3-D objects. It can also be used to photograph regular and oversized materials. 2. Heavy-duty mono stand: It replaces a traditional tripod. It is very stable and allows quick adjustment of camera height and location. 3. Strobe, power pack, and reflector: Together they generate consistent and homogeneous light distribution. Recommended further reading: “Introduction to Off- Camera Flash: Three Main Choices in Strobe Lighting.”10 4. Light stand: It holds strobe and reflector. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 46 5. Light meter: Hand-held exposure meters measure light falling onto a light-sensitive cell and converts it into a reading that enables the correct shutter speed and or lens aperture settings to be made.11 6. Book cradles: They help to minimize the stress on bookbindings and minimize page curvature problem. 7. Nonreflective glass: It helps to flatten a photographed page and reduce the reflection. However, it does not completely eliminate glass reflection. One very useful trick to reduce glass reflection is to place a black board with a hole above a page and shoot through the hole. This approach actually does not eliminate reflection but reflects black to the photograph. When the photograph is reviewed on computer, it will appear as no reflection has occurred. Figure 25. The University of Kansas Spencer Museum of Art Digitization Lab Setup for Materials Needed be Pressed Down by a Glass. Many librarians believe that digitizing print materials using a digital camera requires a professional photographer, but this is not necessarily true. A professional photographer or even an art student can act as a consultant to help set up a shooting studio and provide basic training. Also, many museums have professional photographers and have set up shooting studios for digitization. They are very willing to share their experience and even provide training. I believe the learning curve for operating a shooting studio is no greater than the learning curve to operate an overhead scanner machine and its software. PROS AND CONS No digitization equipment or system is perfect. They all have trade-offs in image quality, speed, convenience of use, quality of accompanying software, and cost. Our tests show that for most archival materials a DSLR camera will do a better job than an overhead scanner. Pros of Overhead Scanner FULFILL YOUR PRESERVATIONS GOALS WITH A BUDGET STUDIO | ZHOU doi: 10.6017/ital.v35i1.5704 47 • The scanner is a complete scanning station. It can be connected to a computer and starts scanning immediately. Materials can be placed on the scanning surface, so no equipment adjustments are required while scanning. • It can scan and save images in bitmap format directly, while a DSLR camera can only shoot in grayscale or color. • Built-in book cradles help to scan thick books and those that cannot be fully opened. • Book curve correction functionality is provided by the accompanying software. Cons of Overhead Scanner • High cost. The overhead scanner we have cost more than $50,000, with an annual maintenance contract of $8,000. • High replacement cost. When a scanner is outdated or broken, the entire machine has to be replaced. • Instability. Our overhead scanner is unstable even when placed on a sturdy table and handled only by professionals. From April 2010 to October 2010, the scanner was down for a total of forty-two working days (sixty calendar days). The company fixed the machine onsite many times, but it continues to have minor problems and has not been completely reliable. • The autofocus feature does not work consistently. • Special training is needed to operate the machine and associated software. • File formats supported are limited. Most scanners only support TIFF, JPEG, JPEG 2000, Windows BMP, and PNG. • Unsupported outdated software: Our overhead scanner’s software can only be run on an older operating system (Windows XP) because there is no updated software for this model. Pros of Budget Studio • Stable. Under normal use DSLR cameras are much less likely to break down than scanners. For example, I have had an older DSLR, Nikon D200, for seven years. It has survived numerous backpacking trips, multiple drops, and extreme weather conditions. The camera still functions as needed. • Fast and accurate focus. DSLR cameras are designed to focus quickly, and their focus indicators provide instant feedback to the operators so they know that the image is focused. If operated properly, images delivered by DSLR cameras can be sharper than ones delivered by scanners. • Less expensive. A good quality DSLR camera and a lens can be purchased for fewer than $4,000 and last for years. As technologies advance, DSLR cameras’ prices will continue to drop. • Ability to save files in more formats. In addition to TIFF and JPEG formats, most DSLR cameras can save photos in RAW file format. Some cameras can directly save images in Digital Negative (DNG) format, and others deliver images in proprietary formats that can be INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 48 converted into DNG using a computer program. Editing RAW images is nondestructive, while editing of TIFF and JPEG images is irreversible. • Accurate WB and exposure. By using right shooting and post-processing techniques, photographs can have exact color reproduction. On the other hand, calibrating an overhead scanner most likely can only be performed by a company’s trained technician. Proper exposure and WB are not guaranteed. • The RAW file format usually provides more dynamic range. Overexposed and underexposed images can be fixed by adjusting exposure compensation via software; thus lost shadow or highlight detail can be restored. • Can photograph 3-D objects. Archival collections often have materials other than books, such as art pieces. These materials are better to be photographed than scanned. • Versatile. Cameras can perform on-site digitization, while overhead scanners are too bulky to be moved around. • Faster and better preview. Images can be viewed instantly on a computer when proper software, such as Adobe Lightroom, is used. Operators can compare multiple shoots on a screen side-by-side and decide which photo to retain. • More accessible technical support. The number of DSLR camera users is much higher than overhead scanner users. Technical questions can often be answered through online forums. • Easy to find replacement parts. When a piece in a shooting studio break down, it is easy to find replacement piece and replace by staff. • Easy software updates. Software used in a studio is independent from equipment. Cons of Budget Studio • There is learning curve for setting up a shooting studio, operating the studio, and mastering new image processing techniques. • A DSLR camera with a lower pixel setting will not be sufficient for scanning large-format materials, such as posters and maps. • No built-in book curve correction is provided by Adobe Photoshop or Lightroom. However, our experience proves that the automatic book curve function does not always work well. We normally use a home-made book cradle to help lay a page flat and use one or two weights to hold down the other side of book. For some books, if flatness is hard to achieve, we place a piece of glass on the top to ensure the flatness. • Security concern: Since a DSLR camera is highly portable, it can be stolen easily. FULFILL YOUR PRESERVATIONS GOALS WITH A BUDGET STUDIO | ZHOU doi: 10.6017/ital.v35i1.5704 49 Figure 26. Scanning Setup Using a Book Cradle. CONCLUSION The technology of DSLR cameras has advanced very quickly in the past ten years. Newer DSLR cameras can handle higher resolutions and have very little image noise even at a high ISO setting. The higher demand for DSLR cameras and accompanying image-editing software results in more rapid technology advances compared to low-demand and high-end overhead scanners. High consumer demand drives DSLR camera prices much lower than prices for overhead scanners. In addition, the wide range of consumers purchasing DSLR cameras and software prompts companies to offer more user-friendly interfaces. As you can see from our tests, for most library materials a DSLR camera can produce superior images. If you do not have a budget for high-end overhead scanners, you can still fulfill your digitization preservation goals with a budget studio. ACKNOWLEDGEMENT I would like to thank Robert Hickerson and Ryan Waggoner, the University of Kansas Spencer Museum of Art, Tim Hawkins, and Steelworks Center of the West for showing their digitization labs and sharing experience with me. REFERENCES 1. Federal Agencies Digitization Guidelines Initiative, “Technical Guidelines for Digitizing Cultural Heritage Material: Creation of Raster Image Master Files,” August 2010, http://www.digitizationguidelines.gov/guidelines/digitize-technical.html 2. “Tutorials: White Balance,” Cambridge in Colour, accessed March 9, 2016, http://www.cambridgeincolour.com/tutorials/white-balance.htm. http://www.cambridgeincolour.com/tutorials/white-balance.htm INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 50 3. “ColorChecker Passport User Manual,” X-Rite Incorporated, accessed March 9, 2016, http://www.xrite.com/documents/manuals/en/ColorCheckerPassport_User_Manual_en.pdf. 4. Scott Kelby, “Scott Kelby's Editing Essentials: How to Develop Your Photos,” Pearson Education, Peachpit, accessed March 9, 2016, http://www.peachpit.com/articles/article.aspx?p=2117243&seqNum=3. 5. “sRGB vs. Adobe RGB 1998,” Cambridge in Colour, accessed March 9, 2016, http://www.cambridgeincolour.com/tutorials/sRGB-AdobeRGB1998.htm. 6. “Colour Banding,” Wikipedia, accessed March 9, 2016, http://en.wikipedia.org/wiki/Colour_banding. 7. “Posterization,” Wikipedia, accessed March 9, 2016, http://en.wikipedia.org/wiki/Posterization. 8. “Image Posterization,” Cambridge in Colour, accessed March 9, 2016, http://www.cambridgeincolour.com/tutorials/posterization.htm. 9. Richard Anderson and Peter Krogh, “Color Space and Color Profiles,” American Society of Media Photographers, accessed March 9, 2016, http://dpbestflow.org/color/color-space-and- color-profiles. 10. Tony Roslund, “Introduction to Off-Camera Flash: Three Main Choices in Strobe Lighting,” Fstoppers (blog), accessed March 9, 2016, https://fstoppers.com/originals/introduction- camera-flash-three-main-choices-strobe-lighting-40364. 11. “Introduction to Light Meters,” B & H Foto & Electronics Corp., accessed March 9, 2016, http://www.bhphotovideo.com/find/Product_Resources/lightmeters1.jsp. http://www.xrite.com/documents/manuals/en/ColorCheckerPassport_User_Manual_en.pdf http://www.peachpit.com/articles/article.aspx?p=2117243&seqNum=3 http://www.cambridgeincolour.com/tutorials/sRGB-AdobeRGB1998.htm http://en.wikipedia.org/wiki/Colour_banding http://en.wikipedia.org/wiki/Posterization http://www.cambridgeincolour.com/tutorials/posterization.htm http://dpbestflow.org/color/color-space-and-color-profiles http://dpbestflow.org/color/color-space-and-color-profiles https://fstoppers.com/originals/introduction-camera-flash-three-main-choices-strobe-lighting-40364 https://fstoppers.com/originals/introduction-camera-flash-three-main-choices-strobe-lighting-40364 http://www.bhphotovideo.com/find/Product_Resources/lightmeters1.jsp Oversize Materials Small Prints Use of a Sharpening Filter Color Balance Color Space SETTING UP A BUDGET STUDIO Commercial Approach Sample Budget Studio Setup Cons of Budget Studio ACKNOWLEDGEMENT I would like to thank Robert Hickerson and Ryan Waggoner, the University of Kansas Spencer Museum of Art, Tim Hawkins, and Steelworks Center of the West for showing their digitization labs and sharing experience with me. 8652 ---- Identifying Key Steps for Developing Mobile Applications and Mobile Websites for Libraries Devendra Dilip Potnis, Reynard Regenstreif- Harms, and Edwin Cortez INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 43 ABSTRACT Mobile applications and mobile websites (MAMW) represent information systems that are increasingly being developed by libraries to better serve their patrons. Because of a lack of in-house IT skills and the knowledge necessary to develop MAMW, a majority of libraries are forced to rely on external IT professionals who may or may not help libraries meet patron needs but instead may deplete libraries’ scarce financial resources. This paper applies a system analysis and design perspective to analyze the experience and advice shared by librarians and IT professionals engaged in developing MAMW. This paper identifies key steps and precautions to take while developing MAMW for libraries. It also advises library and information science graduate programs to equip their students with the specific skills and knowledge needed to develop and implement MAMW. INTRODUCTION The unprecedented adoption and ongoing use of a variety of context-specific mobile technologies by diverse patron populations, the ubiquitous nature of mobile content, and the increasing demand for location-aware library services have forced libraries to “go mobile.” Mobile applications and mobile websites (MAMW), that is, web portals running on mobile devices, represent information systems that are increasingly being developed and used by libraries to better serve their patrons. However, a majority of libraries often lack the in-house human resources necessary to develop MAMW. Because of a lack of staff equipped with the requisite IT skills and knowledge, libraries are often forced to partner with and rely on external IT professionals, potentially losing control over the process of developing MAMW.1 Partnerships with external IT professionals do not always help libraries meet the information needs of their patrons but instead can deplete their scarce financial resources. It then becomes necessary for librarians to understand the process of developing MAMW to better evaluate MAMW for better serving library patrons. One possibility Devendra Dilip Potnis (dpotnis@utk.edu) is Associate Professor, School of Information Sciences; Reynard Regenstreif-Harms (reynardrh@gmail.com) is Project Archives Technician, Great Smoky Mountains National Park, Gatlinburg, Tennessee; and Edwin Cortez (ecortez@utk.edu) is Professor, School of Information Sciences, University of Tennessee at Knoxville. mailto:dpotnis@utk.edu mailto:reynardrh@gmail.com) mailto:ecortez@utk.edu IDENTIFYING KEY STEPS FOR DEVELOPING MOBILE APPLICATIONS & MOBILE WEBSITES FOR LIBRARIES | POTNIS, REGENSTREIF-HARMS, AND CORTEZ |doi:10.6017/ital.v35i2.8652 44 is to re-educate themselves through continuing education or other professional development activities. Another solution would be to see library and information science (LIS) schools strengthen their curriculum in the area of management, evaluation, and application of MAMW and related emerging technologies. Issues, challenges, and strategies for providing librarians with these opportunities are abundant and have been debated for more than thirty years, especially since libraries started experiencing the impact of microchip and portable technologies.2 Any practical and immediate guidance could help librarians in charge of developing MAMW.3 However, a majority of the practical guidance available for developing MAMW for libraries is limited to specific settings or patron populations. Also, the practical guidance is not theoretically validated, curtailing its generalizability for diverse library settings. For instance, a number of librarians and IT professionals share their experience and stories of MAMW development to serve a specific patron population in a specific library setting.4,5 Their stories typically describe their success stories of developing MAMW, the lessons learned during the development of MAMW, or their advice for developing MAMW. This paper applies a system analysis and design perspective from the information systems discipline to examine the experience and advice shared by librarians and IT professionals for identifying the key steps and precautions to be taken when developing MAMW for libraries. System analysis and design, a branch of the information systems discipline, is the most widely used theoretical knowledgebase available for developing information systems.6 According to the system analysis and design perspective, development, planning, analysis, design, implementation, and maintenance are the six phases of building any information system.7 The next section synthesizes our method for this secondary research. The following section discusses the key steps we identified for developing, planning, analyzing, designing, implementing, and maintaining MAMW for libraries. The concluding section presents the implications of this study for libraries and LIS graduate programs. METHOD We began this study with a practitioner’s handbook guiding libraries to use mobile technologies for delivering services to diverse patron populations.8 To search the literature relevant to our research, we devised many key phrases, including but not limited to “mobile technolog*,” “mobile applications for libraries,” and “mobile websites for libraries.” As part of our active information- seeking process, we applied a snowball sampling technique to collect more than seventy-five scholarly research articles, handbooks, ALA library technology reports, and books hosted on EBSCO and Information Science Source databases. Our passive information-seeking was helped by article suggestions from Emerald Insight and Elsevier Science Direct, two of the most widely used journal hosting sites, in response to the journal articles we accessed there. We applied the following four criteria to establish the relevancy of publications to our research: accuracy of facts; duration of publications (i.e., from 2000 to 2014); credibility of authors; and content focused on INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 45 problems, solutions, advice, and tips for developing MAMW. Several research articles published by Information Technology and Libraries and Library Hi Tech, two top-tier journals covering the development of MAMW for libraries, built the foundation of this secondary research. We analyzed the collected literature using the qualitative data presentation and analysis method proposed by Miles and Huberman.9 We developed Microsoft Excel summary sheets to code the experience and advice shared by librarians and IT professionals. The coded data was read repeatedly to identify and name patterns and themes. Each relevant publication was analyzed individually and then compared across subjects to identify patterns and common categories. The inter-coder reliability between the two authors who analyzed data was 85 percent. Data analysis helped us identify the key steps needed for planning, analyzing, designing, implementing, and maintaining MAMW for libraries. FINDINGS AND DISCUSSION Key Steps for Planning MAMW Forming and Managing a Team Building teams of people with the appropriate skills, knowledge, and experience is one of the first steps suggested by the existing literature for planning MAMW. It is essential for team members to be aware of new developments and trends in the market.10 For instance, developers should be aware of print resources on relevant technologies such as Apache, ASP, JavaScript, PHP, Ruby on Rails, and Python, etc.; online resources such as detectmobilebrowser.com and W3C mobileOK Checker to test catalogs, design functionality, and accessibility on mobile devices; and various online communities of developers who could provide peer-support when needed.11 Team members are also expected to keep up with new developments in mobile devices, platforms, operating systems, digital rights management terms and conditions, and emerging standards for content formats.12 Periodic delegation of various tasks could help libraries develop MAMW effectively.13 Libraries should also form productive, financially feasible partnerships with external stakeholders such as Internet service providers and network administrators for hosting MAMW on appropriate Internet servers that meet desired safety and security standards.14,15 Requirements Gathering Requirements for developing MAMW can be collected through empirical research and secondary research. Typically, the goal of empirical research is to help libraries [set off as bulleted list?]gather patron preferences for and expectations of MAMW,16,17 stay abreast of the continual evolution of patron needs,18 periodically (e.g., quarterly, annually, biannually, etc.) gather and evaluate user needs,19 index the content of MAMW,20 investigate the acceptance of the library’s use of MAMW by patrons,21 understand user needs, and identify top library services requested by patrons. IDENTIFYING KEY STEPS FOR DEVELOPING MOBILE APPLICATIONS & MOBILE WEBSITES FOR LIBRARIES | POTNIS, REGENSTREIF-HARMS, AND CORTEZ |doi:10.6017/ital.v35i2.8652 46 Empirical research in the form of usability testing, functional validation, user surveys, etc., should be carried out before developing MAMW to inform the development process and/or after developing MAMW to study their adoption by library patrons. Empirical research typically involves the identification of patrons and other stakeholders who are going to be affected by MAMW. This step is followed by developing data-collection instruments, collecting data from patrons and other stakeholders, and analyzing qualitative and quantitative data using appropriate techniques and software.22 Secondary research mainly focuses on scanning and assessing existing literature. For instance, using appropriate datasets on mobile use, librarians may be able to identify the factors responsible for the adoption of mobile technologies.23 Typically, such factors include but are not limited to cognitive, affective, social, and economic conditions of potential users. MAMW developers could also scan the environment by examining existing MAMW and reviewing the literature to create sets of guidelines for replacing old information systems by developing new, well-functioning MAMW.24 Librarians could also scan the market for free software options to conserve financial resources.25 Making Strategic Choices Mobile Applications or Mobile Websites? One of the most important strategic decisions libraries need to make during this phase is whether to use a mobile app or a mobile website—that is, a web portal running on mobile devices—for offering services to patrons. Mobile websites are web browser-based applications that might direct mobile users to a different set of content pages, serve a single set of content to all patrons while using different style sheets or templates reformatted for desktop or mobile browsers, or use a site transcoder (a rule-based interpreter), which resides between a website and a web client and intercepts and reformats content in real time for a mobile device.26,27 Mobile apps are more challenging to build than mobile websites because they require separate and specific programming for each operating system.28 Mobile apps burden users and their devices. For instance, users are expected to remember the functionality of each menu item, and a significant amount of memory is required to store and support apps on mobile devices. However, potential profitability, better mobile-device functionality, and greater exposure through app stores can make mobile apps an economical option over mobile websites.29 Buy or Build? In the planning phase, libraries also need to decide whether to buy commercial, off-the-shelf (COTS) MAMW or build a customized MAMW. MAMW need to be evaluated in terms of customer support and service, maintenance, the ability to meet patron needs, and library needs when making this choice.30 Sometimes libraries purchase COTS products and end up customizing them, benefiting from both options. For example, some libraries first purchase packaged mobile frameworks to create simple, static mobile websites and subsequently develop dynamic library apps specific to library services.31 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 47 Managing Scope Many libraries have limited financial resources, which makes it necessary for their staff to manage the scope of MAMW development. The ability to prioritize tasks and identify mission-critical features of mobile MAMW are some of the most common activities undertaken by libraries to manage this scope.32 For instance, it is not practical to make entire library websites mobile because libraries would end up serving only those patrons who access their sites over mobile alone. Instead, libraries should determine which part of the website should go mobile. A growing trend of using products like Mobile First Design to design a mobile version of a website first and then work up to a larger desktop version could help librarians better manage the scope of MAMW development. Alternatively, Jeff Wisniewski, a leading web services librarian in the United States, advises libraries to create a new mobile-optimized homepage alone, which is faster than trying to retrofit the library’s existing homepage for mobile.33 This advice is highly practical because no webmaster has any interest in trying to maintain two distinct versions of the library’s webpages with details such as hours of operations and contact information. Selecting the Appropriate Software Development Method There are three key methods for developing MAMW: structured methodologies (e.g., waterfall or parallel), rapid application prototyping (e.g., phased, prototyping, or throwaway prototyping), and agile development, an umbrella term used to refer to the collection of agile methodologies like Crystal, dynamic systems development method, extreme programming, feature-driven development, and Scrum. There is a bidirectional relationship between these MAMW development methods and the resources available for their development. Project resources such as funding, duration, and human resources influence and are affected by the type of software development method selected for developing MAMW. However, studies rarely pay attention to this important dimension of the planning phase.34 Key Steps in the Analysis Phase Requirements Analysis After collecting data from patrons, the next natural step is to analyze the data to inform the process of conceptualizing, building, and developing MAMW.35 The requirements-analysis phase helps libraries achieve user-centered design of MAMW and assess the return on investment in MAMW. The context and goals of the patrons using mobile devices, and the tasks they are likely and unlikely to perform on a mobile device, are the key considerations for developing user-centered MAMW for library patrons.36 It is critical to gather, understand, and review user needs.37 Surveys can be developed on paper or online, which can be analyzed using advanced statistical techniques or qualitative software.38,39 The analysis allows the following questions to be answered: Which IDENTIFYING KEY STEPS FOR DEVELOPING MOBILE APPLICATIONS & MOBILE WEBSITES FOR LIBRARIES | POTNIS, REGENSTREIF-HARMS, AND CORTEZ |doi:10.6017/ital.v35i2.8652 48 library services do patrons use most frequently on their mobile devices? What is their level of satisfaction for using those services? What types of library services and products would they like to access with their mobile phones in the future? Survey analyses can help librarians predict which mobile services patrons will find most useful;40 they can also help librarians classify users on the basis of their perceptions, experience, and habits when using mobile technologies to access library services.41 As a result, libraries can identify and prioritize functional areas for their MAMW deployment.42 MAMW developers can learn from their users’ humbling and/or frustrating experience of using mobile devices for library services. In addition, libraries can keep track of their patrons’ positive and negative observations, their information-sharing practices, and howthey create group experiences on the platform provided by their libraries.43 To improve existing MAMW, libraries could also use Google Analytics, a free web metrics tool, for identifying the popularity of MAMW features and analyzing statistics on how they are used.44 To develop operating system-specific mobile apps, Google Analytics can be used to learn about the popularity of mobile devices used by patrons.45 Ideally, libraries should calculate and document ROI before investing in the development of MAMW.46 For instance, libraries can run a cost-benefit analysis on the process of developing MAMW and compare various library services offered over mobile devices.47 Typically the following data could help libraries run the cost-benefit analysis: specific deliverables (e.g., features of MAMW), resources (e.g., resources needed, available resources, etc.), risks (e.g., types of risks, level of risks, etc.), performance requirements, and security requirements for developing MAMW. This analysis would help libraries make decisions on service provisions such as specific goals to be set for developing MAMW, feasibility of introducing desired features of MAMW, and how to manage available resources to meet the set goals.48 Libraries should also examine what other libraries have already done to provide mobile services.49 Communication/Liaising with Stakeholders The effective communication between developers and stakeholders influences almost every aspect of developing information systems. However, existing studies do not emphasize the significance of communication with stakeholders. For instance, several studies vaguely refer to the translation of user needs into technology requirements.50 But few studies point out the precise modeling technique (e.g., Entity Relationship Diagrams, Unified Modeling Language, etc.) for converting user needs into a language understood by software developers. Developers should communicate best practices and suggestions for the future implementation of MAMW in libraries,51 which involves the prediction and selection of appropriate MAMW for libraries,52 the demonstration of what is possible and how services are relevant, and how new resources can help create value for libraries.53,54 Communication with users is also critical for creating value-added services for patrons who use different mobile technologies to meet their needs related to work, leisure, commuting, etc.55 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 49 However, the existing literature on MAMW development for libraries does not mention the significance of this activity. Key Steps for Designing MAMW Prototyping Prototyping refers to the modeling or simulation of an actual information system. MAMW can have paper-based or computer-based prototypes. Prototyping allows developers to directly communicate with MAMW users to seek their feedback. Developers can correct or modify the original design of MAMW until users and developers are in agreement about the system design. Building consensus between MAMW developers and potential users is another key challenge to overcome during this phase, which may put a financial burden on MAMW development projects. It requires skilled personnel to manage the scope, time, human resources, and budget of such projects. Wireframing is one of the most prominent prototyping techniques practiced by librarians and IT professionals for developing MAMW for libraries.56 This technique depicts schematic on-screen blueprints of MAMW, lacking style, color, or graphics, focusing mainly on functionality, behavior, and priority of content. Selecting Hardware, Programming Languages, Platforms, Frameworks, and Toolkits Existing literature on the development of MAMW for libraries covers the selection and management of software; software development kits; scripting languages like JavaScript; data management and representation languages such as HTML, XML, and their text editors; and AJAX for animations and transitions. The existing literature also guides libraries for training their staff for using MAMW to better serve patrons.57 Few studies also provide guidance on selecting COTS products such as WebKit, an open source web browser engine that renders webpages on smartphones and allows users to view high-quality graphics on data networks with faster throughput.58 However, it might be a good idea to use licensed open source COTS products because licensed software allows libraries to legally distribute software within their organizations as covered by the licensing agreement. Libraries that use software-licensing agreements may also be able to seek expert help and advice whenever they have a concern or query. In the authors’ experience, librarians have shared few effective strategies to design MAMW. One key strategy is to purchase reliable device emulators and cross-compatible web editors. These technologies allow the user to work with the design at the most basic level, save documents as text, transfer the documents between web programs, and direct designers toward simple solutions.59 Sample cross-compatible web editors include, but are not limited to, Notetab Pro (http://www.notetab.com/), Code Lobster (http://www.codelobster.com/), and Bluefish (http://bluefish.openoffice.nl). http://www.notetab.com/ http://www.codelobster.com/ http://bluefish.openoffice.nl/ IDENTIFYING KEY STEPS FOR DEVELOPING MOBILE APPLICATIONS & MOBILE WEBSITES FOR LIBRARIES | POTNIS, REGENSTREIF-HARMS, AND CORTEZ |doi:10.6017/ital.v35i2.8652 50 Hybrid mobile app frameworks like Bootstrap, Ionic, Mobile Angular UI, Intel XDK, Appcelerator Titanium, Sencha, Kendo UI, and PhoneGap use a combination of web technologies like HTML, CSS, and JavaScript for developing mobile-first, responsive MAMW. A majority of these frameworks use a drag-and-drop approach and do not require any coding for developing mobile apps. One-click API connect further simplifies the process. User-interface frameworks like jQueryMobile and Topcoat eliminate the need to design user interfaces manually. Importantly, MAMW developed using such frameworks can support many mobile platforms and devices. Toolkits like GitHub, skyronic, crudkit, and HAWHAW enable developers to quickly build mobile- friendly CRUD (create/read/update/delete) interfaces for PHP, Laravel, and Codeigniter apps. Such mobile apps also work with MySQL and other databases, allowing users to receive and process data and display information to users. Table 1 categorizes specific hardware and software features recommended for MAMW to better serve library patrons. # Areas of Information Systems/IT Specific Features Recommended for Developing MAMW for Libraries 1 Human-Computer Interaction (HCI) Behavioral, cognitive, motivational, and affective aspects of HCI Design responsive web sites for libraries to enhance user experience60 Design a user interface meeting the expectations and needs of potential users (e.g., menu with the following items: library catalog, patron accounts, ask a librarian, contact information, listing of hours, etc.)61 Design meaningful mobile websites based on user needs, documenting and maintaining mobile websites62 Usability engineering Design concise interfaces with limited links, descriptive icons, home and parent-link icons63 Create a user-friendly site (e.g., the DOK Library Concept Center in Delft, Netherlands, offers a welcome text message to first-time visitors)64 Effectively transition from traditional websites to mobile-optimized sites with responsive design65 Create user-friendly interface designs66 Present a clean, easy to navigate mobile version of search results67 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 51 Information visualization Automatically maintain reliable and stable fundamental information required by indoor localization systems68 Save time by redesigning existing sites69,70 2 Web Programming HTML, XML, etc. Design sites with a complete separation of content and presentation71 Code HTML and CSS for better user experiences72 Create and shorten links to make them easier to input using small or virtual keyboards73 Using cient-side and server-side scripting such as JavaScript Object Notation, etc. Design and develop mashups74 Develop MAMW using client-server architecture, accessible on mobile devices75 Without scripting Implement widgetization to facilitate the integration of mobile websites—developing a widget library for mobile-based web information systems76 3 Open Source Design mobile websites that allow users to leverage the same open source technology as the main websites77 Design mobile websites linking to other existing services like library h3lp and library catalogs with mobile interfaces such as MobileCat78 4 Networking Design a mobile website capable of exploiting advancements in technology such as faster mobile data networks79 Identify and address technology issues (e.g., connectivity, security, speed, signal strength, etc.) faced by patrons when using MAMW80 5 Input/Output Devices Use a mobile robot to determine the location of fixed RFID tags in space81 Design MAMW capable of processing data communicated using radio frequency identification devices, near-field communication technology, and Bluetooth- based technology like iBeacons82 Offer innovative services using augmented- reality tools83 IDENTIFYING KEY STEPS FOR DEVELOPING MOBILE APPLICATIONS & MOBILE WEBSITES FOR LIBRARIES | POTNIS, REGENSTREIF-HARMS, AND CORTEZ |doi:10.6017/ital.v35i2.8652 52 6 Databases Integrate a back-end database of metadata with front-end mobile technologies84 Integrate front-end of mobile MAMW with back-end of standard databases and services85 7 Social Media and Analytics Integrate social media sites (e.g., Foursquare, Facebook Place, Gowalla, etc.) with existing checkout services for accurate and information rich entries86 Implement Google Voice or a free text- messaging service87 Use Google Analytics for mobile optimized website by copying the free JavaScript code generated from Google Analytics and paste it into library webpages to gain insight into what resources are used and who used them88 Integrate a geo-location feature with mobile services89 Table 1. MAMW with specific hardware and software features From the above table, which is based on the analysis of the literature on developing mobile applications and mobile websites for libraries, it becomes clear that web programming and HCI are the two leading technology areas that shape the development of MAMW and consequently the services offered by them. Designing User Interfaces of MAMW Librarians and IT professionals engaged in developing MAMW for libraries make the following recommendations. Use two style sheets: CSS play a key role in offering uniform display to user interfaces for all webpages. Studies recommend designing two style sheets—namely, mobile.css and iphone.css— when developing MAMW, since most of the time smartphones ignore mobile stylesheets.90 In that case, iphone.css could direct itself to browsers of a specific screen-width, helping those mobile devices that are not directed to the mobile website by the mobile.css stylesheet.91 Minimize use of JavaScript: JavaScript is instrumental in detecting what mobile device is being used by patrons and then directing them to the appropriate webpage with options including full website, simple text-based, and touch-mobile-optimized. However, it is critical to minimize the use of JavaScript on library mobile websites because not every smartphone offers the minimum level of support required to operate it.92 Handle images intelligently: To help patrons optimize their bandwidth use, image files on mobile sites should be incorporated with CSS rather than HTML code; also, to ensure consistency in the INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 53 appearance of user interfaces of mobile websites, images should be kept to the same absolute size.93 Key Steps for Implementing MAMW Programming for MAMW Programming is at the heart of developing MAMW. As shown in table 1 above, web programming enables developers to build MAMW with a number of value-added features for patrons. For instance, a web-application server running on Cold Fusion can process data communicated via web browsers on mobile devices; this feature allows MAMW users to access search engines on library websites via smartphones.94 Also, client-side processing of classes (with a widget library) allows patrons to use their mobile devices as thin clients, thereby optimizing the use of network bandwidth.95 Testing MAMW Past studies recommend testing the content, display/design, and functionality of MAMW in a controlled environment (e.g., usability lab) or in the real world (i.e., in libraries). Content: Librarians are advised to set up testing databases for testing image presentation, traditional free text search, location-based search, barcode scanning for ISBN search, QR encapsulation, and voice search.96 Display/design: Librarians can review and test MAMW on multiple devices to confirm that everything displays and functions as intended.97 They can also test a beta version of their mobile website with varying devices to provide guidance regarding image sizing;98 beta versions are also useful in testing mobile websites for their display on different browsers and devices.99 Functionality: Librarians can set up testing practices and environments for the most heavily used device platforms (e.g., HCI incubators such as eye testing software, which is a combination of virtual emulators and mobile devices not owned by libraries).100,101 They can also use the User Agent Switcher Add-On for Firefox to test a mobile website and use web-based services like Device Anywhere and Browser Cam offering mobile emulation to test the functionality of MAMW.102 Training Patrons Unless patrons realize the significance of a new information system for managing information resources they will hardly use it. However, training patrons for using a newly developed MAMW is almost completely missing from the studies describing the process of developing MAMW for libraries. Joe Murphy, a technology librarian at Yale University, identifies the significance of user training in managing the change from traditional to mobile search and advises librarians to explore the mobile literacy skills of their patrons and educate them on how to use new systems.103 IDENTIFYING KEY STEPS FOR DEVELOPING MOBILE APPLICATIONS & MOBILE WEBSITES FOR LIBRARIES | POTNIS, REGENSTREIF-HARMS, AND CORTEZ |doi:10.6017/ital.v35i2.8652 54 Data Management MAMW cannot function properly without clean data. Cleaning up data, curating data, and addressing other data-related issues are some of the least mentioned activities in the literature for developing MAMW. However, it is necessary for librarians engaged in developing MAMW to identify and address common challenges for managing data when used for MAMW. For example, it might be a good strategy for librarians to study the best practices for managing data-related issues when offering reference services using SMS .104 Skills Needed for Maintaining MAMW Documentation and Version Control of Software Past studies recommend developing a mobile strategy for building a mobile-tracking device and evaluating mobile infrastructure to ensure the continued assessment and monitoring of mobile usage and trends among patrons.105 However, past studies do not report or provide many details about the maintenance of MAMW, which leads us to infer that maintenance of MAMW involving documentation and version control is a neglected aspect of their development. Open source software development is increasingly becoming a common practice for developing MAMW. Implementing version-control software (e.g., subversion and GitHub) to accommodate the needs of developers distributed across the world is a necessity for developing MAMW. Version- control software provides a code repository with a centralized database for developers to share their code, which minimizes errors associated with overwriting or reverting code changes and maximizes software development collaboration efforts.106 CONCLUSION There are various forces driving change in the knowledge and skills area for information professionals: technologies, changing environments, and the changing role of IT in managing and providing services to patrons. These forces affect all levels of IT-based professionals, those responsible for information processing and those responsible for information services. This paper has examined the key steps and precautions to be taken while developing MAMW to better serve their patrons. After analyzing the existing guidance offered by librarians and IT professionals from the system analysis and design perspective, we find that some of the most ignored activities in MAMW development are selecting appropriate software development methodologies, prototyping, communicating with stakeholders, software version control, data management, and training patrons to use newly developed or revamped MAMW. The lack of attention to these activities could hinder libraries’ ability to better serve patrons using MAMW. It is necessary for librarians and IT professionals to pay close attention to the above activities when developing MAMW. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 55 Our study also shows that web programming and HCI are the two most widely used technology areas for developing MAMW for libraries. To save their scarce financial resources, which otherwise could be invested in partnering with external IT professionals, libraries could either train their existing staff or recruit LIS graduates equipped with the skills and knowledge identified in this paper to develop MAMW (see table 2). # Key Steps for Developing MAMW Skills and Knowledge Required for Developing MAMW A Planning Phase 1 Forming and managing team Human resource management 2 Making strategic choices Time management Cost management Quality management Human resource management (e.g., staff capacity) 3 Requirements gathering Research (empirical and secondary) 4 Managing scope (e.g., managing financial resources, prioritizing tasks, identifying mission-critical features of MAMW, etc.) Scope management 5 Selecting an appropriate software development method Time management Cost management Quality management B Analysis Phase 6 Requirements analysis Research (empirical and secondary) 7 Communication/liaising with stakeholders Communications management C Design Phase 8 Prototyping Software development (HCI) 9 Selecting hardware and programming languages and platforms Software development (web programming and HCI) 10 Designing user interfaces of MAMW Software development (HCI) D Implementation Phase 11 Programming for MAMW Software development (web programming—e.g., Android, iOS, Visual C++, Visual C#, Visual Basic, etc.) 12 Testing MAMW Software development (web programming and HCI) IDENTIFYING KEY STEPS FOR DEVELOPING MOBILE APPLICATIONS & MOBILE WEBSITES FOR LIBRARIES | POTNIS, REGENSTREIF-HARMS, AND CORTEZ |doi:10.6017/ital.v35i2.8652 56 13 Training patrons Human resource management 14 Data management (e.g., cleaning up data, curating data, etc.) Data management E Maintenance Phase 15 Documentation and version control of software Software development (web programming and HCI) Table 2. Skills and knowledge necessary to develop MAMW The management of scope, time, cost, quality, human resources, and communication related to any project is known as project management.107 In addition to the skills and knowledge related to project management, librarians would also need to be proficient in software development (with an emphasis on HCI and web programming), data management, and the proper methods for conducting empirical and secondary research for developing MAMW. If LIS programs equip their graduate students with the skills and knowledge identified in this paper, the next generation of LIS graduates could develop MAMW for libraries without relying on external IT professionals, which would make libraries more self-reliant and better able to manage their financial resources.108 This paper assumes a very small number of scholarly publications to be reflective of the real- world scenarios of developing MAMW for all types of libraries. This assumption is one of the limitations of this study. Also, the sample of publications analyzed in this study is not statistically representative of the development of MAMW for libraries around the world. In the future, the authors plan to interview librarians and IT professionals engaged in developing and maintaining MAMW for their libraries to better understand the landscape of developing MAMW for libraries. REFERENCES 1. Devendra Potnis, Ed Cortez, and Suzie Allard, “Educating LIS Students as Mobile Technology Consultants” (poster presented at 2015 Association for Library and Information Science Education Annual Meeting, Chicago, January 25–27), http://f1000.com/posters/browse/summary/1097683. 2. Edwin Michael Cortez, “New and Emerging Technologies for Information Delivery,” Catholic Library World no. 54 (1982): 214–18. 3. Kimberly D. Pendell and Michael S. Bowman, “Usability Study of a Library’s Mobile Website: An Example from Portland State University,” Information Technology & Libraries 31, no. 2 (2012): 45–62, http://dx.doi.org/10.6017/ital.v31i2.1913. 4. Godmar Back and Annette Bailey, “Web Services and Widgets for Library Information Systems,” Information Technology & Libraries 29 no. 2 (2010): 76–86, http://dx.doi.org/10.6017/ital.v29i2.3146 . http://f1000.com/posters/browse/summary/1097683 http://dx.doi.org/10.6017/ital.v31i2.1913 http://dx.doi.org/10.6017/ital.v29i2.3146 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 57 5. Hannah Gascho Rempel and Laurie Bridges, “That was Then, This is Now: Replacing the Mobile Optimized Site with Responsive Design,” Information Technology & Libraries 32, no. 4 (2013): 8–24, http://dx.doi.org/10.6017/ital.v32i4.4636. 6. June Jamrich Parsons and Dan Oja, New Perspectives on Computer Concepts 2014: Comprehensive, Course Technology (Boston: Cengage Learning, 2013). 7. Ibid. 8. Andrew Walsh, Using Mobile Technology to Deliver Library Services: A Handbook (London: Facet, 2012). 9. Matthew B. Miles and A. Michael Huberman, Qualitative Data Analysis (Thousand Oaks, CA: Sage, 1994). 10. Bohyun Kim, “Responsive Web Design, Discoverability and Mobile Challenge,” Library Technology Reports 49, no 6 (2013): 29–39, https://journals.ala.org/ltr/article/view/4507. 11. James Elder, “How to Become the “Tech Guy and Make iPhone Apps for Your Library,” The Reference Librarian 53, no. 4 (2012): 448–55, http://dx.doi.org/10.1080/02763877.2012.707465. 12. Sarah Houghton, “Mobile Services for Broke Libraries: 10 Steps to Mobile Success,” The Reference Librarian 53, No. 3 (2012): 313–21, http://dx.doi.org/10.1080/02763877.2012.679195. 13. Pendell and Bowman, “Usability Study.” 14. Lisa Carlucci Thomas, “Libraries, Librarians and Mobile Services,” Bulletin of the American Society for Information Science & Technology 38, no. 1 (2011): 8–9, http://dx.doi.org/10.1002/bult.2011.1720380105. 15. Elder, “How to Become the ‘Tech Guy.’” 16. Kim, “Responsive Web Design.” 17. Chad Mairn, “Three Things You Can Do Today to Get Your Library Ready for the Mobile Experience,” The Reference Librarian 53, no. 3 (2012): 263–69, http://dx.doi.org/10.1080/02763877.2012.678245. 18. Rempel and Bridges, “That was Then.” 19. Rachael Hu and Alison Meier, “Planning for a Mobile Future: A User Research Case Study From the California Digital Library,” Serials 24, no. 3 (2011): S17–25. 20. Kim, “Responsive Web Design.” http://dx.doi.org/10.6017/ital.v32i4.4636 https://journals.ala.org/ltr/article/view/4507 http://dx.doi.org/10.1080/02763877.2012.707465 http://dx.doi.org/10.1080/02763877.2012.679195 http://dx.doi.org/10.1002/bult.2011.1720380105 http://dx.doi.org/10.1080/02763877.2012.678245 IDENTIFYING KEY STEPS FOR DEVELOPING MOBILE APPLICATIONS & MOBILE WEBSITES FOR LIBRARIES | POTNIS, REGENSTREIF-HARMS, AND CORTEZ |doi:10.6017/ital.v35i2.8652 58 21. Lorraine Paterson and Boon Low, “Student Attitudes Towards Mobile Library Services for Smartphones,” Library Hi Tech 29, no. 3 (2011): 412–23, http://dx.doi.org/10.1108/07378831111174387. 22. Jim Hahn, Michael Twidale, Alejandro Gutierrez and Reza Farivar, “Methods for Applied Mobile Digital Library Research: A Framework for Extensible Wayfinding Systems,” The Reference Librarian 52, no. 1-2 (2011): 106–16, http://dx.doi.org/10.1080/02763877.2011.527600. 23. Patterson and Low, “Student Attitudes.” 24. Gillian Nowlan, “Going Mobile: Creating a Mobile Presence for Your Library,” New Library World 114, no. 3/4 (2013): 142–50, http://dx.doi.org/10.1108/03074801311304050. 25. Elder, “How to Become the ‘Tech Guy.’” 26. Matthew Connolly, Tony Cosgrave, and Baseema B. Krkoska, “Mobilizing the Library’s Web Presence and Services: A Student-Library Collaboration to Create the Library’s Mobile Site and iPhone Application,” The Reference Librarian 52, no. 1-2 (2010): 27–35, http://dx.doi.org/10.1080/02763877.2011.520109. 27. Stephan Spitzer, “Make That to Go: Re-Engineering a Web Portal for Mobile Access,” Computers in Libraries 3 no. 5 (2012): 10–14. 28. Houghton, “Mobile Services.” 29. Cody W. Hanson, “Mobile Solutions for Your Library,” Library Technology Reports 47, no. 2 (2011): 24–31, https://journals.ala.org/ltr/article/view/4475/5222. 30. Terence K. Huwe, “Using Apps to Extend the Library’s Brand,” Computers in Libraries 33, no. 2 (2013): 27–29. 31. Edward Iglesias and Wittawat Meesangnill, “Mobile Website Development: From Site to App,” Bulletin of the American Society for Information Science and Technology 38, no. 1 (2011): 18– 23. 32. Jeff Wisniewski, “Mobile Usability,” Bulletin of the American Society for Information Science & Technology 38, no. 1 (2011): 30–32, http://dx.doi.org/10.1002/bult.2011.1720380108. 33. Jeff Wisniewski, “Mobile Websites with Minimal Effort,” Online 34, no. 1 (2010): 54–57. 34. Hahn et al., “Methods for Applied Mobile Digital Library Research.” 35. J. Michael DeMars, “Smarter Phones: Creating a Pocket Sized Academic Library,” The Reference Librarian 53, no. 3 (2012): 253–62, http://dx.doi.org/10.1080/02763877.2012.678236. http://dx.doi.org/10.1108/07378831111174387 http://dx.doi.org/10.1080/02763877.2011.527600 http://dx.doi.org/10.1108/03074801311304050 http://dx.doi.org/10.1080/02763877.2011.520109 https://journals.ala.org/ltr/article/view/4475/5222 http://dx.doi.org/10.1002/bult.2011.1720380108 http://dx.doi.org/10.1080/02763877.2012.678236 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 59 36. Kim Griggs, Laurie M. Bridges, and Hannah Gascho Rempel, “Library/Mobile: Tips on Designing and Developing Mobile Websites,” Code4lib no. 8 (2009), http://journal.code4lib.org/articles/2055. 37. DeMars, “Smarter Phones.” 38. Hahn et al., “Methods for Applied Mobile Digital Library Research.” 39. Beth Stahr, “Text Message Reference Service: Five Years Later,” The Reference Librarian no. 52, no. 1-2 (2011): 9–19, http://dx.doi.org/10.1080/02763877.2011.524502. 40. Patterson and Low, “Student Attitudes.” 41. Ibid. 42. Ibid. 43. Hanson, “Mobile Solutions for Your Library.” 44. Stahr, “Text Message Reference Service.” 45. Spitzer, “Make That to Go.” 46. Allison Bolorizadeh et al., “Making Instruction Mobile,” The Reference Librarian 53, no. 4 (2012): 373–83, http://dx.doi.org/10.1080/02763877.2012.707488. 47. Maura Keating, “Will They Come? Get Out the Word About Going Mobile,” The Reference Librarian no. 52, no. 1-2 (2010): 20-26, http://dx.doi.org/10.1080/02763877.2010.520111. 48. Patterson and Low, “Student Attitudes.” 49. Hanson, “Mobile Solutions for Your Library.” 50. Patterson and Low, “Student Attitudes.” 51. Hanson, “Mobile Solutions for Your Library.” 52. Cody W. Hanson, “Why Worry About Mobile?,” Library Technology Reports no. 47, no. 2 (2011): 5–10, https://journals.ala.org/ltr/article/view/4476. 53. Keating, “Will They Come?” 54. Spitzer, “Make That to Go.” 55. Kim, “Responsive Web Design.” 56. Wisniewski, “Mobile Usability.” 57. Elder, “How to Become the ‘Tech Guy.’” http://journal.code4lib.org/articles/2055 http://dx.doi.org/10.1080/02763877.2011.524502 http://dx.doi.org/10.1080/02763877.2012.707488 http://dx.doi.org/10.1080/02763877.2010.520111 https://journals.ala.org/ltr/article/view/4476 IDENTIFYING KEY STEPS FOR DEVELOPING MOBILE APPLICATIONS & MOBILE WEBSITES FOR LIBRARIES | POTNIS, REGENSTREIF-HARMS, AND CORTEZ |doi:10.6017/ital.v35i2.8652 60 58. Sally Wilson and Graham McCarthy, “The Mobile University: From the Library to the Campus,” Reference Services Review 38, no. 2 (2010): 214–32, http://dx.doi.org/10.1108/00907321011044990. 59. Brendan Ryan, “Developing Library Websites Optimized for Mobile Devices,” The Reference Librarian 52, no. 1-2 (2010): 128–35, http://dx.doi.org/10.1080/02763877.2011.527792. 60. Kim, “Responsive Web Design.” 61. Connolly, Cosgrave, and Krkoska, “Mobilizing the Library’s Web presence and Services.” 62. DeMars, “Smarter Phones.” 63. Mark Andy West, Arthur W. Hafner, and Bradley D. Faust, “Expanding Access to Library Collections and Services Using Small-Screen Devices,” Information Technology & Libraries 25 (2006): 103–7. 64. Houghton, “Mobile Services.” 65. Rempel and Bridges, “That was Then.” 66. Elder, “How to Become the ‘Tech Guy.’” 67. Heather Williams and Anne Peters, “And That’s How I Connect to MY Library: How a 42- Second Promotional Video Helped to launch the UTSA Libraries’ New Summon Mobile Application,” The Reference Librarian 53, no. 3 (2012): 322–25, http://dx.doi.org/10.1080/02763877.2012.679845. 68. Hahn et al., “Methods for Applied Mobile Digital Library Research.” 69. Danielle Andre Becker, Ingrid Bonadie-Joseph, and Jonathan Cain, “Developing and Completing a Library Mobile Technology Survey to Create a User-Centered Mobile Presence,” Library Hi-Tech 31, no. 4 (2013): 688–99, http://dx.doi.org/10.1108/LHT-03-2013-0032. 70. Rempel and Bridges, “That was Then.” 71. Iglesias and Meesangnill, “Mobile Website Development.” 72. Elder, “How to Become the ‘Tech Guy.’” 73. Andrew Walsh, “Mobile Information Literacy: A Preliminary Outline of Information Behavior in a Mobile Environment,” Journal of Information Literacy 6, no. 2 (2012): 56–69, http://dx.doi.org/10.11645/6.2.1696. 74. Back and Bailey, “Web Services and Widgets.” 75. Ibid. 76. Ibid. 77. Spitzer, “Make That to Go.” http://dx.doi.org/10.1108/00907321011044990 http://dx.doi.org/10.1080/02763877.2011.527792 http://dx.doi.org/10.1080/02763877.2012.679845 http://dx.doi.org/10.1108/LHT-03-2013-0032 http://dx.doi.org/10.11645/6.2.1696 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 61 78. Iglesias and Meesangnill, “Mobile Website Development.” 79. Bohyun Kim, “The Present and Future of the Library Mobile Experience,” Library Technology Reports 49, no. 6 (2013): 15–28, https://journals.ala.org/ltr/article/view/4506. 80. Pendell and Bowman, “Usability Study.” 81. Hahn et al., “Methods for Applied Mobile Digital Library Research.” 82. Andromeda Yelton, “Where to Go Next,” Library Technology Reports 48, no. 1 (2012): 25–34, https://journals.ala.org/ltr/article/view/4655/5511. 83. Ibid. 84. Hahn et al., “Methods for Applied Mobile Digital Library Research.” 85. Houghton, “Mobile Services.” 86. Ibid. 87. Mairn, “Three Things You Can Do Today.” 88. Ibid. 89. Tamara Pianos, “EconBiz to Go: Mobile Search Options for Business and Economics— Developing a Library App for Researchers,” Library Hi Tech 30, no. 3 (2012): 436–48, http://dx.doi.org/10.1108/07378831211266582. 90. DeMars, “Smarter Phones.” 91. Ryan, “Developing Library Websites.” 92. Pendell and Bowman, “Usability Study.” 93. Ryan, “Developing Library Websites.” 94. Michael J. Whitchurch, “QR Codes and Library Engagement,” Bulletin of the American Society for Information Science & Technology 38, no. 1 (2011): 14–17. 95. Back and Bailey, “Web Services and Widgets.” 96. Jingru Hoivik, “Global Village: Mobile Access to Library Resources,” Library Hi Tech 31, no. 3 (2013): 467–77, http://dx.doi.org/10.1108/LHT-12-2012-0132. 97. Elder, “How to Become the ‘Tech Guy.’” 98. Ryan, “Developing Library Websites.” 99. West, Hafner and Faust, “Expanding Access.” 100. Hu and Meier, “Planning for a Mobile Future.” 101. Iglesias and Meesangnill, “Mobile Website Development.” https://journals.ala.org/ltr/article/view/4506 https://journals.ala.org/ltr/article/view/4655/5511 http://dx.doi.org/10.1108/07378831211266582 http://dx.doi.org/10.1108/LHT-12-2012-0132 IDENTIFYING KEY STEPS FOR DEVELOPING MOBILE APPLICATIONS & MOBILE WEBSITES FOR LIBRARIES | POTNIS, REGENSTREIF-HARMS, AND CORTEZ |doi:10.6017/ital.v35i2.8652 62 102. Wisniewski, “Mobile Usability.” 103. Joe Murphy, “Using Mobile Devices for Research: Smartphones, Databases and Libraries,” Online 34, no. 3 (2010): 14–18. 104. Amy Vecchione and Margie Ruppel, “Reference is Neither Here nor There: A Snapshot of SMS Reference Services,” The Reference Librarian 53, no. 4 (2012): 355–72, http://dx.doi.org/10.1080/02763877.2012.704569. 105. Hu and Meier, “Planning for a Mobile Future.” 106. Wilson and McCarthy, “The Mobile University.” 107. Project Management Institute, A Guide to the Project Management Body of Knowledge (PMBOK Guide) (Newtown Square, PA: Project Management Institute, 2013). 108. Devendra Potnis et al., “Skills and Knowledge Needed to Serve as Mobile Technology Consultants in Information Organizations,” Journal of Education for Library & Information Science 57 (2016): 187–96. http://dx.doi.org/10.1080/02763877.2012.704569 ABSTRACT INTRODUCTION METHOD Forming and Managing a Team Key Steps in the Analysis Phase Key Steps for Designing MAMW Key Steps for Implementing MAMW Skills Needed for Maintaining MAMW CONCLUSION Forming and managing team This paper assumes a very small number of scholarly publications to be reflective of the real-world scenarios of developing MAMW for all types of libraries. This assumption is one of the limitations of this study. Also, the sample of publications anal... REFERENCES 8749 ---- In the Name of the Name: RDF Literals, ER Attributes, and the Potential to Rethink the Structures and Visualizations of Catalogs Manolis Peponakis ABSTRACT The aim of this study is to contribute to the field of machine-processable bibliographic data that is suitable for the Semantic Web. We examine the Entity Relationship (ER) model, which has been selected by IFLA as a “conceptual framework” in order to model the FR family (FRBR, FRAD, and RDA), and the problems ER causes as we move towards the Semantic Web. Subsequently, while maintaining the semantics of the aforementioned standards but rejecting the ER as a conceptual framework for bibliographic data, this paper builds on the RDF (Resource Description Framework) potential and documents how both the RDF and Linked Data’s rationale can affect the way we model bibliographic data. In this way, a new approach to bibliographic data emerges where the distinction between description and authorities is obsolete. Instead, the integration of the authorities with descriptive information becomes fundamental so that a network of correlations can be established between the entities and the names by which the entities are known. Naming is a vital issue for human cultures because names are not random sequences of characters or sounds that stand just as identifiers for the entities—they also have socio-cultural meanings and interpretations. Thus, instead of describing indivisible resources, we could describe entities that appear in a variety of names on various resources. In this study, a method is proposed to connect the names with the entities they represent and, in this way, to document the provenance of these names by connecting specific resources with specific names. INTRODUCTION The basic aim of this study is to contribute to the field of machine-processable bibliographic data. As to what constitutes “machine processable” we concur with the clarification of Antoniou and van Harmelen, who state, “In the literature the term machine-understandable is used quite often. We believe it is the wrong word because it gives the wrong impression. It is not necessary for intelligent agents to understand information; it is sufficient for them to process information effectively, which sometimes causes people to think the machine really understands.”1 Also, in the bibliography used, the term “computationally processable” is used as a synonym to “machine­ processable.” Manolis Peponakis (epepo@ekt.gr) is an information scientist at the National Documentation Centre, National Hellenic Research Foundation, Athens, Greece. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 19 mailto:epepo@ekt.gr With regard to machine-processable bibliographic data, we have taken into consideration both the practice and theory of Library and Information Science (LIS) and Computer Science. From LIS we have chosen the Functional Requirements for Bibliographic Records (FRBR) and the Functional Requirements for Authority Data (FRAD) while making comparisons with the Resource Description and Access (RDA) standard. From the Computer Science domain we have chosen the Resource Description Framework (RDF) as a basic mechanism for the Semantic Web. We examine the Entity Relationship (ER) model (selected from IFLA as a “conceptual framework” for the development of FRBR), 2 as well as the potential problems that may arise as we move towards the Semantic Web. Having rejected the ER model as a conceptual framework for bibliographic data, we have built on the potential of RDF and document how its rationale affects the modeling process. In the context of the Semantic Web and Uniform Resource Identifiers (URIs), the identification process has been transformed. For this reason we have performed an analysis of appellations and names as identifiers and also explored how we could move on from an era where controlled names play the role of identifiers to one of the URI dominion: “While it is self-evident that labels and comments are important for constructing and using ontologies by humans, the OWL standard does not pay much attention to them. The standard focuses on the syntax, structure and reasoning capabilities. . . . If the Semantic Web is to be queried by humans, there will be no other way than dealing with the ambiguousness of human language.”3 It is essential to build on the “library's signature service, its catalog,”4 and use it to provide added- value services. But to get there, first there has to be “a shift in perspective, from locked-up databases of records to open data shared on the Web.”5 This requires a transition from descriptions aimed at human readers to descriptions that put the emphasis on computational processes to escape the rationale of records being a condensed description in textual form and move towards more flexible and fruitful representations and visualizations. BACKGROUND FRBR and RDA The FR family has been growing for more than a decade. The first member of the family was the Functional Requirements for Bibliographic Records (FRBR),6 the first version of which was published towards the end of the last century. Subsequently, IFLA decided to extend the model in order to cover authorities. During this process, the task of modeling the names was separated from the task of modeling the subjects. Thus two new members were added to the family; the “Functional Requirements for Authority Data: A Conceptual Model” (FRAD) and the “Functional Requirements for Subject Authority Data (FRSAD).” 7,8 At the same period of time, the “Resource Description and Access” (RDA) standard was established as a set of cataloging rules to replace the AACR standard. According to its creators, the alignment with the FR family was crucial. As stated, IN THE NAME OF THE NAME: RDF LITERALS, ER ATTRIBUTES, AND THE POTENTIAL TO RETHINK THE STRUCTURES AND VISUALIZATIONS OF CATALOGS | PEPONAKIS |doi:10.6017/ital.v35i2.8749 20 “A key element in the design of RDA is its alignment with the conceptual models for bibliographic and authority data developed by the International Federation of Library Associations and Institutions (IFLA): Functional Requirements for Bibliographic Records [and] Functional Requirements for Authority Data.”9 This paper uses the FR family and the RDA as a starting point but detects some problems and inconsistencies between these models. It sustains the basic semantics from these standards but rejects their structural formalism because it finds that it is quite problematic and lacks effectiveness in expressing highly machine-processable data. The effective processability of the data will be discussed in detail in the section “The Impact of the Representation Scheme’s Selection: RDF versus ER.” Among the FR family, the terminology is inconsistent and, as we pass from the FRBR to FRAD and FRSAD, even the perception angle of the general model undergoes change. In FRBR (the first in order), there is no notion of the name as an entity. FRAD introduces this perception (FRAD also adds family as a new entity) and FRSAD makes a step forward and introduces the concept of nomen instead of the concept of name. Hence, despite the fact that each of the members of the FR family of models has been represented in RDF,10 there is no established consolidated edition yet that combines the different angles using a common model and terminology (vocabulary).11 These representations (one for each model) are available at IFLA’s website.12 On the other hand, in the context of RDA there may be more consistency regarding terminology, but, as is well established in the relevant literature, there are significant differences between the two models, i.e. the FR family and RDA.13,14,15 Due to these differences, there are no URIs, not even in the RDA registry, in the examples of our study.16 Given the above, the terms appearing in the figures are a selection from the three texts of the FR family. Thus, nomen (from FRSAD) is used instead of name (from FRAD) as a more abstract notion, and the attribute—property in the context of RDF—“has string” (from FRAD) is used to assign a specific literal to a nomen. In figures 2–5 we have used the “has appellation” (reversed “is appellation of”) relationship of FRAD.17 Notes about Terminology and Graphs: How to Read the Figures In this paper two different sorts of figures appear. This covers the need to compare two different models and pinpoint the differences between them and the problems that arise from selecting the ER model to express FRBR. An explanation of the two major models follows in the next subsection. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 21 The first figure type follows the diagrams of the Entity–relationship model and is used in figure 1. In this case: • The rectangles represent entities. • The oval shapes represent attributes. • The diamond-shaped boxes represent relationships. The second figure type has been created according to the RDF graphical representations and is used in figures 2–5. In these cases: • The oval shapes represent nodes that are identified by a URI and they could serve as objects or subjects for further expansion of the network. In figures 3–5 all the names were derived from the FR entities. • The line connectors between nodes represent the predicates (i.e., they are properties) and should also serve as URIs. • The rectangle shapes represent literals consisting of lexical form. Language code could apply in these cases. With or without language codes, these are the end points and they could not be subject to new connections. We follow the common modeling of the language in RDF in which the literal itself contains a language code, for example "example"@en in standard Turtle syntax, or in RDFS XML coding. We must note that this kind of modeling is quite a simplistic way of language modeling because there is no mechanism to declare more information about language, such as multiple scripts, which could apply in the context of the same language. The Impact of the Representation Scheme’s Selection: RDF versus ER Nowadays, all the information on library catalogs is created through and stored in computers. This technological infrastructure provides specific methods and dictates limitations for the catalog’s data management. Hence, every model must take into consideration the basic rationale of the technological infrastructure that will curate and process the data. Depending on the syntax capabilities of the representation model, the expression of what we want to express becomes reasonably easy and accurate since “semantics is always going to have a close relationship with the field of syntax.”18 This establishes a vital relationship between what we want to do and how computers can do it. In this section we emphasize the limitations of the Entity Relationship (ER) implementation, which FRBR proposes, and denote how syntax affects expressiveness and, accordingly, functionality. Finally, we demonstrate how the selection of one implementation or another (in our case ER vs. RDF) has serious implications, both for cataloging rules and for cataloging practice. IN THE NAME OF THE NAME: RDF LITERALS, ER ATTRIBUTES, AND THE POTENTIAL TO RETHINK THE STRUCTURES AND VISUALIZATIONS OF CATALOGS | PEPONAKIS |doi:10.6017/ital.v35i2.8749 22 Why do we compare these two specific models? The ER model is the base that has been selected from IFLA as a “conceptual framework” 19 for the development of FRBR, while FRBR is the conceptual model upon which RDA has been founded. Subsequently, RDA is also affected by the choice of ER model. On the other hand, RDF is the current conceptualization for resource description in the web of data. So, what kind of problems and conflicts arise from the implementations of each of these models? The basic rationale of ER comprises three fundamental elements. There are entities; entities have attributes; and there are relationships between entities. It is also possible to declare cardinality constraints upon which the FR family builds. Then again, RDF implies quite a different model. “The core structure of the abstract syntax is a set of triples, each consisting of a subject, a predicate and an object. A set of such triples is called an RDF graph. An RDF graph can be visualized as a node and directed-arc diagram, in which each triple is represented as a node-arc-node link. . . . There can be three kinds of nodes in an RDF graph: IRIs, literals, and blank nodes.”20 “Linking the object of one statement to the subject of another, via URIs, results in a chain of linked statements, or linked data. This avoids the ambiguity of using natural language strings as headings to match statements. As a result, a literal object terminates a linked data chain, and literals are generally used for human-readable display data such as labels, notes, names, and so on.”21 As a representative example of the differences between the two models, let us consider “place of publication.” Peponakis counts nine attributes of place and notices that, due to the fact that the ER model does not allow links between attributes, there is no way to define explicitly whether these attributes address the same place or not.22 Taking into consideration this problem we demonstrate the transition from the ER attributes approach to RDF implementations in figures 1– 2. Let us assume that there is Person (X), who was born in London, is named John Smith and works at Publisher (Y). This Publisher is located in London, where Book (1), entitled History of London, has been published. For this specific book, Person X was the lithographer. If we create a strict mapping to FRBR entities, attributes, and relations, then we have the situation illustrated in figure 1. Due to the fact that there is no way to link the four occurrences of London (inasmuch as there is no option to define relations between attributes in the ER model), there is no way to be certain that London is the same in all cases. Judging only by the name, it could stand for London in England, in Ontario, in Ohio, or elsewhere. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 23 Figure 1. Example of “Place” as attribute of several entities The IFLA working group has faced the problem with place and noted the following. The model does not, however, parallel entity relationships with attributes in all cases where such parallels could be drawn. For example, “place of publication/distribution” is defined as an attribute of the manifestation to reflect the statement appearing in the manifestation itself that indicates where it was published. Inasmuch as the model also defines place as an entity it would have been possible to define an additional relationship linking the entity place either directly to the manifestation or indirectly through the entities person and corporate body which in turn are linked through the production relationship to the manifestation. To produce a fully developed data model further definition of that kind would be appropriate. But for the purposes of this study it was deemed unnecessary to have the conceptual model reflect all such possibilities. 23 Finally, they seem to avoid the problem and repeat their position in FRAD as well. In certain instances, the model treats an association between one entity and another simply as an attribute of the first entity. For example, the association between a person and the place in which the person was born could be expressed logically by defining a relationship (“born in”) between person and place. However, for the purposes of this study, it was deemed sufficient to treat place of birth simply as an attribute of person. 24 IN THE NAME OF THE NAME: RDF LITERALS, ER ATTRIBUTES, AND THE POTENTIAL TO RETHINK THE STRUCTURES AND VISUALIZATIONS OF CATALOGS | PEPONAKIS |doi:10.6017/ital.v35i2.8749 24 For some reason the creators of the FR family have chosen not to “upgrade” the attributes of place into one and only one entity. Furthermore, the same problem exists for many attributes, not only for place. Thus, the problem has to do with the selection of ER as “conceptual framework” and not with the specific entity of place. If we accept that “Place of Publication” must not be recorded as it appears on the resource, an RDF-based approach makes things clearer, as figure 2 shows. In this case, all attributes of place are promoted to the same RDF node and, instead of four repeats of the attribute with the value “London,” we reduce it to one and only one node with four connections to it. Then, as illustrated by figure 2, we can be sure that all instances refer to the same London. Figure 2. RDF-based representations of figure 1 In figure 2, it is assumed that there is no need to transcribe the literal of “Place of Publication” from the resource; i.e., we did not follow rule 2.8.1.4 of RDA: “Transcribe places of publication and publishers' names as they appear on the source of information.” For cataloging rules that demand to record the place as it appears on the resource, the readers can consult the subsection “Place Names” in this study. Last but not least, RDF has another significant advantage compared to the ER model: data coded in RDF are packed ready for use in the Semantic Web. On the contrary, data coded in ER must undergo conversion—with all its implications—in order to be published in the Semantic Web. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 25 NAMES, ENTITIES, and IDENTITIES In this section, the significance of names as carriers of meaning is outlined and the importance of documenting the relations of names with the entities and identities they refer to is established. Additionally, the basic approaches are presented for metadata generation for managing names. These approaches resulted in the distinction (dissociation of authorities) from the bibliographic records, which in turn led (both FRBR/FRAD and RDA) to the lack of potentially linking—in an explicit way—the entity with the names it goes by. This linking, as it is presented later in this text, is fundamental for the description and interpretation of the entity. In everyday communication, the usage of a name in a sentence plays the role of the identifier for the entity that this specific name indicates. If the speakers share a common background, there is no need for qualifiers other than the name in order to disambiguate information such as whether Nick is Person X or Person Y, or if the word “London” indicates the city in Ohio or in England, etc. Thus, the common background leads to a very limited context in which the interpretation of the name and the assignment to the appropriate entity is sufficient and accurate. However, the context of the Internet is extended into a variety of possibilities, so there is need of a more precise way to identify specific entities. In this regard, a very essential issue is the distinction between the properties of the name and the properties of the entity that is represented by the specific name. The word “John” could be recognized as an English name, but we jump to a logical flaw if we assume that John knows English. A representative example of this kind of inference (syllogism) can be found in Rayside and Campbell.25 Statement: “Man is a species of animal. Socrates is a man. Therefore, Socrates is a species of animal. . . . ‘Man' is a three-lettered word. Socrates is a man. Therefore, Socrates is a three-lettered word.” Therefore the authorities of a catalog should embody a two-level modeling of the information they represent. The first has to do with the entities and the second with the names of these entities. Consequently, there is the need to find a way to pass from names to the entities they indicate; and, from entities, to the various appellations that these entities have. IN THE NAME OF THE NAME: RDF LITERALS, ER ATTRIBUTES, AND THE POTENTIAL TO RETHINK THE STRUCTURES AND VISUALIZATIONS OF CATALOGS | PEPONAKIS |doi:10.6017/ital.v35i2.8749 26 In catalogs, it is kind of vague whether the change of a name signifies a new identity. Niu states: “For example: the maiden name and the married name of an agent are normally not considered two separate identities, yet one pseudonym used for writing fiction and another pseudonym used for writing scientific works are often considered two different identities of an agent.”26 Then there can be one individual with many identities. But there can also be one identity which incorporates many individuals: for example, a shared pseudonym for a group of authors. To deal with these problems, FRAD introduces the notion of persona, rejecting at the same time the idea that a person is equal to an individual. FRAD defines a person as an “individual or a persona or identity established or adopted by an individual or group.”27 The question that arises here is when the persona must be conceived as a new identity. Yet, FRAD does not make a sufficient judgment; instead, they refer to cataloguing rules. “Under some cataloguing rules, for example, authors are uniformly viewed as real individuals, and consequently specific instances of the bibliographic entity person always correspond to individuals. Under other cataloguing rules, however, authors may be viewed in certain circumstances as establishing more than one bibliographic identity, and in that case a specific instance of the bibliographic entity person may correspond to a persona adopted by an individual rather than to the individual per se.”28 So there is no specific guidance if, for example, in the case of “religious relationship,”29 there must be one identity created with two alternative names or two different identities. Rule 9.2.2.8 in RDA does not elaborate further. Still, even with the problem of identities solved, the matter of appellations itself could be extremely complicated, and this is widely addressed in relevant literature.30,31,32 The VIAF project confirms this with an extremely huge data set .33 Assigning all appellations as attributes is an easy way to model the variants of a name, but it is very simplistic because it “does not allow these appellations to have attributes of their own and neither does it allow the establishing of relationships among the appellations. . . . FRAD makes a big step forward: all appellations are defined as entities in their own right, thus allowing full modeling.”34 Of course, FRAD’s approach is not a novelty in the domain of LIS since library catalogs have been modeling names since the era of MARC. In UNIMARC Authorities,35 the control subfield $5 contains a coded value to indicate the relations between the names with values such as “k = name before the marriage,” “i = name in religion,” “d = acronym,” etc., and in MARC 21 there is the corresponding subfield $w.36 FRAD puts these values on a more consistent and abstract level. FRAD also defines “Relationships between Persons, Families, Corporate Bodies, and Works” in section 5.3 and “Relationships between their Various Names” in section 5.4.37 The Distinction between Authorities and Descriptive Information Since the days of card catalogs and for as long as MARC and AACR have been used, bibliographic records have set their grounds on the dichotomy between descriptive information and control access points. The various types of headings stand for control access points. The terminus of headings was the alphabetical sorting. With the advent of computers, they were used as string identifiers to cluster and retrieve relevant bibliographic records. These bibliographic records had INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 27 a body of descriptive information that was transcribed from the resource and remained unchanged. So the headings were the keys to the records and the records were surrogates for documents. “The elements of a bibliographic record . . . were designed to be read and comprehended by human beings, not by machines”38; established headings are not an exception. One of their basic characteristics was the precondition that they were unique in the context of a specific catalog, thereby avoiding ambiguity. In every case of synonymy, qualifiers (such as date of birth or profession) were added to disambiguate, while the names also played the role of a unique identifier. From this process, an issue emerges: the information that appears on the document has changed and the controlled name may be completely different from the name on the resource. This means that the cataloger performs a transformation of the information, and this transformation carries two dangers. First, by changing the name, there is the possibility of assigning the entity behind the name to a wrong entity. Second, by disturbing the correspondence between the information on the resource and the information on the record of the resource, the record becomes a problematic surrogate of the resource. To surpass this obstacle, traditional catalogs split the information into two different areas: one with the established forms, i.e., the headings; and the second with the purely descriptive information, i.e., the information that must be transcribed from the resource. This is the reason why traditional library catalogs put much effort into transcribing information from resources and very detailed guidelines have been developed. On the other hand, current approaches on metadata creation (such as Dublin Core) seem to underestimate the importance of descriptive information while concentrating on the established forms of names. But how can we be sure that different literals communicate the same meaning? Does this kind of simplification, perhaps, cause problems regarding the integrity of the information? The names are not just sequences of characters (i.e., strings), but they carry latent information. It is known that there are women who wrote using male names (for example Mary Ann Evans wrote as George Eliot) and men who wrote by using female names. There are also nicknames for groups (e.g., “Richard Henry” is a pseudonym for the collaborative works of Richard Butler and Henry Chance Newton), etc. Therefore, it is important not to ignore names and the forms in which they appear on the resources, but to model them in such a way that integration between authorities and descriptive information is feasible, and the names are efficiently machine-processable. INTEGRATING AUTHORITIES WITH DESCRIPTIVE INFORMATION As we have already stated, traditional library catalogs are built on the dichotomy between description and access points. This analysis aims to bring descriptive information and authorities closer, i.e. to connect the access point of catalogs with the description of the resource. The basic principle of the model presented in this section is to promote each verbal (lexical) representation of a name to a nomen, whether this form of the name derives from a controlled vocabulary or not. IN THE NAME OF THE NAME: RDF LITERALS, ER ATTRIBUTES, AND THE POTENTIAL TO RETHINK THE STRUCTURES AND VISUALIZATIONS OF CATALOGS | PEPONAKIS |doi:10.6017/ital.v35i2.8749 28 In the cases that this form appears in a specific vocabulary, appropriate properties could be used to indicate such a relation. In this section, some representative examples are presented. It is important to note, once again, that every node and relation in the following figures could (and must, in the context of the Semantic Web) be identified by a URI, except for the values in rectangles, which are RDF simple literals and therefore cannot be the subjects of further expansion. Thus, the concatenation is the following: Every individual (instance of the relevant class) acquires a URI. Every individual is connected through the “has appellation” property (acquires URI) to a nomen (also acquires URI) and these nomens end up connected to a plain RDF literal, which is in natural language wording and cannot be subjected to further analysis. Place Names The problem of place as an attribute in FRBR and FRAD has also been analyzed in the Background Analysis of the current paper, specifically in the subsection “The Impact of the Representation Scheme’s Selection: RDF versus ER.” Here, a solution to this problem that is compatible with the FRBR/RDA solution is proposed. By promoting every nomen of a place to an RDF node, there is the option of referring to the entity of place as a whole or to a specific appellation of this entity. So, the relation (property in the context of RDF) between the subjects of a work could be indicated by connecting Work X with Place Z. On the other hand, according to rule 2.8.1.4 of RDA, the place of publication for the manifestation must be transcribed as it appears on the source of information. But following the connections presented in figure 3, it is easy to assume that this specific nomen corresponds to the same entity, i.e., to the same place. Figure 3. Place INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 29 Personal names In the section “Names, Entities and Identities,” we analyzed many of the problems associated with personal names. Here, a model is presented where the work (and expression) is connected directly with the author, whereas manifestation is connected with a specific appellation, i.e., nomen, of this author. Figure 4. Statements of responsibility RDA rule 2.4.1.4 states, “Transcribe a statement of responsibility as it appears on the source of information.” But occasionally the statement of responsibility may contain phrases and not just names. In these cases, a solution similar to the Metadata Object Description Schema (MODS) could be implemented where, if needed, the statement of responsibility is included in the note element using the attribute type="statement Of Responsibility." Titles The management of titles in FRBR and RDA indicates a different point of view between the two standards. According to RDA there is no title for the expression,39 and, as Taniguchi states, this is a “significant difference between FRBR and RDA.”40 BIBFRAME abides by the same principle of downgrading expression, since it entangles expression with work in an indivisible unit. In this regard, BIBFRAME is closer to RDA than to FRBR. The notion of work has nothing to do with specific languages, even in the case when the work is a written text. Therefore the assignment of the title of work to a specific appellation is an unnecessary limitation. On the contrary, the title of a manifestation is derived by a specific IN THE NAME OF THE NAME: RDF LITERALS, ER ATTRIBUTES, AND THE POTENTIAL TO RETHINK THE STRUCTURES AND VISUALIZATIONS OF CATALOGS | PEPONAKIS |doi:10.6017/ital.v35i2.8749 30 resource. We argue that between these two poles there is the title of expression, which could stand as a uniform title per language. Figure 5. Titles V of BIBLIOGRAPHIC RECORDS and CATALOGING RULES Resource description in the domain of LIS—from Cutter’s era to the present day—emphasizes static linear textual representations. According to the RDA “0.1 Key Features,” “In RDA, there is a clear line of separation between the guidelines and instructions on recording data and those on the presentation of data. This separation has been established in order to optimize flexibility in the storage and display of the data produced using RDA. Guidelines and instructions on recording data are covered in chapters 1 through 37; those on the presentation of data are covered in appendices D and E.” But the tables in the relative appendices (D and E) contain guidelines that are mainly concentrated on punctuation issues, and they do not take into consideration the dynamics of current interactive user interface capabilities. As Coyle and Hillmann comment, “there are instructions for highly structured strings that are clearly not compatible with what we think of today as machine-manipulable data.”41 It is rather like producing high-tech cards: RDA is faithful INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 31 to the classical text-centric approaches that produce bibliographic records as a linear enumeration of attributes; thus, RDA can be likened to a new suit that is quite old fashioned. Traditional catalogs (from card catalogs to OPACs and repository catalogs) were built upon the principle of creating autonomous records. FRBR set this principle, i.e. one record for each resource, under dispute, while Linked Data abolishes it. This way, a gigantic graph of statements is created, while a certain part of these statements (not always the same) responds to or describes the desired information. Thus, a more sophisticated method emerges, if not makes itself imposed, for showing the results. Therefore, the issue is not to present a record that describes a specific resource, since this conceptualization tends to be obsolete altogether. Consequently, the visualization has to be different while in dependence with the data structure as well as the available interface of the searcher. In this context, the analysis of this study tries to keep in balance the machine-processable character of RDF that builds on identifiers (URIs), while paying attention to the linguistic representation of entities. We argue that the balance between them will result in highly accurate and efficient representations for both humans and software agents. Let us consider the model for titles that has been introduced in this study. According to FRBR, “if the work has appeared under varying titles (differing in form, language, etc.), a bibliographic agency normally selects one of those titles as the basis of a ‘uniform title’ for purposes of consistency in naming and referencing the work.”42 RDA treats the case in a very similar way: rule 5.1.3 states, “The term ‘title of the work’ refers to a word, character, or group of words and/or characters by which a work is known. The term ‘preferred title for the work’ refers to the title or form of title chosen to identify the work. The preferred title is also the basis for the authorized access point representing that work”. In this study, we consider the aforementioned statements as a projection that springs from the days when records were static textual descriptions independent of interfaces. Nowadays we are moving towards a much clearer distinction between the entity and its names. This is reflected in figure 5, in which the connection between a work and its author has nothing to do with specific names (appellations) but is based on URIs. The selection of the appropriate name as a title for the specific work could be based on certain criteria such as the language of the interface: in this case, the title of the work will be the title of the user interface language, and if this is not possible (i.e. there is no title label in this language), then it could be the title of the catalog’s default language. Following the kind of modeling proposed in the current study, the visualizations of data become more flexible and efficient in a variety of dynamic ways. Hence, we can isolate and display nodes and their connections, correlate them with the interface language or screen size (i.e., mobile phone or PC), create levels relative to the desired depth of analysis, personalize them upon the user’s request or habits, and so on. Also, it becomes possible to display the data in forms other than textual. “As a result, humans, with their great visual pattern recognition skills, can comprehend data tremendously faster and more effectively through visualization than by reading the numerical or textual representation of the data.”43 IN THE NAME OF THE NAME: RDF LITERALS, ER ATTRIBUTES, AND THE POTENTIAL TO RETHINK THE STRUCTURES AND VISUALIZATIONS OF CATALOGS | PEPONAKIS |doi:10.6017/ital.v35i2.8749 32 As we have already mentioned, the syntax and the semantics are always going to have a close relationship, but it is crystal clear that, now more than ever, the current Semantic Web standards allow for greater flexibility. As Dunsire et al. put it, The RDF approach is very different from the traditional library catalog record exemplified by MARC21, where descriptions of multiple aspects of a resource are bound together by a specific syntax of tags, indicators, and subfields as a single identifiable stream of data that is manipulated as a whole. In RDF, the data must be separated out into single statements that can then be processed independently from one another; processing includes the aggregation of statements into a record-based view, but is not confined to any specific record schema or source for the data. Statements or triples can be mixed and matched from many different sources to form many different kinds of user-friendly displays.44 In this framework, cataloging rules must reexamine their instructions in light of the new opportunities offered by technological advancements. DISCUSSION Naming is a vital issue for human cultures. Names are not random sequences of characters or sounds that stand just as identifiers for the entities, but they also have socio-cultural meanings and interpretations. Recently, out of “political correctness” and fear of triggering racism, Sweden changed the names of bird species that could potentially offend, such as “gypsy bird” and “negro.”45 Therefore we cannot treat names just as random identifiers. In this study we examined how, instead of describing indivisible resources, we could describe entities that appear in a variety of names on various resources. We proposed a method for connecting the names to the entities they represent and, at the same time, we documented the provenance of these names by connecting specific resources with specific names. We illustrated how to establish connections between entities, connections between an entity and a specific name of another entity, as well as connections between one name and another name concerning one or two entities. In the proposed framework, we maintain the linguistic character of naming while modeling the names in a machine-processable way. This formalism allows for a high level of expressiveness and flexible descriptions that do not have a static, text-centric orientation, since the central point is not the establishment of the text values (i.e., heading) but the meaning of our statements. This study has shown that it is important to have the possibility to establish relationships both between entities and between specific appellations (nomens in the context of this study) of these entities. To achieve this we promoted every appellation to an RDF node. This is not something unheard of in the domain of RDF since this approach has also been adopted by W3C for the development of SKOS-XL.46 FRBRoo, which is another interpretation of increasing influence in the wider context of the FR family, adopts the same perspective. 47 FRBRoo also gives the option to connect a specific name with a resource through the property “R64 used name (was name used INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 33 by)” or to connect a name with someone who uses this specific name through the property “R63 named (was named by).” Murray and Tillett state that “cataloging is a process of making observations on resources”48; hence, the production of records is the result of the judgments during this process. But in the context of traditional descriptive cataloging, the cataloger was not required to judge information in any way other than its category, i.e. to characterize whether the X set of characters corresponded to the name of an author, publisher, or place and so on. There was no obligation of assigning a particular name to a specific author, publisher, or place. In our approach, the cataloger interprets the information and supports the catalog’s potential to deliver added-value information. Moreover, the initial information remains undifferentiated; hence, there is always the option of going back in order to generate new interpretations or validate existing ones. In recent years, there has been a significant increase in the attention given to multi-entity models of resource description.49 In this new environment, “the creation of one record per resource seems a deficient simplification.”50 RDF allows the transformation of universal bibliographic control to a giant global graph.51 In this manner, current approaches on resource description “cannot be considered as simple metadata describing a specific resource but more like some kind of knowledge related to the resource.”52 Indeed, this knowledge can be computationally processable and exploitable. Yet, to achieve this, “catalogers can only begin to work in this way if they are not held bound by the traditional definitions and conceptualizations of bibliographic records.”53 One critical issue is the isolation of parts (sets of statements) of this “giant graph” and the linking of these parts with something else; indeed, theory on this topic is starting to emerge.54 This is very essential because it allows for the creation of ad hoc clusters (i.e. the usage of a specific identity for an entity with all the names that have been assigned to this identity, in our context), which could be used as a set to link to some other entity. As a final remark, we could say that authorities manage controlled access points. In the Semantic Web, every URI is a controlled access point, and hence, the discrimination between description and authorities acquires a new meaning. In the context of machine-processable bibliographic data, the aim is to connect these two, i.e. the authorities with the description, and examine how one can support the other. However, since the emphasis is not on their individual management, we are drawn away from a mentality of ‘descriptive information versus access points” and towards one of “descriptive information as an access point.” ACKNOWLEDGEMENT The author wishes to thank Henry Scott who assisted in the proofreading of the manuscript. IN THE NAME OF THE NAME: RDF LITERALS, ER ATTRIBUTES, AND THE POTENTIAL TO RETHINK THE STRUCTURES AND VISUALIZATIONS OF CATALOGS | PEPONAKIS |doi:10.6017/ital.v35i2.8749 34 REFERENCES and NOTES 1. Grigoris Antoniou and Frank van Harmelen, A Semantic Web Primer, 2nd ed. (Cambridge, MA: MIT Press, 2008), 3. 2. IFLA, Functional Requirements for Bibliographic Records: Final Report, as amended and corrected through February 2009, IFLA Series on Bibliographic Control, vol. 19 (Munich: K.G. Saur, 1998), 6. 3. Daniel Kless et al., “Interoperability of Knowledge Organization Systems with and through Ontologies,” in Classification & Ontology: Formal Approaches and Access to Knowledge: Proceedings of the International UDC Seminar 19–20 September 2011, The Hague, the Netherlands, Organized by UDC Consortium, The Hague, edited by Aida Slavic and Edgardo Civallero (Würzburg: Ergon, 2011), 63–64. 4. Karen Coyle and Diane Hillmann, “Resource Description and Access (RDA): Cataloging Rules for the 20th Century,” D-Lib Magazine 13, no. 1/2 (January 2007): para. 2, doi:10.1045/january2007-coyle. 5. Cory K. Lampert and Silvia B. Southwick, “Leading to Linking: Introducing Linked Data to Academic Library Digital Collections,” Journal of Library Metadata 13, no. 2–3 (2013): 231, doi:10.1080/19386389.2013.826095. 6. IFLA, Functional Requirements for Bibliographic Records. 7. IFLA, Functional Requirements for Authority Data: A Conceptual Model, edited by Glenn E. Patton, IFLA Series on Bibliographic Control (Munich: K.G. Saur, 2009). 8. IFLA, “Functional Requirements for Subject Authority Data (FRSAD): A Conceptual Model” (IFLA, 2010), http://www.ifla.org/files/assets/classification-and-indexing/functional­ requirements-for-subject-authority-data/frsad-final-report.pdf. 9. ALA, “RDA Toolkit: Resource Description and Access,” sec. 0.3.1, accessed June 18, 2014, http://access.rdatoolkit.org/. 10. Gordon Dunsire, “Representing the FR Family in the Semantic Web,” Cataloging & Classification Quarterly 50, no. 5–7 (2012): 724–41, dx:10.1080/01639374.2012.679881. 11. While this paper was under review, IFLA released the draft “FRBR-Library Reference Model” (FRBR-LRM), which is a consolidated edition for the FR family standards. It is developed according to the respective individual standards following the principles of the entity relationship modeling, which is challenged in this paper. Taking into account the ER modeling and the statement (available on p.5 of the standard) that “the model is comprehensive at the conceptual level, but only indicative in terms of the attributes and relationships that are defined,” this consolidated edition could not be perceived as a standard that could be implemented directly as a property vocabulary qualifying for use in the RDF environment. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 35 http://dx.doi.org/10.1045/january2007-coyle http://dx.doi.org/10.1080/19386389.2013.826095 http://www.ifla.org/files/assets/classification-and-indexing/functional-requirements-for-subject-authority-data/frsad-final-report.pdf http://www.ifla.org/files/assets/classification-and-indexing/functional-requirements-for-subject-authority-data/frsad-final-report.pdf http://access.rdatoolkit.org/ http://dx.doi.org/10.1080/01639374.2012.679881 12. Main page (for all FR) at http://iflastandards.info/ns/fr/; “FRBR Model" available at http://iflastandards.info/ns/fr/frbr/frbrer/; “FRAD Model” available at http://iflastandards.info/ns/fr/frad/; “FRSAD Model” available at http://iflastandards.info/ns/fr/frsad/. An addition to the previous is FRBRoo: the element set is available at http://iflastandards.info/ns/fr/frbr/frbroo/. 13. Manolis Peponakis, “Conceptualizations of the Cataloging Object: A Critique on Current Perceptions of FRBR Group 1 Entities,” Cataloging & Classification Quarterly 50, no. 5–7 (2012): 587–602, doi:10.1080/01639374.2012.681275. 14. Pat Riva and Chris Oliver, “Evaluation of RDA as an Implementation of FRBR and FRAD,” Cataloging & Classification Quarterly 50, no. 5–7 (2012): 564–86, doi:10.1080/01639374.2012.680848. 15. Shoichi Taniguchi, “Viewing RDA from FRBR and FRAD: Does RDA Represent a Different Conceptual Model?,” Cataloging & Classification Quarterly 50, no. 8 (2012): 929–43, doi:10.1080/01639374.2012.712631. 16. RDA registry is available at http://www.rdaregistry.info/. 17. The nomen entity and the “has appellation” (reversed “is appellation of”) property are also used by the FRBR-LRM. 18. Paul H. Portner, What Is Meaning?: Fundamentals of Formal Semantics (Malden, MA: Blackwell, 2005), 34. 19. IFLA, Functional Requirements for Bibliographic Records, 19:6. 20. W3C, “RDF 1.1 Concepts and Abstract Syntax: W3C Recommendation,” February 25, 2014, http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/. 21. Gordon Dunsire, Diane Hillmann, and Jon Phipps, “Reconsidering Universal Bibliographic Control in Light of the Semantic Web,” Journal of Library Metadata 12, no. 2–3 (2012): 166, doi:10.1080/19386389.2012.699831. 22. Manolis Peponakis, “Libraries’ Metadata as Data in the Era of the Semantic Web: Modeling a Repository of Master Theses and PhD Dissertations for the Web of Data,” Journal of Library Metadata 13, no. 4 (2013): 333, doi:10.1080/19386389.2013.846618. 23. IFLA, Functional Requirements for Bibliographic Records, 19:32. 24. IFLA, Functional Requirements for Authority Data: A Conceptual Model, 36–37. 25. Derek Rayside and Gerard T. Campbell, “An Aristotelian Understanding of Object-Oriented Programming,” in Proceedings of the 15th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA ’00 (New York: ACM, 2000), 350, doi:10.1145/353171.353194. IN THE NAME OF THE NAME: RDF LITERALS, ER ATTRIBUTES, AND THE POTENTIAL TO RETHINK THE STRUCTURES AND VISUALIZATIONS OF CATALOGS | PEPONAKIS |doi:10.6017/ital.v35i2.8749 36 http://iflastandards.info/ns/fr/ http://iflastandards.info/ns/fr/frbr/frbrer/ http://iflastandards.info/ns/fr/frad/ http://iflastandards.info/ns/fr/frad/ http://iflastandards.info/ns/fr/frsad/ http://iflastandards.info/ns/fr/frbr/frbroo/ http://dx.doi.org/10.1080/01639374.2012.681275 http://dx.doi.org/10.1080/01639374.2012.680848 http://dx.doi.org/10.1080/01639374.2012.712631 http://www.rdaregistry.info/ http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/ http://dx.doi.org/10.1080/19386389.2012.699831 http://dx.doi.org/10.1080/19386389.2013.846618 http://dx.doi.org/10.1145/353171.353194 26. Jinfang Niu, “Evolving Landscape in Name Authority Control,” Cataloging & Classification Quarterly 51, no. 4 (2013): 405, doi:10.1080/01639374.2012.756843. 27. IFLA, Functional Requirements for Authority Data: A Conceptual Model, 24. 28. Ibid., 20. 29. “Religious relationship” is the “relationship between a person and an identity that person assumes in a religious capacity”; for example the “relationship between the person known as Thomas Merton and that person’s name in religion, Father Louis” (IFLA, 2009, 61–62). 30. Junli Diao, “‘Fu hao,’ ‘fu hao,’ ‘fuHao,’ or ‘fu Hao’? A Cataloger’s Navigation of an Ancient Chinese Woman’s Name,” Cataloging & Classification Quarterly 53, no. 1 (2015): 71–87, doi:10.1080/01639374.2014.935543. 31. On Byung-Won, Sang Choi Gyu, and Jung Soo-Mok, “A Case Study for Understanding the Nature of Redundant Entities in Bibliographic Digital Libraries,” Program: Electronic Library and Information Systems 48, no. 3 (July 1, 2014): 246–71, doi:10.1108/PROG-07-2012-0037. 32. Neil R. Smalheiser and Vetle I. Torvik, “Author Name Disambiguation,” Annual Review of Information Science and Technology 43, no. 1 (2009): 1–43, doi:10.1002/aris.2009.1440430113. 33. Thomas B. Hickey and Jenny A. Toves, “Managing Ambiguity in VIAF,” D-Lib Magazine 20, no. 7/8 (2014), doi:10.1045/july2014-hickey. 34. Martin Doerr, Pat Riva, and Maja Žumer, “FRBR Entities: Identity and Identification,” Cataloging & Classification Quarterly 50, no. 5–7 (2012): 524, doi:10.1080/01639374.2012.681252. 35. IFLA, UNIMARC Manual: Authorities Format, 2nd revised and enlarged edition, UBCIM Publications—New Series, vol. 22 (Munich: K.G. Saur, 2001). 36. Library of Congress, “MARC 21 Format for Authority Data” (Library of Congress, April 18, 1999), http://www.loc.gov/marc/authority/. 37. IFLA, Functional Requirements for Authority Data: A Conceptual Model. 38. Martha M. Yee, “FRBRization: A Method for Turning Online Public Findings Lists into Online Public Catalogs,” Information Technology and Libraries 24, no. 2 (2005): 81, doi:10.6017/ital.v24i2.3368. 39. See FRBR-RDA Mapping from Joint Steering Committee for Development of RDA available at http://www.rda-jsc.org/docs/5rda-frbrrdamappingrev.pdf 40. Taniguchi, “Viewing RDA from FRBR and FRAD,” 934. 41. Coyle and Hillmann, “Resource Description and Access (RDA): Cataloging Rules for the 20th Century,” sec. 8. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 37 http://dx.doi.org/10.1080/01639374.2012.756843 http://dx.doi.org/10.1080/01639374.2014.935543 http://dx.doi.org/10.1108/PROG-07-2012-0037 http://dx.doi.org/10.1002/aris.2009.1440430113 http://dx.doi.org/10.1045/july2014-hickey http://dx.doi.org/10.1080/01639374.2012.681252 http://www.loc.gov/marc/authority/ http://dx.doi.org/10.6017/ital.v24i2.3368 http://www.rda-jsc.org/docs/5rda-frbrrdamappingrev.pdf 42. IFLA, Functional Requirements for Bibliographic Records, 19:33. 43. Leonidas Deligiannidis, Amit P. Sheth, and Boanerges Aleman-Meza, “Semantic Analytics Visualization,” in Intelligence and Security Informatics, edited by Sharad Mehrotra et al., Lecture Notes in Computer Science 3975 (Springer Berlin Heidelberg, 2006), 49, http://link.springer.com/chapter/10.1007/11760146_5. 44. Dunsire, Hillmann, and Phipps, “Reconsidering Universal Bibliographic Control in Light of the Semantic Web,” 166. 45. Rick Noack, “Out of Fear of Racism, Sweden Changes the Names of Bird Species,” Washington Post, February 24, 2015, http://www.washingtonpost.com/blogs/worldviews/wp/2015/02/24/out-of-fear-of­ racism-sweden-changes-the-names-of-bird-species/. 46. W3C, “SKOS eXtension for Labels (SKOS-XL) Namespace Document—HTML Variant,” 2009, http://www.w3.org/TR/2009/REC-skos-reference-20090818/skos-xl.html. 47. Chryssoula Bekiari et al., FRBR Object-Oriented Definition and Mapping from FRBRER, FRAD and FRSAD, version 2.0 (draft), 2013, http://www.cidoc­ crm.org/docs/frbr_oo//frbr_docs/FRBRoo_V2.0_draft_2013May.pdf. 48. Robert J. Murray and Barbara B. Tillett, “Cataloging Theory in Search of Graph Theory and Other Ivory Towers,” Information Technology and Libraries 30, no. 4 (January 12, 2011): 171, http://dx.doi.org/10.6017/ital.v30i4.1868. 49. Thomas Baker, Karen Coyle, and Sean Petiya, “Multi-Entity Models of Resource Description in the Semantic Web,” Library Hi Tech 32, no. 4 (2014): 562–82, http://dx.doi.org/10.1108/LHT­ 08-2014-0081. 50. Peponakis, “Libraries’ Metadata as Data in the Era of the Semantic Web,” 343. 51. Kim Tallerås, “From Many Records to One Graph: Heterogeneity Conflicts in the Linked Data Restructuring Cycle,” Information Research 18, no. 3 (2013), http://informationr.net/ir/18­ 3/colis/paperC18.html. 52. Peponakis, “Conceptualizations of the Cataloging Object,” 599. 53. Rachel Ivy Clarke, “Breaking Records: The History of Bibliographic Records and Their Influence in Conceptualizing Bibliographic Data,” Cataloging & Classification Quarterly 53, no. 3–4 (2015): 286–302, doi:10.1080/01639374.2014.960988. 54. Gianmaria Silvello, “A Methodology for Citing Linked Open Data Subsets,” D-Lib Magazine 21, no. 1/2 (2015), doi:10.1045/january2015-silvello. IN THE NAME OF THE NAME: RDF LITERALS, ER ATTRIBUTES, AND THE POTENTIAL TO RETHINK THE STRUCTURES AND VISUALIZATIONS OF CATALOGS | PEPONAKIS |doi:10.6017/ital.v35i2.8749 38 http://link.springer.com/chapter/10.1007/11760146_5 http://www.washingtonpost.com/blogs/worldviews/wp/2015/02/24/out-of-fear-of-racism-sweden-changes-the-names-of-bird-species/ http://www.washingtonpost.com/blogs/worldviews/wp/2015/02/24/out-of-fear-of-racism-sweden-changes-the-names-of-bird-species/ http://www.w3.org/TR/2009/REC-skos-reference-20090818/skos-xl.html http://www.cidoc-crm.org/docs/frbr_oo/frbr_docs/FRBRoo_V2.0_draft_2013May.pdf http://www.cidoc-crm.org/docs/frbr_oo/frbr_docs/FRBRoo_V2.0_draft_2013May.pdf http://dx.doi.org/10.6017/ital.v30i4.1868 http://dx.doi.org/10.1108/LHT-08-2014-0081 http://dx.doi.org/10.1108/LHT-08-2014-0081 http://informationr.net/ir/18-3/colis/paperC18.html http://informationr.net/ir/18-3/colis/paperC18.html http://dx.doi.org/10.1080/01639374.2014.960988 http://dx.doi.org/10.1045/january2015-silvello Introduction Background FRBR and RDA Notes about Terminology and Graphs: How to Read the Figures The Impact of the Representation Scheme’s Selection: RDF versus ER Names, Entities, and Identities The Distinction between Authorities and Descriptive Information Integrating authorities with descriptive information Place Names Personal names Titles Visualization of Bibliographic Records and Cataloging Rules Discussion References and Notes 8923 ---- Microsoft Word - March_ITAL_Kuglitsch_proof.docx Facilitating Research Consultations Using Cloud Services: Experiences, Preferences, and Best Practices Rebecca Zuege Kuglitsch, Natalia Tingle, and Alexander Watkins INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 29 ABSTRACT The increasing complexity of the information ecosystem means that research consultations are increasingly important to meeting library users' needs. Yet librarians struggle to balance escalating demands on their time. How can we embrace this expanded role and maintain accessibility to users while balancing competing demands on our time? One tool that allows us to better navigate this balance is Google Appointment Calendar, part of Google Apps for Education. It makes it easier than ever for students to book a consultation with a librarian, while at the same time allowing the librarian to better control their schedule. Our experience suggests that both students and librarians felt it was a useful, efficient system. INTRODUCTION The growing complexity of the information ecosystem means that research consultations are increasingly important to meeting library users' needs. Although reference interactions in academic libraries have declined overall, in-depth research consultations have not followed that trend.1 These research consultations represent an increasingly large proportion of academic librarians' reference interactions, and offer important opportunities to follow up on information literacy instruction, support student academic success, and relieve library anxiety. The library literature has demonstrated a need for and appreciation of these services.2 Moreover, students value face to face consultations because they provide an opportunity to talk through complex problems and questions while providing affective benefits such as relationship building and reassurance.3 It is evident that students seek out and value these services. But even as these services become increasingly important, librarians struggle to balance escalating demands on their time. How can we embrace this expanded role and maintain accessibility to users while managing competing priorities? We found little guidance in the literature to identify the most efficient technological tools to offer these services to undergraduates, so we began to explore options. One tool that allows us to better navigate this shifting landscape is Google Appointment Calendar, part of Google Apps for Education. It makes it easier for students to book a consultation with a librarian, while at the same time allowing the librarian to better control their schedule; Rebecca Zuege Kuglitsch (rebecca.kuglitsch@colorado.edu) is Head, Gemmill Library of Engineering, Mathematics & Physics, University of Colorado Boulder. Natalia Tingle (natalia.tingle@colorado.edu) is Business Collections & Reference Librarian, University of Colorado Boulder. Alexander Watkins (alexander.watkins@colorado.edu) is Art & Architecture Librarian, University of Colorado Boulder. FACILITATING RESEARCH CONSULTATIONS USING CLOUD SERVICES: EXPERIENCES, PREFERENCES, AND BEST PRACTICES | KUGLITSCH, TINGLE, AND WATKINS | https://doi.org/10.6017/ital.v36i1.8923 30 consequently, it is being adopted by many librarians at the University of Colorado Boulder. There are several other options available for librarians interested in calendar applications, such as YouCanBook.me.4 However, on campuses using Google Apps for Education, it may be easier to use a tool students are already familiar with and commonly use as part of their daily academic routines. Moreover, the integration with Apps for Education solves some of the problems Hess noted in the public version of Google Calendar Appointments (which is also no longer available), such as appointments booked without identifying information, and the extra step of logging in just for an appointment. Because students are often already logged in due to using Google Apps for word processing, group work, and more, there is no extra step to log in for a simple appointment.5 Our exploration of this tool suggests that it is helpful to librarians, but that it can also be of benefit to students, too. Research has proposed that students may hesitate to ask questions due to library anxiety. Would scheduling an appointment using a calendaring system be less intimidating than emailing a librarian directly, for example? We set out apply this technology in an environment of changing student preferences and expectations, explore how students received it, and establish effective practices for using it in an academic setting. Since we are liaisons to science, social science, and humanities subject areas, we were able to work with a wide spread of undergraduate students in our exploration to see what might be most effective for us, and also for students from a variety of backgrounds. Why Google Calendar We selected appointment booking via Google Calendar because of its ease of use and because the University of Colorado Boulder has Google Apps for Education. This means that every student will have a Google ID and the option of using Google Calendar as part of their normal routine. In December 2012, Google discontinued appointment calendars for general users, and limited claimable appointment slots to Google Apps for Education. For institutions which who do not subscribe, it may be worth investigating third-party Google Calendar apps, some of which are free or freemium, such as Calendly (https://calendly.com/), or SpringShare’s similar subscription service, LibCal (https://www.springshare.com/libcal/). Setting up Google Calendar One of the benefits of Google Calendar is its ease of use. Starting to set up the calendar for appointment slots is as simple as creating a new Google Calendar event and selecting appointment slots as the type of event. Next, you can give your appointment slots a name that correspond with the language your institution uses for research consultations, and schedule them for the desired length of time. It is possible to schedule blocks of appointments that Google will automatically break into shorter appointments of predetermined amounts of time. The authors created appointments lasting 30 minutes, 60 minutes, or a mix of both, depending on the expectations of our disciplines. It is also possible to create several simultaneous appointment slots, if you would like to accommodate small groups. As well as indicating time, each appointment also has a space to indicate location, particularly useful for librarians who might work in several branches or combine office hours in academic buildings with in-library office consultations. Once the events are named and saved, the calendar can be shared. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 31 Figure 1. Create a new event, selecting ‘Appointment slots’. Appointment calendars are given a unique shareable URL to direct users to available appointments; however, these URLs are necessarily long and complicated, so we recommend using a link shortener. To obtain the very long URL for an appointment calendar, click on ‘edit details’ in an appointment event. From there, it is possible to copy the link and use a link shortener to make a brief, understandable link. Figure 2. Obtain the shareable link When a student uses the link to make an appointment, both the librarian and the student receive an email with the student’s login name, email, appointment time, and other details. The slot immediately appears as taken on the calendar, so it is no longer available for other students, reducing confusion and double booking. Receiving the student’s email allows the librarian to initiate the reference interview and establish expectations. FACILITATING RESEARCH CONSULTATIONS USING CLOUD SERVICES: EXPERIENCES, PREFERENCES, AND BEST PRACTICES | KUGLITSCH, TINGLE, AND WATKINS | https://doi.org/10.6017/ital.v36i1.8923 32 Figure 3. Google calendar showing a variety of available appointments. Student Impressions We received positive feedback about the appointment calendars from students. Students commented: ● “I like the ability to see all of the possible openings,” ● “I already bookmarked that bit.ly, so you’ll probably hear from me” (which we did, shortly thereafter). ● “I like to be able to ‘schedule’ a consultation, not request one. It seems more useful and immediate.” We kept track of how many students who made calendar appointments over two semesters kept them, and sent a short, informal survey to students who made appointments. No students who made a calendar appointment failed to attend their consultation. Though our survey does not permit large-scale generalizations due to a very low response rate (4) and a small sample size (15), all of the students who responded and used the calendar found the experience of booking an appointment that way to be easy, convenient, and unintimidating. Everyone who used the calendar indicated that they would prefer to use it again, and about half of the respondents who set up their appointments via email told us that they would prefer to book a consultation through INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 33 an appointment calendar in the future. Our anecdotal evidence in succeeding semesters aligns with this perception. We found that using appointment calendars can have many benefits for students: ● They can reduce student anxiety from having to compose and send an email. ● Booking appointments can take less of their time. They book immediately without back and forth emailing. This also means there’s no time to rethink the appointment and either never send the email or back out later. ● The appointment is placed on their calendar, meaning they automatically have a built-in reminder and don’t need to search through their email to find the date and time of their appointment. ● Since the appointment calendars eliminate back and forth scheduling and reduce email fatigue, students may be more willing to use email to discuss their topic and/or question with the librarian. Librarian Impressions Our experience has been equally positive. We found that using the calendars radically streamlines the typical back and forth email exchanges for setting appointments. We emailed each student to confirm the appointment, but this single email is still a significant reduction of claim on the librarian’s attention from a minimum of three emails to schedule an appointment (which often realistically becomes five or more when negotiating a time) to two. Additionally, librarians can put appointment slots in between meetings and other times when they might only have a spare hour, which are often too tedious to list when emailing. Using appointment calendars lets librarians efficiently use their time even when it is fragmented. As well as facilitating efficient use of small amounts of time, appointment calendars also allow librarians to gently create boundaries. Rather than having to deny appointments requested for late nights or weekends, students are guided to viable times. While the use of Google Calendar is entirely voluntary at the University of Colorado Boulder we presented the tool at several reference librarian meetings with success and several other librarians have happily adopted the tool. One librarian who adopted the tool said: “Sending a student a calendar that they can use to request a meeting eliminates the twelve messages back and forth on when to schedule a meeting. I also like that it puts the meeting on both our calendars, reducing the number of no-shows.” BEST PRACTICES Our experiences and verbal feedback from students and librarians provided a foundation to develop best practices to minimize both librarian and student confusion. For students, confusion often centered around accessing the calendar, identifying which time slots were available, and identifying acceptable locations for appointments. The following best practices can help solve these difficulties. Use a link shortener and a consistent naming convention so the links are similar for multiple librarians. Using a link shortener makes it easy for students to jot down the calendar URL, either to manually enter into a browser later or to quickly get to the link and bookmark it. This makes it easy for students to file the link and return to it at point of need. Using a consistent naming FACILITATING RESEARCH CONSULTATIONS USING CLOUD SERVICES: EXPERIENCES, PREFERENCES, AND BEST PRACTICES | KUGLITSCH, TINGLE, AND WATKINS | https://doi.org/10.6017/ital.v36i1.8923 34 convention makes it intuitive for students to transfer the appointment method over to other librarians’ cases for future research needs. If your link shortener is case-sensitive, create capitalized and lowercase versions of the link. Many link shorteners are case-sensitive, unlike most URLs, which can confuse students and lead to frustration when they try to access a link later. While this could be solved to some extent by using only lowercase letters for the shortened link, that solution can create a cumbersome and difficult to read short URL. Simply creating two forms of the link efficiently solves this. Develop a naming convention so available appointment slots are obvious. We found that when naming time slots simply “Consultation” students sometimes assumed that all appointments were booked when, in fact, every appointment was open. Using a term like “Available consultation” made it clear to students that the appointments were not already booked. Google Calendar automatically makes booked appointments unavailable, eliminating the opposite frustration. Carefully consider the location in the bookable appointment form. Google Calendar allows librarians to enter or leave empty the location. If the field is left empty, users can specify a location, and students often filled in a location when none was indicated. If a librarian is not mobile, or is available in certain places only at certain times, it is key to identify a location. For example, in our study, one librarian held weekly office hours in two academic buildings; it was particularly important to identify which times the librarian was available in the library versus the academic buildings. On the other hand, it may also make sense not to designate a location. Another of the authors, serving a population that used the main library, one branch library, and research area of the campus with no onsite library services, chose not to enter any location in order to accommodate the extremely dispersed population. Users frequently indicated in which location they would be willing to meet, an option the librarian wanted to support in order to underscore the availability of services wherever users were located on campus. Schedule two weeks of availability. We found that students could almost always find a time that worked for them with two weeks of available appointments. Moreover, other than recurring office hours, it was difficult for librarians to predict their schedule further into the future than a few weeks. Librarian concerns centered around keeping calendars synchronized, providing enough lead time for users to book appointments, and publicizing the service. We found several best practices that eased these concerns. Designate a day each week to update hours and clear conflicts on the calendar. If Google Calendar is not the primary calendaring software for the library, it can be challenging to synchronize calendars. Google Calendar sends a calendar invitation to the librarian when an appointment is claimed, which they can accept on their primary calendaring system, but conflicts that arise on the primary calendaring system are not automatically sent to Google Calendar. By selecting a day and habitually updating the Google Calendar and quickly checking for conflicts that have arisen with unclaimed slots, librarians can avoid forgetting to add slots or remove those that conflict with other late-arising obligations. Advertise the link on the library web site, give out the calendar link during class sessions and give it to professors to embed in course management systems. While appointment calendars still INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 35 benefit librarian workflows without advertising, students need easy access to the calendar. For maximum user uptake, it is important to put the calendar link anywhere a librarian’s contact information can be found. We found it helpful to promote the link in classes, and that it was particularly effective when professors agreed to place the link in the class web site. This positions library research assistance next to assignments when they are given out and drafts when they are returned--hopefully reminding students that the library is available for assistance at moments in which they are most likely to seek it. REFLECTIONS AND CONCLUSIONS Our experiences support the idea that online appointment calendars are appreciated by students, streamline work for librarians, and are easily adopted by both parties. More use of this technology, whether via Google Apps for Education or another service, can be mutually beneficial to librarians and students. Students using the calendar indicated that it was not more intimidating than emailing a librarian, and by removing the waiting period for a response, a calendar can prevent student distraction or students persuading themselves that they actually do not need help in the interim. By providing a calendar where students can quickly and simply book an appointment with a librarian for research assistance, librarians can support students seeking assistance, and thus ultimately bolster student success and increase the library’s relevance. REFERENCES 1. Naomi Lederer and Louise Mort Feldmann, “Interactions: A Study of Office Reference Statistics,” Evidence Based Library and Information Practice 7, no. 2 (2012): 5–19. 2. Ramirose Attebury, Nancy Sprague, and Nancy J. Young, “A Decade of Personalized Research Assistance,” Reference Services Review 37, no. 2 (2009): 207–20, https://doi.org/10.1108/00907320910957233; Trina J. Magi and Patricia E. Mardeusz, “What Students Need from Reference Librarians: Exploring the Complexity of the Individual Consultation,” College & Research Libraries News 74, no. 6 (2013): 288–91. 3. Trina J. Magi and Patricia E. Mardeusz, “Why Some Students Continue to Value Individual, Face- to-Face Research Consultations in a Technology-Rich World,” College & Research Libraries 74, no. 6 (November 1, 2013): 605–18, https://doi.org/10.5860/crl12-363. 4. Amanda Nichols Hess, “Scheduling Research Consultations with YouCanBook.Me Low Effort, High Yield,” College & Research Libraries News 75, no. 9 (October 1, 2014): 510–13. 5. Hess, “Scheduling Research Consultations with YouCanBook.Me Low Effort, High Yield,” 511. 8930 ---- September_ITAL_Ullah_for_proofing Bibliographic Classification in the Digital Age: Current Trends and Future Directions Asim Ullah, Shah Khusro, and Irfan Ullah INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 48 ABSTRACT Bibliographic classification is among the core activities of Library & Information Science that brings order and proper management to the holdings of a library. Compared to printed media, digital collections present numerous challenges regarding their preservation, curation, organization and resource discovery & access. Therefore, true native perspective is needed to be adopted for bibliographic classification in digital environments. In this research article, we have investigated and reported different approaches to bibliographic classification of digital collections. The article also contributes two evaluation frameworks that evaluate the existing classification schemes and systems. The article presents a bird’s-eye view for researchers in reaching a generalized and holistic approach towards bibliographic classification research, where new research avenues have been identified. INTRODUCTION Classification is the primary instinct of human beings in arranging, understanding, and relating knowledge artifacts. Bibliographic classification provides a framework for arranging and organizing knowledge artifacts preserved in the form of books, magazines, newspapers and other holdings to explore new avenues of knowledge management. Today several classification schemes are in use ranging from conventional classification schemes including Library of Congress Classification (LCC), Dewey Decimal Classification (DDC), Colon Classification (CC), and Universal Decimal Classification (UDC) to classification for digital environments including Association for Computing Machinery (ACM) digital library1, Institute of Electrical and Electronics Engineering (IEEE) digital library2, and Online Computer Library Center (OCLC) cooperative catalogue3. Besides the difficulties that lie in devising a classification scheme (time-consuming and resource- consuming), it is required that either the existing schemes should be revised and extended or a new classification scheme should be devised, which could act as a common platform for representing knowledge artifacts belonging to different contexts. Such a classification scheme should also resolve the challenges in digital preservation and curation and support the precise Asim Ullah (asimullah@upesh.edu.pk), Shah Khusro (khusro@upesh.edu.pk), and Irfan Ullah (cs.irfan@upesh.edu.pk) are researchers at the Department of Computer Science, University of Peshawar, Peshawar, Pakistan. 1 http://dl.acm.org/ 2 http://ieeexplore.ieee.org/Xplore/home.jsp 3 https://www.oclc.org/ BIBLIOGRAPHIC CLASSIFICATION IN THE DIGITAL AGE | ULLAH, KHUSRO, AND ULLAH | doi:10.6017/ital.v36i3.8930 49 And accurate search and retrieval of digital collections. The first step, in this connection, is to properly analyze and evaluate the existing bibliographic classification schemes and to dig out their strengths and limitations in classifying digital collections accurately and appropriately. Therefore, the objectives of this research article include: • To investigate and evaluate the available approaches to bibliographic classification from the perspective of devising a classification scheme that can act as a common platform for classifying any type of digital collection. • To devise evaluation frameworks that compares the available bibliographic classification schemes and approaches. • To present issues, challenges, and research opportunities in state-of-the-art bibliographic classification research. The rest of the paper is organized as: Section 2 presents the current trends in the classification of digital collections. Section 3 presents two evaluation frameworks for comparing and evaluating the existing solutions. Section 4 presents research challenges and opportunities in bibliographic classification research. Finally, Section 5 concludes our discussion. References are presented at the end of the paper. Classifying Digital Collections – A Mixed Trend The bibliographic classification has been the focus of several researchers to properly classify, catalogue, and describe digital collections. In this regard, two approaches have been adopted: the former supports the use of conventional classification schemes including CC, DDC, and LCC etc., in describing and classifying digital documents, while the latter recommends devising some new ways of classification such as ACM4 computing classification. However, in most of the digital environments, a mixed trend has been observed, where along with new classification schemes, categorization is also used as a complementary solution. For example, ACM presents its own classification system as poly-hierarchical ontology in describing Computer Science literature and for using in Semantic Web applications. ACM has replaced its 2008 ACM classification system that serves as de-facto model for the classification of Computer Science literature by giving visual topic display along with searching services. It serves as semantic vocabulary for categorizing concepts and a foundation of computing disciplines ("The 2012 ACM Computing Classification System,"). Similarly, IEEE digital library categorizes its holdings into directories per its own rules of cataloguing and categorization. It categorizes articles and standards in to several subject areas and clusters documents through year of publication, author names, content type, affiliation, publication title, publisher, country of publication, alphabets, numerals and alphanumeric values5. The document collection can be navigated through collection names, number of documents, by topic and International Classification for Standards (ICS). 4 http://dl.acm.org 5 http://ieeexplore.ieee.org/browse/standards/ics/ieee/ INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 50 The DMOZ6 directory is the largest human made directory of web pages. Since its inception in 1998, it categorizes 3,861,137 websites available in 90 languages into 1,031,719 categories and sub-categories by 91,928 editors and volunteers. In addition, it has its DMOZ RDF dumps available on Linked Open Data (LOD) cloud. According to the World Wide Web Consortium (W3C), LOD enables the data integration and reasoning at a large scale ("Linked data,"). It establishes links among data enabling machines and users to explore the web of data rather than the web of documents along with finding related data (Berners-Lee, 2006; Bizer, Heath, & Berners-Lee, 2009). However, it lacks in semantic search (meaningful search), which affects the precision and accuracy in exploring the required resources. Also, the categories, under which the websites are kept, are needed to be revised because there can be faceted and intra-hierarchical links among web pages. In addition, the content management needs to be upgraded for updating the directory with new entries and the way it reviews and categorizes websites (Boykin, 2016). Institutional repositories use the mixed approach towards creating, collecting and managing metadata for printed and digital collections using several sources including conventional and digital. This mixed trend introduces challenges to the metadata managers (Chapman, Reynolds, & Shreeves, 2009). To deal with these challenges, the subject classification systems can be very beneficial in providing Web-oriented services including searching of contents through search patterns, browsing, and content filtering by subject area. However, at the same time, a cognitive overload rises for the authors and depositors of the institutional repository (Cliff, 2008) that needs further attention. To handle the information overload in retrieving digital collections, several controlled methods have been proposed in the literature ranging from manual techniques (e.g., web directories) to automatic techniques including clustering and classification. Several classification schemes including sentiment and subject classification have been developed for classifying (and categorizing) web pages. Classification is used in focused crawling, searching and ranking results, and classifying queries. Clustering also classifies web resources but it is slightly different from classification, which is based on a rigid predefined taxonomy and rules for interpreting the meaning of classification order. On the other hand, clustering shows flexibility in classification (categorization) of web documents (Zhu, 2011). However, a mixed trend has been observed, where classification and categorization are intermingled to facilitate organization, description, exploration, and retrieval of digital collections. Semantic Web brings meaningful connections between the web of data so that not only humans but machines can also understand the content of documents to retrieve the most intended and required documents. This way other related documents could also be easily connected and retrieved (Berners-Lee, 2006). To understand, describe, and relate concepts within documents, ontologies are used. Therefore, researchers have been working on bringing semantics through Semantic Web and related technologies to automatically classify digital collections. For example, 6http://www.dmoz.org BIBLIOGRAPHIC CLASSIFICATION IN THE DIGITAL AGE | ULLAH, KHUSRO, AND ULLAH | doi:10.6017/ital.v36i3.8930 51 (Beghtol, 1986) argues that semantic axis makes syntactical classification structure more meaningful and provides the platform for developing relationships among knowledge artifacts through several warrants in classification systems. Similarly, classification ontology is used in automatic classification (Wijewickrema & Gamage, 2013) to minimize the ambiguity in vocabulary. To obtain a single subject for the input document, several weight functions including the term frequency-inverse document frequency (TF-IDF), and filtering methods are applied. Semantic Web and LOD technologies have also been used in dealing with bibliographic data. For example, BibBase7, a bibliographic data publishing and management tool (Xin, Hassanzadeh, Fritz, Sohrabi, & Miller, 2013) publishes bibliographic data on the user website according to LOD principles. However, these are limited because of the lack of interoperability among native languages while translating classification records from source language to the target language (Kwaśnik & Rubin, 2003). The classification schemes are also being converted into ontologies. (Giunchiglia, Marchese, & Zaihrayeu, 2007) have applied reasoning capabilities of OWL ontologies to classification schemes. These ontologies are used as interfaces to human knowledge for machines whereas classification schemes are interfaces to knowledge for humans. However, there is limited support available for cross-disciplinary searching and accommodation for more views and interpretations of knowledge (Albrechtsen, 2000). The supervised and unsupervised machine learning techniques are used for automatic text classification. Supervised machine learning techniques use models including multinomial Naïve Bayes model, and Bernoulli model (Manning, Raghavan, & Schütze, 2008) for classification. Yelton (2011) applies probabilistic classification of important words (and therefore of documents) especially by considering Amazon’s Statistically Improbable Phrases (SIPs)8 and Google phrase search inside a book. For subject analysis, he mentions simplistic; content-based; and requirements-based methods in terms of understanding text classification and manipulation of books. The Wikipedia page structural hierarchy is exploited in automatically harvesting, classification, categorization, clustering, and metadata enrichment (Yelton, 2011). Information Extraction (IE) is also applied in classifying books automatically. For example, (Betts, Milosavljevic, & Oberlander, 2007) use IE methods for automatic labeling of books using LCC classification. They used bag-of-words (BOW) model, bag-of-named-entity recognition (NER) model, generalizing named entities (GAZ) model in automatic text classification. To achieve better accuracy, they also combined the results of these models. However automatic classification may lead to limited search and retrieval because of the missing semantics associated with phrases or key words. To overcome this issue, a fundamental and practical theoretical model of classification is required (Jones, 1970). 7 https://bibbase.org/ 8http://www.amazon.com/gp/search-inside/sipshelp.html INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 52 Table 1 categorizes the bibliographic classification approaches into three broader categories namely: theoretical approaches, practical approaches and approaches used in digital environments. Theoretically researchers have discussed different viewpoints for classification, whereas we get a different view when these schemes are applied for classification. Practically, the syntactic structure is valued by using faceted and enumerative techniques. In digital environments like the Web and digital libraries, strict boundaries of classification are often compromised by categorization. Approaches to Classification Techniques Used Theoretical Approaches 1. Biasness (Mai, 2009) (Mai, 2010) 2. Subjectivity and objectivity (Hjørland, 2016) 3. Epistemological and Semiotic approaches (Hjørland, 2013) (Lee, 2012; Mai, 2011) (Tennis, 2008) 4. Empiricism, Rationalism, Historicism and Pragmatism (Hjørland, 2013) 5. Multidisciplinarity approach (Beghtol, 1998) 6. Scientific approaches (Hjørland, 2008) 7. Positivistic and pragmatic approaches (Dousa, 2009) (Mai, 2011) 8. Interdisciplinary and evidence based practice classification (Hjørland, 2016) 9. Social and cultural context (J.-E. Mai, 2004) 10. By tracking the universe of knowledge 11. Universal order (Smiraglia & Van den Heuvel, 2011) 12. Integrative levels in classification (Dousa, 2009) 13. Literary warrant (Rodriguez, 1984) 14. Education warrant (Hjørland, 2007) (Beghtol, 1986) 15. Semantic warrant (Beghtol, 1986) 16. Syntactic warrant (Beghtol, 1986) 17. Domain and users requirements (Mai, 2005) 18. Pluralism and human interpretations Practical Approaches 1. Enumerative and Faceted (Batley, 2014). 2. General Purpose approach (Mai, 2003) and Special Purpose approach (Mancuso, 1994) e.g. classification schemes for general classes of knowledge areas or for a special class of knowledge area. 3. Syntactic axis (Beghtol, 1986) (Beghtol, 2001) 4. Semantic axis (Beghtol, 1986) (Beghtol, 2001) BIBLIOGRAPHIC CLASSIFICATION IN THE DIGITAL AGE | ULLAH, KHUSRO, AND ULLAH | doi:10.6017/ital.v36i3.8930 53 Classification in Digital Environment 1. Document Similarity (Hamming distance and Euclidean geometric approaches) (Losee, 1993) 2. Fuzzy approach (Jacob, 2004) 3. Clustering (Nizamani, Memon, & Wiil, 2011) 4. Categorization (Koshman, 1993) 5. TF-IDF weighting (Dorji et al., 2011) 6. Unsupervised machine learning techniques (Joorabchi & Mahdi, 2011). (K-mean Clustering, hierarchical clustering) 7. Supervised machine learning techniques (Wang, 2009) (Multinomial Naïve BAYES, Bernoulli model, Support Vector Machine, Random Forest, K-NN technique) 8. Information Extraction methods (Gilchrist, 2015) 9. Probabilistic text and document classification (Maron, Kuhns, & Ray, 1959) 10. Ontologies (Campbell, 2002) Table 1. Categorization of approaches towards bibliographic classification Evaluating Classification Schemes & Approaches In this Section, we present two evaluation frameworks to compare and evaluate the existing classification and categorization systems and well-known bibliographic classification ontologies. We have chosen CC, DDC, LCC, and Universal Decimal Classification (UDC) on the basis of their structural properties and wide usage both in conventional and digital libraries ("Subject classification schemes," 2015) ("Library of Congress Classification," 2014) ("About Universal Decimal Classification (UDC),") (Press, 2002) (Encyclopedia, 1 August 2014). Some of these properties include: citation and filling order; notations expressiveness; flexibility in classification principles, rules and notations; coverage of the knowledge areas; classification schedules and notations structure; notations brevity and simplicity; notations mnemonics; notations hospitality; schedules with updateable and comprehensive subjects order; and knowledge coverage (Batley, 2014). The UDC, LCC, and DDC are universal, multidisciplinary, and widely used systems (Koch & Day, 1997), whereas CC has the seminal and inspirational value for the faceted structure of the bibliographic classification. Therefore, the evaluation framework mainly targets these classification schemes as our natural choice for the evaluation and comparison. Similarly, we evaluate ACM9, IEEE10 and DMOZ11 using the evaluation framework as these are the well-known and widely used document classification & categorization systems for the digital libraries. Table 2 presents 22 metrics used in the evaluation framework. These evaluation metrics are extracted 9 http://www.acm.org/about/class 10http://www.ieee.org/about/today/at_a_glance.html?utm_source=mm_link&utm_campaign=iaa&utm_medium=ab& utm_term=at%20a%20glance 11 https://www.dmoz.org/docs/en/about.html INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 54 from the existing literature (Kaosar, 2008) (Painter, 1974) (Encyclopedia, 1 August 2014) (Buchanan, 1979) (Kaosar, 2008) (Painter, 1974) (Encyclopedia, 1 August 2014) (Koch et al., 1997) (Reiner, 2008) (Gnoli, Merli, Pavan, Bernuzzi, & Priano, 2008) (Francu, 2007) (Chan, Intner, & Weihs, 2016). These metrics include: (i) structural complexity; (ii) notational brevity; (iii) predefined structure; (iv) rules complexity; (v) theoretical laws; (vi) mnemonics; (vii) hospitality; (viii) search complexity; (ix) usability; (x) precision and accuracy; (xi) multilinguality; (xii) interoperability; (xiii) semantic search; (xiv) bias in subject representation; (xv) enumerative structure; (xvi) faceted structure; (xvii) faceted search; (xviii) consistency; (xix) LOD datasets; (xx) Linked Open Vocabularies (LOV) support; (xxi) platform; and (xxii) warrants of classification. These metrics, their need, and their use in ratings of classification systems are discussed in the following paragraphs. In Table 2, these bibliographic systems are evaluated for these metrics. The indicator ü shows the presence of metric value, û indicator represents that the system has no or minimal support for the mentioned metric, whereas and N/A is used for not applicable. In addition, each classification system has been evaluated and rated based on these metrics (Table 3), where Figure 1 graphically demonstrates the rankings and ratings of these classification systems. Schemes Metrics CC UDC DDC LCC ACM IEEE DMOZ Structural Complexity ü ü û û û û û Notational Brevity û û ü ü ü ü N/A Predefined Structure ü û ü ü ü ü ü Rules Complexity ü ü û ü û û û Theoretical Laws ü ü ü ü û û û Mnemonics ü ü ü ü ü ü û Hospitality ü ü ü ü ü ü ü Search Complexity ü ü û û û û ü Usability ü ü ü ü ü ü û Accuracy and Precision ü ü ü ü ü ü û Multilinguality ü ü ü ü û û ü Interoperability û ü ü ü ü ü û Semantic search ü ü ü ü ü ü û Bias in representation ü ü ü ü ü ü û Enumerative Structure û ü ü ü û û û BIBLIOGRAPHIC CLASSIFICATION IN THE DIGITAL AGE | ULLAH, KHUSRO, AND ULLAH | doi:10.6017/ital.v36i3.8930 55 Faceted Structure û ü û û ü ü û Faceted Search û ü ü ü ü ü û Consistency ü ü ü ü ü ü ü LOD Datasets û ü ü ü ü ü ü LOV Support û û û û ü û û Platform N/A UDC consortium OCLC Library of Congress ACM digital library IEEE Xplore digital library Open Directory Project Warrants of classification Literary Warrant (Giess, Wild, & McMahon, 2007) Literary Warrant (Perles, 1995) Literary and Scientific Warrant (Giess et al.,2007) Literary and Scientific Warrant (Giess et al.,2007) Scientific Research warrant Scientific Research warrant N/A Table 2. Evaluation of Classification Schemes The structural complexity means difficulties in using the structure and notations in classifying and describing a specific subject area. The metric will help us in selecting a classification scheme or system that is easy to use in classifying document collection by requiring short notations and simple rules. The notations and rules are complex in CC and UDC (Ranganathan, 1968). This complexity is because of the faceted structure in these classification schemes (Sukhmaneva, 1970). The structural complexity of CC is greater than that of UDC. UDC comes at second position in complexity as compared to CC. Because of its enumerative structure, LCC stands at third position, as it is lesser complex than CC and UDC. DDC is the simplest in this list because it is based on enumerative classification structure and on the principle of dividing universe of knowledge into defined classes. IEEE is more complex than ACM, whereas DMOZ is the least complex system. The classification system with greater structural complexity is ranked lower in the list. Therefore, based on this metric, the classification systems can be ranked as DMOZ, ACM, IEEE, DDC, LCC, UDC, and CC. The notational brevity means how brief are the notations in describing and understanding the holdings with minimum number of symbols and minimal cognitive load. DDC uses well-organized short notations and their mnemonic value is also greater (Comaromi & Satija, 1983) (Hyman, 1980). LCC has notational brevity (Chan et al., 2016). UDC uses lengthy notations (Kaosar, 2008) as compared to DDC, whereas CC also uses lengthy and complex notations (Chatterjee, 2016). ACM notations are shorter than IEEE, whereas DMOZ do not use any notations at all. Using this metric, these classification systems can be ranked as ACM, IEEE, DDC, LCC, UDC, CC, and DMOZ at the last with no usage of symbols at all. The predefined structure means that the classification scheme follows rigid pre-assumed subject categorization along with classification class marks. In this regard, UDC and LCC are enumerative INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 56 and impose subjectivity viewpoint of classification by following a predefined structure (Goh, Giess, McMahon, & Liu, 2009). Being faceted, CC arranges basic concepts in few predefined categories (Satija & Martínez-Ávila, 2015). DDC also has the predefined hierarchical structure of classification (Press, 2002) (Jonassen, 2004). Among these schemes, CC has minimal predefined structure because of using facets; UDC is both enumerative and analytico-synthetic. LCC is enumerative but possesses weaker predefined rules for the structural design. Because of the rigid enumerative hierarchies and predefined class structure, DDC comes at first position. DMOZ has the most rigid predefined structure as compared to that of IEEE and ACM. The classification system with most rigid and predefined structure is ranked lower, and therefore, the ranking could be CC, ACM, IEEE, UDC, DDC, LCC and DMOZ. The complexity in rules determines the difficulty level in applying classification rules on knowledge artifacts. CC presents a complex set of rules and classification theory, which is comparatively difficult to implement and understand (Tennis, 2011). LCC is also complex ("Library of Congress Subject Headings: Pre- vs. Post-Coordination and Related Issues," March 15, 2007 ) in implementing Library of Congress Subject Headings (LCSH) in pre-coordinated subject strings. DDC’s rules and principles are comprehensive and complete (Press, 2002) and easier than those of CC and LCC. UDC is also easy to understand and implement (Piros, 2014). ACM, IEEE, and DMOZ are simple to use and understand, and therefore, bears no such complexity. A classification system with greater complexity is ranked lower, therefore, based on this metric, the rankings could be ACM, IEEE, DMOZ are on top with similar rankings followed by UDC, DDC, LCC, and CC. Theoretical laws are considered as a metric to analyze the foundations of classification systems to understand whether they are based on certain theoretical laws and principles of classification or not. UDC combines the enumerative and faceted approaches gathered from DDC and CC (Kaosar, 2008). The synthetic principle of UDC contributes to its widespread use but it is not enough at the intellectual level for making the relations between the subject facets (Kyle & Vickery, 1961). UDC lacks standard rules for its application for making facets, but there are rules for its structural representation (McIlwaine, 1997). Therefore, the structural and synthetic rules are good enough for its applicability but it should be refined further at the intellectual level. The theoretical laws of CC are based on the faceted approach of managing knowledge artifacts. CC has sound rules and principles, which include different postulates, laws, principles and canons (Batley, 2014) (Arashanapalai Neelameghan & Parthasarathy, 1997). On the other hand, LCC has weaker theoretical foundations. There also exists some intellectual and structural limitations due to its enumerative structure (San Segundo Manuel, 2008). DDC has the hierarchical and the enumerative structure which is based on the knowledge philosophy of hierarchical division (Hjorland, 1999). Because of the strong theoretical foundations, CC is at the top of this list, DDC is second because of its universal theory of knowledge division, UDC is third for being exploiting the theories of DDC and CC, LCC is at fourth position for comparatively weak theory of classification, whereas ACM, IEEE, and DMOZ present no or very limited theoretical laws or philosophical rules of classification. BIBLIOGRAPHIC CLASSIFICATION IN THE DIGITAL AGE | ULLAH, KHUSRO, AND ULLAH | doi:10.6017/ital.v36i3.8930 57 The support for using mnemonics enables human classifiers to easily memorize the symbols and notations of classification scheme. The systematic and literal mnemonics are used in UDC (Satija, 2013) (Kaosar, 2008). The mnemonics are increased through mnemonic devices, which are described through the canons of mnemonics (Kaula, 1965). LCC uses literal mnemonics (Satija, 2013), whereas DDC uses systematic and literal mnemonics but its systematic mnemonics are not consistent (Satija, 2013). There are several seminal mnemonics in CC (Rahman & Ranganathan, 1962). These mnemonic devices increase mnemonics in CC, but the formation and length of the notations affects this mnemonic quality. ACM has greater support for mnemonics in comparison with IEEE, whereas DMOZ is the collection of web pages under specific categories. Based on this metric, the rankings of classification systems could be DDC, UDC, LCC, ACM, IEEE, CC, whereas DMOZ lacks in using any mnemonic devices or notations. Hospitality means the ability of a classification scheme to incorporate new knowledge areas expressed in different multilingual contexts. Hospitality is present in UDC (Kaosar, 2008). CC is also hospitable for new subjects (De Grolier, 1962). LCC is hospitable for expressing the new subjects and knowledge areas (Satija, 2013). DDC is hospitable for new subject areas (Satija, 2013). By applying this metric, a classification scheme with faceted approach is naturally more hospitable than others. Therefore, CC is more hospitable and at the top in this list followed by UDC. DDC is at third position for being following enumerative approach. LCC is at fourth position because of it’s of pure enumerative structure. IEEE and ACM are at fifth position by covering short span of knowledge areas, faceted structure, and efficient search. DMOZ is covering only web pages in already specified categories therefore it is at seventh position. Search complexity measures the difficulty in searching artifacts using a classification scheme. It describes that which classification scheme is worth in searching a specific document. Search complexity is minimal in UDC because of its syntactic-analytico and enumerative nature (Kaosar, 2008), which can contribute in search applications in both Web based and in-house searching applications e.g., Online Public Access Catalog (OPAC). The theory and philosophy of CC is the trend setter for the knowledge management, resource discovery & access, however, according to (Raghavan, 2016) searching through CC is comparatively weaker than other bibliographic classification schemes. According to (Chan, 2000), LCC and LCSH have the potential to provide the ease in searching because of richer vocabulary for greater subject coverage, synonym and homograph capabilities, pre-coordinated system, browsing capability in multi-faceted structure, multilingual support and MARC format support with sematic interoperability. However, it is limited in providing ease in search & retrieval process, which include syntax and application rules complexity, lack of training for the personnel, and too lengthy and complex searching strings. DDC and LCC are aggregated in Classify12 project initiated by OCLC. With the use of the Classify application, the search experience of the catalogers and patrons becomes much easier. Using this metric, DDC stands at the top with least complexity than LCC, UDC, and CC. IEEE is more complex than ACM and DMOZ. The classification scheme with less search complexity will be ranked higher. 12 http://www.oclc.org/research/themes/data-science/classify.html INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 58 Therefore, ACM and IEEE, DDC, and LCC stand first with least search complexity followed by UDC, and CC. DMOZ stands at the last position with greater search complexity having loose boundaries of categorization. Usability analyzes the difficulty in using a classification scheme for classifying and searching documents. This metric defines the ease of learning and effective usage. Usability measures user satisfaction, user understanding of the system, and precision with minimal recall in lesser amount of time (Singapore, 2016). OCLC has included structural changes to improve usability and simplify classification tasks ("Dewey Services: Dewey Decimal Classification System,"). The Classify13 project aims at finding books through a web interface, which is easy to use and understand by using DDC and LCC. UDC is extensively used in Web-based search and retrieval applications (Kaosar, 2008). This classification scheme is used in several institutions’ OPAC systems ("Library OPACs containing UDC codes,"). The UDC notations are supportive for the usability (Slavic- Overfield, 2005). However, the user interface of these OPAC search systems could be further improved (Slavic, 2006) (Pollitt, 1998) (Schallier, 2005). CC is the source of inspiration and a standardized model for the usability of faceted structure of bibliographic classification in the electronic and web based environments (Thelwall, 2009). In (Rosenfeld & Morville, 2002), the philosophy and methodology of CC is considered at the abstract and theoretical level. This assessment of CC leads us to the argument that the faceted structure is supportive in precise retrieval with a considerably high cognitive work at the user end as compared to DDC and LCC because of their simple enumerative structures. Library of Congress uses LCC in its catalog14 and Classification Web15 applications. These applications are exploiting LCSHs and LCC in user friendly manner. By looking at the usability aspect of these classification schemes, the ranking through this metrics appears as DDC is at the top for its easy enumerative structure and notational simplicity along with easy to use Web applications. LCC is at second position because of its enumerative structure and adoptability in web applications. Being enumerative and faceted, UDC stands at the third position. CC for being a pure faceted scheme with complex notations and rules, is ranked at the fourth position. Similarly, IEEE and ACM are faceted and easy to use, and therefore, share the first position with DDC. DMOZ with loose boundaries of categorization is least usable with limited browsing and search. The accuracy and precision metric measures how accurate and precise a classification system can identify the exact locations of the holdings in the given knowledge space. UDC shows accuracy and precision in finding the required knowledge artifact (Kaosar, 2008). The accuracy and precision of CC gets compromised as its lengthy notations introduces complexity in searching and discovering documents (Satija, 2015). LCC and DDC were researched for accuracy and precision by using a prototype model (Gnoli, Pusterla, Bendiscioli, & Recinella, 2016) for automatic text classification of electronic documents using classification metadata of library holdings from LCC and DDC 13 http://classify.oclc.org/classify2/ 14 https://catalog.loc.gov/vwebv/searchBasic 15 https://www.loc.gov/cds/classweb/classwebfeatures.html BIBLIOGRAPHIC CLASSIFICATION IN THE DIGITAL AGE | ULLAH, KHUSRO, AND ULLAH | doi:10.6017/ital.v36i3.8930 59 datasets. It was observed that for precision, there is a need for increasing DDC and LCC bibliographic data on the Web, introducing searching capabilities for bibliographic data at the micro level of any document, and increasing the efficiency of user interfaces for navigation using DDC-based browsing structure (Joorabchi & Mahdi, 2009) (Joorabchi & Mahdi, 2011). Therefore, CC because of the pure faceted approach has high-level precision in search and resource discovery. UDC stands second because of being enumerative and enumerative and analytico- synthetic. DDC is at the third position as OCLC maintains and updates its structure regularly along with state-of-the-art search applications. LCC shares the third position with DDC, being regularly updated and maintained by Library of Congress for precision in their search application. IEEE and ACM also show greater precision in their search & retrieval, and therefore, share the third position with DDC and LCC. DMOZ are the manually created and updated categories of web pages, having limited keyword search with very low precision. In connection to the evaluation framework, multilinguality means to classify and describe the knowledge artifacts written and expressed in variety of natural languages and the availability of any classification scheme in different natural languages. DMOZ supports 72 different languages of the world and therefore stays at the top. UDC is multilingual by supporting French, Portuguese, Spanish and Russian (Slavic, 2008) (Koch & Day, 1997) and has been translated into languages ("Universal Decimal Classification summary," 2017). LCC supports works in 19 language subclasses ("Library of Congress Classification Outline: Class P - Language and Literature,") including German, Slavic, Oriental Languages and Roman languages etc. The translations of DDC support to localize this scheme for different languages of the world (Vizine-Goetz, 2009). DDC is translated in 30 different languages but covers different languages in only seven classes i.e., from 420 to 490 class number ("Dewey Decimal Classification summaries,"). CC shows minimal multilingual support because of its sub-continental origin (A Neelameghan & Lalitha, 2013; Raghavan, 2016). ACM and IEEE are in English languages only and therefore, show no multilinguality at all. Using this metric, we can conclude that DMOZ is at the first position, followed by UDC, DDC, LCC, and then CC. Consistency measures the level of uniformity in classification system to classify subjects. According to (Batty, 1967), in the earlier stages, CC shows no consistency but by the addition of consistency cannons, it has gradually become consistent. LCC seems less consistent in expressing different subjects areas (Madge, 2011). DDC and LCC were found short of defining and classifying religious holdings especially Jewish contents. These schemes also show biasness towards different religious and regional contents (Maddaford & Briefing). Although DDC is a little bit inconsistent, still it can classify complex subjects (Gnoli et al., 2016). UDC also shows inconsistency, which can be sorted out by introducing specific UDC classes to database in online system (Kaosar, 2008). DDC shows comparatively greater consistency in classifying new subjects with constant uniformity; CC is ranked second because of the introduction of canons of consistency. LCC and UDC are ranked at third position. For being only limited to the scientific research articles, IEEE INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 60 and ACM are at fourth position. DMOZ stands at fifth position due to its loose boundaries of categorization. The interoperability determines how much a given classification scheme is interoperable in expressing its classification artifacts with other schemes. UDC is interoperable (Koch & Day, 1997) and supports integration with other systems. CC, because of its sub-continental origin, shows limited interoperability (A Neelameghan & Lalitha, 2013) (Raghavan, 2016). LCC shows interoperability by being capable to map with DDC (Vizine-Goetz, 2009). The interoperability and multilinguality of DDC enables it to map with other classification schemes (Vizine-Goetz, 2009). IEEE, ACM and DMOZ datasets are interoperable with other web applications. Based on this metric, DDC, LCC, UDC, ACM, IEEE and DMOZ are standing at first position because of the presence of their interoperability and data harvesting protocols and ontologies in the digital environment. DMOZ stands at the second position because of limited interoperability. CC provides only philosophical and theoretical model but we found no practical web-based application so it is not included in this list. By enabling semantic search, a classification scheme can proactively respond to information seekers using its faceted structure. UDC, because of its semantic structure (Slavic, 2008), contains semantic search capability. The classification theory and philosophy of CC provides the basis for classification ontology development (Panigrahi & Prasad, 2005), which makes obvious its capability of semantic search and inference. LCC supports semantic search through LOD support, semantically enabled LCSH and authority control files ("LC Linked Data Service: Authorities and Vocabularies,") (Harper & Tillett, 2007). DDC also contains semantic features (Green, 2015), which can be utilized in the semantic search applications. Therefore, it can be concluded that semantic search is also supported by DDC. This metric can be better analyzed in the digital environment and especially through analyzing these bibliographic classifications for their ontologies. LCC could be ranked first because of its expressive ontology with efficient semantic search application. DDC is at second position, because of efficient search but limited usage of its ontology. ACM is at third position because of its expressive ontology and efficient search but limited coverage to scientific domain. IEEE is at fourth position because of its faceted semantic search. UDC comes at fifth position because of its ontological presence but with limited usage. CC has no application in the digital environment, which could demonstrate its capability for semantic search, although it provides the basis for the semantic level for all bibliographic classification systems. DMOZ lacks in semantic search, where it is only based on keywords. Bias in subject representation means inclination for or against some subjects which results in unfair, partial negligence or fully ignoring any subject. DDC and LCC are biased in representing different knowledge and regional information, e.g., Anglo-American bias (Tomren, 2003), while UDC is biased towards European culture (Fandino, 2008). CC is biased towards different knowledge areas (Satija & Singh, 2010). A classification system with least biasness is ranked higher. Therefore, in this connection, DMOZ is ranked higher for showing no/least biasness; CC is ranked second because of the presence of acute biasness followed by DDC showing comparatively BIBLIOGRAPHIC CLASSIFICATION IN THE DIGITAL AGE | ULLAH, KHUSRO, AND ULLAH | doi:10.6017/ital.v36i3.8930 61 less biasness towards religion and regional subjects. LCC comes at the fourth position followed by IEEE and ACM that show greater biasness towards certain domains. Enumerative structure exhibits the rigid hierarchies. LCC is enumerative (Goh et al., 2009; Perles, 1995) (Bryant, October 4, 1993). UDC is nearly enumerative and faceted (Kaosar, 2008) (Bryant, October 4, 1993) and DDC is both analytico-synthetic and enumerative (Hallows, 2014). CC is faceted (Chatterjee, 2016; Dawson, Brown, & Broughton, 2006). By comparing these systems, LCC fully supports enumerative structure, and then comes DDC, whereas UDC is nearly enumerative and CC shows no enumerative structure at all. LCC and DDC are enumerative. The trend is towards semantic and faceted structure, and therefore, enumerative structure in classification systems is not a desirable characteristic. Therefore, the system with enumerative nature will be ranked lower. Based on this metric, CC and DMOZ are least enumerative and therefore, ranked higher, followed by IEEE and ACM at the second position, then UDC at the third position, while DDC and LCC at the last. The faceted structure means the semantically interlinked structure of categories, which can be merged and combined to generate an expression for existing or new concepts (Svenonius, 2000). CC is faceted (Chatterjee, 2016; Dawson et al., 2006). UDC is analytico-synthetic (Kaosar, 2008) and follows the faceted method of CC using different connecting symbols in mixed notations and using subject facets including time and space (Chatterjee, 2016). IEEE and ACM possess faceted structures. DMOZ has only hierarchical structure and predefined categories. Based on this metric we rank CC first, UDC second, ACM and IEEE third while DDC and LCC are enumerative structures, and therefore, cannot be included in the list. Faceted search means to navigate or browse through the faceted structure of a faceted classification scheme. Faceted search is also applied by selecting different ranges and choices from different facets that are given by any faceted system to search the required contents. It is different from search complexity in the sense that it looks at the pattern and criteria of search that exist in any classification scheme either in there OPACs or web applications. The theory and philosophy of CC supports faceted search & browsing economically (Kong, 2016), however, to the best of our knowledge, no real-world application demonstrates its usefulness. UDC is based on the faceted approach, which supports faceted search (Tunkelang, 2009). LCC supports faceted search with the help of LCSH (McGrath, 2007). LCC also provides faceted search through the Faceted Application of Subject Terminology (FAST) application ("Faceted Application of Subject Terminology," 2017). DDC provides the faceted search through the OCLC Classify16 application. Using this metric, these classification schemes can be ranked as DDC at first position because DDC is adopting the faceted approach along with its native enumerative nature and state-of-the-art web based search applications developed by OCLC. LCC is at second position because of its web based search applications and its adaptation of comparatively restricted faceted approach. IEEE, for providing extensive choice of searching patterns, stands at the third position. ACM has poly-hierarchical and 16 http://classify.oclc.org/classify2/ INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 62 multi-faceted classification structure along with robust search mechanism; therefore, it is on fourth position in this list. There are very limited faceted search applications of UDC and therefore it stands at fifth position. DMOZ has hierarchical structure in which the required element can be accessed through a keyword search. Therefore, it is not providing any faceted search. CC has no search applications that could confirm its support for the faceted search. LOD datasets means the availability of datasets of a given classification system on LOD cloud. Among our choice of well-known classification systems UDC, LCC, DDC, IEEE, ACM and DMOZ have datasets in the LOD cloud whereas CC has no such datasets. The definitions of classes and properties are gathered in Linked Data Vocabularies (LOV), which are used for describing different types of objects used in LOD cloud. These definitions of different things provide vocabularies for linking the linked data (Foundation, 2017). CC, UDC, DDC, LCC, IEEE and DMOZ have no LOV, whereas ACM has LOV vocabularies. The metric “platforms” in the evaluation framework, considers the applicability of a given classification system in real-world web applications and other digital environments. In this regard, UDC is supported by UDC consortium, DDC by OCLC, LCC by Library of Congress, ACM by ACM digital library, IEEE by IEEE Xplore digital library, and DMOZ by Open Directory Project. To the best of our knowledge, CC has not been used by any of the online applications. Table 3. Ranking and Average Ranking of Classification Schemes The warrants of classification work as authoritative acts for classificationists to perform the cognitive practice for designing the classes and concepts in the classification system, their structural properties and then putting subjects in the specified classes (Beghtol, 1986). CC and R an ki n g St ru ct ur al C om pl ex it y N ot at io n al b re vi ty P re de fi n ed S tr uc tu re R ul es C om pl ex it y T he or et ic al L aw s M n em on ic s H os pi ta li ty Se ar ch C om pl ex it y U sa bi li ty P re ci si on a n d A cc ur ac y M ul ti li n gu il it y In te ro pe ra bi li ty Se m an ti c Se ar ch B ia sn es s En um er at iv e St ru ct ur e Fa ce te d S tr uc tu re Fa ce te d S ea rc h Co n si st en cy LO D d at as et s LO V S up po rt A ve ra ge R an ki n g CC 1 2 7 1 5 2 6 2 2 4 2 1 7 5 4 4 1 4 1 1 3.1 UDC 2 3 4 4 3 6 5 3 3 3 5 3 3 3 2 3 2 5 2 1 3.25 DDC 4 5 3 3 4 7 4 4 5 2 4 3 6 4 1 1 6 3 2 1 3.6 LCC 3 4 2 2 2 5 3 4 4 2 3 3 2 1 1 1 5 3 2 1 2.65 ACM 6 7 6 5 1 4 2 4 5 2 1 3 5 2 3 2 3 2 2 1 3.3 IEEE 5 6 5 5 1 3 2 4 5 2 1 3 4 2 3 2 4 2 2 2 3.15 DMOZ 7 1 1 5 1 1 1 1 1 1 6 2 1 6 4 1 1 1 2 1 2.25 BIBLIOGRAPHIC CLASSIFICATION IN THE DIGITAL AGE | ULLAH, KHUSRO, AND ULLAH | doi:10.6017/ital.v36i3.8930 63 UDC use literary warrant; DDC and LCC use literary & scientific warrants. ACM and IEEE use scientific research warrant, while DMOZ exhibits no warrant of classification. In the above paragraphs, we compared and evaluated the selected classification system using the evaluation metrics (shown in Table 2), and discussed how these systems can be ranked based on a given evaluation metric. However, to give a holistic view of this comparison and evaluation, we introduce a ranking score or levels ranging from 1 (meaning low ranking, not applicable, or not available) to 7 (meaning high ranking) in how a classification scheme is best among its counterparts in the list. It is also the case that for a given metric, multiple systems may belong to the same ranking level. By assigning these ranking levels, Table 3 compares these systems based on 20 metrics by excluding platforms and warrants of classification. Table 3 also reports the average ranking of these classification systems, showing DDC at top with average ranking of 3.6, followed by ACM = 3.3, and UDC = 3.25. It can be concluded that DDC and UDC are among the best classification schemes for describing printed as well as digital collections, whereas ACM is best for classifying digital collections belonging to Computer Science domain. However, ACM classification system can be extended to include other domains as well. Figure 1 illustrates graphically the comparison and evaluation of these systems. Figure 1. Comparison and Ranking of Classification Systems Table 4 presents the state-of-the-art bibliographic classification ontologies including Bibliographic ontology, LCC ontology, DDC ontology, UDC ontology, and DMOZ ontology. Some of these ontologies were designed specifically for certain targeted applications e.g., ACM ontology for ACM digital library, and LCC ontology for Library of Congress etc., whereas others have multiple usage scenarios and have been used by several applications. An example of such general-purpose INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 64 bibliographic classification ontology is the Bibliographic Ontology17, which is used by several bibliographic services and digital libraries e.g., digital object identifier (DOI), Zotero, and Library of Congress Classification Number (LCCN) permalink service (Giasson, 2012). This evaluation framework compares these ontologies based on their size (in terms of number of classes), usage in the state-of-the-art applications, LOD support, the availability of datasets on datahub18, and LOV support. By looking at Table 3, ACM show more comprehensiveness in terms of number of classes, triples and LOV support. Classification and Categorization Ontologies No. of classes Applications LOD datasets LOD datasets triples LOV support Bibliographic ontology19 69 Library of Congress and BibBase ü 200000 ü LCC ontology20 40+ Library of Congress ü Not Given û DDC ontology21 20+ OCLC ü 402288 û UDC ontology22 2,600 UDC23 ü 69,000 û ACM ontology24 1469 ACM ü 12402336 ü IEEE LOM metadata ontology (Casali, Deco, Romano, & Tomé, 2013) 9 IEEE25 Xplore digital library ü 91564 ü DMOZ ontology26 Not given Open Directory Project ü Not given û Table 4. Comparison of classification and categorization ontologies 18 https://datahub.io 19 http://purl.org/ontology/bibo 20 http://id.loc.gov/ 21 http://dewey.info/ 22 http://udcdata.info/ 23 http://udcdata.info/ 24 http://dl.acm.org/ccs/ccs.cfm 25 http://ieee.rkbexplorer.com/id/ 26 https://www.dmoz.org/rdf.html BIBLIOGRAPHIC CLASSIFICATION IN THE DIGITAL AGE | ULLAH, KHUSRO, AND ULLAH | doi:10.6017/ital.v36i3.8930 65 Issues & Challenges in Classification Research Although bibliographic classification has been practiced since the use of books and the inception of library & information science practices, further research & development efforts are required for to meet the classification needs of the digital age. Especially, with the arrival of digital holdings, researchers face several issues and challenges. For example, automatic text classification performs categorization of resources using ordinary metrics including TF-IDF and classification in its true sense is yet to be achieved (Yi, 2006). To handle the issue, text classification is also carried out through semantic indexing but its accuracy and precision are yet to be achieved. Semantic and structural relationships among different parts of text corpus is still at infant level and has not been exploited to their fullest so that these can be used in text classification in more meaningful ways. Other challenges in text classification include handling huge data resulted by applying a classification scheme, dynamism in classification, and structure dissimilarity among classification schemes although they agree upon subject as the primary characteristic. The biasness in DDC and LCC needs to be resolved. Several revisions and proposals are put forward for addressing the problem of systematic knowledge organization and searching through natural language terms (Miksa, 2007). There are various issues regarding the structural updates, search & retrieval criteria, and visualization (Slavic-Overfield, 2005). There are two main challenges for the application of the bibliographic classification principles in classifying the Web. First, the principles of the bibliographic classification are formulated for the printed documents, which should also be applicable to digital collections. For addressing these challenges, there is need to apply and modify bibliographic classification principles in digital environments. Second, it is required to exploit hidden hierarchies and concepts to be better classified by the principles of bibliographic classification for precise discovery, search and retrieval (J. Mai, 2004). The issue of dependent process of classification of any object per predefined criteria and principles is important to address for finding a place in this age of search engines. This issue can be tailored by the principles of classification, so that the conventional principles are modified to consider the purpose of classification and domain of objects. For this issue semantic web and ontologies can play a vital role in bibliographic classification, which can provide independent classification of the bibliographic classification predefined theories (Hjørland, 2012). The issue of heterogeneity conflicts, which arise because of the inconsistencies and structural divergences, are the challenges for the semantic interoperability. Semantic interoperability can be brought into the bibliographic records inside the bibliographic system and across the systems through different phases of interlinking, evaluation, analysis, remodeling & conversion for analyzing, and restructuring the bibliographic data (Tallerås, 2013). Bibliographic data is in multi-format, multi-topical, multi-lingual and multi-targeted. For tackling these issues, the bibliographic data must be made mutually interoperable for making it INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 66 interlinked, searchable, and presented in a harmonized way across the boundaries of the datasets and data silos. The interoperability problem arises at the syntactic level for making consistent the character sets, notations, data formats and records in different systems. The interoperability problem is also arising at the semantic level because of the difference in data interpretations and difference in vocabularies, and precision levels in data encoding. Publishing, collecting and maintenance of bibliographic data by multi organizations through own established standards and best practices in Web 2.0 (Hyvönen, 2012). With these problems in hand, the transition of this data from syntactical Web to Semantic Web is a challenge for bringing the uniformity in records that are generated by diverse sources, encoded in multi-bibliographic systems, cross bibliographic systems interoperability, the visualization of bibliographic data accordingly as per need for different contexts. For addressing these problems there is a need for coordination and collaboration between bibliographic data publishers and the technical developers of the web applications (Hyvönen, 2012). There is variety of metadata standards and schemas for defining, managing, resource discovery, search & retrieval, preserving, mapping, cross-walking, integrity, accuracy, and authenticity of metadata and bibliographic data. But for these tasks to be handled with great simplicity, semantic richness and accuracy, a universal all in one metadata format and schema is the need of the day (Ramesh, Vivekavardhan, & Bharathi, 2015) to get out of this jungle of standards (Gartner, 2016). This way, the metadata publishers and managers could get relieved and the job will become economic in terms of time, management, and search & retrieval. Three main tasks were set in Semantic Publishing challenges 2015. These tasks are: (i) extracting data on workshops’ quality indicators; (ii) extracting data on affiliations, citations, funding; and (iii) interlinking. Several challenges were faced while fulfilling these tasks. These tasks are being fulfilled through a proposed solution, which is composed of a text mining pipeline, LODeXporter and Named Entity Recognition (NER) for named entities extraction form text and linking them to resources on the LOD cloud (Sateli & Witte, 2015). In (Peroni, 2012) three main issues of semantic publishing are addressed which are: lack of document publishing universal metadata schemas according to publishing vocabulary, lacking of efficient user interface that are based on models and theories of semantic publishing, and there is a need for a tool that semantically link and describe document text. These issues require the urgent need for comprehensive ontologies for document publishing domain. (Ferrara & Salini, 2012) tossed 10 challenges for multiple dimensions of data in terms of bibliographic analysis. These challenges are: (i) analyzing bibliographic data in a multidimensional pattern; (ii) discovering and integrating data coming from diverse sources; (iii) detecting multiple references to the same item and cleaning, normalizing, and disambiguating bibliographic data records; (iv) analyzing multidimensional nature of bibliographic data through multivariate analysis for aggregating the data; (v) comparing different elements of bibliographic data and its ranking accordingly, (vi) aggregating indexes of different nature with respect to BIBLIOGRAPHIC CLASSIFICATION IN THE DIGITAL AGE | ULLAH, KHUSRO, AND ULLAH | doi:10.6017/ital.v36i3.8930 67 different parameters, dimensions, and elements of bibliographic data; (vii) dealing with multiple indexes for the same item with different values coming from different sources; (viii) extracting and indexing textual information from text corpus in support of text mining; (ix) analyzing textual data topic-wise and describing these topics for research and learning process and tracing different trends; and (x) combining multidimensional information for finding trends in bibliographic data collection. Bibliographic classification systems are being incorporated in LOD. In Dewey.info27, a prototype version of DDC is designed for linking its dataset in linked data cloud. The intention is to provide a platform for DDC data on the Web having summaries of top 3 levels of classification order of DDC 22nd edition in 11 different language encoded in RDF/SKOS, having actionable URIs for every class, representation for machines is in RDF, and for humans in XHTML+RDFs, and serialization available in formats of RDF/XML, Turtle and JSON, and with SPARQL endpoint. (OCLC 2011; Mitchell and Panzer 2013). However, this version of DDC on LOD cloud is still at infant stage to cover different subjects and to be widely used in generating and creating documents metadata. Library of Congress Linked Data service provides access to commonly used standards and vocabularies developed by Library of Congress. This includes data values, controlled vocabularies, and preservation vocabularies which are part of this service. This service provides access to LCSH, LCC name authority files, LCC28, LC children's subject headings, LC genre/form terms, thesaurus for graphic materials, MARC relators, MARC countries, MARC geographic areas, MARC languages, ISO639-1 languages, ISO639-2 languages, ISO639-5 languages, extended date/time format, preservation events, and preservation level role and cryptographic hash functions. The authorities and vocabularies currently included in this service are listed on the Linked Data service (Library of Congress 2014). However, it lacks in vocabularies for supporting PREMIS, MARC, MODS, METS, and MIX. As presented in Section 2, several ontologies have been developed for describing and sharing knowledge about bibliographic classification. However, the available ontologies are limited in several ways e.g., these ontologies are not the complete clones of classification schemes of which they are deemed to be ontologies and they also not mature enough in terms of metadata collection. In addition, these ontologies still couldn’t break the cross-classification scheme metadata collection barriers i.e., they are not interoperable enough to harvest the metadata across bibliographic ontology system. Therefore, further initiatives are required to develop matured bibliographic ontologies which fully clone bibliographic schemes that are in practical use and have strong theoretical ground. These ontologies must be interoperable and sharing metadata collection with other bibliographic ontologies. In this way in future we can have ontology-based general bibliographic classification system by fusion of the new and existing bibliographic ontologies for better management of the knowledge artifacts. 27 https://datahub.io/dataset/dewey_decimal_classification INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 68 CONCLUSIONS With the arrival of digital collections, new challenges of preservation, curation as well as resource discovery & access (retrieval) have emerged that needs proper attention, where classification schemes and ontologies can play a significant role. By comparing and evaluating the available bibliographic classification and categorization systems it is concluded that currently DDC is the best classification system followed by UDC, and ACM. The bibliographic classification ontologies are limited in one way or the other e.g., some of these are comprehensive like UDC and ACM but lack support for LOD and LOV etc., while others support these later aspects but lack comprehensiveness. Keeping in view the available bibliographic classification ontologies and their limitations, we recommend that a universal bibliographic classification ontology should be developed by using the classes from the available ontologies and providing support in terms of availability of datasets, support for interoperability, LOD, and Linked data vocabularies. For developing a more meaningful classification system, equally applicable to digital environments, it is necessary to consider the book structural semantics such as table of contents, headings, chapters, sections, subsections, figures, algorithms, mathematical equations, quotations etc., and the logical connections in contents (Khusro & Ullah, 2016; I. Ullah & Khusro, 2016) as well as about the book information i.e., the bibliographic details of the holdings. To meet, the former requirement, a comprehensive ontology like BookOnt (A. Ullah, Ullah, Khusro, & Ali, 2016) could be used, which can be mapped with any bibliographic ontology like e.g., Bibliographic Ontology29. However, as the evaluation frameworks suggest, DDC, UDC, and ACM Classification System should be exploited in designing such a general-purpose classification system. REFERENCES The 2012 ACM Computing Classification System. Retrieved March 20, 2017, from http://www.acm.org/about/class/2012 About Universal Decimal Classification (UDC). Retrieved March 21, 2017, from http://www.udcc.org/index.php/site/page?view=about Albrechtsen, H. (2000). Who wants yesterday's classifications? Information science perspectives on classification schemes in common information spaces. In K. Schmidt (Ed.), Papers. Technical University of Denmark, Center for Tele-Information. Batley, S. (2014). Classification in Theory and Practice. Oxford: Chandos Publishing. Batty, C. D. (1967). An introduction to colon classification: Archon Books. Beghtol, C. (1986). Semantic validity: concepts of warrant in bibliographic classification systems. Library resources & technical services, 30(2), 109-125. 29 http://bibliontology.com/# BIBLIOGRAPHIC CLASSIFICATION IN THE DIGITAL AGE | ULLAH, KHUSRO, AND ULLAH | doi:10.6017/ital.v36i3.8930 69 Beghtol, C. (1998). Knowledge domains: multidisciplinarity and bibliographic classification systems. Knowledge Organization, 25(1-2), 1-12. Beghtol, C. (2001). Relationships in classificatory structure and meaning Relationships in the organization of knowledge (pp. 99-113): Springer. Berners-Lee, T. (2006, June 18, 2009). Linked data. Design Issues. Retrieved March 21, 2017, from https://www.w3.org/DesignIssues/LinkedData.html Betts, T., Milosavljevic, M., & Oberlander, J. (2007). The utility of information extraction in the classification of books Advances in Information Retrieval (pp. 295-306): Springer. Bizer, C., Heath, T., & Berners-Lee, T. (2009). Linked data-the story so far. Semantic Services, Interoperability and Web Applications: Emerging Concepts, 205-227. Boykin, J. (2016). Assessing DMOZ: A Quality Review. Retrieved 14-03-2016, 2016, from https://www.seochat.com/c/a/search-engine-news/assessing-dmoz-a-quality-review/ Bryant, B. (October 4, 1993). 'Numbers You Can Count On' Dewey Decimal Classification Is Maintained at LC. Library of Congress Information Bulletin, 52(18). http://www.loc.gov/loc/lcib/93/9318/count.html Buchanan, B. (1979). Theory of library classification. Campbell, D. G. (2002). Centripetal and Centrifugal Forces in Bibliographic Classification Research. Paper presented at the ASIS SIG/CR Classification Research Workshop. Casali, A., Deco, C., Romano, A., & Tomé, G. (2013). An assistant for loading learning object metadata: An ontology based approach. Chan, L. M. (2000). Exploiting LCSH, LCC, and DDC to Retrieve Networked Resources: Issues and Challenges. Chan, L. M., Intner, S. S., & Weihs, J. (2016). Guide to the Library of Congress classification: ABC- CLIO. Chapman, J. W., Reynolds, D., & Shreeves, S. A. (2009). Repository metadata: approaches and challenges. Cataloging & classification quarterly, 47(3-4), 309-325. Chatterjee, A. (2016). Universal Decimal Classification and Colon Classification: Their mutual impact. Annals of Library and Information Studies (ALIS), 62(4), 226-230. Cliff, P. (2008). JISC-Repositories: Subject Classification Thread Summary. Comaromi, J. P., & Satija, M. P. (1983). Brevity of notation in Dewey decimal classification: Metropolitan. Dawson, A., Brown, D., & Broughton, V. (2006). The need for a faceted classification as the basis of all methods of information retrieval. Paper presented at the Aslib proceedings. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 70 De Grolier, E. (1962). A study of general categories applicable to classification and coding in documentation. Dewey Decimal Classification summaries. Retrieved March 21, 2017, from https://www.oclc.org/en/dewey/features/summaries.html Dewey Services: Dewey Decimal Classification System. Retrieved March 20, 2017, from https://www.oclc.org/content/dam/oclc/services/brochures/211422usb_dewey_services. pdf Dorji, T. C., Atlam, E.-s., Yata, S., Fuketa, M., Morita, K., & Aoe, J.-i. (2011). Extraction, selection and ranking of Field Association (FA) Terms from domain-specific corpora for building a comprehensive FA terms dictionary. Knowledge and Information Systems, 27(1), 141-161. doi: 10.1007/s10115-010-0296-x Dousa, T. M. (2009). Evolutionary order in the classification theories of CA Cutter & EC Richardson: its nature and limits. Encyclopedia, N. W. (1 August 2014). Library classification. 2017, from http://www.newworldencyclopedia.org/entry/Library_classification Faceted Application of Subject Terminology. (2017). Retrieved March 21, 2017, from http://www.oclc.org/research/themes/data-science/fast.html Fandino, M. (2008). UDC or DDC: a note about the suitable choice for the National Library of Liechtenstein. Extensions and Corrections to the UDC. Ferrara, A., & Salini, S. (2012). Ten challenges in modeling bibliographic data for bibliometric analysis. Scientometrics, 93(3), 765-785. Foundation, O. K. (2017). About LOV. from http://lov.okfn.org/dataset/lov/about Francu, V. (2007). Multilingual access to information using an intermediate language: Proefschrift voorgelegd tot het behalen van de graad van doctor in de Taal-en Letterkunde aan de Universiteit Antwerpen. Gartner, R. (2016). Metadata: Springer. Giasson, B. D. A. F. (2012). Projects using BIBO. from http://www.bibliontology.com/projects.html Giess, M. D., Wild, P., & McMahon, C. (2007). The use of faceted classification in the organisation of engineering design documents. Paper presented at the Proceedings of the International Conference on Engineering Design 2007. Gilchrist, A. (2015). Reflections on Knowledge, Communication and Knowledge Organization. Knowledge Organization, 42(6), 456-469. Giunchiglia, F., Marchese, M., & Zaihrayeu, I. (2007). Encoding classifications into lightweight ontologies Journal on data semantics VIII (pp. 57-81): Springer. BIBLIOGRAPHIC CLASSIFICATION IN THE DIGITAL AGE | ULLAH, KHUSRO, AND ULLAH | doi:10.6017/ital.v36i3.8930 71 Gnoli, C., Merli, G., Pavan, G., Bernuzzi, E., & Priano, M. (2008). Freely faceted classification for a Web-based bibliographic archive: the BioAcoustic Reference Database. Gnoli, C., Pusterla, L., Bendiscioli, A., & Recinella, C. (2016). Classification for collections mapping and query expansion. Goh, Y. M., Giess, M., McMahon, C., & Liu, Y. (2009). From faceted classification to knowledge discovery of semi-structured text records Foundations of Computational, IntelligenceVolume 6 (pp. 151-169): Springer. Green, R. (2015, October 29-30, 2015). Relational aspects of subject authority control: the contributions of classificatory structure. Paper presented at the Proceedings of the International UDC Seminar 2015 Classification & authority control Expanding resource discovery, Lisbon. Hallows, K. M. (2014). It's All Enumerative: Reconsidering Library of Congress Classification in US Law Libraries. Law Libr. J., 106, 85. Harper, C. A., & Tillett, B. B. (2007). Library of Congress controlled vocabularies and their application to the Semantic Web. Cataloging & classification quarterly, 43(3-4), 47-68. Hjorland, B. (1999). The DDC, the universe of knowledge, and the post-modern library. Journal of the Association for Information Science and Technology, 50(5), 475. Hjørland, B. (2007). Semantics and knowledge organization. Annual review of information science and technology, 41(1), 367-405. Hjørland, B. (2008). Core classification theory: a reply to Szostak. Journal of Documentation, 64(3), 333-342. Hjørland, B. (2012). Is classification necessary after Google? Journal of Documentation, 68(3), 299- 317. Hjørland, B. (2013). Theories of knowledge organization—theories of knowledge: Keynote March 19, 2013. 13th Meeting of the German ISKO in Potsdam. Knowledge Organization, 40(3), 169-181. Hjørland, B. (2016). Subject (of documents). Knowledge Organization, 44(1), 55-64. Hyman, R. J. (1980). Shelf classification research: past, present--future? Occasional papers (University of Illinois at Urbana-Champaign. Graduate School of Library Science); no. 146 (Nov. 1980). Hyvönen, E. (2012). Publishing and using cultural heritage linked data on the semantic web. Synthesis Lectures on the Semantic Web: Theory and Technology, 2(1), 1-159. Jacob, E. K. (2004). Classification and categorization: a difference that makes a difference. Library trends, 52(3), 515. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 72 Jonassen, D. H. (2004). Handbook of research on educational communications and technology: Taylor & Francis. Jones, K. S. (1970). Some thoughts on classification for retrieval. Journal of Documentation, 26(2), 89-101. Joorabchi, A., & Mahdi, A. E. (2009). Leveraging the legacy of conventional libraries for organizing digital libraries. Paper presented at the International Conference on Theory and Practice of Digital Libraries. Joorabchi, A., & Mahdi, A. E. (2011). An unsupervised approach to automatic classification of scientific literature utilizing bibliographic metadata. J. Inf. Sci., 37(5), 499-514. doi: 10.1177/0165551511417785 Kaosar, A. (2008). Merit & Demerit of using Universal Decimal Classification on the Internet. Kaula, P. (1965). Colon Classification: Genesis and Development. Library Science Today. Ranganathan’s Festschrift, 1, 87-93. Khusro, S., & Ullah, I. (2016). Towards a semantic book search engine. Paper presented at the 2016 International Conference on Open Source Systems & Technologies (ICOSST'16), Lahore, Pakistan. Koch, T., & Day, M. (1997). DESIRE - Development of a European Service for Information on Research and Education. Koch, T., Day, M., Brümmer, A., Hiom, D., Peereboom, M., Poulter, A., & Worsfold, E. (1997). The role of classification schemes in Internet resource description and discovery. Work Package, 3. Kong, W. (2016). Extending Faceted Search to the Open-Domain Web. University of Massachusetts Amherst. Koshman, S. (1993). Categorization and classification revisited: a review of concept in library science and cognitive psychology. Current Studies in Librarianship Spring/Fall, 26. Kwaśnik, B. H., & Rubin, V. L. (2003). Stretching conceptual structures in classifications across languages and cultures. Cataloging & classification quarterly, 37(1-2), 33-47. Kyle, B., & Vickery, B. C. (1961). The Universal Decimal Classification: present position and future developments: Unesco. LC Linked Data Service: Authorities and Vocabularies. Retrieved 28 Feb 2017, 2017, from http://id.loc.gov Lee, H.-L. (2012). Epistemic foundation of bibliographic classification in early China: A Ru classicist perspective. Journal of Documentation, 68(3), 378-401. Library of Congress Classification. (2014, 10/1/2014). Retrieved March 20, 2017, from https://www.loc.gov/catdir/cpso/lcc.html BIBLIOGRAPHIC CLASSIFICATION IN THE DIGITAL AGE | ULLAH, KHUSRO, AND ULLAH | doi:10.6017/ital.v36i3.8930 73 Library of Congress Classification Outline: Class P - Language and Literature. [Press release]. Retrieved from https://www.loc.gov/aba/cataloging/classification/lcco/lcco_p.pdf . Library of Congress Subject Headings: Pre- vs. Post-Coordination and Related Issues. (March 15, 2007 ) Report for Beacher Wiggins, Director, Acquisitions & Bibliographic Access Directorate, Library Services, Library of Congress (pp. 49). Cataloging Policy and Support Office. Library OPACs containing UDC codes. Retrieved March 21, 2017, from http://www.udcc.org/index.php/site/page?view=opacs Linked data. from https://www.w3.org/standards/semanticweb/data Losee, R. M. (1993). Seven fundamental questions for the science of library classification. Knowledge Organization, 20, 65-65. Maddaford, S., & Briefing, C. Library of Congress Classification System. Madge, O.-L. (2011). Evidence Based Library and Information Practice. Studii de Biblioteconomie şi Ştiinţa Informării(15), 107-112. Mai, J.-E. (2003). The future of general classification. Cataloging & classification quarterly, 37(1-2), 3-12. Mai, J.-E. (2004). Classification in context: relativity, reality, and representation. Knowledge Organization, 31(1), 39-48. Mai, J.-E. (2005). Analysis in indexing: document and domain centered approaches. Information Processing & Management, 41(3), 599-611. doi: http://dx.doi.org/10.1016/j.ipm.2003.12.004 Mai, J.-E. (2009). The boundaries of classification. Mai, J.-E. (2010). Classification in a social world: bias and trust. Journal of Documentation, 66(5), 627-642. Mai, J.-E. (2011). The modernity of classification. Journal of Documentation, 67(4), 710-730. Mai, J. (2004). Classification of the Web: challenges and inquiries. Knowledge Organization, 31(2), 92. Mancuso, J. (1994). General Purpose vs Special Purpose Couplings. Paper presented at the 23rd Turbomachinery Symposium, Dallas, TX, Sept. Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval (Vol. 1): Cambridge university press Cambridge. Maron, M. E., Kuhns, J. L., & Ray, L. C. (1959). Probabilistic indexing: a statistical approach to the library problem. Paper presented at the Preprints of papers presented at the 14th national meeting of the Association for Computing Machinery, Cambridge, Massachusetts. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 74 McGrath, K. (2007). Facet-based search and navigation with LCSH: Problems and opportunities. code4lib Journal, 1. McIlwaine, I. C. (1997). The Universal Decimal Classification: Some factors concerning its origins, development, and influence. Journal of the American Society for Information Science (1986- 1998), 48(4), 331. Miksa, S. D. (2007). The challenges of change: a review of cataloging and classification literature, 2003-2004. Library resources & technical services, 51(1), 51. Neelameghan, A., & Lalitha, S. (2013). Multilingual thesaurus and interoperability. DESIDOC Journal of Library & Information Technology, 33(4). Neelameghan, A., & Parthasarathy, S. (1997). SR Ranganathan's Postulates and Normative Principles: Applications in Specialized Databases Design, Indexing and Retrieval: Sarada Ranganathan Endowment for Library Science. Nizamani, S., Memon, N., & Wiil, U. K. (2011). Cluster Based Text Classification Model Counterterrorism and Open Source Intelligence (pp. 265-283): Springer. Painter, A. F. (1974). Classification: Theory and Practice. Drexel Library Quarterly, 10(4), n4. Panigrahi, P., & Prasad, A. (2005). Inference Engine for Devices of Colon Classification in AI-based Automated Classification System. Perles, B. (1995). Faceted Classifications and Thesauri. Retrieved from Howard Besser's Web website: http://besser.tsoa.nyu.edu/impact/f95/Papers-projects/Papers/perles.html Peroni, S. (2012). Semantic Publishing: issues, solutions and new trends in scholarly publishing within the Semantic Web era. alma. Piros, A. (2014, 29 February 1, 2014). A different approach to Universal Decimal Classification in a Mechanized Retrieval System. Paper presented at the Proceedings of the 9th International Conference on Applied Informatics Eger, Hungary. Pollitt, A. S. (1998). The key role of classification and indexing in view-based searching: Technical report, University of Huddersfield, UK, 1998. http://www.ifla.org/IV/ifla63/63polst.pdf. Press, O. F. (2002). Introduction to the dewey decimal classification. Raghavan, K. (2016). The Colon Classification: A few considerations on its future. Annals of Library and Information Studies (ALIS), 62(4), 231-238. Rahman, A., & Ranganathan, T. (1962). Seminal Mnemonics. Annals of Library Science, 9, 53-67. BIBLIOGRAPHIC CLASSIFICATION IN THE DIGITAL AGE | ULLAH, KHUSRO, AND ULLAH | doi:10.6017/ital.v36i3.8930 75 Ramesh, P., Vivekavardhan, J., & Bharathi, K. (2015). Metadata Diversity, Interoperability and Resource Discovery Issues and Challenges. DESIDOC Journal of Library & Information Technology, 35(3). Ranganathan, S. R. (1968). Choice of scheme for classification. Lib. Sci. with a slant to Documentation, 5(1), 1-69. Reiner, U. (2008). Automatic analysis of Dewey decimal classification notations Data analysis, machine learning and applications (pp. 697-704): Springer. Rodriguez, R. D. (1984). Hulme's concept of literary warrant. Cataloging & classification quarterly, 5(1), 17-26. Rosenfeld, L., & Morville, P. (2002). Information architecture for the world wide web: " O'Reilly Media, Inc.". San Segundo Manuel, R. (2008). Some arguments against the suitability of Library of Congress Classification for Spanish Libraries. Extensions and Corrections to the UDC. Sateli, B., & Witte, R. (2015). Automatic construction of a semantic knowledge base from CEUR workshop proceedings. Paper presented at the Semantic Web Evaluation Challenge. Satija, M. P. (2013). The theory and practice of the Dewey decimal classification system: Elsevier. Satija, M. P. (2015). Save the national heritage: Revise the Colon Classification. Satija, M. P., & Martínez-Ávila, D. (2015). Features, Functions and Components of a Library Classification System in the LIS tradition for the e-Environment. Journal of Information Science Theory and Practice, 3(4), 62-77. Satija, M. P., & Singh, J. (2010). Colon Classification (CC) Encyclopedia of library and information sciences (Vol. 2, pp. 1158-1168). Schallier, W. (2005). Subject retrieval in OPAC's: a study of three interfaces. Paper presented at the 7th ISKO-Spain Conference: The human dimension of knowledge Organization, Barcelona. Singapore, N. L. o. (2016). Usability on the Web. from http://www.nlb.gov.sg/resourceguides/usability-on-the-web/ Slavic-Overfield, A. (2005). Classification management and use in a networked environment: the case of the Universal Decimal Classification. University of London. Slavic, A. (2006). Interface to classification: some objectives and options. Slavic, A. (2008). Use of the Universal Decimal Classification: A world-wide survey. Journal of Documentation, 64(2), 211-228. Smiraglia, R. P., & Van den Heuvel, C. (2011). Idea Collider: From a theory of knowledge organization to a theory of knowledge interaction. Bulletin of the American Society for Information Science and Technology, 37(4), 43-47. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 76 Subject classification schemes. (2015). from http://www.ifla.org/best-practice-for-national- bibliographic-agencies-in-a-digital-age/node/9042 Sukhmaneva, E. (1970). The Problems of Notation and Faceted Classification. 17(3-4), 112-116. Svenonius, E. (2000). The intellectual foundation of information organization: MIT press. Tallerås, K. (2013). From Many Records to One Graph: Heterogeneity Conflicts in the Linked Data Restructuring Cycle. Information Research: An International Electronic Journal, 18(3), n3. Tennis, J. T. (2008). Epistemology, theory, and methodology in knowledge organization: toward a classification, metatheory, and research framework. Tennis, J. T. (2011). Ranganathan's layers of classification theory and the FASDA model of classification. Thelwall, M. (2009). Synthesis lectures on information concepts, retrieval, and services.". Introduction to webometrics: Quantitative web research for the social sciences. Tomren, H. (2003). Classification, bias, and American Indian materials. Unpublished work, San Jose State University, San Jose, California. Tunkelang, D. (2009). Faceted search. Synthesis lectures on information concepts, retrieval, and services, 1(1), 1-80. Ullah, A., Ullah, I., Khusro, S., & Ali, S. (2016, 19-21 Dec. 2016). BookOnt: A Comprehensive Book Structural Ontology for Book Search and Retrieval. Paper presented at the 2016 International Conference on Frontiers of Information Technology (FIT), Islamabad, Pakistan. Ullah, I., & Khusro, S. (2016). In Search of a Semantic Book Search Engine on the Web: Are We There Yet? Artificial Intelligence Perspectives in Intelligent Systems (pp. 347-357): Springer. Universal Decimal Classification summary. (2017). from http://www.udcsummary.info/php/index.php?id=67277&lang=en# Vizine-Goetz, J. S. M. D. (2009). The Dewey Decimal Classification. Encyclopedia of Library and Information Science. Wang, J. (2009). An extensive study on automated Dewey Decimal Classification. Journal of the American Society for Information Science and Technology, 60(11), 2269-2286. Wijewickrema, C. M., & Gamage, R. (2013). An ontology based fully automatic document classification system using an existing semi-automatic system. Xin, R. S., Hassanzadeh, O., Fritz, C., Sohrabi, S., & Miller, R. J. (2013). Publishing bibliographic data on the Semantic Web using BibBase. Semantic Web, 4(1), 15-22. Yelton, A. (2011). A Simple Scheme for Book Classification Using Wikipedia. Information Technology and Libraries, 30(1), 7-15. BIBLIOGRAPHIC CLASSIFICATION IN THE DIGITAL AGE | ULLAH, KHUSRO, AND ULLAH | doi:10.6017/ital.v36i3.8930 77 Yi, K. (2006). Challenges in automated classification using library classification schemes. Paper presented at the Proceedings of world library and information congress: 72nd ifla general conference and council. Zhu, Z. (2011). Improving Search Engines via Classification. University of London. 8965 ---- Lessons Learned: A Primo Usability Study Kelsey Brett, Ashley Lierman, and Cherie Turner INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 7 ABSTRACT The University of Houston Libraries implemented Primo as the primary search option on the library website in May 2014. In May 2015, the Libraries released a redesigned interface to improve user experience with the tool. The Libraries took a user-centered approach to redesigning the Primo interface, conducting a “think-aloud” usability test to gather user feedback and identify needed improvements. This article describes the method and findings from the usability study, the changes that were made to the Primo interface as a result, and implications for discovery-system vendor relations and library instruction. INTRODUCTION Index-based discovery systems have become commonplace in academic libraries over the past several years, and academic libraries have invested a great deal of time and money into implementing them. Frequently, discovery platforms serve as the primary access point to library resources, and in some libraries they have even replaced traditional online public access catalogs. Because of the prominence of these systems in academic libraries and the important function that they serve, libraries have a vested interest in presenting users with a positive and seamless experience while using a discovery system to find and access library information. Libraries commonly conduct user testing on their discovery systems, make local customizations when possible, and sometimes even change products to present the most user-friendly experience possible. University of Houston Libraries has adopted new discovery technologies as they became available in an effort to provide simplified discovery and access to library resources. As a first step, the Libraries implemented Innovative Interface’s Encore, a federated search tool, in 2007. When index-based discovery systems became available, the Libraries saw them as a way to provide an improved and intuitive search experience. In 2010, the Libraries implemented Serials Solutions’ Summon. After three years and a thorough process of evaluating priorities and investigating alternatives, the Libraries made the decision to move to Ex Libris’ Primo, which was done in May of 2014. The Libraries’ intention was to continually assess and customize Primo to improve functionality and user experience. The Libraries conducted research and performed user testing, and in May Kelsey Brett (krbrett@ua.edu) is Discovery Systems Librarian, Ashley Lierman (arlierman@uh.edu) is Instructional Design Librarian, and Cherie Turner (ckturner2@uh.edu) is Chemical Sciences Librarian, University of Houston Libraries, Houston, Texas. mailto:krbrett@ua.edu mailto:arlierman@uh.edu mailto:ckturner2@uh.edu LESSONS LEARNED: A PRIMO USABILITY STUDY | BRETT, LIERMAN, AND TURNER doi: 10.6017/ital.v35i1.8965 8 2015 a redesigned Primo search results page was released. One of the activities that informed the Primo redesign was a “think-aloud” usability test that required users to complete a set of two tasks using Primo. This article will present the method and results of the testing as well as the customizations that were made to the discovery system as a result. It will also discuss some broader implications for library discovery and its effect on information literacy instruction. LITERATURE REVIEW There is a substantial body of literature discussing usability testing of discovery systems. In the interest of brevity, we will focus solely on studies and overviews involving Primo implementations, from which several patterns have emerged. Multiple studies have indicated that users’ responses to the system are generally positive; even in testing of very early versions by a development partner users responded positively overall.1 Interestingly, some studies found that in many cases users rated Primo positively in post-testing surveys even when their task completion rate in the testing had been low.2 Multiple studies also found evidence that, although users may struggle with Primo initially, the system is learnable over time. Comeaux found that the time it took users to use facets or locate resources decreased significantly with each task they performed,3 while other studies saw the use of facets per task increase for each user over the course of the testing.4 User reactions to facets and other post-limiting functions in Primo were divided. In one of the earliest studies, Sadeh found that users responded positively to facets,5 and some authors found users came to use them heavily while searching,6 while others found that facets were generally underused.7 Multiple studies found that users tended to repeat their searches with slightly different terms rather than use post-limiting options.8 Thomsett-Scott and Reese, in a survey of the literature on discovery tools, reported evidence of a trend that users reacted more positively to post-limiting in earlier discovery studies,9 while the broader literature shows more negative reactions in more recent studies. This could indicate that shifts in the software, user expectations, or both may have decreased users’ interest in these options. A few specific types of usability problems seem common across tests of Primo and other discovery systems. Across a large number of studies, it has been found that users—especially undergraduate students—struggle to understand library and academic terminology used in discovery. Some terminology changes were made after users had difficulty in the earliest usability tests of Primo,10 but users continued to struggle with terms like hold and recall in item records.11 Users also failed to understand the labels of limiters,12 and they also failed to recognize the internal names of repositories and collections.13 Literature reviews on discovery systems have found terminology to be a common stumbling block for searchers across a wide number of individual studies.14 Similarly, users often struggle to understand the scope of options available to them when searching and the holdings information in item records. Users failed in multiple tests to distinguish between the article level and the journal level,15 could not interpret bibliographic INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 9 information sufficiently to determine that they had found the desired item,16 and chose incorrect options for scoping their searches.17 Many studies found that users were unable to distinguish between multiple editions of a held item when all item types or editions were listed in the record.18 In other cases, users had difficulty interpreting locations and holdings information for physical items.19 Among the needs and desires expressed by and for Primo users in the literature, two in particular stand out. First, many users expressed a desire for more advanced search options; some wanted more complexity in certain facets and the ability to search within results,20 while other users simply wanted an advanced search option to be available.21 Secondly, a large number of studies indicated that instruction on Primo or other discovery systems was needed for users to search effectively. In some cases this was the conclusion of the researchers conducting the study,22 while in other cases users themselves either suggested or requested instruction on the system.23 It is also worth noting that it has been questioned whether usability testing as a whole is a sufficient mechanism for evaluating discovery-system functionality. Prommann and Zhang found that usability testing has focused almost exclusively on the technical functioning of the software and not adequately revealed the ability of discovery systems like Primo to successfully complete users’ desired tasks.24 They proposed hierarchical task analysis (HTA) as an alternative, to examine users’ most frequent desires and the capacity of discovery systems to meet them. Prommann and Zhang acknowledged, however, that as HTA is completed by an expert on the system rather than by an actual user, some of the valuable information derived from usability testing (including terms and functions that users do not understand, however well-designed) is lost in the process; they concluded that a combination of the two systems of testing is ideal to retain the best of both. BACKGROUND At the University of Houston Libraries, the Resource Discovery Systems department (RDS) is responsible for the maintenance and development of Primo. However, it is important to RDS to gather feedback and foster buy-in from stakeholders in the Library before making changes to the system. To that end, RDS works with two committees to assess the system and make recommendations for its improvement. The Discovery Usability Group and the Discovery Advisory Group include members from public services, technical services, and systems; each member brings a unique perspective on discovery. The Discovery Usability Group is charged with assessing the discovery system through a variety of methods including usability testing, focus groups, and user interviews. The Discovery Advisory Group reviews results of user testing and makes recommendations for improvement. All changes to the discovery system are reviewed by the Groups before they are released for public use. LESSONS LEARNED: A PRIMO USABILITY STUDY | BRETT, LIERMAN, AND TURNER doi: 10.6017/ital.v35i1.8965 10 In fall 2014, several months after the Primo implementation, the Discovery Usability Group conducted a focus group with student workers from the library’s information desk (a dual reference and circulation desk) to solicit feedback about the functionality of Primo and suggestions for its improvement. In the meantime, the Discovery Advisory Group was testing Primo and evaluating Primo sites at peer and aspirational institutions. The groups used the information collected through the focus group and research on Primo to make recommendations for improvement. RDS has access to a Primo development sandbox, and many of the recommended changes were made in the sandbox environment and reviewed by the two groups prior to public release. Changes to the search box can be seen in figure 1. Rarely used tabs were replaced with a drop- down menu to the right of the search box to allow users to limit to “everything,” “books+,” or “digital library.” To increase visibility, links to “Advanced Search” and “Browse Search” were made larger and more spacing was added. Live site: Development Sandbox: Figure 1. Search Box in Live Site (Above) and Development Sandbox (Below) at Time of Testing Changes were also made to create a cleaner and less cluttered search results page (see figure 2). More white space was added, and the links (or tabs) to “View Online,” “Request,” “Details,” etc., were redesigned and renamed for clarity. For example, the “View Online” link was renamed to “Preview Online” because it opens a box within the search results page that displays the item. The groups believed “Preview Online” more accurately represents what the link does. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 11 Live Site: Development Sandbox: Figure 2. Search Results in Live Site (Above) and Development Sandbox (Below) at Time of Testing The facets were also redesigned to look cleaner and larger to attract users’ attention (see figure 3). LESSONS LEARNED: A PRIMO USABILITY STUDY | BRETT, LIERMAN, AND TURNER doi: 10.6017/ital.v35i1.8965 12 Live Site: Development Sandbox: Figure 3. Facets in Live Site and Development Sandbox at Time of Testing Both groups were happy with the changes to the Primo development sandbox but wanted to test the effect of the changes on user search behavior before updating the live site. The Discovery Usability Group conducted a usability test within the development sandbox. The goal of the test was to find out if users could effectively complete common research tasks using Primo. With that goal in mind, the group developed a usability test and conducted it during the spring semester of 2015. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 13 METHODOLOGY The Discovery Usability Group developed a usability test using a “think-aloud” methodology, where users were asked to verbalize their thought process as they completed research tasks through Primo. Four tasks were designed to mirror tasks that users are likely to complete for class assignments or for general research. To minimize the testing time, each participant completed two tasks, with the facilitators alternating between two sets of tasks from one participant to the next. Test 1 Task 1: You are trying to find an article that was cited in a paper you read recently. You have the following citation: Clapp, E., & Edwards, L. (2013). Expanding our vision for the arts in education. Harvard Educational Review, 83(1), 5–14. Please find this article using OneSearch [the public-facing name given to the Libraries’ Primo implementation]. Task 2: You are doing a research project on the effects of video games on early childhood development. Find a peer-reviewed article on this topic, using OneSearch. Test 2 Task 1: Recently your friend recommended the book The Lighthouse by P. D. James. Use OneSearch to find out if you can check this book out from the library. Task 2: You are writing a paper about the drug cartels’ influence on Mexico’s relationship with the United States. Find a newspaper article on this topic, using OneSearch. Two facilitators set up a table with a laptop in the front entrance of the library. They alternated between the facilitator and note-taker roles. Another group member took on the role of “caller” and recruited library patrons to participate in the study. The caller set up a table visible to those passing by with library-branded T-shirts and umbrellas to incentivize participation. The caller explained what would be expected of the potential participant and went over the informed- consent document. After signing the form, the participant performed two tasks. After the test the participant received a library T-shirt or umbrella, and snacks. The facilitators used Morae Usability Software to record the screen and audio of each test. Participants were asked for permission to record their sessions, but could opt out. During the three hour testing period, fifteen library patrons participated in the study, and fourteen sessions were recorded. Of the fifteen participants, thirteen were undergraduate students (four freshman, one sophomore, seven juniors, and two seniors), one was a graduate student, and one was a post- baccalaureate student. The majority of the participants were from the sciences, along with two students from the College of Business and two from the School of Communications. There were no participants from the humanities. LESSONS LEARNED: A PRIMO USABILITY STUDY | BRETT, LIERMAN, AND TURNER doi: 10.6017/ital.v35i1.8965 14 The facilitators took notes on a rubric (see table 1) that simplified the processes of coding and reviewing the recordings. After the usability testing, the facilitators reviewed the notes and recordings, coded them for common themes and breakdowns, and prepared a report of their findings and design recommendations. The facilitators sent the report, along with audio and screen recordings, to the Discovery Advisory Group, who reviewed them along with RDS. The Discovery Advisory Group made additional design recommendations, and RDS used the information and recommendations to implement additional customizations to the Primo development sandbox. Preliminary Questions ASK: What is your affiliation with the University of Houston? Year? Major? ASK: How often do you use the library website? For what purpose(s)? Task 1 Describe the steps the participant took to complete the task S/U ASK: How did you feel about this task? What was simple? What was difficult? ASK: Is there anything that would make completing this task easier? Task 2 Describe the steps the participant took to complete the task S/U ASK: How did you feel about this task? What was simple? What was difficult? ASK: Is there anything that would make completing this task easier? Follow-up Question ASK: What can we do to improve the overall experience using OneSearch? Table 1. Task Completion Rubric for Test 1 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 15 RESULTS Test 1, Task 1 You are trying to find an article that was cited in a paper you read recently. You have the following citation: Clapp, E., & Edwards, L. (2013). Expanding our vision for the arts in education. Harvard Educational Review, 83(1), 5–14. Please find this article using OneSearch. Participant Time on Task Task Completion 1 1m 54s Y 2 4m 13s Y 3 1m 26s Y 4 1m 17s Y 5 1m 26s Y (required assistance) 6 1m 43s Y 7 1m 27s Y 8 1m 5s Y Table 2. Results for Test 1, Task 1 All eight participants successfully completed this task, although sophistication and efficiency varied between participants. Some searched by the authors’ last names, which was not specific enough to return the item in question. Four participants attempted to use advanced search or the drop-down menu to the right of the search box to pre-filter their results. Two participants viewed the options in the drop-down menu, which were “everything,” “books+,” and “digital library,” and left it on the default “everything” search. When prompted, the participants explained that they were expecting the drop-down to contain title and/or author limiters. Similarly, participants expected an author limiter in the advanced search. The citation format seemed to confuse participants, and they tended to search for the piece of information that was listed first—the authors—rather than the most unique piece of information—the title. If the first search did not return the correct item in the first few results, the participant would modify their search by searching for a different element of the citation or adding another element of the citation to the initial search until the item they were looking for appeared as one of the first few results. Participant 5 thought they had successfully completed the LESSONS LEARNED: A PRIMO USABILITY STUDY | BRETT, LIERMAN, AND TURNER doi: 10.6017/ital.v35i1.8965 16 task, but the facilitator had to point out that the item they chose did not meet the citation exactly, and on the second try they found the correct item. Participant 2 worked on the task for more than four minutes, significantly longer than the other seven participants. They immediately navigated to advanced search and filled out several fields in the advanced search form with the elements of the citation. If the search did not return their item, they added more elements until they finally found it. Simply searching the title in the citation would have returned the item as the first search result. Filling out the advanced search form with all of the information from the citation does not necessarily increase a user’s chances of finding the item in a discovery system, though it might do so when searching in an online catalog or subject database. The Discovery Advisory and Usability Groups made two recommendations to address some of the identified issues: include an author search option in the advanced search, and add an “articles+” option to the drop-down menu on the basic search. RDS implemented both recommendations. The Discovery Usability Group identified confusion around citations as a common breakdown during this task. The groups recommended providing instructional information about searching for known items to address this breakdown; however, RDS is still working on an effective method to provide this information in a simple and visible way. Test 1, Task 2 You are doing a research project on the effects of video games on early childhood development. Find a peer-reviewed article on this topic, using OneSearch. Participant Time on Task Task Completion 1 3m 44s Y 2 2m 21s Y 3 5m 23s Y (required assistance) 4 2m 5s Y 5 3m 32s Y 6 2m 45s Y 7 3m 8s Y 8 3m 1s Y (required assistance) Table 3. Results for Test 1, Task 2 All eight participants successfully found an article on this topic, but were less successful in determining whether the article was peer-reviewed. Only one participant used the “Peer-reviewed INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 17 Journals” facet without being prompted. Three users noticed the “[Peer-reviewed Journal]” note in the record information for search results, and used it to determine if the article was peer-reviewed. One participant went to the full-text of an article, and said it “seemed” like it was peer-reviewed and considered the task complete. The resource type facets were more heavily used during this task than the “Peer-reviewed Journals” facet, despite its being promoted to the top of the list of facets. Two participants used the “Articles” facet, and two participants used the “Reviews” facet, thinking it limited to peer-reviewed articles. Participants 3 and 8 needed help from the facilitator to determine whether a source was peer-reviewed. There was an overall misunderstanding of what peer-reviewed means, which affected participants’ confidence in completing the task. The design recommendations based on this task included changing the “Peer-reviewed Journals” facet to “Peer-reviewed Articles” or simply, “Peer-reviewed.” RDS changed the facet to “Peer- reviewed Articles” to help alleviate confusion. Additionally, the groups recommended emphasizing the “[Peer-reviewed Journal]” designations within the search results and providing a method for limiting to peer-reviewed materials before conducting a search. Customization limitations of the system have prevented RDS from implementing these design recommendations yet. A way to address the breakdowns caused by misunderstanding terminology also has yet to be identified. It was disheartening that participants did not use the “Peer-reviewed Journals” facet despite its being purposefully emphasized on the search results page. Test 2, Task 1 Recently your friend recommended the book The Lighthouse by P. D. James. Use OneSearch to find out if you can check this book out from the library. Participant Time on Task Task Completion 1 1m 7s Y 2 56s Y 3 No recording Y 4 2m 21s Y 5 1m 8s Y 6 2m 14s Y 7 1m 15s Y Table 4. Results for Test 2, Task 1 All seven participants were able to find this book using Primo, but had difficulty in determining what to do once they found it. For this task every participant searched by title and found the book as the first search result. Four users limited to “books+” before searching using the drop-down LESSONS LEARNED: A PRIMO USABILITY STUDY | BRETT, LIERMAN, AND TURNER doi: 10.6017/ital.v35i1.8965 18 menu, while the other three remained in the default “everything” search. Only one participant used the locations tab within the search results to determine availability; the others clicked the title and went to the item’s catalog record. All participants were able to determine that the book was available in the library, but there was an overall lack of understanding about how to use the information in the catalog to check out a book. Participant 1 said that they would write down the call number, take it to the information desk, and ask how to find it, which was the most sophisticated response of all seven participants. Participant 4 spent nearly two minutes clicking through links in the OPAC expecting to find a “Check Out” button and only stopped when the facilitator stepped in. A recommended design change based on this task was to have call numbers in Primo and the online catalog link to a stacks guide or map. This is a feature that may be developed in the future, but technical limitations prevented RDS from implementing it in time for the release of the redesigned search interface. Like the previous tasks, some of the breakdowns occurred because of a lack of understanding of library services. Users easily figured out that there was a copy of the book in the library, but had little sense of what to do next. None of the participants successfully located the stacks guide or the request feature that would put the item on hold for them. Steps should be taken to direct users to these features more effectively. Test 2, Task 2 You are writing a paper about the drug cartels’ influence on Mexico’s relationship with the United States. Find a newspaper article on this topic, using OneSearch. Participant Time on Task Task Completion 1 4m 45s Y (required assistance) 2 59s Y 3 No recording N 4 7m 47s Y 5 2m 52s Y 6 1m 33s Y 7 1m 30s Y Table 5. Results for Test 2, Task 2 This task was difficult for participants. Two users limited their search initially to “digital library” using the drop-down menu, thinking it would be a place to find newspaper articles; their searches returned zero results. Only two users used the “Newspaper Articles” facet without being prompted, and users did not seem to readily distinguish newspaper articles as a resource type. Participants INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 19 did not notice the resource type icons without being prompted. Several participants needed to be reminded that the task was to find a newspaper article, and not any other type of article. With guidance, most participants were able to complete the task. Participant 4 remained on the task for almost eight minutes because of their dissatisfaction with the relevancy of the results to the prompt. Interestingly, they found the “Newspaper Articles” facet and reapplied it after each modified search, suggesting that they learned to use system features as they went. One of the recommendations based on this task was to remove “digital library” as an option in the drop-down menu on the basic search. It was evident that “digital library” did not have the same meaning to end users as it does to internal users. This recommendation was easily implemented. Another recommendation was to emphasize the resource type icons within the search results, but we have not determined a way to do so effectively. One suggestion from the Discovery Usability Group was to exclude newspaper articles from the search results as a default, but no consensus was reached on this issue. LIMITATIONS The Discovery Usability Group identified limitations to the usability test that should be noted. Testing was done in a high-traffic portion of the library’s lobby, which is used as study space by a broad range of students. Participants were recruited from this study space, and we chose not to screen participants. The fifteen participants in the study did not constitute a representative sample. Almost all participants were undergraduate students, and no humanities majors participated. The outcomes might have been different if our participants had included more experienced researchers or students from a broader range of disciplines. By adding screening questions or choosing a more neutral location, we would have limited the number of participants who could complete our testing. Another limitation was that the participants started the usability test within the Primo interface. Because Primo is integrated into the Libraries’ website, users would typically begin searching the system from within the library homepage. The goals of the study required testing of our Primo development sandbox, which was not yet available to the public, and therefore could not be accessed in the same way. This gave participants some additional options from the initial search pages that are not usually available through the main search interface. While testing an active version of the interface would be preferable, one of our goals was to understand how our modifications affected user behavior, so testing the unmodified version was not an acceptable substitute. Additionally, the usability study presented tasks out of context and did not replicate a true user-searching experience. Despite the limitations, we learned valuable lessons from the participants in this study. DISCUSSION Users successfully completed the tasks in this usability study. Unfortunately, they did not take advantage of many of the features that can make such tasks easier—particularly facets. This was LESSONS LEARNED: A PRIMO USABILITY STUDY | BRETT, LIERMAN, AND TURNER doi: 10.6017/ital.v35i1.8965 20 especially apparent when we asked users to find a peer-reviewed journal article (Test 1, Task 2). Primo has a facet that will limit a search to only peer-reviewed journal articles, and only one out of eight participants used this facet during this task. Participants appreciated the pre-search filtering options, and requested more of them (such as an author search), while post-search facets were underutilized. Similarly, participants almost uniformly ignored the links, or tabs, within the search results, which would provide users with more information, a preview of the full-text, and additional features such as an email function. Users bypassed these options and clicked on the title instead. The Discovery Usability Group theorized that users clicked on the title of the item because that behavior would be successful in a more familiar search interface like Google. The team customized the configuration so that a title click would open either the full-text of electronic items or the catalog record for physical items to accommodate users’ instinctive search behaviors. The tabs, though a prominent feature of the discovery system, have proved to have little value for users. Throughout the implementation of discovery systems in academic libraries, both research studies and anecdotal evidence have suggested that users do not find end-user features like facets valuable; however, discovery system vendors have made no apparent attempt to reimagine the possibilities for search refinements. Indeed, most of the findings in this study will present few surprises to anyone familiar with the discovery usability literature, which is itself concerning. As our literature review has shown, many of the same general usability issues have repeated throughout studies of Primo since 2008, and most are very similar to usability issues in other, competitor discovery systems. This raises some concerns about the pace of innovation in the discovery field, and whether discovery vendors are genuinely taking into account the research findings about the needs of our users as they refine their products. In a recent article, David Nelson and Linda Turney identified many issues with discovery facets in their current form that may be barriers to usage, particularly labeling and library jargon; we join them in urging vendors and libraries to collaborate more closely for deep analysis of actual facet usage by users, and to address those factors that have negatively affected facets’ value.25 During our usability study, a common barrier to the successful completion of a task was not the technology itself but a lack of understanding of the task. Participants had difficulty deciphering a citation, which may have led to their tendency to search for a journal article by author and not by title. Many participants struggled with using call numbers, and how to find and check out books in the library. Peer review also proved to be a difficult or unfamiliar concept for many; when looking for peer-reviewed articles, some participants clicked on the “Reviews” facet, which limited their searches to an inappropriate resource type. Additionally, participants did not differentiate between journal articles and newspaper articles, which may indicate a broader inability to differentiate between scholarly and nonscholarly resources. This effect may be exaggerated by the high percentage of science students who participated, as these students may not have frequent need for newspaper articles. All of these challenges, however, are indicative of a deeper problem INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 21 with terminology. Regardless of how simple it is to limit a search to peer-reviewed articles, a user who does not understand what peer review means cannot complete the task with confidence or certainty. Librarians struggle with presenting understandable language and avoiding library terminology; as we discovered, academic language, like “peer-reviewed” and “citation,” presents a similar problem. These are not issues that can be resolved with a technological solution. Rather, we join previous authors in suggesting that instruction may be a reasonable way to address many usability issues in Primo. From our findings and from those in the wider literature, we conclude that general instruction in information literacy is prerequisite for effective use of this or any research tool, particularly for undergraduates. Nichols et al. “recommend studying how to effectively provide instruction on Primo searching and results interpretation,”26 but instruction on the use of a single tool is of limited utility to students in their academic lives. Instead libraries could bolster information literacy instruction on key concepts around the production and storage of information, scholarly communications, and differences in information types. Teaching these concepts effectively should help to alleviate the most common user issues, including understanding terminology and different types of information, as well as helping students to understand key elements of research in general. This is a particularly important point to note for librarians working as advocates for information literacy instruction, especially in cases where administrators or faculty may feel that more advanced tools, like discovery systems, should make instruction obsolete. CONCLUSION Several changes were made to the Primo interface in response to breakdowns identified during the usability study. Resource Discovery Systems first implemented the changes to the Primo development sandbox. After the Discovery Usability and Advisory Groups agreed on the changes, they were made available on the live site (see figure 4). The redesigned search results page became available to the general public between the spring and summer academic sessions of 2015. In addition to the changes that were made because the usability study, RDS made changes to the look and feel to make the search results interface more aesthetically pleasing and more in line with the University of Houston brand. LESSONS LEARNED: A PRIMO USABILITY STUDY | BRETT, LIERMAN, AND TURNER doi: 10.6017/ital.v35i1.8965 22 Before (live site): Figure 4. Primo Interface before Usability Testing During (development sandbox): Figure 5. Primo Interface during Usability Testing INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 23 After (live site): Figure 6. Primo Interface after Usability Testing Many larger assertions of this study, encompassing implications for instruction and our needs from discovery vendors, will require further study to address. The authors intend to continue to investigate these issues as additional usability testing is conducted and to use the data to support future vendor relations and instructional curriculum development discussions. REFERENCES 1. Tamar Sadeh, “User Experience in the Library: A Case Study,” New Library World 109, no. 1/2 (2008): 7–24, doi:10.1108/03074800810845976. 2. Aaron Nichols et al., “Kicking the Tires: A Usability Study of the Primo Discovery Tool,” Journal of Web Librarianship 8, no. 2 (2014): 172–95, doi: doi:10.1080/19322909.2014.903133; Scott Hanrath and Miloche Kottman, “Use and Usability of a Discovery Tool in an Academic Library,” Journal of Web Librarianship 9, no. 1 (2015): 1–21, doi:10.1080/19322909.2014.983259. 3. David J. Comeaux, “Usability Testing of a Web-Scale Discovery System at an Academic Library,” College & Undergraduate Libraries 19, no. 2–4 (2012): 189–206, doi:10.1080/10691316.2012.695671. 4. Kylie Jarrett, “FindIt@ Flinders: User Experiences of the Primo Discovery Search Solution,” Australian Academic & Research Libraries 43, no. 4 (2012): 278–300; Nichols et al., "Kicking the Tires." 5. Sadeh, “User Experience in the Library.” http://dx.doi.org/10.1108/03074800810845976 http://dx.doi.org/10.1080/19322909.2014.903133 http://dx.doi.org/10.1080/19322909.2014.983259 http://dx.doi.org/10.1080/10691316.2012.695671 LESSONS LEARNED: A PRIMO USABILITY STUDY | BRETT, LIERMAN, AND TURNER doi: 10.6017/ital.v35i1.8965 24 6. Jarrett, “FindIt@ Flinders”; Nichols et al., “Kicking the Tires.” 7. Xi Niu, Tao Zhang, and Hsin-liang Chen, “Study of User Search Activities with Two Discovery Tools at an Academic Library,” Libraries Faculty and Staff Scholarship and Research 30, no. 5 (2014), doi:10.1080/10447318.2013.873281; Hanrath and Kottman, “Use and Usability of a Discovery Tool in an Academic Library.” 8. Rice Majors, “Comparative User Experiences of Next-Generation Catalogue Interfaces,” Library Trends 61, no. 1 (2012): 186–207, doi:10.1353/lib.2012.0029; Niu, Zhang, and Chen, “Study of User Search Activities with Two Discovery Tools at an Academic Library.” 9. Beth Thomsett-Scott and Patricia E. Reese, “Academic Libraries and Discovery Tools: A Survey of the Literature,” College & Undergraduate Libraries 19, no. 2–4 (2012): 123–43, doi:10.1080/10691316.2012.697009. 10. Sadeh, “User Experience in the Library.” 11. Comeaux, “Usability Testing of a Web-Scale Discovery System at an Academic Library.” 12. Jessica Mahoney and Susan Leach-Murray, “Implementation of a Discovery Layer: The Franklin College Experience,” College & Undergraduate Libraries 19, no. 2–4 (2012): 327–43, doi:10.1080/10691316.2012.693435. 13. Joy Marie Perrin et al., “Usability Testing for Greater Impact: A Primo Case Study,” Information Technology & Libraries 33, no. 4 (2014): 57–67. 14. Majors, “Comparative User Experiences of Next-Generation Catalogue Interfaces”; Thomsett- Scott and Reese, “Academic Libraries and Discovery Tools.” 15. Jarrett, “FindIt@ Flinders”; Mahoney and Leach-Murray, “Implementation of a Discovery Layer.” 16. Jarrett, “FindIt@ Flinders”; Mahoney and Leach-Murray, “Implementation of a Discovery Layer”; Nichols et al., “Kicking the Tires." 17. Jarrett, “FindIt@ Flinders”; Mahoney and Leach-Murray, “Implementation of a Discovery Layer”; Perrin et al., “Usability Testing for Greater Impact : A Primo Case Study.” 18. Jarrett, “FindIt@ Flinders”; Nichols et al., “Kicking the Tires”; Hanrath and Kottman, “Use and Usability of a Discovery Tool in an Academic Library”; Majors, “Comparative User Experiences of Next-Generation Catalogue Interfaces.” 19. Comeaux, “Usability Testing of a Web-Scale Discovery System at an Academic Library”; Thomsett-Scott and Reese, “Academic Libraries and Discovery Tools.” 20. Jarrett, “FindIt@ Flinders.” 21. Mahoney and Leach-Murray, “Implementation of a Discovery Layer”; Perrin et al., “Usability Testing for Greater Impact.” http://dx.doi.org/10.1080/10447318.2013.873281 http://dx.doi.org/10.1353/lib.2012.0029 http://dx.doi.org/10.1080/10691316.2012.697009 http://dx.doi.org/10.1080/10691316.2012.693435 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 25 22. Mahoney and Leach-Murray, “Implementation of a Discovery Layer”; Nichols et al., “Kicking the Tires”; Niu, Zhang, and Chen, “Study of User Search Activities with Two Discovery Tools at an Academic Library.” 23. Thomsett-Scott and Reese, “Academic Libraries and Discovery Tools.” 24. Tao Zhang and Merlen Prommann, “Applying Hierarchical Task Analysis Method to Discovery Layer Evaluation,” Information Technology & Libraries 34, no. 1 (2015): 77–105, doi:10.6017/ital.v34i1.5600. 25. David Nelson and Linda Turney, “What’s in a Word? Rethinking Facet Headings in a Discovery Service,” Information Technology & Libraries 34, no. 2 (2015): 76–91, doi:10.6017/ital.v34i2.5629. 26. Nichols et al., “Kicking the Tires,” 184. http://dx.doi.org/10.6017/ital.v34i1.5600 http://dx.doi.org/10.6017/ital.v34i2.5629 9152 ---- Hitting the Road Towards a Greater Digital Destination: Evaluating and Testing DAMS at University of Houston Libraries Annie Wu, Santi Thompson, Rachel Vacek, Sean Watkins, and Andrew Weidner INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 5 ABSTRACT Since 2009, tens of thousands of rare and unique items have been made available online for research through the University of Houston (UH) Digital Library. Six years later, the UH Libraries’ new digital initiatives call for a more dynamic digital repository infrastructure that is extensible, scalable, and interoperable. The UH Libraries’ mission and the mandate of its strategic directions drives the pursuit of seamless access and expanded digital collections. To answer the calls for technological change, the UH Libraries administration appointed a Digital Asset Management System (DAMS) Implementation Task Force to explore, evaluate, test, recommend, and implement a more robust digital asset management system. This article focuses on the task force’s DAMS selection activities: needs assessment, systems evaluation, and systems testing. The authors also describe the task force’s DAMS recommendation based on the evaluation and testing data analysis, a comparison of the advantages and disadvantages of each system, and system cost. Finally, the authors outline their DAMS implementation strategy comprised of a phased rollout with the following stages: system installation, data migration, and interface development. INTRODUCTION Since the launch of the University of Houston Digital Library (UHDL) in 2009, the UH Libraries have made tens of thousands of rare and unique items available online for research using CONTENTdm. As we began to explore and expand into new digital initiatives, we realize that the UH Libraries’ digital aspirations require a more dynamic, flexible, scalable, and interoperable digital asset management system that can manage larger amounts of materials in a variety of formats. We plan to implement a new digital repository infrastructure that accommodates creative workflows and allows for the configuration of additional functionalities such as digital exhibits, data mining, cross-linking, geospatial visualization, and multimedia presentation. The Annie Wu (awu@uh.edu) is Head of Metadata and Digitization Services, Santi Thompson (sathompson3@uh.edu) is Head of Repository Services, Rachel Vacek (evacek@uh.edu) is Head of Web Services, Sean Watkins (slwatkins@uh.edu) is Web Projects Manager, and Andrew Weidner (ajweidner@uh.edu) is Metadata Services Coordinator, University of Houston Libraries. mailto:awu@uh.edu mailto:sathompson3@uh.edu mailto:evacek@uh.edu mailto:slwatkins@uh.edu mailto:ajweidner@uh.edu HITTING THE ROAD TOWARDS A GREATER DIGITAL DESTINATION: EVALUATING AND TESTING DAMS AT UNIVERSITY OF HOUSTON LIBRARIES | WU ET AL. | doi:10.6017/ital.v35i2.9152 6 new system will be designed with linked data in mind and will allow us to publish our digital collections as linked open data within the larger semantic web environment. The UH Libraries Strategic Directions set forth a mandate for us to “work assiduously to expand our unique and comprehensive collections that support curricula and spotlight research. We will pursue seamless access and expand digital collections to increase national recognition.”1 To fulfill the UH Libraries’ mission and the mandate of our Strategic Directions, the UH Libraries administration appointed a Digital Asset Management System (DAMS) Implementation Task Force to explore, evaluate, test, recommend, and implement a more robust digital asset management system that would provide multiple modes of access to the UH Libraries’ unique collections and accommodate digital object production at a larger scale. The collaborative task force comprises librarians from four departments: Metadata and Digitization Services (MDS), Web Services, Digital Repository Services, and Special Collections. The core charge of the task force is to: • Perform a needs assessment and build criteria and policies based on evaluation of the current system and requirements for the new DAMS • Research and explore DAMS on the market and identify the top three systems for beta testing in a development environment • Generate preliminary recommendations from stakeholders' comments and feedback • Coordinate installation of the new DAMS and finish data migration • Communicate the task force work to UH Libraries colleagues LITERATURE REVIEW Libraries have maintained DAMS for the publication of digitized surrogates of rare and unique materials for over two decades. During that time, information professionals have developed evaluation strategies for testing, comparing, and evaluating library DAMS software. Reviewing these models and associated case studies provided insight into common practices for selecting systems and informed how the UH Libraries DAMS Implementation Task Force conducted its evaluation process. One of the first publications of its kind, “A Checklist for Evaluating Open Source Digital Library Software” by Dion Hoe-Lian Goh et al., presents a comprehensive list of criteria for library DAMS evaluation.2 The researchers developed twelve broad categories for testing (e.g., content management, metadata, and preservation) and generated a scoring system based on the assignment of a weight and a numeric value to each criterion.3 While the checklist was created to assist with the evaluation process, the authors note that an institution’s selection decision should be guided primarily by defining the scope of their digital library, the content being curated using the software, and the uses of the material.4 Through their efforts, the authors created a rubric that can be utilized by other organizations when selecting a DAMS. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 7 Subsequent research projects have expanded upon the checklist evaluation model. In “Choosing Software for a Digital Library,” Jody DeRidder outlines major issues that librarians should address when choosing DAMS software, including many of the hardware, technological, and metadata concerns that Goh et al. identified.5 Additionally, she emphasizes the need to account for personnel and service requirements with a variety of activities: usability testing and estimating associated costs; conducting a formal needs assessment to guide the evaluation process; and a tiered-testing approach, which calls upon evaluators to winnow the number of systems.6 By considering stakeholder needs, from users to library administrators, DeRidder’s contributions inform a more comprehensive DAMS evaluation process. In addition to creating evaluation criteria, the literature on DAMS selection has also produced case studies that reflect real-world scenarios and identify use cases that help determine user needs and desires. In “Evaluation of Digital Repository Software at the National Library of Medicine,” Jennifer L. Marill and Edward C. Luczak discuss the process that the National Library of Medicine (NLM) used to compare ten DAMS, both proprietary and open-source.7 Echoing Goh et al. and DeRidder, Marill and Luczak created broad categories for testing and developed a scoring system for comparing DAMS.8 Additionally, Marill and Luczak enriched the evaluation process by implementing two testing phases: “initial testing of ten systems” and “in-depth testing of three systems.”9 This method allowed NLM to conduct extensive research on the most promising systems for their needs before selecting a DAMS to implement. The tiered approach appealed to the task force, and influenced how it conducted the evaluation process, because it balances efficiency and comprehensiveness. In another case study, Dora Wagner and Kent Gerber describe the collaborative process of selecting a DAMS across a consortium. In their article “Building a Shared Digital Collection: The Experience of the Cooperating Libraries in Consortium,”10 the authors emphasize additional criteria that are important for collaborating institutions: the ability to brand consortial products for local audiences; the flexibility to incorporate differing workflows for local administrators; and the shared responsibility of system maintenance and costs.11 While the UH Libraries will not be managing a shared repository DAMS, the task force appreciated the article’s emphasis on maximizing customizations to improve the user experience. In “Evaluation and Usage Scenarios of Open Source Digital Library and Collection Management Tools,” Georgios Gkoumas and Fotis Lazarinis describe how they tested multiple open-source systems against typical library functions—such as acquisitions, cataloging, digital libraries, and digital preservation—to identify typical use cases for libraries.12 Some of the use cases formulated by the researchers address digital platforms, including features related to supporting a diverse array of metadata schema and using a simple web interface for the management of digital assets.13 These use cases mirror local feature and functionality requests incorporated into the UH Libraries’ evaluation criteria. HITTING THE ROAD TOWARDS A GREATER DIGITAL DESTINATION: EVALUATING AND TESTING DAMS AT UNIVERSITY OF HOUSTON LIBRARIES | WU ET AL. | doi:10.6017/ital.v35i2.9152 8 In “Digital Libraries: Comparison of 10 Software,” Mathieu Andro, Emmanuelle Asselin, and Marc Maisonneuve discuss a rubric they developed to compare six open-source platforms (Invenio, Greenstone, Omeka, EPrints, ORI-OAI, and DSpace) and four proprietary platforms (Mnesys, DigiTool, YooLib, and CONTENTdm) around six core areas: document management, metadata, engine, interoperability, user management, and Web 2.0. 14 The authors note that each solution is “of good quality” and that institutions should consider a variety of factors when selecting a DAMS, including the “type of documents you will want to upload” and the “political criteria (open source or proprietary software)” desired by the institution.15 This article provided the UH Libraries with additional factors to include in their evaluation criteria. Finally, Heather Gilbert and Tyler Mobley’s article “Breaking Up with CONTENTdm: Why and How One Institution Took the Leap to Open Source,” provides a case study for a new trend: selecting a DAMS for migration from an existing system to a new one.16 The researchers cite several reasons for their need to select a new DAMS, primarily their current system’s limitations with searching and displaying content in the digital library.17 They evaluated alternatives and selected a suite of open-source tools, including Fedora, Drupal, and Blacklight, which combine to make up their new DAMS.18 Gilbert and Mobley also reflect on the migration process and identify several hurdles they had to overcome, such as customizing the open-source tools to meet their localized needs and confronting inconsistent metadata quality.19 Gilbert and Mobley’s article most closely matches the scenario faced by the UH Libraries. Our study adds to the limited literature on evaluating and selecting DAMS for migration in several ways. It demonstrates another model that other institutions can adapt to meet their specific needs. It identifies new factors for other institutions to take into account before or during their own migration process. Finally, it adds to the body of evidence for a growing movement of libraries migrating from proprietary to open-source DAMS. DAMS EVALUATION AND ANALYSIS METHODOLOGY Needs Assessment The DAMS Implementation Task Force fulfilled the first part of its charge by conducting a needs assessment. The goal of the needs assessment was to collect the key requirements of stakeholders, identify future features of the new DAMS, and gather data in order to craft criteria for evaluation and testing in the next phase of its work. The task force employed several techniques for information gathering during the needs assessment phase: • Identified stakeholders and held internal focus group interviews to identify system requirement needs and gaps • Reviewed scholarly literature on DAMS evaluation and migration • Researched peer/aspirational institutions • Reviewed national standards around DAMS INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 9 • Determined both the current use of UHDL as well as its projected use of UHDL • Identified UHDL materials and users Task force members took detailed notes during each focus group interview session. The literature research on DAMS evaluation helped the task force to find articles with comprehensive DAMS evaluation criteria. The NISO criteria for core types of entities in digital library collections were also listed and applied to the evaluation after reviewing the NISO Framework of Guidance for Building Good Digital Collections.20 More than forty peer and aspirational institutions’ digital repositories were benchmarked to identify web site names, platform architecture, documentation, and user and system features. The task force analyzed the rich data gathered from needs assessment activities and built the DAMS evaluation criteria that prepared the task force for the next phase of evaluation. Evaluation, Testing, and Recommendation The task force began its evaluation process by identifying twelve potential DAMS for consideration that were ultimately narrowed down to three systems for in-depth testing. Using data from focus group interviews, literature reviews, and DAMS best practices, the group generated a list of benchmark criteria. These broad evaluation criteria covered features in categories of system functionality, content management, metadata, user interface, and search support. Members of the task force researched DAMS documentation, product information, and related literature to score each system against the evaluation criteria. Table 1 contains the scores of the initial evaluation. From this process, five systems emerged with the highest scores: ● Fedora (and, closely associated, Fedora/Hydra and Fedora/Islandora) ● Collective Access ● DSpace ● RosettaCONTENTdm The task force eliminated Collective Access from the final systems for testing because of its limited functionality. It is based around archival content only, and is not widely deployed. The task force decided not to test CONTENTdm because of the system’s known functionalities that we identified through firsthand experience. After the initial elimination process, Fedora (including Fedora/Hydra and Fedora/Islandora), DSpace, and Rosetta remained for in-depth testing. HITTING THE ROAD TOWARDS A GREATER DIGITAL DESTINATION: EVALUATING AND TESTING DAMS AT UNIVERSITY OF HOUSTON LIBRARIES | WU ET AL. | doi:10.6017/ital.v35i2.9152 10 DAMS Evaluation Score* Fedora 27 Fedora/Hydra 26 Fedora/Islandora 26 Collective Access 24 DSpace 24 Rosetta 20 CONTENTdm 20 Trinity (iBase) 19 Preservica 16 Luna Imaging 15 RODA† 6 Invenio‡ 5 Table 1. Evaluation scores of twelve DAMS using broad evaluation criteria The task force then created detailed evaluation and testing criteria by drawing from the same sources used previously: focus groups, literature review, and best practices. While the broad evaluation focused on high-level functions, the detailed evaluation and testing criteria for the final three systems closely analyzed the specific features of each DAMS in eight categories: ● System Environment and Function ● Administrative Access ● Content Ingest and Management ● Metadata ● Content Access ● Discoverability ● Report and Inquiry Capabilities ● System Support * Total Possible Score: 29. † Removed from evaluation because the system does not support Dublin Core metadata. ‡ Removed from evaluation because the system does not support Dublin Core metadata. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 11 Prior to the in-depth testing of the final three systems, the task force researched timelines for system setup. Rosetta’s timeline for system setup proved to be prohibitive. Consequently, the task force eliminated Rosetta from the testing pool and moved forward with Fedora and DSpace. To conduct the detailed evaluation, the task force scored the specific features under each category utilizing systems testing and documentation. A score range from zero to three (0 = None, 1 = Low, 2 = Moderate, 3 = High) was assigned for each feature evaluated. After evaluating all features, the score was tallied for each category. Our testing revealed that Fedora outperformed DSpace in over half of the testing sections: Content Ingest and Management, Metadata, Content Access, Discoverability, and Report and Inquiry Capabilities. See table 2 for the tallied scores in each testing section. Testing Sections DSpace Score Fedora Score Possible Score System Environment and Testing 21 21 36 Administrative Access 15 12 18 Content Ingest and Management 59 96 123 Metadata 32 43 51 Content Access 14 18 18 Discoverability 46 84 114 Report and Inquiry Capabilities 6 15 21 System Support 12 11 12 TOTAL SCORE: 205 300 393 Table 2. Scores of top two DAMS from testing using detailed evaluation criteria After review of the testing results, the task force conducted a facilitated activity to summarize the advantages and disadvantages of each system. Based on this comparison, the DAMS Task Force recommended that the UH Libraries implement a Fedora/Hydra repository architecture with the following course of action: ● Adapt the UHDL user interface to Fedora and re-evaluate it for possible improvements ● Develop an administrative content management interface with the Hydra framework ● Migrate all UHDL content to a Fedora repository HITTING THE ROAD TOWARDS A GREATER DIGITAL DESTINATION: EVALUATING AND TESTING DAMS AT UNIVERSITY OF HOUSTON LIBRARIES | WU ET AL. | doi:10.6017/ital.v35i2.9152 12 Fedora/Hydra Advantages Fedora/Hydra Disadvantages Open source Steep learning curve Large development community Long setup time Linked data ready Requires additional tools for discovery Modular design through API No standard model for multi-file objects Scalable, sustainable, and extensible Batch import/export of metadata Handles any file format Table 3. Fedora/Hydra advantages and disadvantages The primary advantages of a DAMS based on Fedora/Hydra are: a large and active development community; a scalable and modular system that can grow quickly to accommodate large scale digitization; and a repository architecture based on linked data technologies. This last advantage, in particular, is unique among all systems evaluated, and will give the UH Libraries the ability to publish our collections as linked open data. Fedora 4 conforms to the World Wide Web Consortium (W3C) recommendation for Linked Data Platforms.21 The main disadvantage of a Fedora/Hydra system is the steep learning curve associated with designing metadata models and developing a customized software suite, which translates to a longer implementation time compared to off-the-shelf products. The UH Libraries must allocate an appropriate amount of time and resources for planning, implementation, and staff training. The long-term return on investment for this path will be a highly skilled technical staff with the ability to maintain and customize an open-source, standards-based repository architecture that can be expanded to support other UH Libraries content such as geospatial data, research data, and institutional repository materials. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 13 Dspace Advantages DSpace Disadvantages Open source Flat file and metadata structure Easy installation / ready out of box Limited reporting capabilities Existing familiarity through Texas Digital Library Limited metadata features User group / profile controls Does not support linked data Metadata quality module Limited API Batch import of objects Not scalable / extensible Poor user interface Table 4. DSpace advantages and disadvantages The main advantages of DSpace are ease of installation, familiarity of workflows, and additional functionality not found in CONTENTdm.22 Installation and migration to a DSpace system would be relatively fast, and staff could quickly transition to new workflows because they are similar to CONTENTdm. DSpace also supports authentication and user roles that could be used to limit content to the UH community only. Commercial add-on modules, although expensive, could be purchased to provide more sophisticated content management tools than are currently available with CONTENTdm. The disadvantages of a DSpace system are the same long-term, systemic problems with the current CONTENTdm repository. DSpace uses a flat metadata structure, has a limited API, does not scale well, and is not customizable to the UH Libraries’ needs. Consultations with peers indicated that both CONTENTdm and DSpace institutions are exploring the more robust capabilities of Fedora-based systems. Migration of the digital collections in CONTENTdm to a DSpace repository would provide few, if any, long term benefits to the UH Libraries. Of all the systems considered, implementation of a Fedora/Hydra repository aligns most clearly with the UH Libraries Strategic Directions of attaining national recognition and improving access to our unique collections. The Fedora and Hydra communities are very active, with project management overseen by Duraspace and Hydra respectively.23,24 Over the long term, a repository based on Fedora/Hydra will give the UH Libraries a low cost, scalable, flexible, and interoperable platform for providing online access to our unique collections. HITTING THE ROAD TOWARDS A GREATER DIGITAL DESTINATION: EVALUATING AND TESTING DAMS AT UNIVERSITY OF HOUSTON LIBRARIES | WU ET AL. | doi:10.6017/ital.v35i2.9152 14 Cost Considerations To balance the current digital collections production schedule with the demands of a timely implementation and migration, the task force identified the following investments as cost effective for Fedora/Hydra and DSpace, respectively: Fedora/Hydra DSpace Metadata Librarian: annual salary ● manages daily Metadata Unit operations during implementation ● streamlines the migration process Metadata Librarian: annual salary ● manages daily Metadata Unit operations during implementation ● streamlines the migration process @Mire Modules: $41,500 ● Content Delivery (3): $13,500 ● Metadata Quality: $10,000 ● Image Conversion Suite: $9,000 ● Content & Usage Analysis: $9,000 ● These modules require one-time fees to @Mire that recur when upgrading to a new version of DSpace Table 5. Start-up costs associated with Fedora/Hydra and DSpace The task force determined that an investment in one librarian’s salary is the most cost-effective course of action. The new Metadata Librarian will manage daily operations of the Metadata Unit in Metadata & Digitization Services while the Metadata Services Coordinator, in close collaboration with the Web Projects Manager, leads the DAMS implementation process. In contrast to Fedora, migration to DSpace would require a substantial investment in third party software modules from @Mire to deliver the best possible content management environment and user experience. IMPLEMENTATION STRATEGIES The implementation of the new DAMS will occur in a phased rollout comprised of the following stages: System Installation, Data Migration, and Interface Development. MDS and Web Services will perform the majority of the work, in consultation with key stakeholders from Special Collections and other units. Throughout this process, the DAMS Implementation Task Force will INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 15 consult with the Digital Preservation Task Force* to coordinate the preservation and access systems. Phase One System Installation Phase Two Data Migration Phase Three Interface Development Set up production and server environment Formulate content migration strategy and schedule Reevaluate front-end user interface Rewrite UHDL front-end application for Fedora/Solr Migrate test collections and document exceptions Rewrite UHDL front end as a Hydra head OR . . . Create metadata models Conduct the data migration . . . Update current front end Coordinate workflows with Digital Preservation Task Force Create preservation metadata for migrated data Establish inter- departmental production workflows Begin development of administrative Hydra head for content management Continue development of the Hydra administrative interface Refine administrative Hydra head for content management Table 6. Overview of DAMS phased implementation Phase One: System Installation During the first phase of DAMS implementation, Web Services and MDS will work closely together to install an open-source repository software stack based on Fedora, rewrite the current PHP front-end interface to provide public access to the data in the new system, and create metadata content models for the UHDL based on the Portland Common Data Model,25 in consultation with the Coordinator of Digital Projects from Special Collections and other key stakeholders. The DAMS Task Force will consult with the Digital Preservation Task Force† to determine how closely the preservation and access systems will be integrated and at what points. The two groups will also jointly outline a DAMS migration strategy that aligns with the preservation system. Web Services and MDS will collaborate on research and development of an administrative interface, based on the Hydra framework, for day-to-day management of UHDL content. * An appointed task force to create a digital preservation policy and identify strategies, actions, and tools needed to sustain long-term access to digital assets maintained by UH Libraries. † A working team at UH Libraries that enforces the digital preservation policy and maintains the digital preservation system.[convert these footnotes to endnotes?] HITTING THE ROAD TOWARDS A GREATER DIGITAL DESTINATION: EVALUATING AND TESTING DAMS AT UNIVERSITY OF HOUSTON LIBRARIES | WU ET AL. | doi:10.6017/ital.v35i2.9152 16 Phase Two: Data Migration In the second phase, MDS will migrate legacy content from CONTENTdm to the new system and work with Web Services, Special Collections, and the Architecture and Art Library to resolve any technical, metadata, or content problems that arise. The second phase will begin with the development of a strategy for completing the work in a timely fashion, followed by migration of representative sample collections to the new system to test and refine its capabilities. After testing is complete, all legacy content will be migrated from CONTENTdm to Fedora, and preservation metadata for migrated collections will be created and archived. Development work on the Hydra administrative interface will also continue. After the data migration is complete, all new collections will be ingested into Fedora/Hydra, and the current CONTENTdm installation will be retired. Phase Three: Interface Development In the final phase, Web Services will reevaluate the current front-end user interface (UI) for the UHDL by conducting user tests to better understand how and why users are visiting the UHDL. Web Services will also analyze web and system analytics and gather feedback from Special Collections and other stakeholders. Depending on the outcome of this research, Web Services may create a new UI based on the Hydra framework or choose to update the current front-end application with modifications or new features. Web Services and MDS will also continue to develop or adopt tools for the management of UHDL content and work with Special Collections and the branch libraries to establish production workflows in the new system. Continued development work on the front-end and administrative interfaces, for the life of the new Digital Asset Management System, is both expected and desirable as we maintain and improve the UHDL infrastructure and contribute to the open source software community in line with the UH Libraries Strategic Directions. Ongoing: Assessment, Enhancement, Training, and Documenting Throughout the transition process MDS and Web Services will undergo extensive training in workshops and conferences to develop the skills necessary for developing and maintaining the new system. They will also establish and document workflows to ensure the long-term viability of the system. Regular consultation with Special Collections, the branch libraries, and other stakeholders will be conducted to ensure that the new system satisfies the requirements of colleagues and patrons. Ongoing activities will include: ● Assessing service impact of new system ● User testing on UI ● Regular system enhancements ● Establishing new workflows ● Creating and maintaining documentation ● Training: conferences, webinars, workshops, etc. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 17 CONCLUSION Transitioning from CONTENTdm to a Fedora/Hydra repository will place the UH Libraries in a position to sustainably grow the amount of content in the UH Digital Library and customize the UHDL interfaces for a better user experience. Publishing our data in a linked data platform will give the UH Libraries the ability to more easily publish our data for the semantic web. In addition, the Fedora/Hydra architecture can be adapted to support a wide range of UH Libraries projects, including a geospatial data portal, a research data repository, and a self-deposit institutional repository. Over the long term, the return on investment for implementing an open-source repository architecture based on industry standard software will be: improved visibility of our unique collections on the Web; expanded opportunities for aggregating our collections with high- profile repositories such as the Digital Public Library of America; and increased national recognition for our digital projects and staff expertise. REFERENCES 1. “The University of Houston Libraries Strategic Directions, 2013–2016,” accessed July 22, 2015, http://info.lib.uh.edu/sites/default/files/docs/strategic-directions/2013-2016- libraries-strategic-directions-final.pdf. 2. Dion Hoe-Lian Goh et al., “A Checklist for Evaluating Open Source Digital Library Software,” Online Information Review 30, no. 4 (July 13, 2006): 360–79, doi:10.1108/14684520610686283. 3. Ibid., 366. 4. Ibid., 364. 5. Jody L. DeRidder, “Choosing Software for a Digital Library,” Library Hi Tech News 24, no. 9 (2007): 19–21, doi:10.1108/07419050710874223. 6. Ibid., 21. 7. Jennifer L. Marill and Edward C. Luczak, “Evaluation of Digital Repository Software at the National Library of Medicine,” D-Lib Magazine 15, no. 5/6 (May 2009), doi:10.1045/may2009- marill. 8. Ibid. 9. Ibid. 10. Dora Wagner and Kent Gerber, “Building a Shared Digital Collection: The Experience of the Cooperating Libraries in Consortium,” College & Undergraduate Libraries 18, no. 2–3 (2011): 272–90, doi:10.1080/10691316.2011.577680. 11. Ibid., 280–84. http://info.lib.uh.edu/sites/default/files/docs/strategic-directions/2013-2016-libraries-strategic-directions-final.pdf http://info.lib.uh.edu/sites/default/files/docs/strategic-directions/2013-2016-libraries-strategic-directions-final.pdf http://dx.doi.org/10.1108/14684520610686283 http://dx.doi.org/10.1108/07419050710874223 http://dx.doi.org/10.1045/may2009-marill http://dx.doi.org/10.1045/may2009-marill http://dx.doi.org/10.1080/10691316.2011.577680 HITTING THE ROAD TOWARDS A GREATER DIGITAL DESTINATION: EVALUATING AND TESTING DAMS AT UNIVERSITY OF HOUSTON LIBRARIES | WU ET AL. | doi:10.6017/ital.v35i2.9152 18 12. Georgios Gkoumas and Fotis Lazarinis, “Evaluation and Usage Scenarios of Open Source Digital Library and Collection Management Tools,” Program: Electronic Library and Information Systems 49, no. 3 (2015): 226–41, doi:10.1108/PROG-09-2014-0070. 13. Ibid., 238–39. 14. Mathieu Andro, Emmanuelle Asselin, and Marc Maisonneuve, “Digital Libraries: Comparison of 10 Software,” Library Collections, Acquisitions, & Technical Services 36, no. 3–4 (2012): 79–83, doi:10.1016/j.lcats.2012.05.002. 15. Ibid., 82. 16. Heather Gilbert and Tyler Mobley, “Breaking Up with CONTENTdm: Why and How One Institution Took the Leap to Open Source,” Code4Lib Journal, no. 20 (2013), http://journal.code4lib.org/articles/8327. 17. Ibid. 18. Ibid. 19. Ibid. 20. NISO Framework Working Group with support from the Institute of Museum and Library Services, A Framework of Guidance for Building Good Digital Collections (Baltimore, MD: National Information Standards Organization (NISO), 2007). 21 . “Linked Data Platform 1.0”, W3C, accessed July 22, 2015, http://www.w3.org/TR/ldp/. 22. “DSpace,” accessed July 22, 2015, http://www.dspace.org/. 23. “Fedora Repository Home,” accessed July 22, 2015, https://wiki.duraspace.org/display/FF/Fedora+Repository+Home. 24. “Hydra Project,” accessed July 22, 2015, http://projecthydra.org/. http://dx.doi.org/10.1108/PROG-09-2014-0070 http://dx.doi.org/10.1016/j.lcats.2012.05.002 http://journal.code4lib.org/articles/8327 http://www.w3.org/TR/ldp/ http://www.dspace.org/ https://wiki.duraspace.org/display/FF/Fedora+Repository+Home http://projecthydra.org/ INTRODUCTION LITERATURE REVIEW DAMS EVALUATION AND ANALYSIS METHODOLOGY Needs Assessment Evaluation, Testing, and Recommendation Cost Considerations Implementation Strategies Phase One: System Installation Phase Two: Data Migration Phase Three: Interface Development Ongoing: Assessment, Enhancement, Training, and Documenting CONCLUSION 9182 ---- Transitioning from XML to RDF: Considerations for an Effective Move Towards Linked Data and the Semantic Web Juliet L. Hardesty INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 51 INTRODUCTION Metadata, particularly within the academic library setting, is often expressed in eXtensible Markup Language (XML) and managed with XML tools, technologies, and workflows. Software tools such as the Oxygen XML Editor and querying languages such as XPath and XQuery over time have become capable of helping that management. However, managing a library’s metadata currently takes on a greater level of complexity as libraries are increasingly adopting the Resource Description Framework (RDF). Semantic Web initiatives are surfacing in the library context with experiments in publishing metadata as Linked Data sets, BIBFRAME development using RDF, and software developments such as the Fedora 4 digital repository using RDF. Challenges are evident when considering examples of transitions from XML into RDF and show the need for communication and coordination between efforts to incorporate and implement RDF. This article outlines these challenges using different use cases from the literature and first-hand experience. The follow-up discussion considers ways to progress forward from metadata formatted in XML to metadata expressed in RDF. The options explored are not only targeted to metadata practitioners considering this transition but also to programmers, librarians, and managers. LITERATURE REVIEW AND CONCEPTS As an initial example of the challenges faced when considering RDF, clarifying terminology is still a helpful activity. RDF focuses on sets of statements describing relationships and meaning. These statements consist of a subject, a predicate, and an object (i.e., an article, has an author, Jane Smith). These statement parts are also referred to as a resource, a property, and a property value. Since there are three parts to RDF statements, they are referred to as triples. The predicate or property of an RDF statement defines the relationship between the subject and the object. RDF ontologies are sets of properties for a particular domain. For example, Darwin Core has an RDF ontology to express biological properties,1 and EBUCore has an RDF ontology to express properties about audiovisual materials.2 Pulling apart the many issues involved in moving from XML to RDF is an exploration into the Juliet L. Hardesty (jlhardes@iu.edu) is Metadata Analyst at Indiana University Libraries, Bloomington, Indiana. mailto:jlhardes@iu.edu TRANSITIONING FROM XML TO RDF | HARDESTY doi: 10.6017/ital.v35i1.9182 52 purpose of metadata, the tools available and their capabilities, and the various strategies that can be employed. Poupeau rightly states that XML provides structural logic in its hierarchical identification of elements and attributes, where RDF provides data logic declaring resources that relate to each other using properties.3 These properties are ideally all identified with single reference points (Uniform Resource Identifiers or URIs) rather than a description encased in an encoding. A source of honest confusion, however, is that RDF can be expressed as XML. Lassila’s note regarding the Resource Description Framework specification from the World Wide Web Consortium (W3C) states, “RDF encourages the view of ‘metadata being data’ by using XML (eXtensible Markup Language) as its encoding syntax.”4 So even though RDF can use XML to express resources that relate to each other via properties, identified with single reference points (URIs), RDF is itself not an XML schema. RDF has an XML language (sometimes called, confusingly, RDF, and from here forward called RDF/XML). Additionally, RDF Schema (RDFS) declares a schema or vocabulary as an extension of RDF/XML to express application-specific classes and properties.5 Simply speaking, RDF defines entities and their relationships using statements. There are various ways to make these statements, but the original way formulated by the W3C is using an XML language (RDF/XML) that can be extended by an additional XML schema (RDFS) to better define those relationships. Ideally, all parts of that relationship (the subject, predicate, object, or the resource, property, property value) are URIs pointing to an authority for that resource, that property, or that property value. An additional concept worth covering is serialization. This term is used as a way to describe how RDF data is expressed using various formatting languages. RDF/XML, N-triples, Turtle, and JSON- LD are all examples of RDF serializations.6 Describing something as being in RDF really means the framework of subject, predicate, object is being used. Describing something as being expressed in RDF/XML or JSON-LD means that the RDF statements have been serialized into either of those formatting languages. Using “RDF” to refer not only to the framework to describe something (RDF) but also the serialization of that description (RDF/XML) can easily muddle the discussion. Other thoughts about the difference between XML and RDF or moving metadata from XML into RDF point to the difference in perspective and the change in thinking that is required to manage such a move. In an online discussion about RDF in relation to TEI (Text Encoding Initiative), Cummings talks about the need for both XML and RDF, using XML to encode text and RDF to extract that data and make it more useful.7 Yee, in her in-depth look at bibliographic data as part of the Semantic Web, points out that RDF is designed to encode knowledge, not information.8 The RDF Primer 1.0 also states “RDF directly represents only binary relationships.”9 XML describes what something is by encoding it with descriptive elements and attributes. RDF, on the other hand, constructs statements about something using direct references—a reference to the thing itself, a reference to the descriptor, and a reference to the descriptor’s value. As Farnel discussed in her 2015 Open Repositories presentation about the University of Alberta’s move to RDF, they learned they were moving from a records-based framework in XML to a things-based framework in RDF.10 What is pointed out here time and again is something else Farnel discussed—moving from XML to INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 53 RDF is not simply a conversion between encoding formats; it is a translation between two different ways of organizing knowledge. It involves understanding the meaning of the metadata encoded in XML and representing that meaning with appropriate RDF statements. The tools most commonly employed for reworking XML into RDF are OpenRefine when accompanied by its RDF extension; a triplestore database such as OpenLink Virtuoso,11 Apache Fuseki,12 or Sesame13; Oxygen XML Editor14; and Protégé,15 an ontology editor. OpenRefine is, according to the website, “a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.”16 The RDF extension, called RDF Refine, allows for importing existing vocabularies and reconciling against SPARQL endpoints (web services that accept SPARQL queries and return results).17,18 SPARQL is similar to SQL as a language for querying a database, but the syntax is specifically designed to allow for querying data formatted in triple statements instead of tables with columns.19 Triplestore databases such as OpenLink Virtuoso can store and index RDF statements for searching as a SPARQL endpoint, offering a way to retrieve information and visualize connections across a collection of triples. Oxygen XML Editor has proven helpful in formulating eXtensible Stylesheet Language (XSL) transformations to move metadata from a particular XML schema or format into RDF/XML or other serializations such as JSON-LD (JavaScript Object Notation for Linking Data).20 Protégé is a tool developed by Stanford University that supports the OWL 2 Web Ontology Language and has helped to convert XML schemas to RDF ontologies and establish ways to express XML metadata in RDF. These tools provide the technical means to take metadata expressed in XML and physically reformat it to metadata expressed in an RDF serialization. What that reformatting also encompasses, however, is a review of the information expressed in XML and a set of decisions as to how to express that information as RDF statements. Strategic approaches and ideas for handling data transformations into RDF have involved the XML schema or document type definition (DTD). These include Thuy, Lee, and Lee’s approach to map an XML schema (the XSD) to RDF, associating simpleType’s XSD in XML with properties in RDF, defining complexType’s XSD in XML as classes in RDF, and handling a hierarchy of XML schema elements with top levels as domains and lower-level elements and attributes as container classes or subproperties in those domains.21 Thuy et al. earlier worked on a method to transform XML to RDF by translating the DTD using RDFS (ELEMENTs in the DTD are RDF classes or subclasses, ATTLISTs are RDF properties, and ENTITIES—preset variables in the DTD—are called up for use in RDF as encountered).22 Similarly, Hacherouf, Bahloul, and Cruz translate an XML schema into an OWL ontology.23 Klein et al. point out that while ontologies serve to describe a domain, XML schemas are meant to provide constraints on documents or structure for data so it can be advantageous to work out an RDF expression this way.24 Tim Berners-Lee puts it simply: “the same RDF tree results from many XML trees,” meaning the same single statement in RDF (an article has an author Jane Smith) can be expressed in many ways in XML and can vary on the basis of the source of the XML, any schemas involved, and the people creating the metadata.25 Transitioning from XML to RDF using the XML schema might serve to ensure all XML elements are TRANSITIONING FROM XML TO RDF | HARDESTY doi: 10.6017/ital.v35i1.9182 54 replicated in RDF but does not necessarily establish the relationships meant by that XML encoding without additional evaluation. There is no single strategy that will always work to move XML metadata into RDF, even within the same set of tools (such as Fedora/Hydra) or the same area of concern (libraries, archives, or museums). USE CASES FOR RDF The following use cases explain approaches to transition to RDF taken from two differing perspectives. The first set describes efforts to express XML schemas or standards as RDF ontologies. The second set describes efforts by various library or cultural-heritage digital collections to transform metadata records into RDF statements. They also show that strategies to transform XML to RDF cannot occur without a shift in view from structure to relationships and, likewise, from descriptive encoding to direct meaning. Moving an XML Schema/Standard to an RDF Ontology As a graduate student at Kent State University, Mixter took on converting the descriptive metadata standard VRA Core 4.0 from an XML schema to an RDF ontology.26 Using the VRA Data Standards Committee Guidelines to ensure all minimum fields were included,27 Mixter mapped VRA XML elements and attributes to schema.org, FOAF, VoID, and DC Terms ontologies. This process is known as “cherry-picking,” or combining various ontologies that already exist to represent properties or relationships (the predicates in RDF statements) as RDF instead of creating new proprietary RDF properties. Using OWL and RDFS as metavocabularies in Protégé, this created an ontology that could “retain the granularity required to describe library, archive, or museum items” of VRA Core 4.0’s design in XML without being a straight conversion of VRA Core 4.0 from XML to RDF.28 The outcome was an XSLT stylesheet that was tested on VRA Core 4.0 XML records to produce that same information as RDF statements. One point that seemed to help in testing was the fact that all controlled vocabulary terms had reference identifiers in the XML (ready-made URIs). Something not discussed in the outcomes was that dates resulted in complex RDF (RDF statements that encompass additional RDF statements or blank nodes) and there was no discussion about this complexity or its effect on using those particular RDF statements. VRA Core 4.0 now has an RDF ontology in draft form, with Mixter as one of its authors.29 The OWL ontology still points to schema.org, FOAF, and VoID for equivalent classes and properties, but everything is now named within a VRA RDF ontology and namespace and translates to such when VRA Core 4.0 XML is transformed to RDF. Another case in the category of going from an XML standard to an RDF ontology is the development of the BIBFRAME model for bibliographic description from the Library of Congress. The BIBFRAME model is expressed as RDF. According to the BIBFRAME site, “in addition to being a replacement for MARC, BIBFRAME serves as a general model for expressing and connecting bibliographic data.”30 MARC has its own format of expression with numbered fields and subfields but can be expressed or serialized in XML and is often shared that way. The BIBFRAME model, INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 55 while revamping the way a bibliographic record is described on the basis of work, instance, authority, and annotation, also provides tools to transform records from MARC/XML to the RDF statements of BIBFRAME.31 A single namespace serves the BIBFRAME model and is explained as a long-term strategy to ensure namespace persistence over the next forty-plus years.32 The transformations produced from Library of Congress MARC records and local MARC records contain complex hierarchical RDF statements, particularly when ascribing authority sources to names, subjects, and types of identifiers. As it is still a work in progress there are no tools making use of BIBFRAME records in RDF. An additional example is the work happening with PBCore, the public broadcasting metadata standard managed by the Corporation for Public Broadcasting.33 Public broadcasting stations and other institutions across the United States provide descriptive, technical, and structural metadata for audiovisual materials using this XML standard. In Boston, WGBH’s use of PBCore coincides with its digital asset management system, HydraDAM, built on Fedora 3 and the Hydra technology stack (based on Blacklight, Solr, and the Fedora Digital Repository).34 Fedora 3 does not natively support RDF statements as properties on objects like Fedora 4. Building off an interest to move HydraDAM to Fedora 4 and leverage RDF for metadata about audiovisual collections, WGBH began exploring transitioning the PBCore XML metadata standard into an RDF ontology. EBUCore, the European Broadcasting Union’s metadata standard, is already expressed as an RDF ontology.35 A comparison between the XML standard of PBCore and the classes and properties expressed in EBUCore revealed that most PBCore elements were covered by the EBUCore ontology.36 Efforts are ongoing to offer PBCore 3.0 as an RDF ontology that uses EBUCore with the addition of a smaller set of properties along with a way to transform PBCore XML to PBCore 3.0 in RDF.37 The Hydra community, in an effort to help the transition from Fedora 3 with its XML binary files of descriptive metadata to Fedora 4 using RDF statements as properties on objects, is working on a recommendation and transformation to move descriptive metadata in MODS XML into RDF that is usable in Fedora 4.38 The MODS standard has a draft of an RDF ontology and a stylesheet transformation available,39 but the complex hierarchical RDF produced from this transformation is unmanageable with the current Fedora 4 architecture. The Hydra MODS and RDF Descriptive Metadata Subgroup is attempting to reflect the MODS elements in simple RDF statements that can be incorporated as properties on a Fedora 4 digital object.40 Led by Steven Anderson at the Boston Public Library, this group is moving through MODS element by element, asking the question, “If you had to express this MODS element from your metadata in RDF today, how would you do that?” Participating institutions are reviewing their MODS records and exploring the possible RDF predicates that could be used to represent the meaning of that information. Some are even considering how to construct those RDF statements so that MODS XML can be re-created as close to the original MODS as possible (this is called “round tripping”). There are still questions as to whether every single MODS element will be reflected in this transformation, how exactly Fedora 4 will make use of these descriptive RDF statements, and if the original MODS XML will need to be preserved as part of the digital object in Fedora, but this group is recognizing that moving from TRANSITIONING FROM XML TO RDF | HARDESTY doi: 10.6017/ital.v35i1.9182 56 Fedora 3 to Fedora 4 requires a major shift in thinking about descriptive metadata. This transformation tool is an effort to help make that transition possible. The Avalon Media System is an open source system for managing and providing access to large collections of digital audio and video.41 It is built on Fedora 3 and the Hydra technology stack and uses MODS XML to store descriptive metadata. As development progresses and the available descriptive fields expand, maintaining the workflow to update XML records in Fedora and reindexing objects in the Hydra interface becomes increasingly complicated. Each time an update is made to descriptive information about an audiovisual item through the Avalon interface, the entire XML record for that object, stored as a binary text file, is rewritten in Fedora 3 and reindexed in Solr. In considering advantages to using Fedora 4, it appears that descriptive metadata properties stored in RDF are easier to manage programmatically (updating content, adding new fields, more focused reindexing) because descriptive information would not be stored in a single binary file but as individual properties on the object. Turning XML metadata into RDF or Linked Data for publishing, search and discovery, and management As Southwick describes the process, the library at the University of Nevada Las Vegas (UNLV) took a collection with descriptive records from CONTENTdm and published them as a single RDF Linked Open Data set.42 After cleaning up controlled vocabulary terms across collections and solidifying locally controlled vocabularies, they exported tab-delimited CSV records from CONTENTdm. These records were brought into OpenRefine with its RDF extension where they reviewed the data and mapped to various properties within the Europeana Data Model (EDM). Controlled vocabulary terms were in text form and had to be reconciled against a SPARQL endpoint, either locally from downloaded data or from the controlled vocabulary service, to gather the URIs to use as the object or value in the RDF statement. OpenRefine was then used to create RDF files that were uploaded to a triplestore (first Mulgara then OpenLink Virtuoso). This provided public access to the Linked Open Data set and a SPARQL endpoint for querying the data set. After publishing the data set they experimented with PivotViewer from OpenLink Virtuoso and RelFinder to see what kinds of connections and relationships could be visualized from the data as Linked Open Data. The outlined steps are clear and the outcomes are described, but interestingly the data set itself no longer appears to be available online.43 Although the UNLV use case relies on CSV instead of XML as the data source, the tools and workflows enlisted to transform the data set into RDF Linked Open Data are still applicable. OpenRefine can import XML just as it imports CSV, so this described case shows the tools that can be used and decisions to be made in processing that data into RDF statements. In Oregon Digital,44 XML from Qualified Dublin Core, VRA Core, and MODS at two different institutions (University of Oregon and Oregon State University) were mapped as Linked Open Data and stored in a triplestore to be served up in a new web application using the Hydra technology stack.45 An inventory of metadata fields across all collections was first mapped to existing Linked INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 57 Data terms, or properties (those with available URIs), then properties that were needed in the new web application but did not have available corresponding URIs were mapped to a newly devised local namespace for Oregon Digital. Any properties that were not used were kept in the original static XML file for the record as part of the digital object in Fedora. The focus here appears to be on mapping properties without as much detail provided on whether the objects were kept as text or mapped to URI values where possible. From the sample record provided the objects appear to be text and not URIs. The real power of this project is finding common properties to describe objects from diverse collections and institutions. What also comes out in the example mappings is the use of many different namespaces or ontologies (DC Terms, MARC Relators, but also MODS and MADS that produce complex RDF). The University of Alberta also combined a variety of XML metadata from different sources into a new digital asset management system based on Fedora 4 and the Hydra technology stack, called the Education and Research Archive.46 Reporting on the experience at Open Repositories 2015, Farnel described the process as working in phases.47 Beginning with item types, languages, and licenses, then moving to place names and controlled subject terms, and finally person names and free-form subjects, they made multiple passes converting XML metadata into RDF statements and incorporating URIs whenever possible. They are combining all of this into a single data dictionary,48 making use of several RDF ontologies to cover the various metadata properties that are being described about objects and collections. University of California at San Diego (UCSD) has developed a local data model using a mix of external (MADS, VRA Core, Darwin Core, PREMIS) and local ontologies. They published a data dictionary and are working on a substantially different revision as part of the metadata workflow they use to bring digital objects into their digital asset management system from a variety of source metadata formats including XML.49 This allows metadata to be created from disparate source formats and makes it possible to bring them together as RDF for delivery, management, and preservation. DISCUSSION If metadata is in XML form and the desire is to express it as RDF, this is not merely a transformation from one XML schema to another. It is changing the expression of that data and changing its use. Having metadata in XML means information is encoded in a specific way that allows for interchange and sharing. Having metadata in RDF is making statements that have direct meaning and can be used independently. There are different perspectives involved in metadata when approaching RDF: those that manage metadata standards (the XML standard side) and those that have metadata encoded using those XML standards (the data management side). Depending on the desired outcomes, the needs of these two perspectives can conflict. When managing a metadata standard the RDF transition tends to follow certain patterns: TRANSITIONING FROM XML TO RDF | HARDESTY doi: 10.6017/ital.v35i1.9182 58 • Transform an XML standard into a new RDF ontology o Examples: Dublin Core (DC), Darwin Core (DWC), MODS, VRA Core • Establish a move to RDF that incorporates another existing ontology o Example: PBCore, Hydra community From the data management side, the RDF transition means different patterns occur. These scenarios often start by reviewing the needed outcome, deciding how much metadata needs to be expressed in RDF, and what works best to get the metadata to that point. Cases include the following commonalities: • Creating new search and discovery end user applications o Example: Oregon Digital, University of Alberta • Publishing Linked Data sets o Example: UNLV, University of Alberta • Managing metadata using software that supports RDF o Example: University of Alberta, UCSD, Hydra community Conflicts are occurring when the needed outcome on the data management side is not supported by the RDF ontology transitions that have occurred for the XML standards being used. An example of this is how RDF is handled in Fedora 4. When RDF is complex (the object of one statement is another entire RDF statement), Fedora produces blank nodes as new objects within the repository. While not technically problematic, descriptive metadata with complex RDF can result in a situation where a digital object ends up referencing a blank node that then points to, for example, a subject or a genre. This subject or genre has been created as its own object within the digital repository even though that subject or genre is only meant to provide meaning for the digital object. MODS RDF produces this complexity and thus is not workable to use with Fedora 4. In contrast, other standards such as DC or DWC in RDF produce simple statements that Fedora 4 can apply to a digital object without any additional processing. Complications in transitioning from XML to RDF also occur when the original XML does not include URIs or authority-controlled sources. Converting this metadata to RDF can mean locally minting URIs or bringing data over as literals (strings of text) without using URIs at all. Ideally, the result is somewhere in the middle with externally controlled vocabularies incorporated as much as possible and literals or locally minted URIs only used where absolutely necessary. Translating strings to authoritative sources is intensive work. If the XML standard cannot be expressed as a single RDF ontology, work is further complicated by the need to map XML elements to different RDF ontologies using logic that is often decided locally. While it is possible to transition XML to RDF, the process is not uniform and the pathway involves a lot of labor. Potential alleviators for this labor might involve a more user-centered approach by XML standard bodies to consider the ways their standards can be used when translated into RDF (“users” in this context meaning the users of the standards, not the end users searching and INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 59 discovering digital content). Triplestores can manage queries for complex RDF, but digital repository systems are not there yet. Those that support RDF for description of objects do so on the basis of simple property statements. A complex RDF ontology is going to be a challenge to support over time. Another way to progress forward is for the data management side of the equation to focus efforts on showing, in an end user search and discovery format, what is currently possible when XML is transitioned into RDF. Published Linked Data sets need to have interfaces for access and use, showing the value of what is currently available and any needs or gaps that remain. Libraries and cultural-heritage organization engaged in this work should also openly share the processes that both work and do not work so others contemplating this transformation can consider how to forge ahead themselves. Libraries and cultural-heritage organizations moving metadata from XML to RDF should provide feedback to XML standard bodies regarding the usefulness or complications of any RDF transitional help an XML standard might provide. Technologies for incorporating RDF into web applications and truly connecting triples across the web also require further work. Triplestores have so far been the main way to expose data sets but have not been incorporated into common library or cultural-heritage end user search and discovery web applications. Additionally, triplestore use does not seem to extend to management or long-term storage of complete data about digital objects. There seems to be a decision to either reduce the data stored in a triplestore down to simple statements or use the triplestore more like an isolated index or SPARQL endpoint only and manage the complete metadata record separately (in a static file containing text or in a separate database). That aligns triples in RDF more with relational database storage than with catalog records. Triple statements focus on relationships and not the complete unique details of the thing being described. Triplestores can handle complex hierarchical RDF graphs and provide responses on the basis of queries against those complexities,50 but triplestores do not appear to be taking over as either the main search and discovery mechanism for online digital resources or for digital object management. Software using RDF natively is also not currently widespread. A project such as the BIBFRAME Initiative that plans to incorporate RDF needs to make sure the complexity of its data model in RDF is manageable by any tools it produces and that it is possible for vendors and suppliers to encompass the data model in their software development. CONCLUSION The reasons for deciding metadata should transition to RDF are just as important as determining the best process for implementing that transition. Reasons for transitioning to RDF are conceptually based around making data more easily shareable and setting up data to have meaning and relationships as opposed to local static description that requires programmatic interpretation. The use cases outlined in this article show the reality does not quite yet match the concept. Transitioning an XML standard to RDF does not make that data more shareable or more easily understood unless there are end user applications for using that data in RDF. Publishing TRANSITIONING FROM XML TO RDF | HARDESTY doi: 10.6017/ital.v35i1.9182 60 Linked Data involves going through transitional steps, but the endpoint seems to be more of a byproduct. The real goal is going through the process of producing Linked Data to learn how that works. Self-contained projects that aim to express collections in RDF for the purpose of a new search and discovery interface are more successful in implementing RDF that has that new level of meaning and relationship. Beyond the borders of these projects, however, the data is not being shared or used. The use cases described above show some examples of what is happening now when transitioning from XML to RDF. Approaches include XML standards converting to RDF expression as well as digital collections with metadata in XML that have an interest in producing that metadata as RDF. Software that incorporates RDF is still developing and maturing. Helping that process along by providing a pathway from XML to functionally usable RDF improves the chances of the Semantic Web becoming a real and useful thing. It is vital to understand that transitioning from XML to RDF requires a shift in perspective from replicating structures in XML to defining meaningful relationships in RDF. Metadata work is never easy, and for metadata to move from encoded strings of text to statements with semantic relationships requires coordination and communication. How best to achieve this coordination and communication is a topic worth engaging as the move to use RDF, produce Linked Data, and approach the Semantic Web continues. BIBLIOGRAPHY Berners-Lee, Tim. “Linked Data.” Linked Data - Design Issues, June 18, 2009. http://www.w3.org/DesignIssues/LinkedData.html. ———. “Why RDF Model Is Different from the XML Model.” Semantic Web, September 1998. http://www.w3.org/DesignIssues/RDF-XML.html. Estlund, Karen, and Tom Johnson. “Link It or Don’t Use It: Transitioning Metadata to Linked Data in Hydra,” July 2013. http://ir.library.oregonstate.edu/xmlui/handle/1957/44856. Farnel, Sharon. “Metadata at a Crossroads: Shifting ‘from Strings to Things’ for Hydra North.” Slideshow presented at the Open Repositories, Indianapolis, Indiana, 2015. http://slideplayer.com/slide/5384520/. Hacherouf, Mokhtaria, Safia Nait Bahloul, and Christophe Cruz. “Transforming XML Documents to OWL Ontologies: A Survey.” Journal of Information Science 41, no. 2 (April 1, 2015): 242–59. doi:10.1177/0165551514565972. Klein, Michel, Dieter Fensel, Frank van Harmelen, and Ian Horrocks. “The Relation between Ontologies and XML Schemas.” In Linköping Electronic Articles in Computer and Information Science, 2001. doi:10.1.1.14.1037. Lassila, Ora. “Introduction to RDF Metadata.” W3C, November 13, 1997. http://www.w3.org/TR/NOTE-rdf-simple-intro-971113.html. http://www.w3.org/DesignIssues/LinkedData.html http://www.w3.org/DesignIssues/RDF-XML.html http://ir.library.oregonstate.edu/xmlui/handle/1957/44856 http://slideplayer.com/slide/5384520/ http://dx.doi.org/10.1177/0165551514565972 http://dx.doi.org/10.1.1.14.1037 http://www.w3.org/TR/NOTE-rdf-simple-intro-971113.html INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 61 Manola, Frank, and Eric Miller. “RDF Primer 1.0, Section 2.3 Structured Property Values and Blank Nodes.” W3C Recommendation, February 10, 2004. http://www.w3.org/TR/2004/REC-rdf- primer-20040210/#structuredproperties. Mixter, Jeff. “Using a Common Model: Mapping VRA Core 4.0 Into an RDF Ontology.” Journal of Library Metadata 14, no. 1 (January 2014): 1–23. 10.1080/19386389.2014.891890. Poupeau, Gautier. “XML vs RDF: logique structurelle contre logique des données (XML vs RDF: structural logic against logic data).” Les Petites Cases, August 29, 2010. http://www.lespetitescases.net/xml-vs-rdf. “RDF and TEI XML,” October 13, 2010. https://listserv.brown.edu/archives/cgi- bin/wa?A2=ind1010&L=TEI-L&D=0&P=28928. Southwick, Silvia B. “A Guide for Transforming Digital Collections Metadata into Linked Data Using Open Source Technologies.” Journal of Library Metadata 15, no. 1 (March 2015): 1–35. doi: 10.1080/19386389.2015.1007009. Thuy, Pham Thi Thu, Young-Koo Lee, and Sungyoung Lee. “A Semantic Approach for Transforming XML Data into RDF Ontology.” Wireless Personal Communications 73, no. 4 (2013): 1387–1402. doi: 10.1007/s11277-013-1256-z. Thuy, Pham Thi Thu, Young-Koo Lee, Sungyoung Lee, and Byeong-Soo Jeong. “Transforming Valid XML Documents into RDF via RDF Schema.” In Next Generation Web Services Practices, International Conference on, 0:35–40. Los Alamitos, CA: IEEE Computer Society, 2007. doi:10.1109/NWESP.2007.23. “XML RDF.” W3Schools. Accessed September 30, 2015. http://www.w3schools.com/xml/xml_rdf.asp. Yee, Martha M. “Can Bibliographic Data Be Put Directly onto the Semantic Web?” Information Technology and Libraries 28, no. 2 (March 1, 2013): 55–80. doi:10.6017/ital.v28i2.3175. NOTES 1. “Darwin Core,” Darwin Core Task Group, Biodiversity Information Standards, last modified May 5, 2015, http://rs.tdwg.org/dwc/. 2. “Metadata specifications,” European Broadcasting Union, https://tech.ebu.ch/MetadataEbuCore. 3. Gautier Poupeau, “XML vs RDF: logique structurelle contre logique des données (XML vs RDF: structural logic against logic data),” Les Petites Cases (blog), August 29, 2010, http://www.lespetitescases.net/xml-vs-rdf. 4. Ora Lassila, “Introduction to RDF Metadata,” W3C, November 13, 1997, http://www.w3.org/TR/NOTE-rdf-simple-intro-971113.html. http://www.w3.org/TR/2004/REC-rdf-primer-20040210/#structuredproperties http://www.w3.org/TR/2004/REC-rdf-primer-20040210/#structuredproperties http://dx.doi.org/10.1080/19386389.2014.891890 http://www.lespetitescases.net/xml-vs-rdf https://listserv.brown.edu/archives/cgi-bin/wa?A2=ind1010&L=TEI-L&D=0&P=28928 https://listserv.brown.edu/archives/cgi-bin/wa?A2=ind1010&L=TEI-L&D=0&P=28928 http://dx.doi.org/10.1080/19386389.2015.1007009 http://dx.doi.org/10.1007/s11277-013-1256-z http://dx.doi.org/10.1109/NWESP.2007.23 http://www.w3schools.com/xml/xml_rdf.asp http://dx.doi.org/10.6017/ital.v28i2.3175 http://rs.tdwg.org/dwc/ https://tech.ebu.ch/MetadataEbuCore http://www.lespetitescases.net/xml-vs-rdf http://www.w3.org/TR/NOTE-rdf-simple-intro-971113.html TRANSITIONING FROM XML TO RDF | HARDESTY doi: 10.6017/ital.v35i1.9182 62 5. “XML RDF,” W3Schools, accessed September 30, 2015, http://www.w3schools.com/xml/xml_rdf.asp. 6. See “Serialization formats” from Resource Description Framework on Wikipedia. “Resource Description Framework,” Wikipedia, March 18, 2016, https://en.wikipedia.org/wiki/Resource_Description_Framework#Serialization_formats. 7. “RDF and TEI XML,” email thread on TEI-L@listserv.brown.edu, October 13–18, 2010, https://listserv.brown.edu/archives/cgi-bin/wa?A2=ind1010&L=TEI-L&D=0&P=28928. 8. Martha M. Yee, “Can Bibliographic Data Be Put Directly onto the Semantic Web?” Information Technology & Libraries 28, no. 2 (March 1, 2013): 57, doi:10.6017/ital.v28i2.3175. 9. Frank Manola and Eric Miller, “RDF Primer 1.0, Section 2.3 Structured Property Values and Blank Nodes,” W3C Recommendation, February 10, 2004, http://www.w3.org/TR/2004/REC- rdf-primer-20040210/#structuredproperties. 10. Sharon Farnel, “Metadata at a Crossroads: Shifting ‘from Strings to Things’ for Hydra North” (slideshow presentation, Open Repositories, Indianapolis, Indiana, 2015), http://slideplayer.com/slide/5384520/. 11. http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/. 12. https://jena.apache.org/documentation/fuseki2/. 13. http://rdf4j.org. 14. http://www.oxygenxml.com. 15. http://protege.stanford.edu. 16. http://openrefine.org. 17. https://en.wikipedia.org/wiki/SPARQL. 18. http://refine.deri.ie. 19. https://jena.apache.org/tutorials/sparql.html. 20. http://json-ld.org. 21. Pham Thi Thu Thuy, Young-Koo Lee, and Sungyoung Lee, “A Semantic Approach for Transforming XML Data into RDF Ontology,” Wireless Personal Communications 73, no. 4 (2013): 1392–95, doi:10.1007/s11277-013-1256-z. 22. Pham Thi Thu Thuy et al., “Transforming Valid XML Documents into RDF via RDF Schema,” in Next Generation Web Services Practices, International Conference on, vol. 0 (Los Alamitos, CA: IEEE Computer Society, 2007), 37, doi:10.1109/NWESP.2007.23. http://www.w3schools.com/xml/xml_rdf.asp https://en.wikipedia.org/wiki/Resource_Description_Framework#Serialization_formats https://listserv.brown.edu/archives/cgi-bin/wa?A2=ind1010&L=TEI-L&D=0&P=28928 http://dx.doi.org/10.6017/ital.v28i2.3175 http://www.w3.org/TR/2004/REC-rdf-primer-20040210/#structuredproperties http://www.w3.org/TR/2004/REC-rdf-primer-20040210/#structuredproperties http://slideplayer.com/slide/5384520/ http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/ https://jena.apache.org/documentation/fuseki2/ http://rdf4j.org/ http://www.oxygenxml.com/ http://protege.stanford.edu/ http://openrefine.org/ https://en.wikipedia.org/wiki/SPARQL http://refine.deri.ie/ https://jena.apache.org/tutorials/sparql.html http://json-ld.org/ http://dx.doi.org/10.1007/s11277-013-1256-z http://dx.doi.org/10.1109/NWESP.2007.23 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 63 23. See Mokhtaria Hacherouf, Safia Nait Bahloul, and Christophe Cruz, “Transforming XML Documents to OWL Ontologies: A Survey,” Journal of Information Science 41, no. 2 (April 1, 2015): 242–59, doi:10.1177/0165551514565972. 24. Michel Klein et al., “The Relation between Ontologies and XML Schemas,” section 5 in Linköping Electronic Articles in Computer and Information Science, 6 (2001), doi:10.1.1.108.7190. 25. Tim Berners-Lee, “Why RDF Model Is Different from the XML Model,” Semantic Web Road map, September 1998, http://www.w3.org/DesignIssues/RDF-XML.html. 26. See Jeff Mixter, “Using a Common Model: Mapping VRA Core 4.0 Into an RDF Ontology,” Journal of Library Metadata 14, no. 1 (January 2014): 1–23, doi:10.1080/19386389.2014.891890. 27. The document currently labeled “How to Convert Version 3.0 to Version 4.0” contains a recommendation for a minimum set of elements for “meaningful retrieval” in VRA Core: http://www.loc.gov/standards/vracore/convert_v3-v4.pdf. 28. Mixter, “Using a Common Model,” 2. 29. “VRA Core RDF Ontology Available for Review,” Visual Resources Association, October 7, 2015, http://vraweb.org/vra-core-rdf-ontology-available-for-review/. 30. “Bibliographic Framework Initiative,” Library of Congress, https://www.loc.gov/bibframe/. 31. See “MARC to BIBFRAME transformation tools” at “Tools” BIBFRAME, http://bibframe.org/tools/. 32. “Why a single namespace for the BIBFRAME vocabulary?” Library of Congress, BIBFRAME Frequently Asked Questions, https://www.loc.gov/bibframe/faqs/#q06. 33. “PBCore 2.1,” Public Broadcasting Metadata Dictionary Project, http://pbcore.org. 34. “WGBH,” Hydra Community Partners, http://projecthydra.org/community-2-2/partners-and- more/wgbh/. 35. “Metadata specifications,” European Broadcasting Union, https://tech.ebu.ch/MetadataEbuCore. 36. See notes from PBCore Hackathon Part 2, which occurred in June 2015 showing an element- by-element analysis of PBCore against EBUCore. “PBCore Hackathon Part 2,” June 15, 2015, https://docs.google.com/document/d/1pWDfYIzHpfjCn5RWJ1fioweXg5RIrXuDxCWkBQ5BMl A/. 37. “Join us for the PBCore Sub-Committee Meeting at AMIA!” Public Broadcasting Metadata Dictionary Project Blog, November 11, 2015, http://pbcore.org/join-us-for-the-pbcore-sub- committee-meeting-at-amia/. http://dx.doi.org/10.1177/0165551514565972 http://dx.doi.org/10.1.1.108.7190 http://www.w3.org/DesignIssues/RDF-XML.html http://dx.doi.org/10.1080/19386389.2014.891890 http://www.loc.gov/standards/vracore/convert_v3-v4.pdf http://vraweb.org/vra-core-rdf-ontology-available-for-review/ https://www.loc.gov/bibframe/ http://bibframe.org/tools/ https://www.loc.gov/bibframe/faqs/#q06 http://pbcore.org/ http://projecthydra.org/community-2-2/partners-and-more/wgbh/ http://projecthydra.org/community-2-2/partners-and-more/wgbh/ https://tech.ebu.ch/MetadataEbuCore https://docs.google.com/document/d/1pWDfYIzHpfjCn5RWJ1fioweXg5RIrXuDxCWkBQ5BMlA/ https://docs.google.com/document/d/1pWDfYIzHpfjCn5RWJ1fioweXg5RIrXuDxCWkBQ5BMlA/ http://pbcore.org/join-us-for-the-pbcore-sub-committee-meeting-at-amia/ http://pbcore.org/join-us-for-the-pbcore-sub-committee-meeting-at-amia/ TRANSITIONING FROM XML TO RDF | HARDESTY doi: 10.6017/ital.v35i1.9182 64 38. “MODS and RDF Descriptive Metadata Subgroup,” last modified March 19, 2016, https://wiki.duraspace.org/display/hydra/MODS+and+RDF+Descriptive+Metadata+Subgrou p 39. “MODS RDF Ontology,” Library of Congress, https://www.loc.gov/standards/mods/modsrdf/. 40. “MODS and RDF Descriptive Metadata Subgroup,” last modified March 19, 2016, https://wiki.duraspace.org/display/hydra/MODS+and+RDF+Descriptive+Metadata+Subgrou p 41. “Avalon Media System,” http://www.avalonmediasystem.org. 42. See Silvia B. Southwick, “A Guide for Transforming Digital Collections Metadata into Linked Data Using Open Source Technologies,” Journal of Library Metadata 15, no. 1 (March 2015): 1– 35, http://dx.doi.org/10.1080/19386389.2015.1007009. 43. The URL for information is a blog with no links to a data set (https://www.library.unlv.edu/linked-data) and the collection site seems to still be based on CONTENTdm (http://digital.library.unlv.edu/collections). 44. “Oregon Digital,” http://oregondigital.org. 45. See Karen Estlund and Tom Johnson, “Link It or Don’t Use It: Transitioning Metadata to Linked Data in Hydra,” July 2013, http://ir.library.oregonstate.edu/xmlui/handle/1957/44856, accessed from ScholarsArchive@OSU. 46. “ERA: Education & Research Archive,” https://era.library.ualberta.ca. 47. Farnel, “Metadata at a Crossroads.” 48. https://docs.google.com/spreadsheets/d/1hSd6kf4ABm- m8VtYNyqfJGtiZG7bLJQ3fWRbF_nVoIw/edit#gid=1362636241. 49. The substantially revised data model is not available online yet, but the following shows some of the progress toward an RDF data model: “Overview of DAMs Metadata Workflow,” UC San Diego, May 21, 2014, https://tpot.ucsd.edu/metadata-services/mas/data-workflow.html; “DAMS4 Data Dictionary,” https://htmlpreview.github.io/?https://github.com/ucsdlib/dams/master/ontology/docs/da ta-dictionary.html, retrieved from GitHub. 50. See the Apache Jena SPARQL Tutorial for an example of complex RDF with sample queries against that complexity. “SPARQL Tutorial - Data Formats,” The Apache Software Foundation, https://jena.apache.org/tutorials/sparql_data.html. https://wiki.duraspace.org/display/hydra/MODS+and+RDF+Descriptive+Metadata+Subgroup https://wiki.duraspace.org/display/hydra/MODS+and+RDF+Descriptive+Metadata+Subgroup https://www.loc.gov/standards/mods/modsrdf/ https://wiki.duraspace.org/display/hydra/MODS+and+RDF+Descriptive+Metadata+Subgroup https://wiki.duraspace.org/display/hydra/MODS+and+RDF+Descriptive+Metadata+Subgroup http://www.avalonmediasystem.org/ http://dx.doi.org/10.1080/19386389.2015.1007009 https://www.library.unlv.edu/linked-data http://digital.library.unlv.edu/collections http://oregondigital.org/ http://ir.library.oregonstate.edu/xmlui/handle/1957/44856 https://era.library.ualberta.ca/ https://docs.google.com/spreadsheets/d/1hSd6kf4ABm-m8VtYNyqfJGtiZG7bLJQ3fWRbF_nVoIw/edit#gid=1362636241 https://docs.google.com/spreadsheets/d/1hSd6kf4ABm-m8VtYNyqfJGtiZG7bLJQ3fWRbF_nVoIw/edit#gid=1362636241 https://tpot.ucsd.edu/metadata-services/mas/data-workflow.html https://htmlpreview.github.io/?https://github.com/ucsdlib/dams/master/ontology/docs/data-dictionary.html https://htmlpreview.github.io/?https://github.com/ucsdlib/dams/master/ontology/docs/data-dictionary.html https://jena.apache.org/tutorials/sparql_data.html LITERATURE REVIEW AND CONCEPTS USE CASES FOR RDF DISCUSSION CONCLUSION BIBLIOGRAPHY NOTES 9190 ---- Library Discovery Products: Discovering User Expectations through Failure Analysis Irina Trapido INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 9 ABSTRACT As the new generation of discovery systems evolve and gain maturity, it is important to continually focus on how users interact with these tools and what areas they find problematic. This study looks at user interactions within SearchWorks, a discovery system developed by Stanford University Libraries, with an emphasis on identifying and analyzing problematic and failed searches. Our findings indicate that users still experience difficulties conducting author and subject searches, could benefit from enhanced support for browsing, and expect their overall search experience to be more closely aligned with that on popular web destinations. The article also offers practical recommendations pertaining to metadata, functionality, and scope of the search system that could help address some of the most common problems encountered by the users. INTRODUCTION In recent years, rapid modernization of online catalogs has brought library discovery to the forefront of research efforts in the library community, giving libraries an opportunity to take a fresh look at such important issues as the scope of the library catalog, metadata creation practices, and the future of library discovery in general. While there is an abundance of studies looking at various aspects of planning, implementation, use, and acceptance of these new discovery environments, surprisingly little research focuses specifically on user failure. The present study aims to address this gap by identifying and analyzing potentially problematic or failed searches. It is hoped that focusing on common error patterns will help us gain a better understanding of users’ mental models, needs, and expectations that should be considered when designing discovery systems, creating metadata, and interacting with library patrons. TERMINOLOGY In this paper, we adopt a broad definition of discovery products as “tools and interfaces that a library implements to provide patrons the ability to search its collections and gain access to materials.”1 These products can be further subdivided into the following categories: Irina Trapido (itrapido@stanford.edu) is Electronic Resources Librarian at Stanford University Libraries, Stanford, California. mailto:itrapido@stanford.edu LIBRARY DISCOVERY PRODUCTS: DISCOVERING USER EXPECTATIONS THROUGH FAILURE ANALYSIS |IRINA TRAPIDO |doi:10.6017/ital.v35i2.9190 10 • Online catalogs (OPACs)—patron-facing modules of an integrated library system. • Discovery layers (also referred to as “discovery interfaces” or “next-generation library catalogs”)—new catalog interfaces, decoupled from the integrated library system and offering enhanced functionality, such as faceted navigation, relevance-ranked results, as well as the ability to incorporate content from institutional repositories and digital libraries. • Web-scale discovery tools, which in addition to providing all interface features and functionality of next generation catalogs, broaden the scope of discovery by systematically aggregating content from library catalogs, subscription databases, and institutional digital repositories into a central index. LITERATURE REVIEW To identify and investigate problems that end users experience in the course of their regular searching activities, we analyzed digital traces of user interactions with the system recorded in the system’s log files. This method, commonly referred to as transaction log analysis, has been a popular way of studying information-seeking in a digital environment since the first online search systems came into existence, allowing researchers to monitor system use and gain insight into the users’ search process. Server logs have been used extensively to examine user interactions with web search engines, consistently showing that web searchers tend to engage in short search sessions, enter brief search statements, do not browse the results beyond the first page, and rarely resort to advanced searching.2 A similar picture has emerged from transaction log studies of library catalogs. Researchers have found that library users employ the same surface strategies: queries within library discovery tools are equally short and simply constructed; 3 the majority of search sessions consist of only one or two actions.4 Patrons commonly accept the system’s default search settings and rarely take advantage of a rich set of search features traditionally offered by online catalogs, such as Boolean searching, index browsing, term truncation, and fielded searching.5 Although advanced searching in library discovery layers is uncommon, faceted navigation, a new feature introduced into library catalogs in the mid-2000s, quickly became an integral part of the users’ search process. Research has shown that facets in library discovery interfaces are used both in conjunction with text searching, as a search refinement tool, and as a way to browse the collection with no search term entered.6 A recent study that analyzed interaction patterns in a faceted library interface at the North Carolina State University using log data and user experiments demonstrated that users of faceted interfaces tend to issue shorter queries, go through fewer iterations of query reformulation, and scan deeper along the result list than those who use nonfaceted search systems. The authors also concluded that facets increase search accuracy, especially for complex and open-ended tasks, and improve user satisfaction.7 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 11 Another traditional use of transaction logs has been to gauge the performance of library catalogs, mostly through measuring success and failure rates. While the exact percentage of failed searches varied dramatically depending on the system’s search capabilities, interface design, the size of the underlying database, and, most importantly, on the researchers’ definition of an unsuccessful search, the conclusion was the same: the incidence of failure in library OPACs was extremely high.8 In addition to reporting error rates, these studies also looked at the distribution of errors by search type (title, author, or subject search) and categorized sources of searching failure. Most researchers agreed that typing errors and misspellings accounted for a significant portion of failed searches and were common across all search types.9 Subject searching, which remained the most problematic area, often failed because of a mismatch between the search terms chosen by the user and the controlled vocabulary contained in the library records, suggesting that users experienced considerable difficulties in formulating subject queries with Library of Congress Subject Headings.10 Other errors reported by researchers, such as the selection of the wrong search index or the inclusion of the initial article for title searches, were also caused by users’ lack of conceptual understanding of the search process and the system’s functions.11 These research findings were reinforced by multiple observational studies and user interviews, which showed that patrons found library catalogs “illogical,” “counter-intuitive,” and “intimidating,”12 and that patrons were unwilling to learn the intricacies of catalog searching.13 Instead, users expected simple, fast, and easy searching across the entire range of library collections, relevance-ranked results that exactly matched what users expected to find, and convenient and seamless transition from discovery to access.14 Today’s library discovery systems have come a long way: they offer one-stop search for a wide array of library resources, intuitive interfaces that require minimal training to be searched effectively, facets to help users narrow down the result set, and much more.15 But are today’s patrons always successful in their searches? Usability studies of next-generation catalogs and, more recently, of web-scale discovery systems have pointed to patron difficulties associated with the use of certain facets, mostly because of terminological issues and inconsistencies in the underlying metadata.16 Researchers also reported that users had trouble interpreting and evaluating the results of their search;17 users also were confused as to what resources were covered by the search tool.18 Our study builds on this line of research by systematically analyzing real-life problematic searches as reported by library users and recorded in transaction logs. BACKGROUND Stanford University is a private, four-year or above research university offering undergraduate and graduate degrees in a wide range of disciplines to about sixteen thousand students. The study analyzed the use of SearchWorks, a discovery platform developed by Stanford University Libraries. SearchWorks features a single search box with a link to advanced search on every page, relevance- ranked results, faceted navigation, enhanced textual and visual content (summaries, tables of LIBRARY DISCOVERY PRODUCTS: DISCOVERING USER EXPECTATIONS THROUGH FAILURE ANALYSIS |IRINA TRAPIDO |doi:10.6017/ital.v35i2.9190 12 content, book cover images, etc.), as well as “browse shelf” functionality. SearchWorks offers searching and browsing of catalog records and digital repository objects in a single interface; however, it does not allow article-level searching. SearchWorks was developed on the basis of Blacklight (projectblacklight.org), an open-source application for searching and interacting with collections of digital objects.19 Thanks to Blacklight’s flexibility and extensibility, SearchWorks enables discovery across an increasingly diverse range of collections (MARC catalog records, archival materials, sound recordings, images, geospatial data, etc.) and allows to continuously add new features and improvements (e.g., https://library.stanford.edu/blogs/stanford-libraries-blog/2014/09/searchworks-30-released). STUDY OBJECTIVES The goal of the present study was two-fold. First, we sought to determine how patrons interact with the discovery systems, which features they use and with what frequency. Second, this study aimed to identify and analyze problems that users encounter in their search process. METHOD This study used data comprising four years of SearchWorks use, which was recorded in Apache Solr logs. The analysis was performed at the aggregate level; no attempts were made to identify individual searchers from the logs. At the preprocessing stage, we created and used a series of Perl scripts to clean and parse the data and extract only those transactions where the user entered a search query and/or selected at least one facet value. Page views of individual records were excluded from the analysis. The resulting output file contained the following parameters for each transaction: a time stamp, search mode used (basic or advanced), query terms, search index (“all fields,” “author,” “title,” “subject,” etc.), facets selected, and the number of results returned. The query stream was subsequently partitioned into task-based search sessions using a combination of syntactic features (word co- occurrence across multiple transactions) and temporal features (session time-outs: we used fifteen minutes of inactivity as a boundary between search sessions). The analysis was conducted over the following datasets: Dataset 1. Aggregate data of approximately 6 million search transactions conducted between February 13, 2011, and December 31, 2014. We performed quantitative analysis of this set to identify general patterns of system use. Dataset 2. A sample of 5,101 search sessions containing 11,478 failed or potentially problematic interactions performed in the basic search mode and 2,719 sessions containing 3,600 advanced searches, annotated with query intent and potential cause of the problem. The searches were performed during eleven twenty-four-hour periods, representing different years, academic http://projectblacklight.org/ https://library.stanford.edu/blogs/stanford-libraries-blog/2014/09/searchworks-30-released INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 13 quarters, times of the school year (beginning of the quarter, midterms, finals, breaks), and days of the week. This dataset was analyzed to identify common sources of user failure. Dataset 3. User feedback messages submitted to SearchWorks between January 2011 and December 2014 through the “Feedback” link, which appears on every SearchWorks page. While the majority of feedback messages were error and bug reports, this dataset also contained valuable information about how users employed various features of the discovery layer, what problems they encountered, and what features they felt would improve their search experience. For the manual analysis of dataset 2, all searches within a search session were reconstructed in SearchWorks and, in some cases, also in external sources such as WorldCat, Google Scholar, and Google. They were subsequently assigned to one of the following categories: known-item searches (searches for a specific resource by title, combination of title and author, a standard number such as ISSN or ISBN, or a call number), author searches (queries for a specific person or organization responsible for or contributing to a resource), topical searches, browse searches (searches for a subset of the library collection, e.g., “rock operas,” “graphic novels,” “DVDs,” etc.), invalid queries, and queries where the search intent could not be established. To identify potentially problematic transactions, the following heuristic was employed: we selected all search sessions where at least one transaction failed to retrieve any records, as well as sessions consisting predominantly of known-item or author searches, where the user repeated or reformulated the query three or more times within a five-minute time frame. We hypothesized that this search pattern could be part of the normal query formulation process for topical searches, but it could serve as an indicator of the user’s dissatisfaction with the results of the initial query for known-item and author searches. We identified seventeen distinct types of problems, which we further aggregated into the following five groups: input errors, absence of the resource from the collection, queries at the wrong level of granularity, erroneous or too restrictive use of limiters, and mismatch between the search terms entered and the library metadata. Each search transaction in dataset 2 was manually reviewed and assigned to one or more of these error categories. FINDINGS Usage Patterns Our analysis of the aggregate data suggests that keyword searching remains the primary interaction paradigm with the library discovery system, accounting for 76 percent of all searches. However, users also increasingly take advantage of facets both for browsing and refining their searches: the use of facets grew from 25 percent in 2011 to 41 percent in 2014. LIBRARY DISCOVERY PRODUCTS: DISCOVERING USER EXPECTATIONS THROUGH FAILURE ANALYSIS |IRINA TRAPIDO |doi:10.6017/ital.v35i2.9190 14 Although both the basic and the advanced search modes allow for “fielded” searches, where the user can specify which element of the record to search (author, title, subject, etc.), searchers rarely made use of this feature, relying mostly on the system’s defaults (the “all fields” search option in the basic search mode): users selected a specific search index in less than 25 percent of all basic searches. Advanced searching was infrequent and declining (from 11 percent in 2011 to 4 percent in 2014). Typically, users engaged in short sessions with a mean session length of 1.5 queries. Search queries were brief: 2.9 terms per query on average. Single terms made up 23 percent of queries; 26 percent had two terms, and 19 percent had three terms. Error Patterns The breakdown of errors by category and search mode is shown in figure 1. In the following sections, we describe and analyze different types of errors. Figure 1. Breakdown of errors by category and search mode Input Errors Input errors accounted for the largest proportion of problematic searches in the basic search mode (29 percent) and for 5 percent of problems in the advanced search. While the majority of such errors occurred at the level of individual words (misspellings or typographical errors), entire search statements were also imprecise and erroneous (e.g., “Diary of an Economic Hit Man” instead of “Confessions of an Economic Hit Man” and “Dostoevsky War and Peace” instead of “Tolstoy War and Peace”). It is noteworthy that in 46 percent of all search sessions containing INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 15 problems of this type, users subsequently entered a corrected query. However, if such errors occurred in a personal name, they were almost half as likely to be corrected. Absence of the Item Sought from the Collection Queries for materials that were not in the library’s collection accounted for about a quarter of all potentially problematic searches. In the advanced search modality, where the query is matched against a specific search field, such queries typically resulted in zero hits and can hardly be considered failures per se. However, in the default cross-field search, users were often faced with the problem of false hits and had to issue multiple progressively more specific queries to ascertain that the desired resource was absent from the collection. Queries at the Wrong Level of Granularity A substantial number of user queries failed because they were posed at the level of specificity not supported by the catalog. Such queries accounted for the largest percentage of problematic advanced searches (63 percent), where they consisted almost exclusively of article-level searching: users either tried to locate a specific article (often by copying the entire citation or its part from external sources) or conducted highly specific topical searches more suitable for a full- text database. In the basic search mode, the proportion of searches at the wrong granularity level was much lower, but still substantial (20 percent). In addition to searches for articles and narrowly defined subject searches, users also attempted to search for other types of more granular content, such as book chapters, individual papers in conference proceedings, poems, songs, etc. Erroneous or Too Restrictive Use of Limiters Another common source of failure was the selection of the wrong search index or a facet that was too restrictive to yield any results. The majority of these errors were purely mechanical: users failed to clear out search refinements from their previous search or entered query terms into the wrong search field. However, our analysis also revealed several conceptual errors, typically stemming from a misunderstanding of the meaning and purpose of certain limiters. For example, “Online,” “Database,” and “Journal/Periodical” facets were often perceived by the user as a possible route to article-level content. Even seemingly straightforward limiters such as “Date” caused confusion, especially when applied to serial publications: users attempted to employ this facet to drill down to the desired journal issue or article, most likely acting on the assumption that the system included article-level metadata. Lack of Correspondence between the Users’ Search Terms and the Library Metadata A significant number of problems in this group involved searches for non-English materials. When performed in their English transliteration, such queries often failed because of users’ lack of LIBRARY DISCOVERY PRODUCTS: DISCOVERING USER EXPECTATIONS THROUGH FAILURE ANALYSIS |IRINA TRAPIDO |doi:10.6017/ital.v35i2.9190 16 familiarity with the transliteration rules established by the library community, whereas searches in the vernacular scripts tended to produce incomplete or no results because not all bibliographic records in the database contained parallel non-Roman script fields. Author and title searches often failed because of the users’ tendency to enter abbreviated queries. For example, personal name searches where the user truncated the author’s first or middle name to an initial while the bibliographic records only contained this name in its full form were extremely likely to fail. Abbreviations were also used in searches for journals, conference proceedings, and occasionally even for book titles (e.g., “AI: a modern approach” instead of “Artificial intelligence: a modern approach”). Such queries were successful only if the abbreviation used by the searcher was included in the bibliographic records as a variant title. A somewhat related problem occurred when the title of a resource contained a numeral in its spelled out form but was entered as a digit by the user. Because these title variations are not always recorded as additional access points in the bibliographic records, the desired item either did not appear in the result set or was buried too deep to be discovered. Topical searches within the subject index were also prone to failure, mostly because patrons were unaware that such searches require the use of precise terms from controlled vocabularies and resorted to natural language searching instead. User Feedback Our analysis of user feedback revealed substantial differences in how various user groups approach the search system and which areas of it they find problematic. Students were often frustrated by the absence of spelling suggestions, which, as one user put it, “left the users wander [to?] in the dark” as to the cause of searching failure. This user group also found certain social features desirable: for example, one user suggested that having ratings for books would be helpful in his choice of a good programming book. By contrast, faculty and researchers were more concerned about the lack of the more advanced features, such as cross-reference searching and left-anchored browsing of the title, subject, and author indexes. However, there were several areas that both groups found problematic: students and faculty alike saw the system’s inability to assist in the selection of the correct form of the author’s name as a major barrier to effective author searching and also converged on the need for more granular access to formats of audiovisual materials. DISCUSSION Scope of the Discovery System The results of our analysis point to users’ lack of understanding of what is covered by the discovery layer. Users are often unaware of the existence of separate specialized search interfaces for different categories of materials and assume that the library discovery layer offers Google-like INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 17 searching across the entire range of library resource types. Moreover, they are confused by the multiple search modalities offered by the discovery layer: one of the common misconceptions in SearchWorks is that the advanced search will allow the user to access additional content rather than offer a different way of searching the same catalog data. In addition to the expanded scope of the discovery tools, there is also a growing expectation of greater depth of coverage. According to our data, searching in a discovery layer occurs at several levels: the entire resource (book, journal title, music recording), its smaller integral units (book chapters, journal articles, individual musical compositions, etc.), and full text. User Search Strategies The search strategies employed by SearchWorks users are heavily influenced by their experiences with web search engines. Users tend to engage in brief search sessions and use short queries, which is consistent with the general patterns of web searching. They rely on relevance ranking and are often reluctant to examine search results in any depth: if the desired item does not appear within the first few hits, users tend to rework their initial search statement (often with only a minimal change to the search terms) rather than scrolling down to the bottom of the results screen or looking beyond the first page of results. Given these search patterns, it is crucial to fine-tune relevance-ranking algorithms to the extent that the most relevant results are displayed not just on the first page but are included in the first few hits. While this is typically the case for unique and specific queries, more general searches could benefit from a relevance-ranking algorithm that would leverage the popularity of a resource as measured by its circulation statistics. Adding this dimension to relevance determination would help users make sense of large result sets generated by broad topical queries (e.g., “quantum mechanics,” “linear algebra,” “microeconomics”) by ranking more popular or introductory materials higher than more specialized ones. It could also provide some guidance to the user trying to choose between different editions of the same resource and improve the quality of results of author searches by ranking works created by the author before critical and biographical materials. Users’ query formulation strategies are also modeled by Google, where making search terms as specific as possible is often the only way to increase the precision of a search. Faceted search systems, however, require a different approach: the user is expected to conduct a broad search and subsequently focus it by superimposing facets on the results. Qualifying the search upfront through keywords rather than facets is not only ineffective, but may actually lead to failure. For example, a common search pattern is to add the format of a resource as a search term (e.g., “Fortune magazine,” “Science journal,” “GRE e-book,” “Nicole Lopez dissertation,” “Woody Allen movies”), and because the format information is coded rather than spelled out in the bibliographic records, such queries either result in zero hits or produce irrelevant results. In a similar vein, making the query overly restrictive by including the year of publication, publisher, or edition LIBRARY DISCOVERY PRODUCTS: DISCOVERING USER EXPECTATIONS THROUGH FAILURE ANALYSIS |IRINA TRAPIDO |doi:10.6017/ital.v35i2.9190 18 information often causes empty retrievals because the library might not have the edition specified by the user or because the query does not match the data in the bibliographic record. Thus our study lends further weight to claims that even in today’s reality of sophisticated discovery environments and unmediated searching, library users can still benefit from learning the best search techniques that are specifically tailored to faceted interfaces.20 Error Tolerance Input errors remain one of the major sources of failure in library discovery layers. Users have become increasingly reliant on error recovery features that they find elsewhere on the web, such as “Did you mean . . . ” suggestions, automatic spelling corrections, and helpful suggestions on how to proceed in situations where the initial search resulted in no hits. But perhaps even more crucial are error-prevention mechanisms, such as query autocomplete, which helps users avoid spelling and typographical errors and provides interactive search assistance and instant feedback during the query formulation process. Our visual analysis of the logs from the most recent years revealed an interesting search pattern, where the user enters only the beginning of the search query and then increments it by one or two letters: pr pro proq proque proques proquest Such search patterns indicate that users expect the system to offer query expansion options and show the extent to which the query autocomplete feature (currently missing from SearchWorks) has become an organic part of the users’ search process. Topical Searching While next-generation discovery systems represent a significant step toward enabling more sophisticated topical discovery, a number of challenges still remain. Apart from mechanical errors, such as misspellings and wrong search index selections, the majority of zero-hit topical searches were caused by a mismatch between the user’s query and the vocabulary in the system’s index. In many cases such queries were formulated too narrowly, reflecting the users’ underlying belief that the discovery layer offers full-text searching across all of the library’s resources. In addition to keyword searching, libraries have traditionally offered a more sophisticated and precise way of accessing subject information in the form of Library of Congress Subject Headings (LCSH). However, our results indicate that these tools remain largely underused: users took advantage of this feature in only 21 percent of all subject searches in our sample. We also found INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 19 that 95 percent of LCSH usage came from clicks on subject heading links within individual bibliographic records rather than from “Subject” facets, corroborating the results of earlier studies.21 There is a whole range of measures that could help patrons leverage the power of controlled vocabulary searching. They include raising the level of patron familiarity with the LCSHs, integrating cross-references for authorized subject terms, enabling more sophisticated facet- based access to subject information by allowing users to manipulate facets independently, and exposing hierarchical and associative relationships among LCSHs. Ideally, once the user has identified a helpful controlled vocabulary term, it should be possible to expand, refine, or change the focus of a search through broader, narrower, and related terms in the LCSH’s hierarchy as well as to discover various aspects of a topic through browse lists of topical subdivisions or via facets. Known-Item Searching Important as it is for the discovery layer to facilitate topical exploration, our data suggests that SearchWorks remains, first and foremost, a known-item lookup tool. While a typical SearchWorks user rarely has problems with known-work searches, our analysis of clusters of closely related searches has revealed several situations where users’ known-item search experience could be improved. For example, when the desired resource is not in the library’s collection, the user is rarely left with empty result sets because of automatic word-stemming and cross-field searching. While this is a boon for exploratory searching, it becomes a problem when the user needs to ensure that the item sought is not included in the library’s collection. Another common scenario arises when the query is too generic, imprecise, or simply erroneous, or when the search string entered by the user does not match the metadata in the bibliographic record, causing the most relevant resources to be pushed too far down the results list to be discoverable. Providing helpful “Did you mean . . . ” suggestions could potentially help the user distinguish between these two scenarios. Another feature that would substantially benefit the user struggling with the problem of noisy retrievals is highlighting the user’s search terms in retrieved records. Displaying search matches could alleviate some of the concerns over lack of transparency as to why seemingly irrelevant results are retrieved, repeatedly expressed in user feedback, as well as expedite the process of relevance assessment. Author Searching Author searching remains problematic because of a convergence of factors: a. Misspellings. According to our data, typographical errors and misspellings are by far the most common problem in author searching. When such errors occur in personal names, they are much more difficult to identify than errors in the title, and in the absence of LIBRARY DISCOVERY PRODUCTS: DISCOVERING USER EXPECTATIONS THROUGH FAILURE ANALYSIS |IRINA TRAPIDO |doi:10.6017/ital.v35i2.9190 20 index-based spell-checking mechanisms, often require the use of external sources to be corrected. b. Mismatch between the form and fullness of the name entered by the user and the form of the name in the bibliographic record. For example, a user’s search for “D. Reynolds” will retrieve records where “D” and “Reynolds” appear anywhere in the record (or anywhere in the author fields, if the user opts for a more focused “author” search), but will not bring up records where the author’s name is recorded as “Reynolds, David.” c. Lack of cross-reference searching of the LC Name Authority file. If the user searches for a variant name represented by a cross-reference on an authority record, she might not be directed to the authorized form of the name. d. Lack of name disambiguation, which is especially problematic when the search is for a common name. While the process of name authority control ensures the uniqueness of name headings, it does not necessarily provide information that would help users distinguish between authors. For instance, the user often has to know the author’s middle name or date of birth to choose the correct entry, as exemplified by the following choices in the “Author” facet resulting from the query “David Kelly”: Kelly, David Kelly, David (David D.) Kelly, David (David Francis) Kelly, David F. Kelly, David H. Kelly, David Patrick Kelly, David St. Leger Kelly, David T. Kelly, David, 1929 July 11– Kelly, David, 1929– Kelly, David, 1929–2012 Kelly, David, 1938– Kelly, David, 1948– Kelly, David, 1950– Kelly, David, 1959– e. Errors and inaccuracies in the bibliographic records. Given the past practice of creating undifferentiated personal-name authority records, it is not uncommon to have one name heading for different authors or contributors. Conversely, situations where a single person is identified by multiple headings (largely because some records still contain obsolete or variant forms of a personal name) are also prevalent and may INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 21 become a significant barrier to effective retrieval as they create multiple facet values for the same author or contributor. f. Inability to perform an exhaustive search on the author’s name. A fielded “Author” search will miss the records where the name does not appear in the “Author” fields but appears elsewhere in the bibliographic record. g. Relevance ranking. Because search terms occurring in the title have more weight than search terms in the “Author” fields, works about an author are ranked higher than works of the author. Browsing Like many other next-generation discovery systems, SearchWorks features faceted navigation, which facilitates both general-purpose browsing and more targeted search. In SearchWorks, facets are displayed from the outset, providing a high-level overview of the collection and jumping-off points for further exploration. Rather than having to guess the entry vocabulary, the searcher may just choose from the available facets and explore the entire collection along a specific dimension. However, findings from our manual analysis of the query stream suggest that facets as a browsing tool might not be used to their fullest potential: users often resort to keyword searching when faceted browsing would have been a more optimal strategy. There are at least two factors that contribute to this trend. The first is users’ lack of awareness of this interface feature: it is common for SearchWorks users to issue queries such as “dissertations,” “theses,” and “newspapers” instead of selecting the appropriate value of the “Format” facet. Second, many of the facets that could be useful in the discovery process are not available as top-level browsing categories. For example, users expect more granular faceting of audiovisual resources, which would include the ability to browse by content type (“computer games,” “video games”) and genre (“feature films,” “documentaries,” “TV series,” “romantic comedies”). Another category of resources commonly accessed by browsing is theses and dissertations. Users frequently try to browse dissertations by field or discipline (issuing searches such as “linguistics thesis,” “dissertations aeronautics,” “PhD thesis economics,” “biophysics thesis”), by program or department and by the level of study (undergraduate, master’s, doctoral), and could benefit from a set of facets dedicated to these categories. Browsing for books could be enhanced by additional faceting related to intellectual content, such as genre and literary form (e.g., “fantasy,” “graphic novels,” “autobiography,” “poetry”) and audience (e.g., “children’s books”). Users also want to be able to browse for specific subsets of materials on the basis of their location (e.g., permanent reserves at the engineering library). Browsing for new acquisitions with the option of limiting to a specific topic is also a highly desirable feature. LIBRARY DISCOVERY PRODUCTS: DISCOVERING USER EXPECTATIONS THROUGH FAILURE ANALYSIS |IRINA TRAPIDO |doi:10.6017/ital.v35i2.9190 22 While some browsing categories are common across all types of resources, others only apply to specific types of materials (e.g., music, cartographic/geospatial materials, audiovisual resources, etc.). For example, there is a strong demand among music searchers for systematic browsing by specific musical instruments and their combinations. Ideally, the system should offer both an optimal set of initial browse options and intuitive context-specific ways to progressively limit or expand the search. Offering such browsing tools may require improvements in system design as well as significant data remediation and enhancement because much of the metadata that could be used to create these browsing categories is often scattered across multiple fixed and variable fields in the bibliographic records, inconsistently recorded, or not present at all. One of the hallmarks of modern discovery tools has been their increased focus on developing tools that would facilitate serendipitous browsing. SearchWorks was one of the pioneers to offer virtual “browse shelf” feature, which is aimed at emulating browsing the shelves in a physical library. However, because this functionality relies on the classification number, it does not allow browsing of many other important groups of materials, such as multimedia resources, rare books, or archival resources. Call-number proximity is only one of the many dimensions that could be leveraged to create more opportunities for serendipitous discoveries. Other methods of associating related content might include recommendations based on subject similarity, authorship, keyword associations, forward and backward citations, and use. Implications for Practice Addressing the issues that we identified would involve improvements in several areas: • Scope. Our findings indicate that library users increasingly perceive the discovery interface as a portal to all of the library’s resources. Meeting this need goes far beyond offering the ability to search multiple content sources from a single search box: it is just as important to help users make sense of the results of their search and to provide easy and convenient ways to access the resources that they have discovered. And whatever the scope of the library discovery layer is, it needs to be communicated to the user with maximum clarity. • Functionality. Users expect a robust and fault-tolerant search system with a rich suite of search-assistance features, such as index-based alternative spelling suggestions, result screens displaying keywords in context, and query auto-completion mechanisms. These features, many of which have become deeply embedded into user search processes elsewhere on the web, could prevent or alleviate a substantial number of issues related to problematic user queries (misspellings, typographical errors, imprecise queries, etc.), enable more efficient recovery from errors by guiding users to improved results, and facilitate discovery of foreign-language materials. Equally important is the continued focus on relevance ranking algorithms, which ideally should move beyond simple keyword- INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 23 matching techniques toward incorporating social data as well as leveraging the semantics of the query itself and offering more intelligent and possibly more personalized results depending on the context of the search. • Metadata. The quality of the user experience in the discovery environments depends as much on the metadata as it does on the functionality of the discovery layer. Thus it remains extremely important to ensure consistency, granularity, and uniformity of metadata, especially as libraries are increasingly faced with the problem of integrating heterogeneous pools of metadata into a single discovery tool. CONCLUSIONS AND FUTURE DIRECTIONS The analysis of the transaction log data and user feedback has helped us identify several common patterns of search failure, which in turn can reveal important assumptions and expectations that users bring to the library discovery. These expectations pertain primarily to the system’s functionality: in addition to simple, intuitive, and visually appealing interfaces and relevance- ranked results, users expect a sophisticated search system that would consistently produce relevant results even for incomplete, inaccurate, or erroneous queries. Users also expect a more centralized, comprehensive, and inclusive search environment that would enable more in-depth discovery by offering article-level, chapter-level, and full-text searching. Finally, the results of this study have underscored the continued need for a more flexible and adaptive system that would be easy to use for novices while offering advanced functionality and more control over the search process for the “power” users, a system that would provide targeted support for the different types of information behavior (known-item look-up, author searching, topical exploration, browsing) and would facilitate both general inquiry and very specialized searches (e.g., searches for music, cartographic and geospatial materials, digital collections of images, etc.). Just like discovery itself, building discovery tools is a dynamic, complex, iterative process that requires intimate knowledge of ever-changing and evolving user needs and expectations. It is hoped that ongoing focus on user problems and frustrations in the new discovery environments can complement other assessment methods by identifying unmet user needs, thus helping create a more holistic and nuanced picture of users’ search and discovery behaviors. REFERENCES 1. Marshall Breeding, “Library Resource Discovery Products: Context, Library Perspectives, and Vendor Positions,” Library Technology Reports 50, no. 1 (2014): 5–58. 2. Craig Silverstein et al., “Analysis of a Very Large Web Search Engine Query Log,” SIGIR Forum 33, no. 1 (1999): 6–12; Bernard J. Jansen, Amanda Spink, and Tefko Saracevic, “Real Life, Real Users, and Real Needs: A Study and Analysis of User Queries on the Web,” Information LIBRARY DISCOVERY PRODUCTS: DISCOVERING USER EXPECTATIONS THROUGH FAILURE ANALYSIS |IRINA TRAPIDO |doi:10.6017/ital.v35i2.9190 24 Processing & Management 36, no. 2 (2000): 207–27, http://dx.doi.org/10.1016/S0306- 4573(99)00056-4; Amanda Spink, Bernard J. Jansen, and H. Cenk Ozmultu, “Use of Query Reformulation and Relevance Feedback by Excite Users,” Internet Research 10, no. 4 (2000): 317–28; Amanda Spink et al., “Searching the Web: The Public and Their Queries,” Journal of the American Society for Information Science & Technology 52, no. 3 (2001): 226–34; Bernard J. Jansen and Amanda Spink, “An Analysis of Web Searching by European AllteWeb.com Users,” Information Processing & Management 41, no. 2 (2005): 361–81, http://dx.doi.org/10.1016/S0306-4573(03)00067-0. 3. Cory Lown and Bradley Hemminger, “Extracting User Interaction Information from the Transaction Logs of a Faceted Navigation OPAC,” code4lib 7, June 26, 2009, http://journal.code4lib.org/articles/1633; Eng Pwey Lau and Dion Ho-Lian Goh, “In Search of Query Patterns: A Case Study of a University OPAC,” Information Processing & Management 42, no. 5 (2006): 1316–29, http://dx.doi.org/10.1016/j.ipm.2006.02.003; Heather Moulaison, “OPAC Queries at a Medium-Sized Academic Library: A Transaction Log Analysis,” Library Resources & Technical Services 52, no. 4 (2008): 230–37. 4. William H. Mischo et al., “User Search Activities within an Academic Library Gateway: Implications for Web-Scale Discovery Systems,” in Planning and Implementing Resource Discovery Tools in Academic Libraries, edited by Mary Pagliero Popp and Diane Dallis, 153–73 (Hershey, : Information Science Reference, 2012); Xi Niu, Tao Zhang, and Hsin-liang Chen, “Study of User Search Activities with Two Discovery Tools at an Academic Library,” International Journal of Human-Computer Interaction 30, no. 5 (2014): 422–33, http://dx.doi.org/10.1080/10447318.2013.873281. 5. Eng Pwey Lau and Dion Ho-Lian Goh, “In Search of Query Patterns”; Niu, Zhang, and Chen, “Study of User Search Activities with Two Discovery Tools at an Academic Library.”. 6. Lown and Hemminger, “Extracting User Interaction; Kristin Antelman, Emily Lynema, and Andrew K. Pace, “Toward a Twenty-First Century Library Catalog,” Information Technology & Libraries 25, no. 3 (2006): 128–39; Niu, Zhang, and Chen, “Study of User Search Activities with Two Discovery Tools at an Academic Library.” 7. Xi Niu and Bradley Hemminger, “Analyzing the Interaction Patterns in a Faceted Search Interface,” Journal of the Association for Information Science & Technology 66, no. 5 (2015): 1030–47, http://dx.doi.org/10.1002/asi.23227. 8. Steven D. Zink, “Monitoring User Search Success through Transaction Log Analysis: The WolfPAC Example,” Reference Services Review 19, no. 1 (1991): 49–56; Deborah D. Blecic et al., “Using Transaction Log Analysis to Improve OPAC Retrieval Results,” College & Research Libraries 59, no. 1 (1998): 39–50; Holly Yu and Margo Young, “The Impact of Web Search http://dx.doi.org/10.1016/S0306-4573(99)00056-4 http://dx.doi.org/10.1016/S0306-4573(99)00056-4 http://dx.doi.org/10.1016/S0306-4573(03)00067-0 http://journal.code4lib.org/articles/1633 http://dx.doi.org/10.1016/j.ipm.2006.02.003 http://dx.doi.org/10.1080/10447318.2013.873281 http://dx.doi.org/10.1080/10447318.2013.873281 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 25 Engines on Subject Searching in OPAC,” Information Technology & Libraries 23, no. 4 (2004): 168–80; Moulaison, “OPAC Queries at a Medium-Sized Academic Library.” 9. Thomas Peters, “When Smart People Fail,” Journal of Academic Librarianship 15, no. 5 (1989): 267–73; Zink, “Monitoring User Search Success through Transaction Log Analysis”; Rhonda H. Hunter, “Successes and Failures of Patrons Searching the Online Catalog at a Large Academic Library: A Transaction Log Analysis,” Reference Quarterly (Spring 1991): 395–402. 10. Karen Antell and Jie Huang, “Subject Searching Success: Transaction Logs, Patron Perceptions, and Implications for Library Instruction,” Reference & User Services Quarterly 48, no. 1 (2008): 68–76; Hunter, “Successes and Failures of Patrons Searching the Online Catalog at a Large Academic Library”; Peters, “When Smart People Fail.” 11. Peters, “When Smart People Fail.”; Moulaison, “OPAC Queries at a Medium-Sized Academic Library”; Blecic et al., “Using Transaction Log Analysis to Improve OPAC Retrieval Results.” 12. Lynn Silipigni Connaway, Debra Wilcox Johnson, and Susan E. Searing, “Online Catalogs from the Users’ Perspective: The Use of Focus Group Interviews,” College & Research Libraries 58, no. 5 (1997): 403–20, http://dx.doi.org/10.5860/crl.58.5.403. 13. Karl V. Fast and D. Grant Campbell, “‘I Still Like Google’: University Student Perceptions of Searching OPACs and the Web,” ASIST Proceedings 41 (2004): 138–46; Eric Novotny, “I Don’t Think I Click: A Protocol Analysis Study of Use of a Library Online Catalog in the Internet Age,” College & Research Libraries 65, no. 6 (2004): 525–37, http://dx.doi.org/10.5860/crl.65.6.525. 14. Xi Niu et al., “National Study of Information Seeking Behavior of Academic Researchers in the United States,” Journal of the American Society for Information Science & Technology 61, no. 5 (2010): 869–90, http://dx.doi.org/10.1002/asi.21307; Lynn Sillipigni Connaway, Timothy J. Dikey, and Marie L. Radford, “If It Is Too Inconvenient I’m Not Going after It: Convenience as a Critical Factor in Information-Seeking Behaviors,” Library & Information Science Research 33, no. 3 (2011): 179–90; Karen Calhoun, Joanne Cantrell, Peggy Gallagher and Janet Hawk, Online Catalogs: What Users and Librarians Want: An OCLC Report (Dublin, OH: OCLC Online Computer Library Center, 2009). 15. F. William Chickering and Sharon Q. Young, “Evaluation and Comparison of Discovery Tools: An Update,” Information Technology & Libraries 33, no.2 (2014): 5–30, http://dx.doi.org/10.6017/ital.v33i2.3471. 16. William Denton and Sarah J. Coysh, “Usability Testing of VuFind at an Academic Library,” Library Hi Tech 29, no. 2 (2011): 301–19, http://dx.doi.org/10.1108/07378831111138189; Jennifer Emanuel, “Usability of the VuFind Next-Generation Online Catalog,” Information Technology & Libraries 30, no. 1 (2011): 44–52; Erin Dorris Cassidy et al., “Student Searching http://dx.doi.org/10.5860/crl.58.5.403 http://dx.doi.org/10.5860/crl.65.6.525 http://dx.doi.org/10.1002/asi.21307 http://dx.doi.org/10.6017/ital.v33i2.3471 http://dx.doi.org/10.1108/07378831111138189 LIBRARY DISCOVERY PRODUCTS: DISCOVERING USER EXPECTATIONS THROUGH FAILURE ANALYSIS |IRINA TRAPIDO |doi:10.6017/ital.v35i2.9190 26 with EBSCO Discovery: A Usability Study,” Journal of Electronic Resources Librarianship 26, no. 1 (2014): 17–35, http://dx.doi.org/10.1080/1941126X.2014.877331. 17. Sarah C. Williams and Anita K. Foster, “Promise Fulfilled? An EBSCO Discovery Service Usability Study,” Journal of Web Librarianship 5, no. 3 (2011): 179–98, http://dx.doi.org/10.1080/19322909.2011.597590; Rice Majors, “Comparative User Experiences of Next-Generation Catalogue Interfaces,” Library Trends 61, no. 1 (2012): 186– 207; Andrew D. Asher, Lynda M. Duke, and Suzanne Wilson, “Paths of Discovery: Comparing the Search Effectiveness of EBSCO Discovery Service, Summon, Google Scholar, and Conventional Library Resources,” College & Research Libraries 74, no. 5 (2013): 464–88. 18. Jody Condit Fagan et al., “Usability Test Results for a Discovery Tool in an Academic Library,” Information Technology & Libraries 31, no. 1 (2012): 83–112; Megan Johnson, “Usability Test Results for Encore in an Academic Library,” Information Technology & Libraries 32, no. 3 (2013): 59–85. 19. Elizabeth (Bess) Sadler, “Project Blacklight: A Next Generation Library Catalog at a First Generation University,” Library Hi Tech 27, no. 1 (2009): 57–67, http://dx.doi.org/10.1108/07378830910942919; Bess Sadler, “Stanford's SearchWorks: Unified Discovery for Collections?” in More Library Mashups: Exploring New Ways to Deliver Library Data, edited by Nicole C. Engard, 247–260 (London: Facet, 2015). 20. Andrew D. Asher, Lynda M. Duke and Suzanne Wilson, “Paths of Discovery: Comparing the Search Effectiveness of EBSCO Discovery Service, Summon, Google Scholar, and Conventional Library Resources,” College & Research Libraries 74, no. 5 (2013): 464–88; Kelly Meadow and James Meadow, “Search Query Quality and Web-Scale Discovery: A Qualitative and Quantitative Analysis,” College & Undergraduate Libraries 19, no. 2–4 (2012): 163–75, http://dx.doi.org/10.1080/10691316.2012.693434. 21. Sarah C. Williams and Anita K. Foster, “Promise Fulfilled? An EBSCO Discovery Service Usability Study,” Journal of Web Librarianship 5, no. 3 (2011): 179–98, http://dx.doi.org/10.1080/19322909.2011.597590; Kathleen Bauer and Alice Peterson-Hart, “Does Faceted Display in a Library Catalog Increase Use of Subject Headings?” Library Hi Tech 30, no. 2 (2012), 347–58, http://dx.doi.org/10.1108/07378831211240003. http://dx.doi.org/10.1080/1941126X.2014.877331 http://dx.doi.org/10.1080/19322909.2011.597590 http://dx.doi.org/10.1108/07378830910942919 http://dx.doi.org/10.1080/10691316.2012.693434 http://dx.doi.org/10.1080/19322909.2011.597590 http://dx.doi.org/10.1108/07378831211240003 ABSTRACT INTRODUCTION REFERENCES 9255 ---- Critical Success Factors for Integrated Library System Implementation in Academic Libraries: A Qualitative Study Shea-Tinn Yeh and Zhiping Walter INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 27 ABSTRACT Integrated library systems (ILSs) support the entire business operations of an academic library from acquiring and processing library resources to making them available to user communities and preserving them for future use. As libraries’ needs evolve, there is a pressing demand for libraries to migrate from one generation of ILS to the next. This complex migration process often requires significant financial and personnel investment, but its success is by no means guaranteed. We draw on enterprise resource planning and critical success factors (CSFs) literature to identify the most salient CSFs for ILS migration success through a qualitative study with four cases. We found that careful selection process, top management involvement, vendor support, project team competence, staff user involvement, interdepartmental communication, data analysis and conversion, project management and project tracking, staff user education and training, and managing staff user emotions are the most salient CSFs that determine the success of a migration project. INTRODUCTION The first generation of integrated library systems (ILSs) were developed specifically for library operations focused on the selection, acquisition, cataloging, and circulation of print collections. As libraries’ nonprint materials steadily grow, the print-centric ILSs became less and less efficient in supporting libraries’ daily operations. Recent years have seen an emergence of a new generation of ILSs, commonly called Library Services Platforms (LSPs), that takes into account the management of both print and electronic collections. LSPs take advantage of cloud computing and network advancements to provide economies of scale and to allow a library to better share data with other libraries. Furthermore, LSPs unify the entire suite of library operations to provide efficient workflow at the back end and advanced online discovery tools at the front end for the library.1 Given the claimed benefits of the emerging LSP and the fact that vendors are phasing out support for their legacy ILSs, we project that more libraries will be migrating to LSPs as the systems mature and libraries’ needs evolve. Shea-Tinn Yeh (sheila.yeh@du.edu) is Assistant Professor and Library Digital Infrastructure and Technology Coordinator, University of Denver Libraries. Zhiping Walter (zhiping.walter@ucdenver.edu) is Associate Professor, Business School, University of Colorado Denver. mailto:sheila.yeh@du.edu mailto:zhiping.walter@ucdenver.edu) CRITICAL SUCCESS FACTORS FOR INTEGRATED LIBRARY SYSTEM IMPLEMENTATION IN ACADEMIC LIBRARIES: A QUALITATIVE STUDY | YEH AND WALTER |doi:10.6017/ital.v35i2.9255 28 Migrating from one generation of ILS to another is a significant initiative that affects the entire library operation.2 Because of its scale and complexity, the migration project is not always smooth and often fraught with problems, with some projects falling behind migration completion schedule.3, 4, 5 In addition, committing to a new system often results in significant financial and personnel costs for an academic library.6 Understandably, there is considerable trepidation before, during, and after the migration process.6, 7 What contributes to a smooth migration process and a successful migration project? This is an urgent question at present and an enduring question for the future. This is because, as libraries continue to evolve, their operations and management needs are destined to outgrow functionalities of the current generation of ILS. Therefore migration to a new generation of ILS is destined to occur periodically for a library. In this research, we study critical success factors (CSFs) that contribute to a smooth migration process and a successful migration project defined as on-time and on-budget project completion and a smooth implementation process. To achieve our research goal, we anchor our theoretical foundation in the enterprise resource planning (ERP) system-implementation literature. ERP is “business process management software that allows an organization to use a system of integrated applications to manage the business and automate many back office functions related to technology, services and human resources.”9 Since a complete ILS is formed from a suite of integrated functions to manage a broad range of library processes, it is in fact an ERP for libraries.10 A literature review of CSFs for ERP system implementation success revealed more than ninety CSF factors.11, 12 The contribution of our research is in identifying, through qualitative research method, the most salient CSFs that contribute to the success of a library system migration project from one generation of ILS to another. Results of this study can help library administrators to improve the chance of success and decrease the level of anxiety during a migration project now and in the future. The remainder of the article is organized as follows: Section 2 reviews ERP, ILS, LSP, CSFs, and information system success measurement described in the literature. Section 3 describes the guided interviews that have been conducted to identify the CSFs, the results, and the analysis of the results. Finally, we offer conclusions and limitations as well as recommend future work. LITERATURE REVIEW ERP is business-management software comprising a suite of integrated applications that an organization can use to collect, store, manage, and interpret data from many business activities, including product planning, manufacturing, service delivery, marketing and sales, and human resources. The core idea of an ERP system is to integrate both the data and the process dimensions in a business so that transactions can be monitored and analyzed for planning and strategic purposes.13 Modules of the system cover different functions within a company and are linked so users can see what is happening in all areas of the company. An ERP system can improve a business’s back offices as well as its front-end functions, with both operational and strategic benefits.14 Some of the benefits include reliability in information access, data and operations INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 29 redundancy, data retrieval and reporting efficiency, easy module extension, and Internet commerce capability. Just like an ERP system for a business, a complete library management solution comprises a suite of integrated applications that manage a broad range of library processes including circulation, acquisition, cataloging, electronic resources management, and system administration. LSPs, the current generation of library management systems, are designed to manage both physical and digital collections. LSPs follow the service-oriented architecture (SOA) and can be deployed through multitenant Software as a Service (SaaS) distribution model.15 In addition to supporting all library functions, LSPs integrate with other university systems, such as student registry and finance, and provide front-end for library patrons in a cloud environment that leverages a global network of systems for discovery of a wide array of resources.16 Since an LSP is essentially an enterprise system for library functions, CSFs of ERP implementation success could guide LSP implementation. CSFs are conditions that must be met for an implementation to be successful.17 More than ninety CSFs have been identified for ERP implementation success.18, 19 Those CSFs have been classified according to various schemes, but we found the strategic versus tactical classification most relevant to the library context.20 Strategic factors address the big picture involving the breakdown of goals into do-able items. Tactical factors, on the other hand, are the methods to accomplish the doable items that lead to achieving the goals.21 By examining the entire list of CSFs from both the strategic and the tactical perspectives, we identify top CSFs for library-management-solution implementation and migration success, defined as on-time and on-budget delivery as well as smooth implementation process,22, 23 through a qualitative study. METHOD We conducted semi-structured interviews with open-ended questions to identify the most salient CSFs for implementation success. Since we needed to reduce more than ninety CSFs in the literature to a list of most salient CSFs in the library context and to potentially identify new CSFs, a qualitative-interview approach was more suitable than a quantitative-survey approach. A two- step process was used to arrive at the final list. First, we evaluated all CSFs in the literature and identified a subset of CSFs that might be most relevant for library-systems implementation.24 Second, this CSFs subset was used to develop an interview guide for semistructured interviews conducted later to further reduce this subset. Open-ended questions were also used during the interviews to elicit additional CSFs. An institutional review board (IRB) application was submitted and approved. The result of this two-step process is a list of ten CSFs discussed in the results section, with nine CSFs coming from our initial list and one CSF emerging from the interviews. The criterion for recruiting study libraries is that the library has implemented a new LSP within the last three years. This is because the LSP is the current generation of ILS, and it is only within the last few years that various LSP vendors began to promote and implement the LSPs. A CRITICAL SUCCESS FACTORS FOR INTEGRATED LIBRARY SYSTEM IMPLEMENTATION IN ACADEMIC LIBRARIES: A QUALITATIVE STUDY | YEH AND WALTER |doi:10.6017/ital.v35i2.9255 30 recruitment email was sent to libraries listed as adopters on various vendors’ press release sites. Participating recipients referred the interview request to appropriate migration team members whom we later contacted to schedule interviews. This resulted in up to five people from each participating library being interviewed in person or via Skype. Their positions are listed in table 1. Interviews were recorded, transcribed, and cleaned. Emails to the same interviewees were used for follow-up questions as needed. After interviews with each library, qualitative data analysis was performed to identify CSFs that emerged from the interviews. Interviews continued until no new CSFs emerged in the last interview. In total, staff from four libraries were interviewed between October 2014 and March 2015 about their implementation process and experience from staff user perspective. The design and implementation of discovery public interface experience was not part of this inquiry. Table 1 summarizes characteristics of the four libraries. Case numbers instead of university names are used to protect identities of participating libraries and interviewees. Case 1 Case 2 Case 3 Case 4 Type of university Private Public Public Private Student population 11,000+ 32,000+ 2,400+ 2,700+ Operating budget 11 million 13 million 1.5 million 1.3 million Library employees 150 400 17 13.5 Project length 6 months 9 months 6 months 9 months ILS used before Millennium Aleph Evergreen Voyager LSP implemented Sierra Alma Sierra Sierra Reasons for migration Discontinued vendor system support; servers out of warranty; vendor gave incentives Outdated servers; servers out of warranty In need of a robust system and provides discovery layer In need of a modern system demonstrating the library’s moving with the times Positions of interviewees Head of systems; module experts Heads of systems Director of library; head of systems Director of library Table 1. Summary of case study site characteristics. RESULTS The following CSFs emerged from interviews: careful selection process, top management involvement, vendor support, project team competence, staff user involvement, interdepartmental communication, data analysis and conversion, project management and project tracking, staff user education and training, managing staff user emotions. We discuss each CSF next. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 31 Careful Selection Process Most ILSs are commercial, off-the-shelf software systems that can vary dramatically in functionality from system to system.25 For example, some packages are more suitable for large institutions while others are more suitable for smaller ones. To mitigate risks in productivity or transaction loss and to minimize system and implementation costs, a library needs to determine the best “fitness-of-use” system. Such a determination is the outcome of a careful selection process. Although there is no commonly accepted technique, method, or tool for this process, all selection processes share common key steps suggested in the literature.26 They are the following as applied to library-systems selection: define stakeholder requirements, search for products, create a short list of most promising candidates based on a set of “must-have” requirements, evaluate the candidates on the short list, and analyze the evaluation data to make a selection. In addition, if the server option was chosen instead of the cloud option, selected hardware needs to satisfy system requirements for the final configuration. Careful selection process emerged as a CSF that affected implementation outcome for all four libraries. All cases were migrating to an LSP system. Some systems can be offered as locally installed systems, which require appropriate in-house and hardware capabilities. Case 1 did not consider its IT capability when deciding on a turnkey system. As a result, the library experienced difficulties in setting up the infrastructure in-house during the implementation. Each of the other three cases considered the candidate system’s compatibility with the legacy system, the match between library needs and system functionalities, system maturity, migration costs, data storage needs, and vendor support before and during the implementation as well as continued vendor support throughout the life of the new system. Even though each of the three libraries arrived at its system choice differently, on reflection, interviewees expressed relief and satisfaction in their decisions to choose their respective systems. “We were in the position where our servers were out of date and warranty, needed to be replaced. The servers were too small. We had sizing issues and we couldn’t update to the most recent version of Aleph . . . Alma being a cloud based solution will eliminate our need to be ‘in the server business.’” (Case 2). “We went through a very extensive formal process to select this system.” (Case 3) Top Management Involvement Successful implementation requires strong leadership by executives who understand, support, and champion the project.27 When this involvement is trickled down through organizational hierarchy, it leads to an organizational commitment, which is required for implementation success for complex projects.28, 29 Since library-system implementation is a complex project that (if done correctly) will transform the entire library and reposition it for better efficiency, strong leadership is critical as well. CRITICAL SUCCESS FACTORS FOR INTEGRATED LIBRARY SYSTEM IMPLEMENTATION IN ACADEMIC LIBRARIES: A QUALITATIVE STUDY | YEH AND WALTER |doi:10.6017/ital.v35i2.9255 32 In all four cases, top management were involved in the final decisions of their respective system choices. In cases 1 and 2, top management also took charge in securing funding for the migration projects. Interviewees stressed that top management support was very important in their respective project implementations. “The top level management took the recommendations from the systems librarians at the time, with the blessing of the council determined whether they want to proceed with the product Alma, and had funding conversations with the financial people.” (Case 2) “We have faculty library committee, faculty governance oversight. We showed them webinars of the products we considered before we signed them, so we have faculty representation on board. We held open forum and were inclusive in our invitations.” (Case 4) Vendor Support With a new technology, it is critical to acquire external technical expertise, often from the vendor, to facilitate successful implementation.30 Effective vendor support includes adequate and high- quality technical support during and after implementation, sufficient training provided for both the project team and staff users, and positive relationships between all parties in the project.31 Additionally, there should be adequate knowledge transfer between the vendor consultants and the clients, which can be achieved by defining roles, achieving shared understanding, and enhancing relationships through competent communication.32, 33 In the case of library-system implementations, vendor support is particularly important because of the complexity of each new generation of the system and the library personnel’s knowledge gap in understanding the nuts and bolts of the new system. Effective vendor support was identified in each case as a critical success factor determining the implementation outcome even though the form of vendor support varied from case to case. In case 1, the vendor sent different consultants with various expertise as project managers on the basis of the project phase. In case 2, the vendor sent one consultant who served as the main project manager. In case 3, the vendor provided a project manager and a team of technicians. In case 4, consultants were shared across multiple consortium libraries that were implementing the system at the same time. No matter how vendor support was provided, it was essential for implementation success as indicated by interviewees. “The vendor has been very supportive and provides a group of experts throughout the process, some are knowledgeable in server business while others are skilled project managers.” (Case 1) Project Team Competence Since library-system migration affects all functional areas of a library, members of the implementation team need to be cross-functional. Furthermore, members with both business INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 33 knowledge and technology knowhow are especially crucial for implementation success.34 Competence of vendor consultants assigned to the project also influences implementation success, as discussed earlier. Additionally, it is important to have an in-house project leader who champions the project and who has the essential skills and authority to set goals that legitimize change.35 Having a competent project team was essential for implementation success for each of our cases. In each case, the vendor provided the project manager and the library provided a co-manager who was a champion figure. Other team members came from various functional areas such as acquisition, circulation, cataloging, electronic resources management, and system administration. For example, in case 1, the technology librarian participated as a co-project manager. The project- management team comprised module experts within the library and from functional areas. In addition, the university’s technology services department lent technical support during early stages of implementation when servers need to be set up. The interviewees all stressed the importance of project-team competence. “Without the infrastructure knowledge from the university’s technology team and their time and full support to negotiate with the vendor, the migration project would not have been possible.” (Case 1) “The university’s IT made sure that we are in compliance with campus policies and expectations for securities.” (Case 2) Staff User Involvement It is important that the project team involve staff users early on, otherwise the implementation process may be bumpy. When end users are involved in decisions relating to system selection and implementation, they are more invested in and concerned with the success of the system, which in turn leads to greater system use and user satisfaction.36, 37 As such, it is one of the most cited critical success factors in ERP implementation.38 Because personal relevance to the system is just as important for library-system implementation, effective staff user involvement with implementation is positively related to implementation success. Staff user involvement has emerged as a main success factor in all our cases and contributed to the implementation project outcome. In case 1, staff users were not consulted as to whether an LSP was necessary for the library, although they were informed of the reasons for implementation. Additionally, staff users were not involved when the project timetable was negotiated. This lack of early staff user involvement led to considerable stress down the road, which made the implementation process bumpy. The other three cases involved staff users early on; as a result, staff users experienced much less stress and frustration down the road. Specifically, in case 2, the staff users were educated about the need for migration through staff meetings, town hall meetings, supervisory meetings, council meetings, and forums. Many product-demo sessions were conducted for the staff so they would have the knowledge to participate before the final decision CRITICAL SUCCESS FACTORS FOR INTEGRATED LIBRARY SYSTEM IMPLEMENTATION IN ACADEMIC LIBRARIES: A QUALITATIVE STUDY | YEH AND WALTER |doi:10.6017/ital.v35i2.9255 34 was made. There were daily internal newsletters conveying implementation news throughout implementation months. In case 3, the entire library was involved with the selection of a new system. While the key staff (such as circulation manager, acquisition manager, and reference manager) had more input than others, everyone offered input about the project. As such, the buy- in with the new system was strong from all stakeholders. In case 4, staff users were involved early on through open forums and webinars. The following quotes are examples of interviewee sentiment concerning staff user involvement: “Everybody is involved in choosing the system; partially because Evergreen had been so problematic. We wanted to make sure that everyone is on board.” (Case 3) “Migration is the most time consuming aspect of the library staff work during the time of the project, without their buy-ins, it is difficult to have a successful project.” (Case 4) Interdepartmental Communication The importance of effective communications across functional and departmental boundaries is well known in information-systems-implementation literature.39 With consultants coming from the vendor, project team members coming from different functional areas, and staff users with different perceptions and understandings of the implementation project, the importance of effective communications between all involved cannot be overstated. Communications should start early, be consistent and continuous throughout various stages of the implementation process, and include a system overview, rationale for implementation, briefings for process changes, and contact-points establishment.40 Expectations and goals should be communicated to all stakeholders and to all levels of the organization.41 Effectiveness of interdepartmental communication affected the implementation outcome in all our cases. In case 1, the library’s project manager was designated to communicate with the vendor when issues arose, such as hardware and software configurations, system backup and use, and task assignments. The formal project plan was established using the web-based Basecamp so that team members in different roles with different responsibilities could communicate and work together online. Regular meetings were held and emails were exchanged between project team members. However, there is a lack of effective interdepartmental communication with staff who were not on the project team. This resulted in the absence of necessary system testing that would have detected some data-integrity issues. Such issues later caused the system to be offline for days, which brought much frustration and stress to everyone. In the other three cases, all actors were well informed through news releases, meetings, presentations, and webinars. Concerns were communicated to the project team and addressed timely. As a result, the level of frustration was very low for those three cases. Data Analysis and Conversion A fundamental requirement for the effectiveness of an ERP system is the accuracy of its data,42 and the same is true for a library system. Data types in a legacy ILS are often of an outdated format and INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 35 can differ from formats supported by a new library system. Conversion from one format to another can be an overwhelming process, especially when there is no existing expertise in the library. Since migrating legacy data to the new system is essential, effective data analysis for conversion is a critical success factor for implementation success. The smoothness of each of the four implementation cases was related to the project team’s data analysis and conversion efforts. In case 1, the library did not spend any effort to analyze, convert, or clean the data. As a result, the system experienced data-integrity issues after it went live. The other three libraries either devoted time to clean and convert the data or had a third party do the data cleaning. As a result, no system issues arose from data-integrity problems. Interviewees from case 2 told us, “We elected to freeze the data 30 days sooner in terms of bibliographic data, so that we can do an authority control project with a third party vendor.” Project Management and Project Tracking According to ERP implementation literature, effective project-management practices are critical for implementation success. Such practices include defining clear objectives, establishing a formal implementation plan, designing a realistic work plan, and establishing resource requirements.43 The formal implementation plan needs to identify modules to be implemented, tasks to be undertaken, and all technical and nontechnical issues to be considered.44 Project progress must be carefully monitored through meetings and reports.45, 46 Effective project management and tracking has affected implementation outcome in all our cases. A popular project management and tracking software is Basecamp, a web-based project management and collaboration tool initially released in 2004.47 It offers discussion boards, to-do lists, file sharing, milestone management, event tracking, and messaging system that help project teams stay organized and connected despite their different locations. All cases used Basecamp for project management and tracking, which contributed to on-time and on-budget project completion for all cases. Staff User Education and Training A new system often frustrates users who do not receive adequate training in its functionalities and use.48 When feeling frustrated and stressed, users may avoid using the system. Proper and adequate training will sooth users and eliminate their reluctance to use the new system, which in turn helps realize productivity gains.49, 50 Training processes should consider factors such as training curriculum, user commitment, trainers’ personnel skills and competence, as well as training schedule, budget, evaluation, and methods.51 Effective staff user training has emerged as a critical success factor from all our cases. In case 1, staff users had access to a vendor-supplied preview portal, which simulated system functionalities. Staff users were so familiar with the new system by the time the system went live that they were eager to engage with it. In cases 2, 3 and 4, staff users were trained through demo products, online video trainings, Q&A, and on-site training sessions conducted by the vendor. CRITICAL SUCCESS FACTORS FOR INTEGRATED LIBRARY SYSTEM IMPLEMENTATION IN ACADEMIC LIBRARIES: A QUALITATIVE STUDY | YEH AND WALTER |doi:10.6017/ital.v35i2.9255 36 These training materials and sessions served to ease staff user’s feeling of uncertainty and anxiety, as the following quotes show: “The online training videos were provided to all staff in the library and followed up with Q&A sessions which members of the committee will host in their respective areas. . . . Then Ex Libris did a week long onsite training workshop serve for the final deep configuration issues. . . . We know that there are staff users who want to be ahead of the game, yet there are always people who don’t want to learn until the day before they go live.” (Case 2) “We have a training package with several onsite visits, each one is for a few days. The trainer focused on one aspect of the system. It was more than watching the videos online. Because of the small staff here, almost everyone attended at least one training.” (Case 3) “The trainers varied with their expertise, we developed fondness for some more than others. The training is functional in nature. The vendor’s priority was about trainer availability and to keep the project on time. We became familiar with trainers’ expertise; we were able to request the right trainer with the job.” (Case 4) Managing Staff User Emotions Although education and training eases user anxiety, it does not completely eliminate it. Emotions felt by users early in the implementation of a new system have important effects on the use of the system later on.52 How to manage staff user anxiety and negative emotions when they appear has emerged as a critical success factor in all our cases, as shown in the following quotes: “There were so many things going on in the library during the migration go-live week. The unknown of the migration success made staff users uncomfortable. Should the migration date be decided in consideration of other initiatives, the frustration experienced would have been a lot less and might not have been ignored during the going-live week.” (Case 1) “The frustration was just change; it was the fact that we have to learn something new. . . . Primarily the frustration was handled by the lead.” (Case 2) “There was a challenge, especially early on, in getting people to engage with the manuals and the literature in documentation. It is as if everyone is being asked to learn a new language. . . . The key relationship between the onsite coordinator and the project manager on the vendor side is important. When those two exchange information and handle frustration diplomatically, this bridge between the two organizations can smooth over a lot of rough feathers on either or both sides.” (Case 4) INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 37 This final CSF did not come directly from the ninety-plus CSFs that we started with, although it aligned closely with “Change Management” category.53 This CSF emerged mostly from the interview process. Summary of Results The results of the case studies for each critical factor are summarized in table 2. Implementation project outcome is summarized in table 3. An implementation is considered successful if it was completed on-time and on-budget and if the implementation process was smooth as reflected in the number and degree of unexpected problems along the way. Critical Success Factor Case 1 Case 2 Case 3 Case 4 Careful selection process No Yes Yes Yes Top management involvement Yes Yes Yes Yes Vendor support Yes Yes Yes Yes Project team competence Yes Yes Yes Yes Staff user involvement No Yes Yes Yes Interdepartmental communication No Yes Yes Yes Data analysis & conversion No Yes Yes Yes Project management and tracking Yes Yes Yes Yes Staff user education and training Yes Yes Yes Yes Managing staff user emotions No Yes Yes Yes Table 2. Summary of case study critical success factors findings Case 1 Case 2 Case 3 Case 4 On time implementation Yes Yes Yes Yes On budget implementation Yes Yes Yes Yes Smoothness of implementation No Staff users experienced data integrity issue, system downtime, as well as anxiety and stress with the system implementation process Yes Yes Yes Table 3. Summary of case study implementation success measures DISCUSSION AND CONCLUSIONS The implementation of a new ILS is a large-scale undertaking that affects every aspect of a library’s operations as well as every staff user’s workflow process. As such, it is imperative for CRITICAL SUCCESS FACTORS FOR INTEGRATED LIBRARY SYSTEM IMPLEMENTATION IN ACADEMIC LIBRARIES: A QUALITATIVE STUDY | YEH AND WALTER |doi:10.6017/ital.v35i2.9255 38 library administrators to understand what factors contribute to a successful implementation. Our qualitative study shows that there are two categories of CSFs: strategic and tactical. From the strategic perspective, top management involvement, vendor support, staff user involvement, interdepartmental communication, and staff user emotion management are critical. From the tactical perspective, project team competence, project management and project tracking, data analysis and conversion, and staff user education and training to break down the technical barrier greatly affect implementation outcome. In addition, selection of the final system from a variety of choices and options requires a careful consideration of both strategic and tactical issues. Each factor identified is important in its own right during the implementation process. Combined, they complement each other to guide an implementation to success. Among the list of CSFs identified, the role of staff user emotion management was not identified during the theoretical phase of the study; it only emerged as an important CSF during interviews. Top management involvement, vendor support, project team competence, project management and tracking, and staff user education and training are CSFs that were somewhat intuitive, and they were implemented by all cases. However, a library may select an end system without careful considerations. It may also be unaware of the importance of involving users early on, the importance of opening clear lines of interdepartmental communications, or the importance of performing data analysis and conversion before the implementation. Staff user emotion management, especially, is at the risk of being an afterthought of an implementation. By identifying the most salient CSFs, this study offers practical contributions to academic library leaders and administrators in understanding how critical success factors play a role in ensuring a smooth and successful ILS implementation. Although CSFs have been extensively studied in the discipline of information-systems management, this is the first study to apply CSFs in the library context. Since library management has its unique challenges compared to businesses, identifying CSFs for library-system-implementation success is important not only for the current migration to LSPs but also for future migrations to future generations of ILSs as the needs of libraries continue to evolve. As with any empirical research, there are limitations to this study. The number of academic libraries interviewed is small despite no new information being discovered after the fourth interview. The vendors represented in this study are only two of the many in the market providing LSPs to libraries. With these aforementioned limitations, the results of this study may not be generalizable to libraries implementing an LSP with vendors other than Innovative Interfaces and Ex Libris. Additionally, the results may not be generalizable to nonacademic libraries. This research can be extended to validate the proposed CSFs quantitatively by performing a survey research in academic libraries. Studying interactions between identified factors will offer an even greater contribution. This research can be experimented in other types of libraries to generalize inferences. In addition, case libraries 3 and 4 both expressed that LSP changes the public interface that is used by external users, and they wished to have more opportunities for outreach prior to the implementation. Although the design and implementation of the public interface was not considered within the scope of this research, this comment is insightful because INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 39 it may imply that future studies should consider a project champion to be a critical success factor. The project champion must have people-related skills and position to introduce changes in achieving buy-in from staff users.54, 55 REFERENCES 1. Richard M. Jost, Selecting and Implementing an Integrated Library System: The Most Important Decision You Will Ever Make (Boston: Chandos, 2015). 2. Ibid., 3. 3. Suzanne Julich, Donna Hirst and Brian Thompson, “A Case Study of ILS Migration: Aleph500 at the University of Iowa,” Library Hi Tech 21, no. 1 (2003): 44–55, http://dx.doi.org/10.1108/07378830310467391. 4. Zahiruddin Khurshid, “Migration from DOBIS LIBIS to Horizon at KFUPM,” Library Hi Tech 24, no. 3 (2006): 440–51, http://dx.doi.org/10.1108/07378830610692190. 5. Vandana Singh, “Experiences of Migrating to an Open-Source Integrated Library System,” Information Technology & Libraries 32, no. 1 (2013): 36–53. 6. Jost, “Selecting and Implementing an Integrated Library System.” 7. Yongming Wang and Trevor A. Dawes, “The Next Generation Integrated Library System: A Promise Fulfilled,” Information Technology & Libraries 31, no. 3 (2012): 76–84. 8. Keith Kelley, Carrie C. Leatherman, and Geraldine Rinna, “Is It Really Time to Replace Your ILS with a Next-Generation Option?” Computers in Libraries 33, no. 8 (2013): 11–15. 9. Vangie Beal, “ERP—Enterprise Resource Planning,” Webopedia, http://www.webopedia.com/TERM/E/ERP.html. 10. “Library Management System,” Tangient LLC, https://libtechrfp.wikispaces.com/Library+Management+System. 11. Christopher P. Holland and Ben Light, “A Critical Success Factors Model for ERP Implementation,” IEEE Software 16, no. 3 (1999): 30–36, http://dx.doi.org/10.1109/52.765784. 12. Levi Shaul and Doron Tauber, “Critical Success Factors in Enterprise Resource Planning Systems: Review of the Last Decade,” ACM Computing Surveys 45 no. 4 (2013): 1–39, http://dx.doi.org/10.1145/2501654.2501669. 13. Yahia Zare Mehrjerdi, “Enterprise Resource Planning: Risk and Benefit Analysis,” Business Strategy Series 11, no. 5 (2010): 308–24, http://dx.doi.org/10.1108/17515631011080722. 14. Mohammad A. Rashid, Liaquat Hossain, and Jon David Patrick, “The Evolution of ERP Systems: A Historical Perspective,” in Enterprise Resource Planning: Global Opportunities and Challenges (Hershey, PA: Idea Group, 2002). http://dx.doi.org/10.1108/07378830310467391 http://dx.doi.org/10.1108/07378830610692190 http://www.webopedia.com/TERM/E/ERP.html https://libtechrfp.wikispaces.com/Library+Management+System http://dx.doi.org/10.1109/52.765784 http://dx.doi.org/10.1145/2501654.2501669 http://dx.doi.org/10.1108/17515631011080722 CRITICAL SUCCESS FACTORS FOR INTEGRATED LIBRARY SYSTEM IMPLEMENTATION IN ACADEMIC LIBRARIES: A QUALITATIVE STUDY | YEH AND WALTER |doi:10.6017/ital.v35i2.9255 40 15. Marshall Breeding, “Library Systems Report 2014: Competition and Strategic Cooperation,” American Libraries 45, no. 5 (2014): 21–33. 16. Sharon Yang, “From Integrated Library Systems to Library Management Services: Time for Change?” Library Hi Tech News 30, no. 2 (2013): 1–8, http://dx.doi.org/10.1108/LHTN-02- 2013-0006. 17. Shahin Dezdar, “Strategic and Tactical Factors for Successful ERP Projects: Insights from an Asian Country,” Management Research Review 35, no. 11 (2012): 1070–87, http://dx.doi.org/10.1108/14637151111182693. 18. Ibid. 19. Shahin Dezdar and Ainin Sulaiman, “Successful Enterprise Resource Planning Implementation: Taxonomy of Critical Factors,” Industrial Management & Data Systems 109, no. 8 (2009): 1037– 52, http://dx.doi.org/10.1108/02635570910991283. 20. Sherry Finney and Martin Corbett, “ERP Implementation: A Compilation and Analysis of Critical Success Factors,” Business Process Management Journal 13, no. 3 (2007): 329–47, http://dx.doi.org/10.1108/14637150710752272. 21. F. Pearce, Business Building and Promotion: Strategic and Tactical Planning (Houston: Pearman Cooperation Alliance, 2004). 22. Jennifer Bresnahan, “Mixed Messages,” CIO (May 16, 1996), 72, http://dx.doi.org/10.1016/j.jchf.2013.07.005. 23. Majed Al-Mashari, Abdullah Al-Mudimigh, and Mohamed Zairi, “Enterprise Resource Planning: A Taxonomy of Critical Factors,” European Journal of Operational Research 146, no. 2 (2003): 352–64, http://dx.doi.org/10.1016/S0377-2217(02)00554-4. 24. Shaul and Tauber, “Critical Success Factors in Enterprise Resource Planning Systems.” 25. H. Akkermans and K. van Helden, “Vicious and Virtuous Cycles in ERP Implementation: A Case Study of Interrelations between Critical Success Factors,” European Journal of Information Systems 11, no. 1 (2002): 35–46, http://dx.doi.org/10.1057/palgrave.ejis.3000418. 26. Abdallah Mohamed, Guenther Ruhe, and Armin Eberlein, “COTS Selection: Past, Present, and Future” (paper presented at the 14th Annual IEEE International Conference and Workshops on the Engineering of Computer-based System, 2007), http://dx.doi.org/10.1109/ECBS.2007.28. 27. M. Michael Umble, Elisabeth J. Umble, and Ronald R. Haft, “Enterprise Resource Planning: Implementation Procedures and Critical Success Factors,” European Journal of Operational Research 146 no. 2 (2003): 241–57, http://dx.doi.org/10.1016/S0377-2217(02)00547-7. 28. Jim Johnson, “Chaos: the Dollar Drain of IT Project Failures,” Application Development Trends 2, no. 1 (1995): 41–47. http://dx.doi.org/10.1108/LHTN-02-2013-0006 http://dx.doi.org/10.1108/LHTN-02-2013-0006 http://dx.doi.org/10.1108/14637151111182693 http://dx.doi.org/10.1108/02635570910991283 http://dx.doi.org/10.1108/14637150710752272 http://dx.doi.org/10.1016/j.jchf.2013.07.005 http://dx.doi.org/10.1016/S0377-2217(02)00554-4 http://dx.doi.org/10.1057/palgrave.ejis.3000418 http://dx.doi.org/10.1109/ECBS.2007.28 http://dx.doi.org/10.1016/S0377-2217(02)00547-7 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2016 41 29. Prasad Bingi, Maneesh K. Sharma, and Jayanth K. Godla, “Critical Issues Affecting an ERP Implementation,” Information Systems Management 16, no. 3 (1999): 7–14, http://dx.doi.org/10.1201/1078/43197.16.3.19990601/313. 30. Mary Sumner, “Critical Success Factors in Enterprise Wide Information Management Systems Projects,” Proceedings of the 1999 ACM SIGCPR Conference on Computer Personnel Research, 1999 (New York: ACM, 1999), http://dx.doi.org/10.1145/299513.299722. 31. Eric T. G. Wang et al., “The Consistency among Facilitating Factors and ERP Implementation Success: A Holistic View of Fit,” Journal of Systems & Software 81 no. 9 (2008): 1609–21, http://dx.doi.org/10.1016/j.jss.2007.11.722. 32. Dong-Gil Ko, Laurie J. Kirsch, and William R. King, “Antecedents of Knowledge Transfer from Consultants to Clients in Enterprise System Implementations,” MIS Quarterly 29, no. 1 (2005): 59–85. 33. Al-Mashari, “Enterprise Resource Planning.” 34. Fiona Fui-Hoon Nah and Santiago Delgado, “Critical Success Factors for Enterprise Resource Planning Implementation and Upgrade,” Journal of Computer Information Systems 46 no. 5 (2006): 99–113. 35. Liang Zhang et al., “A Framework of ERP Systems Implementation Success in China: An Empirical Study,” International Journal of Production Economics 98, no. 1 (2005): 56–80, http://dx.doi.org/10.1016/j.ijpe.2004.09.004. 36. Ann-Marie K. Baronas and Meryl Reis Louis, “Restoring a Sense of Control During Implementation: How User Involvement Leads to System Acceptance,” MIS Quarterly 12, no. 1 (1988): 111–24. 37. Joseph Esteves, Joan Pastor and Joseph Casanovas, “A Goals/Questions/Metrics Plan for Monitoring User Involvement and Participation in ERP Implementation Projects,” IE working paper, March 11, 2004, http://dx.doi.org/10.2139/ssrn.1019991. 38. Khaled Al-Fawaz, Zahran Al-Salti, and Tillal Eldabi, “Critical Success Factors in ERP Implementation: A Review” (paper presented at the European and Mediterranean Conference on Information Systems, Dubai, May 25–26, 2008). 39. H. Akkermans and K. van Helden, “Vicious and Virtuous Cycles in ERP Implementation: A Case Study of Interrelations between Critical Success Factors,” European Journal of Information Systems 11, no. 1 (2002): 35–46, http://dx.doi.org/10.1057/palgrave.ejis.3000418. 40. Nancy Bancroft, Henning Seip, and Andrea Sprengel, Implementing SAP R/3: How to Introduce a Large System Into a Large Organisation (Greenwich, UK: Manning, 1998). 41. Nah, “Critical Success Factors.” http://dx.doi.org/10.1201/1078/43197.16.3.19990601/313 http://dx.doi.org/10.1016/j.jss.2007.11.722 http://dx.doi.org/10.1016/j.ijpe.2004.09.004 http://dx.doi.org/10.2139/ssrn.1019991 http://dx.doi.org/10.1057/palgrave.ejis.3000418 CRITICAL SUCCESS FACTORS FOR INTEGRATED LIBRARY SYSTEM IMPLEMENTATION IN ACADEMIC LIBRARIES: A QUALITATIVE STUDY | YEH AND WALTER |doi:10.6017/ital.v35i2.9255 42 42. Toni M. Somers and Klara Nelson, “The Impact of Critical Success Factors Across the Stages of Enterprise Resource Planning Implementations,” Proceedings of the 34th Hawaii International Conference on System Sciences, 2001, http://dx.doi.org/10.1109/HICSS.2001.927129. 43. Shi-Ming Huang et al., “Assessing Risk in ERP Projects: Identify and Prioritize the Factors,” Industrial Management & Data Systems 104, no. 8 (2004): 681–88, http://dx.doi.org/10.1108/02635570410561672. 44. Nah, “ERP Implementation.” 45. Umble, “Enterprise Resources Planning.” 46. Nah, “ERP Implementation.” 47. “Basecamp, in a Nutshell,” Basecamp, https://basecamp.com/about/press. 48. Nah, “ERP Implementation.” 49. Umble, “Enterprise Resources Planning.” 50. Mo Adam Mahmood et al., “Variables Affecting Information Technology End-user Satisfaction: A Meta-analysis of the Empirical Literature,” International Journal of Human-Computer Studies 52, no. 4 (2000): 751–71, http://dx.doi.org/10.1006/ijhc.1999.0353. 51. Iuliana Dorobat and Floarea Nastase, “Training Issues in ERP Implementations,” Accounting & Management Information Systems 11, no. 4 (2012): 621–36. 52. Anne Beaudry and Alain Pinsonneault, “The Other Side of Acceptance: Studying the Direct and Indirect Effects of Emotions on Information Technology Use,” MIS Quarterly 34, no. 4 (2010): 689–710. 53. Shaul and Tauber, “Critical Success Factors in Enterprise Resource Planning Systems.” 54. Andrew Lawrence Norton et al., “Ensuring Benefits Realisation from ERP II: The CSF Phasing Model,” Journal of Enterprise Information Management 26, no. 3 (2013): 218–34, http://dx.doi.org/10.1108/17410391311325207. 55. Chong Hwa Chee, “Human Factor for Successful ERP2 Implementation,” New Straits Times, July 28, 2003, https://www.highbeam.com/doc/1P1-76161040.html. http://dx.doi.org/10.1109/HICSS.2001.927129 http://dx.doi.org/10.1108/02635570410561672 https://basecamp.com/about/press http://dx.doi.org/10.1006/ijhc.1999.0353 http://dx.doi.org/10.1108/17410391311325207 https://www.highbeam.com/doc/1P1-76161040.html ABSTRACT INTRODUCTION LITERATURE REVIEW METHOD RESULTS Careful Selection Process Top Management Involvement Vendor Support Project Team Competence Staff User Involvement Interdepartmental Communication Data Analysis and Conversion Project Management and Project Tracking Staff User Education and Training Managing Staff User Emotions Summary of Results 9268 ---- Editorial Board Thoughts: The Importance of Staff Change Management in the Face of the Growing “Cloud” Mark Dehmlow INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 3 The library vendor market likes to throw around the word “cloud” to make their offerings seem innovative and significant. In many ways, much of what the library IT market refers to as “cloud,” especially SAAS (software as a service) offerings, are really just a fancier term for hosted services. The real gravitas behind the label cloud really emanated from grid-computing or large interconnected, and quickly deployable infrastructure like Amazon’s AWS or Microsoft’s Azure platforms. Infrastructure at that scale and that level of geographic distribution was revolutionary when it emerged. Still these offerings at their core are basically IAAS (infrastructure as a service) bundled as a menu of services. So I think the most broadly applicable synonym for the “cloud” could be “IT as a service” in various forms. Outsourcing in this way isn’t entirely new to libraries. The function and structure of OCLC has arguably been one of the earlier instantiations of “IT as a service” for libraries vis-à-vis their MARC record aggregation and distribution which OCLC has been doing for decades. The more recent trend toward hosted IT services has been relatively easy for non-IT related units in our library. A service no different to most library staff based on where it is hosted. And with many services implementing APIs for libraries, that distinction is becoming less significant for our application developers too. For many of our technology staff, who have built careers around systems administration, application development, systems integration, and application management, hosted services represent a threat to not only their livelihoods but in some ways also their philosophical perspectives that are grounded in open source and do-it- yourself oriented beliefs. In many ways the “cloud” for the IT segment of our profession is perhaps more synonymous with change, and with change requires effective management of that change, especially for the human element of our organizations. Recently, our Office of Information Technologies started an initiative to move 80% of their technology infrastructure into the cloud. They have proposed an inverted pyramid structure for determining where IT solutions should reside — focusing first on hosted software as a service solutions for the largest segment of applications, followed by hosting those applications we would have typically installed locally onto a platform or infrastructure as a service provider, and then limiting only those applications that have specialized technical or legal needs to reside on premise. This is a big shift for our IT staff, especially, but not limited to, our systems administrators. The IAAS platform our university is migrating to is Amazon Web Services and their infrastructure is Mark Dehmlow (mdehmlow@nd.edu), a member of LITA and the ITAL editorial board, is the Director, Information Technology Program, Hesburgh Libraries, University of Notre Dame, South Bend, Indiana. EDITORIAL BOARD THOUGHTS: THE IMPORTANCE OF STAFF CHANGE MANAGEMENT IN THE FACE OF THE GROWING “CLOUD” | DEHMLOW | doi: 10.6017/ital.v35i1.8965 4 largely accessible via a web dashboard, so that the myriad of tasks our systems administrators took days and weeks to do can now, in some adjusted way, be accomplished with a few clicks. This example is on one extreme end of the spectrum as far as IT change goes, but simultaneously, we have looked at the vendor market to lease pre-packaged tools that support standard functions in academic libraries and can be locally branded and configured with our data — things like course guides, A-Z journal lists, scheduling events, etc. The overarching goals of these efforts are cost savings and increased velocity and resiliency of infrastructure, but also and perhaps more important, is giving us flexibility in how we invest our staff time. If we are able to move high level tasks from staff to a platform, then we will be able to reallocate our staff’s time and considerable talent to take on the constant stream of new, high level technology needs. Partnering with the University, we are aiming towards their defined goal of moving 80% of our technical infrastructure into the “cloud.” We have adopted their overall strategy of approach to systems infrastructure, at least in principle and are integrating into our own strategy significant consideration for the impact of these changes on our staff. Our organization has recognized that people form not only habits around process, but also personal and emotional attachments to why we do things the way we do them, both from a philosophical as well as a pragmatic perspective. Our approach to staff change is layered as well as long term. We know that getting from shock to acceptance is not an overnight process and that staff who adopt our overarching goals and strategy as their own will be more successful in the long term. To make this transition, we have developed several strategic approaches: 1. Explaining the Case: My experience is that staff can live through most changes as long as they understand why. Helping them gain that understanding can take some time, but ultimately having that comprehension will help them fully understand our strategic goals as well as help them make decisions that are in alignment with the overall approach. I often find it is important to remember that, as managers, we have been a part of all of the change conversations and we have had time to assimilate ideas, discuss points of view, and process the implications of change. Each of our staff needs to go through the same process and it is up to leadership to guide them through that process and ensure they get to participate in similar conversations. It is tempting to want to hit an initiative running, but there is significant value in seeding those discussions gradually over a somewhat gradual time period to more holistically integrate staff into the broader vision. It is important to explain the case for change multiple times and actively listen to staff thoughts and concerns and to remember to lay out the context for change, why it is important, and how we intend to accomplish things. Then reassure, reassure, and reassure. The threats to staff may seem innocuous or unfounded to managers, but staff need to feel secure during a process to ultimately buy in. 2. Consistency and Persistence: Staff acceptance doesn’t always come easy — nor should it necessarily. Listening and integrating their perspectives into the planning and INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2016 5 implementation process can help demonstrate that they matter, but equally important is that they feel our approach is built on something solid. Stability is reinforced through consistency in messaging. Not only in individual consistency, but also team consistency, and upper management consistency — everyone should be able to support and explain messaging around a particular change. Any time staff approach me and say, “it was much easier to do it this other way,” I talk about the efficiency we will garner through this change and how we will be able to train and repurpose staff in the future. The more they hear the message, the more ingrained it becomes, and the more normative it begins to feel. 3. Training and Investment: IT futures require investment, not just in infrastructure, but also in skill development. We continue to invest significantly in providing some level of training on new technologies that we implement. That training will not only prove to staff that you are invested in their development as well as their job security, but it will also give them the tools they need to be successful in implementing new technologies. Change is anxiety inducing because it exposes so many unknowns. Providing training helps build confidence and competence for staff, reducing anxieties and providing some added engagement in the process. It also gives them exposure to the real world implementation of technologies where they can begin to see the benefits that you have been communicating for themselves. 4. Envisioning the Future: Improvements and Roles — One of the initial benefits we will be getting from recouping staff time is around shoring up our processes. We have generally had a more ad hoc approach to managing the day to day. It has been difficult to institute a strong technical change management process, in part, because of time. We will be able to remove that consideration from our excuses as we take advantage of the “cloud.” The net effect will be that we will do our work more thoughtfully and less ad hoc and use better defined processes that will meet group-developed expectations. In addition to doing things better, we do expect to do things differently. With fewer tasks at the operational level, we believe we will be able to transition staff into newly defined roles. Some of these roles include DevOps Engineers, a hybrid of application engineering (the dev) and systems administration (the ops), these staff will help design automation and continuous integration processes that allow developers to focus on their programming and less on the environment they are deploying their applications in; Financial Engineers who will take system requirements and calculate costs in somewhat complex technical cloud environments; Systems Architects who will be focused on understanding the smorgasbord of options that can be tied together to provide a service to meet expected response performance, disaster recovery, uptime, and other requirements; and Business Analysts - who will focus on taking technical requirements and looking at all of the potential approaches to solve that need whether it be a hosted service, a locally developed solution, an implementation of an open source system, or some integration of all or some of the EDITORIAL BOARD THOUGHTS: THE IMPORTANCE OF STAFF CHANGE MANAGEMENT IN THE FACE OF THE GROWING “CLOUD” | DEHMLOW | doi: 10.6017/ital.v35i1.8965 6 above. This list is by no means exhaustive, but I think it forms a good foundation on which to help staff develop their skill set along with our changing environment. I believe it is important to remind those of us who are managing IT departments in Libraries that in many ways the easiest parts of change are the logistics. The technology we work with is bounded by sets of guidelines that define how they are used and ensure that if they are implemented properly, they will work effectively. People on the other hand are not bounded as neatly by stringent rules. They are guided by diverse backgrounds, personalities, experiences, and feelings. They can be unpredictable, difficult to fully figure out, and behaviorally inconsistent. And yet, they are the great constant in our organizations and therefore require significant attention. Our field needs “servant leaders” dedicated to supporting and developing staff, and not just being competent at implementing technologies. Those managers who invest in staff, their well-being, development, and sense of engagement in their jobs, will find their organizations are able to tackle most anything. But those who ignore their staffs’ needs over pragmatic goals will likely find their organizations struggling to move quickly and instead spend too much energy overcoming resistance instead of energizing change. 9343 ---- Let’s Get Virtual: Examination of Best Practices to Provide Public Access to Digital Versions of Three-Dimensional Objects Tanya M. Johnson INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 39 ABSTRACT Three-dimensional objects are important sources of information that should not be ignored in the increasing trend towards digitization. Previous research has not addressed the evaluation of digitized versions of three-dimensional objects. This paper first reviews research concerning such digitization, in both two and three dimensions, as well as public access in this context. Next, evaluation criteria for websites incorporating digital versions of three-dimensional objects are extrapolated from previous research. Finally, five websites are evaluated, and suggestions for best practices to provide public access to digital versions of three-dimensional objects are proposed. INTRODUCTION Much of the literature surrounding the increased efforts of libraries and museums to digitize content has focused on two-dimensional forms, such as books, photographs, or paintings. However, information does not only come in two dimensions; there are sculptures, artifacts, and other three-dimensional objects that have been unfortunately neglected by this digital revolution. As one author stated, “While researchers do not refer to three-dimensional objects as commonly as books, manuscripts, and journal articles, they are still important sources of information and should not be taken for granted” (Jarrell 1998, 32). The importance of three-dimensional objects as information that can and should be shared is not a new phenomenon; indeed, as early as 1887, museologists and educators forwarded the view that “museums were in effect libraries of objects” that provided information not supplied by books alone (Given and McTavish 2010, 11). However, it is only recently, with the advent of newer technological mechanisms, that such objects could be shared with the public on a larger scale. No longer do people need to physically visit museums to experience and learn from three- dimensional objects. Rather, various techniques have been utilized to place digital versions of such objects on the websites of museums and archives, and projects have been created by various universities in order to enhance that digital experience. Nevertheless, as Newell (2012) states: Collections-holding institutions increasingly regard digital resources as additional objects of significance, not as complete replacements for the original. Digital technologies work best when they enable people who feel connected to museum objects to have the freedom to deepen these Tanya M. Johnson (tmjohnso@gmail.com), a recent MLIS degree graduate from the School of Communication & Information, Rutgers, The State University of New Jersey, is winner of the 2016 LITA/Ex Libris Student Writing Award. mailto:tmjohnso@gmail.com LET’S GET VIRTUAL: EXAMINATION OF BEST PRACTICES TO PROVIDE PUBLIC ACCESS TO DIGITAL VERSIONS OF THREE-DIMENSIONAL OBJECTS | JOHNSON | doi:10.6017/ital.v35i2.9343 40 relationships and, where appropriate, to extend outsiders’ understandings of the objects’ cultural contexts. The raison d’être of museums and other cultural institutions remains centred on the primacy of the object and in this sense continues to privilege material authenticity. (303) In this regard, three-dimensional visualization of physical objects can be seen as the next step for museums and cultural heritage institutions that seek to further patrons’ connection to such objects via the internet. Indeed, in this digital age, the goals of museums and archives are changing, converging with those of libraries to focus more efforts on providing information to the public, and, along with the growing trend to digitize information contained within libraries, there has been a concomitant trend to digitize the contents of museums in order to provide greater public access to collections (Given and McTavish 2010). In light of this progress, this paper will review various methods of presenting three-dimensional objects to the public on the internet and, based on an evaluation of five digital collections, attempt to provide some advice as to best practices for museums or institutions seeking to digitize such objects and present them to the public via a digital collection. LITERATURE REVIEW Two-Dimensional Digitization There are many ways to present digital versions of three-dimensional objects on a webpage, ranging from simple two-dimensional photography to complicated three-dimensional scanning and rendering. Beginning on the simpler end of the scale, Bincsik, Maezaki, and Hattori (2012) describe the process of photographing Japanese decorative art objects in order to create an image database of objects from multiple museums. Specifically, the researchers explain that they need high quality photographs showing each object in all directions, as well as close-up images of fine details, in order to recreate the physical research experience as closely as possible. They also note that, for the same reason, the context of each object must be recorded, including photographs of any wrapping or storage materials and accompanying documentation. For this project, the researchers utilized Nikon professional or semi-professional cameras, with zoom and macro lenses, and often used small apertures to increase depth-of-field. At times, they also took measurements of the objects in order to assist museums in maintaining accurate records. The raw image files were then processed with programs such as Adobe Photoshop, saved as original TIF files, and converted into JPEG format for upload. Despite the success of the project, the researchers also noted the limitations of digitizing three-dimensional objects: With decorative art objects some information is inevitably lost, such as the weight of the object, the feeling of its surface texture or the sense of its functionality in terms of proportions and balance. Digital images clearly can fulfill many research objectives, but in some cases they can only be used as references. One objective of the decorative arts database is to advise the researcher in selecting which objects should be examined in person. (Bincsik, Maezaki, and Hattori 2012, 46) One difficulty with photography, particularly when digitizing artwork, is that color is a function of light. Thus, a single object will often appear to be different colors when photographed in different lighting conditions using conventional digital cameras, which process images using RGB filters. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 41 More accurate representations of objects can be acquired using multispectral imaging, which uses a higher number of parameters (the international standard is 31, compared to RGB’s 3) in order to obtain more information about the reflectance of an object at any particular point in space (Novati, Pellegri, and Schettini 2005). Multispectral imaging, however, is very expensive and, despite some researchers’ attempts to create affordable systems (e.g., Novati, Pellegri, and Schettini 2005), the acquisition of multispectral images is generally limited to large institutions with considerable funding (Chane et al. 2013). The use of two-dimensional photography to digitize objects is not limited to the arts; in the natural sciences, different types of photographic equipment have been developed to document existing collections and enhance scientific observation. Gigapixel imaging, for example, has been utilized to allow museum visitors to virtually explore large petroglyphs located in remote locations as well as for documentation and viewing of dinosaur bone specimens that are not on public display (Louw and Crowley 2013). This technology consists of taking many, very high resolution photographs that are then, via computer software, “aligned, blended, and stitched” together to create one extremely detailed composite image (Louw and Crowley 2013, 89–90). Robotic systems, such as GigaPan, have been developed to speed up the process and permit rapid recording and processing of the necessary area. Once the gigapixel image is created, it can then be uploaded and displayed on the web in dynamic form, including spatial navigation of the image with embedded text, audio, or video at specific locations and zoom levels to provide further information (Louw and Crowley 2013). Various types of gigapixel imaging, including the GigaPan system, have also been used to digitize important collections of biological specimens, particularly insects, which are often stored in large drawers. One study examined the documentation of entomological specimens by “whole-drawer imaging” using various gigapixel imaging technologies (Holovachov, Zatushevsky, and Shydlovsky 2014). The researchers explained that different gigapixel imaging systems (many of which are commercial and proprietary) utilize different types of cameras and lenses, as well as different types of software for processing. However, despite the expensive cost of some commercially available systems, it is possible for museums and other institutions to create their own, economically viable versions. The system created by Holovachov, Zatushevsky, and Shydlovsky utilized a standard SLR camera, fitted with a macro lens and attached to an immovable stand. The researchers manually set up lighting, focus, aperture, and other settings, and moved the insect drawer along a pre-determined grid pattern in order to obtain the multiple overlapping photographs necessary to create a large gigapixel image. They used a freely available stitching software program and manually corrected stitching artifacts and color balance issues that resulted from the use of a non-telecentric lens.1 Despite the lower cost of their individualized system, however, the researchers noted that the process was much more time-consuming and necessitated more labor from workers digitizing the collection. Moreover, technologically speaking, the researchers emphasized the limits of two-dimensional imaging, given that the 1The difference between telecentric and non-telecentric lenses is explained by the researchers: “Contrary to ordinary photographic lenses, object-space telecentric lenses provide the same object magnification at all possible focusing distances. An object that is too close or too far from the focus plane and not in focus, will be the same size as if it were in focus. There is no perspective error and the image projection is parallel. Therefore, when such a lens is used to take images of pinned insects in a box, all vertical pins will appear strictly vertical, independent of their position within the camera’s field of view” (Holovachov, Zatushevsky, and Shydlovsky 2014, 7). LET’S GET VIRTUAL: EXAMINATION OF BEST PRACTICES TO PROVIDE PUBLIC ACCESS TO DIGITAL VERSIONS OF THREE-DIMENSIONAL OBJECTS | JOHNSON | doi:10.6017/ital.v35i2.9343 42 “diagnostic characteristics of three-dimensional insects,” as well as the accompanying labels, are often invisible when a drawer is only photographed from the top. Thus, the researchers concluded that, ultimately, “the whole-drawer digitizing of insect collections needs to be transformed from two-dimensions to three-dimensions by employing complex imaging techniques (simultaneous use of multiple cameras positioned at different angles) and a digital workflow” (Holovachov, Zatushevsky, and Shydlovsky 2014, 7). Three-Dimensional Digitization Given the goal of obtaining as accurate a representation as possible when digitizing objects, many researchers have turned to the use of various techniques in order to obtain three-dimensional data. Acquiring a three-dimensional image of an object takes place in three steps: 1. Preparation, during which certain preliminary activities take place that involve the decision about the technique and methodology to be adopted as well as the place of digitization, security planning issues, etc. 2. Digital recording, which is the main digitization process according to the plan from phase 1. 3. Data processing, which involves the modeling of the digitized object through the unification of partial scans, geometric data processing, texture data processing, texture mapping, etc. (Pavlidis et al. 2007, 94) Steps 2 and 3 have been more technically described as (2) obtaining data from an object to create point clouds (from thousands to billions of X,Y,Z coordinates representing loci on the object); and (3) processing point clouds into polygon models (creating a surface on top of the points), which can then be mapped with textures and colors (Metallo and Rossi 2011). There are several techniques that can be utilized to acquire three-dimensional data from a physical object. Table 1 explains the four general methods most commonly used by museums. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 43 Type Description Positives Negatives Approx. Price Range Laser Scanning A laser source emits light onto the object’s surface, which is detected by a digital camera; geometry of the object is extracted by triangulation or time of flight calculations High accuracy in capturing geometry; can capture small objects and entire buildings (using different hardware) Limited texture and color captured; shiny surfaces refract the laser $3,000– $200,000 White Light (Structured Light) Scanning A pattern of light is projected onto the object’s surface, and deformations in that pattern are detected by a digital camera; geometry is extracted by triangulation from deformations Captures texture details, making it very accurate; can capture color Dark, shiny, or translucent objects are problematic $15,000– $250,000 Photogrammetry Three-dimensional data is extracted from multiple two- dimensional pictures Can capture small objects and mountain ranges; good color information Need either precise placement of cameras or more precise software to obtain accurate data Cameras: $500– $50,000; Software: free– $40,000 Volumetric Scanning Magnetic resonance imaging (MRI) uses a strong magnetic field and radio waves to detect geometric, density, volume and location information; computed tomography (CT) uses rotating x-rays to create two- dimensional slices, which can then be reconstructed into three-dimensional images Both types can view the interior and exterior of an object; CT can be used for reflective or translucent objects; MRI can image soft tissues No color information; MRI requires object to have high water content $200,000– $2,000,000 Table 1. Description of four general methods of acquiring three-dimensional data about physical objects (table information compiled by reference to Pavlidis et al. 2007; Metallo and Rossi 2011; Abel et al. 2011; and Berquist et al. 2012). The type of three-dimensional digitization used can ultimately depend upon the types of objects to be imaged or the type of data needed. For example, in digitizing human skeletal collections, one study explained that three-dimensional laser scanning was an advantageous technique to create models of bones for preservation and analysis, but cautioned that CT scans would be needed to examine the internal structures of such specimens (Kuzminsky and Gardiner 2012). Another study LET’S GET VIRTUAL: EXAMINATION OF BEST PRACTICES TO PROVIDE PUBLIC ACCESS TO DIGITAL VERSIONS OF THREE-DIMENSIONAL OBJECTS | JOHNSON | doi:10.6017/ital.v35i2.9343 44 utilized several techniques in an attempt to decipher graffiti inscriptions on ancient Roman pottery shards, ultimately concluding that high-resolution photography (similar to gigapixel imaging) and three-dimensional laser scanning both provided detailed and helpful data (Montani et al. 2012). Additionally, sometimes multiple types of digitization can be used for the same objects with similar results. One study, for example, obtained virtually equivalent three- dimensional models of the same object using laser scanning and two types of photogrammetry (Lerma and Muir 2014). Most recently, researchers have been utilizing combinations of digitization techniques to obtain the most accurate representations possible. Chane et al. (2013), for example, examined methods of combining three-dimensional digitization with multispectral photography in order to obtain enhanced information concerning the physical object in question. The researchers explained that combining the two processes is difficult because, in order to obtain multispectral textural data that is mapped to geometric positions, the object must be imaged from identical locations by multiple scanners/cameras or else the data processing that combines the two types of data becomes extremely complex. As a compromise, the researchers created a system of optical tracking based on photogrammetry techniques that permits the collection and integration of geometric positioning data and multispectral textures utilizing precise targeting procedures. However, the researchers noted that most systems integrating multispectral photography with three- dimensional digitization tended to be quite bulky, did not adapt easily to different types of objects, and needed better processing algorithms for more complex three-dimensional objects (Chane et al. 2013). Public Access to Three-Dimensionally Digitized Objects Despite museums’ growing focus on increasing public access to collections via digitization (Given and McTavish 2010), there is very little literature addressing public access to three-dimensionally digitized objects. Indeed, studies in this realm tend to focus on the technological aspects of either the modeling of specific objects or collections or website viewing of three-dimensional models. For example, Abate et al. (2011) described the three-dimensional digitization of a particular statue from the scanning process to its ultimate depiction on a website. The researchers explained in detail the particular software architecture utilized in order to permit the remote rendering of the three-dimensional model on users’ computers via a Java applet without compromising quality or necessitating download of potentially copyrighted works. By contrast, literature concerning the Digital Michelangelo project, during which researchers three-dimensionally digitized various Michelangelo works, focused on the method used to create an accurate three-dimensional model, complete with color and texture mapping, and a visualization tool (Dellepiane et al. 2008). One study did describe a project that was designed to place three-dimensional data about various cultural artifacts in an online repository for curators and other professionals (Hess et al. 2011). This repository was contained within database management software, a web-based interface was designed for searching, and user access to three-dimensional images and models was provided via an ActiveX plugin. Despite the potential of the prototype, however, it appears that the project has ceased,2 and the institution’s current three-dimensional imaging project is focused on the design 2See http://www.ucl.ac.uk/museums/petrie/research/research-projects/3dpetrie/3d_projects/3d-projects-past/e- curator. http://www.ucl.ac.uk/museums/petrie/research/research-projects/3dpetrie/3d_projects/3d-projects-past/e-curator http://www.ucl.ac.uk/museums/petrie/research/research-projects/3dpetrie/3d_projects/3d-projects-past/e-curator INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 45 of a traveling exhibition incorporating, among other things, three-dimensional models of artifacts and physical replicas created from such models.3 Studies that do address public access directly tend to focus on the improvement of museum websites generally. For example, in terms of user expectations of museum websites, one study found that approximately 63 percent of visitors to a museum’s website did so in order to search the digital collection (Kravchyna and Hastings 2002). Another study found four types of museum website users, who each had different needs and expectations of sites. Relevantly, educators sought collections that were “the more realistic the better,” including suggestions like incorporating three-dimensional simulations of physical objects so that students could “explore the form, construction, texture and use of objects” (Cameron 2003, 335). Further, non-specialist users “value free choice learning” and “access online collections to explore and discover new things and build on their knowledge base as a form of entertainment” (Cameron 2003, 335). Similarly, some studies have addressed the incorporation of Web 2.0 technologies into museum websites. Srinivasan et al. (2009), for example, argue that Web 2.0 technologies must be integrated into museum catalogs rather than simply layered over existing records because users’ interest in objects is increased by participation in the descriptive practice. An implementation of this concept is found in Hunter and Gerber’s (2010) system of social tagging attached to three- dimensional models. This paper is an effort to address the gap between the technical process of digitizing and presenting three-dimensional objects on the web and the user experience of such. Through the evaluation of five websites, this paper will provide some guidance for the digitization of three- dimensional objects and their presentation in digital collections for public access. METHODOLOGY AND EVALUATIVE CRITERIA Evaluations of digital museums are not as prevalent as evaluations of digital libraries. However, given the similar purposes of digital museums and digital libraries, it is appropriate to utilize similar criteria. For digital libraries, Saracevic (2000) synthesized evaluation criteria into performance questions in two broad areas: (a) user-centered questions, including how well the digital library supports the society or community served, how well it supports institutional or organizational goals, how well it supports individual users’ information needs, and how well the digital library’s interface provides access and interaction; and (b) system- centered questions, including hardware and network performance, processing and algorithm performance, and how well the content of the collection is selected, represented, organized, and managed. Xie (2008) focused on user-centered evaluation and found five general criteria that exemplified users’ own evaluations of digital libraries: interface usability, collection quality, service quality, system performance, and user satisfaction. Parandjuk (2010) used information architecture to construct criteria for the evaluation of a digital library, including the following: • uniformity of standards, including consistency among webpages and individual records; • findability, including ease of use and multiple ways to access the same information; • sub-navigation, including indexes, sitemaps, and guides; 3See http://www.3dencounters.com. http://www.3dencounters.com/ LET’S GET VIRTUAL: EXAMINATION OF BEST PRACTICES TO PROVIDE PUBLIC ACCESS TO DIGITAL VERSIONS OF THREE-DIMENSIONAL OBJECTS | JOHNSON | doi:10.6017/ital.v35i2.9343 46 • contextual navigation, including simplified searching and co-location of different types of resources; • language, including consistency in labeling across pages and records and appropriateness for the audience; and • integration of searching and browsing. This system is particularly appropriate in the context of digital museums, as it emphasizes the curatorial or organizational aspect of the collection in order to support learning objectives. In one comprehensive evaluation of the websites of art museums, Pallas and Economides (2008) created a framework for such evaluation, incorporating six dimensions: content, presentation, usability, interactivity and feedback, e-services, and technical. Each dimension then contained several specific criteria. Many of the criteria overlapped, however, and three-dimensional imaging, for example, was placed within the e-services dimension, under virtual tours, although it could have been placed within presentation, with other multimedia criteria, or even within interactivity, with interactive multimedia applications. The problem in trying to evaluate a particular part of a museum’s website, namely, the way it presents three-dimensional objects in digital form, is that the level of specificity almost renders many of the evaluation criteria from previous studies irrelevant. As Hariri and Norouzi (2011) suggest, evaluation criteria should be based on the objective of the evaluation. Hence, based on portions of the above-referenced studies, this author has created a more focused evaluation framework, concentrating on criteria that are particularly relevant to museums’ digital presentations of three-dimensional objects. This framework is detailed in table 2, below. Dimension Description Functionality What technology is used to display the object? How well does it work? Must programs or files be downloaded? Are the loading times of displays acceptable? Usability How easy is the site to use? What is the navigation system? Are there searching and browsing functions, and how well does each work? How findable are individual objects? Presentation How does the display of the object look? What is the context in which the object is presented? Are there multiple viewing options? Is there any interactivity permitted? Content Does the site provide an adequate collection of objects? For individual objects, is there sufficient information provided? Is there additional educational content? Table 2. Summary of evaluative criteria Five digital collections, specified below, will be evaluated based on these criteria. This will be done in a case study manner, describing each website based on the above criteria and then using those evaluations to make suggestions for best practices. RESULTS INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 47 It is difficult to compare different types of digital collections, particularly when the focus is on different types of technology utilized to display similar objects. However, because the goal here is to determine the best practices for the digital presentation of three-dimensional objects, it is important to evaluate a variety of techniques in a variety of fields. Thus, the following digital collections have been chosen to illustrate different ways in which such objects can be displayed on a website. Museum of Fine Arts, Boston (MFA) (http://www.mfa.org/collections) The MFA, both in person and online, boasts a comprehensive and extensive collection of art and historical artifacts of varying forms. The website is very easy to navigate, with well-defined browsing options and easy search capabilities, allowing for refinement of results by collection or type of item. There are many collections, which are well organized and curated into separate exhibits and galleries. In addition, when viewing each gallery, suggestions are linked for related online exhibitions as well as tours and exhibits at the physical museum. Each item record contains a detailed description of the item as well as its provenance. Thus, the MFA website attains a very high rating for usability and content. However, individual items are represented by only single pictures of varying quality. Some pictures are color, some are black and white, and no two pictures appear to have the same lighting. Additionally, despite being slow to load, even the pictures that appear to be of the best quality cannot be of high resolution, as zooming in makes them slightly blurry. Accordingly, the MFA website receives a medium rating for functionality and a low rating for presentation. Digital Fish Library (DFL) (http://www.digitalfishlibrary.org/index.php) The DFL project is a comprehensive program that utilizes MRI scanning to digitize preserved biological fish samples from a particular collection housed at the Scripps Institution of Oceanography. After MRI scans of a specimen are taken, the data is processed and translated into various views that are placed on the website, accompanied by information about each species (Berquist et al. 2012). Navigating the DFL website is very intuitive, as the individual specimen records are organized by taxonomy. It is easy to search for particular species or browse through the clickable, pictorial interface. Records for each species include detailed information about the individual specimen, the specifics of the scans used to image each, and broader information about the species. Individual records also provide links to other species within the taxonomic family. Thus, the DFL website attains high ratings in both usability and content. For functionality and presentation, however, the ratings are medium. Although for each item there are videos and still images obtained from three- dimensional volume renderings and MRI scans, they are small in size and have low resolution. There is no interactive component, with the possible exception of the “digital fish viewer” that supposedly requires Java, but this author could not get it to work despite best efforts. One nice feature, shown in figure 1 below, is that some of the specimen records have three-dimensional renderings showing and explaining the internal structures of the species. http://www.mfa.org/collections http://www.digitalfishlibrary.org/index.php LET’S GET VIRTUAL: EXAMINATION OF BEST PRACTICES TO PROVIDE PUBLIC ACCESS TO DIGITAL VERSIONS OF THREE-DIMENSIONAL OBJECTS | JOHNSON | doi:10.6017/ital.v35i2.9343 48 Figure 1. Annotated three-dimensional rendering of internal structures of hammerhead shark, from the Digital Fish Library (http://www.digitalfishlibrary.org/library/ViewImage.php?id=2851) The Eton Myers Collection (http://etonmyers.bham.ac.uk/3D-models.html) The Eton Myers Collection of ancient Egyptian art is housed at Eton College, and a project to three- dimensionally digitize the items for public access was undertaken via collaboration between that institution and the University of Birmingham. Digitization was accomplished with three- dimensional laser scanners, data was then processed with Geomagic software to produce point cloud and mesh forms, and individual datasets were reduced in size and converted into an appropriate file type to allow for public access (Chapman, Gaffney, and Moulden 2010). Usability of the Eton Myers Collection website is extremely low. The initial interface is simply a list of three-dimensional models by item number with a description of how to download the appropriate program and files. Another website from the University of Birmingham (http://mimsy.bham.ac.uk/info.php?f=option8&type=browse&t=objects&s=The+Eton+Myers+Col lection) contains a more museum-like interface, but contains many more records for objects than are contained on the initial list of three-dimensional models. Moreover, most of the records do not even include pictures of the items, let alone links to the three-dimensional models, and the records that do include pictures do not necessarily include such links. Even when a record has a link to the three-dimensional model, it actually redirects to the full list of models rather than to the individual item. There is no search functionality from the initial list of three-dimensional models, and no way to browse other than to, colloquially speaking, poke and hope. Individual items are only identified by item number, and, aside from the few records that have accompanying pictures on the University of Birmingham site, there is no way to know to what item any given number refers. The http://www.digitalfishlibrary.org/library/ViewImage.php?id=2851 http://etonmyers.bham.ac.uk/3D-models.html http://mimsy.bham.ac.uk/info.php?f=option8&type=browse&t=objects&s=The+Eton+Myers+Collection http://mimsy.bham.ac.uk/info.php?f=option8&type=browse&t=objects&s=The+Eton+Myers+Collection INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 49 website attains only a low rating for content; although it seems that there may be a decent number of items in the collection, it is impossible to know for certain given the problems with the interface and the fact that individual items are virtually unidentified. The Eton Myers Collection website also receives a low rating for functionality. In order to access three-dimensional models of items, users must download and install a program called MeshLab, then download individual folders of compressed files, then unzip those files, and finally open the appropriate file in MeshLab. Despite compression, some of the file folders are still quite large and take some time to download. Presentation of the items is also rated low. Even for the high - resolution versions of the three-dimensional renderings, viewed in MeshLab, the geometry of the objects seems underdeveloped (e.g., hieroglyphics are illegible) and surface textures are not well mapped (e.g., colors are completely off). This is evident from a comparison of the three- dimensional rendering with a two-dimensional photograph of the same item, as in figure 2, below. Figure 2. Comparison of original photograph (left) and three-dimensional rendering (right) of Item Number ECM 361, from the Eton Myers Collection (http://mimsy.bham.ac.uk/detail.php?t=objects&type=ext&f=&s=&record=0&id_number=ecm+3 61&op-earliest_year=%3D&op-latest_year=%3D). Notably, Chapman, Gaffney, and Moulden (2010) indicate that the detailed three-dimensional imaging enabled them to identify tooling marks and read previously unclear hieroglyphics on certain items. Thus, it is possible that the problems with the renderings may be a result of a loss in quality between the original models and the downloaded versions, particularly given that the files were reduced in size and converted prior to being made available for download. http://mimsy.bham.ac.uk/detail.php?t=objects&type=ext&f=&s=&record=0&id_number=ecm+361&op-earliest_year=%3D&op-latest_year=%3D http://mimsy.bham.ac.uk/detail.php?t=objects&type=ext&f=&s=&record=0&id_number=ecm+361&op-earliest_year=%3D&op-latest_year=%3D LET’S GET VIRTUAL: EXAMINATION OF BEST PRACTICES TO PROVIDE PUBLIC ACCESS TO DIGITAL VERSIONS OF THREE-DIMENSIONAL OBJECTS | JOHNSON | doi:10.6017/ital.v35i2.9343 50 Epigraphia 3D Project (http://www.epigraphia3d.es) The Epigraphia 3D project was created to present an online collection of various historical Roman epigraphs (also known as inscriptions) that were discovered and excavated in Spain and Italy; the physical collection is housed at the Museo Arqueológico Nacional (Madrid). Digital imaging was accomplished using photogrammetry, free software was utilized to create three-dimensional object models and renderings, and Photoshop was used to obtain appropriate textures. Finally, the three-dimensional model was published on the web using Sketchfab, a web service similar to Flickr that allows in-browser viewing of three-dimensional renderings in many different formats (Ramírez-Sánchez et al. 2014). The Epigraphia 3D website is intuitive and informative. Browsing is simple because there are not many records, but, although it is possible to search the website, there is no search function specifically directed to the collection. Thus, usability is rated as medium. Despite the fact that the website provides descriptions of the project and the collection, as well as information about epigraphs generally, the website attains a medium rating for content in light of the small size of the collection and the limited information given for each individual item. However, the Epigraphia 3D website receives very high ratings for functionality and presentation. The individual three- dimensional models are detailed, legible, and interactive. Individual inscriptions are transcribed for each item. The use of Sketchfab to display the models is effective; no downloading is necessary, and it takes an acceptable amount of time to load. When viewing the item, users can rotate the object in either “orbit” or “first person” mode, as well as view it full-screen or within the browser window. Users can also display the wireframe model and the textured or surfaced rendering, as shown in Figure 3 below. Figure 3. Three-dimensional textured (left) and wireframe (middle) renderings from the Epigraphia 3D project (http://www.epigraphia3d.es/3d-01.html), as compared to an original two- dimensional photograph of the same object (right) (http://eda- bea.es/pub/record_card_1.php?refpage=%2Fpub%2Fsearch_select.php&quicksearch=dapynus&r ec=19984). http://www.epigraphia3d.es/ http://www.epigraphia3d.es/3d-01.html http://eda-bea.es/pub/record_card_1.php?refpage=%2Fpub%2Fsearch_select.php&quicksearch=dapynus&rec=19984 http://eda-bea.es/pub/record_card_1.php?refpage=%2Fpub%2Fsearch_select.php&quicksearch=dapynus&rec=19984 http://eda-bea.es/pub/record_card_1.php?refpage=%2Fpub%2Fsearch_select.php&quicksearch=dapynus&rec=19984 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 51 Smithsonian X 3D (http://3d.si.edu) The Smithsonian X 3D project, although affiliated with all of the Smithsonian’s varying divisions, was created to test the application of three-dimensional digitization techniques to “iconic collection objects” (http://3d.si.edu/about). The website provides significant detail concerning the project itself, mostly in the form of videos, and individual items, many of which are linked to “tours” that incorporate a story about the object. Content is rated as medium because, despite the depth of information provided about individual items, there are still very few items within the collection. The website also receives a medium rating for usability, given the simple browsing structure, easy navigation, and lack of a search feature (all likely due at least in part to the limited content). Functionality and presentation, however, are rated high. The X3D Explorer in-browser software (powered by Autodesk) does more than simply display a three-dimensional rendering of an object; it also permits users to edit the model by changing color, lighting, texture, and other variables as well as incorporates detailed information about each item, both as an overall description and as a slide show, where snippets of information are connected to specific views of the item. The individual three-dimensional models are high resolution, detailed, and well- rendered, with very good surface texture mapping. However, it must be noted that the X3D Explorer tool is in Beta and, as such, still has some bugs; for example, this author has observed a model disappear while zooming in on the rendering. Table 3, below, summarizes the results of the evaluation. Functionality Usability Presentation Content MFA Medium Very High Low Very High DFL Medium High Medium High Eton Myers Low Low Low Low Epigraphia 3D Very High Medium Very High Medium Smithsonian X 3D High Medium High Medium Table 3. Summary of evaluation results for each website by individual criteria DISCUSSION Based on the evaluation of the five websites described above, some suggested best practices for the digitization and presentation of three-dimensional objects become apparent. When digitizing, the museum should utilize the method that best suits the object or collection. For example, while MRI scanning is likely the best method for three-dimensionally digitizing biological fish specimens, it is not going to be effective or feasible for digitizing artwork or artifacts (Abel et al. 2011; Berquist et al. 2012). Regardless of the method of digitization used, however, the people conducting the imaging and processing should fully comprehend the hardware and software necessary to complete the task. Additionally, although financial restraints must be considered, museums should note that some three-dimensional scanning equipment is just as economically feasible as standard digital cameras (Metallo and Rossi 2011). However, if a museum chooses to utilize only two-dimensional imaging, http://3d.si.edu/ http://3d.si.edu/about LET’S GET VIRTUAL: EXAMINATION OF BEST PRACTICES TO PROVIDE PUBLIC ACCESS TO DIGITAL VERSIONS OF THREE-DIMENSIONAL OBJECTS | JOHNSON | doi:10.6017/ital.v35i2.9343 52 each item should be photographed from multiple angles in high resolution, to avoid creating a website, like the MFA’s, on which everything other than the object itself is presented outstandingly. Further, museums deciding on two-dimensional imaging should explore the possibility of utilizing photogrammetry to create three-dimensional models from their two- dimensional photographs, like the Epigraphia 3D project. There is free or inexpensive software that functions to permit the creation of three-dimensional object maps from very few photographs (Ramírez-Sánchez et al. 2014). Finally, compatibility is a key issue when conducting three- dimensional scans; the museum should ensure that the software used for rendering models is compatible with the way in which users will be viewing the models. In the context of public access to the museum’s digital collections, the website should be easy and intuitive to navigate. The MFA website is an excellent example; browsing and search functions should both be present, and reorganization of large numbers of objects into separate collections may be necessary. Where searching is going to be the primary point of entry into the collection, it is important to have sufficient metadata and functional search algorithms to ensure that item records are findable. Furthermore, remember that the website is simply a way to access the museum itself. Hence, the collections on the website, like the collections in the physical museum, should be curated; there should be a logical flow to accessing object records. The museum may also want to have sections that are similar to virtual exhibitions, like the “tours” provided by the Smithsonian X 3D project. Finally, museums should ensure that no additional technological know-how (beyond being able to access the internet) is required to access the three-dimensional content in object records. Users should not be required to download software or files to view records; Epigraphia 3D’s use of Sketchfab and the Smithsonian’s X 3D Explorer tool are both excellent examples of ways in which three-dimensional content can be viewed on the web without the need for extraneous software. Museums and cultural heritage institutions are increasing the focus on providing public access to collections via digitization and display on websites (Given and McTavish 2010). In order to do this effectively, this paper has attempted to provide some guidance as to best practices of presenting digital versions of three-dimensional objects. In closing, however, it must be noted that this author is not a technician. Although this paper has tried to contend with the issues from the perspective of a librarian, there are complicated technical concerns behind any digitization project that have not been adequately addressed. In addition, this paper has not examined the role of budgetary constraints on digitization or the concomitant issues of creating and maintaining websites. Moreover, because this paper has been treated as a broad overview of the digitization and presentation for public access of three-dimensional objects, the five websites evaluated were from varying fields of study. Museums should look to more specific comparisons in order to appropriately digitize and present their collections on the web. CONCLUSION There may not be a direct substitute for encountering an object in person, but for people who cannot obtain physical access to three-dimensional objects, the digital realm can serve as an adequate proxy. This paper has demonstrated, through an evaluation of five distinct digital collections, that utilizing three-dimensional imaging and presenting three-dimensional models of physical objects on the web can serve the important purpose of increasing public access to otherwise unavailable collections. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 53 REFERENCES Abate, D., R. Ciavarella, G. Furini, G. Guarnieri, S. Migliori, and S. Pierattini. “3D Modeling and Remote Rendering Technique of a High Definition Cultural Heritage Artefact.” Procedia Computer Science 3 (2011): 848–52. http://dx.doi.org/10.1016/j.procs.2010.12.139. Abel, R. L., S. Parfitt, N. Ashton, Simon G. Lewis, Beccy Scott, and C. Stringer. “Digital Preservation and Dissemination of Ancient Lithic Technology with Modern Micro-CT.” Computers and Graphics 35, no. 4 (August 2011): 878–84. http://dx.doi.org/10.1016/j.cag.2011.03.001. Berquist, Rachel M., Kristen M. Gledhill, Matthew W. Peterson, Allyson H. Doan, Gregory T. Baxter, Kara E. Yopak, Ning Kang, H.J. Walker, Philip A. Hastings, and Lawrence R. Frank. “The Digital Fish Library: Using MRI to Digitize, Database, and Document the Morphological Diversity of Fish.” PLoS ONE 7, no. 4: (April 2012). http://dx.doi.org/10.1371/journal.pone.0034499. Bincsik, Monika, Shinya Maezaki, and Kenji Hattori. “Digital Archive Project to Catalogue Exported Japanese Decorative Arts.” International Journal of Humanities and Arts Computing 6, no. 1– 2 (March 2012): 42–56. http://dx.doi.org/10.3366/ijhac.2012.0037. Cameron, Fiona. “Digital Futures I: Museum Collections, Digital Technologies, and the Cultural Construction of Knowledge.” Curator: The Museum Journal 46, no. 3 (July 2003): 325–40. http://dx.doi.org/10.1111/j.2151-6952.2003.tb00098.x. Chane, Camille Simon, Alamin Mansouri, Franck S. Marzani, and Frank Boochs. “Integration of 3D and Multispectral Data for Cultural Heritage Applications: Survey and Perspectives.” Image and Vision Computing 31, no. 1 (January 2013): 91–102. http://dx.doi.org/10.1016/j.imavis.2012.10.006. Chapman, Henry P., Vincent L. Gaffney, and Helen L. Moulden. “The Eton Myers Collection Virtual Museum.” International Journal of Humanities and Arts Computing 4, no. 1–2 (October 2010): 81–93. http://dx.doi.org/10.3366/ijhac.2011.0009. Dellepiane, M., M. Callieri, F. Ponchio, and R. Scopigno. “Mapping Highly Detailed Colour Information on Extremely Dense 3D Models: The Case of David's restoration.” Computer Graphics Forum 27, no. 8 (December 2008): 2178–87. http://dx.doi.org/10.1111/j.1467- 8659.2008.01194.x. Given, Lisa M., and Lianne McTavish. “What’s Old Is New Again: The Reconvergence of Libraries, Archives, and Museums in the Digital Age.” Library Quarterly 80, no. 1 (January 2010): 7– 32. http://dx.doi.org/10.1086/648461. Hariri, Nadjla, and Yaghoub Norouzi. “Determining Evaluation Criteria for Digital Libraries’ User Interface: A Review.” The Electronic Library 29, no. 5 (2011): 698–722. http://dx.doi.org/10.1108/02640471111177116. Hess, Mona, Francesca Simon Millar, Stuart Robson, Sally MacDonald, Graeme Were, and Ian Brown. “Well Connected to Your Digital Object? E-curator: A Web-Based E-Science Platform for Museum Artefacts.” Literary and Linguistic Computing 26, no. 2 (2011): 193– 215. http://dx.doi.org/10.1093/llc/fqr006. http://dx.doi.org/10.1016/j.cag.2011.03.001 http://dx.doi.org/10.1371/journal.pone.0034499 http://dx.doi.org/10.3366/ijhac.2012.0037 http://dx.doi.org/10.1111/j.2151-6952.2003.tb00098.x http://dx.doi.org/10.1016/j.imavis.2012.10.006 http://dx.doi.org/10.3366/ijhac.2011.0009 http://dx.doi.org/10.1111/j.1467-8659.2008.01194.x http://dx.doi.org/10.1111/j.1467-8659.2008.01194.x http://dx.doi.org/10.1086/648461 http://dx.doi.org/10.1108/02640471111177116 http://dx.doi.org/10.1093/llc/fqr006 LET’S GET VIRTUAL: EXAMINATION OF BEST PRACTICES TO PROVIDE PUBLIC ACCESS TO DIGITAL VERSIONS OF THREE-DIMENSIONAL OBJECTS | JOHNSON | doi:10.6017/ital.v35i2.9343 54 Holovachov, Oleksandr, Andriy Zatushevsky, and Ihor Shydlovsky. “Whole-Drawer Imaging of Entomological Collections: Benefits, Limitations and Alternative Applications.” Journal of Conservation and Museum Studies 12, no. 1 (2014): 1–13. http://dx.doi.org/10.5334/jcms.1021218. Hunter, Jane, and Anna Gerber. 2010. “Harvesting Community Annotations on 3D Models of Museum Artefacts to Enhance Knowledge, Discovery and Re-Use.” Journal of Cultural Heritage 11, no. 1 (2010): 81–90. http://dx.doi.org/10.1016/j.culher.2009.04.004. Jarrell, Michael C. “Providing Access to Three-Dimensional Collections.” Reference & User Services Quarterly 38, no. 1 (1998): 29–32. Kravchyna, Victoria, and Sam K. Hastings. “Informational Value of Museum Web Sites.” First Monday 7, no. 4 (February 2002). http://dx.doi.org/10.5210/fm.v7i2.929. Kuzminsky, Susan C. and Megan S. Gardiner. “Three-Dimensional Laser Scanning: Potential Uses for Museum Conservation and Scientific Research.” Journal of Archaeological Science 39, no. 8 (August 2012): 2744–51. http://dx.doi.org/10.1016/j.jas.2012.04.020. Lerma, José Luis, and Colin Muir. “Evaluating the 3D Documentation of an Early Christian Upright Stone with Carvings from Scotland with Multiples Images.” Journal of Archaeological Science 46 (June 2014): 311–18. http://dx.doi.org/10.1016/j.jas.2014.02.026. Louw, Marti, and Kevin Crowley. “New Ways of Looking and Learning in Natural History Museums: The Use of Gigapixel Imaging to Bring Science and Publics Together.” Curator: The Museum Journal 56, no. 1 (January 2013): 87–104. http://dx.doi.org/10.1111/cura.12009. Metallo, Adam, and Vince Rossi. “The Future of Three-Dimensional Imaging and Museum Applications.” Curator: The Museum Journal 54, no. 1 (January 2011): 63–69. http://dx.doi.org/10.1111/j.2151-6952.2010.00067.x. Montani, Isabelle, Eric Sapin, Richard Sylvestre, and Raymond Marquis . “Analysis of Roman Pottery Graffiti by High Resolution Capture and 3D Laser Profilometry.” Journal of Archaeological Science 39, no. 11 (2012): 3349–53. http://dx.doi.org/10.1016/j.jas.2012.06.011. Newell, Jenny. “Old Objects, New Media: Historical Collections, Digitization and Affect.” Journal of Material Culture 17, no. 3 (September 2012): 287–306. http://dx.doi.org/10.1177/1359183512453534. Novati, Gianluca, Paolo Pellegri, and Raimondo Schettini. “An Affordable Multispectral Imaging System for the Digital Museum.” International Journal on Digital Libraries 5, no. 3 (May 2005): 167–78. http://dx.doi.org/10.1007/s00799-004-0103-y. Pallas, John, and Anastasios A. Economides. “Evaluation of Art Museums' Web Sites Worldwide.” Information Services and Use 28, no. 1 (2008): 45–57. http://dx.doi.org/10.3233/ISU- 2008-0554. Parandjuk, Joanne C. “Using Information Architecture to Evaluate Digital Libraries.” The Reference Librarian 51, no. 2 (2010): 124–34. http://dx.doi.org/10.1080/02763870903579737. http://dx.doi.org/10.5334/jcms.1021218 http://dx.doi.org/10.1016/j.culher.2009.04.004 http://dx.doi.org/10.5210/fm.v7i2.929 http://dx.doi.org/10.1016/j.jas.2012.04.020 http://dx.doi.org/10.1016/j.jas.2014.02.026 http://dx.doi.org/10.1111/cura.12009 http://dx.doi.org/10.1111/j.2151-6952.2010.00067.x http://dx.doi.org/10.1016/j.jas.2012.06.011 http://dx.doi.org/10.1177/1359183512453534 http://dx.doi.org/10.1007/s00799-004-0103-y http://dx.doi.org/10.3233/ISU-2008-0554 http://dx.doi.org/10.3233/ISU-2008-0554 http://dx.doi.org/10.1080/02763870903579737 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 55 Pavlidis, George, Anestis Koutsoudis, Fotis Arnaoutoglou, Vassilios Tsioukas, and Christodoulos Chamzas. “Methods for 3D Digitization of Cultural Heritage.” Journal of Cultural Heritage 8, no. 1 (2007): 93–98, http://dx.doi.org/10.1016/j.culher.2006.10.007. Ramírez-Sánchez, Manuel, José-Pablo Suárez-Rivero, and María-Ángeles Castellano-Hernández. “Epigrafía digital: tecnología 3D de bajo coste para la digitalización de inscripciones y su acceso desde ordenadores y dispositivos móviles.” El Profesional de la Información 23, no. 5 (2014): 467–74. http://dx.doi.org/10.3145/epi.2014.sep.03. Saracevic, Tefko. “Digital Library Evaluation: Toward an Evolution of Concepts.” Library Trends 49, no. 3 (2000): 350–69. Srinivasan, Ramesh, Robin Boast, Jonathan Furner, and Katherine M. Becvar. “Digital Museums and Diverse Cultural Knowledges: Moving past the Traditional Catalog.” The Information Society 25, no. 4 (2009): 265–78, http://dx.doi.org/10.1080/01972240903028714. Xie, Hong Iris. “Users’ Evaluation of Digital Libraries (DLs): Their Uses, Their Criteria, and Their Assessment.” Information Processing and Management 44, no. 3 (May 2008): 1346–73, http://dx.doi.org/10.1016/j.ipm.2007.10.003. http://dx.doi.org/10.1016/j.culher.2006.10.007 http://dx.doi.org/10.3145/epi.2014.sep.03 http://dx.doi.org/10.1080/01972240903028714 http://dx.doi.org/10.1016/j.ipm.2007.10.003 INTRODUCTION 9446 ---- Microsoft Word - December_ITAL_Biswas_final.docx Analyzing Digital Collections Entrances: What Gets Used and Why It Matters Paromita Biswas and Joel Marchesoni INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 19 ABSTRACT This paper analyzes usage data from Hunter Library’s digital collections using Google Analytics for a period of twenty-seven months from October 2013 through December 2015. The authors consider this data analysis to be important for identifying collections that receive the largest number of visits. We argue this data evaluation is important in terms of better informing decisions for building digital collections that will serve user needs. The authors also study the benefits of harvesting to sites such as the Digital Public Library of America, and they believe this paper will contribute to the literature on Google Analytics and its use by libraries. INTRODUCTION Hunter Library at Western Carolina University (WCU) has fourteen digital collections hosted in CONTENTdm—a digital collection management system from OCLC. Users can enter the collections in various ways—through the Library’s CONTENTdm landing pages,1 search engines, or sites such as the Digital Public Library of America (DPLA) where all the collections are harvested.2 Since October 2013, the Library has collected usage data from its collections’ websites and from DPLA referrals via Google Analytics. This paper analyzes this usage data covering a period of approximately twenty-seven months from October 2013 through December 2015. The authors consider this data analysis important for identifying collections receiving the largest number of visits, including visits through harvesting sites such as the DPLA. The authors argue that such data evaluation is important because it can better inform decisions taken to build collections that will attract users and serve their needs. Additionally, this analysis of usage data generated from harvesting sites such as the DPLA demonstrates the usefulness of harvesting in increasing digital collections’ usage. Lastly, this paper contributes to the broader literature on Google Analytics and its use by libraries in data analysis. LITERATURE REVIEW Using Google Analytics to study usage of electronic resources is common; a considerable amount of material exists describing the use of Google Analytics in marketing and business fields.3 Paromita Biswas (pbiswas@email.wcu.edu) is Metadata Librarian and Joel Marchesoni (jmarch@email.wcu.edu) is Technology Support Analyst, Hunter Library, Western Carolina University, Cullowhee, North Carolina. ANALYZING DIGITAL COLLECTIONS ENTRANCES: WHAT GETS USED AND WHY IT MATTERS | BISWAS AND MARCHESONI | https://doi.org/10.6017/ital.v35i4.9446 20 However, the published literature offers little about the use of this software for studying usage of collections consisting of unique materials digitized and placed online by libraries and cultural heritage organizations. For example, Betty has written about using Google Analytics to track statistics for user interaction with librarian-created digital media such as quizzes and video tutorials.4 Fang discusses using Google Analytics to track the behavior of users who visited the Rutgers-Newark Law Library website.5 Fang looked at the number of visitors, what and how many pages they visited, how long they stayed on each page, where they were coming from, and which search engine or website had referred them to the library’s website. Findings were evaluated and used to make improvements to the library’s website. For example, Fang mentions using Google Analytics data for tracking the percentage of new and returning visitors before and after the website redesign. Among articles that discuss using web analytics to learn how users access digital collections, most have focused on a comparison between third-party platforms, online search engines, and the traditional library catalog to find preferred modes of access and whether results call for a shift in how libraries share their digital collections. For example, in their article on the impact of social media platforms such as HistoryPin and Pinterest on the discovery and access of digital collections, Baggett and Gibbs use Google Analytics for tracking usage of digital objects on the library’s website as well statistics collected from HistoryPin’s and Pinterest’s first-party analytics tools.6 The authors conclude that while neither HistoryPin nor Pinterest drive users back to the library’s website, they help in the discovery of digital collections and can enhance user access to library collections. Schlosser and Stamper compare the effects on usage of a collection housed in an institutional repository and reposted on Flickr.7 Whether housing a collection on a third-party site had an adverse effect on attracting traffic to the library’s website was not as important as ensuring users accessed the collection somewhere. Likewise, O’English demonstrates how data from web analytics were used to compare access to archival materials via online search engines as opposed to library catalogs using MARC records for descriptions.8 O’English argues library practices should change accordingly to promote patron access and use. Ladd’s article on the access and use of a digital postcard collection from Miami University uses statistics from Google Analytics, CONTENTdm, and Flickr over a period of one year.9 Ladd’s findings reveal that few users came to the main digital collections website to search and browse; instead, most arrived via external sources such as search engines and social media sites. The resulting increase in views makes it imperative, Ladd asserts, that regular updates both in CONTENTdm and Flickr are important for promoting access and use of the postcards. Articles on using Google Analytics for tracking digital collection usage have explored tracking the geographic base of users. For example, Herold uses Google Analytics to demonstrate usage of a digital archival collection by users at institutional, national, and international levels.10 Herold looks at server transaction logs maintained in Google Analytics, on- and off-campus searching counts, user locations, and repeat visitors to the archival images representing cultural heritage materials related to Orang Asli peoples and cultures of Malaysia. She uses these data to ascertain INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 21 the number of users by geographic region and determine that, while most visitors came from the United States, Malaysia ranked second. The data supported, according to Herold, that this particular digital collection was able to reach another target audience: users from Malaysia. Herold’s findings indicate that digitization of unique materials makes them available to a worldwide audience. Whether harvesting has increased usage of digital collections available via DPLA or its hubs has received limited exploration in the literature. Most writings on harvesting digital collections have focused more on the technical aspects of the process, like the DPLA’s ingestion method, the quality and scalability of metadata remediation and enhancement,11 and large metadata encoding.12 For example, Gregory and Williams write about the North Carolina Digital Heritage Center as one of the service hubs of the DPLA. The service hubs are centers that aggregate digital collection metadata provided by institutions for harvesting by the DPLA. The authors discuss metadata requirements, software review, and establishment of workflow for sending large metadata feeds to the DPLA.13 Boyd, Gilbert, and Vinson, in their article on the South Carolina Digital Library (SCDL), another service hub for DPLA, describe the planning behind setting up the SCDL, its management, and the technology involved in metadata harvesting.14 Freeland and Moulaison discuss the Missouri hub as a model for “institutions with similar collective goals for exposing and enriching their data through the DPLA.”15 According to them, by harvesting their metadata to the DPLA, institutions are able to share their digital collections with the broader public. Additionally, institutions that harvest metadata to the DPLA get value-added services like geocoding of location- based metadata and expression of contributed metadata as linked data. Data Collection Parameters Hunter Library digital collections usage data included information on item views16 and referrals17 for each of the collections including DPLA referrals. The authors also considered keyword search terms18 across all referrals, and within CONTENTdm specifically, that brought users to the Library’s collections. The authors considered the most frequently occurring keywords to be representing the subjects of collections that were most used. Repeat visitors to the Library’s digital collections’ website were also tracked. Finally, sessions19 were traced by the geographic area20 of the users. Hunter Library’s collections vary in size. The Library’s largest and one of the oldest collections, Craft Revival [Note: collections are set in roman and capitalized] showcases documents, photographs, and craft objects housed in Hunter Library and smaller regional institutions. The collection’s items represent the late nineteenth and early twentieth century (1890s–1940s) Craft Revival movement in Western North Carolina, which was characterized by a renewed interest in handmade objects, including Cherokee arts and crafts. The Craft Revival collection began in 2005 and includes 1,982 items. The second largest collection, Great Smoky Mountains, which highlights efforts that went into the establishment of the park and includes photographs on the landscape and flora and fauna in the park, began in 2012 and consists of 1,829 items. Not all digital ANALYZING DIGITAL COLLECTIONS ENTRANCES: WHAT GETS USED AND WHY IT MATTERS | BISWAS AND MARCHESONI | https://doi.org/10.6017/ital.v35i4.9446 22 collections were harvested to the DPLA at the same time. While some older collections were harvested to the DPLA in 2013, smaller, institution-specific collections started later were also harvested later. For example WCU—Oral Histories, a collection of interviews collected by students of one of WCU’s history classes documenting the history and culture of Western North Carolina and the lives of WCU athletes or artists’ like Josephina Niggli who taught drama at WCU; Highlights from WCU, a collection of unique items from WCU’s Mountain Heritage Center and other departments on campus, including letters from the Library’s Special Collections transcribed by WCU’s English department students; and WCU—Fine Art Museum, showcasing art work from the university’s Fine Art Museum, were harvested to the DPLA in 2015. As these smaller collections were started later, their total item views and referral counts would likely be less than some of the Library’s older collections; however, these newer collections were included as they might provide valuable data regarding harvesting referrals and returning visitors. Table 1 shows the years the collections were started, the number of items included in each collection, and the year they were harvested to the DPLA. Collection Name Start Year Collection Size (Number of Items) Harvested Since Cherokee Traditions 2011 332 2013 Civil War 2011 68 2013 Craft Revival 2005 1,982 2013 Great Smoky Mountains 2013 1,829 2013 Highlights from WCU 2015 39 2015 Horace Kephart 2005 552 2013 Picturing Appalachia 2012 972 2013 Stories of Mountain Folk 2012 374 2013 Travel Western North Carolina 2011 160 2013 WCU—Fine Art Museum 2015 87 2015 WCU—Herbarium 2013 91 2013 WCU—Making Memories 2012 408 2013 WCU—Oral Histories 2015 67 2015 Western North Carolina Regional Maps 2015 37 2015 Table 1. Collections by year INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 23 Collecting Data Using Google Analytics The Library has had Google Analytics set up on online exhibits—websites outside of CONTENTdm that provide additional insight into the collection—since 2008 and began using Google Analytics to track its CONTENTdm materials with the 6.1.2 release in October 2013. CONTENTdm version 6.4 introduced a configuration field that allowed the authors to enter a Google Analytics ID and automatically generate the tracking code in pages to simplify the setup. Following that software update, OCLC made Google Analytics the default data logging mechanism. The Library set up Google Analytics such that online exhibits are tracked together with their CONTENTdm collections. This is accomplished by using custom tracking on all webpages and a custom script in CONTENTdm. This allows the Library to link its CONTENTdm and wcu.edu domains within Google Analytics so that sessions can be viewed across all online digital collections. Data were collected from Google Analytics using several tools. Google provides an online tool called Query Explorer (https://ga-dev-tools.appspot.com/query-explorer/) that can create and execute custom searches against Google Analytics. This application was used to craft the queries. Microsoft Excel was primarily used to download data, using the custom plugin Rest to Excel Library (http://ramblings.mcpher.com/Home/excelquirks/json/rest) to parse information from Google Analytics into worksheets. The Excel add-on works well, but requires knowledge of Microsoft Visual Basic for Applications (VBA) programming to use effectively. This limitation prompted the authors to look for a simpler way of retrieving data. The authors found OpenRefine (https://github.com/OpenRefine/OpenRefine) to collect, sort, and filter data, with Excel used for results analysis. Once in Excel, formulas were used to mine data for specific targets. RESULTS ANALYSIS The data collected using Google Analytics spanned a period of approximately twenty-seven months, from October 2013 through December 2015. Table 1 and graph 1 show each collection’s item views, item referrals, and size (number of items in the collection). These numbers were calculated for each collection as a percentage of total item views, total items referrals, and total number of items for all collections together. In table 2, the top five collections in terms of items views and referrals are highlighted. Graph 1, a graphical representation of table 2, displays more starkly the differences between collections in terms of views and referrals. ANALYZING DIGITAL COLLECTIONS ENTRANCES: WHAT GETS USED AND WHY IT MATTERS | BISWAS AND MARCHESONI | https://doi.org/10.6017/ital.v35i4.9446 24 Collection Name Item Views as Percentage of Total Views Item Referrals as Percentage of Total Referrals Number of Items as Percentage of Total Items for all Collections Cherokee Traditions 6.38 6.12 4.74 Civil War 1.89 0.88 0.97 Craft Revival 41.35 52.39 28.32 Great Smoky Mountains 7.50 6.34 26.14 Highlights from WCU 0.23 0.08 0.56 Horace Kephart 11.67 7.62 7.89 Picturing Appalachia 10.03 9.99 13.89 Stories of Mountain Folk 3.51 2.45 5.344 Travel Western North Carolina 7.87 9.57 2.29 WCU—Fine Art Museum 0.19 0.08 1.24 WCU—Herbarium 0.71 0.45 1.30 WCU—Making Memories 7.13 2.64 5.83 WCU—Oral Histories 0.80 1.08 0.96 Western North Carolina Regional Maps 0.26 0.11 0.53 Total 100.00 100.00 100.00 Table 2. Collections by percentage Graph 1. Collections by percentage INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 25 As demonstrated in the preceding table and graph, Craft Revival, one of the Library’s oldest and largest collections, contributes more than 28 percent of all digital collections’ items and garners close to 42 percent of all item views and 53 percent of all item referrals. Great Smoky Mountains, the second largest collection, contributes a little more than 26 percent of items but receives only about 8 percent of all item views and 7 percent of all referrals. The Horace Kephart collection, focusing on the life and works of Horace Kephart—author, librarian, and outdoorsman who made the mountains of Western North Carolina his home later in life—is the Library’s fourth largest collection. It receives almost 12 percent of all item views and about 8 percent of all item referrals. Picturing Appalachia, the third largest collection—consisting of photographs showcasing the history, culture, and natural landscape of Southern Appalachia in the Western North Carolina region—makes up 14 percent of items and receives approximately 10 percent of all referrals and views. Travel Western North Carolina—visual journeys of Western North Carolina communities through three generations—contributes fewer than 3 percent of items but scores high on both items views and referrals. WCU—Making Memories, which highlights the people, buildings, and events from WCU’s history, and Stories of Mountain Folk (SOMF), which is a collection of radio programs from Western North Carolina non-profit Catch the Spirit of Appalachia and archived at Hunter Library, are collections that are similar in size—receiving fewer than 3 percent of all item referrals. However, WCU—Making Memories receives a more than 7 percent of all item views compared to SOMF’s almost 4 percent. These findings are not surprising as the Making Memories collection documents Western Carolina University’s history and may receive many views from within the institution. Overall, however, the Craft Revival collection can be considered the Library’s most popular collection. The Horace Kephart collection appears to be the second most popular collection. And, not surprisingly, Cherokee Traditions, a collection of art objects, photographs, and recordings similar in content to the Craft Revival in terms of its focus on Cherokee culture and history, is quite popular and receives more item referrals than both WCU—Making Memories and SOMF and more item views than SOMF (table 2). An analysis of keyword searches within CONTENTdm and keyword searches across all referral sources reiterates these findings. As part of the analysis, data collected for this twenty-seven- month period for the top keyword searches within CONTENTdm and the top keyword searches counting all referrals was recorded in an Excel spreadsheet and then uploaded to OpenRefine. OpenRefine allows text and numeric data to be sorted by name (alphabetical) and count (highest to lowest occurring). Once the Excel spreadsheet was uploaded to OpenRefine, keywords were sorted numerically and clustered. OpenRefine has a “cluster” function to bring together text that has the same meaning but differs by spelling or capitalization (for example, “CHEROKEE,” “cherokee,” “cheroke”) or by order (for example, “Jane Smith,” “Smith, Jane”). The clustering function provides a count of the number of times a keyword was used regardless of exact spelling. After identifying keywords belonging to a cluster (for example, a cluster of the word “Cherokee” spelled differently), the differently spelled or organized keywords in each cluster were merged in ANALYZING DIGITAL COLLECTIONS ENTRANCES: WHAT GETS USED AND WHY IT MATTERS | BISWAS AND MARCHESONI | https://doi.org/10.6017/ital.v35i4.9446 26 OpenRefine with their most accurate counterparts. Finally, it should be noted that keywords including “!” and “+” symbols were most likely generated from either using multiple search terms within CONTENTdm’s advanced search or from curated search links maintained on some of our online exhibit websites. These links take users to commonly used result sets within the collection. Tables 3 and 4 provide a listing of the ten most frequently searched keywords within CONTENTdm across all referrals and names of collections that are most relevant to these searches. Keywords Occurrence Count Relevant Collection(s) Cherokee 187 Craft Revival; Cherokee Traditions Cherokee Language 107 Craft Revival; Cherokee Traditions Southern Highland Craft Guild 98 Craft Revival basket!object 96 Craft Revival; Cherokee Traditions Indian masks—Appalachian Region, Southern 83 Craft Revival; Cherokee Traditions Basket!photograph postcard 82 Craft Revival; Cherokee Traditions W.M. Cline Company 78 Picturing Appalachia; Craft Revival Cherokee +Indian! photograph 72 Craft Revival; Cherokee Traditions Wood-carving— Appalachian Region, Southern 70 Craft Revival Indian wood-carving— Appalachian Region, Southern 69 Craft Revival Table 3. Top keywords searches within CONTENTdm INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 27 Keywords Number of Sessions Relevant Collection(s) cherokee traditions 442 Craft Revival; Cherokee Traditions horace kephart 185 Horace Kephart; Great Smoky Mountains; Picturing Appalachia cherokee pottery 55 Craft Revival; Cherokee Traditions kephart knife 50 Horace Kephart amanda swimmer 37 Craft Revival; Cherokee Traditions appalachian people 36 Craft Revival; Cherokee Traditions; Great Smoky Mountains; WCU—Oral Histories cherokee indian pottery 36 Craft Revival; Cherokee Traditions cherokee baskets 34 Craft Revival; Cherokee Traditions weaving patterns 33 Craft Revival; Cherokee Traditions basket weaving 26 Craft Revival; Cherokee Traditions Table 4. Top keyword searches across all referrals Tables 3 and 4 show that top searches relate to arts and crafts from the Western North Carolina region (“baskets,” “Indian masks,” “Indian wood carving,” “Cherokee pottery”), artists (“amanda swimmer”), or topics relating to Cherokee culture (“cherokee,” “cherokee language”). Searches relating to the Horace Kephart collection (“horace kephart,” “kephart knife”) are also popular, explaining the fact that the Kephart collection, which accounts for fewer than 8 percent of the Library’s digital collections’ items scores highly in terms of item views (second) and referrals (fourth). The popularity of topics related to Western North Carolina is reiterated in the geographic base of the users. Graph 2 shows North Carolina accounts for most of the searches, with cities in Western North Carolina (Asheville, Franklin, Cherokee, Waynesville) accounting for more than 40 percent of sessions. ANALYZING DIGITAL COLLECTIONS ENTRANCES: WHAT GETS USED AND WHY IT MATTERS | BISWAS AND MARCHESONI | https://doi.org/10.6017/ital.v35i4.9446 28 Graph 2. Cities by session count The majority of item referrals come from search engines such as Google, Bing, and Yahoo! Graph 3 shows the percentage of item referrals from these external searches.21 However, the DPLA also generates a fair amount of incoming traffic to the collections. For example, while all collections get referrals from the DPLA, harvesting to the DPLA is particularly useful for smaller collections such as Highlights from WCU, WCU—Fine Art Museum, and Civil War Collection. Each of these collections gets 17 percent of referrals from the DPLA, making DPLA the largest referral source following the search engines for the Highlights and Fine Art Museum collections. Graph 4 shows referrals each collection receives via the DPLA as a percentage of total referrals. This indicates the usefulness of harvesting to the DPLA. A trend seems also to show there is an increase in total referrals from DPLA per month the longer items are in DPLA (graph 5). Graph 3. Percentage of search engine item referrals (Google, Bing, and Yahoo!) 367 319 171 146 144 135 122 109 105 98 44% 29% 47% 44% 75% 43% 57% 11% 23% 75% 74% 38% 33% 6% 22% INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 29 Graph 4. Percentage of DPLA item referrals Graph 5. Increase in DPLA referrals over time Lastly, new and returning visitors to the collections were tracked as a marker of user interest in particular collections. Graph 6 shows data collected for new and returning visitors calculated as a proportion of the total number of visits for each collection. Some smaller collections like Highlights from WCU, WNC Regional Maps, WCU—Fine Art Museum, and WCU—Oral Histories score highly in terms of attracting return visitors (graph 6). 6% 17% 3% 12% 17% 4% 11% 6% 3% 17% 3% 4% 5% 0% ANALYZING DIGITAL COLLECTIONS ENTRANCES: WHAT GETS USED AND WHY IT MATTERS | BISWAS AND MARCHESONI | https://doi.org/10.6017/ital.v35i4.9446 30 Graph 6. New and returning visitors DISCUSSION The aim behind gathering data was to study usage of Hunter Library’s digital collections and examine the usefulness of harvesting in promoting use. Although usage data logs were unable to shed much light on the actual usefulness of the collections to users, the logs provided information on volume of use, what materials were accessed, and where users were located. Analysis of the transaction logs indicates that while all collections likely benefitted from harvesting, Craft Revival, Cherokee Traditions, and Horace Kephart (collections focusing on the culture and history of western North Carolina) were the most heavily used and most visitors came from the state of North Carolina and from the region in particular. Search terms in the transaction logs also indicated a strong interest in items related to Cherokee culture and Horace Kephart. As Herold, who traced the second largest group of users of the Orang Asli digital image archive to Malaysia notes, the geographic base of a collection’s users can be indicative of the popularity of a subject area.22 Likewise, Matusiak asserts that users’ comments can be indicative of the relevance of collections to users’ needs and provide direction for the future development of digital collections.23 As neither the Craft Revival, Cherokee Traditions, nor Horace Kephart collection includes items that relate specifically to the university’s history—unlike other institution-specific collections mentioned earlier—it is possible collection users may be more representative of the larger public than the university. These findings point to the need for questioning identification of an academic INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 31 library’s user base as mainly students and faculty of the institution and whether librarians should give greater consideration to the needs of a wider audience.24 Data supporting the existence of this user base, whose true import or preferences might not be captured in surveys and questionnaires, can serve as a valuable source of information for individuals responsible for building digital collections. In an informal survey of Hunter Library faculty carried out by Hunter Library’s Digital Initiatives Unit in September of 2014, respondents considered collections such as Craft Revival to be more useful to users external to the university. While the survey could allude to the nature of the user base of a collection like Craft Revival, it understandably could not capture the scale of the item views and referrals garnered by this collection as well as a usage data analysis could. On the other hand, analysis of usage data, as demonstrated in this paper, indicated that certain collections— Highlights from WCU, WCU—Fine Art Museum, and WCU—Oral Histories—possibly served a niche audience. These smaller and more recently established collections consisting of university- created materials attracted more returning visitors (see graph 6). These returning visitors were likely internal users whose visits indicated, as Fang points out, a loyalty to these collections.25 In the paper “A Framework of Guidance for Building Good Digital Collections,” authored by the National Information Standards Organization Framework Advisory Group, the authors point out that while there are no absolute rules for creating quality digital collections, a good collection should include data pertaining to usage.26 The authors point to multiple assessment matrixes including using a combination of observations, surveys, experiments, and transaction log analyses. As the WCU digital collections findings demonstrate, a careful analysis of the popularity of collections can indicate the need for balancing quantitative data with more qualitative survey and interview data. These findings also indicate that usage data analysis can be very valuable in identifying the extent of collection usage by visitors who may not have significant survey representation. Results from the small (fewer than ten respondents) WCU survey indicate that some respondents question the institutional usefulness of collections such as Craft Revival. These results show the importance of taking multiple factors into account when assessing user needs and interests in digital collections. CONCLUSION The authors feel future projects might stem from this data analysis. For example, local subject fields based on the highest recurring keywords that were mined from the transaction logs can be added for all of Hunter Library’s digital collections. Usage statistics at a later period could be evaluated to study if addition of user generated keywords increased use of any collection. As Matusiak points out in her article on the usefulness of user-centered indexing in digital image collections, social tagging—despite its lack of synonym control or misuse of the singular and plural—is a powerful form of indexing because of “close connection with users and their language,” as opposed to traditional indexing.27 The terms users assign to describe images are also the ones they are most likely to type while searching for digital images. Likewise, according to Walsh, a ANALYZING DIGITAL COLLECTIONS ENTRANCES: WHAT GETS USED AND WHY IT MATTERS | BISWAS AND MARCHESONI | https://doi.org/10.6017/ital.v35i4.9446 32 study conducted by the University of Alberta found more than forty percent of collections reviewed used a locally developed classification for indexing and searching their collections, and many of these schemes could work well for searches within the collection by users who are familiar with the culture of the collection.28 Usage-data analysis can constitute useful information that guides decisions for building digital collections that better serve user needs. It can identify a library’s digital collections’ users and what they want. These are important considerations to keep in mind if library services are to be all about engaging and building relationship with the users.29 Harvesting to a national portal such as the DPLA is beneficial for Hunter Library’s collections. At the same time, the Library’s institution-specific collections receive more return visits, likely because of sustained interest from the large user base of the university’s students and employees, an assessment supported by survey findings. Conversely, collections not so directly tied to the institution receive the most one- time item views and referrals. Items that get used are a good indication of what users want and, as this paper demonstrates, the focus of academic digital library collections should consider the needs of both the university audience and the general public. REFERENCES 1. A landing page refers to the homepage of a collection. 2. The DPLA provides a single portal for accessing digital collections held by cultural heritage institutions across the United States. “History,” Digital Public Library of America, accessed May 19, 2016, http://dp.la/info/about/history/. 3. Paul Betty, “Assessing Homegrown Library Collections: Using Google Analytics to Track Use of Screencasts and Flash-Based Learning Objects,” Journal of Electronic Resources Librarianship 21, no. 1 (2009): 75–92, https:// doi.org/10.1080/19411260902858631. 4. Ibid. 5. Wei Fang, “Using Google Analytics for Improving Library Website Content and Design: A Case Study,” Library Philosophy and Practice (e-journal), June 2007, 1-17, http://digitalcommons.unl.edu/libphilprac/121. 6. Mark Baggett and Rabia Gibbs, “Historypin and Pinterest for Digital Collections: Measuring the Impact of Image-Based Social Tools on Discovery and Access,” Journal of Library Administration 54, no. 1 (2014): 11–22, https:// doi.org/10.1080/01930826.2014.893111. 7. Melanie Schlosser and Brian Stamper, “Learning to Share: Measuring Use of a Digitized Collection on Flickr and in the IR,” Information Technology and Libraries 31, no. 3 (September 2012): 85–93, https:// doi.org/10.6017/ital.v31i3.1926. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 33 8. Mark R. O’English, “Applying Web Analytics to Online Finding Aids: Page Views, Pathways, and Learning about Users,” Journal of Western Archives 2, no. 1 (2011): 1–12, http://digitalcommons.usu.edu/westernarchives/vol2/iss1/1. 9. Marcus Ladd, “Access and Use in the Digital Age: A Case Study of a Digital Postcard Collection,” New Review of Academic Librarianship 21, no. 2 (2015): 225–31, https://doi.org/10.1080/13614533.2015.1031258. 10. Irene M. H. Herold, “Digital Archival Image Collections: Who Are the Users?” Behavioral & Social Sciences Librarian 29, no. 4 (2010): 267–82, https://doi.org/10.1080/01639269.2010.521024. 11. Mark A. Matienzo and Amy Rudersdorf, “The Digital Public Library of America Ingestion Ecosystem: Lessons Learned After One Year of Large-Scale Collaborative Metadata Aggregation,” in 2014 Proceedings of the International Conference on Dublin Core and Metadata Applications (DCMI, 2014), 1–11, http://arxiv.org/abs/1408.1713. 12. Oskana L. Zavalina et al., “Extended Date/Time Format (EDTF) in the Digital Public Library of America’s Metadata: Exploratory Analysis,” Proceedings of the Association for Information Science and Technology 52, no. 1 (2015), 1–5, http://onlinelibrary.wiley.com/doi/10.1002/pra2.2015.145052010066/abstract. 13. Lisa Gregory and Stephanie Williams, “On Being a Hub: Some Details behind Providing Metadata for the Digital Public Library of America,” D-Lib Magazine 20, no. 7/8 (July/August 2014): 1–10, https://doi.org/10.1045/july2014-gregory. 14. Kate Boyd, Heather Gilbert, and Chris Vinson, “The South Carolina Digital Library (SCDL): What Is It and Where Is It Going?” South Carolina Libraries 2, no. 1 (2016), http://scholarcommons.sc.edu/scl_journal/vol2/iss1/3. 15. Chris Freeland and Heather Moulaison, “Development of the Missouri Hub: Preparing for Linked Open Data by Contributing to the Digital Public Library of America,” Proceedings of the Association for Information Science and Technology 52, no. 1 (2015): 1–4, http://onlinelibrary.wiley.com/doi/10.1002/pra2.2015.1450520100105/abstract. 16. A single view of an item in a digital collection. 17. Visits to the site that began from another site with an item page being the first page viewed. 18. Keywords are words visitors used to find the Library’s website when using a search engine. Google Analytics provides a list of these keywords. 19. A session is defined as a “group of interactions that take place on a website within a given time frame” and can include multiple kinds of interactions like page views, social interactions, and economic transactions. In Google Analytics, a session by default lasts thirty minutes, though ANALYZING DIGITAL COLLECTIONS ENTRANCES: WHAT GETS USED AND WHY IT MATTERS | BISWAS AND MARCHESONI | https://doi.org/10.6017/ital.v35i4.9446 34 one can adjust this length to last a few seconds or several hours. “How a Session Is Defined in Analytics,” Google, Analytics Help, accessed May 20, 2016, https://support.google.com/analytics/answer/2731565?hl=en. 20. Locations were studied in terms of mostly cities and states. 21. The percentage is based on the total referral count a collection gets—for example, a 44 percent referral count for Cherokee Traditions would mean that the search engines account for 44 percent of the total referrals this collection gets. 22. Herold, “Digital Archival Image Collections,” 278. 23. Krystyna K. Matusiak, “Towards User-centered Indexing in Digital Image Collections,” OCLC Systems & Services: International Digital Library Perspectives 22, no. 4 (2006): 283–98, https://doi.org/10.1108/10650750610706998. 24. Ladd, “Access and Use in the Digital Age,” 230. 25. Fang points out that the improvements made to the Rutgers-Newark Law Library website could attract more return visitors and thus achieve loyalty. Fang, “Using Google Analytics for Improving Library Website,” 11. 26. NISO Framework Advisory Group, A Framework of Guidance for Building Good Digital Collections, 2nd ed. (Bethesda, MD: National Information Standards Organization, 2004), https://chnm.gmu.edu/digitalhistory/links/cached/chapter3/link3.2a.NISO.html. 27. Matusiak, “Towards User-centered Indexing,” 289. 28. John Walsh, “The Use of Library of Congress Subject Headings in Digital Collections,” Library Review 60, no. 4 (2011), https://doi.org/10.1108/00242531111127875. 29. Lynn Silipigni Connaway, The Library in the Life of the User: Engaging with People Where They Live and Learn, (Dublin: OCLC Research, 2015), http://www.oclc.org/research/publications/2015/oclcresearch-library-in-life-of-user.html. 9462 ---- Editor’s Comments: Odds and Ends Bob Gerrity INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 1 This issue marks the midpoint of Information Technology and Libraries’ fifth year as an open- access e-only journal. The move to online-only in 2012 was inevitable, as ITAL’s print subscription base was longer covering the costs of producing and distributing the print journal. Moving to an e- only model using an open-source publishing platform (the Public Knowledge Project’s Open Journal Systems) provided a low-cost production and distribution system that has allowed ITAL to continue publishing without requiring a large ongoing investment from LITA. The move to open access, however, was not inevitable, and I commend LITA for supporting that move and for continuing to provide a base subsidy that supports the journal’s ongoing publication. I also thank the Boston College Libraries for their ongoing support in hosting ITAL along with a number of other OA journals. Since ITAL is now open, access to it can no longer be offered as an exclusive benefit that comes with LITA membership. Regardless of the publishing model, though, ITAL has always relied on voluntary contributions of the time and expertise of reviewers and editors. I’d like to acknowledge the contributions of our past and current Editorial Board members, who play a key role in ensuring the ongoing quality and vitality of the journal. We will be adding a few additional Board members shortly, to help ensure that review of submissions to the journal are completed as quickly and effectively as possible. Speaking of peer review, one of the recent innovative startups in the scholarly communication space is a company called publons, which tracks and verifies peer-review activity, providing a mechanism for academics to report (and possibly receive institutional credit for) their peer- review work, an undervalued part of the scholarly communication framework. (Full disclosure: at University of Queensland we are conducting a pilot project with publons, to integrate the peer- review activities of our academics into our institutional repository.) In addition to new approaches to peer review, such as publons and Academic Karma, there are quite a few recent examples of innovations in various aspects of scholarly communication that are worth keeping an eye on. These include new collaborative authoring tools such as Overleaf, impact-measurement tools such as Impactstory, and personal digital library platforms such as Readcube. On a broader scale, initiatives such as PeerJ are building open access publishing platforms intended to dramatically improve the efficiency of and drive down the overall costs of scholarly publishing. February marked the 14th anniversary of a key trigger event in the Open Access movement—the launch of the Budapest Open Access Initiative in 2002. Bob Gerrity (r.gerrity@uq.edu.au), a member of LITA and the Editor of Information Technology and Libraries, is University Librarian at the University of Queensland, Brisbane, Australia. http://ejournals.bc.edu/ojs/index.php/ https://publons.com/ http://academickarma.org/ https://www.overleaf.com/ https://impactstory.org/ https://www.readcube.com/ https://peerj.com/ http://www.budapestopenaccessinitiative.org/read mailto:r.gerrity@uq.edu.au EDITOR’S COMMENTS | GERRITY doi: 10.6017/ital.v35i2.9462 2 Much has happened in the 14 years since the Budapest Initiative, on various fronts: o policy—introduction and widespread adoption of funder and institutional OA mandates; o technology--development and widespread adoption of institutional repositories, recent development of mechanisms to facilitate the discovery of OA publications (e.g., SHARE on the library side and CHORUS on the publisher side); o publishing—establishment of new OA megajournals (e.g., PLOS, BioMed Central), embrace of hybrid OA models by mainstream commercial publishers. Yet despite all the hype, acrimony, and activity triggered by the OA movement, a recent analysis in Chronical of Higher Education suggests the growth of OA has been slow and incremental: the percentage of research articles published annually in fully open-access format has increased at an average rate of of around one percent a year, from 4.8% in 2008 to 12% in 2015. At this rate, the tipping point for OA still seems very far away. Lots of energy has been and continues to be invested by different stakeholders in different approaches, and the green vs. gold argument still predominates. Recent developments suggest momentum is gaining for a more radical shift. In December 2015, the Max Planck Institute, a key player in the launch of OA with the Berlin Declaration on Open Access in 2003, hosted the 12th version of its annual OA conference to further the discussion around open access. Ironically, unlike previous meetings and seemingly in philosophical conflict with the underpinnings of the OA movement, the meeting was by invitation only. Given the topic, though, a “Proposal to Flip Subscription Journals to Open Access,” the closed nature of of the meeting is understandable. Underpinning the proposal was a 2015 paper from the Max Planck Digital Library that suggested that the amount of money currently being spent (largely by libraries) on journal subscriptions should be sufficient to fund research publication costs if applied to a “flipped” journal publishing business model, from subscription-based to gold open access.1 In the Netherlands, the university sector has adopted a national approach in negotiating deals with several major publishers (Springer, SAGE, Elsevier, and Wiley) that allow Dutch authors to publish their papers as gold OA, without additional charges (but, depending on the publisher, with limits on total numbers and/or which journals are available within the deals).2 The so-called “Dutch Deal” by the VSNU (Association of universities in the Netherlands) and UKB (Dutch Consortium of University Libraries and Royal Library) takes a national approach to flipping the model, attempting to bundle access rights for Dutch readers with APC credits for Dutch authors. http://www.arl.org/focus-areas/shared-access-research-ecosystem-share#.V3XhlZN95TY http://www.chorusaccess.org/ http://chronicle.com/article/As-an-Open-Access-Megajournal/234890 http://chronicle.com/article/As-an-Open-Access-Megajournal/234890 https://openaccess.mpg.de/Berlin-Declaration https://openaccess.mpg.de/Berlin-Declaration https://www.mpg.de/9202262/area-wide-transition-open-access INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2016 3 The Dutch government, which currently holds the EU presidency, is pushing hard for a Europe- wide adoption of this approach. Last month, the EU’s Competitiveness Council agreed that all scientific papers should be freely available by 2020.3 Meanwhile, in the US, the “Pay it Forward” research project at the University of California is examining what the institutional financial impact would be with a flipped model. The study is looking at existing institutional journal expenditures on subscriptions and modeling what a future, APC-based model would look like based on institutional research publication output and estimated average APC charges. Who knows when or if a global flip might occur, but it does strike me that the scholarly publishing world is overdue for a major shakeup. From the point of view of a university librarian, focused on keeping journal subscription costs in line (unsuccessfully I might add), I think there is real danger in not considering what a flip to a gold model might look like. The commercial publishers we all complain about are successfully exploiting the gold model as an additional revenue stream which, for the most part, academic libraries have been ignoring, since the individual APCs typically are paid from someone else’s budget. This has allowed the overall envelope of spending on research publication (subscriptions and APCs) to grow significantly. Perhaps a more interesting question is what the impact of a flip on libraries would be. If gold OA became the predominant model, we would no longer need all of the complex systems we’ve built to manage subscriptions and user access. To quote Homer Simpson, “Woohoo!” In the “watch this space” arena, EBSCO’s recently-launched open-source library services platform (LSP) initiative is beginning to take shape. It now has a name—FOLIO (for Future of the Libraries Is Open)—and as Marshall Breeding put it, the project “injects a new dynamic into the competitive landscape of academic library technology, pitting and open source framework backed by EBSCO against a proprietary market dominated by Ex Libris, now owned by EBSCO archrival ProQuest.”4 Publicly listed participants in the project include (in addition to EBSCO) OLE, Index Data, ByWater, BiblioLabs, and SIRSI Dynix.5 The platform release timetable calls for an initial, “technical preview” release of of the code for the base platform in August 2016, and an anticipated release of the apps needed to operate a library in early 2018.6 1. Ralf Schimmer, Kai Karin Geschuhn, Andreas Vogler, Disrupting the Subscription Journals’ Business Model for the Necessary Large-Scale Transformation to Open Access, (2015), doi:10.17617/1.3 2. Frank Huysmans, VSNU-Wiley: Not Such a Big Deal for Open Access, Warekennis (blog), March 1, 2016, https://warekennis.nl/vsnu-wiley-not-such-a-big-deal-for-open-access/ 3. Martin Enserink, “In dramatic statement, European leaders call for ‘immediate’ open access to all scientific papers by by 2020,” Science, May 27, 2016, doi:10.1126/science.aag0577. http://icis.ucdavis.edu/?page_id=286 https://warekennis.nl/vsnu-wiley-not-such-a-big-deal-for-open-access/ EDITOR’S COMMENTS | GERRITY doi: 10.6017/ital.v35i2.9462 4 4. Marshall Breeding, EBSCO Supports New Open Source Project, Amercian Libraries, April 22, 2016, https://americanlibrariesmagazine.org/2016/04/22/ebsco-kuali-open-source-project/ 5. https://www.folio.org/collaboration.php. 6. https://www.folio.org/apps-timelines.php. https://americanlibrariesmagazine.org/2016/04/22/ebsco-kuali-open-source-project/ https://www.folio.org/collaboration.php https://www.folio.org/apps-timelines.php 9469 ---- December_ITAL_Oud_final Accessibility of Vendor-Created Database Tutorials for People with Disabilities Joanne Oud INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 7 ABSTRACT Many video, screencast, webinar, or interactive tutorials are created and provided by vendors for use by libraries to instruct users in database searching. This study investigates whether these vendor- created database tutorials are accessible for people with disabilities to see whether librarians can use these tutorials instead of creating them in-house. Findings on accessibility were mixed. Positive accessibility features and common accessibility problems are described, with recommendations on how to maximize accessibility. INTRODUCTION Online videos, screencasts, and other multimedia tutorials are commonly used for instruction in academic libraries. These online learning objects are time consuming to create in-house and require a commitment to maintain and revise when database interfaces change. Many database vendors provide screencasts or online videos on how to use their databases. Should libraries use these vendor-provided instructional tools rather than spend the time and effort to create their own? Many already do: a study shows that 17.7 percent of academic libraries link to tutorials created by third parties, mainly by vendors or other libraries.1 When deciding whether to use vendor-created tutorials, one consideration is whether the tutorials meet accessibility requirements for people with disabilities. The importance of accessibility for online tutorials has been increasingly recognized and outlined in recent library literature.2 People with disabilities make up one of the largest minority groups in the United States and Canada, and studies show that about 9 percent of university or college students have a disability.3 Problems with web accessibility have been well documented. People with disabilities are often unable to access the same online sites and resources as others, creating a digital divide.4 Even if people with disabilities can access a site, it is more difficult for many to use it.5 Assistive technologies, like screen-reading software, enable access but add an extra layer of complexity in interacting with the site, and blind or low-vision users can’t always rely on visual cues to navigate and interpret sites. A recent study of library website accessibility concluded that typical library websites are not designed with people with disabilities in mind.6 Joanne Oud (joud@wlu.ca) is Instructional Technology Librarian and Instruction Coordinator, Wilfrid Laurier University, Ontario, Canada. ACCESSIBILITY OF VENDOR-CREATED DATABASE TUTORIALS FOR PEOPLE WITH DISABILITIES | OUD https://doi.org/10.6017/ital.v35i4.9469 8 Libraries, which are founded on a philosophy of equal access to information, should be concerned about online accessibility. Legal requirements for providing accessible online web content vary, but exist in every jurisdiction in the United States and Canada. Apart from the legal requirements, recent literature points out that equitable access to information for people with disabilities is a matter of human rights and an issue of diversity and social justice, and calls on libraries and librarians to improve their commitment to online accessibility.7 It is important for libraries to participate in creating level playing field and to avoid creating conditions that make people feel unequal or prevent them from equitable access. It is unclear whether librarians can assume vendor-created instructional tutorials are accessible. Studies on vendor database accessibility have been mixed, showing some commitment to and improvements in accessibility on one hand, but sometimes substantial gaps in accessibility on the other.8 The focus until now has been exclusively on the accessibility of database interfaces. This study investigates the accessibility of online tutorials, including videos, screencasts, interactive multimedia, and archived webinars created by database and journal vendors and offered as instructional materials to librarians and patrons, to determine whether they are a viable alternative to making in-house training materials. LITERATURE REVIEW Although a few articles exist on how to make video tutorials accessible,9 no studies have evaluated the accessibility of already-created video or screencast tutorials. There are, however, some studies evaluating the accessibility of vendor databases. Byerley, Chambers, and Thohira surveyed vendors in 2007 and found that most felt they had integrated accessibility standards into their search interfaces, and nearly all tested for accessibility to some degree, though not always with actual users.10 These findings conflict somewhat with the results of other studies. Tatomir and Durrance evaluated the accessibility of thirty-two databases with a checklist and found that although many did contain accessibility features, 72 percent were marginally accessible or inaccessible.11 Similarly, Dermody and Majekodunmi found that students with print-related disabilities who use screen-reading software could only complete 55 percent of tasks successfully because of accessibility barriers and usability challenges.12 DeLancey surveyed vendors and examined VPATs, or product accessibility claims, and found that vendors felt they were compliant with 64 percent of US Section 508 items.13 Especially relevant to this study, only 23 percent of vendors said that the multimedia content within their products was compliant, and 46 percent admitted multimedia content was not compliant at all. Since vendor VPAT forms are completed for databases and other products only, and not the instructional tutorials created by vendors on how to use those products, vendor accessibility claims for instructional tutorials are unknown. Although no studies have been done on the accessibility of video or screencast tutorials, some have been done on the accessibility of multimedia or other related kinds of online learning. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 9 Roberts, Crittenden, and Crittenden surveyed 2,366 students taking online courses at several US universities. A total of 9.3 percent of those students reported that they had a disability, and of those, 46 percent said their disability affected their ability to succeed in their online course, although most reasons cited were not related to technical accessibility barriers.14 Kumar and Owston studied students with disabilities using online learning units that contained videos. All students in the study reported at least one barrier to completing the learning units.15 Although this study involves student use of video tutorials, it doesn’t report on accessibility issues specific to those tutorials. Previous studies of vendor products focus exclusively on database interfaces, and previous studies of online learning have not focused on screencast accessibility. Therefore this study’s goal is to investigate how accessible vendor-created video tutorials are. Accessibility is defined as both technical accessibility (can people with disabilities locate, access, and use them) and usability (how easy it is for people with disabilities to use them). This study will look at which major accessibility issues there are (if any) and make recommendations on whether librarians can direct students to them rather than making in-house instructional videos. METHOD An evaluation checklist (see appendix 2) was developed for this study using criteria drawn from the Web Content Accessibility Guidelines (WCAG) 2.0. WCAG 2.0 is the most widely recognized web-accessibility standard internationally. Much recent accessibility legislation adopts it, including the in-process revisions to Section 508 guidelines in the United States.16 WCAG 2.0 is also consistent with tutorial accessibility best-practice advice found in recent articles, which emphasize the need for accurate captions, keyboard accessibility, descriptive narration, and alternate versions for embedded objects, among other criteria.17 The checklist has twenty items and is split into two sections, “Functionality” and “Usability.” Functionality items test whether the tutorial can be used by people using screen-reading software or a keyboard only, and include whether the tutorial is findable on the page and playable, whether player controls and interactive content can be operated by keyboard, whether captions are available, and whether audio narration is descriptive enough so someone who can’t see the video can understand what is happening. Usability items test how easy the tutorial is to use. Examples include clear visuals and audio, use of visual cues to focus the viewer’s attention, and short and logically focused content. To help prioritize the importance of checklist items, the local Accessible Learning Centre (ALC), which supports students on campus who use assistive technologies, was consulted about the difficulties most encountered by students. The ALC’s highest priority was the provision of an alternate accessible version of a tutorial, since it is difficult to make complex embedded web content accessible for everyone under every circumstance and an alternate version allows people to work with content in a way that suits their needs. ACCESSIBILITY OF VENDOR-CREATED DATABASE TUTORIALS FOR PEOPLE WITH DISABILITIES | OUD https://doi.org/10.6017/ital.v35i4.9469 10 For the evaluation, major database vendors were chosen through a scan of common vendors and platforms at universities, with input from collections colleagues. Some vendors were eliminated because they don’t provide instructional tutorials on their websites. Twenty-five vendors were included in the study (see appendix 1). A large majority of the tutorials found were screencast or video tutorials; a few vendors provided recorded webinars, and a few provided interactive multimedia tutorials, mainly text captions or visuals with clickable areas or quizzes. In total, 460 tutorials were evaluated for accessibility: 417 video, screencast, or interactive tutorials from twenty-foure vendors, and 41 recorded webinars from four vendors. If tutorials were available in more than one place, most commonly on both the vendor’s website and YouTube, both locations were tested. If more than thirty tutorials were provided by a vendor, every other one was tested. If multiple formats of tutorial were available, such as screencasts and recorded webinars, each format was tested. Testing from the perspective of people with visual impairments was a key focus. Other assistive technologies such as Kurzweil (for people who can see but have print-related disabilities) and Zoomtext (for enlargement) are widely used, but if webpages work well using screen-reading software intended for people with visual impairments, they also generally work using other kinds of assistive software. Tutorials were tested with two screen-reading programs used by people with visual impairments: NVDA (with Firefox), a free open source program, and JAWS (with Internet Explorer), a widely used commercial product. Both were used to determine whether any difficulties were due to the quirks of a particular software product or a result of inherent accessibility problems. In addition, captions were evaluated to determine accessibility for people who are deaf or have hearing difficulties. People with visual or some physical impairments use the keyboard only, so all tutorials were tested without a mouse using solely the keyboard. During testing, each task was tried three different ways within NVDA or JAWS before deciding that it couldn’t be completed. If one of the three methods worked the task was marked as successfully completed. If a task could be completed successfully in one screen-reading program but not the other, it was marked as unsuccessful. Screen-reader support needs to be consistent across platforms, since people may be using a variety of types of assistive software. FINDINGS AND DISCUSSION Tutorials created by the same vendor nearly all used the same approach and had the same checklist results. This is positive, since consistency is important for accessibility and helps in navigation and ease of use. None of the forty-one recorded webinars tested in this study were accessible. Webinars did not have player controls that were findable on the page by screen-reading software or usable by INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 11 keyboard. None had captions, transcripts, or alternate accessible versions. Often webinars were quite long, with no clear structure and no cues to focus attention on the screen. Recorded webinars had almost no accessibility features and can’t be recommended for use as accessible instructional materials in their current form. None of the screencast or video tutorials tested were completely accessible, and all failed in at least one checklist item. Tutorials from some vendors, however, came close to meeting all checklist requirements. Overall, there were many positive accessibility features in the video and screencast tutorials. Most of these tutorials were findable and playable by screen reading software in some way, had video player controls usable by keyboard, had descriptive narration so people who can’t see the screen can tell what is happening, had clear visuals and audio narration, used simple language, and were relatively short and focused in content. The most accessible screencast or video tutorials were produced by the American Psychological Association (APA), American Theological Library Association (ATLA), Modern Language Association (MLA), and Ebsco. Their tutorials had many accessibility features and rated highly on the checklist. They included much less commonly found accessibility features, especially the use of visual and/or audio cues to focus the viewer’s attention and the inclusion of accurate and properly synchronized closed captions. Visual cues are important for people with learning or attention- related disabilities, and help all viewers interpret and follow the video more easily. People who are deaf can’t access the content without captions, and captions also help people who have English as a second language or are at public computers without headphones. Tutorials from these vendors also had an alternate version or transcript available. As mentioned earlier, the highest-priority checklist item is the presence of an alternate accessible version, since it is difficult to design multimedia that works for people with all disabilities in all circumstances. People with disabilities may also have previous negative experiences with online multimedia and prefer to use an alternate format that they have had more success with. In the case of these above-average vendors, the alternate accessible version was a transcript consisting of the video’s closed captions, auto-generated by YouTube. Since the tutorials’ narration was descriptive and the captions were accurate, the auto-generated transcripts are useful. However, the YouTube transcript is hard to find on the YouTube page. Also, most of these vendors had tutorials available both from their own websites and from YouTube, and none had alternate versions available on their own websites. Viewers requiring an alternate format would need to know to go to the YouTube site instead of the vendor site to find it. Two other vendors also had quite accessible tutorials. IEEE’s tutorials had the same positive accessibility features already mentioned. Tutorials were done in-house and presented through the vendor’s site. While most tutorials presented on vendor sites were lacking in accessibility, IEEE’s were well thought out from an accessibility perspective and usable by screen-reading software. These were the only tutorials tested where all interactivity, including pop-up screens, was easily ACCESSIBILITY OF VENDOR-CREATED DATABASE TUTORIALS FOR PEOPLE WITH DISABILITIES | OUD https://doi.org/10.6017/ital.v35i4.9469 12 usable and navigable by keyboard. The one accessibility issue was the lack of an alternate accessible version. Elsevier’s ScienceDirect tutorials took a different approach to accessibility than other vendors, or even than Elsevier’s tutorials for other Elsevier products. The Science Direct tutorials were not accessible, but an alternate text version was available and people using screen-reader software were informed of this when they get to the tutorial page and were redirected to the text version. The ideal is to have one version that is accessible to everyone, but this approach is a good way to implement an alternate version if one accessible version isn’t possible. Screencasts or video tutorials from other vendors also have some good accessibility features, but these were balanced with serious accessibility problems. The main accessibility issues discovered include the following: Alternate accessible versions: vendors who had captions and hosted their videos on YouTube did have auto-generated YouTube transcripts, but these were hard to find and were only useful if the captions were descriptive and accurate, which many were not. Apart from Elsevier’s ScienceDirect tutorials, no vendors provided another format deliberately as an accessible alternative. Captions: captions were missing or problematic in the tutorials of fourteen vendors, or 59 percent of the total. Five (21 percent) of vendors provided no captions at all for their tutorials. Nine (38 percent) had unedited, auto-generated YouTube captions, which are highly inaccurate and therefore don’t provide usable access to the content for people who are deaf. Tutorial not findable or playable on page: Twelve vendors (50 percent) had tutorials that were not findable on the webpage or playable for people using a keyboard or screen- reading software. Most of these issues are with tutorials on vendor sites, which were often Flash-based or offered through non-YouTube third party sites like Vimeo. Four vendors (17 percent) offered access to their tutorials both through their own (inaccessible) website and YouTube, which is findable and playable by screen reading software. Eight (33 percent), however, only provided access through their (inaccessible) webpages, which means that people using a keyboard or screen reading software would not be able to use their tutorials. No visual cues to focus attention: Eight vendors (33 percent) had no visual cues to focus attention in the video. Visual cues help people with certain disabilities focus on the essential part of the screen that is being discussed, help everyone more easily interpret and follow what is happening, and are known to help facilitate successful multimedia learning.18 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 13 Nondescriptive narration: Six vendors (25 percent) had tutorials with audio narration that didn’t sufficiently describe what was happening on the screen. Narration needs to describe what is happening in enough detail so people who can’t see the screen are not missing information available for sighted viewers. Fuzzy visuals: Five vendors (21 percent) had tutorials with visuals that were fuzzy and hard to see. This makes viewing difficult for people with low vision, and challenging even for people with normal vision. Fuzzy audio or background music: Three vendors (13 percent) had poor-quality audio narration or background music playing during narration. Background music is distracting for those with hearing difficulties and makes it more difficult to focus on what is being said. Eliminating extraneous sound also makes it easier for people to learn from multimedia.19 Tutorials consisting only of text captions: Three vendors (13 percent) had tutorials consisting of text captions with no narration. The text captions were not readable by screen-reading software, and no alternate accessible versions were provided. Providing narration in tutorials is recommended for accessibility, since it allows people who can’t see the screen to access the content more easily, and has been shown to improve learning and recall over on-screen text and graphics alone.20 RECOMMENDATIONS AND CONCLUSIONS This study attempted to determine how accessible vendor-created database tutorials are, and whether academic librarians can use them instead of re-creating them locally. For recorded webinars, the answer is a clear no, since none were technically accessible for people using screen- reading software. For video or screencast tutorials, however, the answer less is clear. Results showed that many vendors created tutorials with positive features like clear visuals and audio, being short and focused on one main point, and using descriptive narration. However, technical accessibility was much less successful, with 59 percent of vendors omitting usable captions and 50 percent presenting tutorials that couldn’t be found on the page or played by people using screen-reading software. These technical accessibility issues prevent people with hearing, vision, or some mobility impairments from using the tutorials at all. Although none of the tutorials studied met all the checklist criteria, some came close and could be used by librarians depending on local requirements, policies, and priorities for accessibility. In part, this study found that the accessibility of many tutorials depends on how they are presented. Disappointingly, 50 percent of vendors had tutorials on their websites that were not findable or playable by people with disabilities. Many vendors, however, hosted tutorials on YouTube as well as their own site. In these cases, YouTube was always a more accessible option ACCESSIBILITY OF VENDOR-CREATED DATABASE TUTORIALS FOR PEOPLE WITH DISABILITIES | OUD https://doi.org/10.6017/ital.v35i4.9469 14 than the vendor site. YouTube itself is relatively accessible, with both pages and players that are navigable by keyboard and by screen-reading software. There are options for accessibility settings in YouTube, such as having captions display automatically, and more accessible third-party overlays are available for the YouTube player. On vendor sites, there were more likely to be issues with Flash and an inability for people using screen-reading software or keyboards to find and play videos. Some vendors embed YouTube videos on their site. Even if the embedded videos are findable and playable, this method omits important accessibility features found on the YouTube page, such as the text transcript. The results of this study show that using YouTube where available is recommended. Further, linking to YouTube rather than embedding the video is preferred, unless a separate link to the transcript is made to provide an alternate accessible version. Captions are another key accessibility problem identified in this study: nearly two-thirds had unusable captions. Often, auto-generated YouTube captions were present but were not usable. The presence of captions is not enough for accessibility; those captions need to be accurate and present the same content as the narration. YouTube auto-captioning does not generate captions that are accurate enough to be useful without manual editing. YouTube auto-generates transcripts from the captions, so if the captions are inaccurate the transcript will not be useful either. Editing YouTube auto-generated captions is necessary to ensure accessibility. A few accessibility issues found in this study would be easy to improve with some thought during tutorial creation. Adding visual cues like arrows or highlighting to the screen to help people focus attention, or remembering that not everyone can see the screen while recording narration, can be easily achieved and would improve accessibility significantly. Other issues would require more planning and effort to improve. Given the widespread technical accessibility problems identified in this study, it is particularly important for people creating tutorials to provide alternate formats that are accessible if tutorials themselves are not accessible. Almost no vendors do this currently, but it would have the most significant impact on accessibility for the broadest range of people. Adding usable captions is the second most important area for improvement. To provide access for people who are deaf, captions need to be added or auto- generated YouTube captions need to be edited for accuracy. Both alternate formats and captions require some thought and effort to implement but ensure that tutorials will meet accessibility requirements and be usable by everyone. NOTES AND BIBLIOGRAPHY 1. Eamon Tewell, “Video Tutorials in Academic Art Libraries: A Content Analysis and Review,” Art Documentation 29, no. 2 (2010): 53–61. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 15 2. Amanda S. Clossen, “Beyond the Letter of the Law: Accessibility, Universal Design, and Human-Centered Design in Video Tutorials,” Pennsylvania Libraries: Research & Practice 2, no. 1 (2014): 27–37, https://doi.org/10.5195/palrap.2014.43; Joanne Oud, “Improving Screencast Accessibility for People with Disabilities: Guidelines and Techniques,” Internet Reference Services Quarterly 16, no. 3 (2011): 129–44, https://doi.org/10.1080/10875301.2011.602304; Kathleen Pickens and Jessica Long, “Click Here! (And Other Ways to Sabotage Accessibility),” Imagine, Innovate, Inspire: The Proceedings of the ACRL 2013 Conference (Chicago: ACRL, 2013), 107–12. 3. DeAnn Barnard-Brak, Lucy Lechtenberger, and William Y. Lan, “Accommodation Strategies of College Students with Disabilities,” Qualitative Report 15, no. 2 (2010): 411–29. 4. Cyndi Rowland et al., “Universal Design for the Digital Environment: Transforming the Institution,” Educause Review 45, no. 6 (2010): 14–28. 5. Peter Brophy and Jenny Craven, “Web Accessibility,” Library Trends 55, no. 4 (2008): 950–72. 6. Kyunghye Yoon, Laura Hulscher, and Rachel Dols, “Accessibility and Diversity in Library and Information Science: Inclusive Information Architecture for Library Websites,” Library Quarterly 86, no. 2 (2016): 213–29. 7. Ruth V. Small, William N. Myhill, and Lydia Herring-Harrington, “Developing Accessible Libraries and Inclusive Librarians in the 21st Century: Examples from Practice,” Advances in Librarianship 40 (2015): 73–88, https://doi.org/10.1108/S0065-2830201540; John Carlo Jaeger, Paul T. Wentz, and Brian Bertot, “Libraries and the Future of Equal Access for People with Disabilities: Legal Frameworks, Human Rights, and Social Justice,” Advances in Librarianship 40 (2015): 237–53; Yoon, Hulscher, and Dols, “Accessibility and Diversity in Library and Information Science: Inclusive Information Architecture for Library Websites.” 8. Suzanne L. Byerley, Mary Beth Chambers, and Mariyam Thohira, “Accessibility of Web-Based Library Databases: The Vendors’ Perspectives in 2007,” Library Hi Tech 25, no. 4 (2007): 509– 27, https://doi.org/10.1108/07378830710840473; Kelly Dermody and Norda Majekodunmi, “Online Databases and the Research Experience for University Students with Print Disabilities,” Library Hi Tech 29, no. 1 (2011): 149–60, https://doi.org/10.1108/07378831111116976; Jennifer Tatomir and Joan C. Durrance, “Overcoming the Information Gap: Measuring the Accessibility of Library Databases to Adaptive Technology Users,” Library Hi Tech 28, no. 4 (2010): 577–94, https://doi.org/10.1108/07378831011096240. 9. Pickens and Long, “Click Here!”; Clossen, “Beyond the Letter of the Law”; Oud, “Improving Screencast Accessibility for People with Disabilities”; Nichole A. Martin and Ross Martin, “Would You Watch It? Creating Effective and Engaging Video Tutorials,” Journal of Library & ACCESSIBILITY OF VENDOR-CREATED DATABASE TUTORIALS FOR PEOPLE WITH DISABILITIES | OUD https://doi.org/10.6017/ital.v35i4.9469 16 Information Services in Distance Learning 9, no. 1–2 (2015): 40–56, https://doi.org/10.1080/1533290X.2014.946345. 10 . Byerley, Chambers, and Thohira, “Accessibility of Web-Based Library Databases.” 11. Tatomir and Durrance, “Overcoming the Information Gap.” 12. Dermody and Majekodunmi, “Online Databases and the Research Experience for University Students with Print Disabilities.” 13. Laura DeLancey, “Assessing the Accuracy of Vendor-Supplied Accessibility Documentation,” Library Hi Tech 33, no. 1 (2015): 103–13, https://doi.org/10.1108/LHT-08-2014-0077. 14. Jodi B. Roberts, Laura A. Crittenden, and Jason C. Crittenden, “Students with Disabilities and Online Learning: A Cross-Institutional Study of Perceived Satisfaction with Accessibility Compliance and Services,” Internet and Higher Education 14, no. 4 (2011): 242–50, https://doi.org/10.1016/j.iheduc.2011.05.004. 15. Kari L. Kumar and Ron Owston, “Evaluating E-Learning Accessibility by Automated and Student-Centered Methods,” Educational Technology Research and Development 64, no. 2 (2015): 263–83, https://doi.org/10.1007/s11423-015-9413-6. 16. US Access Board, “Draft Information and Communication Technology ( ICT ) Standards and Guidelines,” 36 CFR Parts 1193 and 1194, RIN 3014-AA37 (2015), https://www.access- board.gov/attachments/article/1702/ict-proposed-rule.pdf. 17. Pickens and Long, “Click Here!”; Clossen, “Beyond the Letter of the Law”; Martin and Martin, “Would You Watch It?”; Oud, “Improving Screencast Accessibility for People with Disabilities.” 18. See the Signaling Principle in Richard E. Mayer, Multimedia Learning, 2nd ed. (Cambridge: Cambridge University Press, 2009): 108–17. 19. See the Coherence Principle, Ibid., 89–107. 20. See the Modality Principle, Ibid., 200–220. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 17 Appendix 1. List of Vendors 1. ACM 2. Adam Matthew 3. Alexander St Press 4. APA 5. ATLA 6. ChemSpider 7. Cochrane Library (webinars only) 8. Ebsco 9. Elsevier 10. Factiva 11. Gale 12. IEEE 13. Lexis Nexis Academic (tutorials and webinars) 14. Marketline 15. MathSciNet 16. OVID/Wolters Kluwer (tutorials and webinars) 17. Oxford 18. Proquest (tutorials and webinars) 19. Pubmed 20. Sage 21. SciFinder 22. Standard & Poor/NetAdvantage 23. Taylor and Francis 24. Web of Knowledge/Thompson Reuters 25. Zotero ACCESSIBILITY OF VENDOR-CREATED DATABASE TUTORIALS FOR PEOPLE WITH DISABILITIES | OUD https://doi.org/10.6017/ital.v35i4.9469 18 Appendix 2. Tutorial Accessibility Evaluation Checklist Functionality � Equivalent alternate format(s) are provided � Transcript/test version � Audio � Other ___________________________ � Alternate formats provided are accessible � Alternate formats provided are findable on the page by screen reader � Screen reading software can find the video on the webpage � Screen-reading software can access and play the video � Video-player functions can by operated by keyboard/screen-reading software � Interactive content can be accessed and used by keyboard/screen-reading software � User has some control over timing (pause/rewind capability) � Alternate modes of presentation are available for all, meaning presented through text, visuals, narration, color, or shape � Synchronized closed captions are available for all audio � Audio/narration is descriptive Usability � User controls if/when the video starts (no auto play) � Video is easy to use by screen-reading software � Clear, high-contrast visuals and text � Clear, high-contrast audio (no background noise/music) � Uses visual cues to focus attention (e.g., highlighting, arrows) � Is short and concise � Is clearly and logically organized � Has consistent navigation, look, and feel � Uses simple language, avoids jargon, and defines unfamiliar terms � Explicit structure with sections, headings to give viewers context � Learning outcome/goal clearly outlined and content focused on outcome 9474 ---- June_ITAL_Rubel_final Picture Perfect: Using Photographic Previews to Enhance Realia Collections for Library Patrons and Staff Dejah T. Rubel INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 59 ABSTRACT Like many academic libraries, the Ferris Library for Information, Technology, and Education (FLITE) acquires a range of materials, including learning objects, to best suit our students’ needs. Some of these objects, such as the educational manipulatives and anatomical models, are common to academic libraries but others, such as the tabletop games, are not. After our liaison to the School of Education discovered some accessibility issues with Innovative Interfaces' Media Management module, we decided to examine all three of our realia collections to determine what our goals in providing catalog records and visual representations would be. Once we concluded that we needed photographic previews to both enhance discovery and speed circulation service, choosing processing methods for each collection became much easier. This article will discuss how we created enhanced records for all three realia collections including custom metadata, links to additional materials, and photographic previews. INTRODUCTION Ferris State University’s full-time enrollment for Fall 2015 was 14,715 students. Of these students, 10,216 are Big Rapids residents and the other 4,499 are either Kendall College of Art and Design students or at other off-campus sites across Michigan.1 During the 2014-2015 school year, FLITE had 14,647 check-outs including 2,558 check-outs of items in reserves, which is where our realia collections are located.2 However, reserves includes other items in addition to these collections, thus making analysis of circulation statistics problematic. Another problem with conducting such an analysis is that the educational manipulative collection already had photographic previews and the tabletop game collection is a pilot project, so there is no clear before and after comparison. We can, however, demonstrate that enhancing the catalog records for our anatomical model collection had an incredibly significant impact, jumping from a handful of check-outs from 2014-2015 to almost 450 in 2016. LITERATURE REVIEW Although there are very few libraries using photographic previews for their realia collections, the ones that do described similar limitations with bibliographic records and goals that only Dejah T .Rubel (rubeld@ferris.edu) is the Metadata and Electronic Resources Management Librarian, Ferris State University, Big Rapids, MI. PICTURE PERFECT: USING PHOTOGRAPHIC PREVIEWS TO ENHANCE REALIA COLLECTIONS FOR LIBRARY PATRONS AND STAFF | RUBEL | https://doi.org/10.6017/ital.v36i2.9474 60 photographic previews could meet. Most realia collections that warranted this extra effort are either curriculum materials or anatomical models, which is not surprising considering how difficult they are to describe. As Butler and Kvenild noted in their article on cataloging curriculum materials, “Patrons struggled to identify which game or kit they sought based on the…information in the online catalog,” because “Discovering curriculum materials in the catalog and getting a sense of the item are not easy when using traditional catalog descriptions...”3. As they continue, “The inventory and retrieval problems…were compounded by the fact that existing catalog records were not as descriptive as they should be.”4 This was also a problem for our collections because our names and descriptions were often not intuitive or precise. In addition, as Loesch and Deyrup discovered while cataloging their curriculum materials collection, “…there was great inconsistency among the OCLC records regarding the labeling of the format…,”5 which was another issue we needed to address. Although the General Material Designation (GMD) has since been rendered obsolete, FLITE continues to use it to highlight certain material. This choice is due to some limitations with our library management system as well as our discovery layer, namely the lack of good mapping or use of the 33X fields. Until this is rectified with a more modern system, we have it found it easier to retain certain GMDs like “sound recording”, “electronic resource”, and “realia”. Thus, we needed to standardize our terms for each collection. Another problem that our predecessors indicated photographic previews might resolve was missing objects or pieces of objects.6 This becomes especially important for our tabletop games collection because most of those pieces are very small and too numerous for a piece count upon return. Fortunately, “Previews…can aid users in making better decisions about potential relevance, and extract gist more accurately and rapidly than traditional hit lists provided by search engines.”7 Ideally, a preview will display an appropriate level of information about the object it represents in order “…to support users in making a correct judgement about the relevance of that object to the user’s information need.”8 Greene goes further by listing the main roles for previews of which the first two are the most applicable for photographic previews: aiding retrieval and aiding users in quickly making relevance decisions.9 For these uses, photographic previews of realia are ideal because users can examine the object without needing to see its details and they expect them to be abstract, not exhaustive, unlike digital surrogates that an archive would use.10 As Greene also notes, the high-level goal of any preview is to "...communicate the level and scope of objects to users so that comprehension is maximized and disorientation is minimized."11 A common finding among all the previous projects was that even a single photograph provides more readily comprehensible information than several lines of description. As Moeller states regarding their journal project, "They [previews of each issue's cover] give the researcher or student an immediate idea of the nature of the journal."12 He goes further to give the example of an innocuous journal title for a propagandist serial whose political nature is transparent once you view its imagery. From a staff perspective, photographic previews can also easily illustrate the number of INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 61 pieces and an object's condition or orientation. This can be very useful in determining whether something is missing or damaged without having to do a time-consuming individual piece count upon check-in. But as Butler and Kvenild discuss, layout within each photograph is key for illustrating missing pieces.13 Unfortunately, aside from a few small projects mentioned in Butler and Kvenild's article, there are not many examples of photographic previews for realia collections currently being used by academic libraries. One reason might be software limitations. Innovative's Media Management module is still unique among ILS/LMS software in that most vendors either provide a separate digital repository for special collections digital surrogates or they incorporate images into the catalog using third party software like Syndetic SolutionsTM. Another reason for the lack of photographic previews within catalogs may simply be the rarity of realia in academic libraries. Every library certainly has a few unique pieces, like a skeleton for the pre-medical students, but often not enough to consider them an entire collection much less a complex enough collection to warrant the extra effort to create photographic previews of each item. At FLITE, we had already crossed that threshold of complexity. Therefore, this article will start by discussing our educational manipulative collection, which provided the basis for how we would catalog and process the tabletop games and anatomical models. Educational Manipulative Collection Our first foray into creating photographic previews was completed by the previous Cataloger with over 300 items cataloged in 2004 and another 30-40 added to the collection over the next decade. Unlike the other realia collections, the educational manipulatives were cataloged using Innovative’s Course Reserves module, so no attempt was made to find or create OCLC records. Nevertheless, the minimal metadata is very consistent across the collection, which supports Greene’s recommendation “…that it was important to define a set of consistent attributes at the high level of the collection if any effective browsing across the collections was to be provided.”14 In our case, we rely on a combination of the GMD ([realia]), a custom call number prefix (TOYS Box #), and a limited amount of local subject headings as shown below with “Manipulatives” as the common subject for the entire collection. 690 = (d) Current local subject headings in use as of 12/3/15: Art. Infant/Toddler. Block props. Magnets. Boards. Manipulatives. Cognitive. Music. Discovery Box. Oversize books. Discovery. Posters. Dramatics. Puppets. Finger Puppets. Story apron. Flannel Board. Story props. Gross Motor. Woodworking. PICTURE PERFECT: USING PHOTOGRAPHIC PREVIEWS TO ENHANCE REALIA COLLECTIONS FOR LIBRARY PATRONS AND STAFF | RUBEL | https://doi.org/10.6017/ital.v36i2.9474 62 Due to the nature of descriptive metadata, photographic previews of the educational manipulatives made logical sense because “The images…are not the content. They are the metadata, the description of the materials.”15 As Moeller describes, Innovative’s Media Management module links images and many other file types directly to bibliographic records without requiring users to click an additional link unless they want to view a larger image of a thumbnail.16 Similar to Butler and Kvenild’s project, all of our photos were 900 pixels wide by 600 pixels tall, which is slightly smaller than their default width of 1000 pixels.17 One advantage of using the Media Management module is its ability to automatically create thumbnails 185 pixels wide by 85 pixels tall. A bigger advantage is that the images are hosted on the same server that runs our catalog, which allows us to freely distribute the images in an intuitive manner (thumbnails instead of links) without having to worry about authentication to a shared folder from off-campus, unlike our PDF files. Unfortunately, our liaison to the School of Education recently discovered some accessibility issues with Media Management that forced us to consider whether we should change the embedded photographic previews to external links. The most significant of these problems is simply the language of the proprietary viewer software. Because it is written in Java, if you click on a thumbnail for a larger image, many browsers, like Chrome, will not run it and those that will often require a security exception to do so. We have attempted to ameliorate some of these issues by providing an FAQ entry on which browsers are best for viewing these images and how to add a security exception for our website, but unless or until Innovative rewrites this software in a different language, these accessibility issues will persist because Java is being phased out of many browsers. Butler and Kvenild also noted its slow response time compared to their own server.18 Another issue they mentioned was that the thumbnails would not be visible in their consortial catalog, so they needed to add links in the 856 field for these users.19 This is less of an issue for us because we do not contribute any of our realia records to our consortia catalog, but Moeller’s concern that in general “…enhancements involving scanned images…will not be easily shared with other libraries,”20 is entirely valid. Unlike OCLC records, there is no way to share attached or embedded images as part of the metadata and not the content. Contrariwise, Butler and Kvenild’s concerns regarding catalog migration are very pertinent because we are considering moving to a new LMS within the next few years.21 Although we acknowledge that “Utilizing 856 tags is an indirect method of accessing the images, as users must take the intiative to follow the links,” we will eventually have to move and link our photographic previews to ensure accessibility after migration.22 Tabletop Game Collection Unlike the educational manipulatives, the majority of the tabletop game collection was previously cataloged in OCLC, so finding good bibliographic records was easy. Once downloaded, we decided to add a unique GMD ([game]), custom call number prefix (BOARD GAME Box #), and local subject heading “Tabletop games”. However, our Emerging Technlogies Librarian who coordinated this INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 63 pilot project felt that the single subject heading was not descriptive enough. So he gave us a spreadsheet with more specific subject headings such as “Deck Building”, “Historical”, and “Resource Management” that we added as genre/form subject headings in the 655_4 field. He also suggested that we add links to the rule books, which we did using the 856 field and the link text “connect to rule book (PDF)”. Because tabletop games are commercial products, finding images online was also easy. At first, we had some concerns about copyright, but we are not reselling these products or using the image as a replacement for the item. So, we concurred with Butler and Kvenild that “…the images in our project fall under copyright fair use.”23 Another plus to using commercial images is that we could use more than one to show various aspects of setup and play. The downside to this benefit is image sizes and content photographed varied widely, so we used our best judgement in creating labels and tried to keep them as consistent as possible. To ensure consistency across the collection, we decided that the first image should always be the top of the game’s box labeled “Box Cover” or “Box Cover – Front” if there was a “Box Cover – Back” image. (We only displayed the back of the box cover if there was significant information about the game printed on it.) Then we added up to five additional images showing parts of the game like “Card Examples”, “Game Pieces”, and “Game Set-up”. Overall, this number of images worked very well in both Encore’s Attached Media Viewer and the Classic Catalog/Web OPAC, but there is a slight duplication in images by Syndetic SolutionsTM for a few games. This results in a larger version of the box top image displaying to the right of the title and above the smaller thumbnails of images we added using Media Management. In regards to piece counts, we presumed that we would need photographic previews to aid in piece counting upon return of a tabletop game. However, our Emerging Technologies Librarian assured us that because we are an educational institution, we could contact the vendor for free replacement pieces at any time. He also emphasized that unlike the educational manipulatives or the anatomical models, this was a pilot collection, so extensive processing would not be a good investment of our labor. Fortunately, the anatomical model collection would require images for piece counts as well as several other cataloging customizations to increase discoverability and speed circulation. Anatomical Model Collection Similar to our educational manipulative collection, but not nearly as extensive, our anatomical model collection has been a part of FLITE since its inception. Unlike the manipulatives, which are used primarily by the early childhood education students, the anatomical models support a range of allied health programs including but not limited to dental hygiene, radiology, and nursing. The majority of our two dozen models were purchased in the 20th century and, like the manipulatives, the majority were cataloged using Innovative’s Course Reserves module. Unfortunately, none of these records were very descriptive, some being so poor as to be merely a title like “Jawbones” and a barcode. So, the first task was to match objects with OCLC records. Fortunately, this task PICTURE PERFECT: USING PHOTOGRAPHIC PREVIEWS TO ENHANCE REALIA COLLECTIONS FOR LIBRARY PATRONS AND STAFF | RUBEL | https://doi.org/10.6017/ital.v36i2.9474 64 became easier once we discovered that it was easier to match the object to the vendor’s catalog image and then search OCLC by vendor model name or number than it is to decipher written descriptions if you do not know human anatomy. Once good bibliographic records were downloaded, we decided to add one of three GMDs depending on the type of model ([model], [chart], or [flash card]), a custom call number prefix (MODEL #), and one or more of the local subject headings shown below. 690 = (d) Anatomy model. Anatomy chart. Anatomy models. Anatomy charts. Dental hygiene model. Dental model. Dental hygiene models. Dental models. Technically, all dental models could be used as anatomical models, but not vice versa. Therefore, the common subject headings for the collection are “Anatomy model” and “Anatomy models”. To make things easier to shelve, retrieve, and inventory, we also designed numeric ranges for the call numbers, as shown below, so we would know what type of model we should expect when referring to a specific model number. 099 = (c) MODEL #00X following this hierarchy: 001-099 Anatomical Charts and Flash Cards 100-199 Articulated Skeletons 200-299 Disarticulated Skeletons and Bone Kits 300-399 Organs 400-499 Skulls (anatomical and dental hygiene) 500-599 Other Dental Models (dental studies, dental decks) We also scanned and linked PDFs of the heavily worn model keys with the link text “connect to key PDF” before washing and rehousing all the models. Once they were clean, they were ready for their shoot with Ferris State University’s Media Production team. Due to winter break, Media Production was able to shoot the majority of the collection fairly quickly. They returned to us high-resolution TIFFs the same size as those for the manipulatives, 900 pixels by 600 pixels. In case of Java viewer failure, we requested that there be one top-level image that showcases exactly what the model contains with images of individual pieces or drawers as the succeeding images. For example, our disarticulated skeletons are housed in small plastic carts with three drawers in each cart. Therefore, the first image would be a shot of all the pieces of the disarticulated skeleton and the second image would be the contents of the top drawer, the third image the contents of the middle drawer, and the last image the contents of the bottom drawer. In this specific example, we re-used the images that we posted in the catalog INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 65 record by pasting them on top of the cart to show circulation staff what to expect in each drawer upon check-in. Overall, photographic previews for this collection appear to be working very well for both catalog users and circulation staff “…to inform users about size, extent, and availability of collections or objects.”24 In fact, they have been working so well for this collection that usage has increased exponentially compared to previous years. Figure 1. Circulation Statistics 2014-2016 CONCLUSIONS AND FUTURE DIRECTIONS Although we implemented photographic previews for three realia collections, we could not define any standard workflow for the process beyond correcting or downloading the metadata first and adding the images second. Part of this is due to our working primarily with legacy collections because we often discovered issues, like the model keys, while working through another issue. The other part is due to the nuances involved in processing realia in general. Even with good, readily available catalog records like those for the tabletop games, time still had to be spent separating, organizing, and rehousing game pieces as well as hunting down useful images. Unfortunately, any type of realia processing, even if it is just textual description, is much more time-consuming than the majority of academic library cataloging. Adding in the extra steps to create, upload, and link a photographic preview can nearly double that labor investment. Notwithstanding, as Butler and Kvenild advocate “…not supplying images as metadata for items that most need them (i.e. kits, games, and models) is to make them nearly irretrievable. Providing bare-bones traditional metadata for these items is analogous to delegating them to the backlog shelves of yesteryear.”25 367 317 114 10 1 444 24 0 50 100 150 200 250 300 350 400 450 500 2014 2015 2016 Circulation Statistics Manipulatives Models Games PICTURE PERFECT: USING PHOTOGRAPHIC PREVIEWS TO ENHANCE REALIA COLLECTIONS FOR LIBRARY PATRONS AND STAFF | RUBEL | https://doi.org/10.6017/ital.v36i2.9474 66 Unfortunately, neither the library management system nor the third-party catalog enhancement market currently provides a good solution to this problem. Considering how great an impact photographic previews have had in the online retail market, this lack of technical support is surprising. Yes, Syndetic SolutionsTM is a great product for cover images and tables of content for books. However, once you go beyond traditional resources, there is a great need to allow institutions to submit their own images as part of catalog record enhancement and not to serve as separate digital surrogates in a digital respository. This could be done either within the library management system, like the Media Management module, or as an option for catalog enhancement where libraries could add images to either a shared database or their own database using standard identifiers on a third-party platform like SyndeticsTM. Further research on photographic previews is also sorely needed. As of this writing, we only have a handful of case studies and some guiding philosophy on the use of previews. Consultation with internet retailers and literature on online marketing might be more applicable than library science research to evaluate their impact, but research into their direct impact vs. textual descriptions on catalog use would be ideal. REFERENCES 1. Fact Book 2015 – 2016 (Big Rapids, MI: Ferris State University Institutional Research & Testing, 2016), http://www.ferris.edu/HTMLS/admision/testing/factbook/FactBook15-16- 2.pdf, 47. 2. Ibid, 12. 3. Marcia Butler and Cassandra Kvenild, “Enhancing Catalog Records with Photographs for a Curriculum Materials Center,” Technical Services Quarterly 31 (2014): 122-138, https://doi.org/10.1080/07317131.2014.875377, 122-124. 4. Ibid, 126. 5. Martha Fallahay Loesch and Marta Mestrovic Deyrup, “Cataloging the Curriculum Library: New Procedures for Non-Traditional Formats,” Cataloging & Classification Quarterly 34, no. 4 (2002): 79-89, https://doi.org/10.1300/J104v34n04_08, 82. 6. Butler and Kvenild, “Enhancing Catalog Records with Photographs,” 128. 7. Stephan Greene, Gary Marchionini, Catherine Plaisant, and Ben Shneiderman, “Previews and Overviews in Digital Libraries: Designing Surrogates to Support Visual Information Seeking,” Journal of the American Society for Information Science 51, no. 4 (2000): 380-393, https://doi.org/10.1002/(SICI)1097-4571(2000) 51:4<380::AID-ASI7>3.0.CO;2-5, 381. 8. Ibid. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 67 9. Ibid, 384. 10. Ibid, 385. 11. Ibid. 12. Paul Moeller, “Enhancing Access to Rare Journals: Cover Images and Contents in the Online Catalog,” Serials Review 33, no. 4 (2007): 231-237, https://doi.org/10.1016/j.serrev.2007.09.003, 235. 13. Butler and Kvenild, “Enhancing Catalog Records with Photographs,” 128. 14. Greene et. al., “Previews and Overviews in Digital Libraries,” 388. 15. Butler and Kvenild, “Enhancing Catalog Records with Photographs,” 124. 16. Moeller, “Enhancing Access to Rare Journals,” 234. 17. Butler and Kvenild, “Enhancing Catalog Records with Photographs,” 129. 18. Ibid, 132. 19. Ibid, 126. 20. Moeller, “Enhancing Access to Rare Journals,” 237. 21. Butler and Kvenild, “Enhancing Catalog Records with Photographs,” 131. 22. Ibid, 135. 23. Ibid, 134. 24. Greene et. al., “Previews and Overviews in Digital Libraries,” 386. 25. Butler and Kvenild, “Enhancing Catalog Records with Photographs,” 136. 9526 ---- Microsoft Word - 9526-16430-5-CE.docx President’s Message: Reflections on LITA’s Past and Future Aimee Fifarek INFORMATION TECHNOLOGIES AND LIBRARIES | SEPTEMBER 2016 3 When I reached out to ITAL Editor Bob Gerrity about my first President’s Column, he graciously provided copies of past LITA Presidents’ columns to get me started. It reminded me once again of the illustrious company I am in, starting with Stephen R. Salmon, the first president of the Information Services and Automation Division, as we were known until 1977. I am proud to be at the head of LITA as it begins to celebrate its 50th Anniversary year. A half century ago when LITA was founded the world was experiencing an era of profound technological change. The US and Soviet Union were battling to be first in the Space Race, and an increasing number of world powers were engaging in nuclear testing. While Civil Rights demonstrations and the fighting in Vietnam dominated the news, we were imagining peace via the technologically-driven future depicted in a new TV series called Star Trek. With TV focused on the stars, we were able to go to the movies and explore the strange new world of inner space in Fantastic Voyage. Technology was poised to enter our daily lives as well, with Diebold demonstrating the first ATM1 and Ralph H. Baer writing the 4-page paper that would lay the foundation for the video game industry.2 Heady times for technology indeed, and the fact that Libraries were sufficiently advanced to require an Association dedicated to supporting technologists is hardly surprising. By the time of LITA’s founding at the 1966 Midwinter Meeting in Chicago, library automation had been in development for over a decade.3 MARC was just being invented, with the first tapes from the Library of Congress scheduled to go to the sixteen pilot libraries later that year. Membership in the only organization that existed, the Committee on Library Automation (COLA), was restricted to the handful of professionals who either developed or managed existing library systems. But technology was beginning to impact many more librarians than just those rarified few. According to President Salmon, “It was clear that large numbers of librarians who didn't meet COLA's standards for membership were in need of information on library automation and wanted leadership.”4 The first meeting of our Division on July 14, 1966 at the ALA Annual Conference in New York was attended by several hundred librarians interested in information sharing, technology standards, and technology training for library staff. This group created the first mission, vision, and bylaws that set us on a 50-year path of success. LITA is well positioned to take the first steps into our next 50 years. Thanks to the efforts of last year’s LITA Board, we are on the verge of adopting a new two-year strategic plan that is designed Aimee Fifarek (aimee.fifarek@phoenix.gov) is LITA President 2016-17 and Deputy Director for Customer Support, IT and Digital Initiatives at Phoenix Public Library, Phoenix, AZ. PRESIDENT’S MESSAGE | FIFAREK doi: 10.6017/ital.v35i3.9526 4 to guide us through the current transitional period. It will be accompanied by a tactical plan that will allow us to document our accomplishments and set the stage for an ongoing culture of continuous planning. Also, Jenny Levine has proven to be extremely capable as she completes her first year as LITA Executive Director. She has just the right combination of ALA experience, technology know-how, and calm competence to guide us through the retooling and reimagining that is required to take a middle-aged Association into the next phase of its life. The four areas of focus in the new strategic plan will help us to balance our efforts between preserving the strengths of our past and adapting our organization for a successful future. The first area of focus, Member Engagement, shows that our primary commitment needs to be to LITA members. Without you, LITA would not exist. One of the key efforts is to increase the value of LITA for members who are unable to travel to conferences. With travel budgets down and staying low, online member engagement is an area all of ALA needs to improve, and who better to lead in this area than LITA. The next area, Organizational Sustainability, is all about keeping the infrastructure of the organization strong, much of which happens in the domain of LITA staff. Budgeting, quality communication, and strategic planning all live here. The section on Education and Professional Development recognizes the important role that webinars, online courses, online journal, and print publications play in allowing LITA members to share their knowledge on both cutting edge and practical topics with the rest of the Association and ALA in general. We are already doing great work here and we need to better support and expand these efforts. The last focus area, Advocacy and Information Policy, represents a future growth area for LITA. Now that everyone in the library world "does" technology to a certain extent, LITA needs to think about how we will differentiate ourselves as outside competencies increase. Our advantage is that we have been doing and thinking about technology for much longer than anyone else. With our vast wealth of experience, it's appropriate that we work to become thought leaders and implementers in the information policy realm. In this, as always, we return to where we started: our members. LITA has thrived over the last 50 years because of this, our most important resource. LITA was founded on the concept of sharing information about technology through conversation, publications, and knowledge creation. We endure because you, the committed, passionate information professionals are willing to share what you know with those who come after. And like our founders, there are always individuals who are willing to take on the mantle of leadership, whether through getting elected to LITA Board, becoming a Committee or Interest Group Chair, serving in key editorial roles for our monographs, journal, and blog, or joining the all-important LITA Staff. Thanks to all of you who make LITA’s future happen every day. I am proud to be in your company. INFORMATION TECHNOLOGIES AND LIBRARIES | SEPTEMBER 1016 5 REFERENCES 1 . Alan Taylor, “50 years ago: a look back at 1966,” The Atlantic Photo, March 23, 2016, http://www.theatlantic.com/photo/2016/03/50-years-ago-a-look-back-at-1966/475074/, Photo 46. 2. “Take me back to August 30, 1966,” http://takemeback.to/30-August-1966#.V8SzItLrtaQ. 3. “Library Technology Timeline,” http://web.york.cuny.edu/~valero/timeline_reference_citations.htm. 4. Stephen R. Salmon, “LITA’s First 25 Years, a Brief History,” http://www.ala.org/lita/about/history/1st25years. 9527 ---- Editorial Board Thoughts: Requiring and Demonstrating Technical Skills for Library Employment Emily Morton-Owens INFORMATION TECHNOLOGIES AND LIBRARIES | SEPTEMBER 2016 6 Recently I’ve been involved in a number of conversations about technical skills for library jobs, sparked by an ITAL article by Monica Maceli1 and a code4lib presentation by Jennie Rose Halperin.2 Maceli performed a text analysis of job postings on code4lib to reveal what skills are co- occurring and most frequent. Halperin problematized the expense of the MLS credential in comparison to the qualifications actually required by library technology jobs and the salaries offered for technical versus nontechnical work. This work has inspired many conversations about the shift in skills required for library work, the value placed on different kinds of labor, and how MLS programs can teach library technology. During a period of hiring at my institution and through teaching a library school course in which many of the students are on the brink of graduation, my attention has been called particularly to one point in the library employment process: job postings. These advertisements are the first step in matching aspiring library staff with the real-life needs of libraries—where the rubber meets the road between employer expectations and new-grad experience. Most libraries already use the practice of distinguishing between required and preferred qualifications, which is a good start, especially for technology jobs where candidates may offer strong learning proficiency yet lack a few particular tools. Although there have been conflicting interpretations of the Hewlett-Packard research suggesting that men are more likely than women to apply to jobs when they don’t meet all the requirements,3 I observe a general tendency among graduating students to err on the side of caution because they’re not sure which qualifications they can claim. Among my students, for example, constant confusion attends the years of experience required. Is this library experience? General job experience? Experience at the same type of library? Paid or unpaid? Postings are often ambiguous and students may choose to apply or not. Similarly, there are questions about what extent of experience qualifies someone to know a technology: mastering it through creating new projects at a paid job, experience maintaining it, or merely basic familiarity? Not knowing who has been hired, and on the basis of what kind of experience, is a gap for researchers trying to close the loop on job advertisements. Even when a job posting has avoided an overlong list of required technical skills, it might still be expressing a narrow sense of what’s required to qualify. Someone who understands Subversion will be capable of understanding Git, so we see plenty of job advertisements that ask for experience with a “a version control system (e.g. Git, Subversion, or Mercurial).” I recently polled staff in our department and found very few of us with bachelor’s degrees in technical subjects. More of us had come to working in library technology through work experience or graduate programs. And yet, our job postings contained long statements that conflated education and experience, such as “Bachelor’s degree in Computer Science, Information Science, or other Emily Morton-Owens (egmowens@upenn.edu), a member of the ITAL Editorial Board, is Director of Digital Library Development and Systems, University of Pennsylvania Libraries, Philadelphia, Pennsylvania. mailto:egmowens@upenn.edu EDITORIAL BOARD THOUGHTS | MORTON-OWENS doi: 10.6017/ital.v35i3.9527 7 relevant field and at least 3 years of experience application development in Object Oriented and scripting languages or equivalent combination of education and experience. Master’s desirable.” I edited our statement to more clearly allow a combination of factors that would show sufficient preparation: “Bachelor’s degree and a minimum of 3-5 years of experience, or an equivalent combination of education and experience, are required; a Master’s degree is preferred,” followed by a separate description of technical skills needed. This increased the number and quality of our applications, so I’ll remain on the lookout for opportunities to represent what we want to require more faithfully and with an open mind. Meanwhile, on the other side of the table, students and recent grads are uncertain how to demonstrate their skills. First, they’re wondering how to show clearly enough that they meet requirements like “three years of work experience” or “experience with user testing” so that their application is seriously considered. Second, they ask about possibilities to formalize skills. Recently, I’ve gotten questions about a certificate program in UX and whether there is any formal certification to be a systems librarian. Surveying the past experience of my own network—with very diverse paths into technology jobs ranging from undergraduate or second master’s degrees to learning scripting as a technical services librarian to pre-MLS work experience—doesn’t suggest any standard method for substantiating technical knowledge. Once again, the truth of the situation may be that libraries will welcome a broad range of possible experience, but the postings don’t necessarily signal that. Some advice from the tech industry about how to be more inviting to candidates applies to libraries too; for example, avoiding “rockstar”/ “ninja” descriptions, emphasizing the problem space over years of experience,4 and designing interview processes that encourage discussion rather than “gotcha” technical tasks. At Penn Libraries, for example, we’ve been asking developer candidates to spend a few hours at most on a take-home coding assignment, rather than doing whiteboard coding on the spot. This gives us concrete code to discuss in a far more realistic and relaxed context. While it may be helpful to express requirements better to encourage applicants to see more clearly whether they should respond to a posting, this is a small part of the question of preparing new MLS grads for library technology jobs. The new grads who are seeking guidance on substantiating their skills are the ones who are confident they possess them. Others have a sense that they should increase their comfort with technology but are not sure how to do it, especially when they’ve just completed a whole new degree and may not have the time or resources to pursue additional training. Even if we make efforts to narrow the gap between employers and job- seekers, much remains to be discussed regarding the challenge of readying students with different interests and preparation for library employment. Library school provides a relatively brief window to instill in students the fundamentals and values of the profession and it can’t be repurposed as a coding academy. There persists a need to discuss how to help students interested in technology learn and demonstrate competencies rather than teaching them rapidly shifting specific technologies. EDITORIAL BOARD THOUGHTS | MORTON-OWENS doi: 10.6017/ital.v35i3.9527 8 REFERENCES 1. Monica Maceli, “What Technology Skills Do Developers Need? A Text Analysis of Job Listings in Library and Information Science (LIS) from Jobs.code4lib.org,” Information Technology and Libraries 34 no3 (2015): 8-21, doi:10.6017/ital./v23i3.5893. 2. Jennie Rose Halperin, “Our $50,000 Problem: Why Library School?” code{4}lib, http://code4lib.org/conference/2015/halperin. 3. Tara Sophia Mohr, “Why Women Don’t Apply for Jobs Unless They’re 100% Qualified,” Harvard Business Review, August 25, 2014, https://hbr.org/2014/08/why-women-dont-apply-for-jobs- unless-theyre-100-qualified. 4. Erin Kissane, “Job Listings That Don’t Alienate,” https://storify.com/kissane/job-listings-that- don-t-alienate. http://dx.doi.org/10.6017/ital./v23i3.5893 http://code4lib.org/conference/2015/halperin https://hbr.org/2014/08/why-women-dont-apply-for-jobs-unless-theyre-100-qualified https://hbr.org/2014/08/why-women-dont-apply-for-jobs-unless-theyre-100-qualified https://storify.com/kissane/job-listings-that-don-t-alienate https://storify.com/kissane/job-listings-that-don-t-alienate 9540 ---- December_ITAL_Maceli_final Technology Skills in the Workplace: Information Professionals’ Current Use and Future Aspirations Monica Maceli and John J. Burke INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 35 ABSTRACT Information technology serves as an essential tool for today’s information professional, and ongoing research is needed to assess the technological directions of the field over time. This paper presents the results of a survey of the technologies used by library and information science practitioners, with attention to the combinations of technologies employed and the technology skills that practitioners wish to learn. The most common technologies employed were email, office productivity tools, web browsers, library catalog- and database-searching tools, and printers, with programming topping the list of most-desired technology skill to learn. Similar technology usage patterns were observed for early and later-career practitioners. Findings also suggested the relative rarity of emerging technologies, such as the makerspace, in current practice. INTRODUCTION Over the past several decades, technology has rapidly moved from a specialized set of tools to an indispensable element of the library and information science (LIS) workplace, and today it is woven throughout all aspects of librarianship and the information professions. Information professionals engage with technology in traditional ways, such as working with integrated library systems, and in new innovative activities, such as mobile-app development or the creation of makerspaces.1 The vital role of technology has motivated a growing body of research literature, exploring the application of technology tools in the workplace, as well as within LIS education, to effectively prepare tech-savvy practitioners. Such work is instrumental to the progression of the field, and with the rapidly-changing technological landscape, requires ongoing attention from the research community. One of the most valuable perspectives in such research is that of the current practitioner. Understanding current information professionals’ technology use can help in understanding the role and shape of the LIS field, provide a baseline for related research efforts, and suggest future Monica Maceli (mmaceli@pratt.edu) is Assistant Professor, School of Information, Pratt Institute, New York. John J. Burke (burkejj@miamioh.edu) is Library Director and Principal Librarian, Gardner-Harvey Library, Miami University Middletown, Middletown, Ohio. TECHNOLOGY SKILLS IN THE WORKPLACE: INFORMATION PROFESSIONALS’ CURRENT USE AND FUTURE ASPIRATIONS | MACELI AND BURKE | https://doi.org/10.6017/ital.v35i4.9540 36 directions. The practitioner perspective is also valuable in separating the hype that often surrounds emerging technologies from the reality of their use and application within the LIS field. This paper presents the results of a survey of LIS practitioners, oriented toward understanding the participants’ current technology use and future technology aspirations. The guiding research questions for this work are as follows: 1. What combinations of technology skillsets do LIS practitioners commonly use? 2. What combinations of technology skillsets do LIS practitioners desire to learn? 3. What technology skillsets do newer LIS practitioners use and desire to learn as compared to those with ten-plus years of experience in the field? LITERATURE REVIEW The growth and increasing diversity of technologies used in library settings has been matched by a desire to explore how these technologies impact expectations for LIS practitioner skill sets. Triumph and Beile examined the academic library job market in 2011 by describing the required qualifications for 957 positions posted on the ALA JobLIST and ARL Job Announcements websites.2 The authors also compared their results with similar studies conducted in 1996 and 1988 to see if they could track changes in requirements over a twenty-three-year period. They found that the number of distinct job titles increased in each survey because of the addition of new technologies to the library work environment that require positions focused on handling them. The comparison also found that computer skills as a position requirement increased by 100 percent between 1988 and 2011, with 55 percent of 2011 announcements requiring them. Looking more deeply at the technology requirements specifically, Mathews and Pardue conducted a content analysis of 620 jobs ads from the ALA JobList to identify skills required in those positions.3 The top technology competencies required were web development, project management, systems development, systems applications, networking, and programming languages. They found a significant overlap of librarian skill sets with those of IT professionals, particularly in the areas of web development, project management, and information systems. Riley-Huff and Rholes found that the most commonly sought technology-related job titles were systems/automation librarian, digital librarian, emerging and instructional technology librarian, web services/development librarian, and electronic resources librarian.4 A few years later, Maceli added to this list with newly popular technology-relating titles, including emerging technologies librarian, metadata librarian, and user experience/architect librarian.5 Beyond examining which specific technologies librarians should be able to use, researchers have also pondered whether a list of skills is even possible to create. Crawford synthesized a series of blog posts from various authors to discuss which technology skills are essential and which are too specialized to serve as minimum technology requirements for librarians.6 He questioned whether universal skill sets should be established given the variety of tasks within libraries and the unique backgrounds of each library worker. Crawford also questioned the expectation that every librarian INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 37 will have a broad array of technology skills from programming to video editing to game design and device troubleshooting. Partridge et al. reported on a series of focus groups held with 76 librarians that examined the skills required for members of the profession, especially those addressing technology.7 In the questions they asked the focus groups, the authors focused on the term “library 2.0” and attempted to gather suggestions on skills that current and future librarians need to assist users. They concluded that the groups identified that a change in attitudes by librarians was more important to future library service than the acquisition of skills with specific technology tools. Importance was given to librarians’ abilities to stay aware of technological changes, be resilient and reflective in the face of them, and to communicate regularly and clearly with the members of their communities. Another area examined in the studies is where the acquisition of technology skills should and does happen for librarians. Riley-Huff and Rholes reported on a dual approach to measure librarians’ preparation for performing technology-related tasks.8 The authors assessed course offerings for LIS programs to see if they included sufficient technology preparation for new graduates to succeed in the workplace. They then surveyed LIS practitioners and administrators to learn how they acquired their skills and how difficult it is to find candidates with enough technology preparation for library positions. Their findings suggest that while LIS programs offer many technology courses, they lack standardization, and graduates of any program cannot be expected to have a broad education in library technologies. Further research confirmed this troubling lack of consistency in technology-related curricula. Singh and Mehra assessed a variety of stakeholders, including students, employers, educators, and professional organizations, finding widespread concern about the coverage of technology topics in LIS curricula.9 Despite inconsistencies between individual programs, several studies provided a holistic view of the popular technology offerings within LIS curricula. Programs commonly offered one or more introductory technology courses, as well as courses in database design and development, web design and development, digital libraries, systems analysis, and metadata.10,11,12 As researchers have emphasized from a variety of perspectives, new graduates could not realistically be expected to know every technology with application to the field of information.13 There was widespread acknowledgement that learning in this area can, and must, continue in a lifelong fashion throughout one’s career. Riley-Huff and Rholes reported that LIS practitioners saw their own experiences involving continuing skill development on the job, both before and after taking on a technology role.14 However, literature going back many decades suggests that the increasing need for continuing education in information technology has generally not been matched by increasing organizational support for these ventures. Numerous deterrents to continuing technology education were noted, including lack of time,15 organizational climate, and the perception of one’s age.16 While studies in this area have primarily focused on MLS-level positions, Jones reported on academic library support staff members and their perceptions of technology use over a ten-year period and found that increased technology responsibilities added TECHNOLOGY SKILLS IN THE WORKPLACE: INFORMATION PROFESSIONALS’ CURRENT USE AND FUTURE ASPIRATIONS | MACELI AND BURKE | https://doi.org/10.6017/ital.v35i4.9540 38 to workloads and increased workplace stress.17 Respondents noted that increasing use of technology in their libraries has increased their individual workloads along with the range of responsibilities that they hold. METHOD To build an understanding of the research questions stated above, which focus on the technologies currently used by information professionals and those they desired to learn, we designed and administered a thirteen-question anonymous survey (see appendix) to the subscribers of thirty library-focused electronic discussion groups between February 25 and March 13, 2015. The groups were chosen to target respondents employed in multiple types of libraries (academic, public, school, and special) with a wide array of roles in their libraries (public services librarians, systems staff members, catalogers, and so on). We solicited respondents with an email sent to the groups asking for their participation in the survey and with the promise to post initial results to the same groups. The survey included closed and open-ended questions oriented toward understanding current technology use and future aspirations as well as capturing demographics useful in interpreting and generalizing the results. The survey questions have been previously used and iteratively expanded over time by the second author, first in the fall of 2008, then spring of 2012, with summative results presented in the last three editions of the Neal-Schuman Library Technology Companion. We obtained a total of 2,216 responses to the question, “Which of the following technologies or technology skills are you expected to use in your job on a regular basis?” Of these responses, 1,488 (67 percent) of the respondents answered the question regarding technologies they would like to learn: “What technology skill would you like to learn to help you do your job better?” We conducted basic reporting of response frequency for closed questions to assess and report the demographics of the respondents. To analyze the open-ended survey question results in greater depth, we conducted a textual analysis using the R statistical package (https://www.r-project.org/). We used the tm (text mining) package in R (http://CRAN.R- project.org/package=tm) to calculate frequency, correlation of terms, generate plots, and cluster terms. RESULTS The following section will first present an overview of survey responses and respondents, and then explore results as related to the stated four research questions. The LIS practitioners who responded to the survey reported that their libraries are located in forty US states, eight Canadian provinces, and forty-three other countries. Academic libraries were the most common type of library represented, followed by public, school, special, and other (see table 1). INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 39 Library Type Number of Respondents Percentage of All Respondents Academic 1,206 54.4 Public 545 24.6 School 266 12 Special 138 6.2 Other 61 2.8 Table 1. The types of libraries in which survey respondents work Respondents also provided their highest level of education. A total of 77 percent of responding LIS practitioners have earned a library-related or other master’s degrees, dual master’s degrees, or doctoral degrees. From these reported levels of education, it is likely that more respondents are in librarian positions than in library support staff positions. However, individuals with master’s degrees serve in various roles in library organizations, so the percentage of graduate degree holders may not map exactly to the percentage of individuals in positions that require those degrees. Significantly fewer respondents (16 percent) reported holding a high school diploma, some college credit, an associate degree, or a bachelor’s degree as their highest level of education. Another aspect we measured in the survey was tasks that respondents performed on a regular basis. The range of tasks provided in the survey allowed for a clearer analysis of job responsibilities than broad categories of library work such as “public services” or “technical services.” Some respondents appeared to be employed in solo librarian environments where they are performing several roles. Even respondents who might have more focused job titles such as “reference librarian” or “cataloger” may be performing tasks that overlap traditional roles and categories of library work. The tasks offered in the survey and the responses to each are shown in table 2. TECHNOLOGY SKILLS IN THE WORKPLACE: INFORMATION PROFESSIONALS’ CURRENT USE AND FUTURE ASPIRATIONS | MACELI AND BURKE | https://doi.org/10.6017/ital.v35i4.9540 40 Task Number of Respondents Percentage of Respondents Reference 1,404 63.4 Instruction 1,296 58.5 Collection development 1,260 56.9 Circulation 917 41.4 Cataloging 905 40.8 Electronic resource management 835 37.7 Acquisitions 789 35.6 User experience 775 35 Library administration 769 34.7 Outreach 758 34.2 Marketing/public relations 722 32.6 Library/IT systems 672 30.3 Periodicals/serials 659 29.7 Media/audiovisuals 566 25.5 Interlibrary loan 518 23.4 Distance library services 474 21.4 Archives/special collections 437 19 Other 209 9.40% Table 2. Tasks performed on a regular basis by survey respondents While public services-related activities lead the list, with reference, instruction, collection development, and circulation as the top four task areas, technical services-related activities are well represented; the next three in rank are cataloging, electronic resource management, and acquisitions. The overall list of tasks shows the diversity of work LIS practitioners engage in, as each respondent chose an average of six tasks. The results also suggest that the survey respondents are well acquainted with a wide variety of library work rather than only having experience in a few areas, making their uses of technology more representative of the broader library world. The survey also questioned the barriers LIS practitioners face as they try to add more technology to their libraries, and 2,161 respondents replied to the question, “Which of the following are barriers to new technology adoption in your library?” Financial considerations proved to be the most common barrier, with “budget” chosen by 80.7 percent of respondents, followed by “lack of staff time” (62.4 percent), “lack of staff with appropriate skill sets” (48.5 percent), and “administrative restrictions” (36.7 percent). INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 41 What Combinations of Technology Skillsets do LIS Practitioners Commonly Use? Responses from survey question 8, “Which of the following technologies or technology skills are you expected to use in your job on a regular basis?,” were analyzed to build an understanding of this research questions. A total of 2,216 responses to this question were received. Survey respondents were asked to select from a detailed list of technologies/skills (visible in question 8 of the appendix) that they regularly used. The top answers respondents chose for this question were: email, word processing, web browser, library catalog (public side), and library database searching. The full list of the top twenty-five technology skills and tools used is detailed in figure 1, with the list of the bottom fifteen technology skills used presented in figure 2. Figure 1. Top twenty-five technology skills/tools used by respondents (N = 2,216) 0 500 1,000 1,500 2,000 Email Word Processing Web Browser Library Catalog Public Side Library Database Searching Spreadsheets Printers Web Searching Teaching Others To Use Technology Presentation Software Windows OS Laptops Scanners Library Management System Staff Side Downloadable Ebooks Web Based Ebook Collections Cloud Based Storage Technology Troubleshooting Teaching Using Technology Online Instructional Materials/Products Tablets Web Video Conferencing Educational Copyright Knowledge Library Website Creation Or Management Cloud-Based Productivity Apps TECHNOLOGY SKILLS IN THE WORKPLACE: INFORMATION PROFESSIONALS’ CURRENT USE AND FUTURE ASPIRATIONS | MACELI AND BURKE | https://doi.org/10.6017/ital.v35i4.9540 42 Figure 2. Bottom fifteen technology skills/tools used by respondents (N = 2,216) Text analysis techniques were then used to determine the frequent combinations of technology skills used in practice. First, a clustering approach was taken to visualize the most popular technologies that were commonly used in combination (figure 3). Clustering helps in organizing and categorizing a large dataset when the categories are not known in advance, and, when plotted in a dendrogram chart, assists in visualizing these commonly co-occurring terms. The authors numbered the clusters identified in figure 3 for ease of reference. From left to right, the first cluster is focuses on communication and educational tools, the second emphasizes devices and software, the third contains web and multimedia creation tools, the fourth contains office productivity and public-facing information retrieval tools, and the fifth cluster has a diverse collection of responsibilities including systems-oriented responsibilities (from operating systems to specific hardware devices), working with ebooks, teaching with technology, and teaching technology to others. 0 500 1,000 1,500 2,000 Mac OS Audio Recording And Editing Technology Equipment Installation Computer Programming Or Coding Assistive Adaptive Technology RFID Chromebooks Network Management Server Management Statistical Analysis Software Makerspace Technologies Linux 3D Printers Augmented Reality Virtual Reality INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 43 Figure 3. Cluster analysis of most frequent technology skills used in practice, with red outlines on each numbered cluster Notably, the list of top skills used (figure 1) falls more on the end-user side of technology; skills more oriented toward systems work (e.g. Linux, server management, computer programming, or coding) were less frequently mentioned, and several were among the lowest reported (figure 2). Of the 2,216 respondents, 15 percent used programming or coding skills regularly in their job (which is of interest as programming or coding was the skill most desired to learn by respondents; this will be discussed further in the context of the next research question). Plotting the correlations between the more advanced technology skillsets can provide a picture of the work such systems-oriented positions are commonly responsible for, particularly as they are less well represented in the responses as a whole. Figure 4 plots the correlated terms for those tasked with “server management.” It is fair to assume someone with such responsibilities falls on the highly technical end of the spectrum. TECHNOLOGY SKILLS IN THE WORKPLACE: INFORMATION PROFESSIONALS’ CURRENT USE AND FUTURE ASPIRATIONS | MACELI AND BURKE | https://doi.org/10.6017/ital.v35i4.9540 44 Figure 4. Terms correlated with “server management,” indicating commonly co-occurring workplace technologies for highly-technical positions The more common task of “library website creation or management,” which fell to those with a broad level of technological expertise, had numerous correlated terms. Figure 5 demonstrated a wide array of technology tools and responsibilities. Figure 5. Terms correlated with “library website creation or management,” indicating commonly co-occurring technologies used on the job INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 45 And lastly, teaching using technology and teaching technology to others is a long-standing responsibility of librarians and library staff. The following plot (figure 6) presents the skills correlated with “teaching others to use technology.” Figure 6. Terms correlated with “teaching others to use technology,” indicating commonly co- occurring technologies used on the job What Combinations of Technology Skillsets do LIS Practitioners Desire to Learn? We analyzed responses to survey question 10, “What technology skill would you like to learn to help you do your job better?,” to explore this research question. As summarized in Burke18—and consistent with the prior year’s findings—coding or programming remained the most desired technology skillset, mentioned by 19 percent of respondents. The raw text analysis yielded a fuller list of the top terms mentioned by participants (table 3 and visualized in figure 7). TECHNOLOGY SKILLS IN THE WORKPLACE: INFORMATION PROFESSIONALS’ CURRENT USE AND FUTURE ASPIRATIONS | MACELI AND BURKE | https://doi.org/10.6017/ital.v35i4.9540 46 Technology Term Number of Respondents Percentage of Respondents Coding or programming (combined for reporting) 292 19.59 Web 178 11.96 Software 158 10.62 Video 112 7.53 Apps 106 7.12 Editing 105 7.06 Design 85 5.71 Database 76 5.11 Table 3. Terms mentioned by 5 percent or more of survey respondents Figure 7. Wordcloud of responses to “what technology skill would you like to learn to help you do your job better?” INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 47 We then explored the deeper context of responses and individually analyzed responses specific to the more popular technology desires. First, we assessed the responses mentioning the desire to learn coding or programming. Of these responses, the most common specific technologies mentioned were HTML, Python, CSS, JavaScript, Ruby, and SQL, listed in decreasing order of interest. Although most participants did not describe what they would like to do with their desired coding or programming skills, of those that did, the responses indicated interest in ● becoming more empowered to solve their own technology problems (e.g., “I would like to learn the [programming languages] so I don't have to rely on others to help with our website,” “I’m one of the most tech-skilled people at my library, but I’d like to be able to build more of my own tools and manage systems without needing someone from IT or outside support.”); ● improving communication with IT (e.g., “how to speak code, to aid in communication with IT,” “to better identify problems and work with IT to fix them”); ● creating novel tools and improving system interoperability (e.g. “coding for app and API creation”); and ● bringing new technologies to their library and patrons (e.g., “coding so that I can incorporate a hackerspace in my library”). Next, we took a clustering approach to visualize the terms commonly desired in combination. Figure 8 describes the clustered terms that we found within the programming or coding responses. The terms “programming” and “coding” form a distinct cluster to the right of the diagram, indicating that many responses contained only those two terms. TECHNOLOGY SKILLS IN THE WORKPLACE: INFORMATION PROFESSIONALS’ CURRENT USE AND FUTURE ASPIRATIONS | MACELI AND BURKE | https://doi.org/10.6017/ital.v35i4.9540 48 Figure 8. Clustering of terms present in responses indicating the desire to learn coding or programming The remaining portion of the diagram begins to illustrate the specific technologies mentioned for those respondents that answered in greater detail or expanded on their general answer of programming or coding. Other related desired technology-skill areas become apparent: database management, HTML and CSS (as well as the more general “web design,” which appeared in the top terms in table 3), PHP and JavaScript, Python and SQL, and XML creation, among others. The bulleted list presented in the previous paragraph illustrates some of the potential applications participants envisioned these skills being useful in, but the majority did not provide this level of detail in their response. Editing was another prominent term that appeared across participant responses and was largely meant in the context of video editing. Because of the vagueness of the term “editing,” a closer look was necessary to determine other technology desires. Looking at terms highly correlated with “editing” revealed both video and photo editing to be important to respondents. Several of the top- appearing terms were used more generally: “database” and mobile “apps” were mentioned without specifying the technology tool or scenario of use, such that a more contextual analysis could not be conducted. These responses can be particularly difficult to interpret as the term “databases” can have a technical meaning (e.g., working with SQL) or it can refer to the use of library databases from an end user perspective. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 49 What Technology Skillsets do Newer LIS Practitioners Use and Desire to Learn as Compared to Those with Ten-Plus Years Experience in the Field? Of the 2,216 survey responses, 877 stated they had worked in libraries for ten or fewer years. We analyzed these responses separately from the remaining 1,334 respondents who had worked in libraries for more than ten years. Of this group, 644 had worked in libraries for twenty-plus years (figure 9). A handful of participants did not answer the question and were omitted from the analysis. Figure 9. Number of survey responses falling into the various categories for number of years working in libraries The top technology skills used in the workplace did not differ significantly between the different groups. The top skills, as discussed earlier and presented in figure 1, were well represented and similarly ordered. A few small percentage points of difference were noted in a handful of the top skills (figure 10). Those newer to the field were slightly more likely to teach others to use technology, use cloud-based storage, and use cloud-based productivity apps. More experienced practitioners regularly used the library management system (on the staff side) more than those that were newer to the field. 0 100 200 300 400 500 600 700 0-2 3-5 6-10 11-15 16-20 21+ TECHNOLOGY SKILLS IN THE WORKPLACE: INFORMATION PROFESSIONALS’ CURRENT USE AND FUTURE ASPIRATIONS | MACELI AND BURKE | https://doi.org/10.6017/ital.v35i4.9540 50 Figure 10. Top twenty-five technology skills used by respondents in the zero to ten years’ experience (dark blue) and eleven-plus years experience (light blue) groups For the question regarding technologies they would like to learn, 69 percent of the participants with zero to ten years’ experience answered the question compared to a slightly smaller 65 percent of the participants with more than ten-years’ experience. Top terms for both groups were very similar, including coding or programming, software, web, video, design, and editing. These terms were not dissimilar to the responses taken as a whole (table 3), indicating that respondents were generally interested in learning the same sorts of technology skills regardless of how long they had been in the field. 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Email Word Processing Web Browser Library Catalog Public Side Library Database Searching Spreadsheets Web Searching Printers Teaching Others To Use Technology Presentation Software Windows OS Laptops Scanners Downloadable Ebooks Cloud Based Storage Library Management System Staff Side Web Based Ebook Collections Technology Troubleshooting Teaching Using Technology Online Instructional Materials/Products Cloud-based Productivity Apps Tablets Web Video Conferencing Library Website Creation Or Management Educational Copyright Knowledge 0-10 Years 11+ Years INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 51 A few noticeable differences between the two groups emerged. The most popular skills mentioned, coding or programming, were mentioned by 28 percent of the respondents with zero to ten years’ experience, and by 15 percent of the respondents with eleven-plus years experience. There was slightly more interest (by a few percentage points) in databases, design, Python, and Ruby in the zero to ten years’ experience group. Taking a closer look at the different year ranges in the zero to ten years of experience or less group, revealed that those with three to five years of experience were most likely to be interested in learning coding or programming skills. Figure 11. Percentage of respondents interested in learning coding or programming in the groups with ten or fewer years’ experience Of the participants that answered the question at all, several stated that there were no technology skills they would need or like to learn for their position, either because they were comfortable with their existing skills or were simply open to learning more as needed (but nothing specific came to mind). Combined with those who did not answer the question (and so presumably did not have a particular technology they were interested in learning), 28 percent of the zero to ten years’ experience group and 31 percent of the eleven-plus years experience group did not have any technologies that they desired to learn at the moment. DISCUSSION As detailed earlier, the most common technologies employed by LIS practitioners were email, office productivity tools, web browsers, library catalog and database searching tools, and printers. Generally similar technology usage patterns were observed for early and later-career practitioners and programming topped the list of most-desired technology skill to learn. 0% 5% 10% 15% 20% 25% 30% 35% 0-2 years 3-5 years 6-10 years TECHNOLOGY SKILLS IN THE WORKPLACE: INFORMATION PROFESSIONALS’ CURRENT USE AND FUTURE ASPIRATIONS | MACELI AND BURKE | https://doi.org/10.6017/ital.v35i4.9540 52 The cluster analysis presented in figure 3 suggests that a relatively small percentage of practitioners have technology-intensive roles that would require skills such as programming, working with databases, systems administration, etc. Rather, the cluster analysis showed common technology skillsets focused on the end-user side of technology tools. In fact, most of the top ten skills used—email, office productivity tools (word processing, spreadsheets and presentation software), web browsers, library catalog and database searching, printers, and teaching others to use technology—are fairly nontechnical in nature. A potential exception is that of teaching technology. Figure 6 suggests that teaching others to use technology entails several hardware devices (for example, laptops, tablets, smartphones, and scanners) as well as online and digital resources, such as ebooks. However, most of the popular skills used would be considered baseline skills for information workers in any domain. As suggested by Tennant, programming and other advanced technical skills do not necessarily need to be a core skill for all information professionals, but knowledge of the potential applications and possibilities of such tools is required.19 This idea was echoed by Partridge et al., whose findings emphasized the need for awareness and resilience in tackling new technological developments.20 These skills alone would obviously be too little for LIS practitioners explicitly seeking a high-tech role, as discussed in Maceli.21 However, further research directed toward exploring the mental models and general technological understanding of information professionals would be helpful in understanding the true level of practitioner engagement with technology, to complement the list of relatively low-tech tools employed. Programming has been a skill of great interest within the information professions for many years and the respondents’ enthusiasm and desire to learn in this area was readily apparent from the survey results, with nearly 20 percent of participants citing either “programming” or “coding” as a skill they desired to learn. In the context of their current responsibilities, 15 percent of respondents overall mentioned “computer programming or coding” as a regular technological skill they employed (figure 2). There was a slight difference between the librarians with fewer than eleven years of experience—19 percent coded regularly—compared to 13 percent of those with eleven or more years of experience. Within the years-of-experience divisions, the newer practitioners were more interested in learning programming, with the peak of interest at three to five years in the workplace (figure 11). The relatively low interest or need to learn programming in the newest practitioners potentially indicates a hopeful finding—that their degree program was sufficient preparation for the early years of their career. Prior research would contradict this finding. For example, Choi and Rasmussen’s 2006 survey found that, in the workplace, librarians frequently felt unprepared in their knowledge of programming and scripting languages.22 In the intervening years, curriculum has shifted to more heavily emphasize technology skills, including web development and other topics covering programming,23 perhaps better preparing early career practitioners. Overall, INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 53 programming remains a popular skill in continuing education opportunities as well as in job listings,24 which aligns well with the respondents’ strong interest in this area. The skills commonly co-occurring with programming in practice included working with Linux, database software, managing servers, and webpage creation (figure 4). Taken as a whole, these skills indicate job responsibilities falling toward the systems side, with webpage creation a skill that bridged intensely technical and more user-focused work (as also evident in figure 4).This indicates that, though programming may be perceived as highly desirable for communicating and extending systems, as a formal job responsibility it may still fall to a relatively small number of information professionals in any significant manner. Makerspace technologies and their implementation possibilities within libraries have garnered a great deal of excitement and interest in recent years, with much literature highlighting innovative projects in this area (such as American Library Association25 and Bagley26). Fourie and Meyer provided an overview of the existing makerspace literature, finding that most research efforts focus on the needs and construction of the physical space.27 Given the general popularity of the topic (as detailed in Moorefield-Lang),28 it is interesting to note that such technologies were infrequently mentioned by survey participants, both in those desiring to learn these tools and those who were currently using them. The most infrequent skills used (figure 2) included makerspace technologies, 3D printers, augmented, and virtual reality. Only a small number of respondents currently used this mix of makerspace-oriented and emerging technologies, and only 3 percent of respondents mentioned interest in learning makespace-related skills. Despite many research efforts exploring the particulars of unique makerspaces in a case-study approach (for example, Moorefield-Lang),29 little data exists on the total number of makerspaces within libraries, and the skillset is largely absent from prior research describing LIS curriculum and job listings. This makes it difficult to determine whether the low number of participants that reported working with makerspace technologies is reflective of the small number of such spaces in existence or simply that few practitioners are assigned to work in this area, no matter their popularity. In either case, these findings provide a useful baseline with which to track the growth of makerspace offerings over time and librarian involvement in such intensely technological work. Despite the interest and clear willingness to learn and use technology, several workplace challenges became apparent from participant responses. As prior research explored (notable Riley-Huff and Rholes),30 practitioners assumed they would be continually learning and building skills on the job throughout their career to stay current technologically. As described in the earlier results section, many participants mentioned that, although they were highly willing and able to learn, the necessary organizational resources were lacking. As one participant noted, “I’d like to learn anything but the biggest problem seems to be budget (time and monetary).” Several participants expressed feeling overwhelmed with their current workload. New learning opportunities, technological or otherwise, were simply not feasible. Although the survey results indicated that practitioners of all ages were roughly equally interested in learning new TECHNOLOGY SKILLS IN THE WORKPLACE: INFORMATION PROFESSIONALS’ CURRENT USE AND FUTURE ASPIRATIONS | MACELI AND BURKE | https://doi.org/10.6017/ital.v35i4.9540 54 technologies, a handful of responses mentioned that ageist issues were creating barriers. Though few, these respondents described being dismissed as technologists because of their age. These themes have long been noted in the large body of continuing-education-related literature going back several decades. Stone’s study ranked lack of time as the top deterrent to professional development for librarians, and it appears little has changed.31 Chan and Auster noted that organizational climate and the perception of one’s age may impair the pursuit of professional development, among other impediments.32 However, research has noted a generally strong drive in older librarians to continue their education; Long and Applegate found a preference in later- career librarians for learning outlets provided by formal library schools and related professional organizations, but a lower interest in generally popular topics such as programming.33 These findings were consistent with the participant responses gathered in this survey. Finally, as detailed in the results section, a significant percent of respondents (33 percent) did not answer the question regarding what technologies they would like to learn. As is a limitation with survey research, it is difficult to know what the respondent’s intention was in not answering the question, i.e., are they comfortable with their current technology skills? Do they lack the time or interest in pursuing further technology education? And of those that did answer, many did not specify their intended use of the technologies they desired to learn. So a deeper exploration of what technologies LIS practitioners desire to learn and why would be of value as well. These questions are worth pursuing in more depth through further research efforts. CONCLUSION This study provides a broad view into the technologies that LIS practitioners currently use and desire to learn, across a variety of types of libraries, through an analysis of survey responses. Despite a marked enthusiasm toward using and learning technology, respondents described serious organizational limitations impairing their ability to grow in these areas. The LIS practitioners surveyed have interested patrons, see technology as part of their mission, and are not satisfied with the current state of affairs, but they seem to lack money, time, skills, and a willing library administration. Though respondents expressed a great deal of interest in more advanced technology topics, such as programming, the majority typically engaged with technology on an end-user level, with a minority engaged in deeply technical work. This study suggests future work in exploring information professionals’ conceptual understanding of and attitudes toward technology, and a deeper look at the reasoning behind those who did not express a desire to learn new technologies. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 55 REFERENCES 1. Marshall Breeding, “Library Technology: The Next Generation,” Computers in Libraries 33, no. 8 (2013): 16–18, http://librarytechnology.org/repository/item.pl?id=18554. 2. Therese F. Triumph and Penny M. Beile, “The Trending Academic Library Job Market: An Analysis of Library Position Announcements from 2011 with Comparisons to 1996 and 1988,” College & Research Libraries 76, no. 6 (2015): 716–39, https://doi.org/10.5860/crl.76.6.716. 3. Janie M. Mathews and Harold Pardue, “The Presence of IT Skill Sets on Librarian Position Announcements,” College & Research Libraries 70, no. 3 (2009): 250–57, https://doi.org/10.5860/crl.70.3.250. 4. Debra A. Riley-Huff and Julia M. Rholes, “Librarians and Technology Skill Acquisition: Issues and Perspectives,” Information Technology and Libraries 30, no. 3 (2011): 129–40, https://doi.org/10.6017/ital.v30i3.1770. 5. Monica Maceli, “Creating Tomorrow’s Technologists: Contrasting Information Technology Curriculum in North American Library and Information Science Graduate Programs against Code4lib Job Listings,” Journal of Education for Library and Information Science 56, no. 3 (2015): 198–212, https://doi.org/10.12783/issn.2328-2967/56/3/3. 6. Walt Crawford, “Making it Work Perspective: Techno and Techmusts,” Cites and Insights 8, no. 4 (2008): 23–28. 7. Helen Partridge et al., “The Contemporary Librarian: Skills, Knowledge and Attributes Required in a World -f Emerging Technologies,” Library & Information Science Research 32, no. 4 (2010): 265–71, https://doi.org/10.1016/j.lisr.2010.07.001. 8. Riley-Huff and Rholes, “Librarians and Technology Skill Acquisition.” 9. Vandana Singh and Bharat Mehra, “Strengths and Weaknesses of the Information Technology Curriculum in Library and Information Science Graduate Programs,” Journal of Librarianship and Information Science 45, no. 3 (2013): 219–231, https://doi.org/10.1177/0961000612448206. 10. Riley-Huff and Rholes, “Librarians and Technology Skill Acquisition.” 11. Sharon Hu, “Technology Impacts on Curriculum of Library and Information Science (LIS)—A United States (US) Perspective,” LIBRES: Library & Information Science Research Electronic Journal 23, no. 2 (2013): 1–9, http://www.libres-ejournal.info/1033/. 12. Singh and Mehra, “Strengths and Weaknesses of the Information Technology Curriculum.” 13. See, for example, Crawford, “Making it Work Perspective”; Partridge et al., “The Contemporary Librarian.” TECHNOLOGY SKILLS IN THE WORKPLACE: INFORMATION PROFESSIONALS’ CURRENT USE AND FUTURE ASPIRATIONS | MACELI AND BURKE | https://doi.org/10.6017/ital.v35i4.9540 56 14. Riley-Huff and Rholes, “Librarians and Technology Skill Acquisition.” 15. Elizabeth W. Stone, Factors Related to the Professional Development of Librarians (Metuchen, NJ: Scarecrow, 1969). 16. Donna C. Chan and Ethel Auster, “Factors Contributing to the Professional Development of Reference Librarians,” Library & Information Science Research 25, no. 3 (2004): 265–86, https://doi.org/10.1016/S0740-8188(03)00030-6. 17. Dorothy E. Jones, “Ten Years Later: Support Staff Perceptions and Opinions on Technology in the Workplace,” Library Trends 47, no. 4 (1999): 711–45. 18. John J. Burke, The Neal-Schuman Library Technology Companion: A Basic Guide for Library Staff, 5th edition (New York: Neal-Schuman, 2016). 19. Roy Tennant, “The Digital Librarian Shortage,” Library Journal 127, no. 5 (2002): 32. 20. Partridge et al., “The Contemporary Librarian.” 21. Monica Maceli, “What Technology Skills Do Developers Need? A Text Analysis of Job Listings in Library and Information Science (LIS) from Jobs.code4lib.org,” Information Technology and Libraries 34, no. 3 (2015): 8–21, https://doi.org/10.6017/ital.v34i3.5893. 22. Youngok Choi and Edie Rasmussen, “What Is Needed to Educate Future Digital Libraries: A Study of Current Practice and Staffing Patterns in Academic and Research Libraries,” D-Lib Magazine 12, no. 9 (2006), http://www.dlib.org/dlib/september06/choi/09choi.html. 23. See, for example, Maceli, “Creating Tomorrow's Technologists.” 24. Elías Tzoc and John Millard, “Technical Skills for New Digital Librarians,” Library Hi Tech News 28, no. 8 (2011): 11–15, https://doi.org/10.1108/07419051111187851. 25. American Library Association, “Manufacturing Makerspaces,” American Libraries 44, no. 1/2 (2013), https://americanlibrariesmagazine.org/2013/02/06/manufacturing-makerspaces/. 26. Caitlin A. Bagley, Makerspaces: Top Trailblazing Projects, A LITA Guide (Chicago: American Library Association, 2014). 27. Ina Fourie and Anika Meyer, “What to Make of Makerspaces: Tools and DIY Only or is there an Interconnected Information Resources Space?,” Library Hi Tech 33, no. 4 (2015): 519–25, https://doi.org/10.1108/LHT-09-2015-0092. 28. Heather Moorefield-Lang, “Change in the Making: Makerspaces and the Ever-Changing Landscape of Libraries,” TechTrends 59, no. 3 (2015): 107–12, https://doi.org/10.1007/s11528-015-0860-z. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 57 29. Heather Moorefield-Lang, “Makers in the Library: Case Studies of 3D Printers and Maker Spaces in Library Settings,” Library Hi Tech 32, no. 4 (2014): 583–93, https://doi.org/10.1108/LHT-06-2014-0056. 30. Riley-Huff and Rholes, “Librarians and Technology Skill Acquisition.” 31. Stone, Factors Related to the Professional Development of Librarians. 32. Chan and Auster, “Factors Contributing to the Professional Development of Reference Librarians.” 33. Chris E. Long and Rachel Applegate, “Bridging the Gap in Digital Library Continuing Education: How Librarians Who Were Not ‘Born Digital’ Are Keeping Up,” Library Leadership & Management 22, no. 4 (2008), https://journals.tdl.org/llm/index.php/llm/article/view/1744. TECHNOLOGY SKILLS IN THE WORKPLACE: INFORMATION PROFESSIONALS’ CURRENT USE AND FUTURE ASPIRATIONS | MACELI AND BURKE | https://doi.org/10.6017/ital.v35i4.9540 58 Appendix. Survey Questions 1. What type of library do you work in? 2. Where is your library located (state/province/country)? 3. What is your job title? 4. What is your highest level of education? 5. Which of the following methods have you used to learn about technologies and how to use them? Please mark all that apply. • Articles • As part of a degree I earned • Books • Coworkers • Face-to-face credit courses • Face-to-face training sessions • Library patrons • Online credit courses • Online training sessions (webinars, etc.) • Practice and experiment on my own • Web resources I regularly check (sites, blogs, Twitter, etc.) • Web searching • Other: 6. Which of the following skill areas are part of your responsibilities? Please mark all that apply. • Acquisitions • Archives/special collections • Cataloging • Circulation • Collection development • Distance library services • Electronic resource management • Instruction • Interlibrary loan INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 59 • Library administration • Library IT/systems • Marketing/public relations • Media/audiovisuals • Outreach • Periodicals/serials • Reference • User experience • Other: 7. How long have you worked in libraries? • 0–2 years • 3–5 years • 6–10 years • 11–15 years • 16–20 years • 21 or more years 8. Which of the following technologies or technology skills are you expected to use in your job on a regular basis? Please mark all that apply • Assistive/adaptive technology • Audio recording and editing • Augmented reality (Google Glass, etc.) • Blogging • Cameras (still, video, etc.) • Chromebooks • Cloud-based productivity apps (Google Apps, Office 365, etc.) • Cloud-based storage (Google Drive, Dropbox, iCloud, OneDrive, etc.) • Computer programming or coding • Computer security and privacy knowledge • Database creation/editing software (MS Access, etc.) • Dedicated e-readers (Kindle, Nook, etc.) • Digital projectors TECHNOLOGY SKILLS IN THE WORKPLACE: INFORMATION PROFESSIONALS’ CURRENT USE AND FUTURE ASPIRATIONS | MACELI AND BURKE | https://doi.org/10.6017/ital.v35i4.9540 60 • Discovery layer/service/system • Downloadable e-books • Educational copyright knowledge • E-mail • Facebook • Fax machine • Image editing software (Photoshop, etc.) • Laptops • Learning management system (LMS) or virtual learning environment (VLE) • Library catalog (public side) • Library database searching • Library management system (staff side) • Library website creation or management • Linux • Mac operating system • Makerspace technologies (laser cutters, CNC machines, Arduinos, etc.) • Mobile apps • Network management • Online instructional materials/products (LibGuides, tutorials, screencasts, etc.) • Presentation software (MS PowerPoint, Prezi, Google Slides, etc.) • Printers (public or staff) • RFID (radio frequency identification) • Scanners and similar devices • Server management • Smart boards/interactive whiteboards • Smartphones (iPhone, Android, etc.) • Software installation • Spreadsheets (MS Excel, Google Sheets, etc.) • Statistical analysis software (SAS, SPSS, etc.) • Tablets (iPad, Surface, Kindle Fire, etc.) • Teaching others to use technology INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 61 • Teaching using technology (instruction sessions, workshops, etc.) • Technology equipment installation • Technology purchase decision-making • Technology troubleshooting • Texting, chatting, or instant messaging • 3D printers • Twitter • Using a web browser • Video recording and editing • Virtual reality (Oculus Rift, etc.) • Virtual reference (text, chat, IM, etc.) • Word processing (MS Word, Google Docs, etc.) • Web-based e-book collections • Web conferencing/video conferencing (Webex, Google Hangouts, Goto Meeting, etc.) • Webpage creation • Web searching • Windows operating system • Other: 9. Which of the following are barriers to new technology adoption in your library? Please mark all that apply. • Administrative restrictions • Budget • Lack of fit with library mission • Lack of patron interest • Lack of staff time • Lack of staff with appropriate skill sets • Satisfaction with amount of available technology • Other: 10. What technology skill would you like to learn to help you do your job better? 11. What technologies do you help patrons with the most? 12. What technology item do you circulate the most? TECHNOLOGY SKILLS IN THE WORKPLACE: INFORMATION PROFESSIONALS’ CURRENT USE AND FUTURE ASPIRATIONS | MACELI AND BURKE | https://doi.org/10.6017/ital.v35i4.9540 62 13. What technology or technology skill would you most like to see added to your library? 9585 ---- June_ITA_Buljung_final Up Against the Clock: Migrating to LibGuides v2 on a Tight Timeline Brianna Buljung and Catherine Johnson INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 68 ABSTRACT During Fall semester 2015, Librarians at the United States Naval Academy were faced with the challenge of migrating to LibGuides version 2 and integrating LibAnswers with LibChat into their service offerings. Initially, the entire migration process was anticipated to take almost a full academic year; giving guide owners considerable time to update and prepare their guides. However, with the acquisition of the LibAnswers module, library staff shortened the migration timeline considerably to ensure both products went live on the version 2 platform at the same time. The expedited implementation timeline forced the ad hoc implementation teams to prioritize completion of the tasks that were necessary for the system to remain functional after the upgrade. This paper provides an overview of the process the staff at the Nimitz Library followed for a successful implementation on a short timeline and highlights transferable lessons learned during the process. Consistent communication of expectations with stakeholders and prioritization of tasks were essential to the successful completion of the project. INTRODUCTION Academic libraries all over the United States have migrated from LibGuides version 1 to the new, sleeker, responsive design of version 2. Approaches to the migration can differ vastly depending on library size, staff capabilities and time frame available for completing the project. In 2015, the Nimitz Library at the United States Naval Academy, began planning to both upgrade LibGuides to version 2 and to acquire LibAnswers with LibChat. The Web Team and Reference Department partnered to migrate the LibGuides platform and integrate LibAnswers into the Library’s web presence. The Library first adopted Springshare’s LibGuides in 2009. By 2015, the subscription had grown to 61 published guides with 10,601 views. The LibGuides collection was modified and expanded during two web site upgrades and several staffing changes. Throughout 2014 and 2015, Library staff periodically discussed the possibility of upgrading to the version 2 interface, but timing, Brianna Buljung (bbuljung@mines.edu) is Instruction & Research Librarian, Colorado School of Mines, Golden, CO. Catherine Johnson (cjohnson@usna.edu) is Head of Reference & Instruction at the United States Naval Academy, Annapolis, MD. UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 69 staffing vacancies and the priority of other projects kept the migration from taking place. In late summer 2015, with the acquisition of Springshare’s LibAnswers with LibChat pending, staff determined that it was finally time to migrate to the new LibGuides interface. Initially, the migration team planned to spend nearly a full academic year completing the migration process. This timeline would provide guide owners with ample time for staff training, revising guides, conducting usability testing and preparing the migrated guides to go live without distracting from their other duties. However, right before starting the project, the Library finalized the acquisition of Springshare’s LibAnswers with LibChat which they decided to launch with the version 2 interface. The team pushed up the LibGuides migration by several months to keep from confusing patrons with multiple interfaces and launch dates. The migration of LibGuides and the implementation of LibAnswers would take place during the Fall semester and both products would go live in the version 2 interface before the start of the Spring semester. This paper provides an overview of the process that the staff at Nimitz Library followed for a successful implementation on a short timeline and highlights transferable lessons learned during the process. The authors also include a post-implementation reflection on the process. LITERATURE REVIEW Much of the currently available literature on migration of platforms, especially the LibGuides platform is published informally. Librarians from universities across the country have created help guides, checklists and best practices for surviving the migration. Most migration help-guides are tailored to each specific institution but they can still provide helpful suggestions that can be adapted by another.1 Springshare also provides extensive help content and checklists, including a list of the most important steps for administrators to complete.2 However, little of the available literature discusses the minimally acceptable amount of work needed to be completed by guide authors. This type of information was crucial to the Nimitz Library team after drastically shortening the migration timeline. A clearly delineated list of required and optional tasks was needed for guide owners, given time constraints and other job duties. In addition to the informally published help materials, several articles have been published on various aspects of research guide design and evaluation. A few articles examine the migration process. Hernandez and McKeen offer advice for libraries contemplating migration; including setting goals and performing usability testing against the new guides.3 Duncan et al provide a case study of the implementation process at the University of Saskatchewan.4 Some articles discuss the basics of guide design and usage in the library. These best practices can be adapted to different platforms, web sites and user populations. They discuss the importance of various web design elements such as word choice and page layout.5 Another aspect the literature exposes is student use of the guides.6 Finally, usability of research guides is one of the most important and widely discussed topics in the literature. Creating and maintaining guide content depends on the user’s ability to locate and use the guides in their research.7 Most often, research guides are designed UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 70 with the student in mind; to assist them in beginning a project, researching when a librarian is unavailable or as a reference for follow-up after an instruction session.8 As Pittsley and Memmott discuss, navigation elements can impact a student’s use of research guides.9 The Process As preparations for the migration began, it became immediately apparent that the Web Team and Reference Department would have to divide the project into manageable segments to complete the work without overwhelming guide owners. Three ad hoc teams, made up of librarians from several different departments, were created to take the lead on different elements of the project. The migration team was responsible for researching, organizing and supervising the migration of LibGuides to version 2. The LibAnswers team learned about LibAnswers and how to effectively integrate the product into the Library’s web site. The LibChat team tested the functionality of LibChat and determined how it would fit into the Library’s reference desk staffing model. Dividing the project into manageable segments allowed each team to focus on the execution of their area of responsibility. The team approach allowed the Library to draw on individual strengths and staff willingness to participate without depending on one single staff member to manage the entire migration and implementation process on such a short timeline. Migration Team The migration team was responsible for determining the tasks that were mandatory for guide owners to complete, the amount of training they would need to use the new interface and how each product should be incorporated into the Library’s web site. The LibGuides migration team relied heavily on advice from other libraries and the documentation from Springshare to guide them in determining mandatory tasks. The Engineering and Computer Science librarian reached out to the ASEE Engineering Libraries Division listserv for advice from peer libraries that had already completed migration. The team also made use of the Springshare help guides and best practices guides posted by other universities. Ultimately, the migration team created checklists and spreadsheets to help guide owners prepare their guides for migration. A pre-migration checklist (Appendix A) was shared with guide owners; containing all of the required and optional tasks that needed to be completed before the migration took place in early November. Tasks such as deleting outdated or unused images and evaluating low use guides for possible deletion were required for guide owners to complete. Other tasks such as checking each guide for a friendly url or checking database descriptions for brevity and jargon free language were encouraged but considered optional. The team determined that items directly related to the ability of post-migration guides to function properly made the required list, while more cosmetic or stylistic tasks could be completed on a time-allowed basis. A post-migration checklist (Appendix B) was created for guide owners following the migration. This list included portions of the guides that had to be checked to ensure widgets, links and other assets had UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 71 migrated properly. Both checklists were accompanied by tips, screenshots, deadlines and indicated which team member to contact with questions. Clear explanation of the expectations for the project, and accommodating the guide owners’ busy schedules made the migration more successful. The migration team gave the new, more robust A-Z list significant attention. LibGuides version 2 allows the A-Z list to be sorted by subject, type and vendor. It also allows a library to tag “Best Bets” databases in each subject area. The databases categorized as Best Bets display more prominently in the list of databases by subject. Using Google Sheets, the Electronic Resources Librarian quickly and easily solicited feedback from liaison librarians about which databases to tag as Best Bets for each subject area. Google Sheets also made it easy for librarians to edit the list of databases related to their subject expertise. Some databases had been incorrectly categorized and, in some subjects, newer subscriptions didn’t appear on the list. LibGuides version 2 allows users to sort databases by type, but doesn’t provide a predetermined list of types. In order to create the list of material types into which all databases would be sorted, the migration team examined lists found on other library web sites. Several lists were combined and duplicates, or irrelevant types were removed. An additional military specific type was added to address the most common research conducted by midshipmen. Then, the liaison librarians were solicited for input on the language used to describe each type and which databases should be tagged by each type. Name choices are a matter of local preference, such as having a single type category for both dictionaries and encyclopedias, or two separate categories. To keep the list of material types to a manageable length, the team decided that each type must contain more than one or two databases. It takes time to get well defined lists of subjects and types. Staff working with patrons are able to gather informal feedback about the categorizations in their current form, and make suggestions, corrections, or additions based on patron feedback. The migration of LibGuides and acquisition of LibAnswers provided the Reference Department and Web Team with an opportunity to update policies and establish new best practices for guide owners. One important cosmetic update included more encouragement for guide owners to use a photo in their profiles. Profile pictures had been used inconsistently in the first LibGuides interface, and several guide owners used the default grey avatar. Guide owners who were reluctant to have a headshot on their profile were encouraged to take advantage of stock photos made available through the Naval Academy’s Public Affairs Office. A photo shoot was also organized for guide owners. On a voluntary basis, guide owners spent about an hour helping each other to take pictures in and around the Library. The event helped to get a collection of more professional photos for guide owners to choose from. Another important update was the re-evaluation of LibGuides policies in light of the new functionality available in version 2. The guide owners gathered for a meeting midway through the pre-migration guide cleanup process to troubleshoot problems and consider best practices for the UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 72 new interface. Guide owners discussed the standardization of tab names in the guides, the information important to include in author profile boxes, and potential categories for the “types” dropdown in the A-Z database list. The meeting provided a great opportunity to discuss the options available to guide owners and to solicit feedback on interface mock-ups and guide templates created by the Systems Librarian. Many items from the discussion were incorporated into the update LibGuides policies for guide owners. LibAnswers and LibChat Teams Integrating LibAnswers with LibChat, an additional Springshare product, at the same time as the migration to LibGuides version 2 is not necessary. Because the acquisition of LibAnswers coincided with the need to upgrade to version 2, the Library staff determined that the two should be done at the same time in order to minimize disruption for patrons. The ad hoc teams tasked with implementing LibAnswers and LibChat met regularly to learn about the new products and to consider how these products would fit into the library’s existing points of service. While the LibAnswers and LibChat teams began as two distinct groups, it became increasingly clear that the functionality of these two systems is interwoven so closely that they must be reviewed and discussed together. The teams spent considerable time learning the functionality of the new systems, considering how the new service points would integrate into the existing offerings, and creating draft policies to provide guidance to staff. The teams developed a set of tips and guidelines to address staff concerns and provide guidance on how the new system should be used (see Appendix C). The teams also held training sessions focused on providing opportunities for staff to explore and practice using the new products. Although the implementation of LibAnswers with LibChat was not necessary to upgrade to LibGuides version 2, undertaking all of these upgrades at once allowed the ad hoc groups to collaborate with ease, define policies and procedures that would help these products integrate seamlessly with existing services, and prevent change fatigue within the Library. Updating the Library Website The final element of migration and implementation the teams had to consider was integration into the Library’s existing web site. Many elements of the Library’s site are dictated by the broader university web policy and content management system. However, working within guidelines the teams were able to take advantage of the new LibGuides interface, especially the more robust A-Z list of databases, to provide users with multiple ways of accessing the new tools. The Library makes use of a tabbed box to provide entry to Summon, the catalog, the list of databases and LibGuides. The new functionality of LibGuides version 2 enabled the team to provide easier access directly to the alphabetical listing of databases. The LibGuides tab was also updated to provide a drop down list of all the guides and a link to browse by guide owner, subject or type of guide. These enhancements saved time for the user and cut down on the number of clicks needed to access database content licensed by the library. UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 73 Integrating the LibAnswers product into the site was achieved by providing several different ways for patrons to access it. An FAQ tab was added to the main tabbed box to provide quick access to LibAnswers, complete with a link to submit questions. The “Contact Us” section on the site home page was updated to include a link to LibAnswers as well as newer, more modern icons for the different contact methods. All guide owners were instructed to update the contact information on their guides to include a LibAnswers widget. A great source of inspiration on integrating the tools into the Library site came from looking at other library web sites. The teams worked from the list of LibGuides community members provided on the Springshare help site and by viewing the sites of known peer libraries. Working through an unfamiliar web site can be a quick way to find design ideas and work flows that are successful and attractive. Team members found wording, icons and placement ideas that could be adapted for use on the Nimitz Library site. Advice for Managing a Short Migration Timeline While on a short implementation timeline or with a small staff that has to accomplish this project in addition to their regular duties, it's important to consider a few strategies that can make the process simpler and less stressful. First, communicate expectations with everyone involved in the project at all steps of the process. Determine which stakeholders need to know about the various checklists and upcoming deadlines. Communicating needs and expectations throughout the entirety of the project reduces confusion and enables teams and individual guide owners to complete the project on time. Although LibGuides had predominantly been the domain of the Nimitz Reference Department, projects of this scale also impacted other parts of the library, from systems to the Electronic Resources librarian. Email communication and short notices in the Library’s weekly staff update were the primary means of communication with stakeholders. Documents were shared via Google Drive to provide guide owners with a centralized file of help materials. Also, the point of contact for questions with each element of the migration was clearly identified on each checklist and tip sheet. This single addition to the checklists helped guide owners to quickly and easily get questions and technical issues addressed. On a short timeline it is also important to consider the elements that are crucial for completion and those that can be delayed. Some critical needs in a LibGuides migration include deleting guides that are no longer being used, checking for boxes that will not migrate and deleting bad links. These tasks must be completed by guide owners or administrators to ensure that the migrated data formats properly. Careful attention to these tasks also save the staff unnecessary work on updating and fixing the new guides before going live. Other elements of guide design and migration are merely nice to have. They complement the user’s experience with the final product but neglecting them will not affect basic functionality. These secondary tasks can be completed as time allows. For guide owners, optional tasks include shortening link descriptions, checking for a UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 74 guide description and friendly url and other general updates to the guides. The migration was broken into manageable tasks by giving guide owners a clear list of required and optional items. Team leaders will also need to manage expectations. It can be difficult to remember that web pages, especially LibGuides, are living documents. They can be updated fairly easily after the system has gone live. On a short timeline, in the midst of other duties and responsibilities, it is acceptable for a guide to be just good enough. There is rarely enough time for each guide to reach a state of perfection prior to going live. A guide that is spell-checked and contains accurate information can be edited and made more aesthetically pleasing as time allows after the entire site has gone live. While additional edits are taking place, students still have access to the information they need for their academic work. Lists, such as the subjects and material types in the A-Z list, are always a work in progress based on feedback from service points and usability testing. Updates and edits should be made as patrons interact with the products. Regular use can help library staff identify problems with or confusion about the products that might not be anticipated prior to going live. Stress on guide owners can be greatly reduced by communicating expectations throughout the process. Post-Implementation Nimitz Library successfully went live with both LibGuides version 2 and LibAnswers with LibChat in early January 2016, right before midshipmen returned to campus for the Spring semester. LibAnswers with LibChat was introduced to the campus community with a soft launch at the beginning of the Spring semester due to staffing levels and shifts at the reference desk. The librarian on duty at the reference desk was also responsible for answering any chats or LibAnswers questions initiated during their shift. The volume of questions remained fairly low during the semester. On average, the Library received two synchronous and 1.5 asynchronous incoming questions per week via LibAnswers with LibChat. The low volume was beneficial in that it allowed librarians to become familiar with answering questions and editing FAQs. They were able to handle both face-to-face interactions with patrons in the library and the web traffic. However, the volume was so low that it became apparent more marketing of the service was needed. At the start of the fall 2016 semester, the library made an effort to increase awareness of the new LibAnswers products by emailing all students, mentioning the service in every instruction session, and creating fliers advertising the service and distributing them around the library. Though data is preliminary, statistics have shown that use of these services has more than tripled in the first month of the new semester. As discussed above, the expedited implementation timeline forced the ad hoc teams to prioritize completion of the tasks that were necessary for the system to remain functional after the upgrade. This meant other necessary, but not urgent, updates to guides were left untouched during the migration. Given the amount of effort needed to prepare the guides for migration, it is understandable that guide owners had grown tired of making LibGuides updates and found it UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 75 necessary to move on to other projects. With this fatigue in mind, the team leaders will continue to remind guide authors that LibGuides are living pages in need of constant attention. The team leaders will also take advantage of user feedback to promote continued updates to LibGuides. Throughout the migration process team leaders solicited feedback from staff and users in a variety of ways. First, reference staff wereinformed of design and implementation changes made throughout the migration. They were given time to view and evaluate the master guide template prior to the migration. The team solicited feedback on the names and organization of categories in the A-Z list. After the products went live, the team gathered informal feedback through reference desk interviews, in information literacy instruction sessions and in conversations with faculty and students. Student volunteers participated in usability testing during the Spring semester. They were asked to complete a series of tasks related to the different aspects of the new interface. Their feedback, especially from thinking aloud while completing the tasks, revealed to librarians how students actually use the guides. Both formal and informal feedback helped librarians adapt and improve the guides. Based on the feedback, the Systems Librarian made global changes to improve system functionality. In one instance, users were having difficulty submitting a new LibAnswers question when they could not find an appropriate FAQ response. The Systems librarian made the “Submit Your Question” link more prominent for users in that situation. The LibGuides continue to be evaluated by staff for currency and ease of use. In discussing the first round of usability test results it was determined that more testing during the Fall semester of 2016 would be helpful. During the upgrade to version 2 and implementation of LibAnswers with LibChat, librarians focused on the functions in the system that were most essential or most desired. All of these products contain additional functionality that was not implemented during the upgrade. After a brief rest, the reference department and library web team explored the products’ additional functionality and determined what avenues to explore next. CONCLUSIONS Migration of any platform can be an extensive and time consuming task for library staff. Preparations and post-migration clean up can interrupt staff workflows and strain limited resources. Using migration teams was a successful strategy on a short timeline because it helped spread the workload by delegating specific learning and tasks to specific people. Those people, in turn, became experts in their area of focus and served as a resource for others in the library. This model cultivated a sense of ownership in the migration across many stakeholders that might not have otherwise existed. That sense of ownership in the project, coupled with checklists and spreadsheets full of discrete tasks in need of completion made it possible for a small staff to complete the migration quickly and successfully. Migrating on a short timeline can be especially stressful but careful planning and good communication of expectations helps stakeholders focus on the end goal. Upon completion of the project there was a very real sense of fatigue with this UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 76 project. As a result, tasks that were listed as optional because they weren’t critical for migration went unattended for quite some time after the migration. Slowly, months later, guide owners are ready to revisit guides and continue making improvements. If given more time, this migration may have been completed more methodically and with the intent of having everything perfect before moving on to the next step. Instead, working on a tight timeline forced us to continue moving forward, making necessary changes, and making note of changes to be made in the future. Ultimately, it was a constant reminder that our online presence is and should be a constant work in progress, not the subject of a big, occasional update. REFERENCES 1. Luke F. Gadreau, “Migration checklist for guide owners,” last modified April 3, 2015, https://wiki.harvard.edu/confluence/display/lg2/Migration+Checklist+for+Guide+Owners; Leeanne Morrow et al., “Best Practice Guide for LibGuides,” accessed November 17, 2016, http://libguides.ucalgary.ca/c.php?g=255392&p=1703394; Rebecca Payne, “Updating LibGuides & Preparing for LibGuides v2,” last modified November 18, 2014, https://wiki.doit.wisc.edu/confluence/pages/viewpage.action?pageId=85630373; Julia Furay, “Libguides Presentation: Migrating from v1 to v2 (Julia),” last modified September 29, 2015, http://guides.cuny.edu/presentation/migration. 2. Anna Burke, “LibGuides 2: Content Migration is Here!” last modified April 30, 2014, http://blog.springshare.com/2014/04/30/libguides-2-content-migration-is-here/; Springshare, “On your Checklist: Five Tips & Tricks for Migrating to LibGuides v2,” last modified February 18, 2016; http://buzz.springshare.com/springynews/news-27/springytips; Springshare, “Migrating to LibGuides v2(and going live!),” last modified November 7, 2016, http://help.springshare.com/libguides/update/whyupdate. 3. Lauren McKeen and John Hernandez, “Moving mountains: surviving the migration to LibGuides 2.0,” Online Searcher 39 (2015): 16-21, http://www.infotoday.com/OnlineSearcher/Articles/Features/Moving-Mountains-Surviving- the-Migration-to-LibGuides--102367.shtml. 4. Vicky Duncan et al., “Implementing LibGuides 2: an academic case study,” Journal of Electronic Resources Librarianship, 27 (2015): 248-258, https://dx.doi.org/10.1080/1941126X.2015.1092351 5. Jimmy Ghaphery and Erin White, “Library use of web-based research guides,” Information Technology and Libraries 31 (2012): 21-31, http://dx.doi.org/10.6017/ital.v31i1.1830; Danielle A Becker; “LibGuides remakes: how to get the look you want without rebuilding your website,” Computers in Libraries 34 (2014): 19-22, http://www.infotoday.com/cilmag/jun14/index.shtml; Michal Strutin, “Making research guides UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 77 more useful and more well used,” Issues in Science and Technology Librarianship 55(2008), https://dx.doi.org/10.5062/F4M61H5K. 6. Ning Han and Susan L. Hall, “Think Globally! Enhancing the International Student Experience with LibGuides,” Journal of Electronic Resources Librarianship 24(2012): 288-297, https://dx.doi.org/10.1080/1941126X.2012.732512; Gabriela Castro Gessner et al., “Are you reaching your audience?: The Intersection Between LibGuide Authors and LibGuide Users,” Reference Services Review 43(2015): 491-508, http://dx.doi.org/10.1108/RSR-02-2015-0010. 7. Luigina Vileno, “Testing the usability of two online research guides,” Partnership: The Canadian Journal of Library and Information Practice and Research 5 (2012), https://dx.doi.org/10.21083/partnership.v5i2.1235; Rachel Hungerford et., “LibGuides usability testing: customizing a product to work for your users,” http://hdl.handle.net/1773/17101; Alec Sonsteby and Jennifer DeJonghe, “Usability testing, user-centered design, and LibGuides subject guides: a case study,” Journal of Web Librarianship 7(2013): 83-94, https://dx.doi.org/10.1080/19322909.2013.747366. 8. Mardi Mahaffy, “Student use of library research guides following library instruction,” Communications in Information Literacy 6(2012): 202-213, http://www.comminfolit.org/index.php?journal=cil&page=article&op=view&path%5B%5D=v 6i2p202. 9. Kate A Pittsley and Sara Memmot, “Improving Independent Student Navigation of Complex Educational Web Sites: An Analysis of Two Navigation Design Changes in LibGuides,” Information Technology and Libraries 31 (2012): 52-64, https://dx.doi.org/10.6017/ital.v31i3.1880. UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 78 Appendix A: LibGuides Pre-Migration Checklist If there are issues, contact the Head of Reference & Instruction. Required before migration: Due Date Task Check when Complete 26 October 2015 Review attached report of guides that have not been updated in the last year. Delete or consolidate unneeded, practice, or backup guides.* 26 October 2015 Review attached report of guides with fewer than 500 hits. Delete or consolidate unneeded, practice, or backup guides.* 26 October 2015 Review all links to all databases included on your guides and make sure the links are mapped to the A-Z list. 26 October 2015 Review all guides for links not included in the current A-Z List. List any links that you think should be included in the A-Z List moving forward on the shared spreadsheet (A-Z Additions and Best Bets). Be sure to include all necessary information, including subject and type. Mid- October 2015 & 28 October 2015 Review forthcoming reports about broken links. Anticipate one report on October 13, and one October 26. 26 October 2015 Review the Databases by subject page of the A-Z list and make sure everything that should be included in your subject is there. Add anything you’d like removed from your subject to UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 79 the shared spreadsheet (tab 2). Identify 3 “best bets” databases for each of your subject areas on the shared spreadsheet (tab 3). 26 October 2015 Ensure all images have an alt tag 26 October 2015 Delete outdated or unused images in your image collection 26 October 2015 Convert all tables to percentages, not pixels 26 October 2015 Review attached report of boxes that will not migrate into version 2. (This won’t apply to everyone) 26 October 2015 Email the chair of the Web Team if you have guides with boxes containing custom formatting or code (this is only necessary if you manually adjusted the HTML or CSS, or use a tabs within a box on your guide). We are keeping a master list to double check after migration. 26 October 2015 Check all links to the catalog in your guides to make sure they are accurate 26 October 2015 Check all widgets (like catalog search boxes) to ensure they function properly, delete any widgets you don’t need, and keep a list of widgets to check post-migration to make sure they still function. UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 80 Optional before migration: Due Date Task Check when Complete Consider turning links in ‘rich text’ boxes to a ‘links and list’ box. Review all guides to ensure they have a friendly URL, are assigned to a subject, have assigned tags, and a brief guide description. Shorten database descriptions to one to two sentences. Consider including dates of coverage and why it’s useful for this particular subject. Helpful hints: *If you’d like to hold on to content from guides you plan to delete, create an unpublished “master guide” where you can store content you plan to use in the future. UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 81 Appendix B: LibGuides Post-Migration Checklist and Guide Clean up NOTE: Now that migration is complete, if you make an update to your version 1 guides, your change will not transfer to version 2. This means broken links will need to be fixed in both versions. If there are issues or questions contact the Head of Reference & Instruction (general questions), the Systems Librarian (technical issues), or the Electronic Resources Librarian (database assets and A-Z list). CLEAN UP AND CHECK CONTENT 1) Check boxes to make sure content is correctly displayed on all your guides. Check all boxes closely, as some had the header end up below the first bullet point. For example: To fix an issue like this - Click on at the bottom of the box you are working on. Then click on “Reorder Content”. You can move the links down and the text up 2) Ensure all guides have a friendly URL, are assigned to a subject, have assigned tags if you didn’t do this pre-migration. See the pre- migration handout for help. In version 2 this information will display at the TOP of our guides in edit mode and at the BOTTOM of our guides on the public interface. 3) Ensure images are resized to fit general web guidelines - See this guide for help http://guidefaq.com/a.php?qid=12922 4) Check all your widgets to ensure they still function properly 5) Add a guide type to each of your guides. This is a new feature in LibGuides version 2. It is under the gear on the right side of your guide while in edit mode. This will help us sort and organize them in the list of guides. UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 82 ADD NEW LIBGUIDES 2 CONTENT 1) Make a box pointing to related guides. Research has shown that a box on the guide home page pointing to related guides can be very helpful to students. Link to other subject guides that would be of interest and any course guides for that subject. For example: the box on the Mechanical Engineering guide contains links to EM215 and Nuclear Engineering (which is part of the Mechanical Engineering department). To do this - go to the bottom of your welcome box, click the Add/Reorder button, and then on Guide List, your first option is to manually choose guides to add to the list. 2) Add a tab to every guide that is named Citing Your Sources and redirects to the Citing Your Sources LibGuide. To do this: a. Create a blank page named Citing Your Sources at the bottom of your left side navigation b. On your blank page click on the to open the options for editing the page. c. Click on Redirect URL and paste the link to the Citing Sources guide in the box. d. It is also a good idea to mark the open in a new window box as well e. If you’ve completed it successfully your Citing Your Sources tab will look like this in edit mode. Since the Citing Your Sources guide is still a work in progress it is unpublished and you will get an error when you preview it. UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 83 f. Finally, REMOVE the plagiarism and citing sources box from your guides. 3) Now is a good time to take advantage of new functionality and to update the content of your guides. You can now combine multiple types of information into the same box, you can also take advantage of tabbed boxes. See this LibGuide for further assistance: http://support.springshare.com/libguides/migration/v2cleanup-regular 4) Create your new Profile Box At the meeting on Oct 20th, the Reference & Instruction department agreed that the following elements should be consistent in the profile box: Box Name: Librarian Image: A stock photo or a personal photo (picture day coming soon) In the Contact box: Title Nimitz Library XXXX Dept. Office # XXX 410-293-XXXX EMAIL ADDRESS And your subjects will be displayed below UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 84 Appendix C: Tips & Guidelines for LibAnswers with LibChat WHAT MODES OF INQUIRY WILL BE AVAILABLE TO USERS? Using the LibAnswers platform, users will be able to submit questions via chat or by using the question form within LibAnswers. Users will also be able to ask questions as they did before: at the reference desk, via askref@usna.edu, and by calling 410-293-2420. WHAT ARE “BEST PRACTICES” OR GUIDELINES FOR LIBANSWERS W/ LIBCHAT? See the tips for responding to tickets at the bottom of this document. See the tips for creating/maintaining FAQ at the bottom of this document. See the tips for responding to chat questions at the bottom of this document. WHAT PRIORITY SHOULD I GIVE RESPONSES COMING THROUGH VARIOUS MODES OF INQUIRY? Reference staff will have to use their professional judgement when deciding what priority to give questions coming in through various modes of inquiry. While the addition of chat and tickets may seem overwhelming at first, the same rules you’ve applied in the past will work. If a chat comes in while you’re helping someone face-to-face, use that as an opportunity to advertise the chat service. Explain to the patron that you also help users via chat and you’re going to let the chatter know that you’ll be with them shortly. The same can apply if you’re finishing up a chat when a face-to-face user walks up. Simply explain that the library also offers a chat service and you’re just finishing up a question. Remember to get comfortable with and take advantage of the canned messages in chat, let the phone go to voicemail if necessary, and explain to face-to-face users what’s happening. During the pilot phase you should also keep track of strategies that worked well for you, or times when the various modes of inquiry became too overwhelming. We’ll take all of that into consideration when we reexamine this service. Chat, phone, and face-to-face interactions are synchronous modes of communication, so users expect responses immediately. Tickets are asynchronous modes of communication and should be dealt with on a first come, first served basis. Respond to tickets when you have time. When responding to tickets, respond to the oldest tickets first as that user has been waiting the longest for an answer. However, feel free to use your judgement and, if you choose, respond to questions with quick answers right away. HOW SHOULD I PRIORITIZE QUESTIONS FROM USNA V. NON-USNA USERS? Priority should be given to midshipmen, faculty, and staff. If an outside user makes use of the chat or ticket service, feel free to explain to them that this service is primarily for faculty/staff/students and they should direct their question to askref@usna.edu. If you are free and have time, feel free to assist outside patrons via the chat or ticket system. UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 85 HOW SHOULD I HANDLE REMAINING QUESTIONS DURING A CHANGE IN SHIFTS? Handle them in the same manner that you would a face-to-face question with a student, faculty, or staff member. Finish up quickly if you can, advise the patron that you need to leave and offer to handle the question when you return, or transfer the chat to another librarian. If there are remaining tickets in the queue, simply notify the next librarian on duty. WHAT ARE THE EXPECTED TURNAROUND TIMES FOR RESPONDING TO PATRON INQUIRIES? Chat, face-to-face, and phone inquiries should be responded to as immediately as possible. Tickets should be responded to within a business day. WHO CAN I CONTACT FOR HELP AND TROUBLESHOOTING? If you have questions, your first stop should be the LibAnswers FAQ, provided by Springshare (available in the “Help” section when logged into LibApps). If you can’t find the answer to your question there, feel free to contact the Head of Reference and Instruction, who will work to resolve the problem with you. GUIDELINES FOR RESPONDING TO LIBANSWERS TICKETS*: ● Keep in mind that when you are responding to tickets, you are a jack of all trades. That means even if the question is outside of your subject area, you should do your best to provide the user information that will get them started. In that email you may also suggest that the user contact the subject specialist. ● Respond to LibAnswers tickets in the same way you would respond to an email inquiry from a user. ● If you provide a factual response, be sure to include the source from which that information came. GUIDELINES FOR CREATING/MAINTAINING FAQS*: ● The FAQ database is a public-facing, searchable collection of questions and answers. The intent is to empower our users to find their answers. Any question that might be considered a frequently asked question should be included in the FAQ. This might include questions about the library, the collections, how to find specific types of information, how to start research on specific and recurring assignments etc. ● When creating an FAQ from a ticket, remember that you can edit the question. Do your best to format the question in a way that would be applicable and relevant to the most users. ● When creating an FAQ from a response you’ve already written, be sure to edit out any personally identifiable information (PII) about the person who initially asked the question. Be sure to check the question and response for any PII. UP AGAINST THE CLOCK: MIGRATING TO LIBGUIDES V2 ON A TIGHT TIMELINE | BULJUNG AND JOHNSON https://doi.org/10.6017/ital.v36i2.9585 86 ● If you want to modify an FAQ: If a member of the staff notices incomplete or incorrect information in an FAQ response, he/she should use professional judgement in deciding how to handle the situation. If it’s an error that may have been caused by a typo, he/she may choose to edit the response immediately. However, if the edit impacts the substantive content of the response, he/she may choose to consult with the librarian who initially wrote the response. GUIDELINES FOR LIBCHAT*: ● If you refer a question, alert the librarian to whom the user is being referred. ● Remember the person you’re chatting with can’t see you so if you leave (to conduct a search, to check a book, to help someone else etc.) let them know you’ll be right back. ● Sometimes chat questions can seem rushed, so it may be tempting to answer the initial question. Remember, like face-to-face interactions, clarifying queries save time for the user and the librarian, allowing for the provision of more accurate and efficient answers. ● When providing responses, remember that as an academic library, our mission is to provide the information needed and to instruct our users so they may become self-reliant; Chat challenges us to balance providing answers and instruction. Do your best to find an appropriate balance. ● As the transaction is ending, remain courteous, check that all the user’s questions have been addressed, and encourage them to use the service again. * Note: These Guidelines are drafts and will evolve as the staff learns more about this system throughout the pilot phase. 9595 ---- Identifying Emerging Relationships in Healthcare Domain Journals via Citation Network Analysis Kuo-Chung Chu, Hsin-Ke Lu, and Wen-I Liu INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 39 Kuo-Chung Chu (kcchu@ntunhs.edu.tw) is Professor, Department of Information Management, and Dean, College of Health Technology, National Taipei University of Nursing and Health Sciences; Hsin-Ke Lu (sklu@sce.pccu.edu.tw) is Associate Professor, Department of Information Management, and Dean, School of Continuing Education, Chinese Culture University; Wen-I Liu (wenyi@ntunhs.edu.tw, Corresponding author) is Professor, Department of Nursing, and Dean, College of Nursing, National Taipei University of Nursing and Health Sciences. ABSTRACT Online e-journal databases enable scholars to search the literature in a research domain or to cross- search an interdisciplinary field. The key literature can thereby be efficiently mapped. This study builds a web-based citation analysis system consisting of four modules: (1) literature search; (2) statistics; (3) articles analysis; and (4) co-citation analysis. The system focuses on the PubMed Central dataset and facilitates specific keyword searches in each research domain for authors, journals, and core issues. In addition, we use data mining techniques for co-citation analysis. The results could help researchers develop an in-depth understanding of the research domain. An automated system for co-citation analysis promises to facilitate understanding of the changing trends that affect the journal structure of research domains. The proposed system has the potential to become a value-added database of the healthcare domain, which will benefit researchers. INTRODUCTION Healthcare is a multidisciplinary research domain of medical services provided both inside and outside a hospital or clinical setting. Article retrieval for systematic reviews in the domain is much more elusive than retrieval for reviews in clinical medicine because of the interdisciplinary nature of the field and the lack of a significant body of evaluative literature. Other connecting research fields consist of the respective research fields of the application domain (i.e., the health sciences, including medicine and nursing).1 In addition, valuable knowledge and methods can be taken from the fields of psychology, the social sciences, economics, ethics, and law. Further, the integration of those disciplines is attracting increasing interest.2 Researchers may use bibliometrics to evaluate the influence of a paper or describe the relationship between citing and cited papers. Citation analysis, one of several possible bibliometric approaches, is more popular than others because of the advent of information technologies.3 Citation analysis counts the frequency of cited papers from a set of citing papers to determine the most influential scholars, publications, or universities in a discipline. It can be classified into two basic types: the first type counts only the citations in a paper that are authored by an individual, while the second mailto:kcchu@ntunhs.edu.tw mailto:sklu@sce.pccu.edu.tw mailto:wenyi@ntunhs.edu.tw IDENTIFYING EMERGING ISSUES IN THE HEALTHCARE DOMAIN | CHU, LU, AND LIU 40 https://doi.org/10.6017/ital.v37i1.9595 type analyzes co-citations to identify intellectual links among authors in different articles. This paper focuses on the second type of citation analysis. Small defined co-citation analysis as “the frequency with which two items of earlier literature are cited together by the later literature.”4 It is not only the most important type of bibliometric analysis, but also the most sophisticated and popular method. Many other methods originate from citation analysis, including document co-citation analysis, bibliographic coupling,5 author co- citation analysis,6 and co-word analysis.7 There are levels of co-citation analysis: document, author, and journal. Co-citation could be used to establish a cluster or “core” of earlier literature.8 The pattern of links between documents can establish a structure to highlight the relationship of research areas. Citation patterns change when previously less-cited papers are cited more frequently, or old papers are no longer cited. Changing citation patterns imply the possibility of new developments in research areas; furthermore, we can investigate changing patterns to understand the scientific trend within a research domain.9 Co-citation analysis can help obtain a global overview of research domains.10 The aim of this paper is to detect emerging issues in the healthcare research domain via citation network analysis. Our results can provide a basis for knowledge that researchers can use to construct a search strategy. Structural knowledge is intrinsic to problem solving. Because of the interdisciplinary nature of the healthcare domain and the broadness of the term, research is performed in several research fields, such as nursing, nursing informatics, long-term care, medical informatics, geriatrics, information technology, telecommunications, and so forth. Although electronic journals enable searching by author, article, and journal title using keywords or full text, the results are limited to article content and references and therefore do not provide an in-depth understanding of the knowledge structure in a specific domain. The knowledge structure includes the core journals, core issues, the analysis of research trends, and the changes in focus of researchers. For a novice researcher, however, the literature survey remains a troublesome process in terms of precisely identifying the key articles that highlight the overview concept in a specific domain. The process is complicated and time-consuming, and it limits the number of articles collected for retrospective research. The objective of this paper is to provide information about the challenges and methodology of relevant literature retrieval by systematically reviewing the effectiveness of healthcare strategies. To this end, we build a platform for automatically gathering the full text of e- journals offered by the PubMed Central (PMC) database.11 We then analyze the co-citation results to understand the research theme of the domain. METHODS This paper tries to build a value-added literature database system for co-citation analysis of healthcare research. The results of the analysis will be visually presented to provide the structure of the domain knowledge to increase the productivity of researchers. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 41 Dataset For co-citation analysis, a data source of related articles on healthcare is required. For this paper, the articles were retrieved from the PMC database using search terms related to the healthcare domain. To build the article analysis system, we used bibliometrics to locate the relevant references while analysis techniques were implemented by the association rule algorithm of data mining. The PMC database, which is produced by the US National Institutes of Health and is implemented and maintained by the US National Center for Biotechnology Information of the US National Library of Medicine, provides electronic articles from more than one thousand full-text journals for free. We could understand the publication status from the Open Access Subset (OAS) and access to the OAI (Open Archives Initiative) Protocol for Metadata Harvesting, which includes the full text in XML and PDF. Regarding access permission, PMC offers a dataset of many open access journal articles. This paper used a dedicated XML-formatted dataset (https://www.ncbi.nlm.nih.gov/pmc/tools/oai/). The XML-formatted dataset followed the specification of DTD (document type definition) files, which are sorted by journal title. Each article has a PMCID (PMC identification), which is useful for data analysis. In addition to the dataset, the PMC also provides several web services to help widely disseminate articles to researchers. PubMed Central (PMC) citation database Searching module Citation module Web view Users Data sourceMiddle-end Pre-processeingBack-end Front-end XML files Web serverDB server Keyword Co-citation module Statistical module Figure 1. The system architecture of citation analysis with four subsystems. https://www.ncbi.nlm.nih.gov/pmc/tools/oai/ IDENTIFYING EMERGING ISSUES IN THE HEALTHCARE DOMAIN | CHU, LU, AND LIU 42 https://doi.org/10.6017/ital.v37i1.9595 System Architecture Our development environment consisted of the following four subsystems: front-end, middle-end, back-end, and pre-processing. The front-end creates a “web view,” a visualization of the results for our web-based co-citation analysis system. The system architecture is shown in figure 1. Front-End Development Subsystem We used Adobe Dreamweaver CS5 as a visual development tool for the design of web templates. The PHP programming language was chosen to build the co-citation system that would be used to access and analyze the full-text articles. In terms of the data mining technique, we implemented the Apriori algorithm with the PHP language.12 The results were exported as XML to a charting process, where we used amCharts (https://www.amcharts.com/), to create stock charts, column charts, pie charts, scatter charts, line charts, and so forth. Middle-End Server Subsystem The system architecture was a Microsoft Windows-based environment with a XAMPP 2.5 web server platform (https://www.apachefriends.org/download.html). XAMPP is a cross-platform web development kit that consists of Apache, MySQL, PHP, and Perl. It works across several operating systems, such as Linux, Windows, Apache, macOS, and Oracle Solaris, and provides SSL encryption, a phpMyAdmin database management system, Webalizer traffic management and control suite, a mail server (Mercury Mail Transport System), and FileZilla FTP server. Back-End Database Subsystem To speed up co-citation analysis, the back-end database system used MySQL 5.0.51b with interface phpMyAdmin 2.11.7 for easy management of the database. MySQL includes the following features: • Using C and C++ to code programs, users can develop an application programming interface (API) through Visual Basic, C, C + +, Eiffel, Java, Perl, PHP, Python, Ruby, and Tcl languages with the multithreading capability that can be used in multi-CPU systems and easily linked to other databases. • Performance of querying articles is quick because SQL commands are optimally implemented, providing many additional commands and functions for a user-friendly and flexible operating database. An encryption mechanism is also offered to improve data confidentiality. • MySQL can handle a large-scale dataset. The storage capacity is up to 2TB for Win32 NTS systems and up to 4TB for Linux ext3 systems. • It provides the software MyODBC as an ODBC driver for connecting many programming languages, and it several languages and character sets to achieve localization and internationalization. Pre-processing Subsystem The PMC provides access to the article via OAS, OAI services, e-utilities, and FTP. We used FTP to download a compressed (ZIP) file packaged with a filename following the pattern “articles?-?.xml.tar.gz” on October 28, 2012 (ftp://ftp.ncbi.nlm.nih.gov/pub/pmc), where “?-?” is “0-9” or “A-Z”. The size of the ZIP file was approximately 6.17GB. After extraction, the size of the articles was approximately 10GB. The 571,890 articles from 3,046 journals were grouped and https://www.amcharts.com/ https://www.apachefriends.org/download.html ftp://ftp.ncbi.nlm.nih.gov/pub/pmc INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 43 sorted by journal title in a folder labeled with an abbreviated title. An XML file would, for example, be named “AAPSJ-10-1-2751445.nxml,” where “AAPSJ” was the abbreviated title of the journal American Association of Pharmaceutical Scientists Journal, “10” was the volume of the journal, “1” was number of the issue, and “2751445” was the PMCID. We used related technologies for developing systems that include PHP language, array usage, and the Apriori algorithm to analyze the articles and build the co-citation system.13 Finally, several analysis modules were created to build an integrated co-citation system. RESEARCH PROCEDURE The following is our seven-step research procedure to fulfill the integrated co-citation system: 1. Parse XML file: select tags for construction of database; choose fields for co-citation analysis (for example, , , and ). 2. Present web-based article: design webpage and CSS style; present web-based XML file by indexing variable . 3. Build an abstract database: the database consists of several fields: , , , , , , and . 4. Develop searching module: pass the keyword to the method “POST” in SQL query language and present the search result in the webpage. 5. Develop statistical module: the statistical results include number of article and cited articles, the journals and authors cited in all articles, and the number of cited articles. 6. Develop citation module: visually present the statistical results in several formats; rank searched journals; rank searched and cited journals in all the articles. 7. Develop co-citation module: analyze the association between articles with the Apriori algorithm. Association Rule Algorithms The association rule (AR), usually represented by AB, means that the transaction containing item A also contains item B. There are many such rules in most of the dataset, but some were useless. To validate the rules, two indicators, support and confidence, can be applied. Support, which means usefulness, is the number of times the rules feature in the transactions, whereas confidence means certainty, which is the probability that B occurs whenever the A occurs. We chose the rules for which the values of both support and confidence were greater than a predefined threshold. For example, a rule stipulating “toastjam” has support of 1.2 percent and confidence of 65 percent, implying that 1.2 percent of the transactions contain “toast” and “jam” and that 65 percent of the transactions containing “toast” also contained “jam.” The principle for generating the AR is based on two features of the documents: (1) find the high- frequency items that set their supports greater than the threshold; (2) for each dataset X and its subnet Y, check the rule XY if the support is greater than the threshold, in which the rule XY means that the occurrence in the rule containing X also contains Y. Most studies focus on searching high-frequency item sets.14 The most popular approach for identifying the item sets is Apriori algorithm, as shown in figure 2.15 The algorithm rationale is that if the support of item set I is less IDENTIFYING EMERGING ISSUES IN THE HEALTHCARE DOMAIN | CHU, LU, AND LIU 44 https://doi.org/10.6017/ital.v37i1.9595 than or equal to the threshold, I is not a high-frequency item set. New item set I that inserts any item A into I would not be a high-frequency item set. According to the rationale, the Apriori algorithm is an iteration-based approach. First, it generates candidate item set C1 by calculating the number of occurrences of each attribute and finding that the high-frequency item set L1 has support greater than the threshold. Second, it generates item set C2 by joining L1 to C1, iteratively finding L2 and generating C3, and so on. 1: L1 = {large 1-item sets}; 2: for (k=2; Lk-1; k++) do begin 3: Ck = Candidate_gen (Lk-1); 4: for all transactions tD do begin /* generate candidate k-dataset*/ 5: Ct = subset (Ck, t); 6: for all candidates c  Ct do 7: c_count=c_count+1; 8: end 9: Lk ={cCk | c_count ≥ minsuppport} 10: end 11: return L =  Lk; Figure 2. The Apriori algorithm. The Apriori algorithm is one of the most commonly used methods for AR induction. The Candidate_gen algorithm, as shown in figure 3, includes join and prune operations for generating candidate sets.16 Steps 1 to 4 generate all possible candidate item sets c from Lk-1. Steps 5 to 8: delete the item set that is not a frequent item set by the Apriori algorithm. Step 9 returns candidate set Ck to the main algorithm. 1: for each item set X1 Lk-1 2: for each item set X2 Lk-1 3: c = join (X1[1], X1[2], X1[k-2], X1[k-1], X2[k-1]) 4: Where X1[1] = X2[1], X1[k-2] = X2[k-2], X1[k-1] < X2[k-1]; 5: for item sets c  Ck do 6: for all (k-1)-subsets s of c do 7: if (s  Lk-1) then add c to Ck; 8: else delete c from Ck; 9: return Ck; Figure 3. The Candidate_gen algorithm. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 45 RESULTS We searched the PMC database with keywords “healthcare,” “telecare,” “ecare,” “ehealthcare,” and “telemedicine” and located 681 articles with a combined 14,368 references. Values were missing from the year field for 4 of the references; this was also the case for 635 of a total of 52,902 authors. According to the keyword search for the healthcare domain, a pie chart of the journal citation analysis, as shown in figure 4, the top-ranked journal in terms of citations was the British Medical Journal (BMJ). It was cited approximately 439 times, 18.89 percent of the total, followed by the Journal of the American Medical Association (JAMA), which was cited approximately 344 times, 14.80 percent of the total. The trend of healthcare citation 1852 to 2009 peaked in 2006 at approximately 1,419 citations, with more than half of the total occurring in this year. Figure 4. Top-cited journals in the healthcare domain by percentage of total citations (N = 2324) With the keyword search for the healthcare domain, Figure 5 shows a pie chart of the author citations. The most-cited author was J. W. Varni, professor of pediatric cardiology at the University of Michigan Mott Children’s Hospital in Ann Arbor. This author was cited approximately 149 times, equivalent to 23.24 percent of the total, followed by D. N. Herndon, professor at the Department of Plastic and Hand Surgery, Friedrich-Alexander University of Erlangen in Germany. This author was cited approximately 73 times, 11.39 percent of the total. By identifying the affiliations of the top- ranked authors, researchers can access related information in their field of interest. The co-citation analysis was conducted using the Apriori algorithm. The relationship of co-citation journals with a supporting degree greater than 38 from 1852 to 2009 is shown in figure 6. Each IDENTIFYING EMERGING ISSUES IN THE HEALTHCARE DOMAIN | CHU, LU, AND LIU 46 https://doi.org/10.6017/ital.v37i1.9595 journal was denoted by a node, where the node with double circle meant the journal is co-cited with the other in a citing article. BMJ, which covers the fields of evidence-based nursing care, obstetrics, healthcare, nursing knowledge and practices, and others, is the core journal of the healthcare domain. Figure 5. Top-cited authors in journals of the healthcare domain by percentage of total citations (N = 641) Figure 6. The relationship of co-citation journals with BMJ. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 47 To identify the focus of the journal, we analyze the co-citation in three periods. In 1852–1907, journals are not in co-citation relationships; in 1908–61, five candidates had a supporting degree greater than 1 (see table 1); and in 1962–2009, twenty-eight candidates had a supporting degree greater than 14 (see table 2 (for example, BMJ and Lancet had sixty-eight co-citations). Table 1. Candidates in co-citation analysis with a supporting degree greater than 1 (1908–61). No Journals No. of Journals Co-cited Support 1 Publ Math Inst Hung Acad Sci, Publ Math 2 3 2 JAOA, J Osteopath 2 1 3 Antioch Rev, J Abnorm Soc Psychol 2 1 4 N Engl J Med, Am Surg 2 1 5 Arch Neurol Psychiatry, J Neurol Psychopathol, Z Ges Neurol Psychiat 3 1 Table 2. Candidates in co-citation analysis with a supporting degree greater than 14 (1962–2009). No Journals No. of Journals Co-cited Support 1 BMJ, Lancet 2 68 2 BMJ, JAMA 2 65 3 JAMA, Med Care 2 64 4 BMJ, Arch Intern Med 2 61 5 Lancet, JAMA 2 52 6 Soc Sci Med, BMJ 2 52 7 JAMA, Arch Intern Med 2 51 8 Lancet, Med Care 2 50 9 Crit Care Med, Prehospital Disaster Med 2 49 10 N Engl J Med, BMJ 2 49 11 N Engl J Med, Lancet 2 49 12 N Engl J Med, JAMA 2 47 13 N Engl J Med, Med Care 2 47 14 Qual Saf Health Care, BMJ 2 47 15 BMJ, Crit Care Med 2 42 16 Med Care, BMJ 2 38 17 N Engl J Med, J Bone Miner Res 2 33 IDENTIFYING EMERGING ISSUES IN THE HEALTHCARE DOMAIN | CHU, LU, AND LIU 48 https://doi.org/10.6017/ital.v37i1.9595 18 N Engl J Med, J Pediatr Surg 2 26 19 Lancet, J Pediatr Surg 2 25 20 JAMA, Nature 2 25 21 Lancet, JAMA, BMJ 3 24 22 N Engl J Med, Lancet, BMJ 3 21 23 Intensive Care Med, BMJ 2 21 24 BMJ, N Engl J Med, JAMA 3 20 25 N Engl J Med, JAMA, Lancet 3 20 26 JAMA, Med Care, Lancet 3 14 27 JAMA, Med Care, N Engl J Med 3 14 28 BMJ, JAMA, Lancet, N Engl J Med 4 14 The link of co-citation journals in three periods from 1852 to 2009 can be summarized as follows: (1) three journals were highly cited but were not in a co-citation relationship in 1852–1907 (see figure 7); (2) five clusters of the healthcare journals in co-citation relationships were found for the years 1908–61 (see figure 8); and (3) 1962–2009 had a distinct cluster of four journals within the healthcare domain (see figure 9). Figure 7. The relationship of co-citation journals for the healthcare domain in 1852–1907. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 49 Figure 8. The relationship of co-citation journals for the healthcare domain in 1908–61. Journals with double circles are co-cited with the other in a citing article. Journals with triple circles are co- cited with the other two in a citing article. Figure 9. The relationship of co-citation journals for the healthcare domain in 1962–2009. The thick line and circle indicates the journals are co-cited in a citing article. CONCLUSIONS IDENTIFYING EMERGING ISSUES IN THE HEALTHCARE DOMAIN | CHU, LU, AND LIU 50 https://doi.org/10.6017/ital.v37i1.9595 This paper presented an automated literature system for co-citation analysis to facilitate understanding of the sequence structure of journal articles cited in the healthcare domain. The system visually presents the results of its analysis to help researchers quickly identify the key articles that provide an overview of the healthcare domain. This paper used the keywords related to healthcare for its analysis and found that BMJ is a core journal in the domain. The co-citation analysis found a single cluster within the healthcare domain comprising four journals: BMJ, JAMA, Lancet, and the New England Journal of Medicine. This paper focused on a co-citation analysis of journals. Authors, articles, and issues featured in the co-citation analysis can be further studied in an automated way. A period analysis of publication years is also important. Further analyses can facilitate understanding of the changes in a research domain and the trend of research issues. In addition, the automatic generation of a map would be a worthwhile topic for the future study. ACKNOWLEDGEMENTS This article was funded by the Ministry of Science and Technology of Taiwan (MOST), formerly known as National Science Council (NSC), with Grant No: NSC 100-2410-H-227-003. For the remaining authors none were declared. All the authors have made significant contributions to the article and agree with its content. There is no known conflict of interest in this study. REFERENCES 1 A. Kitson et al., “What are the Core Elements of Patient-Centered Care? A Narrative Review and Synthesis of the Literature from Health Policy, Medicine and Nursing,” Journal of Advanced Nursing 69 (2013): 4–8, https://doi.org/10.1111/j.1365-2648.2012.06064.x. 2 S. J. Brownsell et al., “Future Systems for Remote Health Care,” Journal of Telemedicine and Telecare 5 (1999): 145–48, https://doi.org/10.1258/1357633991933503; B. G. Celler, N. H. Lovell, and D. K. Chan, “The Potential Impact of Home Telecare on Clinical Practice,” Medical Journal of Australia 171 (1999): 518–20; R. Walker et al., “What It Will Take to Create New Internet Initiatives in Health Care,” Journal of Medical Systems 27 (2003): 95–98, https://doi.org/10.1023/A:1021065330652. 3 I. Marshakova-Shaikevich, The Standard Impact Factor as an Evaluation Tool of Science Fields and Scientific Journals,” Scientometrics 35 (1996): 283–85, https://doi.org/10.1007/BF02018487; I. Marshakova-Shaikevich, “Bibliometric Maps of Field of Science,” Information Processing & Management 41(2005):1536–45, https://doi.org/10.1016/j.ipm.2005.03.027; A. R. Ramos- Rodrí guez and J. Ruí z-Navarro, “Changes in the Intellectual Structure of Strategic Management Research: A Bibliometric Study of the Strategic Management Journal, 1980–2000,” Strategic Management Journal 25, no. 10 (2004): 982–1000, https://doi.org/10.1002/smj.397. 4 H. Small, “Co-citation in the Scientific Literature: A New Measure of the Relationship between Two Documents,” Journal of American Society for Information Science 24 (1973): 266–68. https://doi.org/10.1111/j.1365-2648.2012.06064.x https://doi.org/10.1258/1357633991933503 https://doi.org/10.1023/A:1021065330652 https://doi.org/10.1007/BF02018487 https://doi.org/10.1016/j.ipm.2005.03.027 https://doi.org/10.1002/smj.397 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 51 5 M. M. Kessler, “Bibliographic Coupling between Scientific Papers,” American Documentation 14 (1963): 10–25, https://doi.org/10.1002/asi.5090140103; B. H. Weinberg, “Bibliographic Coupling: A Review,” Information Storage and Retrieval 10 (1974): 190–95. 6 H. D. White and B. C. Griffith, “Author Cocitation: A Literature Measure of Intellectual Structure,” Journal of the American Society for Information Science 32 (1981): 164–70, https://doi.org/10.1002/asi.4630320302. 7 Y. Ding, G. G. Chowdhury, and S. Foo, “Bibliometric Cartography of Information Retrieval Research by Using Co-word Analysis,” Information Processing & Management 37 no. 6 (November 2001): 818–20, https://doi.org/10.1016/S0306-4573(00)00051-0. 8 Small, “Co-citation,” 266. 9 D. Sullivan et al., “Understanding Rapid Theoretical Change in Particle Physics: A Month-by- Month Co-citation Analysis,” Scientometrics 2 (1980): 312–16, https://doi.org/10.1007/BF02016351. 10 N. Shibata et al., “Detecting Emerging Research Fronts based on Topological Measures in Citation Networks of Scientific Publications,” Technovation 28 (2008): 762–70, https://doi.org/10.1016/j.technovation.2008.03.009. 11 Weinberg, “Bibliographic Coupling.” 12 White and Griffith, “Author Cocitation.” 13 R. Agrawal and R. Srikant. “Fast Algorithm for Mining Association Rules in Large Databases” (paper, International Conference on Very Large Databases [VLDB], September 12–15, 1994, Santiago de Chile). 14 R. Agrawal, T. Imielinski, and A. Swami, “Mining Association Rules between Sets of Items in Large Databases” (paper, ACM SIGMOD International Conference on Management of Data, Washington, DC, May 25–28, 1993. 15 Agrawal and Srikant, “Fast Algorithm,” 3. 16 Ibid., 4. https://doi.org/10.1002/asi.5090140103 https://doi.org/10.1002/asi.4630320302 https://doi.org/10.1016/S0306-4573(00)00051-0 https://doi.org/10.1007/BF02016351 https://doi.org/10.1016/j.technovation.2008.03.009 Abstract Introduction Methods Dataset System Architecture Front-End Development Subsystem Middle-End Server Subsystem Back-End Database Subsystem Pre-processing Subsystem Research Procedure Association Rule Algorithms Results Conclusions Acknowledgements References 9598 ---- Microsoft Word - March_ITAL_Massicotte_proof_revised.docx Reference Rot in the Repository: A Case Study of Electronic Theses and Dissertations (ETDs) in an Academic Library Mia Massicotte and Kathleen Botter INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 11 ABSTRACT This study examines ETDs deposited during the period 2011-2015 in an institutional repository, to determine the degree to which the documents suffer from reference rot, that is, linkrot plus content drift. The authors converted and examined 664 doctoral dissertations in total, extracting 11,437 links, finding overall that 77% of links were active, and 23% exhibited linkrot. A stratified random sample of 49 ETDs was performed which produced 990 active links, which were then checked for content drift based on mementos found in the Wayback Machine. Mementos were found for 77% of links, and approximately half of these, 492 of 990, exhibited content drift. The results serve to emphasize not only the necessity of broader awareness of this problem, but also to stimulate action on the preservation front. INTRODUCTION A significant proportion of material in institutional repositories is comprised of electronic theses and dissertations (ETDs), providing academic librarians with a rich testbed for deepening our understanding of new paradigms in scholarly publishing and their implications for long-term digital preservation. While academic libraries have long collected and preserved hard copy theses and dissertations of the parent institution, the shift to mandatory electronic deposit of this material has conferred new obligations and curatorial functions not previously incorporated into library workflows. By highlighting ETDs as a susceptible collection deserving of specific preservation actions, we draw attention to some unique responsibilities for libraries housing university-produced content, particularly as scholarly information continues its shift away from commercial production and distribution channels. As Teper and Kraemer point out in their discussion of ETD program goals, “without preservation, long-term access is impossible; without long-term access, preservation is meaningless.”1 What Is Reference Rot, And Why Study It? In addition to linkrot (where a link sends the user to a webpage which is no longer available), Mia Massicotte (Mia.Massicotte@concordia.ca) is Systems Librarian, Concordia University Library, Montreal, Quebec, Canada. Kathleen Botter (Kathleen.Botter@concordia.ca) is Systems Librarian, Concordia University Library, Montreal, Quebec, Canada. A CASE STUDY OF ELECTRONIC THESES AND DISSERTATIONS (ETDS) IN AN ACADEMIC LIBRARY | MASSICOTTE AND BOTTER | https://doi.org/10.6017/ital.v36i1.9598 12 there are webpages that remain available, but whose contents have undergone change over time-- known as content drift. This dual phenomena of linkrot plus content drift has been characterized as reference rot by the Hiberlink project team,2 and has important implications for digital preservation. Since theses and dissertations are original works born digital by virtue of mandatory deposit programs, a university’s ETD program is effectively a digital publishing initiative, accompanied by a new universe of responsibility for its digital preservation. Due to the specialized nature of graduate-level research, ETDs frequently include links to resources on the open web, for example, personal blogs, project websites, and commercial entities. Digital Object Identifiers (DOIs), useful in the context of published literature, do not apply to URLs on the free web, which are DOI-indifferent. Open web links also fall outside the scope of preservation initiatives such as LOCKSS (Lots of Copies Keep Stuff Safe)3 which aim to safeguard the published literature. With increasing frequency, researchers are citing newer forms of scholarship, which do not readily fall under the rubric of published literature. Moreover, since thesis preparation is conducted over a period of time typically measured in years, links cited therein are likely to be more vulnerable to linkrot and content drift by the time of manuscript submission. Yet despite the surfeit of anecdotal daily evidence that URLs vanish and result in dead links, Phillips, Alemneh, and Ayala point out that “by and large academic libraries are not capturing and maintaining collections of web resources that provide context and historical reference points to the modern theses and dissertations held in their collections.”4 Since an ETD comprises a unique form of scholarly output produced by universities, and simultaneously satisfies the parent institution's degree-granting apparatus, as well as reflecting its academic stature on the international stage, the presence of reference rot in this body of literature is of particular concern and worthy of immediate attention. Smoking Guns There has been no shortage of evidence reporting on the linkrot phenomena over the last two decades. Koehler, whose initial study on linkrot appeared in JASIS in 1999, periodically revisited, analyzed, and reported on the same set of 360 URLs collected in his original study.5,6,7 In 2015, upon the twenty-year benchmark of the original data collection, Oguz and Koehler reported in JASIS that only 2 of the original links remained active.8 A number of foundational studies, including Casserly and Bird,9 Spinellis,10 Sellitto,11 Falagas, Karveli, and Tritsaroli,12 and Wagner et al.13 have reported on linkrot occurring in professional literature. Sanderson, Phillips, and Van de Sompel provide a table of 17 well-known linkrot studies, comparing overall benchmarks, and supplying a succinct summary of the scope of each study.14 Linkrot also gained further important exposure with the Harvard Law School study by Zittrain, Albert, and Lessig, which found that 70% of 3 Harvard law journal references, and 49.9% of URLs in Supreme Court opinions examined, no longer pointed to their originally cited sources.15 INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 13 Members of the Hiberlink project, which set out to examine “a vast corpus of online scholarly publication in order to assess what links still work as intended and what web content has been successfully archived using text mining and information extracting tools” have been pivotal in making the case for reference rot.16 Hiberlink demonstrated that failure to link to cited sources was due not only to linkrot, but also to web page content which changed over time.17 A new dimension of the digital preservation universe was thrown into sharp relief with follow-up study by Klein et al. (2014), which examined one million web references extracted from 3.5 million Science, Technology, and Medicine (STM) articles published in Elsevier, PubMed Central, and ArXiv, between the years 1997 and 2012. The study concluded that one in five articles suffers from reference rot.18 Though the study focused on STM articles, its authors drew attention to theses and dissertations as a susceptible class of material. Analyzing the same set of links extracted from this large STM corpus, Jones et al. (2016) recently reported that 75% of referenced open web pages demonstrated changes in content.19 ETDs — A Susceptible Collection The digital preservation part of institutionally mandated ETD deposit has yet to have its dots fully connected to the rest of the diagram. After four years of research into academic institutions’ ETD programs, Halbert, Skinner, and Schultz reported that close to 75% of respondents surveyed had no preservation plan for their ETD collections.20 Despite the prevalence of linkrot studies, linkrot in ETDs has not been subjected to similar scrutiny, and the implications of disappearance of content is underappreciated. While mandatory deposit programs have become relatively commonplace, focus has largely remained on policy and implementation aspects, metadata quality, interoperability and conformance to standards.21,22 There are few studies which focus on institutional repository link content. The study conducted by Sanderson, Philips, and Van de Sompel (2011) was a large-scale examination of two repositories.23 400,144 papers deposited in ArXiv, and 3,595 papers in the University of North Texas (UNT) digital library repository were studied, and more than 160,000 URLs examined. Links were analyzed for persistence and the availability of mementos, that is, whether prior versions of the page existed in a public web archive, such as the Internet Archive's Wayback Machine. For 72% of UNT URLs, either mementos were available, or the resource still existed at its original location, or both. Although 54% (9,880) were available in one or more international web archives, 28% (5,073) of UNT's ETD links were found to no longer exist, nor had they been archived by the international archival community. Phillips, Alemneh, and Ayhala looked at overall general patterns and trends of URL references in repository ETDs, examining 4,335 ETDs between the years 1999-2012 in the UNT repository.24 The team analyzed 26,683 unique URLs in 2,713 ETDs containing one or more links, finding an overall average of 10.58 unique URLs per ETD with one or more links. The UNT team provided a A CASE STUDY OF ELECTRONIC THESES AND DISSERTATIONS (ETDS) IN AN ACADEMIC LIBRARY | MASSICOTTE AND BOTTER | https://doi.org/10.6017/ital.v36i1.9598 14 breakdown of domain and subdomain occurrence frequency, and indicated areas of future investigation into content-based URL linking patterns of ETDs. ETD link decay was studied by Sife and Bernard, who performed a citation analysis on URLs in 83 theses published between 2007 and 2011 at Tanzania's Sokoine National Agricultural Library.25 15,468 citations were examined, 9.6% (1,487) of which were open web citations. URLs were considered active if found at the original location, or available after a URL redirect. The authors manually tested URLs over a period of seven days to record their accessibility, noting down inaccessible URLs error messages and domains, and analyzing the types of errors encountered. The authors calculated that it took only 2.5 years for half of the web citations to disappear. At the ETD2014 conference,26 an important study of 7,500 ETDs in 5 U.S. universities was presented. Of 6,400 ETDs defended between 2003 and 2010, approximately 18% of open web link content was confirmed as lost, and a further 34% at risk of loss, that is, live links which lacked an archived copy.27 Though the results of that particular study have not been formally published, it was briefly summarized in a session held at the 38th UKSG Annual Conference in Glasgow, Scotland in March 2015, an account of which was subsequently published by Burnhill, Mewissen, and Wincewicz in Insights.28 Given the scarcity of published literature on link content as found in ETDs, this present study which examines reference rot in ETDs in an academic institutional repository is unique, draws attention to an important digital collection which is vulnerable to loss, and highlights need for action. BACKGROUND AND CONTEXT Concordia University is a comprehensive university located in Montreal, with a student population of 43,903 full-time equivalents in 2015, of which 7,835 were graduate students. 27 PhD programs were offered in 2015,29 and 43 programs at the Masters level. Faculties of Arts and Science, Engineering and Computer Science, Fine Arts, and Business have a thesis requirement, and produce upwards of 350 Masters and 150 PhD dissertations annually. The broad disciplines, and the departmental clusters used in this study are shown in Table 1. Prior to the thesis deposit mandate, Concordia University Library housed hard copy versions of theses and dissertations in the collection. In 2009, the Library launched Spectrum, Concordia’s Eprints institutional repository, playing a leadership role in Spectrum's implementation and policy development, and providing training and support to the School of Graduate Studies regarding submission and management of theses for deposit. Following a successful pilot project, the Graduate Studies Office ceased accepting paper manuscripts, and mandated electronic deposit of all theses and dissertations into Spectrum as of spring 2011. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 15 Discipline Department Discipline Department Arts Applied Linguistics Communication Economics Educational Technology History Hist and Phil of Religion Humanities Philosophy Sociology Political Science Psychology Religion Business* Decision Sciences and MIS Finance Management Marketing Engineering** Building Engineering Civil Engineering Computer Science Comp Sci & Software Eng Electrical and Comp Eng Industrial Engineering Info Systems Security Mechanical Engineering Fine Arts Art Education Art History Film and Moving Image Studies Industrial Design Fine Arts Performing Arts Science Biology Chemistry Mathematics Physics Exercise Science Table 1. Summary of departmental clusters used in this study * John Molson School of Business ** Engineering & Computer Science METHODOLOGY We concentrated on PhD dissertations (henceforth ETDs) in Spectrum in order to limit the scope of the project; Master's theses were excluded. A 5-year period was chosen, beginning with the first semester of mandatory deposit, spring 2011, through fall 2015, a total of 720 ETDs. Since Concordia ETDs are released for publication immediately following convocation, the University's official convocation dates were used to identify the set of documents to be downloaded and examined. We proceeded in phases: first downloading ETDs from Spectrum and converting to a text format that could be examined for patterns; then extracting links from each and testing programmatically A CASE STUDY OF ELECTRONIC THESES AND DISSERTATIONS (ETDS) IN AN ACADEMIC LIBRARY | MASSICOTTE AND BOTTER | https://doi.org/10.6017/ital.v36i1.9598 16 for linkrot; then drawing a stratified random sample of active URLs and visiting them to determine if content drift had taken place. Our methodology for link extraction was similar to those described by Klein et al.,30 and Zhou, Tobin, and Grover.31 During the dissertation download stage, 36 ETDs with embargoed content were encountered and eliminated. ETDs were then converted from existing PDF/A format to xml. A further 20 documents failed to convert due to nonstandard or complex formatting which resulted in unreadable, garbled characters. These documents resisted multiple conversion attempts, and since they could not be mined, had to be eliminated. A final total of 664 ETDs were successfully converted using three different tools: 97% (644) were converted using PDFtoHTML,32 the remaining 3% by either givemetext (14)33 or Adobe Acrobat (3). A spot check of documents was sufficient evidence that many links occurred throughout the text body. Since we intended to extract URLs to the open web, we wanted to err on the side of detecting more links, rather than easily-identifiable well-formed URLs. Links were mined from the body of the text in a manner similar to the study carried out at UNT.34 We wanted a regular expression which would catch as many URLs as possible, expecting to manually clean the link output before further processing. We tested multiple regular expressions35 against a small sample of our converted ETDs and compared the results. We selected one which seemed well-suited for our purpose, as it was liberal in detecting links throughout the text, was able to extract links which contained obvious omissions and problems — for example, those that lacked http:// prefixes — but also caught non-obvious errors, such as ellipses in long URLs. We considered how de- duplication of extracted links might affect the outcome, and opted to count each link as an individual instance. Manual cleanup included catching URLs that broke across new lines, identifying false hits such as titles containing colons and DOIs, and adding escape encoding characters for "&" and "%" in order to generate a clean URL for use in the next step of the process. METHODOLOGY — Linkrot collection A script programmatically used the cURL command line tool to visit each link and fetch the http response code in return.36 An output listing was produced for each doctoral dissertation, comprised of the original URLs, the final URLs, and the http response codes. Link output for each of the converted 664 ETDs was collected from December 2015 to January 2016, with the fall 2015 semester checked in March 2016. 76% (504 of 664) of ETDs contained one or more links, the highest number of links (5,946) falling into the Arts group. 24% (160 of 664) of ETDs contained no links. For the 5-year period, the broad discipline breakdown of documents examined, the number of ETDs with links, and the number of links extracted are shown in Table 2. Converted ETDs by publication year, broken out by broad disciplines, are shown in Figure 1. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 17 Discipline Number of PhD ETDs in Spectrum ETDs converted* Contain no links Contain links Number of links extracted Arts 210 195 31 164 5,946 Business 45 43 12 31 210 Engineering 351 326 82 244 3,259 Fine Arts 28 25 2 23 1,728 Science 86 75 33 42 294 Total 720 664 160 504 11,437 Table 2. 5-year period, 2011-2015, summary of documents examined and links extracted * 56 documents in total eliminated (36 embargoed; plus 20 which failed to convert). Figure 1. Converted ETDs by publication year and broad discipline A CASE STUDY OF ELECTRONIC THESES AND DISSERTATIONS (ETDS) IN AN ACADEMIC LIBRARY | MASSICOTTE AND BOTTER | https://doi.org/10.6017/ital.v36i1.9598 18 The 11,437 links extracted were checked for linkrot, each link accessed and its http response code recorded. 77% (8,834 of 11,437) of links returned an active 2xx http response code. 23% (2,603) of links could not be reached, returning a response code other than in the 2xx range. This includes 102 links in the 3xx range which failed to reach a destination after 50 redirects and were considered linkrot. Numbers of links, total link response, and link response by year broken down by broad discipline are shown in Figure 2, with accompanying data provided in Table 3 and discussed in the findings section. Figure 2. Link HTTP response codes, by broad discipline and year INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 19 Discipline HTTP response code 2011 2012 2013 2014 2015 Total % Active & Rotten** Arts 2xx 691 864 800 1,108 1,093 4,556 77% all other* 320 428 131 293 218 1,390 23% Business 2xx 14 52 17 22 50 155 74% all other 9 19 5 9 13 55 26% Engineering 2xx 302 702 638 482 404 2,528 78% all other 134 172 180 196 49 731 22% Fine Arts 2xx 165 143 504 467 94 1,373 79% all other 74 56 118 98 9 355 21% Science 2xx 77 34 58 39 14 222 76% all other 25 23 10 11 3 72 24% Subtotal 2xx 1,249 1,795 2,017 2,118 1,655 8,834 77% active all other 562 698 444 607 292 2,603 23% rotten % Rotten 31% 28% 18% 22% 15% 23% Total 1,811 2,493 2,461 2,725 1,947 11,437 100% Table 3. Breakdown by year and discipline showing active (2xx) and rotten (all others) response codes *All other = 0, 1xx, 3xx (unresolved after 50 redirects), 4xx and 5xx response codes combined ** Active and rotten rates based on total links per discipline METHODOLOGY — Content Drift For the content drift phase, we wanted to sample documents from each of the five disciplines. ETDs which did not contain any links were excluded from the sample. Using only documents with one or more active links, a stratified random sample of 10% was drawn for a final sample of 49 ETDs containing a total of 990 links. A snippet of text surrounding each link was then also extracted from each ETD, along with any "date accessed" or "date viewed" information if present. Each link was manually visited, assessed for content drift, and observations recorded. The breakdown of the content drift sample is shown in Table 4. A CASE STUDY OF ELECTRONIC THESES AND DISSERTATIONS (ETDS) IN AN ACADEMIC LIBRARY | MASSICOTTE AND BOTTER | https://doi.org/10.6017/ital.v36i1.9598 20 Discipline ETDs with links ETDs with active links (2xx) ETDs sampled for content drift* Number of links extracted for sample Arts 164 156 16 668 Business 31 28 3 12 Engineering 244 235 24 154 Fine Arts 23 23 2 136 Science 42 40 4 20 Total 504 482 49 990 Table 4. Breakdown of sample pool of ETDs for content drift analysis * 10% sample drawn from each discipline’s pool of ETDs; only ETDs with URLs relevant for content drift assessment. Visited links were benchmarked against the existence of a memento -- an archived snapshot of that page located in the Wayback Machine.37 Since the University sets a strict thesis submission deadline of 3 months prior to convocation, mementos prior to submission deadline would be sought. Based on the occurrences of "date accessed" and discursive information found in the snippets, we arrived at the supposition that links were likely to have been checked the closer the student approached final stages of manuscript preparation, although this is not verifiable. We set ourselves a soft window for locating an archived snapshot using a date 6 months prior to the convocation date as the benchmark; that is, for each semester's deadline date, an additional 3 months was added, arriving at a 6-months-prior-to-publication marker. Since programmatic analysis of 990 links required time, expertise, and resources not available to us, we approached the problem heuristically. Assuming that online consultations are not linear, active links occurring multiple times in a document were given equal weight. Each link was manually checked in the Wayback Machine using "date viewed" if provided; if no date was provided (the majority of cases), Wayback was checked to see if an archived version existed as close to our 6 month soft marker as possible. If a memento was not found within a month earlier/later than the soft marker, then the nearest neighboring older memento was selected, if one existed. The original URL, the date the URL was visited, and whether a snapshot was located in Wayback was recorded. All links were checked during July-August 2016. If the initial web browser failed to access, a second and sometimes third browser was tried, using Safari, Chrome, and Internet Explorer (IE) in that order. Unsuccessful attempts to reach Wayback were rechecked in September. The question as to whether, and to what degree content drift had occurred was assessed, and is discussed in the next section. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 21 FINDINGS AND DISCUSSION Linkrot Findings Of 664 ETDs examined for linkrot, 77% of links tested returned an active http response code in the 2xx range -- roughly three-quarters overall. Numbers of links by broad discipline varied greatly, as shown in Figure 2 (healthy links in green, linkrot shown in red). Linkrot rates ranged from 21% in Fine Arts, to 26% in Business, as seen in last column of Table 3. It should be noted that 2xx response codes are also returned for pages that disguise themselves as active links. For example, a URL returns an active status code when a domain has been parked (e.g. purchased to reserve the space), or when a customized 404-page-not-found is encountered. Since we had no mechanism in place to treat false positives, these were flagged during the linkrot phase as candidates for subsequent content drift analysis. 23% (2,604 of 11,437) of all links, returned a response code of something other than in the 2xx-range and considered linkrot -- roughly one-quarter. Response codes in the 4xx range alone, including 404-page-not-found errors, comprised 17% (1,916 of 11,437) of all links. Table 5 shows the breakdown of the total number of links that were visited in the spring of 2016 for linkrot determination. HTTP response code category Meaning of http response code* Number of links Percent of total links (%) 0 Empty response** 507 4% 1xx Informational 2 0% 2xx Successful 8,834 77% 3xx Redirection† 102 1% 4xx Client error 1,916 17% 5xx Server error 76 1% Total 11,437 100% Table 5. Breakdown of HTTP response codes received * We used http protocol definitions at http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html ** unofficial http response code due to request timing out † failure to resolve after 50 redirects Http responses ranged from a high of 85% active in 2015, to a low of 69% active in 2011, the oldest publication year. To put it differently, the most recent year exhibited a linkrot rate of 15%. Consistent with other studies, linkrot manifests itself quickly after publication and increases over time, as indicated by percentages shown in Figure 2. Content Drift Findings Of the 990 links visited to check for the presence of content drift, 764 (400 + 364), or 77%, had a Wayback memento compared 226 (92+134), or 23%, which did not. Slightly more than half of links with mementos, 52% (400 of 764), demonstrated some level of content drift when the A CASE STUDY OF ELECTRONIC THESES AND DISSERTATIONS (ETDS) IN AN ACADEMIC LIBRARY | MASSICOTTE AND BOTTER | https://doi.org/10.6017/ital.v36i1.9598 22 memento was compared to the current active link, while 48% (364 of 764) with mementos did not exhibit content drift. The presence of content drift by discipline, with/without mementos showing numbers of links tested, appears in Table 6. Discipline Number of links tested Content Drift detected No Content Drift Memento found Memento not found Total Memento found Memento not found Total Arts 668 261 60 321 254 93 347 Business 12 5 0 5 4 3 7 Engineering 154 74 10 84 55 15 70 Fine Arts 136 55 22 77 38 21 59 Science 20 5 0 5 13 2 15 Total 990 400 92 492 364 134 498 Table 6. Presence of Content Drift by Discipline, with/without mementos For links that had no memento in Wayback, content drift assessment was based on the presence of an observable date in the current active link, including copyright, and/or other details which positively correlated against our extracted snippet information. For example, some links retrieved a .pdf or other static file which correlated with the snippet, there being no reason to conclude its content had undergone change since publication, despite the lack of a memento. Snippets were also used in cases where a robots.txt file at the target URL had prevented Wayback from creating a memento. Occasional examination of the dissertation text was conducted to validate information extracted in the snippet. The 23% (226) which lacked mementos remain at significant risk and will fall prey to further drift as time passes. As seen in Table 7, of 492 URLs manifesting content drift, 11% (54 of 492) were completely lost, linking to web domains that had been sold or were currently up for sale, and webpages replaced or removed. 9% (42 of 492) of web pages exhibited major change such that there was little correlation with snippets, or where website overhauls made assessment difficult, but not impossible. 36% (179 of 492) web links exhibited minor drift, primarily pages that differed somewhat from a memento in visual appearance, such as header and footer differences, changes in background theme or style, or changes in navigation or search functionality which did not represent a high degree of impairment. 7% (34 of 492) linked to continually updating websites, such as Wikipedia and news organizations, and 7% (35 of 492) were customized 404-page-not- found, distinctive enough to warrant separate categories. A full 30% (148 of 492) exhibited a multiplicity of changes of uncertain nature which we grouped together, such as pages where graphic or audio components had been removed or could not be retrieved, broken javascript that impeded access, browser failure, mementos not accessible after repeated attempts -- indicative of a range of issues affecting the quality of web archives and hence preservation.38 The types of INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 23 content drift encountered, broken down by broad discipline and numbers of links, and percentage, is shown in Table 7. Type of content drift Arts Business Engineeri ng Fine Arts Science Total % of type Lost 45 0 3 6 0 54 11% Major but findable 22 0 9 9 2 42 9% Minor – redesigned but recognizable 128 2 30 17 2 179 36% Ongoing updating website 25 3 5 0 1 34 7% Custom 404 23 0 4 8 0 35 7% Other 78 0 33 37 0 148 30% Total 321 5 84 77 5 492 100% Table 7. Types of content drift encountered, number of links by broad discipline Though difficulties encountered during content drift assessment made further extrapolation problematic, the presence of reference rot was confirmed. Our 10% stratified random sample examined 990 active links, finding that roughly half (492 of 990) manifested some degree of content drift. For 364 links, or 36% overall, a benchmark memento was found and no content drift detected. Although many content drift changes can arguably be characterized as minor, it is not possible to ascertain where the content drift scale tips irremediably for any particular reader. What can be said with certainty is that 11% of active links which did not exhibit linkrot, and were quite live and accessible, fell into a small but unsettling group where the context of the cited web source is irrevocably lost. Of the 498 links which did not exhibit any evidence of content drift, 134, approximately one-third, have no memento archived and continue to remain at high risk. A focused and deeper analysis of active links which might lead to a typology of content drift types would be a possible area of future study, though even the well-resourced study by Jones et al. which utilized a strict "ground truth" for comparing textual mementos over time, points out that classifying links would certainly be challenging.39 A larger sample size might also allow closer analysis of disciplinary differences, which may lead to a better understanding of these types of content drift variations. CONCLUSION Reference rot in the form of linkrot and content drift were observed in ETDs in Spectrum, our institutional repository, and this confirmation should give pause for those charged with A CASE STUDY OF ELECTRONIC THESES AND DISSERTATIONS (ETDS) IN AN ACADEMIC LIBRARY | MASSICOTTE AND BOTTER | https://doi.org/10.6017/ital.v36i1.9598 24 stewardship of ETD collections. Theses and dissertations have long been viewed as material which contribute overall to academic scholarly output, and carry unique status within the academy. In August 2016, OpenDOAR registered 1600 institutional repositories with ETDs,40 up from 1,100 institutions as reported in 2012 by grey literature specialist Schoepfel.41 Academic libraries have, in large part, facilitated the transition from paper to ETD with widespread adoption of institutional repository deposit programs, and along with that adoption comes a range of long-term preservation issues. Yet as Ohio State’s Strategic Digital Initiatives Working Group pointed out, “Even in digital library communities, preservation all too often stands in for or is used interchangeably with byte level backup of content.”42 For long-term access, focus can productively be shifted to offset the immediate threat of incompleteness and inadequate capture.43 Not much has changed since Hedstrom wrote back in 1997: “With few exceptions, digital library research has focussed on architectures and systems for information organization and retrieval, presentation and visualization, and administration of intellectual property rights … The critical role of digital libraries and archives in ensuring the future accessibility of information with enduring value has taken a back seat to enhancing access to current and actively used materials.”44 Our understanding and discussion of digital preservation must be broadened, and attention turned to this key area of responsibility in the preservation life-cycle. The authors maintain that ETD content and link preservation is an editorial, not individual, imperative. Encouraging individual authors to perform their own archiving is doomed to fall short of even reasonable expectations. Instituting measures such as Perma, a distributed, redundant method of capturing and archiving web site content as part of the citation process must be pro-actively sought and built into library, and hence repository, workflows.45 Browser plugins and automated solutions which use the Memento protocol for capturing and archiving web site content as part of the citation process do exist,46 but naturally have to be implemented before they can take effect. Either way, efforts to operationalize existing mechanisms which are designed to reduce future loss would be extremely productive. Responsibility for insuring not only current, but continuing future access to ETD content rests with those who maintain curatorial function of the repository. Academic librarians have assumed a prominent and de facto role as curators, facilitating the role of university publication and emphasizing its break away from previous ties with commercial entities. We collectively bear greater responsibility for this body of scholarly work, and need to move forward from a position of benign neglect to one of informed curation and pro-active preservation of an important collection of scholarly output which is at risk. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 25 REFERENCES 1. Thomas H. Teper and Beth Kraemer, “Long-Term Retention of Electronic Theses and Dissertations,” College & Research Libraries 63, no. 1 (January 1, 2002), 64, https://doi.org/10.5860/crl.63.1.61. 2 The term “reference rot” was introduced by the Hiberlink team. “Hiberlink – About,” accessed March 31, 2016, http://hiberlink.org/about.html. 3. LOCKSS: Lots of Copies Keep Stuff Safe, accessed December 6, 2016, http://www.lockss.org/about/what-is-lockss/. 4. Mark Edward Phillips, Daniel Gelaw Alemneh, and Brenda Reyes Ayala, “Analysis of URL References in ETDs: A Case Study at the University of North Texas,” Library Management 35, no. 4/5 (June 3, 2014), 294, https://doi.org/10.1108/LM-08-2013-0073. 5. Wallace Koehler, “An Analysis of Web Page and Web Site Constancy and Permanence,” Journal of the American Society for Information Science 50, no. 2 (January 1, 1999): 162–80, https://doi.org/10.1002/(SICI)1097-4571(1999)50:2<162::AID-ASI7>3.0.CO;2-B. 6. Wallace Koehler, “Web Page Change and Persistence—a Four-Year Longitudinal Study,” Journal of the American Society for Information Science & Technology 53, no. 2 (January 15, 2002): 162–71, http://doi.org/10.1002/asi.10018. 7. Wallace Koehler, "A longitudinal study of Web pages continued: a consideration of document persistence." Information Research 9, no. 2 (2004): 9-2, http://www.informationr.net/ir/9- 2/paper174.html. 8. Fatih Oguz and Wallace Koehler, “URL Decay at Year 20: A Research Note,” Journal of the Association for Information Science and Technology 67, no. 2 (February 1, 2016): 477–79, https://doi.org/10.1002/asi.23561. 9. Mary F. Casserly and James Bird, “Web Citation Availability: Analysis and Implications for Scholarship,” College and Research Libraries 64, no. 4 (July 2003): 300–317, http://crl.acrl.org/content/64/4/300.full.pdf. 10. Diomidis Spinellis, “The Decay and Failures of Web References,” Communications of the ACM 46, no. 1 (January 2003): 71–77, https://doi.org/10.1145/602421.602422. 11. Carmine Sellitto, “A Study of Missing Web-Cites in Scholarly Articles: Towards an Evaluation Framework,” Journal of Information Science 30, no. 6 (December 1, 2004): 484–95, https://doi.org/10.1177/0165551504047822. A CASE STUDY OF ELECTRONIC THESES AND DISSERTATIONS (ETDS) IN AN ACADEMIC LIBRARY | MASSICOTTE AND BOTTER | https://doi.org/10.6017/ital.v36i1.9598 26 12. Matthew E. Falagas, Efthymia A. Karveli, and Vassiliki I. Tritsaroli, “The Risk of Using the Internet as Reference Resource: A Comparative Study,” International Journal of Medical Informatics 77, no. 4 (April 2008): 280–86, https://doi.org/10.1016/j.ijmedinf.2007.07.001. 13. Cassie Wagner et al., “Disappearing Act: Decay of Uniform Resource Locators in Health Care Management Journals,” Journal of the Medical Library Association 97, no. 2 (April 2009): 122– 30, https://doi.org/10.3163/1536-5050.97.2.009. 14. Robert Sanderson, Mark Phillips, and Herbert Van de Sompel, “Analyzing the Persistence of Referenced Web Resources with Memento,” arXiv:1105.3459 [Cs], May 17, 2011, http://arxiv.org/abs/1105.3459. 15. Jonathan Zittrain, Kendra Albert, and Lawrence Lessig, “Perma: Scoping and Addressing the Problem of Link and Reference Rot in Legal Citations,” Legal Information Management 14, no. 2 (June 2014): 88–99, https://doi.org/10.1017/S1472669614000255. 16. “Hiberlink - About,” accessed March 31, 2016, http://hiberlink.org/about.html. 17. “Hiberlink - Our Research,” accessed March 31, 2016, http://hiberlink.org/research.html. 18. Martin Klein, Herbert Van de Sompel, Robert Sanderson, Harihar Shankar, Lyudmila Balakireva, Ke Zhou, Richard Tobin. “Scholarly Context Not Found: One in Five Articles Suffers from Reference Rot,” PLoS One 9, no. 12 (December 26, 2014), https://doi.org/10.1371/journal.pone.0115253. 19. Shawn M. Jones, Herbert Van de Sompel, Harihar Shankar, Martin Klein, Richard Tobin, Claire Grover. “Scholarly Context Adrift: Three out of Four URI References Lead to Changed Content,” PLOS ONE 11, no. 12 (December 2, 2016): e0167475, https://doi.org/10.1371/journal.pone.0167475. 20. Martin Halbert, Katherine Skinner, and Matt Schultz, “Preserving Electronic Theses and Dissertations: Findings of the Lifecycle Management for ETDs Project,” Text, (August 6, 2015), 2, http://educopia.org/presentations/preserving-electronic-theses-and- dissertations-findings-lifecycle-management-etds. 21. For a recent overview, see Sarah Potvin and Santi Thompson, “An Analysis of Evolving Metadata Influences, Standards, and Practices in Electronic Theses and Dissertations,” Library Resources & Technical Services 60, no. 2 (March 31, 2016): 99–114, https://doi.org/10.5860/lrts.60n2.99. 22. Joy M. Perrin, Heidi M. Winkler, and Le Yang, “Digital Preservation Challenges with an ETD Collection — A Case Study at Texas Tech University,” The Journal of Academic Librarianship 41, no. 1 (January 2015): 98–104, https://doi.org/10.1016/j.acalib.2014.11.002. 23. Sanderson, Phillips, and Van de Sompel, “Analyzing the Persistence of Referenced Web Resources with Memento,” http://arxiv.org/abs/1105.3459. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 27 24. Phillips, Alemneh, and Ayala, "Analysis of URL references," https://doi.org/10.1108/LM-08- 2013-0073. 25. Alfred S. Sife and Ronald Bernard, “Persistence and Decay of Web Citations Used in Theses and Dissertations Available at the Sokoine National Agricultural Library, Tanzania,” International Journal of Education and Development Using Information and Communication Technology 9, no. 2 (2013): 85–94, http://eric.ed.gov/?id=EJ1071354. 26. “ETD2014 — University of Leicester,” University of Leicester, accessed January 27, 2016, http://www2.le.ac.uk/library/downloads/etd2014. 27. EDINA, University of Edinburgh, “Reference Rot: Threat and Remedy,” (Education, 04:54:38 UTC), http://www.slideshare.net/edinadocumentationofficer/reference-rot-and-linked- data-threat-and-remedy. 28. Peter Burnhill, Muriel Mewissen, and Richard Wincewicz, “Reference Rot in Scholarly Statement: Threat and Remedy,” Insights the UKSG Journal 28, no. 2 (July 7, 2015): 55–61, https://doi.org/10.1629/uksg.237. 29. Concordia University University Graduate Programs, accessed April 7, 2016, http://www.concordia.ca/academics/graduate.html. 30. Klein et al., "Scholarly Context Not Found," https://doi.org/10.1371/journal.pone.0115253. 31. Ke Zhou, Richard Tobin, and Claire Grover, “Extraction and Analysis of Referenced Web Links in Large-Scale Scholarly Articles,” in Proceedings of the 14th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL ’14 (Piscataway, NJ, USA: IEEE Press, 2014), 451–452, http://dl.acm.org/citation.cfm?id=2740769.2740863. 32. Pdftohtml v0.38 win32, meshko (Mikhail Kruk), http://pdftohtml.sourceforge.net/ accessed September 20, 2015. (Actual download is at http://sourceforge.net/projects/pdftohtml/). 33. Give me text! Open Knowledge International, accessed October 26, 2015-March 7, 2016, http://givemetext.okfnlabs.org/. 34. Phillips, Alemneh, and Ayala, "Analysis of URL references," https://doi.org/10.1108/LM-08- 2013-0073. 35. “In Search of the Perfect URL Validation Regex,” accessed December 7, 2015, https://mathiasbynens.be/demo/url-regex. We selected "@gruber v2" for our extraction. 36. cURL v7.45.0, "command line tool and library for transferring data with URLs," accessed October 18, 2015, http://curl.haxx.se/. 37. We have used the term "memento" in lowercase to denote a snapshot souvenir page, to distinguish from an automated service utilizing the Memento protocol. A CASE STUDY OF ELECTRONIC THESES AND DISSERTATIONS (ETDS) IN AN ACADEMIC LIBRARY | MASSICOTTE AND BOTTER | https://doi.org/10.6017/ital.v36i1.9598 28 38. For a good overview of the types of problems, see Michael L. Nelson, Scott G. Ainsworth, Justin F. Brunelle, Mat Kelly, Hany SalahEldeen and Michele Weigle, "Assessing the Quality of Web Archives" 1 vol., Computer Science Presentations, Book 8 (Old Dominion University. ODU Digital Commons, 2014). http://digitalcommons.odu.edu/computerscience_presentations/8. 39. Shawn M. Jones, et al. “Scholarly Context Adrift," https://doi.org/10.1371/journal.pone.0167475. 40. OpenDOAR search of Institutional Repositories with Theses at http://www.opendoar.org/find.php, accessed August 26, 2016. 41. Joachim Schöpfel, "Adding value to electronic theses and dissertations in institutional repositories." D-Lib Magazine 19, no. 3 (2013): 1. https://doi.org/10.1045/march2013- schopfel. 42. Strategic Digital Initiatives Working Group. Implementation of a Modern Digital Library at The Ohio State University. (Apr 2014). https://library.osu.edu/documents/SDIWG/sdiwg_white_paper.pdf. (Published). 43. Tim Gollins. “Parsimonious Preservation: Preventing Pointless Processes! (The Small Simple Steps That Take Digital Preservation a Long Way Forward),” in Online Information Proceedings UK National Archives, 2009. Available at http://www.nationalarchives.gov.uk/documents/information-management/parsimonious- preservation.pdf. 44. Margaret Hedstrom, "Digital preservation: a time bomb for digital libraries." Computers and the Humanities 31, no. 3 (1997): 189-202. https://doi.org/10.1023/A:1000676723815. 45. Zittrain, Albert, and Lessig, "Perma," https://doi.org/10.1017/S1472669614000255. 46. Herbert Van de Sompel, Michael L. Nelson, Robert Sanderson, Lyudmila L. Balakireva, Scott Ainsworth, and Harihar Shankar, “Memento: Time Travel for the Web,” arXiv:0911.1112 [Cs], November 5, 2009, http://arxiv.org/abs/0911.1112. 9601 ---- Microsoft Word - December_ITAL_farnell_final.docx Editorial Board Thoughts: Metadata Training in Canadian Library Technician Programs Sharon Farnel INFORMATION TECHNOLOGIES AND LIBRARIES | DECEMBER 2016 3 The core metadata team at my institution is small but effective. In addition to myself as Coordinator, we include two librarians and two full-time metadata assistants. Our metadata assistant positions are considered to be similar, in some ways, to other senior assistant positions within the organization which require or at least prefer that individuals have a library technician diploma. However, neither of our metadata assistants has such a diploma. Their credentials, in fact, are quite different. In part, this difference is driven by the nature of the work that our metadata assistants do. They work regularly with different metadata standards such as MODS, DC, DDI in addition to MARC. The perform operations on large batches of metadata using languages such as XSLT or R. This is quite different in many ways than the work of their colleagues who work with the ILS, many of whom do have a library technician diploma. As we prepare for an upcoming short-term leave of one of our team members, I have been thinking a great deal about the work our metadata assistants do and whether or not we would find an individual who came through a librarian technician program who had the skills and knowledge we need a replacement to have. And I have also been reminded of conversations I have had with recently graduated library technicians who felt their exposure to metadata standards, practices, and tools beyond RDA and MARC had been lacking in their programs. This got me thinking about the presence or absence of metadata courses in library technician programs in Canada. I reached out to two colleagues from MacEwan University—Norene Erickson and Lisa Shamchuk—who are doing in-depth research into library technician education in Canada. They kindly provided me with a list of Canadian institutions that offer a library technician program so I could investigate further. Now, I must begin with two caveats. One, this is very much a surface level scan rather than an in- depth examination, although this is simply the first step in what I hope will be a longer term investigation. Second, although several Francophone institutions in Canada offer library technician programs, I did not review their programs; I was concerned that my lack of fluency in the French language could lead to inadvertent misrepresentations. Sharon Farnel (sharon.farnel@ualberta.ca), a member of the ITAL Editorial Board, is Metadata Coordinator, University of Alberta Libraries, Edmonton, Alberta. EDITORIAL BOARD THOUGHTS | FARNEL https://doi.org/10.6017/ital.v35i4.9601 4 Canadian institutions offering a library technician program (by province) are: Alberta ● MacEwan University (http://www.macewan.ca/wcm/SchoolsFaculties/Business/Programs/LibraryandInforma tionTechnology/) ● Southern Alberta Institute of Technology (http://www.sait.ca/programs-and-courses/full- time-studies/diplomas/library-information-technology) British Columbia ● Langara College (http://langara.ca/programs-and-courses/programs/library-information- technology/) ● University of the Fraser Valley (http://www.ufv.ca/programs/libit/) Manitoba ● Red River College (http://me.rrc.mb.ca/catalogue/ProgramInfo.aspx?ProgCode=LIBIF- DP&RegionCode=WPG) Nova Scotia ● Nova Scotia Community College (http://www.nscc.ca/learning_programs/programs/plandescr.aspx?prg=LBTN&pln=LIBIN FTECH) Ontario ● Algonquin College (http://www.algonquincollege.com/healthandcommunity/program/library-and- information-technician/) ● Conestoga College (https://www.conestogac.on.ca/parttime/library-and-information- technician) ● Confederation College (http://www.confederationcollege.ca/program/library-and- information-technician) ● Durham College (http://www.durhamcollege.ca/programs/library-and-information- technician) ● Seneca College (http://www.senecacollege.ca/fulltime/LIT.html) ● Mohawk College (http://www.mohawkcollege.ca/ce/programs/community-services-and- support/library-and-information-technician-diploma-800) INFORMATION TECHNOLOGIES AND LIBRARIES | DECEMBER 2016 5 Quebec ● John Abbott College (http://www.johnabbott.qc.ca/academics/career- programs/information-library-technologies/) Saskatchewan ● Saskatchewan Polytechnic (http://saskpolytech.ca/programs-and- courses/programs/Library-and-Information-Technology.aspx) My method was quite simple. Using the program websites listed above, I reviewed the course listings looking for ‘metadata’ either in the title or in the description when it was available. Of the fourteen (14) programs examined, nine (9) had no course with metadata in the title or description. Two (2) programs had courses where metadata was listed as part of the content but not the focus: Langara College as part of “Special Topics: Creating and Managing Digital Collections” and Seneca College as part of “Cataloguing III” which has a partial focus on metadata for digital collections. Three (3) of the programs had a course with metadata in the title or description; all are a variation on “Introduction to Metadata and Metadata Applications”. (Importantly, the three institutions in question - Conestoga College, Confederation College, and Mohawk College - are all connected and share courses online). So, what do these very preliminary and impressionistic findings tell us? It seems that there is little opportunity for students enrolled in library technician programs in Canada to be exposed to the metadata standards, practices, and tools that are increasingly necessary for positions involved in work with digital collections, research data management, digital preservation, and the like. Admittedly, no program can include courses on all potentially relevant topics. In addition, formal course work is only one aspect of training and education that can prepare graduates for their career; practica and work placements and other more informal activities during a program are crucial, as are the skills and knowledge that can only be developed once hired and on the job. Nevertheless, based on the investigation above, one would be justified in asking if we are disadvantaging students by not working to incorporate additional coursework focused on metadata standards, application, and tools, as well as on basic skills in manipulation of metadata in large batches. scripting languages or equivalent combination of education and experience. Master’s desirable.” I edited our statement to more clearly allow a combination of factors that would show sufficient preparation: “Bachelor’s degree and a minimum of 3-5 years of experience, or an equivalent combination of education and experience, are required; a Master’s degree is preferred,” followed by a separate description of technical skills needed. This increased the number and quality of our EDITORIAL BOARD THOUGHTS | FARNEL https://doi.org/10.6017/ital.v35i4.9601 6 applications, so I’ll remain on the lookout for opportunities to represent what we want to require more faithfully and with an open mind. Meanwhile, on the other side of the table, students and recent grads are uncertain how to demonstrate their skills. First, they’re wondering how to show clearly enough that they meet requirements like “three years of work experience” or “experience with user testing” so that their application is seriously considered. Second, they ask about possibilities to formalize skills. Recently, I’ve gotten questions about a certificate program in UX and whether there is any formal certification to be a systems librarian. Surveying the past experience of my own network—with very diverse paths into technology jobs ranging from undergraduate or second master’s degrees to learning scripting as a technical services librarian to pre-MLS work experience—doesn’t suggest any standard method for substantiating technical knowledge. Once again, the truth of the situation may be that libraries will welcome a broad range of possible experience, but the postings don’t necessarily signal that. Some advice from the tech industry about how to be more inviting to candidates applies to libraries too; for example, avoiding “rockstar”/ “ninja” descriptions, emphasizing the problem space over years of experience,1 and designing interview processes that encourage discussion rather than “gotcha” technical tasks. At Penn Libraries, for example, we’ve been asking developer candidates to spend a few hours at most on a take-home coding assignment, rather than doing whiteboard coding on the spot. This gives us concrete code to discuss in a far more realistic and relaxed context. While it may be helpful to express requirements better to encourage applicants to see more clearly whether they should respond to a posting, this is a small part of the question of preparing new MLS grads for library technology jobs. The new grads who are seeking guidance on substantiating their skills are the ones who are confident they possess them. Others have a sense that they should increase their comfort with technology but are not sure how to do it, especially when they’ve just completed a whole new degree and may not have the time or resources to pursue additional training. Even if we make efforts to narrow the gap between employers and job- seekers, much remains to be discussed regarding the challenge of readying students with different interests and preparation for library employment. Library school provides a relatively brief window to instill in students the fundamentals and values of the profession and it can’t be repurposed as a coding academy. There persists a need to discuss how to help students interested in technology learn and demonstrate competencies rather than teaching them rapidly shifting specific technologies. REFERENCES 1. Erin Kissane, “Job Listings That Don’t Alienate,” https://storify.com/kissane/job-listings-that- don-t-alienate. 9602 ---- December_ITAL_fifarek_final President’s Message: Focus on Information Ethics Aimee Fifarek INFORMATION TECHNOLOGIES AND LIBRARIES | DECEMBER 2016 1 Just a few weeks ago we held yet another successful LITA Forum1, this time in Fort Worth, TX. Tight travel budgets and time constraints mean that only a few hundred people get to attend Forum each year, but that is one of the things that make it a great conference. Because of its size you have a realistic chance of meeting everyone there, whether it’s at Game Night, one of the many networking dinners, or just for during hallway chitchat after a session. And the sessions really do give you something to talk about. This year I couldn’t help but notice a theme. Among all the talk about makerspace technologies, analytics, and specific software platforms, the one bubble that kept rising to the surface was information ethics. Why are you doing what you are doing with the information you have, and should you really be doing it? Have you stopped to think what impact collecting, posting, sharing that information is going to have on the world around you? In a post-election environment replete with talk of fake news and other forms of deliberate misinformation, LITA Forum presenters seem to have tapped in to the zeitgeist. Tara Robertson, in her closing keynote2, talked about the harm digitizing analog materials can do when what is depicted is sensitive to individuals and communities. Waldo Jaquith of US Open Data talked about how a government decision to limit options on a birth certificate to either “white” or “colored” effectively wiped the native population out of political existence in Virginia. And Sam Kome from Claremont Colleges talked about how well-meaning librarians can facilitate privacy invasion merely by collecting operational statistics3. There were many other examples brought out by Forum speakers but these in particular emphasized the real consequences the serious consequences the use of data – intentional or not – can have on people. I think it is time for librarians4 to get more vocal about information ethics and the role we play in educating the population about humane information use. Our profession has always been forward thinking about information literacy and is traditionally known for helping our communities make judgements about the information they consume. But we have not done enough to declare our expertise in the information economy, to stand up and say “we’re librarians – this is what we do.” Now, more than ever, people need the skills to think critically about the information they are consuming via all kinds of media, understand the consequences of allowing algorithms to shape their information universe, and make quality judgments about trading their personal information for goods and services. To quote from UNESCO: Aimee Fifarek (aimee.fifarek@phoenix.gov) is LITA President 2016-17 and Deputy Director for Customer Support, IT and Digital Initiatives at Phoenix Public Library, Phoenix, AZ. PRESIDENT’S MESSAGE | FIFAREK https://doi.org/10.6017/ital.v35i4.9602 2 Changes brought about by the rapid development of information and communication technologies (ICT) not only open tremendous opportunities to humankind but also pose unprecedented ethical challenges. Ensuring that information society is based upon principles of mutual respect and the observance of human rights is one of the major ethical challenges of the 21st century.5 I challenge all librarians to make a commitment to propagating information ethics, both personally and professionally. Make an effort to get out of your social media echo chamber6 and engage with uncomfortable ideas. When you see biased information being shared consider it a “teachable moment” and highlight the spin or present more neutral information. And if your library is not actively making information literacy and information ethics part of its programming and instruction, then do what you can to change it. Offer to be on a panel, create a curriculum, or host a program that includes key concepts relating to information “ownership, access, privacy, security, and community”7. The focus of the Libraries Transform campaign this year is all about our expertise: “Because the best search engine in the Library is the Librarian”8 It’s our time to shine. REFERENCES 1. http://forum.lita.org/home/ 2. http://forum.lita.org/speakers/tara-robertson/ 3. http://forum.lita.org/sessions/patron-activity-monitoring-and-privacy-protection/ 4. As always, when I use the term “librarian” my intention is to include any person who works in a library and is skilled in information and library science, not to limit the reference to those who hold a library degree. 5. http://en.unesco.org/themes/ethics-information 6. https://www.wnyc.org/story/buzzfeed-echo-chamber-online-news-politics/ 7. https://en.wikipedia.org/wiki/Information_ethics 8. http://www.ilovelibraries.org/librariestransform/ 9655 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Impact of Information Technology on Library Anxiety: The Role of Computer Attitudes Jiao, Qun G;Onwuegbuzie, Anthony J Information Technology and Libraries; Dec 2004; 23, 4; ProQuest pg. 138 The Impact of Information Technology on Library Anxiety: Oun G. Jiao and Anthony J. Onwuegbuzie The Role of Computer Attitudes Over the past two decades, computer-based technologies have become dominant forces to shape and reshape the products and services the academic library has to offer. The application of library technologies has had a profound impact on the way library resources are being used. Although many students continue to experience high lev- els of library anxiety, it is likely that the new technologies in the library have led to them experiencing other forms of negative affective states that may be, in part, a function of their attitude towards computers. This study investi- gates whether students' computer attitudes predict levels of library anxiety. C omputers and information technologies have experienced considerable growth over the past two decades. As such, familiarity with computers is rapidly becoming a basic skill and a prerequisite for many tasks. Although not every college student is equally prepared for the rising demand of computer skills in the !nformation age, computer literacy is increasingly becom- mg a gatekeeper for students' academic success. 1 Gaps in computer literacy and skills can leave many students behind not only in their academic achievement but also in their future job-market success. The unprecedented pace of technological change in the development of digital information networks and electronic services in recent years has helped to expand the role of the academic library. Once only a storehouse of printed materials, it is now a technology-laden informa- tion network where students can conduct research in a mixed print and digital-resource environment, experi- ence the use of advanced information technologies, and hone their computer skills. Yet, many students are struggling to cope with the changes brought on by the rapid advances of information teclmologies. Academic libraries of various sizes have spent a large percentage of their material budget on elec- tronic commercial content, and the trend will continue.' These days, college students are faced with the choices of ever-changing modes of electronic accessing tools, inter- faces, and protocols along with the traditional print resources in the library. The fact that the same journal article may be available in multiple vendors' aggregator Oun G. Jiao (gerryjiao@baruch.cuny.edu) is Reference Librar- ia~ and_ Associate Professor at Newman Library, Baruch College, City University of New York, and Anthony J. Onwuegbuzie (Tony Onwuegbuzie@aol.com) is Associate Professor at the College of Education, University of South Florida, Tampa. sites (such as EBSCOhost and Gale Group) makes the navigation through these bibliographic databases more complex and challenging. Relevant sources must be iden- tified and navigation protocols must be learned before appropriate information and contents can be found. Furthermore, having located a citation, students still have to search the library online catalog to find out if the jour- nal or book is available in the library and, if not, know how to make an interlibrary loan request either on paper or electronically.' Anxiety levels can be high and patience levels can be low at varying times of conducting library research. 4 . That students experience various levels of apprehen- sion when using academic libraries is not a new phenom- enon. Indeed, the phenomenon is prevalent among college students in the United States and many other countries, and is widely known as library anxiety. Mellon first coined the term in her study in which she noted that 75 percent to 85 percent of undergraduate students described their initial library experiences in terms of anx- iety.5 According to Mellon, feelings of anxiety stem from either the relative size of the library; a lack of knowledge about the location of materials, equipment, and resources of the library; how to initiate library research; or how to proceed with a library search. 6 Library anxiety is an unpleasant feeling or emotional state with physiological and behavioral concomitants that come to the fore in li_brary settings. Typically, library-anxious students expe- rience negative emotions, including ruminations, tension, fear, and mental disorganization, which prevent them from using the library effectively. 7 A student who experi- ences library anxiety usually undergoes either emotional ~r physical discomfort when faced with any library or library-related task. 8 Library anxiety may arise from a lack of self-confidence in conducting research, lack of prior exposure to academic libraries, the inability to see the relevance of libraries to one's field of interest, and lack of familiarity with library equipment and technologies. Library anxiety is often accorded special attention because of its debilitating effects on students' academic achievement.9 Although many students continue to experience high levels of library anxiety, it is likely that the new technolo- gies and electronic databases in libraries have led to stu- dents experiencing other forms of negative affective states. In particular, it is likely that library anxiety experi- enced by students is, in part, a function of their attitudes toward computers. Consistent with this assertion, Mizrachi and Shoham and Mizrachi reported a statisti- cally significant relationship between library anxiety and computer attitudes. 10 They noted in their research that home and work usage of computers, computer games, word processors, computer spreadsheets, and the Internet are all related to the dimensions of library anxi- ety found among Israeli students to varying degrees. 138 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Similarly, Jerabek, Meyer, and Kordinak found levels of computer anxiety to be related to levels of library anxiety for both men and women. 11 These studies focused exclu- sively on undergraduate students. However, no study has examined this relationship among graduate students, a population that uses the academic library more than any other student population. Over the past fifteen years, a large body of research lit- erature on computer attitudes has been generated. In par- ticular, many researchers have studied the relationship between computer attitudes and computer use. 12 The importance of beliefs and attitudes towards computers and technologies is widely acknowledged. 13 Students' computer attitudes arguably impact their willingness to engage in computer-related activities in colleges and uni- versities where effectively using library electronic resources represents an increasingly important part of col- lege education. Negative computer attitudes may inhibit students' interests in learning to use the library resources and thereby weaken their academic performance levels, while at the same time elevating levels of library anxiety. Mclnerney, Mclnerney, and Sinclair observed that nega- tive perceptions about computers among student teach- ers may accompany feelings of anxiety, including worries about being embarrassed, looking foolish, and even damaging the computer equipment. 14 Further, there is often a negative relationship between prior experience with computers and computer anxiety experienced by individuals. 15 Until recently, library anxiety has only been inter- preted in the context of the library setting-that is, a phe- nomenon that occurs while students are undertaking library tasks. Jiao, Onwuegbuzie, and Lichtenstein defined library anxiety as "an uncomfortable feeling or emotional disposition, experienced in a library setting, which has cognitive, affective, physiological, and behav- ioral ramifications." 16 At the same time, unprecedented technological advancement has had a profound impact on the products and services offered by academic libraries. Students now are able to conduct sophisticated library searches from the comfort of their homes. It is clear that the construct of library anxiety needs to be expanded in the new library and information environ- ment, incorporating into its definition other variables that are relevant for the changing library and information con- text. Because many library users spend a significant por- tion of their time using computer-based technologies to conduct information searches, it is natural to ask, to what extent does library anxiety stem from students' prior atti- tudes and experiences with computers and library tech- nologies? However, with the exception of the studies conducted by Mizrachi and Shoham and Mizrachi on Israeli undergraduate students, this link has not been examined. 17 Thus, the present study investigated the rela- tionship between computer attitudes and library anxiety in the rapidly changing library and information environ- ment. As such, the current inquiry replicated the works of Mizrachi, Shoham and Mizrachi, and Jerabek, Meyer, and Kordinak by examining the degree to which computer attitudes predict levels of library anxiety among graduate students in the United States. 18 It was expected that find- ings from this study would help to increase the under- standing of the construct of library anxiety. Indeed, research in this area has become critical in higher educa- tion where educators are responsible for graduating stu- dents with the skills necessary to thrive and to lead in a rapidly changing technological environment in the twenty-first century. I Method Participants Participants were ninety-four African American graduate students enrolled in the College of Education at a histori- cally Black college and university in the eastern U.S. All participants were solicited in either a statistics or a meas- urement course at the time that the investigation took place. In order to participate in the study, students were required to sign an informed-consent document that was given during the first class session of the semester. The majority of the participants were female. Ages of the par- ticipants ranged from twenty-two to sixty-two years (Mean = 30.40, SD = 8.75). Instruments and Procedure All participants were administered two scales, namely, the Computer Attitude Scale (CAS) and the Library Anxiety Scale (LAS). The CAS, developed by Loyd and Gressard, contains forty Likert-type items that assess individuals' attitudes toward computers and the use of computers. 19 This instrument consists of the following four scales, which can be used separately: (1) anxiety or fear of computers; (2) confidence in the ability to use com- puters; (3) liking or enjoying working with computers; and (4) computer usefulness. Loyd and Gressard reported coefficient alpha reliability coefficients of .86, .91, .91, and .95 for scores pertaining to computer anxiety, computer confidence, computer liking, and total scales, respec- tively. For the present study, the score reliabilities were as follows: • computer anxiety, .84 (95 percent confidence interval CI = .79, .88); • computer confidence, .81 (95 percent CI = .75, .86); • computer liking, .89 (95 percent CI= .85, .92); and • computer usefulness, .76 (95 percent CI = .68, .83). THE IMPACT OF INFORMATION TECHNOLOGY ON LIBRARY ANXIETY I JIAO AND ONWUEGBUZIE 139 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The LAS, developed by Bostick, contains forty-three 5-point Likert-format items that assess levels of library anxiety experienced by college students. 20 It also contains the following five subscales: 1. barriers with staff; 2. affective barriers; 3. comfort with the library; 4. knowledge of the library; and 5. mechanical barriers. A high score on any subscale represents high levels of anxiety in that area. Jiao and Onwuegbuzie, in their exami- nation of the score reliability reported on LAS in the extant literature, found that it has typically been in the adequate to high range for the subscale and total-scale scores. 21 Based on their analysis, Onwuegbuzie, Jiao, and Bostick concluded that "not only does the [LAS] produce scores that yield extremely reliable estimates, but also these estimates are remarkably consistent across samples with different cul- tures, nationalities, ages, years of study, gender composi- tion, educational majors, and so forth." 22 For the current investigation, the subscales generated scores for the com- bined sample that had a classical theory alpha reliability coefficient of .89 (95 percent CI = .85, .92) for barriers with staff, .84 (95 percent CI = .79, .88) for affective barriers, .53 (95 percent CI= .37, .66) for comfort with the library, .62 (95 percent CI= .48 .73) for knowledge of the library, and .70 (95 percent CI= .58, .79) for mechanical barriers. Analysis A canonical correlation analysis was conducted to iden- tify a combination of library anxiety dimensions (barriers with staff, affective barriers, comfort with the library, knowledge of the library, and mechanical barriers) that might be simultaneously related to a combination of com- puter-attitude dimensions (computer anxiety, computer liking, computer confidence, and computer usefulness). Canonical correlation analysis is used to examine the rela- tionship between two sets of variables whereby each set contains more than one variable. 23 In the present investi- gation, the five dimensions of library anxiety were treated as the dependent multivariate set of variables, and the four dimensions of computer attitudes formed the inde- pendent multivariate profile. The number of canonical functions (factors) that can be produced for a given dataset is equal to the number of variables in the smaller of the two variable sets. Because the library-anxiety set contained five dimensions and the computer-attitude set contained four variables, four canonical functions were generated. For any significant canonical coefficient, the standard- ized canonical-function coefficients and structure coeffi- cients were then interpreted. Standardized canonical- function coefficients are computed weights that are applied to each variable in a given set in order to obtain the composite variate used in the canonical correlation analysis. As such, standardized canonical-function coeffi- cients are equivalent to factor-pattern coefficients in fac- tor analysis or to beta coefficients in a regression analysis." Conversely, structure coefficients represent the correlations between a given variable and the scores on the canonical composite (latent variable) in the set to which the variable belongs.2 5 Thus, structure coefficients indicate the degree to which each variable is related to the canonical composite for the variable set. Indeed, structure coefficients are essentially bivariate correlation coeffi- cients that range in value between -1.0 and + 1.0 inclu- sive." The square of the structure coefficient yields the proportion of variance that the original variable shares linearly with the canonical variate. I Results Table 1 presents the intercorrelations among the five dimensions of library anxiety and the four dimensions of computer attitude. Of particular interest were the twenty correlations between the library-anxiety subscale scores and the computer-attitude subscale scores. It can be seen that, after applying the Bonferroni adjustment, four of these relationships were statistically significant. Specifically, computer liking was statistically significantly related to affective barriers, knowledge of the library, and comfort with the library. Using Cohen's criteria of .1, .3, and .5 for small, medium, and large relationships, respec- tively, the first two relationships (involving affective bar- riers and knowledge of the library) were medium, and the third relationship (between computer liking and com- fort with the library) was large. 27 In addition to these three relationships, the association between computer useful- ness and knowledge of the library also was statistically significant, with a medium effect size. The correlation matrix in table 1 was used to examine the multivariate relationship between library anxiety and computer attitudes. This relationship was assessed via a canonical correlation analysis. The canonical analysis revealed that the four canonical correlations combined were statistically significant (p < .0001). Also, when the first canonical root was removed, the remaining three canonical roots were not statistically significant. In fact, removal of subsequent canonical roots did not lead to statistical significance. Together, these results suggested that only the first canonical function was statistically sig- nificant, but the remaining three roots were not statisti- cally significant. This first canonical root also was practically significant (Rc1 = .63), contributing 40.8 per- cent (Rc12) to the shared variance, which represents a large effect size. 28 140 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Data pertaining to the first canonical root are pre- sented in table 2, which provides both standardized func- tion coefficients and structure coefficients. Using a cutoff correlation of 0.3, the standardized canonical-function coefficients revealed that affective barriers, comfort with the library, and knowledge of the library made important contributions to the library-anxiety set, with affective bar- riers and comfort with the library making similarly large contributions. 29 With regard to the computer-attitude set, computer anxiety, computer liking, and computer confi- dence made noteworthy contributions, with the latter two dimensions making the most noteworthy contributions. The structure coefficients revealed that all five dimen- sions of library anxiety made important contributions to the first canonical variate. The square of the structure coefficient indicated that barriers with staff, affective bar- riers, comfort with the library, and knowledge of the library made similarly large contributions, explaining 67.2 percent, 72.3 percent, 72.3 percent, and 60.8 percent of the variance, respectively. With regard to the computer- attitude set, computer liking and computer usefulness made important contributions. These variables explained 64.0 percent and 16.8 percent of the variance, respectively. Comparing the standardized and structure coeffi- cients indicated that computer anxiety and computer con- fidence served as suppressor variables because the standardized coefficients associated with these variables were large, whereas the corresponding structure coeffi- cients were relatively small. 30 Suppressor variables are variables that assist in the prediction of dependent vari- ables due to their correlation with other independent variables. 31 Thus, the inclusion of computer anxiety and computer confidence in the canonical correlation model strengthened the multivariate relationship between library anxiety and computer attitudes. I Discussion The purpose of this study was to investigate the rela- tionship between computer attitudes and library anxi- ety among African American graduate students. Specifically, the multivariate link between these two constructs was examined. A canonical correlation analysis revealed a strong multivariate relationship between library anxiety and computer attitudes. The library-anxiety subscale scores and computer-attitudes subscale scores shared 40.82 percent of the common variance. Specifically, computer liking and computer usefulness were related simultaneously to the following five dimensions of library anxiety: barriers with staff, affective barriers, comfort with the library, knowledge of the library, and mechanical barriers. Computer anxi- ety and computer confidence served as suppressor vari- ables. Thus, computer attitudes predict levels of library anxiety. As such, the present findings are consistent with those of Mizrachi and Shoham and Mizrachi, who found a sta- tistically significant relationship between computer atti- tudes and the following seven dimensions of the Hebrew Library-Anxiety Scale, a modified version of LAS devel- oped by the authors for their Israeli sample: 1. Staff 2. Knowledge 3. Language 4. Physical Comfort 5. Library Computer Comfort 6. Library Policies and Hours, and 7. Resources. 32 According to its authors, the Staff factor refers to stu- dents' attitudes towards librarians and library staff and their perceived accessibility. The Knowledge factor per- tains to how students rate their own library expertise. The Language factor relates the extent to which using English- language searches and materials yield discomfort. Physical Comfort evaluates how much the physical facil- ity negatively affects students' satisfaction and comfort with the library. Library Computer Comfort assesses the perceived trustworthiness of library computer facilities and the quality of directions for using them. Library Policies and Hours concerns students' attitudes toward library rules, regulations, and hours of operation. Finally, Resources refers to the perceived availability of the desired material in the library collection. The correlations between the dimensions of library anxiety and computer attitudes ranged from .11 (physical comfort) to .47 (knowledge). The current results also replicate those of Jerabek, Meyer, and Kordinak, who found levels of com- puter anxiety to be related to levels of library anxiety for both men and women. 33 Nevertheless, caution should be exercised in generaliz- ing the current findings to all graduate students. Though the present study examined the association between library anxiety and computer attitudes among African American graduate students, it should not be assumed that this relationship would hold for other racial groups. Jiao, Onwuegbuzie, and Bostick found that African American students attending a research-intensive institution reported statistically significantly lower levels of library anxiety associated with barriers with staff, affective barri- ers, and comfort with the library than did Caucasian American graduate students enrolled at a doctoral-grant- ing institution, with effect sizes ranging from moderate to large. 34 In a follow-up study, Jiao and Onwuegbuzie com- pared African American and Caucasian American students with respect to library anxiety, controlling for educational background by selecting both racial groups from the same institution. 35 No statistically significant racial differences THE IMPACT OF INFORMATION TECHNOLOGY ON LIBRARY ANXIETY I JIAO AND ONWUEGBUZIE 141 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 1. lntercorre lations among the Library-Anxiety Subscales and Computer-Att itude Subsca les Subscale 2 3 4 5 6 7 8 9 1 . Barriers with Staff .64 * .63* .49* .46* - .02 .05 -.27 -.09 * .52 * * -.05 .02 -.37 * -.23 2. Affective Barriers .56 .40 3. Comfort with the Library .56 * .44 * -.19 -.20 -.55 * -.16 _39*' -.21 -.11 -.37 * -.32 * 4. Knowledge of the Library 5. Mechanical Barriers -.13 -.01 -.18 .04 .77 * .48 * .46 * 6. Computer Anxiety .67 * .36 * 7. Computer Confidence .43 * 8. Computer Liking 9. Computer Usefulness *Indicates a statistically significant relationsh ip after the Bonferroni adjustment. Table 2. Canon ical Solution for Th ird Function-Re lationship between Library-Anx iety Subscales and Computer-Att itude Subsca les Theme Standardization Coefficient Library-Anxiety Subscale Barriers with Staff .17 Affect ive Barriers .40* Comfort with the Library .39* Know ledge of the Library .31 * Mechanical Barr iers -.12 Computer-Attitude Subscale Computer Anxiety -0.31* Computer Confidence 0.98* Computer Liking -1.25* Computer Usefulness -0 .13 *Loadings with the effect sizes larger than .3. were found in library anxiety for any of the five dimen- sions of LAS. However, across all five library-anxiety measures, the African American sample reported lower scores than did the Caucasian American sample. In fact, using the test of trend by Onwuegbuzie and Levin, they found that the consistency with which the African American graduate students had lower levels of library anxiety than did the Caucasian American students was both statistically and practically significant. 36 Thus, Jiao and Onwuegbuzie's results, alongside those of Jiao, Onwuegbuzie, and Bostick, suggest that racial differences in library anxiety prevail. 37 Thus, future research should investigate whether the relationship between library anxi- Structure Coefficient .82* .85* .85* .78* .39* -.22 - .13 -.80* -.41 * Structure•(%) 67.2 72 .3 72.3 60.8 15.2 4.8 1.7 64.0 16.8 ety and computer attitudes found in the present study among African American graduate students also exists among Caucasian American graduate students, as well as among other racial groups. Further, the causal direction of the relationship found in the current study should be investigated. That is, future studies should investigate whether library anxiety places a person more at risk for experiencing poor com- puter attitudes, or whether the converse is true. More research also is needed to determine how computer atti- tudes might play a role in the library context. Notwithstanding, it appears that the construct of library anxiety can be expanded to include the construct 142 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. of computer attitudes. Indeed, one implication of the findings is that Bostick's LAS should be modified to include dimensions of computer attitudes. 38 Such a modi- fication likely would facilitate the identification of library-anxious students. By identifying students with high levels of library anxiety and poor computer atti- tudes, library educators and others could help them improve their dispositions and provide them with the skills necessary to negotiate the rapidly changing techno- logical environment, thereby putting them in a better position to be lifelong learners. References 1. Susan M. Piotrowski, Computer Training: Pathway from Extinction (ERIC Document Reproduction Service, ED 348955, 1992). 2. Thomas H. Hogan, "Drexel University Moves Aggres- sively from Print to Electronic Access for Journals (Interview with Carol Hansen Montgomery, Dean of Libraries)," Computers in Libraries 21, no. 5 (May 2001): 22-27. 3. M. Claire Stewart and H. Frank Cervone, "Building a New [nfrastructure for Digital Media: Northwestern University Library," Information Technology and Libraries 22, no. 2 (June 2003): 69-74. 4. Carol C. Kuhlthau, "Longitudinal Case Studies of the Infor- mation Search Process of Users in Libraries," Library and Informa- tion Science Research 10 (July 1988): 257-304; Carol C. Kuhlthau, "Inside the Search Process: Information Seeking from the User's Perspective," Journal of the American Society for Information Science 42, no. 5 (June 1991): 361-71; Carol C. Kuhlthau, Seeking Meaning: A Process Approach to Library and Information Services (Norwood, N.J.: Ablex, 1993); Carol C. Kuhlthau, "Students and the Informa- tion Search Process: Zones of Intervention for Librarians," Advances in Librarianship 18 (1994): 57-72; Carol C. Kuhlthau et al., "Validating a Model of the Search Process: A Comparison of Aca- demic, Public, and School Library Users," Library and Information Science Research 12, no. 1 (Jan.-Mar. 1990): 5-31. 5. Constance A. Mellon, "Library Anxiety: A Grounded The- ory and Its Development," College & Research Libraries 47, no. 2 (Mar. 1986): 160-65. 6. Ibid. 7. Qun G. Jiao, Anthony J. Onwuegbuzie, and Art Lichten- stein, "Library Anxiety: Characteristics of' At-Risk' College Stu- dents," Library and Information Science Research 18 (spring 1996): 151-63. 8. Constance A. Mellon, "Attitudes: The Forgotten Dimen- sion in Library Instruction," Library Journal 113 (Sept. 1, 1988): 137-39; Constance A. Mellon, "Library Anxiety and the Non- Traditional Student," in Reaching and Teaching Diverse Library User Groups, ed. Teresa B. Mensching (Ann Arbor, Mich.: Pierian, 1989), 77-81; Anthony J. Onwuegbuzie, "Writing a Research Pro- posal: The Role of Library Anxiety, Statistics Anxiety, and Com- position Anxiety," Library and Information Science Research 19, no. 1 (1997): 5-33. 9. Anthony J. Onwuegbuzie and Qun G. Jiao, "Information Search Performance and Research Achievement: An Empirical Test of the Anxiety-Expectation Model of Library Anxiety," four- nal of the American Society for Information Science and Technology (JASIST) 55, no. 1 (2004): 41-54; Anthony J. Onwuegbuzie, Qun G. Jiao, and Sharon L. Bostick, Library Anxiety: Theory, Research, and Applications (Lanham, Md.: Scarecrow, 2004). 10. Diane Mizrachi, "Library Anxiety and Computer Atti- tudes among Israeli B.Ed. Students" (master's thesis, Bar-Ilan University, Israel, 2000); Snunith Shoham and Diane Mizrachi, "Library Anxiety among Undergraduates: A Study of Israeli B.Ed. Students," Journal of Academic Librarianship 27, no. 4 (July 2001): 305-11. 11. Ann J. Jerabek, Linda S. Meyer, and Thomas S. Kordinak, "'Library Anxiety' and 'Computer Anxiety': Measures, Validity, and Research Implications," Library and Information Science Research 23, no. 3 (2001): 277-89. 12. Muhamad A. Al-Khaldi and Ibrahim M. Al-Jabri, "The Relationship of Attitudes to Computer Utilization: New Evi- dence from a Developing Nation," Computers in Human Behavior 9, no. 1 (Jan. 1998): 23-42; Margaret Cox, Valeria Rhodes, and Jennifer Hall, "The Use of Computer-Assisted Learning in Pri- mary Schools: Some Factors Affecting Uptake," Computers in Education 12, no. 1 (1988), 173-78; Gayle V. Davidson and Scott D. Ritchie, "Attitudes toward Integrating Computers into the Class- room: What Parents, Teachers, and Students Report," Journal of Com- puting in Childhood Education 5, no. 1 (1994): 3-27; Donald G. Gardner, Richard L. Dukes, and Richard Discenza, "Computer Use, Self-Confidence, and Attitudes: A Causal Analysis," Com- puters in Human Behavior 9, no. 4 (winter 1993): 427-40; Robin H. Kay, "Predicting Student Teacher Commitment to the Use of Computers," Journal of Educational Computing Research 6, no. 3 (1990): 299-309. 13. Deborah Bandalos and Jeri Benson, "Testing the Factor Structure Invariance of a Computer Attitude Scale over Two Grouping Conditions," Educational and Psychological Measure- ment 50, no. 1 (Spring 1990): 49-60; Frank M. Bernt and Alan C. Bugbee Jr., "Factors Influencing Student Resistance to Computer Administered Testing," Journal of Research on Computing in Education 22, no. 3 (spring 1990): 265-75; Michel Dupagne and Kathy A. Krendl, "Teacher's Attitudes toward Computers: A Review of the Literature," Journal of Research on Computing in Education 24, no. 3 (Spring 1992): 420-29; Elizabeth Mowrer-Popiel, Constance Pollard, and Richard Pollard, "An Analysis of the Perceptions of Preservice Teachers toward Technology and Its Use in the Class- room," Journal of Instructional Psychology 21, no. 2 (June 1994): 131-38; Jennifer D. Shapka and Michel Ferrari, "Computer- Related Attitudes and Actions of Teacher Candidates," Comput- ers in Human Behavior 19, no. 3 (May 2003): 319-34. 14. Valentina Mclnerney, Dennis M. Mclnerney, and Kenneth E. Sinclair, "Student Teachers, Computer Anxiety, and Computer Experience," Journal of Educational Computing Research 11, no. 1 (1994): 27-50. 15. Susan E. Jennings and Anthony J. Onwuegbuzie, "Com- puter Attitudes as a Function of Age, Gender, Math Attitude, and Developmental Status," Journal of Educational Computing Research 25, no. 4 (2001): 367-84. 16. Jiao, Onwuegbuzie, and Lichtenstein, "Library Anxiety," 152. 17. Mizrachi, "Library Anxiety and Computer Attitudes"; Shoham and Mizrachi, "Library Anxiety among Undergraduates." 18. Mizrachi, "Library Anxiety and Computer Attitudes"; Shoham and Mizrachi, "Library Anxiety among Undergraduates"; THE IMPACT OF INFORMATION TECHNOLOGY ON LIBRARY ANXIETY I JIAO AND ONWUEGBUZIE 143 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Jerabek, Meyer, and Kordinak, '"Library Anxiety' and 'Com- puter Anxiety."' 19. Brenda H. Loyd and Clarice Gressard, "The Effects of Sex, Age, and Computer Experience on Computer Attitudes" AEDS Journal 18, no. 2 (1984): 67-77. 20. Sharon L. Bostick, "The Development and Validation of the Library Anxiety Scale" (Ph.D. diss, Wayne State University, 1992). 21. Qun G. Jiao and Anthony J. Onwuegbuzie, "Reliability Generalization of the Library Anxiety Scale Scores: Initial Find- ings/' (unpublished manuscript, 2002). 22. Onwuegbuzie, Jiao, and Bostick, Library Anxiety, 22. 23. Norman Cliff and David J. Krus, "Interpretation of Canonical Analyses: Rotated versus Unrotated Solutions," Psy- chometrica 41, no. 1 (Mar. 1976): 35-42; Richard B. Darlington, Sharon L. Weinberg, and Herbert J. Walberg, "Canonical Variate Analysis and Related Techniques," Review of Educational Research 42, no. 4 (fall 1973): 131-43; Bruce Thompson, "Canonical Corre- lation: Recent Extensions for Modeling Educational Processes" (paper presented at the annual meeting of the American Educa- tional Research Association, Boston, Mass., Apr. 7-11, 1980) (ERIC, ED 199269); Bruce Thompson, Canonical Correlation Analysis: Uses and Interpretations (Newbury Park, Calif.: Sage, 1984); Bruce Thompson, "Canonical Correlation Analysis: An Explanation with Comments on Correct Practice" (paper pre- sented at the annual meeting of the American Educational Research Association, New Orleans, La., Apr. 5-9, 1988) (ERIC, ED 295957); Bruce Thompson, "Variable Importance in Multiple Regression and Canonical Correlation" (paper presented at the annual meeting of the American Educational Research Associa- tion, Boston, Mass., April 16-20, 1990) (ERIC, ED 317615). 24. Margery E. Arnold, "The Relationship of Canonical Corre- lation Analysis to Other Parametric Methods" (paper presented at the annual meeting of the Southwest Educational Research Association, New Orleans, La., Jan. 1996) (ERIC, ED 395994). 25. Thompson, "Canonical Correlation: Recent Extensions." 26. Ibid. 27. Jacob Cohen, Statistical Power Analysis for the Behavioral Sciences (New York: Wiley, 1988). 28. Ibid. 29. Zarrel V. Lambert and Richard M. Durand, "Some Pre- cautions in Using Canonical Analysis," Journal of Marketing Research 12, no. 4 (Nov. 1975): 468-75. 30. Anthony J. Onwuegbuzie and Larry G. Daniel, "Typology of Analytical and Interpretational Errors in Quantitative and Qualitative Educational Research," Current Issues in Education 6, no. 2 (Feb. 2003). Accessed Nov. 13, 2003,http://cie.ed.asu.edu/ volume6/number2/. 31. Barbara G. Tabachnick and Linda S. Fidell, Using Multi- variate Statistics, 3rd ed. (New York: Harper), 1996. 32. Mizrachi, "Library Anxiety and Computer Attitudes"; Shoham and Mizrachi, "Library Anxiety among Undergraduates." 33. Jerabek, Meyer, and Kordinak, '"Library Anxiety' and 'Computer Anxiety."' 34. Qun G. Jiao, Anthony J. Onwuegbuzie, and Sharon L. Bostick, "Racial Differences in Library Anxiety among Graduate Students," Library Review 53, no. 4 (2004): 228-35. 35. Qun G. Jiao and Anthony J. Onwuegbuzie, "Library Anx- iety: A Function of Race?" (unpublished manuscript, 2003). 36. Anthony J. Onwuegbuzie and Joel R. Levin, "A Proposed Three-Step Method for Assessing the Statistical and Practical Significance of Multiple Hypothesis Tests" (paper presented at the annual meeting of the American Educational Research Asso- ciation, San Diego, Calif., Apr. 12-16, 2004). 37. Jiao, Onwuegbuzie, and Bostick, "Racial Differences in Library Anxiety." 38. Bostick, "The Development and Validation of the Library Anxiety Scale." 144 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 9656 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Beyond Information Architecture: A Systems Integration Approach to Web-site Design Maloney, Krisellen;Bracke, Paul J Information Technology and Libraries; Dec 2004; 23, 4; ProQuest pg. 145 Beyond Information Architecture: A Systems Integration Approach to Web-site Design Krisellen Maloney and Paul J. Bracke Users' needs and expectations regarding access to infor- mation have fundamentally changed, creating a discon- nect between how users expect to use a library Web site and how the site was designed. At the same time, library technical infrastructures include legacy systems that were not designed for the Web environment. The authors propose a framework that combines elements of informa- tion architecture with approaches to incremental system design and implementation. The framework allows for the development of a Web site that is responsive to changing user needs, while recognizing the need for libraries to adopt a cost-effective approach to implementation and maintenance. T he Web has become the primary mode of informa- tion seeking and access for users of academic libraries. The rapid acceptance of Web technologies is due, in part, to the ubiquity of the Web browser, which presents a user interface that is recognized and under- stood by a broad range of users. As libraries increase the amount of content and broaden the range of services available through their Web sites, it is becoming evident that it will take more than a well-designed user interface to completely support users' information-seeking and access needs. The underlying technical infrastructure of the Web site must also be organized to logically support the users' tasks. Library technical infrastructures, largely designed to support traditional library processes, are being adapted to provide Web access. As part of this adaptation process, they are not necessarily being reor- ganized to meet the changing expectations of Web-savvy users, particularly younger users who are not familiar with traditional library organization methods such as the card catalog, print indexes, or other legacy tools. Libraries must harness the power of the highly struc- tured information systems that have long been a part of libraries and integrate these systems in new ways to support users' goals and objectives. Part of this chal- lenge will be answered by the development of new sys- tems and technical standards, but these are only a partial solution to the problem. An important part of making library systems and Web sites function as powerful dis- covery tools is to modernize the systems that provide existing services and content to support the changing needs and expectations of the user. Emerging concepts of information architecture (IA) describe the system requirements from the user perspective but do not pro- vide a mechanism to conceptually integrate existing functions and content, or to inform the requirements necessary to modernize and integrate the current system architecture. The authors propose a framework for approaching a comprehensive Web-site implementation that combines components of IA and system modernization that have been successful in other industries. Within this frame- work, those components are tailored for the unique aspects of information provision that characterize a library. The proposed framework expands the concept of IA to include functional and content requirements for the Web site. This expansion identifies points within the con- ceptual and physical design where user requirements are constrained by the existing infrastructure. Identification of these constraints begins an iterative design process in which some user requirements inform changes to the underlying system architecture. Conversely, when the required changes to the underlying system architecture cannot be achieved, the constraints inform the conceptual design of the Web site. The iterative nature of this approach acknowledges the usefulness of much of the existing infrastructure but provides an incremental approach to modernizing installed systems. This frame- work describes aspects of the conceptual and physical- design elements that must be considered together and balanced to produce a Web site that supports the goals and objectives of the user but is cost-effective and practi- cal to implement. I Information Architecture and the Problem of Libraries IA is both a characteristic of a Web site and an emerging discipline. A number of authors have attempted to develop a formal definition of IA. Wodtke presents a sim- ple task-based definition, stating that an information architect "creates a blueprint for how to organize the Web site so that it will meet all (business, end user) these needs." 1 Rosenfeld and Marville present a four-part defi- nition in which two parts focus on the practice, and two parts define IA as characteristic. The first characteristic defines IA as a combination of "organization, labeling, and navigation schemes" while the second describes it as "the structural design of an information space to facilitate task description and intuitive access to content." 2 There is general agreement that IA provides a specification of the Web site from the perspective of the user. The specification usually describes the organization, navigational elements, Krisellen Maloney (maloneyk@u.library.arizona.edu) is Director of Technology at the University of Arizona Libraries, Tucson. Paul J. Bracke (paul@ahsl.arizona.edu) is Head of Systems and Net- working at the Arizona Health Sciences Library, Tucson. BEYOND INFORMATION ARCHITECTURE I MALONEY AND BRACKE 145 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and labeling required to completely structure a user's Web-site experience. IA is not synonymous with Web-site design, but rather provides the conceptual foundation upon which a presentation design is based. Web-site design adds presentation and graphical elements to IA to create the user experience. Library Web sites provide a display platform by which library content and services can be accessed through a common user interface. Most of the tools and services have been available for decades and, in response to user demand, are increasingly being made Web-acces- sible in digital formats (virtual reference, full-text data- bases). Despite this new access medium and format, the conceptual design of the underlying systems has not changed much. The library technical infrastructure is made up of many loosely coupled systems optimized to perform a single function or to support the work of a library department. Library Web sites do not present a sufficiently unified interface design or level of technical integration to match current users' mental models of information seeking and access. 3 The systems have not been integrated to support users' overarching goals or meet the expectation of seamless access that they have developed when using other Web sites (such as Google or Amazon). In many cases, users are still expected to understand aspects of the library that are now obsolete (card catalogs) in order to navigate the library's Web site. For example, the process of finding a journal article using a typical library Web site is based on a print para- digm and has changed little despite the advent of online discovery tools. In a print environment, users first looked at an index to identify an article of interest, then wrote down the citation, went to the card catalog, and there looked up the journal containing the article. If the library owned the journal, the user would then write down the call number and go to the shelves to find the article. This process has not necessarily changed much for many libraries, even though indexes, card catalogs, and journals are often available online. Even more confusing is that the end result of some search processes within a library Web site is not necessarily content, but a metadata representa- tion of content that must be entered into another search box. Although the first search is representative of the search of a traditional index and the second search is rep- resentative of the search of the card catalog, many of our users have no mental model for this multistep search process. Users accustomed to the simple keyword search available through Internet search engines may have great difficulty in understanding the need for the many steps involved in library use. There is an expectation that search systems and online content will be linked, regard- less of the economic, legal, and technical factors that make these links difficult. While linking-options in ven- dor databases and OpenURL resolvers have begun to simplify the electronic version of the process by automat- ing some of the steps, the multistep process is still valid in many instances in most libraries. It is clear that library Web sites must undergo a fun- damental change in order to be responsive to the needs of the user. Because library Web sites appear to be similar to conventional Web sites, it is tempting to adopt a general approach to IA to address users' needs. There are, how- ever, several areas in which the general approach to IA does not adequately support the design needs for library Web sites. Generalized IA approaches, such as those provided by Rosenfeld and Marville, do not provide adequate guid- ance regarding the organization and display of content from external sources. There is an unstated assumption that external sources will provide information in the for- mat specified by the Web-site architect. IA approaches suggest methods to completely describe the user experi- ence, from the time a user first accesses a site to the point at which a user task is complete, regardless of the origin of the content or service accessed. For example, the con- tent from each of Amazon.com' s commercial partners is packaged to operate like a part of the Amazon.com site. In contrast, libraries often only have control of a user's experience up to the point at which they leave a library's servers. Libraries guide users not only to local services and digitized collections, but to databases, journals, and more that are licensed from external sources and the appearances of which are controlled by external sources. Even when using a technical standard such as Z39.50 to provide a local look and feel to remote resources, libraries do not necessarily have full control over the data format or elements of the content that is returned. This lack of local control over content is a limitation to libraries adopt- ing common definitions of IA. Another design area that is not well supported by generalized approaches to IA is the integration of previ- ously installed systems, such as library catalogs. These legacy systems provide important services that represent decades of development and collaboration, and are essen- tial to the future of libraries. For example, libraries pro- vide access to unique resources and systems ranging from online catalogs to abstracting and indexing databases to interlibrary loan (ILL) networks. Libraries are using Web technologies to provide new access methods to library content and services. These technologies provide a thin veneer on systems that function in a manner unfamiliar to many users. The challenge then becomes to change what lies beneath the surface, the underlying functionality of the site, to support the needs of the user. Using a general- ized approach to IA, as applied in other settings, libraries would assess the needs of the user and develop a new, complete system that supports those needs. Such an approach ignores the extensive, existing infrastructure of legacy systems in libraries that is still useful and that serves purposes beyond the user's Web interface. What is 146 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. needed is a standard reference model for library services that provides a framework for access to services and con- tent. This is a long-term goal that requires cooperation and agreement among libraries, and that would allow legacy systems to be repackaged in ways that are more flexible, meet changing user needs, and can be integrated into changing technology environments. Because there are cur- rently no such reference models, librarians need to develop other approaches to integrate existing legacy sys- tems into a modernized Web site. I Extending the IA Framework In this paper, the general definition of IA that has been proposed by several authors has been extended to incor- porate the additional constraints that characterize library Web sites.4 Extended Information Architecture (EIA) is the first half of the framework, and provides a complete conceptual design of the Web site from the users' per- spective. Figure 1 depicts the elements and relationships within EIA. The coordinating structure provides an over- arching framework for the integration of the multiple service elements that provide much of the underlying functionality of the Web site. The relationship between the coordinating structure and the service elements is iterative, with service elements constraining the coordi- nating structure and the coordinating structure informing the design of the service elements. The Coordinating Structure The coordinating structure contains many of the design elements that are found in generalized approaches to IA, including the organization, navigational structure, and labeling. These are the elements of a Web site that, in con- cert, define the structure of the user interface without specifying the functionality and content underlying that interface. The framework emphasizes aspects of the gen- eralized approaches that are most relevant to libraries and places them in relation to the service elements that specify the content and functionality of the site. The first element of the coordinating structure is the organization of the Web site. Organization refers to the logical groupings of the content and services that are available to the user. These groupings are not necessarily representative of physical-system implementations, but may be task- or subject-based instead. For example, many academic library Web sites have primary groupings that include information resources, services, and user guides. Although the information resources may include infor- mation from a range of systems (for instance, the catalog, abstracting and indexing databases, full-text databases, locally-developed exhibits), the logical grouping of infor- mation resources unifies the concept for the user. A site's organization scheme will often serve as the foundation for the primary navigational choices on a site's main menu or primary navigational bar. Another component of the coordinating structure is the navigational structure of the site. Navigational struc- tures define the relationships between content and serv- ice elements of a site, and between groupings in the site's organization. These structures also include search tools and other link-management tools that help users locate needed content and services. There are usually two types of relationships that form a navigational structure. First is the definition of a global relationship scheme that out- lines the primary navigational structure of the site. These often define relationships between sections of a site's organization, but may also provide access to key pieces of functionality from any point within a site. In addition to the overarching global relationship scheme, there are often several locally or functionally defined relationship schemes that are used throughout the site. These local relationship schemes are usually located within a service or content grouping and provide logical connections within their defined grouping. Both sets of relationships are designed to support a task and provide pathways for the user to move among the various elements of the site. Other relationship schemes may be topic oriented, allow- ing the user to move easily among similar content sources. These logical relationships are later implemented within a user interface as tools such as menus, navigation bars, and navigation tabs when combined with labels and a visual design. Customization and personalization are navigational structures that have gained a fair amount of attention in the library literature. Both strategies allow a Web site to be displayed differently, based on user characteristics. Customization allows the user to create the relationships most suitable for his or her needs. This strategy has been explored by a number of libraries, although there is little convincing evidence that users implement such strate- gies in an intense or repeated manner. 5 Personalization allows a system designer to bring together a set of pages in a relationship that is meaningful for a user or a user group. Labels, the third element of the coordinating struc- ture, provide signposts that communicate an integrated view of a Web site's design to those who use it. It is important to define a labeling system that consistently and clearly communicates the meaning of the site to the user. Accordingly, the labels should be constructed in the user's language, not the librarian's. For example, a user may not understand that an abstracting and index- ing database will provide them with information regarding journal articles that are relevant to a topic of interest. In that case, the label "Find an article" is more useful than "Indexes." BEYOND INFORMATION ARCHITECTURE I MALONEY AND BRACKE 147 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Extended Informati on Architecture Coordinating Structure • Organization: The grouping and specification of the funct ion and content that is necessary to support the site. • Navigational Structure: The associations among the service and content elements of the site. These relationships provide the conceptual foundation for navigation and include global and local navigationa l concepts, site index and search, customizab le and personalized structures. • Labeling: A consistent naming scheme that presents options and choices to users in terms that will understand. Serv ice Elements • Functio nal Requirements: The description of the functional elements that are necessary to support the user. • Content Requirements: The description of the content elements that are necessary to support the user. • Content Specifications: The description of the content elements that are already available to support the user. • Functional Specifications: The description of the functional elements that are present in a previously installed system. Figure 1. An Extended Information Architecture for Developing a Conceptual Design of Library Web Sites Labels are used to describe individual service or con- tent units, but may also be used as headings to provide structural elements to augment the navigational scheme. The consistent use of labels as headings within the site not only increases user understanding of the site, but may also be explicitly constructed to support user tasks. An example of labeling to support tasks can be seen on the University Libraries Web site of the University of Louisville where, under the main heading for Articles, the first subheading is Step 1: Search article databases; and the second subhead- ing is Step 2: Search (the catalog) by journal title." Service Elements Service elements are the second major component of Extended Information Architecture, and represent the content and functionality of the Web site. In this frame- work, the service elements serve a dual purpose. The def- inition of service elements involves defining both the ideal requirements for functionality and content as well as the specifications of what is currently available. The definition process can then be used to identify points in the Web site where new functions and content need to be added, or where existing functionality must be modern- ized. These additions and modifications may be achiev- able immediately, but in many cases an incremental plan for change may need to be developed. The service-element requirements, labeled as Functional Requirements and Content Requirements in figure 1, express the users' needs and expectations for the functional or content elements of the Web site. The pur- pose of the requirements definitions is to describe the service elements that are necessary to allow a user to meet his or her goals or objectives in using the site. These requirements are a representation of the ideal composi- tion of a Web site, and inform not only the immediate implementation of the site but also the development of future systems and the modernization of existing sys- tems. It is also important to note that the requirements should be developed to express user needs, not a particu- lar implementation option. For example, it might be tempting to specify the implementation of a particular vendor's OpenURL resolver. This does not, however, describe how the system would function ideally from a user perspective. Instead, an appropriate requirement would be that users should be able to link to full text from all citations in an abstracting and indexing database. More specifically, content requirements describe the content that is necessary to meet the users' goals and objectives. Access to content is often the primary empha- sis of a library Web site, and the content requirements describe the intellectual content that should be accessible through a Web site. Examples of content that might be required are article citations, full-text articles, and multi- media objects. Normally, these requirements will be closely connected with library-wide collection-develop- ment policies and priorities, and should be driven by sub- ject specialists rather than systems personnel. These requirements inform the development of systems to meet the needs of the users. The content specifications describe the content that is available within the current systems. There are many reasons why content requirements and content specifications do not match, including the inabil- ity or choice of a library to acquire a particular piece, the unavailability of specified content, or technical incompat- ibilities between content and the library's infrastructure. Although content is sometimes viewed as the core component of a library Web site, there is also great deal of additional functionality that is provided to users. The functional requirements describe the users' needs and expectations of the functionality in the context of com- pleting tasks on the Web site. For example, ILL forms found on many sites are easy for the user to fill out, although the most effective interface to ILL for the user might not involve a form-based user interface at all. It might be a direct system-to-system interface from an OpenURL Resolver to the ILL software in which all cita- tion data are transmitted for the user. This requirement is not necessarily obvious when considering ILL in isola- tion, but is evident when considering it in the larger con- text of the users' goals and objective for the entire Web site. The functional specifications describe the functions 148 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. as they exist in the installed base of systems and expose the functionality that is available to the user. When the specifications do not match the requirements, the users' expectations regarding the system will not be fully achieved. The economic and technical limitations of system implementation and modernization often reduce the speed at which the large base of previously installed sys- tems can be modified to meet users' changing needs and expectations. It is thus critical to identify gaps between existing systems and desired systems and discover areas where a Web site will have characteristics that are not completely aligned with what the user needs or expects. When the service-element requirements do not match the service-element specifications of existing systems, an iter- ative design process begins. This process will be inter- twined with the evaluation of the system architecture. Gaps that can be addressed immediately should be incor- porated into an implementation plan for the new Web site. Longer-term migration or development plans can be developed to fill gaps that cannot be addressed immedi- ately. It is also important to acknowledge that developing and meeting service-element requirements is an iterative process. They will need to be revisited over time as user needs change, and requirements that are met now become the specifications that are evaluated in the future. I Interrelationships within EIA When the service-element requirements cannot be used to modify the service-element specifications, the service ele- ments constrain the design of the Web site and influence the design of the coordinating structure. The upward arrow in figure 1 labeled Constrains indicates that the user experience is constrained by the specifications of content or functional elements that are not currently changeable. In such situations, the coordinating structure must be designed to provide additional context for the user to understand the purpose of the existing service ele- ments. This explanatory role can be seen in the imple- mentation of many Web sites as formal parts of the organizational structure designed to explain the idiosyn- crasies of the Web site to the user. For example, many aca- demic library Web sites have tutorials, FAQs, or sections labeled "How do I . . . ?" that provide tips on using aspects of the site that are not always evident to users. It is necessary to acknowledge the usefulness of the explanatory role of the coordinating structure in the iter- ative and incremental processes of Web-site design. Just as bibliographic instruction and adequate signage have allowed the user to navigate aspects of the traditional library that were not intuitive, the coordinating structure provides the conceptual signposts and other guidance required for users to effectively navigate the Web site. At the same time, it is important to realize that the explana- tory role would not be necessary if the Web site's archi- tecture and design were intuitive to the user. As the design of the service elements changes to accommodate the larger goals of the user, the explanatory function of the coordinating structure will be diminished. The main goal of library Web site design should be to reduce the explanatory role of the coordinating structure and to develop service elements that seamlessly support the goals and objectives of the user. Until all service elements have been modernized to meet the needs of the user, the conceptual design of Web sites will represent a compro- mise between what users require and what it is possible for users to do within the current legacy information infrastructure I System Architecture While the conceptual design of the Web site describes the needs of the user apart from the technical details of the implementation, the system architecture is the descrip- tion of the system as it exists. In the case of library Web sites, the system architecture is not limited to the func- tionality and data on the library's Web server. Instead, it is also inclusive of all core infrastructure, individual sys- tems, and data access and storage mechanisms that pro- vide the blueprint of the Web site's backend as it has been built. The individual systems in the architecture may include locally controlled ones, (for instance, an online catalog), but will also include remote systems such as abstracting and indexing databases mounted by a vendor. A definition of the design of the existing system plays a key role in the evolutionary specification of the system because it provides developers with a greater under- standing of the possibilities and constraints of the existing infrastructure. In describing a system architecture, sev- eral formal representations can be used that capture vari- ous aspects of the system's capabilities at different levels of granularity. These include module views that provide static specifications of individual components; compo- nent and connector views that provide dynamic views of processes; and deployment views that incorporate hard- ware elements.' The selection of representations is beyond the scope of this paper. Typical elements of a system architecture can be seen in figure 2. For the paper, three classes of components are being considered, although more may be introduced if applicable locally. The core-infrastructure components are fundamental services and information that support one or more systems or subsystems. In a typical library environ- ment this includes authentication services, Web platforms, and the network. In some library environments, external BEYOND INFORMATION ARCHITECTURE I MALONEY AND BRACKE 149 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. units may maintain some or all of these components. For example, many college campuses maintain an authentica- tion infrastructure in the campus computing office. Overall, core infrastructure provides the glue for tying together the many applications that libraries attempt to integrate in their Web sites. The system architecture should include details regarding the standards and interfaces that are used within the library technical infrastructure. Many of the applications in the library environment are off-the-shelf components that have been developed by external vendors. These off-the-shelf components may include the catalog, ILL modules, electronic-course reserves, and virtual-reference systems. Although indi- vidual libraries may have some control over configura- tion options in these applications, they are likely to have little influence over the basic functionality or data formats provided by these systems. Core functionality tends to change based on the demands of many libraries looking for similar functionality. Despite the lack of functional control over these systems, components developed by external vendors may provide standards-based system interfaces to their functionality. These usually take the form of industry-supported standards or vendor-sup- plied application programming interfaces and give libraries some flexibility in working with these compo- nents. Explicit descriptions of the available standard and proprietary interfaces should be included within the sys- tem architecture. Other applications may have been developed within the library and so can be changed more easily. Examples of locally developed applications typically include subject pages, information about the library, and digital Web exhibits and collections. Although local development does provide more control over the appearance and functional- ity of a piece of software, it is not without problems. Local development is often conducted using a bricolage approach, solving specific problems singularly, without giving consideration to the larger networks of systems in which the solutions operate. When such approaches do not take into account larger issues of systems architecture, opportunities to solve a broader range of problems may be missed and subsequent repackaging of these solutions may be limited or impossible. Libraries frequently also have a limited number of programmers, often remedied by pulling librarians or staff from other duties. While this cer- tainly can allow libraries to meet some user needs, the lack of software-engineering skills in libraries may result in local solutions that are inflexible and that do not support standards for data storage or interchange. Because the internal design of these applications is accessible and mod- ifiable, the system architecture should include more exten- sive descriptions of the internal features and relationships that they contain. Although this will not completely allevi- ate the problems of software maintenance, it will provide a better foundation for decisions regarding future migration. System Architecture Applications (off-the-shelf and locally developed) Specification of the access mechanisms and standar ds for previously installed systems including: • Catalog • Interlibrary-Loan • Electronic Reserves • Abstracting and Indexing Databases • Content Management Systems • Legacy Web Content Core Infrastructure • Authentication: The va lidation of a users identity based on creden tials. Increas ingly a part of a campus-wide infrastructure . • Web Platforms : Operating systems, server software and application software the provide the general foundation for the Website. • Network: The communication infrastructur e within the library system and connect ing to the Internet. Information Storage and Access • Storage: The definition of storage structures including relational or hierarchical schema. Character format specifications. • Standards: Standards available for access to the data. These include formats like MARC, Dublin Core and mechani sms like 239 .50 and ODBC . Figure 2. Eleme nts of a System Archit ec tur e Finally, typical library architectures consist of links to resources that are licensed or organized on behalf of the user. These include abstracting and indexing databases, full-text content provided by publishers outside of the library, and general vetted Internet sites. Linking the user to the system usually provides access to these systems, and libraries have no control over the technical imple- mentations of such resources. Newer federated search technologies are integrating into the library infrastructure the users' access to the site and to results from the sites, and linking tools make the interrelationships between these systems more easily understood. Nevertheless, inte- grating these resources into a Web site in a manner that makes sense to library users is a challenge. The access mechanisms and information formats required to com- municate with the site should be clearly documented within this system architecture. I Interrelationship of the Information and System Architectures Reacting to the rapid pace of change can result in an ad- hoc or haphazard approach to Web-site design. The sec- tions above describe a systematic approach to include and evaluate changes to the Web site. In order to imple- 150 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ment the changes and create a Web site that is scalable and made of reusable components, it is necessary to eval- uate, plan, and document all changes to the system . Figure 3 graphically depicts the interrelationship between EIA and system architecture. User needs, as described by IA, should inform the development of technical infra- structure. The Informs arrow indicating that EIA informs the design and development of th e system architecture depict s this interrelationship. The Constrains arrow des- ignates the reality that some aspects of the existing infra- structure cannot be changed within this planning cycle and will limit the library 's ability to immediately change the underlying content and function of the Web site. When mapping the conceptual d esign to the physical design, there will be gaps that represent functionality that cannot be supported, either fully or in part, by the current system architecture and thus constrain the full imple- mentation of the conceptual design . If IA is then to be implemented as fully as possible, these gaps identify the modification s and additions that must be carefully evaluated, designed , and implemented within the underlying system architecture. Gaps can be addressed in a variety of ways. If there is a total gap in functionality, a system can be deve loped or implemented to provide the desired functionality as part of the larg er system architecture. This may result in a complete devel- opment project or in the specification of an off-the-sh elf application to meet the newly identified demand. In the case where an existing system has some of the required functionality but is not completely suitable for the users ' goals and objectiv es, an incremental approa ch of modernization can be adopted . Modernization sur- rounds "the legacy system with a so ftware layer that hid es the unwanted complexity of the old system and exports a modern interface ."" This is done to provide integration with a modern operating environment while retaining the data and exposing the functions of the exist- ing system, if desired. Techniques range from screen scraping to the implementation of Web services to export access to functions that are still relevant within the new context. All of these chang es beco me part of the system architecture for future iteration s of change. Gaps that can- not be immediately added or changed to meet the speci- fied requirements become constraints in the next iteration of conceptual design. In the absence of a plan, the underlying systems will continue to undergo constant evolutionary changes, ostensibly to meet the changing needs and workflows of both users and staff. Change comes from many sources, including local implementations and modifications, external vendors, and industry-wide changes in stan- dards. This rapid but incremental change can produce a system that is very difficult to maintain and that provides few reusable modules. Having a well-documented imple- mentation and integration plan will not guarantee that Extended Inform atio n Archit ecture System Archit ect ure Coro Infrastructu re • Authonticatlon • Web Platforms • Network Figure 3. The Interrelat ionsh ip between the Conceptual and Physical Design of the Library Web Site the library will not experi ence the negative effects of tech- nological change, but it does allow a library to b ette r manage change in meeting the needs of its users. Th e more explicitly and clearly th e modifiable featur es are documented within the sys te m architecture; the easier it will be to plan to fill the gaps. I Conclusion Library users' mental models of library processes hav e fundamentally changed, creating a serious disconn ect between how users expect to use a library Web site and how the site was design ed. In particular , user expectations regarding the numb er of step s that must be completed have changed. At the same time, library technical infra- structures are composed, in part , of legacy systems that provide great value and facilitate interlibrary resour ce sharing, but were not designed for the Web environm ent. It is essential that librari es develop new approaches to th e conceptual design of Web sites that support current and future changes to both user behaviors and to library sys- tems architectures. In th e long run, these approach es should contribute to th e development of a referenc e model for the description of library services. The authors have proposed a complete framework for conceptual design and physical implementation that is responsive to changing user ne eds while recogni zing the need for libraries to adopt an efficient and cost-effe ctive approach to Web-site design, implementation, and main- tenanc e. Functional and content needs of the user are id entified and molded into a conceptual design based on a broadened perspectiv e of the users' objectiv es . Mapping conceptual requirem ents to physical architec- tures is an important part of this framework, using an architectural representation in combination with descrip- tions of integration elements that have been developed to support the incremental and iterative change. BEYOND INFORMATION ARCHITECTURE I MALONEY AND BRACKE 151 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The ability to respond is essential, necessitated by the rapid change in the technical and user environments in which libraries operate. The framework is designed to allow logical and informed decisions to be made through- out the process regarding when to create new systems, when to replace or modernize existing systems, and when to improve the conceptual signage of the Web site. References 1. Christina Wodtke, Information Architecture: Blueprints for the Web (Indianapolis: New Riders, 2003), 2. Louis Rosenfeld and Peter Marville, Information Architec- ture for the World Wide Web, 2nd ed. (Cambridge, Mass.: O'Reilly, 2002), 4. 3. Bob Gerrity, Theresa Lyman, and Ed Tallent, "Blurring Services and Resources: Boston College's Implementation of MetaLib and SPX," Reference Services Review 30, no. 3 (2002): 229-41; Barbara J. Cockrell and Elaine Anderson Jayne, "How Do I Find an Article? Insights from a Web Usability Study," Jour- nal of Academic Librarianship 28, no. 3 (May 2002): 122-32. 4. Jesse James Garrett, Elements of User Experience (Indi- anapolis: New Riders, 2002); Rosenfeld and Marville, Information Architecture. 5. James S. Ghapery and Dan Ream, "VCU's My Library: Librarians Love It ... Users? Well Maybe," Information Technol- ogy and Libraries 19, no. 4 (Dec. 2000): 186-90; James S. Ghapery, "My Library at Virginia Commonwealth University: Third Year Evaluation," D-Lib Magazine 8, no. 7 /8 (July/ Aug. 2002). Accessed July 16, 2003, www.dlib.org/ dlib/july02/ ghaphery / 07ghaphery.html. 6. University of Louisville Libraries Web site (2003). Accessed July 16, 2003, http:/ /library.louisville.edu. 7. Craig Larman, Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design (New Jersey: Prentice Hall PTR, 1998); Martin Fowler, Analysis Patterns: Reusable Object Models (Boston: Addison-Wesley, 1997); James Rumbaugh, Ivar Jacobson, and Grady Booch, The Unified Modeling Language Ref- erence Manual (Boston: Addison-Wesley, 1999); Robert C. Sea- cord, Daniel Plakosh, and Grace A. Lewis, Modernizing Legacy Systems: Software Technologies, Engineering Processes, and Business Practices (Boston: Addison-Wesley, 2003). 8. Seacord, Plakosh, and Lewis, Modernizing Legacy Systems, 9. 152 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 9657 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Policies Governing Use of Computing Technology in Academic Libraries Vaughan, Jason Information Technology and Libraries; Dec 2004; 23, 4; ProQuest pg. 153 Policies Governing Use of Computing Technology in Academic Libraries The networked computing environment is a vital resource for academic libraries. Ever-increasing use dictates the prudence of having a comprehensive computer-use policy in force. Universities often have an overarching policy or policies governing the general use of computing technology that helps to safeguard the university equipment, software, and network against inappropriate use. Libraries often benefit from having an adjunct policy that works to empha- size the existence and important points of higher-level poli- cies, while also providing a local context for systems and policies pertinent to the library in particular. Having computer-use policies at the university and library level helps provide a comprehensive, encompassing guide for the effective and appropriate use of this vital resource. F or clients of academic libraries, the computing envi- ronment and access to online information is an essential part of everyday service-every bit as vital as having a printed collection on the shelf. The computing environment has grown in positive ways-higher-caliber hardware and software, evolving methods of communi- cation, and large quantities of accurate online information content. It has also grown in many negative ways-the propagation of worms and viruses, other methods of hacking and disruption, and inaccurate informational content. As the computing environment has grown, it has become essential to have adequate and regularly reviewed policies governing its use. Often, if not always, overarching policies exist at a broad institutional or even larger systemwide level. Such policies can govern the use of all university equipment, software, and network access within the library and elsewhere on campus, such as cam- pus computer labs. A single policy may encompass every easily conceivable computing-related topic, or there may be several individual policies. Apart from any document drafted and enforced at the university level, various pub- lic laws exist that also govern appropriate computer-use behavior, whether in academia or on the beach. Many institutions have separate policies governing employee use of computer resources; this paper focuses on student use of computing technologies. In some cases, the library and the additional campus student-computer infrastructure (for example, campus labs and dormitory computer access) are governed by the same organizational entity, so the higher-level policy and the library policy are de facto the same. In many instances, libraries have enacted additional computer- use policies. Such policies may emphasize or augment certain points found in the institution-level policy(s), address concerns specific to the library environment, or both. This paper surveys the scope of what are most Jason Vaughan commonly referred to as "computer-use policies," specifically, those geared toward the student-client pop- ulation. Common elements found in university-level policies (and often later emphasized in the library pol- icy) are identified. A discussion on additional topics generally more specific to the library environment, and often found in library computer-use policies, follows. The final section takes a look at the computer-use envi- ronment at the University of Nevada, Las Vegas (UNLV), the various policies in force, and identifies where certain elements are spelled out-at the univer- sity level, the library level, or both. I Policy Basics Purpose and Scope Policies can serve several purposes. A policy is defined as: a plan or course of action ... intended to influence and determine decisions, actions, and other matters. A course of action, guiding principle, or procedure con- sidered expedient, prudent, or advantageous.' Any sound university has a comprehensive computer- use policy readily available and visible to all members of the university community-faculty, staff, students, and visitors. Some institutions have drafted a universal policy that seeks to cover all the pertinent bases pertaining to the use of computing technology. In some cases, these broad overarching policies have descriptive content as well as references to other related or subsidiary policies. In this way, they provide content and serve as an index to other policies. In other cases, no illusions are made about having a single, general, overarching policy-the university has multiple policies instead. Policies can define what is per- mitted (use of computers for academic research) or not per- mitted (use of computers for nonacademic purposes, such as commercial or political interests). A policy is meant to guide behavior and the use of resources as they are meant to be used. In addition, policies can delve into procedure. For example, most policies contain a section on how to report suspected abuse and how suspected abuse is inves- tigated, and outlines potential penalties. Policies buried in legalese may serve some purpose, but they may not do a good job of educating users on what is acceptable and not acceptable. Perhaps the best approach is an appropriate Jason Vaughan (jvaughan@ccmail.nevada.edu) is Head of the Library Systems Department at the University of Nevada, Las Vegas. POLICIES GOVERNING USE OF COMPUTER TECHNOLOGY IN ACADEMIC LIBRARIES I VAUGHAN 153 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. balance between legalese and language most users will understand. In addition, policies can also serve to help educate individuals on important topics, rather than merely stating what is allowed and what will get one in trouble. For example, a general policy statement might read, "You must keep your password confidential." Taken a step further, the policy could include recommendations pertaining to passwords, such as the minimum password length, inclusion on nonalphabetic characters, the recom- mendation to change the password regularly, and the man- date to never write down the password. Characteristics of a Policy-Visibility, Prominence, Easily Identifiable A policy is most useful when it is highly visible and clearly identified as a policy that has been approved by some authoritative individual or body. Students often sign a form or agree online to terms and conditions when their university accounts are established. Web pages may have a disclaimer stating something to the effect of "use of (insti- tution's) resources is governed by .... " and provide a hyperlink to the various policies in place. Or, a simple poli- cies link may appear in the footer of every Web page at the institutional site. Some universities have gone a bit further. At the University of Virginia, for example, students must complete an online quiz after reviewing the computer-use guidelines.' In addition, they can choose to view the optional video. Such components serve to enhance aware- ness of the various policies in place. A review of the library literature failed to uncover any articles focusing on computer-use policies in academic libraries. The author then selected several similar-sized (but not necessarily peer) institutions to UNLV-doctoral- granting universities with a student population between twenty thousand and thirty thousand-and thoroughly examined their library Web sites to see what, if any, policy components were explicitly highlighted. It quickly became evident that many libraries do not have a centrally visible, specifically titled, inclusive computer-use policy document. Most, but not all, of the library Web sites pro- vided a link to the institutional-level computer-use policy. In some cases, library policies were not consolidated under a central page titled "Policies and Procedures," or "Guidelines," and, where they did appear, the context did not imply or state authoritatively that this was an official policy. There was no statement of who drafted the policy (which can lend some level of authority or credence), as well as no indicated creation or revision date. Granted, many libraries have paper forms one must sign to obtain a library card, or they may state the rules in hardcopy posted within prominent computer-dense locations. Still, with so much emphasis given to licensed database and Internet resources, and with such heavy use of the com- puting environment, such policies should appear online in a prominent location. Where better to provide a com- puter-use policy than online? Perhaps all the libraries reviewed did have policies posted somewhere online. If the author could not easily find them, chances are a stu- dent would have difficulties as well. In sum, the location of the policy information and how it is labeled can make a tremendous difference. Revisions Policies should be reviewed on a regular basis. Often, the initial policy likely goes through university counsel, the president's administrative circles, and, perhaps, a board of regents or the equivalent. Revisions may go through such avenues, or may be more streamlined. A frequent review of policies is mandated by evolving information technology. For example, cell phones with built-in cameras or Internet- browsing capabilities, nonexistent a few years ago, are now becoming mainstream. With such an inconspicuous device, activities such as taking pictures of an exam or finding simple answers online are now possible. Similarly, regularly installed critical updates are a central concept within Windows' latest version of operating-system soft- ware. Such functionality failed to attract much attention until the increase in security exploits and associated media coverage. Some policies, recently updated, now make mention of the need to keep operating systems patched. I Why Have a Library Policy? While some libraries link to higher-level institutional policies and perhaps have a few rules stated on various scattered library Web pages, other libraries have quite comprehensive policies that serve as an adjunct to (and certainly comply with) higher-institutional policies. There are several reasons to have a library policy. First, it adds visibility to whatever higher-level policy may be in place. A central feature of a library policy is that it often provides links (and thus, additional visibility) to other higher-level policies. A computer-use policy can never appear in too many places. (Some libraries have the link in the footer of every Web page.) A computer-use policy can be thought of as a speed limit sign. Presumably, everyone knows that unless otherwise posted, the speed limit inside the city is thirty-five miles per hour, and out- side it is fifty-five miles per hour. Nevertheless, numerous speed-limit signs are in place to remind drivers of this. Higher-level institutional policies often take a broad stroke, in that they pertain to and address computing technology in general, without addressing specific sys- tems in detail. A second reason to have a local-library pol- icy is to reflect rules governing local-library resources that are housed and managed by the library. Such systems 154 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. often include virtual reference, electronic reserves, lap- top-checkout privileges, and the mass of electronic data- bases and full-text resources purchased and managed by libraries. Such library-based systems do not necessarily make the radar of higher-level policies, yet have impor- tant considerations, such as copyright issues in the elec- tronic age or privacy as it relates to e-mail and chat reference. In addition, libraries often have two large user groups that other campus entities do not have-univer- sity affiliates (faculty, staff, students) and nonuniversity affiliates (community users). While broader university policies generally apply to all users of computing tech- nology, local-library policies can work to address all users of the library PCs, and make distinctions as to when, where, and what each group can use. I Common Computer-Use Policy Elements The following section outlines broad topics that are usu- ally addressed within high-level, institutional policies. Often, some or many of these same elements are later reemphasized or adapted by libraries, focusing on the library environment. In many cases, the policy is pre- sented in a manner somewhat like breaking the seal on a new piece of software packaging. Essentially, if someone is using the university equipment or network, that person agrees to abide by all policies governing such use. An overarching policy frequently may end with a bulleted summary of the important points in the document. An important first part of the policy is a clear indication of who the policy applies to. This may be as broad as "any- one who sits down in front of university equipment or connects to the network," or as specific as spelling out individual user groups (undergraduates, graduates, alumni, K-12 students). Appendix A summarizes ele- ments found in the various end-user computer policies in force at UNLV and the UNLV university Libraries. Network and Workstation Security Network security is a universal topic addressed in com- puter-use policies. Under this general aegis one often finds prohibitions against various forms of hacking, as well as recommendations for steps individual users should take to help better secure the overall network. There are also such policies as the prohibition of food and drink near computer workstations or on the furniture housing computer workstations. Typical components related to network and workstation security include: 1. Disruption of other computer systems or networks; deliberately altering or reconfiguring system files; use of FTP servers, peer-to-peer file sharing, or operation of other bandwidth-intensive services 2. Creation of a virus; propagation of a virus 3. Attempts at unauthorized access; theft of account IDs or passwords 4. Password information-individual users need to maintain a strong, confidential password 5. Intentionally viewing, copying, modifying, or deleting other users' files 6. A requirement to secure restrictions to files stored on university servers 7. Recommendation or requirement to back up files 8. Statement of ownership regarding equipment and software-the university, not the student, owns the equipment, network, and software 9. Intentional physical damage: tampering, marking, or reconfiguring equipment or infrastructure- such as unplugging network cables 10. Food and drink policies Personal Hardware and Software Many universities allow students to attach their own lap- tops to the campus wired or wireless network(s). In addi- tion to network connections, a growing number of consumer devices such as floppy disks, zip disks, and rewritable CD /DVD-media have the potential to connect to university computers for the purpose of data transfer. Today, the list has grown to include portable flash drives, digital cameras and camcorders, and MP3 players, among others. The attaching of personal equipment to university hardware may or may not be allowed. Similarly, users may often try to install software on university-owned equipment. Typical examples may include a game brought from home or any of the myriad pieces of soft- ware easily downloaded from the Internet. Some of the policy elements dealing with the use of personal hard- ware and software include: 1. Connecting personal laptops to the university wired or wireless network(s) 2. Use of current and up-to-date patched operating systems and antivirus programs running on per- sonal equipment attached to the network 3. Connecting, inserting, or interfacing such personal hardware as floppy disks, CDs, flash drives, and digital cameras with university-owned hardware; liability regarding physical damage or data loss 4. Limit access to and mandate immediate reporting of stolen personal equipment (to deactivate regis- tered MAC addresses, for example) 5. Downloading or installing personal or otherwise additional software onto university equipment 6. Use of personal technology (cell phones, PDAs) in classroom or test-taking environments POLICIES GOVERNING USE OF COMPUTER TECHNOLOGY IN ACADEMIC LIBRARIES I VAUGHAN 155 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. E-mail E-mail privileges figure prominently in computer-use policies. Some topics deal with security and network per- formance (sending a virus), while many deal with inap- propriate use (making threats or sending obscene e-mails). Other topics deal with both (such as sending spam, which is unsolicited, annoying, and consumes a lot of bandwidth). Among the activities covered are prohibi- tions or statements regarding: l. Hiding identity, forging an e-mail address 2. Initiating spam 3. Subscribing others to mailing lists 4. Disseminating obscene material or Weblinks to such material 5. General guidelines on e-mail privileges, such as the size of an e-mail account, how long an account can be used after graduation, and e-mail retention 6. Basic education regarding e-mail etiquette Printing With the explosion of full-text resources, libraries and other student-computing facilities have experienced a tremendous growth in the volume of pages printed on library printers. At UNLV Libraries, for example, the print- ing volume for July 2002 to June 2003 was just shy of two million pages; the following year that had jumped to almost 2.4 million pages. Various policies helping to gov- ern printing may exist, such as honor-system guidelines ("don't print more than ten pages per day"). Some institu- tions or libraries have implemented cost-recovery systems, where students pay fixed amounts per black-and-white and color pages printed through networked printers. Standard policies regarding printer-use cover: 1. Mass printing of flyers or newsletters 2. Tampering with or trying to load anything into paper trays (such as trying to load transparencies in a laser printer) 3. Per-sheet print costs (color and black-and-white; by paper size) 4. Refund policies 5. Additional commonsense guidelines, such as "use print preview in browser" Personal Web Sites Many universities allow students to create personal Web sites, hosted and served from university-owned equipment. Customary policy items focusing on this privilege include: 1. General account guidelines-space limitations, backups, secure FTP requirements 2. Use of school logo on personal Web pages 3. Statement of content responsibility or institutional disclaimer information 4. Requirement to provide personal contact information 5. Posting or hosting of obscene, questionable, or inappropriate content Intellectual Property, Copyright, or Trademark Abuse of copyright, clearly a violation of federal law, is something that libraries and universities were concerned about long before computers hit the mainstream. Widespread computing has introduced new avenues to potentially break copyright laws, such as peer-to-peer file sharing and DVD-movie duplication, to mention only two. A computer-use policy covering copyright will gen- erally include: l. General discussion of copyright and trademark law; links to comprehensive information on these topics 2. Concept of educational "fair use" 3. Copying or modification of licensed software, use of software as intended, use of unlicensed software 4. Specific rules pertaining to electronic theses and dissertations 5. Specific mention of the illegality of downloading copyrighted music and video files Appropriate- and Priority-Use Guidelines Appropriate use is often covered in association with top- ics such as network security or intellectual property. However, appropriate- and priority-use rules can be an entire policy and would include: l. Mention of federal, state, and local laws 2. Use of resources for theft or plagiarism 3. Abuse, harassment, or making threats to others (via e-mail, instant messaging, or Web page) 4. Viewing material that may offend or harass others 5. Legitimate versus prohibited use; use for nonacad- emic purposes such as commercial, advertising, political purposes, or games 6. Academic freedom, Internet filtering Privacy, Data Security, and Monitoring Privacy and data security are tremendous issues within the computing environment. Networking protocols and com- ponents of many software programs and operating sys- tems by default keep track of many activities (browser history files and cache, Dynamic Host Configuration Protocol logs, and network account login logs, to mention a few). Additional specialized tools can track specific ses- sions and provide additional information. Just as credit- 156 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. card companies, banks, and hospitals provide a privacy policy to their clients, so do many academic computer-use policies. Such statements often address what logs are kept, how they are maintained, how they may be used, and who has access. In addition to the legitimate use of maintaining information, there is the general concept of questionable or outright malicious collection of information, through cook- ies, spybots, or browser hijacks. The following are concepts often addressed under the general heading of privacy: l. Cookies, spybots, other malicious software 2. What information is collected for evaluative sys- tem management and/ or statistical purposes; use of cookies for this; how such information is used and reported 3. Statement on routine monitoring or inspection of accounts or use; reasons information may be accessed (routine system maintenance, official uni- versity business, investigation of abuse, irregular usage patterns) 4. Security of information stored on or transmitted by various campus resources 5. Statement on general lack of security of public, multiuser workstations (browser cache, search his- tory, recent documents) 6. Disposition of information under certain circum- stances (for example, if a student dies while enrolled, any personal university e-mail and stored files can be turned over to executor of will or parents) Abuse Violations, Investigations, and Penalties As policies generally are a statement of what is or is not permitted, or what is considered abuse, a clearly defined mechanism for reporting suspected abuse and policy vio- lations can often be found. Obviously, some abuse issues violate not only university policy, but also local, state, or federal law. Investigations of suspected abuse are by their nature tied into the privacy and monitoring category. Policy items detailing suspected abuse usually include: 1. How one can report suspected abuse 2. How requests for content, logging, or other account information are handled; how and by what entities abuse investigations are handled 3. Potential penalties 4. How to appeal potential penalties; rights and responsibilities one may have in such a situation Other Computer or Network-based Services Affecting the Broad Student Population Universities operate any number of other computer or network-based services for the broad academic commu- nity. Such services may include provisioning of ISP accounts, courseware, online registration, and digital institutional repositories. Depending on the broad nature of these services, policy information particular to such systems can be specified at the broad policy level, espe- cially if they have unique avenues of potential exploita- tion or abuse not covered in the general topics included elsewhere in the policy. I Additional Library-Specific Computer-Use Policy Elements Many libraries elect to have their own, additional computer-use policies that serve as an adjunct to the larger university-level policy that generally governs the use of all computing resources on campus. Libraries that have a formalized library computer-use policy often start with a statement of other policies governing the use of the library equipment and network-references to the uni- versity policies in place. The library policy may choose to include or paraphrase parts of the university policy deemed especially important or otherwise applicable to the specific library environment. Important concepts gov- erning university policies apply equally to library poli- cies-purpose and comprehensiveness, visibility, and frequent review. Libraries that have formalized com- puter-use policies often link them under library common Web-site sections such as "information about the libraries," or "about the libraries." Library policies can help address items unique, special, or otherwise worthy of elaboration, such as specific systems in place or situa- tions that may arise. They can also help provide guide- lines and strategies to aid staff in policy enforcement. As an example of a library computer-use policy, appendix B provides the main UNLV Libraries computer-use policy. Public versus Student Use-Allowances and Priority Use Many of the other entities on a university campus do not daily deal with the community at large (the non-univer- sity affiliates) as do academic libraries. This applies to most if not all public institutions, as well as many private institutions. The degree to which academic libraries embrace community users varies widely; often, a state- ment on which user groups are the primary clients is stated in a policy. Such policy statements may discuss who may use what computers, what software compo- nents they have access to, and when access is allowed. In some cases, levels of access for students and the commu- nity are basically the same. Community users may be allowed to use all software installed on the PC. More often, separate PCs with smaller software sets have been configured for community users or for specific access to POLICIES GOVERNING USE OF COMPUTER TECHNOLOGY IN ACADEMIC LIBRARIES I VAUGHAN 157 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. government documents. In some cases, libraries allow some or all PCs to be used by anyone, student or nonstu- dent, but have technically configured the PC or network to prevent the community at large from using the full software set (such as common productivity suites). However, community users may be limited from using the productivity software (such as Microsoft Word) found on these PCs. The may be restricted from using PCs on upper floors, or those reserved for special pur- poses, such as high-end graphics-development worksta- tions. In addition, during crunch time-midterms and final exams-community users are often restricted to the few PCs set up and configured to allow access only to the library Web page (not the Web at large) and the online catalog. In addition, only students and staff can plug in their personal laptops to the library and campus net- work. Regardless of whether it is crunch time, nonstu- dent users can be asked to leave if all PCs are in use and students are waiting. An in-house-authored program identifies accounts and whether particular users are stu- dents or nonstudents. In 2005, the UNLV Libraries will begin limiting full web access to community users; they will only be permitted access to a limited set of Web- based resources, such as government document websites and library licensed databases. More and more government information is available online. For libraries serving as government document repositories, all users have the right to freely access infor- mation distributed by the government. In 2005, the UNLV Libraries will begin limiting full Web access to commu- nity users; they will only be permitted access to a limited set of Web-based resources, such as government docu- ment Web sites and library licensed databases. On another note, many libraries have special adaptive work- stations with additional software and hardware to facili- tate access to library resources by disabled citizens. Disabled individuals, enrolled at the university or not, are allowed to use these adaptive workstations. Laptop Checkout Privileges Many libraries today check out laptops for student use. At UNLV Libraries, faculty, staff, and students may check out LCD projectors and library-owned laptops and plug them into the network at any of the hundreds of available locations within the main library. More details on these privileges can be found in the article "Bringing Them In and Checking Them Out: Laptop Use in the Modern Academic Library." 3 As the university does not otherwise check out laptops to users or allow students to plug in their own laptops to the wired university net- work, the Libraries had to come up with these additional specific policies. Licensed Electronic Resources-Terms and Conditions Academic libraries are generally the gatekeepers to many citation and full-text databases and electronic journals. Each of the myriad subscription vendors has terms of use, violations of which can carry harsh penalties. For exam- ple, the UNLV Libraries had an incident where a vendor temporarily cut off access to its resource due to potential abuse detected from a single student. In this case, the user was downloading multiple PDF full-text files in an auto- mated manner. This illustrates the need to have some statement in a library policy outlining the existence of such additional terms of use. Vendors generally place a link at the top page of each of their resources related to this. For greater visibility, libraries should at least point out the existence of such terms of use for better exposure and potential compliance. In addition, some electronic resources have licensing agreements that simply do not permit community-user access. In these cases, library pol- icy can simply state that some licensed resources may be accessed only by university affiliates. Electronic Reserves Many libraries have set up electronic reserves systems to help distribute electronic full-text documents and stream- ing media content, among other things. Additional poli- cies may govern the use of such systems, such as making the system available only to currently enrolled students, and providing some boundaries in terms of what is acceptable for mounting on such a system. In addition, there is the whole area of copyright. E-reserve systems often have built-in methods to help better enforce copy- right compliance in the electronic arena. Additional pol- icy statements can help educate faculty members on particulars related to copyright and e-reserves. Offsite Access to Licensed Electronic Resources Many libraries provide offsite access to their licensed resources to legitimate users via proxy servers or other methods. The policy regarding such access may address things such as who is permitted to access resources from offsite (such as students, staff, and faculty), and the requirement that the user be in good standing (such as no outstanding library-book fines). In some instances, uni- versities have implemented broad authentication systems that, once logged on from an offsite location, allow the user into a range of university resources, including, potentially, library-licensed electronic resources. If such is the case, information pertaining to offsite access may be found in a higher-level policy. 158 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Electronic Reference Transactions Many libraries have installed (or plan to install) virtual- reference systems, or, at a minimum, have a simple e-mail reference service ("Ask a Librarian"). In addition, many collect library feedback or survey information through simple forms. In all cases, a record exists of the transac- tion. With virtual-reference systems, the record can include chat logs, e-mail reference inquiries, and URLs of Web pages accessed during the transaction. A policy governing the use of electronic-reference systems may address such things as which clientele may use the sys- tem; a statement on the confidentiality of the transac- tion; or a statement on whether the library maintains the electronic-transaction details. Items such as hours of operation and response time to an e-mail question could be considered more procedural or informational than a policy issue. Statements on Information Literacy While perhaps not a policy per se, many libraries have a computer-use policy statement to the effect that while the library may provide links to certain information, this does not serve as an endorsement or guarantee that the infor- mation is accurate, up-to-date, or has been verified. (Such a statement posted on the library Web site may provide additional exposure to the maxim that all that glitters is not gold.) Statements that libraries do not regulate, organize, or otherwise verify the general mass of information on the Internet may be included. Obviously, many libraries have separate instruction sessions, awareness programs, and overall mission goals geared toward information literacy. I Principles on Intellectual Freedom and Internet Filtering Statements by the American Library Association (ALA) on intellectual freedom and Internet filtering may well appear in an institutional policy and often are included in library policies. Filtering is something more likely to affect public and school libraries as opposed to academic libraries. Still, underage children can and do use aca- demic libraries. In such an environment, they may be intentionally or unintentionally exposed to questionable or obscene material. Thus, a library computer-use policy can express the general concept behind the following: 1. intellectual freedom (freedom of speech; free, equal, unrestricted access); 2. the fact that academic libraries provide a variety of information expressing a variety of viewpoints; 3. the fact that this information is not filtered; and 4. the responsibility of parents to be aware of what their children may be viewing on library PCs. Some libraries have provided policy links to various sets of information from the Office of Intellectual Freedom at ALA's Web site, such as: 1. ALA Code of Ethics 2. ALA Bill of Rights 3. Intellectual Freedom Principles for Academic Libraries: An Interpretation of the Library Bill of Rights 4. Access to Electronic Information, Services, and Networks: An Interpretation of the Library Bill of Rights Some libraries also provide references to ALA infor- mation pertaining to the USA Patriot Act and how law- enforcement inquiries are handled. I Summary Computing is a vitally important tool in the academic environment. University and library computing resources receive constant and growing use for research, communication, and synthesizing information. Just as computer use has grown, so have the dangers in the net- worked computing environment. Universities often have an overarching policy or policies governing the general use of computing technology that help to safeguard the university equipment, software, and network against inappropriate use. Libraries often benefit from having an adjunct policy that works to emphasize the existence and important points of higher-level policies, while also pro- viding a local context for systems and policies pertinent to the library in particular. Having computer-use policies at the university and library level help provide a compre- hensive, encompassing guide for the effective and appro- priate use of this vital resource. References 1. The A111erica11 I Jeri/age College Dictionary, 3rd edition. (Boston: Houghton, 1997), 1058. 2. Board of Visitors of the University of Virginia, "Responsi- ble Computing at U.Va.: A Handbook for Students." Accessed June 2, 2004, www.itc.virginia.edu/pubs/ docs/RespComp / rchandbook03.html. 3. Jason Vaughan and Brett Burnes, "Bringing Them In and Checking Them Out: Laptop Use in The Modern Academic Library," Information Technology and Libraries 21 (2002): 52-62. POLICIES GOVERNING USE OF COMPUTER TECHNOLOGY IN ACADEMIC LIBRARIES I VAUGHAN 159 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix A. Systemwide, Institutional, and Library Computing Policies at UNLV UNLV UCCSN UNLV Policy Libraries scs Computing UNLV Student for Posting Guidelines UNLV Libraries NevadaNet Resou rces Computer-Use Information for Library Additional Policy* Policy** Policy*** on the Webt Computer Usett Policiesttt General Direct Evident Link or References to Higher-Level Institutional/System Computer Use Policy X X X Author / Authority Information Included X X X Approved/Revised Date Included X X X X Network and Workstation Security Disruption of other computer systems/networks; deliberate ly altering or reconfiguring system files; FTP Servers/Peer-to-Peer File Sharing/Operation of other bandwidth intensiv e services X X X Creat ion of a virus; propagation of a virus X X X X Attempts at unauthorized access/Theft of account IDs or passwords X X X X X Password Information- individual user's need to maintain a strong, confidential password Intentionally view, copy, mod ify, or de lete other users' files X X X X Requirement to secure restrictions on stored files Recommendat ion/requirement to back up fi les Statement of ownership regarding equipment/software X Intent ion al phys ical damage: tampering with or marking, reconfiguring equipment or infrastructure X X X Food and dr ink policies X 160 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix A. Systemwide, Institutional, and Library Computing Policies at UNLV (cont.) scs Nevada Net Policy* Pe rsonal Hardware and Software Connect ing persona l laptops, etc. to university wired or wireless network(s) Use of current and up-to-date patched operating systems and antiv irus programs running on personal equipment attached to network - Connect ing/ insert ing/ interfacing personal hardware with existing univers ity equipment; liability regarding physica l damage or data loss Limiting access to personal equipment/report immed iately if stolen Download ing or installat ion of personal or otherwise add itiona l software onto university equipment Use of pe rsonal technology in c lassroom/test -tak ing environments Printing Mass printing of f lyers or news lette rs Tampering with or trying to load anything into paper trays Per-sheet print costs Refund policies Additiona l common- sense gu idel ines E-mail Hiding ident ity/forging an e-mai l address Initiating spam X UCCSN Computing Resources Policy** X X X X UNLV Policy UNLV Student for Posting Computer-Use Information Policy*** on the Webt X X UNLV Libraries Gu idelines for Library Computer Usett X X X X X X X UNLV Libraries Additional Policiesttt X X POLICIES GOVERNING USE OF COMPUTER TECHNOLOGY IN ACADEMIC LIBRARIES I VAUGHAN 161 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix A. Systemwide, Institutional, and Library Computing Policies at UNLV (cont.) j UNLV UCCSN UNLV Policy Libraries I scs Computing UNLV Student for Posting Guidelines UNLV Libraries Nevada Net Resources Computer-Use Information for Library Additional Policy* Policy** Policy*** on the Webt Computer Usett Policiesttt E-mail (cont.) Subscribing others to mailing lists Dissemination of obscene material or Web links to such material X X General guidel ines on e-mail privileges , such as the size of an e-mail account, how long an account can be used after graduation, etc. Personal Web Site Specific General account guidelines X Use of schoo l name and logo Statement of content responsibility/institutional discla imer inform ation X Requirement to prov ide personal contact inform at ion X Posting and hosting of obscene, questionable, or inappropriate material X Intellectual Property, Copyright, and Trademark General d iscussion of copyrights and trademark law; link s to comprehensive information on these topics X X X The concept of educational fair use X Copying or modifying licensed software/use of software as intended/use of unlicensed software X X X Specific rules pertaining to electronic theses and dissertations 162 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ' f I I Appendix A. Systemwide, Institutional, and Library Computing Policies at UNLV (cont.) UNLV UCCSN UNLV Policy Libraries scs Computing UNLV Student for Posting Guidelines UNLV Libraries NevadaNet Resources Computer-Use Information for Library Additional Policy* Policy** Policy*** on the Webt Computer Usett Policiesttt Appropriate- and Priority-Use Guidelines Mention of federal, state, and local laws X X X Use of resources for theft/plagiarism X Abuse, harassment, or making threats to others (via e-mai l, instant messaging, Web page, etc.) X X X X Viewing material which may offend others X Legitimate versus prohibited use; use for nonacademic purposes (commerc ial; advertising; political purposes; games; etc.) X X X X X Academic Freedom; Internet filtering X X X X Privacy Cookies, spybots, other malicious software What information is collected for evaluative/system management/statistical purposes; use of cookies for this Statement on routine monitoring or inspect ion of accounts or use; reasons information may be accessed X X Security of information stored on or transmitted by various campus resources X Statement on general lack of security of public, multi-user workstations Disposition of information under certain circumstances POLICIES GOVERNING USE OF COMPUTER TECHNOLOGY IN ACADEMIC LIBRARIES I VAUGHAN 163 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix A. Systemwide, Institutional, and Library Computing Policies at UNLV (cont.) UNLV scs NevadaNet Policy* Abuse Violations, Investigations, and Penalties How one can report suspected ab use How requests for content, logg ing, or other account information are hand led; how and by what entities abuse investigations are hand led Potential pena lt ies How to appea l potentia l penalties; rights/ respons ibilit ies you may have in such a sit uation Other Computer/ Network-based Services Affecting the Broad Student Population Library-Specific Pub lic versus student use -a llowances and pr iority use Right to access government information Assistance for person w ith disab ilit ies Laptop, LCD projector, etc. checkout privileges Licensed electron ic resources-terms and conditions Offsite access to licensed electron ic resources-who can access from offsite Electronic reference transactions Statements on information literacy X X X UCCSN Computing UNLV Student Resources Computer,-Use Policy** Policy*** X X X 164 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 UNLV Policy for Posting Information on the Webt X X X Libraries Guidelines UNLV Libraries for Library Additional Computer Usett Policies ttt X X X X X X Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix A. Systemwide, Institutional, and Library Computing Policies at UNLV (cont.) UNLV ALA princip les on academic freedom /Internet filtering scs Nevada Net Policy* Electron ic reserves; copyright as it pertains to electronic reserves Notes UCCSN UNLV Policy Computing UNLV Student for Posting Resources Computer-Use Information Policy** Policy*** on the Webt Libraries Guidelines UNLV Libraries for Library Additional Computer Usett Policiesttt X X * The Systems Computing Services NevadaNet Policy. Among other responsibilities, SCS provides and maintains the general Internet connectivity for Nevada's higher education institutions, including UNLV. The complete document can be accessed at www.scs.nevada.edu/nevadanet/nvpolicies.html. ** The University and Community College System of Nevada Computing Resources Policy. UCCSN is the system of higher education institutions in the state of Nevada, governed by an elected board of regents. The complete document can be accessed at www.scs .nevada .edu/about/policy061899.html *** The complete document can be accessed at www.unlv.edu/infotech/itcc/SCUP.html. 1 The complete document can be accessed at www.unlv.edu/infotech/itcc/WWW _Policy.html. 11 The primary UNLV Libraries policy governing student computer use. Provided in Appendix 2, the complete document can also be accessed at www. library.unlv.edu/services/policies/computeruse.html . ttt Various other policies are in effect at the UNLV Libraries. Some of these can be accessed at www.library.unlv .edu/services/policies/computeruse.html. Appendix B. UNLV University Libraries Guidelines for Library Computer Use In pursuit of its goal to provide effective access to information resources in support of the university's programs of teaching, research, and scholarly and creative production , the university libraries have adopted guidelines governing electronic access and use of licensed software. All those who use the libraries' public computers must do so in a legal and ethical manner that demonstrates respect for the rights of other users and recognizes the importance of civility and responsibility when using resources in a shared academic environment. Authorized Users To gain authenticated access to the libraries ' computer network, all users of the university libraries public computers must be officially registered as a library borrower, a library computer user, or a guest user . A photo ID is required. (Exceptions may be made as needed when access to Federal Depository electronic resources is required.) Priority use is granted to UNLV students, faculty, and staff. As need arises, access restrictions may be imposed on nonuniversity users. In accordance with lic ensing and legal restrictions, nonuniversity users are restricted from using word-processing, spreadsheet, and other productivity and high-end multi-media software. During high-demand times, all users may have time restrictions placed on their computer use. If requested by library staff, all users must be prepared to show photo ID to confirm their user status. Authorized and Unauthorized Use Public computers are to be used for academic research purposes only. Electronic information, services, software, and net- works provided directly or indirectly by the mliversity libraries shall be accessible, in accordance with licensing or contrac- tual obligations and in accordance with existing UNLV and University and Community College System of Nevada (UCCSN) computing services policies (UCCSN Computing Resources Policy www.scs .nev ada.edu/about/policy061899.html; POLICIES GOVERNING USE OF COMPUTER TECHNOLOGY IN ACADEMIC LIBRARIES I VAUGHAN 165 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UNLV Faculty Computer Use Policy www.unlv.edu/infotech/itcc/FCUP.html; Student Computer Use Policy http:/ /ccs. unlv.edu/ scr/ computeruse.asp). Users are not permitted to: 1. Copy any copyrighted software provided by UNLV. It is a criminal offense to copy any software that is protected by copyright, and UNLV will treat it as such 2. Use licensed software in a manner inconsistent with the licensing arrangement. Information on licenses is avail- able through your instructor 3. Copy, rename, alter, examine, or delete the files or programs of another person or UNLV without permission 4. Use a computer with the intent to intimidate, harass, or display hostility toward others (sending offensive mes- sages or prominently displaying material that others might find offensive such as vulgar language, explicit sexual material, or material from hate groups) 5. Create, disseminate, or run a self-replicating program ("virus"), whether destructive in nature or not 6. Use a computer for business purposes 7. Tamper with switch settings, move, reconfigure, or do anything that could damage terminals, computers, printers, or other equipment 8. Collect, read, or destroy output other than your own work without the permission of the owner 9. Use the computer account of another person with or without their permission unless it is designated for group work 10. Use software not provided by UNLV 11. Access or attempt to access a host compnter, either at UNLV or through a network, without the owner's permis- sion, or through use of log-in informatio! belonging to another person Internet and Web Use The university libraries cannot control the information available over the Internet and are not responsible for its content. The Internet contains a wide variety of material, expressing many points of view. Not all sources provide information that is accurate, complete, or current, and some may be offensive or disturbing to some viewers. Users should properly evaluate Internet resources according to their academic and research needs. Links to other Internet sites should not be construed as an endorsement by the libraries of the content or views contained therein. The university libraries respect the First Amendment and support the concept of intellectual freedom. The libraries also endorse ALA's Library Bill of Rights, which supports access to information and opposes censorship, labeling, and restricting access to information. In accordance with this policy, the university libraries do not use filters to restrict access to information on the Internet or Web. As with other library resources, restriction of a minor's access to the Internet or Web is the responsibility of the parent or legal guardian. Printing Users are charged for printing no matter who supplies the paper. Mass production of club flyers, newsletters, posters is strictly prohibited. If multiple copies are desired, users need to go to an appropriate copying facility such as Campus Reprographics. Contact a staff member when using the color laser printer to avoid costly mistakes. The university libraries reserve the right to restrict user printing based on quantity and content (such as materials related to running an outside business). Copyright Alert Many of the resources found on the Internet or Web are copyright protected. Although the Internet is a different medium from printed text, ownership and intellectual property rights still exist. Check the documents for appropriate statements indicat- ing ownership. Most of the electronic software and journal articles available on library servers and computers are also copy- righted. Users shall not violate the legal protection provided by copyrights and licenses held by the university libraries or others. Users shall not make copies of any licensed or copyrighted computer program found on a library computer. Use of Personal Laptops and Other Equipment Students, faculty, and staff of the university are welcome to bring laptops with network cards and use them with our data drops to gain access to our network. The laptop must be registered in our laptop authentication system, and a valid 166 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. library barcode is also required. Users are responsible for notifying the library promptly if their registered laptop is lost or stolen, since they may be held responsible if their laptop is used to access and damage the network. Users taking advantage of this service are required to abide by all UCCSN and UNLV computer policies. The libraries allow the use of the universal serial bus (USB) connections located in the front of the workstations. This includes use with portable USB-based devices such as flash-based memory readers (memory sticks, secure digital) and digital camera connections. The patron assumes all responsibility in attaching personal hardware to library workstations. The libraries are not responsible for any damage done to patron-owned items (hardware, software, or personal data) as a result of connecting such devices to library workstations. As with any use of library workstations, patrons must adhere to all UCCSN, UNLV, and university libraries' computing and network-use policies. Patrons are responsible for the security of their personal hardware, software, and data. Inappropriate Behavior Behavior that adversely affects the work of others and interferes with the ability of library staff to provide good service is considered inappropriate. It is expected that users of the libraries' public computers will be sensitive to the perspective of others and responsive to library staff's reasonable requests for changes in behavior and compliance with library and university policies. The university libraries and their staff reserve the right to remove any user(s) from a computer if they are in violation of any part of this policy and may deny further access to library computers and other library resources for repeat offenders. The libraries will pursue infractions or misconduct through the campus disciplinary channels and law enforcement as appropriate. Revised: March 3, 2004 Updated: Thursday, May 13, 2004 Content Provider: Wendy Starkweather, Director of Public Services POLICIES GOVERNING USE OF COMPUTER TECHNOLOGY IN ACADEMIC LIBRARIES I VAUGHAN 167 9658 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Impact of Web Search Engines on Subject Searching in OPAC Yu, Holly;Young, Margo Information Technology and Libraries; Dec 2004; 23, 4; ProQuest pg. 168 The Impact of Web Search Engines on Subject Searching in OPAC Holly Yu and Margo Young This paper analyzes the results of transaction logs at California State University, Los Angeles (CSULA) and studies the effects of implementing a Web-based OPAC along with interface changes. The authors find that user success in subject searching remains problematic. A major increase in the frequency of searches that would have been more successful in resources other than the library catalog is noted over the time period 2000-2002. The authors attribute this increase to the prevalence of Web search engines and suggest that metasearching, relevance-ranked results, and relevance feedback ("more like this") are now expected in user searching and should be integrated into online catalogs as search options. I n spite of many studies and articles on Online Public Access Catalogs (OPAC) over the last twenty-five years, many of the original ideas about improving user success in searching library catalog have yet to be imple- mented. Ironically, many of these techniques are now found in Web search engines. The popularity of the Web appears to have influenced users' mental models and thus their expectations and behavior when using a Web- based OPAC interface. This study examines current search behavior using transaction-log analysis (TLA) of subject searches when zero-hits are retrieved. It considers some of the features of Web search engines and online bookstores and suggests future enhancements for OPACs. I Literature Review Many studies have been published since the 1980s center- ing on the OPAC. Seymour and Large and Beheshti pro- vide in-depth overviews on OPAC research from the mid-1980s through the mid-1990s.' Much of this research has addressed system design and user behavior including: • user demographic s, • search behavior, • knowledge of system, • knowledge of subject matter, Holly Yu (hyu3@calstatela.edu) is Library Web Administrator and Reference Librarian at the University Library, California State University, Los Angeles. Margo Young (Margo.e.young@jpl. nasa.gov) is Manager of the Library, Archives and Records Sec- tion at the Jet Propulsion Laboratory, California Institute of Technology, Pasadena. • library settings, • search strategies, and • OPAC systems 2 OPAC research has employed a number of data-col- lection methodologies: experiment, interviews, question- naires, observation, think aloud, and transaction logs. ' Transaction logs have been used extensively to study the use of OPACs, and library literature reflects this. While the exact details of TLA vary greatly, Peters et al. define it simply as "the study of electronically recorded interac- tions between online information retrieval systems and the persons who search for the information found in those systems."' This section reviews the TLA literature relevant to the study. I Number of Hits TLA cannot portray user intention or actual satisfaction since relevance, success, or failure are subjectively deter- mined and require the user to decide. Peters recommends combining TLA with another technique such as observa- tion, questionnaire or survey, interview, or focus group. 5 In spite of the limit ations of TLA, many studies (including this one) rely on it alone. Typically, these studies define failure as zero hits in response to a search. Generalizing from several studies, approximately 30 percent of all searches result in zero hits.6 The failure rate is even higher for subject searches: Peters reported that about 40 percent of subject searches failed by retrieving zero hits. 7 Some researchers also define an upper number of results for a successful sea rch. Buckland found that the average retrieval set was 98.8 Blecic reported that Cochrane and Markey found that OPAC users retrieve too much (15 percent of the time). 9 Wiberly, Daugherty, and Danowski (as reported in Peters) found that the median number of postings considered to be too many was fifteen, although when fifteen to thirty postings were retrieved, more users displayed them all than abandoned the search. 10 I Subject Searching Some studies have specifically looked at subject search- ing. Hildreth differentiated among various types of searches and defined one hundred items as the upper limit for keyword searches and ninety as the upper limit for subject searches." Larson defined reasonable subject retrieval as between one and twenty items and found that only 12 percent of subject searches retrieved the appro- priate number. 12 168 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Larson is not the only researcher to have reported poor results in subject searching. For more than twenty years, research has demonstrated that subject or topical searches are both popular and problematic. Tolle and Han found that subject searching is most frequently used and the least successful. 13 Moore reported that 30 percent of searches were for subject, and Matthews et al. found that 59 percent of all searches were for subject information. 14 Hunter found that 52 percent of all searches were subject searches and that 63 percent of these had zero hits. 15 Van Pulis and Ludy referred to Alzofon and Van Pulis's earlier work in 1984 where they reported that 42 percent of all searches were subject searches.16 Hildreth found that 62.1 percent of subject searches and 35.4 percent of keyword searches failed. 17 Larson categorized the major problems with online catalogs as follows: • users' lack of knowledge of Library of Congress sub- ject headings (LCSH), • users' problems with mechanical and conceptual aspects of query formulation, • searches that retrieve nothing, • searches that retrieve too much, and • searches that retrieve records that do not match what the user had in mind. 18 During an eleven-year longitudinal study, Larson found that subject searching was being replaced by key- word searching. 19 No consistent pattern in the number of search terms has emerged in the literature. Van Pulis and Ludy reported that user searches were typically single words. 20 Markey contended that users' search terms frequently matched standardized vocabulary in large catalogs. 21 None of Markey's researchers consulted LCSH, and only 11 percent of Van Pulis and Ludy's did so, notably in spite of their library's user-education programs. Peters reported that Lester found that the average search was less than two words and fewer than thirteen characters." Hildreth found that more than two-thirds of keyword searches included two or more words and 42 percent of these multiple-word searches resulted in zero hits. 23 The proportion of zero-hit keyword searches rose with the increasing number of words in the search. Subject headings have been a matter of considerable study. Gerhan examined catalog records and surmised their accessibility in an online catalog. He contended that when a keyword from the title only is accessed, only 50 percent of all relevant books would be found and that title keywords would lead a user to subject-relevant records in 55 percent of cases while LCSH would lead a user success- fully in 85 percent of the cases. 24 In contrast, Cherry found that 42 percent of zero-hit subject searches would have been more fruitful as keyword or title searches than by fol- lowing cross references retrieved from the subject field.25 She recommended converting zero-hit subject queries to other types of subject searches (keyword). Thorne and Whitlatch recommended that subject searchers should select keyword rather than subject headings as their first access strategy. 26 Types of Problems in Subject Searches Numerous studies have categorized reasons for search failure (typically in zero-hit situations), but Peters reports that a standard categorization has not yet been estab- lished .27 Tn cases where more than one error is made in a search (and Hunter reported this to be frequent), there is no consistency in how that is assigned. Nonetheless, some major categories of problems stand out: • misspelling and typographical errors-Peters found that these errors accounted for 20.8 percent of all unsuccessful keyword searches, while Henty (reported by Peters) concluded that 33 percent of such searches could be attributed to this.28 Hunter found that 9.3 per- cent of subject searches had typographical and spelling errors. 29 • keyword search-Hunter found 52.6 percent of zero- hit searches used uncontrolled vocabulary terms. 30 • wrong source or field-Hunter concluded that 4.5 percent of searches should have been done in a source other than the catalog, while 1.3 percent of searches were of the wrong type (an author search in the subject-search option). 31 • items not in the database-Peters found that searches for items not held in the database accounted for 39.1 percent of unsuccessful searches, while Hunter found that problem in only 2.5 percent of the problem cases. 32 In addition to these problems, Hunter also found that index display and rules relating to the systems accounted for 27 percent of errors. 33 I Resulting Recommendations for Change While Hildreth stated, "There has been little research on most components of the OPAC interface" in 1997, he pro- posed two options to improve user success: increased user training or improved design based on information- seeking behavior. 34 Wallace pointed out that there is a very short window of opportunity when searchers are amenable to instruction and that successful screen designs should therefore focus on presenting the quick- searching options employed by the majority of users first. 35 Large and Beheshti observed "that too many options simply caused confusion, at least for less experi- enced OPAC users," and they summarized that OPAC- THE IMPACT OF WEB SEARCH ENGINES ON SUBJECT SEARCHING IN OPAC I YU AND YOUNG 169 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. interface research focuses on menu sequence, browsing, and querying .3'; Menu Sequence In terms of menu sequence, Hancock-Beaulieu indicated that "the menu sequence in which search options are offered will influence user selection." 37 Ballard found that the amount of keyword searching was affected by its posi - tion on the menu. 38 Scott reported that both keyword- and subject-search success improved when the keyword was plac ed at the top of the menus .39 Thorne and Whitlach used a combination of methods in their study and concluded that several interface changes should be implemented : • strongly encourage novi ce users to start with key- word (list keyword above subject heading), • relabel "keyword" to "subject or title words," and • relabel II subject heading" to "Library of Congress Subject Heading."' 0 Blecic et al. studied tran sactio n logs over six months to track th e impact of "simplifying and clarifying" OPAC introductory screens. After moving the keyword option to th e top, keyword searching incr ease d from 13.30 per- cent to 15.83 percent of all sea rch statements. Blecic et al. found her original tally of 35.05 p ercent of correct searches having zero hits decre ased to 31.35 percent after screen changes. 41 Querying OPAC-interface design has been based on an assumption that us ers come to the catalog knowing what the y need to know . In either text-bas ed OPAC or Web-based OPAC, query-based searches are still mainstream. Searchers are required to have knowledge of title, author, or subject. Ortiz-Repiso and Moscoso observed that Web-based cata- logs, like all library catalogs, basi cally fulfill two functions: locating works based on known details and identifying which documents in the databas e cover a given subject. 42 Natural-language input has long been considered a desi r- able way to overcome this shortcoming. Browsing Relevance-ranked output and hypertext were considered by Hildr eth to be promising in 1997.43 OPACs have not been conceived within a true h ypertext environment, but rather they maintain the structure of their original for- mats, principally machine-readable cataloging (MARC), and therefore impede the generation of a structure of nodes and links. 44 In addition to continuing to employ MARC format as its underlying structure, the concept of main entry and added entr y, field label, and displa y logic all reflect cataloging rules . Amazon.com and Barnes and Noble have completel y mo ve d away from this century- old structure to pro vi d e easy access to book information . In the Web environment , th e concept of main ent ry loses its meaning to multiple-acces s points and linking capabil- ities of author, subject, and call number. Another prominent drawback of Web-based OP A Cs is that they have not taken advantage of thesaurus structure and utilized the thesaurus for sea rching feedback. The hierarc hical relationship in LCSH is underutilized in terms of the relationship betw een terms and associations through related terms. Web-based OPACs have failed to make use of this important access. The persistence of the se drawbacks in OPAC-interfac e design is rooted d eeply in cataloging rules that were derived from the manual environment more than a cen- tury ago. It reflects th e gap between "concepts typically h eld by nonprofessional users and those used in library practices." 45 In her article "Why Are Online Catalogs Still Hard to Use?" Borgman conclude s: Despite numerous imp rove m en ts to the user interfac e of online catalogs in recent years, searc her s still find th em hard to use . Most of th e improvements are in sur- face features rather than in th e core functionality. We see little evidence th at our research on searching behav- ior studies has influ enced onlin e catalog design. " Catalog Content Users misunderstand th e scope of the catalog. In ques- tionnaire responses, 80 percent of Van Pulis and Ludy 's participants indicated the y had considered looking else- where than the library catalog, as in periodical ind exes. 47 Blazek and Bilal report ed a reque st for inclusion of journal- article titles in one respo nse to their questionnaire .48 Libraries responded to th ese requests by acquiring data- bas es on CD-ROM , loadin g them locally (sometimes using the catalog system to mount a separate databas e), and, most recently, providing access to databases over the Internet. However, seldom h ave libraries responded to these requests by integratin g searc h access through a sin- gle front end as the default search. I Impact of Web Search Engines Blecic et al. found that keyword searching increased from 13.3 percent to 28.3 percent over her four-year series of logs. At the same time, zero hits in keyword increased from 8.71 percent to 20.78 percent while subject zero hits dropped from 23 percent to 13.69 percent. She surmised th at the influence of Web interfaces might have affected the regression-fluctuation in search syntax, initial articles, and author order. 49 170 INFORMATION TECHNOLOGY ANO LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. automatically sco uts the Web for pa ges that are related to its res ults so it can find a large number of resources ver y qu ickly without requiring th e user to select the right keyw ord s . Teoma structures the appropriate com- munities of int eres t on-the-fly and ranks th e results on a range of facto rs including authorities and hubs (good resources pointing to related resources). Google offers an opti on of "similar pages." Whil e the subj ect-r edirect function in a Web-bas ed OPAC emulates thi s, it succ ee ds only if th e user 's initi a l search term y ielded the right result. OPAC u sers ha ve the option of clicking on hyper- linked h ea din gs (author, titl e, subject headin gs ) but can- not ask the sys tem to perform a more so phisticated sea rch on their behalf. User-Popularity Tracking Amazon and Barnes and Nobl e Web sites pr ese nt enhan- ced information about items b y user-popul arity tracking. Circulation stati stics or user comments could serv e as a form of "r ecommend er sys tem " to h elp novi ces narrow th eir selections. Messa ge s such as "o ther student s who checked this bo ok out also read thes e book s" could be dynamically in serted in bibliographic records. Users could also be allowed to pro vide comment s on mat eri als in the catalog, thus providin g an int era ctive experience for OPAC user s. Summary of Web Features There are positive and negati ve imp acts of Web sea rch engines and on line bookstores on Web-based OPAC u sers . U sers who find Web p ages to b e comfortable, easy, and familiar may mak e greater use of Web-ba sed OPACs. While th ey brin g with them their knowledge of search eng ine s, they also brin g their misp erce ption s. The possi- bility of using similar too ls to those found on Web sea rch en gine s can greatly "re infor ce the u sefuln ess of the cata- log as well as th e positiv e perc eption that th e end us er has of it." 61 Given the diver sity of the error s that u sers experi- ence , a co mbination of approaches is necessa r y to improve their search success. Automatic mapping of free- tex t-to -th esauru s term s, tran sla tion of common spelling mistak es, and links to related pages are to ols alr eady in use in th e Web sea rch engines . "See similar pages," exten- sive us e of releva nce feedback, and popularity track ing along with natural language are less common. I Recommendations for Web-based OPACs Th e authors' TLA rev ea led a continuing problem with subject-h ea din g searches and sho we d a trend toward searching top ics that are n ot typically answered in a bo ok catalog. The form er probl em ha s a well-documented hi s- tory, whil e the authors b elieve th e latt er probl em stems from the influence of th e Web and Web sear ch engin es . Severa l changes to typical OPACs are recommended to addr ess th e trend s observ ed in th e cour se of thi s study. Metasearching Th e recent trend of incorporating databases and OPAC s into a single sear ch reflects the neces sity of exp anding information resourc es and simplif ying access to resources. Thi s stud y's empirical results clearly indicate a need to exp and thi s integration into one sea rch. While some argu e that this metasearching w ill further au gment the syntax digr ess ion an d pr eve nt us ers from becom ing information literate, oth er s beli eve that metas ea rchin g, along with th e option of sear chin g each individu al d a tabase , is an ulti- m a te goal for onl ine search. Like it or not, the m etasearch technolog y, also known as federat ed or broadca st search, "crea tes a portal that could allow the libr ary to become the on e-stop shop th eir us ers and potential use rs find so attractive ."65 One- sea rch-for-all cannot solve all problems; how ev er, guidin g u sers to where the y are mo st likel y to find results quickly (the quick search) should sa tisfy th e ne ed s of th e majority of u sers . Menu Sequence Eff ec tive scree n d es ign h as a p osi ti ve e ffec t on user su c- cess. The m enu sequence for search opti ons plays a signif- icant role in user selection . This research and oth ers it h ave demonstr ated th at users choose an option hi gher rath er than lower in a list. Too many options "simply cause con- fu sion, at least for less experienced OPAC use rs." •• Browsing Feature Browsing is a natural and effective approach to m any information -seekin g problems an d requires less effort and knowledge on the part of th e u ser. The liter a ture sug- gests that a great deal of the use of th e Web relies on known Web sites, recommended sites , or return visits to sit es recently visited-thus relying on browsin g rather than on searching. Jenkins, Co rritore , and Widenb eck found that domain novice s seldom clicked very deep- out and b ack-while Web experts explor ed mor e deeply. 67 Holscher and Strub e not e that Hurtineene and Wandtke claim that only minimal trainin g is necessa ry for brow s- in g an individual Web site, whil e Pollok and Hockl ey claim that considerably more experience is req uired for qu ery ing and na viga tin g among sites. 68 Hancock -Beaulieu found that betwe en 30 p ercent and 45 percent of all online searches, reg ardl ess of th e typ e of search, ar e concluded with brow sing the librar y shelve s.69 176 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. to implement user help throu gh tip s or tac tics select ed and accumulated from a collection of common u ser- searc h mistakes. In such a case, the system would play a more activ e role by generatin g relevant search tips on th e fly and using zero-hits search resul ts as a basis for gener - ating a spe ll check or sugg esting altern ate wording. An idea l scenario is th at OPAC allow s the user to pursue mu ltiple avenues of an inq uiry by entering fra g- ments of th e question, exploring vocabulary choices, and reformulating the search wi th the assis tan ce of var iou s spec ialized intelligent assistants. Borgman suggests that an OPAC should be jud ge d by whether the ca tal og answers questions rather th an merely mat ches queri es. She s ugges ts the need to design systems that are ba sed on behavioral models of h ow people ask questions, argu- ing th a t users still need to tra n sla te their question into what a sys tem will accept. " User Instruction On- site tr aini ng and online documentation can help mak e it eas ier to u se OPAC. With the adven t of information lit- eracy, the shi ft in librar y instruction from procedur e- based query formulation to question-being-answered has taken place. At CSULA, in struct ion for en try-level classes focu ses on formulating a research sta teme nt and then identifying keywords and alternate terms. The instruc - tion sess ions that follow the initia l-conce pt formulation are sh ort an d focus on how to en ter keyword or subject, a u t h 01~ a n d title, and th e u se of Boolea n operators. Thi s approac h may improve success until th e sys tems provid e th e tools to improve sea rch stra tegies or accept an unt rai ned user 's input. As an increas ing numb er of users access online librar y ca talogs remotel y, assistance needs to be embedded into intuitive sys tems. "Time invested in elaborate help sys- tems often is better spent in redesigning the user interfac e so that help is no longer n eeded." 74 Users are not willing to devote much of their time to learning to use these sys- tems. They just want to get th eir searc h results quickly and expec t the catalog to be easy to use w ith little or no tim e invested in learning th e sys tem. I Conclusion The em piri cal study repo rted in thi s paper indicates th at p rogress has been made in terms of increasing search suc- cess by improv ing the OPAC search int erfac e. The goal is to design Web-based OPAC systems for today's users who are like ly to bring a mental model of Web search engin es to the lib rary catalog. Web-b ased OPACs and Web search engi nes differ in terms of th eir sys tems and interfac e design. However, in most cases, these differences do not res ult in different sea rch charac teris tics by users. Researc h findings on the impact of Web searc h engines and u ser searc hing expectations and behavior should b e ade- quately utilized to guide the in terface design. Web users typically do n ot know how a search engine works. Therefore, fund amental fea tures in the desi gn of the n ext generation of th e OPAC in terface should includ e ch ang in g the search to allow natural-language searching wit h keyword search first, and focu s on meetin g th e quick-search need . Such a concep t-ba sed sea rch will allow u sers to enter natu ra l lan guage of their chos en top ic in the searc h bo x w hil e th e system maps the quer y to th e s tru cture and content of the database. Relevance feedb ack to allow the system to brin g back related page s, spe llin g correctio n, and relevan ce-ranked output remain key goals for future OPACs. References and Notes 1. Sharon Seymour, "On line Public-Access Catalog User Stud ies: A Revi ew of Research Methodologies, March 1986- November 1989," Library and Information Science Research 13 (1991): 89-102; Andrew Large and Ja mshid Beheshti , "OPACs: A Resear ch Review," Library and Information Science Research 19 (1997): 2, 111-33. 2. Ibid., 113-16. 3. Ibid., 116-20. 4. Thomas A. Peters et al.," An Introduct ion to the Special Sec- tion on Transaction-Log Analysis," Library Hi Tech 11(1993): 2, 37. 5. Thomas A. Peters, "The History and Developm ent of Transaction- Log Analysis," Library Hi Tech 11 (1993): 2, 56. 6. Pauline A. Cochrane an d Karen Markey, "Cata log Use Studies since th e Introdu ction of Onlin e Interactiv e Ca tal ogs: Impact on Design for Subj ec t Access, " in Redesign of Catalogs and Indexes for Improved Subject Access: Selected Papers of Pauline A. Cochrnne (Phoenix: Oryx , 1985), 159-84; Steve n A. Zink , "Moni- toring User Success th ro u gh Transac tion-Log Analysis: The WolfPAC Example," Reference Services Review 19 (Sprin g 1991): 449- 56; Michael K. Buckl and et al., "OAS IS: A Front End for Prototy ping Catalog Enhancements," Library Hi Tech 10 (1992): 7-22. 7. Thomas A. Peters, "When Smart People Fail: An Analysis of the Tra nsaction Log of an On line Public-Access Catalog," Journal of Academic Librarianship 15 (1989): 5, 267. 8. Michael K. Buckland et al., "OASIS," 7-22. 9. Deborah D. Blecic et al., "Using Transac tion-Lo g Ana lys is to Imp rove OPAC Retrieval Result s," College and Research Libraries (Jan. 1998): 48. 10. Peters, "Histo ry and Development of Transacti on-Log Analys is," 2, 52. 11. Cha rles R. Hildr eth , "The Use and Understanding of Key- word Searching in a Un iversity Online Catalog," Information Technology and Libraries 16 (1997): 6. 12. Ray R. Larson, "Th e Decline of Subject Searching: Long - Term Trends and Patt erns of Index Use in an Online Catalog," Journal of the American Society for Information Science and Technol- ogy 42 (1991): 3, 210. 178 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 13. John E. Tolle and Sehchang Hah, "O nline Search Patterns: NLM CATLINE Database," Journal of the American Society for Information Science 36 (Mar. 1985): 82- 93. 14. Carol Weiss Moore, "User Reac tion to Online Catalog s: An Exploratory Study," College and Research Libraries 42 (1981): 295-302; Joseph R. Matth ews et a l., Using Online Catalogs: A Nationwide Survey-A Report of a Study Sponsored by the Council on Library Resources (New York: N ea l-Schuman, 1983), 144. 15. Rhonda N. Hunter, "Success and Failures of Patrons Searching the Online Catalog at a Large Academic Library: A Transaction-Log Analysis," R.Q 30 (Spring 1991): 399. 16. Noelle Van Puli s and Lorne E. Ludy, "Subject Searching in an Onl ine Cata log with Aut h ority Contro l," College and Research Libraries 49 (1988): 526. 17. Hildret h, "Th e Use and Understanding of Keyword Searching," 6. 18. Ray R. Larson, "The Decline of Subjec t Searching," 3, 60. 19. Ibid. 20. Van Pulis and Ludy, "Subj ect Searching in an Onlin e Cat - alog," 527. 21. Karen Markey, Research Report on the Process of Subject Searching in the Library Catalog: Final Report of the Subject Access Research Project (repo rt no. OCLC /OP R/ RR-83-1) (Dub lin , Ohio: OCLC Online Co mput er Library Center, 1983), 529. 22. Pe ters, "The History and Deve lopment o f Transaction- Log Ana lysis," 2, 43. 23. Hi ldr eth, "The Use and Understanding of Keyword Searching," 8-9. 24. David R. Gerhan, "LCSH in vivo: Subje ct Searching Per- formance an d Strategy in th e OPAC Era," Journal of Academic Librarianship 15 (1989): 86-8 7. 25. Joan M. Cherry, "Improving Subject Access in OP ACS: An Exploratory Study of Conversion of Users' Queries," Journal of Academic Librarianship 18 (1992): 2, 98. 26. Rosemary Thorne and Jo Bell Whitlatch, "Patron On line Catalog Success," College and Research Libraries 55 (1994): 496. 27. Peters, "The History and Developmen t of Transaction- Log Analys is," 2, 48. 28. Ibid. 29. H unt er, "Succe ss and Failures," 400. 30. Ibid., 399. 31. Ibid., 400. 32. Peters, "The Histor y and Developmen t of Transa ction- Log Analysis," 2, 56. 33. Hunter, "Success and Failures," 400. 34. Hildreth , "The Use and Understandi n g of Keyword Searchi n g," 6. 35. Patricia M . Wa llace, "How Do Patrons Search th e Online C:, talog W h en No One 's Looking? Trnn sae tion-Log A nal ysis and Impli cation s for Bibliographic Instruction and System Desi gn, " RQ 33 (winter 1993): 3, 249. 36. Large and Beheshti, "OPACs: A Research Review," 125. 37. M. M. Hancock-Beaulieu , "Online Cata logue: A Case for the User," in The Online Catalogue: Developments and Directions, C. Hildreth, ed. (London: Library Association , 1989), 25-46. 38. Terry Ballard, "Com parative Searching Styles of Patrons and Staff," Library Resources and Technical Services 38 (1994): 293- 305. 39. Jane Scott et al.,"@*&#@ This Computer and the Horse It Rode in On: Patron Frustration and Failur e at th e OPAC" (in "Co ntinuity and Transformation : The Promise of Confluen ce": U SABi Li rs·"' I [: ,, ), B p l..JR l.i ""( ' " User Interface Consulting FED ERAT ED SEARCH tN(,lN ES 1.IBR;'.'\RY PORTALS & [)AT/\, (LN 'ITR S ()PACS f.." ( H i LDREi' l's Dl(, ITAL LIBR AR IES Ezra Schwartz LOCS (773) 256-1418 ezra@artandtech.com http://www.artandtech.com Proceedings of the ACRL 7th Nationa l Conference, Chicago: ACRL 1995), 247-56. 40. Thorne and Whitlat ch, "Patron On lin e Catalog Success," 496. 41. Blecic et al., "Usin g Tran sac tion-Log Ana lys is," 46. 42. Virginia Ortiz-Repiso and Purificac ion Moscoso, "We b- Based OP A Cs: Between Tradition and Innovation ," lnformntion Technology and Libraries 18, no. 2 (June 1999): 68-69. 43. Hildreth, "The Use and Understanding of Keyword Search- ing," 6. 44. Ortiz-Repiso and Mos coso, "Web-Bas ed OPAC s," 71. 45. Ibid., 75. 46. Chris tine Borgm an, "Why Are On line Catalogs Still Hard to Us e?" Journal of the Americnn Society for Information Science 47 (1996): 7, 501. 47. Van Pulis and Ludy, "Subje ct Searching in an Onlin e Cat - alog," 53. 48. Rla zek and Bilal , "Prob lems with OPAC: A Case Study of an Academic Research Library," RQ 28 (w int er 1988): 175. 49. Debora h D. Blecic et al., "A Longitud inal Stu dy of the Effects of OPAC Screen Changes on Searching Behavior and User Success," College and Research Library 60, no. 6 (Nov. 1999): 524,527. 50. Bernar d J. Jan sen and Udo Pooch, "A Revi ew of Web Searching Studies and a Framework for Future Resear ch," jour- nal of the American Society for Information Science and Technology 52 (2001): 3, 249-50. 51. Ibid., 250. 52. Blazek and Bilal, "Problems with OPAC: A Case Study," 175; Moore , "User Reaction to Online Cata logs," 295-302. THE IMPACT OF WEB SEARCH ENGINES ON SUBJECT SEARCHING IN OPAC I YU AND YOUNG 179 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 53. M. J. Bates, "The Design of Browsin g and Berry-Pickin g Techniques for the Onlin e Search Interfac e," Online Review 13 (1989): 5, 407-24. 54. Jan sen and Pooch, "A Review of Web Searc hing Studies, " 238. 55. Judy Luther, "Trumping Google? Metasearching's Promise," Library Journal 128 (2003): 16, 36. 56. Jack Muramatsu and Wanda Pratt, "Transparent Queries: Investigating Users' Mental Models of Search Engines," Research and Development in Information Retrieval Sept. 2001. Accessed Mar. 10, 2003, http://citeseer.nj.nec.com/muramatsuOltransparent. html. 57. Jans en and Pooch, "A Review of Web Searching Studie s," 235. 58. Luth er, 'T rumping Goog le," 36. 59. Blecic et a l., "A Lon gitudina l Study of th e Effects of OPAC Screen Changes," 527. 60. Sus an M. Colaric, "Ins truction for Web Searching: An Empirical Study," College and Research Libraries News, 64 (2003): 2. 61. A. G. Sutcliff, M. Ennis, and S. J. Watkinson, "Empirical Studies of End-User Informati on Searching," Joumal of the Ameri- can Society for Information Science and Tcchnologtj 51 (2000): 13, 1213. 62. "A ll About Google," Google. Accessed Dec. 10, 2003, www.google.com. 63. G. Salton, Introduction to Modern Information Retrieval (New York: McGraw-Hill, 1983), 18. 64. Orti z-Rep iso and Moscoso, "We b-Ba sed OPACs," 71. 65. Luth er, "Trumping Google," 37. 66. Maaike D. Kiestr a et al, "End-Us ers Searching th e Online Catalogue: The Influenc e of Domain and System Knowledge on Search Patterns. Experiment at Tilburg University," The Elec- tronic Library 12 (Dec. 1994): 335-43. 67. C. Jen kins et al., "Pa tterns of In forma tion Seeking on the Web: A Qualitative Study of Domain Expertise and Web Experti se," IT and Society l (Winter 2003): 3, 74,77. Accessed May 10, 2003, www.ItandSociety.org/. 68. C. Holscher and G. Strube, "Web Search Behavior of Inter- net Experts and Newbi es," 9th International World Wide Web Con- ference, (Amsterdam. 2000). Accessed Mar. 28, 2003, www9.org/ w9cdrom /8 1/81.html; A. Pollock and A. Hockley, "Wha t's Wrong with Internet Searching," D-lib Magazine (Mar. 1997). Accessed May 10, 2003, www.dlib.org/dlib/march97 /b t /03 pollo ck.h tml. 69. M . M . Hanco ck-Beau lieu , "On lin e Catalogue: A Case for the User," 25-46. 70. Wilbert 0. Galitz, The Essential Guide to User Interface Design: An Introduction to GUI Design Principles and Techniques (Chichester, England: Wiley, 1996). 71. Juliana Chan," An Evaluation of Displays of Bibliographic Records in OPACs in Canadian Academic and Public Libraries," MIS Report, Univ. of Toronto, 1995. [025.3132 C454E] 72. Giorgio Brajnik et al., "Strategic H elp in User Interfaces for Information Retriev al," Journal of the American Society for Information Science and Technology (JASIST) 53 (2002): 5, 344 . 73. Borgman, "Why Are Online Catalogs Still Hard to Use?" 500. 74. Ibid . 180 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 9662 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Using a Native XML Database for Encoded Archival Description Search and Retrieval Cornish, Alan Information Technology and Libraries; Dec 2004; 23, 4; ProQuest pg. 181 Communications Using a Native XML Database for Encoded Archival Description Search and Retrieval Alan Cornish The Northwest Digital Archives (NWDA) is a National Endowment for the Humanities-funded effort by fifteen insti- tutions in the Pacific Northwest to create a finding-aids repository. Approximately 2,300 finding aids that follow the Encoded Archival Description (EAD) standard are being contributed to a union catalog by aca- demic and archival institutions in Idaho, Montana, Oregon, and Washington. This paper provides some information on the EAD standard and on search and retrieval issues for EAD XML documents. It describes native XML technology and the issues that were considered in the selection of a native XML database, Ixiasoft's TextML, to support the NWDA project. Pitti, one of the founders of the EAD standard, noted the primary motiva- tion behind the creation of EAD: "To provide a tool to help mitigate the fact that the geographic distribution of collections severely limits the abil- ity of researchers, educators, and oth- ers to locate and use primary sources."' Pitti expanded on this need for EAD in a 1999 D-Lib article: The logical components of archival description and their relations to one another need to be accurately identified in a machine-readable form to sup- port sophisticated indexing, navigation, and display that provide thorough and accurate access to, and description and control of, archival materials.' In a more recent publication, Pitti and Duff noted a key advantage offered by EAD that relates to the focus of this article, the development of an EAD union catalog: EAD makes it possible to pro- vide union access to detailed archival descriptions and resour- ces in repositories distributed throughout the world. . . . Libraries and archives will be able to easily share information about complementary records and collections, and to "virtu- ally" integrate collections related by provenance, but dispersed geographically or administra- tively.' In a 2001 American Archivist article, Roth examined EAD history and deployment methods used up to the 2001 time period. Importantly, two of the most prominent delivery systems described by Roth-DynaText (a server-side solution) and Panorama (a client-side solution)-were, by 2003, obsolete products for EAD delivery. This is indicative of the rapid pace of change in EAD deployment, in part due to the migration from SGML to XML technologies. Roth described survey results obtained on EAD deployment that underscore the rec- ognized need at that time for a "cost- effective server-side XML delivery system." The lack of such a solution motivated institutions to choose HTML as a delivery method for EAD finding aids.4 Articles like Roth's that describe specific EAD search-and-retrieval implementation options are in short supply. One such option, the Univer- sity of Michigan DLXS XPAT soft- ware, is employed for the search and retrieval of EAD and other metadata in the University of Illinois at Urbana- Champaign (UIUC) Cultural Heritage Repository. 5 Another option, harvest- ing EAD records into machine-read- able cataloging (MARC) to establish search and retrieval access in an inte- grated library system, was described by Fleck and Seadle in a 2002 Coalition for Networked Information Task Force briefing. Using an XML Harvester product created by Innova- tive Interfaces, MARC records are generated based upon MARC encod- ing analogs included in the EAD markup and loaded into an Innova- tive Interfaces INNOPAC system. 6 This product has been used to create access to EAD finding aids in the cat- alog for Michigan State University's Vincent Voice Library. In a 2001 article, Gilliland- Swetland recommended several desirable features for an EAD search- and-retrieval system. She emphasized the challenge of EAD search and retrieval by noting the nature of find- ing aids themselves: Archivists have historically been materials-centric rather than user-centric in their descriptive practices, resulting in the find- ing aid assuming a form quite unlike the concise bibliographic description with name and subject access most users are accustomed to using in other information systems such as library catalogs, abstracts, and indexes.' Without describing specific soft- ware tools, Gilliland-Swetland argued for a user-centric approach to the search and retrieval of finding aids by examining the needs of specific user communities such as genealogists, K-12 teachers, and historians. 8 Several initiatives similar to the NWDA effort are described in the professional literature. The Online Archive of California (OAC), which was founded in the mid-1990s, is a consortium of California special- collections repositories. A number of key consortium functions are central- ized, including "monitoring to ensure consistency of EAD encoding across all OAC finding aids" according to agreed-upon best practices, a critical need in the creation of a union cata- log.9 Brown and Schottlaender also describe the integration of the OAC into the California Digital Library, which enables linkages between EAD Alan Cornish (cornish@wsu.edu) is Sys- tems Librarian, Washington State Univer- sity Libraries, Pullman. USING A NATIVE XML DATABASE FOR ENCODING ARCHIVAL DESCRIPTION SEARCH AND RETRIEVAL I CORNISH 181 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. finding aids and digiti zed copies of original materials. 10 Finall y, one import ant develop- ment area is the po ssibilit y of inte - grating EAD docum ent s into Open Archives Initiative (OAI) services in order to enh ance resourc e discovery. A 2002 paper written by Prom and Habin g, both of whom work with th e UIUC Cultural Herit age Repository, explored th e possibility of mapping EAD to OAI, the latt er of w hich is based up on th e fifteen- eleme nt Dub- lin Cor e Metadata Set (unqualified) . While no ting, "w e do n ot propose that th e full capabiliti es of EAD find- ing aids could be subsum ed by OAI," Prom and Habing sug gested that it is possible to map the top-l eve l and co mpon e nt portions of EA D int o OAI, res ul tin g in multipl e OAI records from a singl e EAD finding aid. In thi s scenario, a sin gle OAI record is created from th e collection- level information and multipl e records from component-level infor- mation in an EAD docum en t.11 Evaluation of EAD Search and Retrieval Products In order to iden tify a software solution for supporting a union catalog of EAD findin g aids, the con so rtium con- ducted a product evaluation. The strengths and weakn esses of the native XML technology em ployed by the consortiu m can be best understood by lookin g at alternative XML prod- uct s and product categor ies . Table 1 shows the products con sidere d during an evaluati on period th at consisted of both product research and actual tri- als. In approaching the eva luation, the consortium and its union -catalog host institut ion , the Washin gton Stat e University Libraries, had seve ral spe- cific need s in mind. First, the licensing an d support costs for the product needed to fit w ithin the consortium's budget. Second , th e sea rch-and- retrieval softw are had to sup port sev- eral basic fun ctions: Keywo rd search- ing across all union-cat alog finding aids; specific field searching based upon elements or attribut es in the EAD docum ent ; an abilit y to cus- tomize the look and feel of the inter- face and search-results screens; and the ability to display search term(s) in the conte xt of the finding aid . As not ed in the tabl e, three of the ev aluated products are n ativ e XML databases. Cyrenne provid es a defi- nition of native XML as a database with the se features: • The XML document is stored intact: "t he XML d ocum ent is preserv ed as a separat e, unique entity in its entirety ." • "Schema independenc e," that is, "a ny we ll-formed XML docu- ment can be stored and queried." • The qu ery language is XML- based: "na ti ve XML d ata base vendors typically u se a quer y langua ge d es igned sp eci fically for XML" as opposed to SQL.12 Of the thr ee native XML products, only the licensi ng costs of Ixiasoft's TextML and the open-sourc e XIndice so ftware fell within the available proj- ec t fundin g. Both pack ages were extensively tested, with Text ML prov- ing superio r at handlin g th e large (sometimes in the MB-size range) and structurall y complex EAD documents crea ted by consortium memb ers. One key strength of TextML that m et an NWDA consortium-need involved field sea rching. In TextML, it is possibl e to m ap a search field to one or m ore XPath s ta tements , enabling th e crea tion of sea rch fields b ase d upon the precise us e of an ele- ment or attribute in EAD d ocuments. The importanc e of this capability is show n with th e EAD ele- ment, which can appear at the collec- tion lev el and at the sub or dinate component level in a docu men t. With TextML, usin g its limited XPath sup- port, it is p ossib le to refer ence a spe- cific, contextual use of . In addition to the native XML sol utions , seve ral oth er product 182 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 types were considered. An XML qu ery engine, Verity Ultra seek, was te sted and produced good results whe n u sed for the search and retrieval of consortium docum en ts. 13 Ultraseek can be used to search dis- crete XML files , supports th e creation of custom int erfaces for th e searc h- and-r etrie va l sys tem, and ha s strong documentation . Pro bably th e most obvious limit a tion in thi s XML qu ery- engin e product conc erned the crea tion of search fields. To contras t U ltr asee k with a native XML solu- tion : Ultras eek 5 .0 (used du r ing the product trial) lacked XPath support. Inste ad, it requir ed a uniqu e eleme nt- attr ibute combin ation for the crea tion of a databa se sea rch field . Returning to th e exa mple , cont extual u ses of could n ot be indexed with o ut recoding consor- tium docum ent s to create a unique eleme nt-attribut e combination on which to ind ex. An XML-enabled databa se, DLXS XPAT, has b een successfully used in se veral EAD projects, including OAC. One d isadv antage of this product is th at it re quir es a UNIX operating sys tem for th e se rver. A dditionall y, XPAT, as a supporting toolse t for di gital-library collection building, provid es functionalit y that duplicates other media tool s at the ho st institution (specifically, OCLC/ DiM eMa CONTENTdm). The use of a Relational Dat abase Management System (RDBMS) to es tablish sea rch and retri eva l for EAD XML d ocume nts was con sid- ered as well. Th e advantage to thi s approac h is th at it would ena ble the u se of codin g techniques built up through other Web-based media d elivery proj ects at the ho st institu- tion. The mo st obvio us negati ve issue is th e need to map XML elements or attributes to tables and field s in an RDBMS, which , as Cyrenne notes, "is often expensiv e and will m ost likely res ult in the loss of some dat a suc h as processing in stru ctions , and com- ments as well as the noti on of ele- me nt and attribut e orderin g." 14 The Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 1. NWDA project---€valuated search and retrieval products Product Vendor Product category License MySQUPHP N/ A Relationa l database management system Open so urce Tamino XML Server Software AG Native XML database Nat ive XML database XML query engine Native XML database Integrated library system XML enabled database Commercial Commercial Commercial Open source Commercial Commercial Textml lxiasoft Ultraseek Verity Xindice N/A XML Harvester Innovative Interfaces XPAT DLXS use of native XML avoids the task of explodin g XML data in to the tabl e and field struc ture s of an RDBMS. Finally, another approac h consid- ered was the use of an integrated library sys tem product. This was a realistic option for NWDA becaus e consortium member institutions had decid ed to include MARC encoding catalogs for selected elements in union-catalog findin g ai ds. Inn o- vative Int er faces produces an XML Harve ste r that can be u sed to gen er- ate MARC records from EAD findin g aids th a t include MARC encoding analo gs. For this proj ect, a local ( or self-cont aine d) cat alog could hav e been created and p opulated with MARC records containing metadat a for th e EAD docum en ts, includin g a URL for online access. This approach offers important strengths and weak- nesses . On the positiv e side, it is a relati ve ly eas y meth od for enablin g search-and-retrieval access to EAD findin g aids. In contrast to the int er - face coding requirement s for TextML, the XML Harvest er provided an almost tu rn key approach to XML search and retrieval. On the negativ e side, tw o factors stood out during th e evaluation . First, it would be difficult to fully custom ize sea rch-and- retrieval interfaces as needed for th e proj ec t. Second, u sing the XML Harv ester, there is no ability to dis- play searc h terms in the context of the findin g aid. Search and retrieval is bas ed upon the m etadata extract ed from th e finding aid usin g th e MARC analogs. In Michigan State's Voice Librar y impl eme ntation of thi s so lu- tion , th e finding aid is an external resource with no hi ghlighting of search ter ms . Strengths and Weaknesses of the TextML Approach Each p roject has it s ow n specific n eed s; thu s, th ere is no correct approach to establishing searc h and retrieval for EAD XML documents. In taking th e needs and resources of th e NWDA conso rtium into account, Ixiaso ft's TextML, a nati ve XML prod - uct, pr ovi ded the best fit and was licens ed for u se. The use of TextML enables the creation of customized interfac es for an XML d atabase (or Docum en t Base, u sing the TextML terminol ogy) and pro vides support for ke yword and field sea rching of consortium documents. The qualified XPath support in TextML enables search fields to be built up on precis e element or attribute combinations wi thin EAD document s. The existence of a major finding- aids Int erne t site empl oy ing TextML was a factor in the proj ect's selection of the sof tware . Th e Acces s to Archive s (A2A) site, access ible from URL www .a2a.pro.gov.uk / , provid es an excell ent model for a publicly sea rchabl e finding-aid sit e. Th e A2A site supp orts keyw ord searching and sea rchin g b y arc hival facility; pro- vides multiple views of sea rch results (a summary recor ds scree n, sea rch ter ms in cont ext, and th e full rec ord); highlights searc h term(s) in the dis- played findin g aid ; and supp or ts the presentation of lar ge findin g -aid doc - ument s. While A2A u ses Ge neral Internation al Standard Arc hival Description, or ISAD(G), as op posed to EAD for its description standard, the similaritie s between th e two stan- d a rd s m akes th e A2A site a va luable example for d eve lopment. '5 One w eakness of TextML is the implementation model supported by Ixiasoft, whi ch assumes significant local de velopme nt of the app lication or Web int er face. Th e rela tionship b etween sof tw are cap abiliti es and local dev elopme nt was con sidered with each of the produ cts listed in tab le 1. As no ted , th e Innovative Interfaces so lution was th e most straightfor wa rd approach , assu ming the existenc e of the MARC analogs in EAD marku p, but provid ed the least flexibility in terms of customization an d establishing a tru e linkage between the searc h system and the actual document. In contra st, while Ixiasoft m akes available a base set of active server pages using visual basic script (ASP / VBScript) code for TextML app lication de velop ment and provides very goo d trainin g and sup- port ser vices, the resp onsi bility for USING A NATIVE XML DATABASE FOR ENCODING ARCHIVAL DESCRIPTION SEARCH AND RETRIEVAL I CORNISH 183 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. that d evelopm ent rests with the loca l site . For the NWDA consortium, this development, using the co de base, ha s been manag ea ble. The curr ent state of interface dev elopment for the NWDA proj ect can b e reviewed at http: // nwd a.ws ulibs .ws u.edu / project_info /. Conclusion In se le cting a n EAD se arch- an d- retr ieva l sy s te m, on e important qu es ti on for th e con so rtium was, Whi ch software so lution had the be st prosp ects for migration in the futur e ? Becau se of th e inherent strength s of native XML tec hnolog y in compari- son to the other product catego r ies listed in table 1, a nativ e XML d a ta- base appeared to be the be s t approach , and Tex tML pro v ided the best combination of lic ensing costs, softw are capabilities, and support. It is import a nt to not e that the di s- tinctions betw ee n nativ e XML d ata- bas es and databases that supp or t XML throu g h extensions (XML- enabl ed datab ases) 1nay b eco me more difficult to di scern ov er time, in part due to the ex isting expertise and inv es tments in RDBMS techn o lo - gies.16 Ne verthel ess, capabilities cen- tral to native XML, such as the us e of an XML-bas ed query language, are integral to th e success of such h ybrid systems . References and Notes 1. Daniel Pitti, "Enc oded Archi va l Description: The Dev elop ment of an Encoding Standard for Archival Findin g Aids," The American Archivist 60, no. 3 (Summer 1997): 269. 2. Daniel Pitti, "Encod ed Archi va l Descrip tion: An Introducti on and Over- view," D-Lib Magazine 5, no. 11 (Nov. 1999). Accessed Nov. 2, 2004, www.dlib. org / dlib / novemb er99 / 11 pi tti.h tml. 3. Daniel V. Pitti and Wendy M. Duff (eds.), "Introdu ction ," in Encoded Archival Description on the Internet (Binghamton, N.Y.: Haworth, 2001), 3. 4. James M. Roth, "Serving Up EAD: An Exploratory Stud y on the Deploy- ment and Utili zation of Encoded Arch- iva l Description Findin g Aids, " The American Archivist 64, n o. 2 (Fall/ Winter 2001): 226. 5. Sarah L. Shreeves et al., "H arvest- ing Cultural Her itage Metada ta Using the OAI Protocol," Library Hi Tech 21, no. 2 (2003): 161. 6. Nanc y Fleck and Michael Seadle, "EAD Harv es ting for the Na tional Gallery of the Spoken Word" (pap er pre- sent ed at th e Coalition for Netw orked Information fall 2002 Task Force meeting, San Antoni o, Tex., Dec. 2002). Accessed Nov. 2, 2004, www.cn i.org/ tfms/20 02b. fall/ handout s/ H-EAD-FleckSeadl e.doc . 7. Anne J. Gilliland -Swetland, "Popu- larizing the Finding Aid : Exploiting EAD to Enhance Online Discovery and Retrie- val," in Encoded Archival Description on the Internet (Binghamton, N.Y.: H aw orth, 2001), 207. 8. Ibid , 210-14. 9. Charlotte B. Brown and Brian E. C. Schottlaender, "The Online Archive of Cal- ifornia: A Consor tia! Approach to Encoded Archival Description ," in Encoded Archival Description on the Internet (Binghamton, N .Y.: Haworth , 2001), 99. 10. Ibid, 103-5. OAC available at: www. oac.cd lib. o rg/. Accessed Nov . 2, 2004 . 11. Christ ophe r J. Prom and Thomas Habing, "Using the Op en Archives Initia- tive Protocols with EAD," in Proceed ings of the Second ACM/ IEEE-CS Joint Confer- ence on Digital Libraries (Portland, Ore., July 2002). Accessed Nov. 2, 2004, http:// dli .grainger. ui uc.ed u / publ ications/ jcdl20 02/ pl4prom .pdf . 12 . Marc Cyre nn e, "Going N ative : Wh en Should You Use a Nativ e XML Database?" AIM E-DOC Magazine 16, no. 6 (Nov./ Dec. 2002), 16. Accessed Nov. 2, 2004, www .edocmag az ine.com / ar ticle_ new.asp?ID=25421. 13. Product categor y decisions based up on definitions and classifications avail- able from : Ronald Bourret, "XML Database Products." Accessed Nov. 2, 2004, www. rp bourret.com / xml / XMLD a t a b a se Prods.htm. 14. Cyrenne, "Going Native, " 18. 15. Bill Stockting, "EAD in A2A," Microsoft PowerPoint present at ion. Accessed N ov. 2, 2004, www.agad .archiwa. gov.pl/ ead / stocking.ppt. 16. Uw e Ho henst ein, "Supporting XML in Oracl e9i," in Akmal B. Chaudhri, 184 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Awais Rashid, and Roberto Zicari (eds.), XML Data Management: Native XML and XML-Enabled Database Systems (Boston: Add ison -Wesley, 2003), 123-4 . Using GIS to Measure In-Library Book-Use Behavior Jingfeng Xia This article is an attempt to develop Geographic Information Syst ems (GIS) technologi; into an analytical tool for exam- ining the relationships between the height of the bookshelves and the behavior of library readers in utiliz ing books within a library. The tool would contain a database to store book-use information and some GIS maps to represent bookshelves. Upon ana- lyzing the data stored in the database, dif- ferent frequ encies of book use across bookshelf layers are displayed on the maps. The tool would provide a wonderful means of visualization through which analysts can quickly realize the spatial distribution of books used in a library. This article reveals that readers tend to pull books out of the bookshelf layers that are easily reachable by human eyes and hands, and thus opens some issues for librarians to reconsider the management of library collections. Several years ago , when working as a library ass istant reshelving books in a univer sit y library, the author noted that the majority of books used inside th e librar y were from the mid-range layers of b oo kshelv es . That is , by pro- portion, few book s pulled out by librar y rea ders were from the top or b ottom layers. Books on the layers that were ea sily re achable by readers were frequently utilized . Such a b oo k-u se distribution patt ern mad e th e job of reshelving books easy, but created some inquiries: how could book locations influ ence th e choices of read ers in selecting books? If this was not a n isolat ed observ a tion, it must have exposed an int ere sting 9663 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Using GIS to Measure In-Library Book-Use Behavior Xia, Jingfeng Information Technology and Libraries; Dec 2004; 23, 4; ProQuest pg. 184 that development rests with the local site. For the NWDA consortium, this development , using the code base, ha s been manageable. The current stat e of interface dev elop ment for the NWDA project can be reviewed at http :// nwda. wsulibs .wsu.edu / project_info /. Conclusion In se lecting an EAD searc h- and- retrieval system, one important qu es tion for th e consortium was, Which software solution had the best prosp ects for migration in the futur e? Because of the inherent strength s of nativ e XML technology in comp ari - son to the other product categories list ed in tabl e 1, a nati ve XML data- bas e appeared to be the best appro ach, and TextML provided the best combination of licensi ng costs, software capabilities, and support. It is important to note that the dis- tinctions betw een nativ e XML data- bas es and databases that support XML throu gh extensions (XML- enabled databa ses) ma y b eco me more difficult to dis cern over time, in part du e to the existi ng exper tise and in vestme nts in RDBMS technolo- gies. 16 Nevertheless, capabilities cen- tral to native XML, such as the us e of an XML-based query language, are integral to th e success of such hybrid syst ems . References and Notes 1. Daniel Pitti , "Encoded Archival De scriptio n: The Development of a n Encoding Standard for Archival Finding Aids ," The American Archivist 60, no. 3 (Summ er 1997): 269. 2. Daniel Pitti, "Encod ed Archival Des cription: An Introducti on and Over- vi ew," 0-Lib Magazine 5, no. 11 (Nov. 1999). Accessed Nov. 2, 2004, www.dlib. org / dlib / november99 / 11 pitti.html. 3. Daniel V. Pitti and Wendy M. Duff (ed s.), "Introduction," in Encoded Archival Description on the Internet (Binghamton, N.Y.: Haworth, 2001), 3. 4. James M . Roth, "Serv ing Up EAD: An Exp lorat o ry Study on the Deploy- ment and Utilization of Encod ed Arch- ival Description Finding Aids," The American Archivist 64, no. 2 (Fall /Win ter 2001): 226. 5. Sarah L. Shreeves et al., "Har ves t- ing Cultural Heritage Metadata Using the OAI Protocol," Library Hi Tech 21, no. 2 (2003): 161. 6. Nan cy Fleck and Michael Sead le, "EAD Harvesting for the National Ga llery of th e Spoken Word" (pap er pre- sent ed at th e Coa liti on for Netw orke d Information fall 2002 Task Force meeting, San Anton io, Tex., Dec. 2002). Accessed Nov. 2, 2004, www.cni .org/tfms/2002b. fall/handouts/H-EAD-FleckSeadle.doc. 7. Anne J. Gilliland -Swe tland , "Po pu- larizi ng th e Finding Aid : Exploiting EAD to Enhance Online Discover y and Retrie- val," in Encoded Archival Description on the Internet (Bing h a mton , N.Y.: Haworth, 2001), 207. 8. Ibid, 210-14. 9. Charlott e B. Brown and Brian E. C. Schottlaender, "The Onlin e Arch ive of Cal- ifornia: A Consortia! Approach to Encode d Archival Descrip tion, " in Encoded Archival Description on the Internet (Bingham ton, N .Y.: Haworth, 2001), 99. 10. Ibid , 103-5. OAC ava ilable at: www . o ac.c dlib.org / . Accessed Nov. 2, 2004. 11. Christopher J. Prom and Thomas Habing, "Using the Open Archiv es Initia- tive Protocols w ith EAD," in Proceed ings of th e Second ACM/IEEE-CS Joint Confe r- ence on Digit al Librari es (Portland, Ore., July 2002). Accessed Nov . 2, 2004, http:// dli .grai ng er.uiu c.edu / publications / jcdl20 02/ p14prom.pdf. 12. Marc Cyrenne, "Go ing N at ive: Wh en Should You Use a Native XML Database?" AIM E-DOC Magazine 16, no . 6 (Nov./Dec. 2002), 16. Accessed Nov. 2, 2004, www. edo cmaga zine.com/ article_ n ew.as p?ID=2 5421. 13. Product categor y decisions ba sed upon definiti ons and classifications avail- able from: Ronald Bourret, "XML Database Products." Accessed Nov. 2, 2004, www. rpbourret .com/ x ml / XMLDa ta base Prods .htm. 14. Cyrenn e, "Going Native," 18. 15. Bill Stockting, "EAD in A2A," Microsoft Power Point pres entation. Accessed N ov. 2, 2004, www.agad.a rchiwa . gov.pl/ ead /s tocking.pp t. 16. Uwe Hohenstein, "Supp orting XML in Oracle9i ," in Akm a l B. Chaudhri , 184 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Awais Rashid, and Rob erto Zic ar i (eds.), XML Data Management: Native XML and XML-Enabled Database Systems (Boston: Addison-Wesley, 2003), 123-4. Using GIS to Measure In-Library Book-Use Behavior Jingfeng Xia This article is an attempt to develop Geographic Information Systems (GIS) technology into an analytical tool for exam- ining the relationships between the height of the bookshelves and the behavior of library readers in utilizing books within a library. The tool would contain a database to store book-use information and some GIS maps to represent bookshelves. Upon ana- lyzing the data stored in the database, dif- feren t frequencies of book use across bookshelf layers are displayed on the maps. The tool would provide a wonderful means of visualization through which analysts can quickly realize the spatial distribution of books used in a library. This article reveals that readers tend to pull books out of the bookshelf layers that are easily reachable by human eyes and hands, and thus opens some issues for librarians to reconsider the management of library collections. Several years ago, when working as a library assistant reshelving books in a university librar y, the author noted that the majority of books used inside the library were from the mid-range laye rs of bookshelves. That is, b y pro- portion , few books pulled out by library readers were from the top or bottom layers. Books on the layers that were easily reachable by readers were frequentl y utilized . Such a book-us e distribution patt ern made the job of reshelving books easy, but created some inquiries: how could book locati ons influ ence th e choices of readers in selecting books? If this was not an isolated observation, it must have exposed an inter es ting Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. phenomenon that librarians needed to pay attention to . Then , by finding out the reasons , librarians might becom e capable of guiding, to some extent , us ers' selectiv eness on library books by deliberately arranging col- lections at design ated heights on book sh elves. A research study was designed to develop Geographical Information Systems (GIS) into an analytical tool to examine former casual observa- tions by the author. The study was conducted in the MacKimmie Library at the University of Calgary. Thi s paper highlights th e results of the study that aimed at assessing th e behavior of library readers in pulling out books from bookshelves . Thes e book s, when not checked out, are cat- egoriz ed as "pickup books" becau se they are usually discarded inside a library after use and then picked up by library assistants for reshelving. Like many other libraries , the MacKimmie Library does not encourage reasd ers to reshelve books th emse lves. ArcView, a GIS software, was selected to develop th e tool for this study because GIS ha s the functions of dynamicall y analyzing and di s- playin g spatial data. The research on library readers pullin g out books involv es the measur emen ts of book- shelf heights, and thu s deals with spatial coordinates. With the capabil- ity of presenting book shelves in dif- ferent views on map s, GIS is able to provide readers with an easy und er- standing of the anal ytical results in visual forms, which make any textu al description s wordy . At the same time, some GIS products are avail- able now in most academic libraries, thus giving develop ers convenient access to use. Hypothesis When library users decide to check books out of a library, the se books are what the y think of as useful. Peopl e are usually hesitant to carry home books that are of little or uncertain use, not only because of the limit on the numb er of check-out books , but also bec ause of the physical work required for carrying them. Moreover, some items, such as periodicals and multimedia materials, are either des- ignated as "refe rence only" or have a very short loan period . It is reasonable to beli eve that user s carefully select what they want from library collec- tions and keep these book s for handy use outside the library. By contrast , in-library book use repre sents a different category of library readers' behavior . There are two general categories of in-library book us e: readers bringin g their own books into a library for use, and readers pulling out book s from book- shelves inside a librar y. The former is commonly seen when students study textbook s for examinations (not the topic of this study), whil e the latter is a little more complex. 1 As library users approach book- shelves to extract book s, th ey may or may not hav e a definit e target. When coming with call numb ers, peopl e will deliberately draw the books they want for reading, photoc opyi ng , or referencing. Ho wever, there are time s when user s on ly wander in bookshelf aisles of desired collections, uncer- tain about singling out specific books . Th ey may simply shelf-shop to randomly select whatever is inter- esting to them, or they may locate a subject of need and go to the storage position(s) to look for whatever books are there. No matter what these readers' intention s are, they roam among collections, pick book s for quick u se, and leave them inside the library after use, although some materials may also be checked out. Because of such arbitrary selec- tions from library collections , physi- cal con venie nce sometimes influence s library users in takin g books from booksh elves-they ma y look around for books on bookshelf layers that are at a reach able height. The standard library bookshelf is hi gher than the average person's height and is struc- tured to have five to eigh t layers. In aca demic libraries, "wood shelving is available in three heights: 82 in. (2050 mm), with a bottom shelf and six ad justabl e shelves; 60 in. (1500 mm), with a b ottom shelf and four adjustable she lves; and 42 in. (1050 mm), with a bottom shelf and two adjustable shel ves ." 2 For regular col- lections in mo st academic libraries, bookshelve s are usually about eighty- two inches high and hav e seven lay- ers. Books on the top lay er are out of reach for many reader s, requiring them to use a ladder to draw a book from it. Many users are hesitant to use ladders. Even worse, a reader will have to bend over or squ at down to view the contents of books on the bot- tom layer of a bookshelf . Hence , the hypothe sis is that books used inside a library are prima- rily distribut ed among the mid-ranged layers of bookshelves. Specifically, if a bookshelf ha s seven lay ers, books placed on layers two through six are most frequently consulted. This is the subject of this research paper . Background A considerable number of studies have investigated the utilization of books that are checked out of a library. An esti mate made in 1967 pointed out that over seven hundred research results pertained to this topic. ' How ever, the situation of books used inside a library has not been given enough attention. One of the reasons for this seeming neglect comes from the belief that the records of library book s in circulation provide similar info rma tion as those of books used within libraries." Thi s misunder- standing wa s lately criticized by other researchers who discov ere d the dif- ferences in use behavior between Jingfeng Xia (jxia@email.arizona.edu) is a student at the School of Information Resources and Library Science at the University of Arizona, Tucson. USING GIS TO MEASURE IN-LIBRARY BOOK-USE BEHAVIOR I XIA 185 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. libr ary readers takin g books h ome and those using books inside librar- ies. 5 Research ers hav e now recog - ni zed that correlations between the two sets of data are n ot as strong as they seemed to be. Such reco gnition, unfortunately, ha s not resulted in mor e consequ ent work to explor e the issu e of in-lib rary book use. This is probabl y due to th e difficulties of co llecting data or the la ck of appropriate research meth- ods .6 Also, the majority of rele va nt surv eys w ere conducted several de cades ago and focu sed primaril y on exp loring a go od method of sam - plin g in-library book us e.7 Am ong the se studies , Fu ssler and Simon pre- ferr ed to carry out researc h by dis- tributing questionnaires am ong library reader s; Drott u sed random- sampling m et hods to statisti ca lly examine th e importance of librar y- book use; and Jain, as well as Salv erso n, emphasized dividing th e survey time s into differ ent investi ga- tion units when conducting res earch. Simil a rly, M orse point ed out the compl ex ity of measurin g lib rary- book u se a t wo rk , advocating an involv ement of computerized opera- tion s in librar y-book man ag ement. The sampling strategies and ana- ly tical methods implemented in pa st studie s are still applicable to curr ent res earc h. Non etheless, because many new technol ogie s ha ve come into view since th en, it is quite likel y tha t som e new ways of obtaining and analy zing th e d ata of in-library book use can now be developed. Th e n ew app roac hes must have the capability of providing not only accurate m eas- urem ent of the data but also the me ans for easy manipulation . Th eir result s must be able to enhance th e und ers tandin g of us er behavio r in expl ori ng th e reso urc es of existing collection inv entorie s . One of th e solutions is an analytic al tool. An analytical tool can control data collection and anal ysis by computeri- zati on . If the system is ab le to accu- mul ate const antly upd ated records ov er time, it will remedy the probl em of poor sampling th at man y resear- chers hav e encount ered, be cause an alysis will then b e done on all the data rather th an w ith certain isolated samples. The development of m odern technologi es makes such data collec- tion and storage po ssible and easier than ever before. On e exampl e of the technologi es is the radio freque ncy identification (RFID) tag system that ha s been adopted b y some public and acade mic librar ies recently.8 Thi s sys- tem stores a tag in each librar y item with the item's biblio gra phic informa- tion, and uses an antenna to keep tr ack of th e tag. By automatically com- municating with dat a stored in the tags, the system can collect dat a on all librar y collections in a timel y manner and export them into pred esigned d atabases for easy man ag ement. Data an a lys is and pres enta tion comprise ano ther p ar t of the an aly ti- cal mechani sm. Researc hers h ave to carefully evaluate existing technolo- gies in order to select prop er prod- u cts or de ve lop parti cular pro gra ms to integrate with RFID (if used) and th e databases. It is fortunate th at GIS techno log y is available with numer- ous functi ons for analyzing and demonstrating data , especiall y spa- tial data. Da ta visuali za tion through GIS produ cts has been very good, which giv es them advantages over other analytic al, stati stical, or repor t- in g produ cts. Combining RFID and GIS into one system would seem to be th e per- fect solution-the former can effec- tive ly carry out dat a collection and th e latter can efficiently perfo rm data analysis and presentation. H ow ever, while GIS products h ave been u sed in libraries in the Unit ed States for more th an a dec ade, mo st academic lib- raries are hesitant to invest in RFID because of its high costs . GIS technol- ogy alone, however , can still provide sufficient functions to be dev eloped into such an analytical tool. Up to n ow, tho se librarie s that have provid ed GIS serv ices only use the software that assists in the uti- lization of geospatial data and map- 186 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 ping te chnologie s for users .9 GIS is not expl oited enou gh to aid the man- age ment of librari es them selves and the res earch of librar y collections. Some commercial GIS software, such as "Lib rary Decision" by Civic-Tech- nologi es, has be en recently marketed to support the analysis of library- user d a ta for public libraries. 10 Ho wev er, it only wor ks w ell on the data of conventional geographical nature, that is, th e distributi on and location of librari es and th eir users with the mapping of city bl ocks and streets . It does not app ly to a librar y an d its books, and especiall y not to the distribution of books us ed insid e th e librar y. Such products are also not ap plicabl e to acad emic librari es that do not always concentrate on the ana lysis of geog rap hical area s of their us ers. Even so, GIS h as all the function s that such a propos ed analytical tool demands. It is suit able for assisting in the research of in-library book us e where library floor layout s or other facilities can be d raw n into maps on multiple-dimensional views. At the same tim e, bookshelves wi th individ- ual lay ers can be treated as an innova- tive form of map by GIS technology (see figur e 1), makin g visible the rela- tionship of book u se to the height of the book sh elf. As soon as th e presen- tation mechanism is linked to data- bases, any updat es on book use will be mirror ed visuall y. Method This proj ect is one of a serie s of proj- ects for deve lopin g GIS into a tool to manag e and anal yze the u sage char- acteristi cs of library books . The other projects include u sing GIS to measure book u sability for the de velopm ent of collection inventorie s; to assist in the managem ent of libr ary physic al space an d facilities; and to locat e library items . 11 In order to make GIS workable for the subject of this paper , the focus was placed only on the exploration of corr ela tions b e tween b ooks helf Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 1. The front view of one bookshelf rack on the fifth floor of the University of Calgary MacKimmie Library. Eight bookshelves assemb le the range. Here, different shades of color represents the numbers of books used on each individual layer. The display is only for demonstration and not to actual scale. height s and book-use frequenci es in an aca demic library environm ent. Th ere are two major step s to con- ductin g this research : collectin g data and d ev eloping a GIS anal y tical tool. Since MacKimmie Librar y did not in ves t in RFID at th e tim e thi s resea rch was undertak en , p ersonal ob servations were mad e to record b ook-use data. 12 The dev elopm ent of th e GIS tool involves creatin g a sm all d a tabase to store data and facilit ate d ata analysis. It also requir es creatin g seve r al bookshelf and sh elf-r an ge m ap s to pre sent anal ytical result s in visualized forms. Arc View-the mo st p opul ar GIS produ ct in th e w orld- was ut ilize d for the de ve lopment. This paper presents only a p or tion of co llection areas at MacKimmie Library. Part of the fifth floor, wh ere som e collections of humaniti es and social sciences are stored, w as selected becau se this floor is amon g the busi est of th e floors used by read ers. It is filled with sixty-eight ran ges of b ook- shelves containin g book s from call numbers B to DU. The terms used in this paper includ e bookshelf, referring to one unit of furnitur e fitted with horizontal sh elves to h old book s; rack, which includ es more than one bookshelf standin g tog eth er in a line ; and range, comp osed of two racks standing b ack-t o-back. Bookshelves on the fifth floor are arr anged to sur- round a group of facility rooms in the central area. Stud y corridors are set between booksh elves and the wall. Each booksh elf ran ge consists of two bookshelf rack s, each of which in turn has eight individual book- shel ves . All of the book shel ves are about eight y-two in ches high and have seven laye rs. Th e laye rs, except for the top on es th a t are open, are equal in height , w idth , and length. Data Collection Personal surv eys wer e taken by the author to not e d own each call number of books that w ere n ot in their origi- n a l p os ition s on the sh elv es, but in stead were found discard ed on the floo r, tables, chairs, sofas , or on top or in front of other stocked book s . Boo ks on th e sh elving carts ar e also account ed for. The surveys we re sep- ar ately con ducted three times a d ay - mo rnin g, afternoon, and ev en in g- in ord er to cat ch as m any book s u sed in a day as p oss ible. To avoid reco rdin g the sa me boo k mor e than on ce, n o duplicat e call numbers w ere acce pt ed for any single da y even thou gh th e sam e book wa s found in diff erent locations on that day. On the oth er hand , the sam e call number coul d be ent ere d int o the records on th e second day alth ough it was recorded th e d ay befo re a nd remained in th e sa m e pla ce w ith out b eing pick ed up by librar y ass is tants . (Thi s dupli ca te reco rdin g was ve ry rare beca use of th e routin e work of book pi ckup by libra ry ass istants.) A period of two w eeks w as d esignated for the sur vey in th e first h alf of December 2002. Th e final exam in ation week was pl ann ed becau se it represents a week of h ea vy book u se, although previous resea rch found th at readers in this w ee k tend ed to u se library collection s less th an their own stud y mat erials." A suppl em ent a ry surve y th a t a lso las ted two w eeks, includin g a final exam ina tion wee k, wa s condu cted in th e lib ra ry in late spring 2004. To simplify the rese arch , some excepti ons w ere established for d a ta collection. Pe riodicals were exclud ed beca use th ey have a very short loa n p er iod (gen erall y one day) . Libra ry u sers m ay pr efer to read articl es in journ als w ithin the library and thu s w ill h av e a clear idea as to wh a t m aterials to read. '' Books belon ging to oth er floo rs of the librar y, o r b oo ks b elon g ing to th e fifth floor but found out sid e th e area were not includ ed in th e an alysis. Furthermore, du e to the n atur e and time limit of thes e ob ser - v ation s, b ooks pulled out of tar geted bo okshelves were not distingui sh ed from b oo ks taken from book sh elves at rand om . Thi s information can onl y become ava ilable throu gh int erv iew s USING GIS TO MEASURE IN-LIBRARY BOOK-USE BEHAVIOR I XIA 187 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. with library users, which can be another rese arch project. Each book shelf laye r wa s recorded with and signified by two call num- bers: the start and end numbers of books. For example, the call numbers "BF1999 .K54" to "BH21 .B35 1965," representing books stored on a partic- ular layer, were record ed to identify that layer . Because book shifting can happen from time to time, such recording of start and end call num- bers for individual book shelf layers only reflects the condition s when this research wa s undertaken and may need updates whenever changes occur. Data Manipulation and Visualization Using a bookshelf lay er as the recording unit is essenti al for the analysi s of the relationship between book use and bookshelf height. Each book used can be classifi ed to fit in one unit according to the call num - ber of the book. Therefore , building a databas e with a table for lay ers will be an important part in the develop- ment of such an analytic al tool. The LAYERS tabl e includes a data field as an identifi e r to stand for the sequenc e of e ach layer-1 for the top layer, 2 for th e next layer down , and so on , in addition to storing the start and end call numbers of books for each lay er. If more than one book- shelf in th e library has seven layers, layer identifiers will it erate from booksh elf to bookshelf . Therefore, this tabl e will also need an identifier for each individual book shelf with which lay ers are associated. The dat abase will also contain such information as bookshelf ranges, bookshelf racks, and books , all of which are individual database tables that are joined with each other by relational keys. Among them, the RANGES table is simply character- ized by its id entifier, and is designed to repre se nt two rack s of book - shelves that stand back to back. The BOOKSHELVES table is identified by the call numbers of the start and end books stor ed across individual bookshelves rather than on individ- ual layers. Furthermore, th e BOOKS table is primarily filled with the data of individual book call numbers as well as book pickup time s and book discard locations . GIS h as lim ited ability for orga n- izing da tab ase struc tu re. If n ecessa ry, oth er da tab ase managemen t sys tem s, su ch as Microsof t Access, can b e incor p ora ted . Qu ery codes are built to ge t su mmarize d infor m ation for speci fic p ur poses, and th e agg re- ga ted da ta are exp or ted int o GIS data bases for fur the r sp a tial an alysis or con venie nt vis u al prese nt ati on . Da ta vis u aliza tion can be show n at differe nt leve ls- by layer, books helf, rack, and range . Th e firs t attempt at ma king a vis u al dem on stra tion of this researc h is for th e area of in di - vi du al b ooks helves at layer leve l (see figur e 1). Th e follow in g qu ery w ill return necessary summ arize d infor- ma tion: SELECT sum(b.call_no) AS total_num, l.layer_id, l.shelf_id FROM (BOOKS b INNER JOIN LAYERS l ON b .some_id = l.some_id) WHERE b.call_no > l.start_no and b.call_no < l.end_no GROUP BY l.layer_id, l.shelf_id ORDER BY l.shelf_id, l.layer_id. At the same time, another attempt is made to d emonstrate book num- bers per layer, at bookshelf level, across multipl e bookshelf ranges. This demonstration provides a better visualization in the GIS di splay so that an ov erall view of the height distributions of book usage over cer- tain collection areas can be presented (see figures 2 and 3). To achi eve such visualization, data must be com- pared in order to get information about which layer of a bookshelf 188 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 contains the most frequently used books and which holds thos e that are rarely visit ed . This demonstration indicates that any alternative selec- tion of analytical-display units can be easily performed by making mod- ifications on the query that works on aggregating data . Technically, data visualization can be presented by using an y GIS soft- ware, although ArcView is used here because it has been availabl e in the systems of many academic libraries. Bookshelf ranges in MacKimmie Library 's fifth floor were drawn into map features . In order to show them with a three-dim ensional view, each of the seven layers was given a sequential number as its height value , and all book shelves were treated as having the same height. These height values are tre a ted as the z values in any three-dimensional analysis. Then, by associating the numbers of books from the database with the heights of layers on the map, ArcView is able to sketch the hei ght distributions of in- library book us e in new perspectives, dramatically improving the under- standing of book use. In order to implement the visual- iza tion of all layers across a book- shelf range, lay ers were drawn as map features (see figure 1). Layer heights and widths are in appropri- at e proportion . (Individual book s on each layer are for demonstration only, and thus are not in the exact shape and number.) Figure 1 shows how a bookshelf rack has been pre- sented as a GIS map, which is a totally new idea in the applications of GIS visualization . The databas e and visualization mechanism constitute what is referred to in this paper as the analyt- ical tool. One will find that th e devel- opment is relatively easy and the tool is incredibly simple. However, it is a dynamic device. If expanded into other parts of the library collections, this tool will become an integrated system that is able to assi st in the management of library book use and Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ••••m•==== ===----::"'-:-=-=-=-=-=-=-=-=-=-=-===:-::::-::-":"".:-:-~.,Jgl4 file Edit 7:J Scene Iheme .S1.liace 6t~ s ~ I:!~ IJID ~ ~ liiffl~[i] !HJ~ ~~~I§]~ [QI (ill ---- ..................... _ -~ ¥.l ,, Ill Figure 2. A three-dimensional view of bookshelf ranges on the fifth floor at the MacKimmie Library. The height of each bookshelf represents the corresponding height of the layer from which most books were removed. This display is not to actual scale. I -'! st a,t IIJ gJ.1V Once the appropriate pieces of HTML code had been replaced with corresponding include statements, the changeover was complete. From this point forward, changes in such things as database names, coverage periods, and descriptive material will be made to one .txt file. The change will immediately be reflected across all subject pages with no additional work involved for the librarians responsible for those pages. Note that the SUL server has been configured so that it parses all Web pages. This is necessary because most of the library's Web pages have some SSL This configuration means that the Web page extensions remain .html. If the server is not configured in this manner, then all pages con- taining SSI must end in a .shtml extension. This is a subject that requires discussion with automation librarians or the department respon- sible for the library's server. Advantages Obviously, the biggest advantage to this method is the time saved for individual librarians. There is now no need for librarians to do any maintenance work for links to infor- mation housed in the alphabetical list. Static HTML pages referencing Gale's InfoTrac OneFile database, for instance, would have required updates to approximately forty sub- ject pages; now, one librarian can cor- rect one .txt file and simultaneously update all forty subject pages. Time saved can be used in collecting and editing the list of Web sites that are a part of each subject page; this is a task that has been pushed back in the past, in favor of making more urgent database information changes. .! i Coffeecup HTML Editor - www.CoffeeCup.com , f.ile J;:dit ~iew Q.ocument [nsert E,ormat Iools \'Lindow t!elp J . [~ · lrl· ~ ~ I ~ @ • ! .. ,,., ,. IX IQ :II"' t;~ ~ . ! '1&1 t':f ~ ft; • •&;pl .;11= • · 1;9l· ,,. ®·. !Ai· ,.,i · ~ . fil• · ~ · ~ · ee . !.l!J • Edrt j Preview I Help I r Academic Sea rch Elite   ( EBSCO)   ; < img src= " / image::: /f ulltext. gif " a lt = "some full text " border = "O"> multi-di sc iplinary database includes some sc hol ar ly articles Fig. 1. HTML Code for Academic Search Elite Using a PURL Called eb-ase El Microsoft Excel 4lphabeticalResources.xls l~ Ole ~dit 'iiew Insert FQ.rmat Ioofs Qata Y!,indow t1elp D Q§; !iii ,f6I 'M It [l. ~ 1 nth till, • ~ AcceSsibleA~h ives · accessible.Ix! Fig. 2. Database Names, .txt File Names, and Resultant Include Commands In addition, librarians who are using this simple technique do not need extensive training. The creation of the Excel database of include com- mands allows for quick additions to an existing page, or the creation of new subject pages. Librarians using the include commands can simply copy and paste them; there is no need for them to understand the syntax or to be able to repeat it. This makes using SSI particularly attractive to staff who do not want the added bur- den of further training in HTML. The librarian responsible for creating the .txt files and the Excel database of statements demonstrated the copy- ing and pasting of the include state- ments to all the other librarians who edit HTML pages in a one-time ten- minute training session. The only additional training issue has involved page structure. Since the library uses a table structure for the subject pages, all table tags are included in the database .txt files. Making sure that librarians under- stand that they do not need to recre- ate the table tags has been the only additional training issue for the department. As librarians begin to use these commands, links to resources across subject pages will look the same and will provide the user with the same information. This increased unifor- mity results in a more professional appearance for the Web site as a whole. Disadvantages This revolution in the maintenance of subject pages has not been without its disadvantages. The primary com- plaint by librarians using SSI include commands is that they cannot pre- view their changes in their HTML editors. SUL's department uses the CoffeeCup HTML Editor, which allows previews, but the previews are not visible for items that are retrieved using SSis. This is because the page is not fully assembled until the server assembles it. When the librarian views the page in the editor, 196 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. prior to uploading it to th e server, the include commands are without tar - gets. The target .txt files are on the server. When a user requ ests a p age, include commands pull in the missing pieces (the .txt files, or other files); th en, th e completed pag e is seam- lessly presented to the us er via his or her brow ser. As Mach notes, "Pre- view ing a Web page without crucial element s . . . can be di sconcerting, esp ecially to visuall y oriented d esign- ers."20 In SUL's experienc e with thi s particular issue, librarian s who are uncomfortable loading pages with locally invisible elements can load th em into temporary fold ers on the server, check them for errors there, and then move them to th eir appro - priate dir ectories . Conclusion Situational factors have allowed SUL to imple ment this change with sur- prising ease and speed. Because the library has its own server, and because th ere is an automation librar- ian on staff, communicati on and chan ge have been easy and efficient. Librar y staff deduce that it is becau se the include command of SSI is b eing u sed more than other possible com- mands that the librar y is not experi- encing an increase in loadin g tim e on its pages. Of course, the size of SUL's reso urce list makes this kind of solu- Art & Tec h EBSCO tion feasible ; certainly, if the librar y were working with hundreds of resources, it would be more likely that a datab ase -driv en strategy would be ad op ted . The simplicity and elegance of the SSI include com- mand process has encourage d adop- tion, and SUL ha s seen no ill effects from the us er side of operations. Librarian Web au th ors qui ckly over- came any slight di sco mfort with the new proc ess and are now able to devote a portion of editing time to other, less m ono tonous tasks. References and Notes 1. Carla Dun smore, "A Qualitative Study of Web-Mounted Pathfinders Cre- ated by Academic Business Libraries," Libri 52, no . 3 (Sept. 2002): 140-41. 2. Charles W. Dea n , "Th e Public Elec- tronic Libr ary : Web-based Subj ec t Guides," Library Hi Tech 16, no. 3-4 (1998): 80-88; Gary Rob erts , "Designi ng a Data- base-Driven Web Site, or, The Evolution of the Infoiguan a," Computers in Libraries 20, no. 9 (Oct. 2000): 26-32; Bryan H. Davidson, "Database-Driven, Dynamic Content Delivery: Providing an d Manag- ing Access to Online Resources Using Microsoft Access and Ac ti ve Server Pages," OCLC Systems and Services 17, no . 1 (2001): 34-42; Marybeth Grimes and Sara E. Morris , "A Co mp ari so n of Acade- mic Librarie s' Webliographies, " Internet Reference Services Quarterly 5, no . 4 (2001): 69-77; Laur a Ga lv an -Estra da, "Moving towards a User-Cent ere d, Database-Dri- ven Web Site at th e UCSD Libraries," Index to Advertisers 179 200 LITA Internet Reference Services Quarterly 7, no. 1-2 (2002): 49-61. 3. Roberts, "Infoiguana "; Davidson, "Da tabase Driven"; Galvan- Estrada, "User -Cen tered, Database-Driv en Web Site." 4. Davidson, "Database Driven," und er " Int roduction ." 5. Ibid., under "Developm ent Con- side ra tions." 6. Roberts, "Infoiguana ," 32. 7. Ga lvan-Estrada, " U ser -Centered, Database-Driven Web Site, " 55-56. 8. Jody Co ndit Fagan, "Server -Side Includ es Made Sim ple, " The Electronic Library 20, no. 5 (2002): 382-83 . 9. Michelle Mach, "The Service of Serv er -Side Includes," Information Tech- nology and Libraries 20, no. 4 (2001): 213. 10. Greg R. Notess, "Serv er Side Includes for Site Management," Online 24, no. 4 (July 2000): 78, 80. 11. Ibid. 12. Mach, "Se rvice of Server-Side Includ es," 216. 13. Ibid., 214. 14. Fagan, "Server -Side Includ es M ade Simple," 387. 15. Ibid., 383. 16 . Ibid. 17. Ibid. 18. Apache HTTPD Server Project, "Apac h e HTTP Server Version 1.3: Secu - rity Tips for Server Configurati on," Th e Apache Softwar e Foundation. Accessed Oct. 29, 2003, http: / / httpd. apac he.org/ docs / misc / sec urity _tips .html. 19. An th on y Baratta, e-mail to th eLis t mailing list, May 16, 2003, Accessed Nov . 4, 2003, http:/ / lists.evolt.or g/ archive/ Week-of-Mon-20030512/140824.html. 20. Mach, "Service of Serv er -Side Includ es," 217. cover 2, 191, covers 3--4 USING SERVER-SIDE INCLUDE COMMANDS I NORTHRUP, CHERRY, AND DARBY 197 9665 ---- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Free Culture: How Big Media Uses Technology and the Law to Lock Down Culture and Control Creativity Coyle, Karen Information Technology and Libraries; Dec 2004; 23, 4; ProQuest pg. 198 Book Review Free Culture How Big Media Uses Technology and the Law to Lock Down Culture and Control Creativity By Lawrence Lessig . New York: Penguin, 2004. 240p. $24.95 (ISBN 1- 594-20006-8). This is the third book by Stanford law professor Larry Lessig, and the third in which he furthers his basic theme : that the ancient regime of intellectual property owners is locked in a battle with the capabilities of new technol- ogy. Lessig u sed his first book, Code and Other Laws of Cyberspace (Basic Books, 1999), to explain that the notion of cyberspace as free, open, and anar- chic is simply a myth, and a danger- ous one at that: the very architecture of our computers and how they com- municate determine what one can and cam10t do within that environ- ment. If you can get control of that architecture, say by mand ating filters on cont ent, yo u can get subs tantial control over the culture of that com- munication space. In his sec ond book, The Future of Ideas: The Fate of the Commons in a Connected World (Random, 2001), Lessig describes how the chang e from real prop erty to vir- tual propert y actually means more opportunity for control , not less. The theme that he takes up in Free Culture is his conc ern that certain power- ful inter ests in our society (read: Hollywood) are using copyright law to lock down the very stuff of creativ- ity: mainly , pa st creativity. Lessig himself admits in his pref- ace that his is not a new or unique argument. He cites Richard Stallman's writings in the mid-1980s that became the basis for the Free Software move- ment as containing many of the same concepts that Lessig argues in his book. In this case, it serves as a kind of proof of concept (that new ideas build on past ideas) rather than a criticism of lack of originality. Stallman's work is not, however, a substitute for Lessig's; not only does Lessig address popular culture where Stallman addresses only computer code, but Lessig has one key thing in his favor: h e is a mast er story-tell er and a darned good writer, not something one usually expec ts in an academic and an expert in constitutional law. His book opens with the first flight osf the Wright brothers and the death of a farmer's chick ens, followed by Buster Keaton's film Steamboat Bill and Disney's famous mouse . Th e next chapter traces the history of photogra- phy and how the law once considered that snapping a picture could require prior permission from the owners of any property caught in th e view- finder. Later he tells how an improve- ment to a sea rch engin e led one college student to owe the Recording Industry Association of America $15 million. Throughout the book Lessig illustrates copyright through the lives of real people and uses histor y, sci- ence, and the arts to mak e this law come to life for the reader . Lessig explains that intellectual property differ s from real property in the eye of the law. Unlike real prop- erty, where th e property owner has near total control over its uses, the only control offered to authors origi- nally was the control over who could make copies of the work and distrib- ut e them. In addition, that right-the "copy right" -lasted only a short time. The original length of copyright in the United States was fourteen years, with the right to renew for another fourteen years. So a total of twenty-eight years stood betwe en an author's rights and the public domain, and those rights were limited to publishing copies. Others could quote from a work, even derive other works from it (such as turning a no ve l into a play) , all within a law that was designed to promote science and the arts. Fast forward to the present day and we have a very different situation. Not only has there been a change in th e length of time that copyright applies to a work; a major change in 198 INFORMATION TECHNOLOGY AND LIBRARIES I DECEMBER 2004 Tom Zillner, Editor copyright law in 1976 extended copy- right to works that had not previously b een covered. In the earli es t U.S. copyright regimes of the late 18th cen- tury, only works that were registered with the copyright office were afforded the prot ection of copyright law, and only about five perc en t of works produc ed were so registered. Th e rest were in the public domain. Later, actual registration with the copyright office was unnecessary but the author was required to place a copyright notice on a work (e.g ., "© 2004, Karen Coyle") in order to claim copyright in it. Copyright holder s had to renew works in order make use of the full term of protection, and renewal rates were actually quite low. In 1976, all such requirements were removed, and the law was amended to state that any work in a fixed m edium automatically receives copy- right protection, and for the full term. That is true even if the author do es not want that protection . So although many saw the great exchange of ideas an d information on the Internet as being a huge commons of knowledge, to be shared and sha red alike, a ll of it has, in fact, alwa ys been covered by copyright law-every word out there belongs to someone. That chang e, combined with a much earlier change that gave a copyright holder control over deriv- ative works, puts creators into a deadlock. Th ey cannot safely build on the work of others without per- mission (thus Less ig's argument that we are becomin g a "permission cul- ture ") . Yet, we have no m echanism (such as registration of works that would result in a databas e of cre- ators) that would facilitate getting th at permission . If you find a work on the Internet and it has no named author or no contact information for the author, the law forbids you to reuse the work without permission, but there is nothing that would make getting that permission a man- ageable task. Of course, even if you do know who th e rights hold er is , permission is not a given. For exam- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ple, you hear a great song on the radio and want to use parts of that tune in your next rap performance. You would need to approach the major record label that holds the rights and ask permission, which might not be granted. You could go ah ead and use the sample and, if challenged, claim "fair use." But being challenged means going to court in a world where a court case could cost you in the six digits, an amount of money that most creators do not have. Lessig, of course, spends quite a bit of time in his book on the length of copyright, now life of the author plus seventy years. It was exactly this issue that he and Eric Eldred took to the Supreme Court in 2003. Lessig argued before the court that if Congress can seemingly arbitrarily increase the length of copyright, as it has eleven times since 1962, then there is effectively no limit to the copyright term. Yet "for a limited time" was clearly mandated in the U.S. Constitution. Lessig lost his case. You might expect him to spend his efforts explaining how the Supreme Court was wrong and he was right, but that is not what he does . Right or wrong, they are the Supreme Court, and his job was to convince them to decide in favor of his client. Instead, Lessig revises his estimation of what can be accom- plished with constitutional argu- ments and spends a chapter outlining compromises that might- just might-be possible in the future. To the extent that Eldred v. Ashcroft had an effect on Lessig's thinking , and there is evidence that the effect was profound, it will have an effect on all of us because Lessig is one of the key actors in this arena. Throughout the book, Lessig points out the difference between copyright law and the actual market for works. There is a great irony in the fact that copyright law now protects works for a century or more while most books are in print for one year or less. It is this vast storehouse of out-of- print and unexploited works that makes a strong argument for some modification of our copyright law. He also recognizes that there are different creative cultures in our society, with different views of the purpose of cre- ation. Here he cites academic move- ments like the Public Library of Science as solutions for the sector of society that has a low or nonexistent commercial interest but a need to get its works as widely distributed as pos- sible. For these creators, and for "shar- ers" everywhere, Lessig promotes the CreativeCommons solution (at www. creativecommons.org), a simple licen- sing scheme that allows creators to attach a license to their work that lets others know how they can make use of it. In a sense, CreativeCommons is a way to opt out of the default copyright that is applied to all works. When I first received my copy of Free Culture, I did two things: I looked up libraries in the index, and I looked up the book online to see what other reviewers had said. Online, I found a Web site for the book (http:/ /free-culture.org) that pointed to two very interesting sites: one that lists free, downloadable full- text copies of the book in over a dozen different formats; and one that allows you to listen to the chapters being read aloud by volunteers and admirers. (I did listen to a few chap- ters and generally they are as listen- able as most nonfiction audio books. In the end, though, I read the hard copy of the book.) Lessig is making a point by offering his work outside the usual confines of copyright law, but in fact the meaning of his gesture is more economic than legal. Al- though he, and Cory Doctorow before him (Down and Out in the Magic Kingdom, Tor Books, 2003), bro- kered agreements with their publish- ers to publish simultaneously in print with free digital copies, few authors and publishers today will choose that option for fear of loss of revenue, not because of their belief in the sanctity of intellectual property. If there were sufficient proof that free online copies of works increased sales of hard copies, this would quickly become the norm, regardless of the state of copyright law. As for libraries-unfortunately, they do not fare well. He dedicates a short chapter to Brewster Kahle and his Way-Back Machine as his example of the need to archive our culture for future access. I admit that I winced when Lessig stated: But Kahle is not the only librar- ian. The Internet Archive is not the only archive. But Kahle and the Internet Archive suggest what the future of librarie s or archives could be. (114) Lessig also mentions libraries in his arguments about out-of-print and inaccessible works, but in this case he actually gets it wrong: After it [a book] is out of print , it can be sold in used book store s without the copyright owner getting anything and stored in libraries, where many get to read the book, also for free. (113) Since we know that Lessig is very aware that books are sold and lent even while they are still in print, we have to assume that the elegance of the argum ent was preferred over preci- sion . But he makes this error mor e than once in the book, leaving librarie s to appear to be a home for leftov ers and remaindered works. That is too bad. We know that Lessig is aware of libraries; anyone active in the legal profession depends on them. He has spoken at library-related conferences and events. Yet he does not see libraries as key players in the battle against overly powerful copyright interests . More to the point, libraries have not captured his imagination, or given him a good story to tell. So here is a challenge for myself and my fel- low librarians: whether it means chat- ting up Lessig after one of his many public performances, becoming active in CreativeCommons, or stopping by Palo Alto to take a busy law professor to lunch , we need to make sure that we get on , and stay on, Lessig's radar . We need him ; h e needs us.-Karen Coyle, Digital Libraries Consultant, http:// kcoyle.net BOOK REVIEW 199 9718 ---- June_ITAL_Fagan_final An Evidence-Based Review of Academic Web Search Engines, 2014-2016: Implications for Librarians’ Practice and Research Agenda Jody Condit Fagan AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 7 7 ABSTRACT Academic web search engines have become central to scholarly research. While the fitness of Google Scholar for research purposes has been examined repeatedly, Microsoft Academic and Google Books have not received much attention. Recent studies have much to tell us about Google Scholar’s coverage of the sciences and its utility for evaluating researcher impact. But other aspects have been understudied, such as coverage of the arts and humanities, books, and non-Western, non-English publications. User research has also tapered off. A small number of articles hint at the opportunity for librarians to become expert advisors concerning scholarly communication made possible or enhanced by these platforms. This article seeks to summarize research concerning Google Scholar, Google Books, and Microsoft Academic from the past three years with a mind to informing practice and setting a research agenda. Selected literature from earlier time periods is included to illuminate key findings and to help shape the proposed research agenda, especially in understudied areas. INTRODUCTION Recent Pew Internet surveys indicate an overwhelming majority of American adults see themselves as lifelong learners who like to “gather as much information as [they] can” when they encounter something unfamiliar (Horrigan 2016). Although significant barriers to access remain, the open access movement and search engine giants have made full text more available than ever.1 The general public may not begin with an academic search engine, but Google may direct them to Google Scholar or Google Books. Within academia, students and faculty rely heavily on academic web search engines (especially Google Scholar) for research; among academic researchers in high-income areas, academic search engines recently surpassed abstracts & indexes as a starting place for research (Inger and Gardner 2016, 85, Fig. 4). Given these trends, academic librarians have a professional obligation to understand the role of academic web search engines as part of the research process. Jody Condit Fagan (faganjc@jmu.edu) is Professor and Director of Technology, James Madison University, Harrisonburg, VA. 1 Khabsa and Giles estimate “almost 1 in 4 of web accessible scholarly documents are freely and publicly available” (2014, 5). AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 8 Two recent events also point to the need for a review of research. Legal decisions in 2016 confirmed Google’s right to make copies of books for its index without paying or even obtaining permission from copyright holders, solidifying the company’s opportunity to shape the online experience with respect to books. Meanwhile, Microsoft rebooted their academic web search engine, now called Microsoft Academic. At the same time, information scientists, librarians, and other academics conducted research into the performance and utility of academic web search engines. This article seeks to review the last three years of research concerning academic web search engines, make recommendations related to the practice of librarianship, and propose a research agenda. METHODOLOGY A literature review was conducted to find articles, conference presentations, and books about the use or utility of Google Books, Google Scholar, and Microsoft Academic for scholarly use, including comparisons with other search tools. Because of the pace of technological change, the focus was on recent studies (2014 through 2016, inclusive). A search was conducted on “Google Books” in EBSCO’s Library and Information Science and Technology Abstracts (LISTA) on December 19, 2016, limited to 2014-2016. Of the 46 results found, most were related to legal activity. Only four items related to the tool’s use for research. These four titles were entered into Google Scholar to look for citing references, but no additional relevant citations were found. In the relevant articles found, the literature reviews testified to the general lack of studies of Google Books as a research tool (Abrizah and Thelwall 2014; Weiss 2016) with a few exceptions concerning early reviews of metadata, scanning, and coverage problems (Weiss 2016). A search on “Google Books” in combination with “evaluation OR review OR comparison” was also submitted to JMU’s discovery service,2 limited to 2014-2016 in combination with the terms. Forty-nine items were found and from these, three relevant citations were added; these were also entered into Google Scholar to look for citing references. However, no additional relevant citations were found. Thus, a total of seven citations from 2014-2016 were found with relevant information concerning Google Books. Earlier citations from the articles’ bibliographies were also reviewed when research was based on previous work, and to inform the development of a fuller research agenda. A search on “Microsoft Academic” in LISTA on February 3, 2017 netted fourteen citations from 2014-2016. Only seven seemed to focus on evaluation of the tool for research purposes. A search on “Microsoft Academic” in combination with terms “evaluation OR review OR comparison” was also submitted to JMU’s discovery service, limited to 2014-2016. Eighteen items were found but no additional citations were added, either because they had already been found or were not relevant. The seven titles found in LISTA were searched in Google Scholar for citing references; four additional relevant citations were found, plus a paper relevant to Google Scholar not 2 JMU’s version of EBSCO Discovery Service contained 453,754,281 items at the time of writing and is carefully vetted to contain items of curricular relevance to the JMU community (Fagan and Gaines 2016). INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 9 previously discovered (Weideman 2015). Thus, a total of eleven citations were found with relevant information for this review concerning Microsoft Academic. Because of this small number, several articles prior to 2014 were included in this review for historical context. An initial search was performed on “Google Scholar” in LISTA on November 19, 2016, limited to 2014-2016. This netted 159 results, of which 24 items were relevant. A search on “Google Scholar” in combination with terms “evaluation OR review OR comparison” was also submitted to JMU’s discovery tool limited to 2014-2016, and eleven relevant citations were added. Items older than 2014 that were repeatedly cited or that formed the basis of recent research were retrieved for historical context. Finally, relevant articles were submitted to Google Scholar, which netted an additional 41 relevant citations. Altogether, 70 citations were found to articles with relevant information for this review concerning Google Scholar in 2014-2016. Readers interested in literature reviews covering Google Scholar studies prior to 2014 are directed to (Gray et al. 2012; Erb and Sica 2015; Harzing and Alakangas 2016b). FINDINGS Google Books Google Books (https://books.google.com) contains about 30 million books, approaching the Library of Congress’s 37 million, but far shy of Google’s estimate of 130 million books in existence (Wu 2015), which Google intends to continue indexing (Jackson 2010). Content in Google Books includes publisher-supplied, self-published, and author-supplied content (Harper 2016) as well as the results of the famous Google Books Library Project. Started in December 2004 as the “Google Print” project,3 the project involved over 40 libraries digitizing works from their collections, with Google indexing and performing OCR to make them available in Google Books (Weiss 2016; Mays 2015). Scholars have noted many errors with Google Books metadata, including misspellings, inaccurate dates, and inaccurate subject classifications (Harper 2016; Weiss 2016). Google does not release information about the database’s coverage, including which books are indexed or which libraries’ collections are included (Abrizah and Thelwall 2014). Researchers have suggested the database covers mostly U.S. and English-language books (Abrizah and Thelwall 2014; Weiss 2016). The conveniences of Google Books include limits by the type of book availability (e.g. free e- books vs. Google e-books), document type, and date. The detail view of a book allows magnification, hyperlinked tables of contents, buying and “Find in a Library” options, “My Library,” and user history (Whitmer 2015). Google Books also offers textbook rental (Harper 2016) and limited print-on-demand services for out-of-print books (Mays 2015; Boumenot 2015). In April 2016, the Supreme Court affirmed Google’s right to make copies for its index without paying or even obtaining permission from copyright holders (Authors Guild 2016; Los Angeles Times 2016). Scanning of library books and “snippet view” was deemed fair use: “The purpose of the copying is highly transformative, the public display of text is limited, and the revelations do 3 https://www.google.com/googlebooks/about/history.html AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 10 not provide a significant market substitute for the protected aspects of the originals” (U.S. Court of Appeals for the Second Circuit 2015). Literature concerning high-level implications of Google Books suggests the tool is having a profound effect on research and scholarship. The tool has been credited for serving as “a huge laboratory” for indexing, interpretation, working with document image repositories, and other activities (Jones 2010). At the same time, the academic community has expressed concerns about Google Books’s effects on social justice and how its full-text search capability may change the very nature of discovery (Hoffmann 2014; Hoffmann 2016; Szpiech 2014). One study found that books are far more prevalently cited in Wikipedia than are research articles (Kousha and Thelwall 2017). Yet investigations of Google Books’ coverage and utility as a research tool seem to be sorely lacking. As Weiss noted, “no critical studies seem to exist on the effect that Google Books might have on the contemporary reference experience” (Weiss 2016, 293). Furthermore, no information was found concerning how many users are taking advantage of Google Books; the tool was noticeably absent from surveys such as (Inger and Gardner's (2016) and from research centers such as the Pew Internet Research Project. In a largely descriptive review, Harper (2016) bemoaned Google Books’ lack of integration with link resolvers and discovery tools, and judged it lacking in relevant material for the health sciences, because so much of the content is older. She also noted the majority of books scanned are in English, which could skew scholarship. The non-English skew of Google Books was also lamented by Weiss, who noted an “underrepresentation of Spanish and overestimation of French and German (or even Japanese for that matter)” especially as compared to the number of Spanish speakers in the United States (Weiss 2016, 286-306). Whitmer (2015) and Mays (2015) provided practical information about how Google Books can be used as a reference tool. Whitmer presented major Google Books features and challenged librarians to teach Google Books during library instruction. Mays conducted a cursory search on the 1871 Chicago Fire and described the primary documents she retrieved as “pure gold,” including records of city council meetings, notes from insurance companies, reports from relief societies, church sermons on the fire, and personal memoirs (Mays 2015, 22). Mays also described Google Books as a godsend to genealogists for finding local records (e.g. police departments, labor unions, public schools). In her experience, the geographic regions surrounding the forty participating Google Books Library Project libraries are “better represented than other areas” (Mays 2015, 25). Mays concludes, “Its poor indexing and search capabilities are overshadowed by the ease of its fulltext search capabilities and the wonderful ephemera that enriches its holdings far beyond mere ‘books’” (Mays 2015, 26). Abrizah and Thelwall (2014) investigated whether Google Books and Google Scholar provided “good impact data for books published in non-Western countries.” They used a comprehensive list of arts, humanities, and social sciences books (n=1,357) from the five main university presses in INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 11 Malaysia 1961-2013. They found only 23% of the books were cited in Google Books4 and 37% in Google Scholar (p. 2502). The overlap was small: only 15% were cited in both Google Scholar and Google Books. English-language books were more likely to be cited in Google Books; 40% of English language books were cited versus 16% Malay. Examining the top 20 books cited in Google Books, researchers found them to be mostly written in English (95% in Google Books vs 29% in the sample), and published by University of Malaysia Press (60% in Google Books vs 26% in the sample) (2505). The authors concluded that due to the low overlap between Google Scholar and Google Books, searching both engines was required to find the most citations to academic books. Kousha and Thelwall (2015; 2011) compared Google Books with Thomson Reuters Book Citation Index (BKCI) to examine its suitability for scholarly impact assessment and found Google Books to have a clear advantage over BKCI in the total number of citations found within the arts and humanities, but not for the social sciences or sciences. They advised combining results from BKCI with Google Books when performing research impact assessment for the arts and humanities and social sciences, but not using Google Books for the sciences, “because of the lower regard for books among scientists and the lower proportion of Google Books citations compared to BKCI citations for science and medicine” (Kousha and Thelwall 2015, 317). Microsoft Academic Microsoft Academic (https://academic.microsoft.com) is an entirely new software product as of 2016. Therefore, the studies cited prior to 2016 refer to entirely different search engines than the one currently available. However, a historical account of the tool and reviewers’ opinions was deemed helpful for informing a fuller picture of academic web search engines and pointing to a research agenda. Microsoft Academic was born as Windows Live Academic in 2006 (Carlson 2006), was renamed Live Search Academic after a first year of struggle (Jacsó 2008), and was scrapped two years later after the company recognized it did not have sufficient development support in the United States (Jacsó 2011). Microsoft Asia Research Group launched a beta tool called Libra in 2009, which redirected to the “Microsoft Academic Search” service by 2011. Early reviews of the 2011 edition of Microsoft Academic Search were promising, although the tool clearly lacked the quantity of data searched by Google Scholar (Jacsó 2011; Hands 2012). There were a few studies involving Microsoft Academic Search in 2014. Ortega and Aguillo (2014) compared Microsoft Academic Search and Google Scholar Citations for research evaluation and concluded “Microsoft Academic Search is better for disciplinary studies than for analyses at institutional and individual levels. On the other hand, Google Scholar Citations is a good tool for individual assessment because it draws on a wider variety of documents and citations” (1155). 4 Google Books does not support citation searching; the researchers searched for the book title to manually find citations to a book. AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 12 As part of a comparative investigation of an automatic method for citation snowballing using Microsoft Academic Search, Choong et al. (2014) manually searched for a sample of 949 citations to journal or conference articles cited from 20 systematic reviews. They found Microsoft Academic Search contained 78% of the cited articles and noted its utility for testing automated methods due to its free API and no blocks to automated access. The researchers also tested their method against Google Scholar, but noted “computer-access restrictions prevented a robust comparison” (n.p.). Also in 2014, Orduna-Malea et al. (2014) attempted a longitudinal study of disciplines, journals, and organizations in Microsoft Academic Search only to find the database had not been updated since 2013. Furthermore they found the indexing to be incomplete and still in process, meaning Microsoft Academic Search’s presentation of information about any particular publication, organization, or author was distorted. Despite this finding, MAS was included in two studies of scholar profiles. Ortega (2015) compared scholar profiles across Google Scholar, Microsoft Academic Search, Research Gate, Academia.edu, and Mendeley, and found little overlap across the sites. They also found social and usage indicators did not consistently correlate with bibliometric indicators, except on the ResearchGate platform. Social and usage indicators were “influenced by their own social sites,” while bibliometric indicators seemed more stable across all services (13). Ward et al. (2015) still included Microsoft Academic Search in their discussion of scholarly profiles as part of the social media network, noting Microsoft Academic Search was painfully time-consuming to work with in terms of consolidating data, correcting items, and adding missing items. In September 2016, Hug et al. demonstrated the utility of the new Microsoft Academic API by conducting a comparative evaluation of normalized data from Microsoft Academic and Scopus (Hug, Ochsner, and Braendle 2016). They noted Microsoft Academic has “grown massively from 83 million publication records in 2015 to 140 million in 2016” (10). The Microsoft Academic API offers rich, structured metadata with the exception of document type. They found all attributes containing text were normalized and that identifiers were available for all entities, including references, supporting bibliometricians’ needs for data retrieval, handling, and processing. In addition to the lack of document type, the researchers also found the “fields of study” to be too granular and dynamic, and their hierarchies incoherent. They also desired the ability to use the DOI to build API requests. Nevertheless, the advantages of Microsoft Academic’s metadata and API retrieval suggested to Hug et al. that Microsoft Academic was superior to Google Scholar for calculating research impact indicators and bibliometrics in general. In October 2016, Harzing and Alakangas compared publication and citation coverage of the new Microsoft Academic with Google Scholar, Scopus, and Web of Science using a sample of 145 academics at the University of Melbourne (Harzing and Alakangas 2016a) including observations from 20-40 faculty each in the humanities, social sciences, engineering, sciences, and life sciences. They discovered Microsoft Academic had improved substantially since their previous study (Harzing 2016b), increasing 9.6% for a comparison sample in comparison with 1.4%, 2%, and 1.7% growth in Google Scholar, Scopus, and Web of Science (n.p.). The researchers noted a few INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 13 problems with data quality, “although the Microsoft Academic team have indicated they are working on a resolution” (n.p.). On average, the researchers found that Microsoft Academic found 59% as many citations as Google Scholar, 97% as many citations as Scopus, and 108% as many citations as Web of Science. Google Scholar had the top counts for each disciplinary area, followed by Scopus except in the social sciences and humanities, where Microsoft Academic ranked second. The researchers explained that Microsoft Academic “only includes citation records if it can validate both citing and cited papers as credible,” as established through a machine-learning- based system, and discussed an emerging metric of “estimated citation count” also provided by Microsoft Academic. The researchers concluded that Microsoft Academic is promising to be “an excellent alternative for citation analysis” and suggested Microsoft should work to improve coverage of books and grey literature. Google Scholar Google Scholar was released in beta form in November 2004, and was expanded to include judicial case law in 2009. While Google Scholar has received much attention in academia, it seems to be regarded by Google as a niche product: in 2011 Google removed Scholar from the list of top services and list of “more” services, relegating it to the “even more” list. In 2014, the Scholar team consisted of just nine people (Levy 2014). Describing Google Scholar in an introductory manner is not helped by Google’s vague documentation, which simply says it “includes scholarly articles from a wide variety of sources in all fields of research, all languages, all countries, and over all time periods.”5 The “wide variety of sources” includes “journal papers, conference papers, technical reports, or their drafts, dissertations, pre-prints, post-prints, or abstracts,” as well as court opinions and patents, but not “news or magazine articles, book reviews, and editorials.” Books and dissertations uploaded to Google Book Search are “automatically” included in Scholar. Google says abstracts are key, noting “Sites that show login pages, error pages, or bare bibliographic data without abstracts will not be considered for inclusion and may be removed from Google Scholar.” Studies of Google Scholar can be divided in to three major categories of focus: investigating the coverage of Google Scholar; the use and utility of Google Scholar as part of the research process; and Google Scholar’s utility for bibliographic measurement, including evaluating the productivity of individual researchers and the impact of journals. There is some overlap across these categories, because studies of Google Scholar seem to involve three questions: 1) What is being searched? 2) How does the search function? and 3) To what extent can the user usefully accomplish her task? The Coverage of Google Scholar Scholars want to know what “scholarship” is covered by Google Scholar, but the documentation merely states that it indexes “papers, not journals”6 and challenges researchers to investigate 5 https://scholar.google.com/intl/en/scholar/inclusion.html 6 https://www.google.com/intl/en/scholar/help.html#coverage AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 14 Google Scholar’s coverage empirically despite Google Scholar’s notoriously challenging technical limitations. While some limitations of Google Scholar have been corrected over the years, longstanding logistical hurdles involved with studying Google Scholar’s coverage have been well-documented for over a decade (Shultz 2007; Bonato 2016; Haddaway et al. 2015; Levay et al. 2016), and include: • Search queries are limited to 256 characters • Not being able to retrieve more than 1,000 results • Not being able to display more than 20 results per page • Not being able to download batches of results (e.g. to load into citation management software) • Duplicate citations (beyond the multiple article “versions”), requiring manual screening • Retrieving different results with Advanced and Basic searches • No designation of the format of items (e.g. conference papers) • Minimal sort options for results • Basic Boolean operators only7 • Illogical interpretation of Boolean operators: esophagus OR oesophagus and oesophagus OR esophagus return different numbers of results (Boeker, Vach, and Motschall 2013) • Non-disclosure of the algorithm by which search results are sorted. Additionally, one study reported experiencing an automated block to the researcher’s IP address after the export of approximately 180 citations or 180 individual searches (Haddaway et al. 2015, 14). Furthermore, the Research Excellence Framework was unable to use Google Scholar to assess the quality of research in UK higher education institutions, because of researchers’ inability to agree with Google on a “suitable process for bulk access to their citation information, due to arrangements that Google Scholar have in place with publishers” (Research Excellence Framework 2013, 1562). Such barriers can limit what can be studied and also cost researchers significant time in terms of downloading (Prins et al. 2016) and cleaning citations (Levay et al. 2016). Despite these hurdles, research activity analyzing the coverage of Google Scholar has continued in the past two years, often building off previous studies. This section will first discuss Google Scholar’s size and ranking, followed by its coverage of articles and citations, then its coverage of books, grey literature, and open access and institutional repositories. Google Scholar Size and Ranking In a 2014 study, Khabsa and Giles estimated there were at least 114 million English-language scholarly documents on the Web, of which Google Scholar had “nearly 100 million.” Another study by Orduna-Malea, Ayllón, Martín-Martín, and López-Cózar (2015) estimated that the total number 7 E.g., no nesting of logical subexpressions deeper than one level (Boeker, Vach, and Motschall 2013) and no truncation operators. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 15 of documents indexed by Google Scholar, without any language restriction, was between 160 and 165 million. By comparison, in 2016 the author’s discovery tool contained about 168 million items in academic journals, conference materials, dissertations, and reviews.8 Google Scholar’s presence in the information marketplace has influenced vendors to increase the discoverability of their content, including pushing for the display of abstracts and/or the first page of articles (Levy 2014). ProQuest and Gale indexes were added to Google Scholar in 2015 (Quint 2016). Martín-Martín et al. (2016b) noted that Google Scholar’s agreements with big publishers come at a price: “the impossibility of offering an API,” which would support bibliometricians’ research (54). Google Scholar’s results ranking “aims to rank documents the way researchers do, weighing the full text of each document, where it was published, who it was written by, as well as how often and how recently it has been cited in other scholarly literature.”9 Martín-Martín and his colleagues (2017, 159) conducted a large, longitudinal study of null query results in Google Scholar and found a strong correlation between result list ranking and times cited. The influence of citations is so strong that when the researchers performed the same search process four months later, 14.7% of documents were missing in the second sample, causing them to conclude even a change of one or two citations could lead to a document being excluded or included from the top 1,000 results (157). Using citation counts as a major part of the ranking algorithm has been hypothesized to produce the “Matthew Effect,” where “work that is already influential becomes even more widely known by virtue of being the first hit from a Google Scholar search, whereas possibly meritorious but obscure academic work is buried at the bottom” (Antell et al. 2013, 281). Google Scholar has been shown to heavily bias its ranking toward English-language publications even when there are highly cited non-English publications in the result set, although selection of interface language may influence the ranking. Martin-Martin and his colleagues noted that Google Scholar seems to use the domain of the document’s hosting web site as a proxy for language, meaning that “some documents written in English but with their primary version hosted in non- Anglophone countries’ web domains do appear in lower positions in spite of receiving a large number of citations” (Martin-Martin et al. 2017, 161). This effect is shown dramatically in Figure 3 of their paper. Google Scholar Coverage: Articles and Citations The coverage of articles, journals, and citations by Google Scholar has been commonly examined by using brute force methods to retrieve a sample of items from Google Scholar and possibly one or more of its competitors. (Studies discussed in this section are listed in Table 1). The goal is usually to determine how well Google Scholar’s database compares to traditional research databases, usually in a specific field. Core methodology involves importing citations into software such as Publish or Perish (Harzing 2016a), cleaning the data, then performing statistical tests, 8 The discovery tool does not contain all available metadata but has been carefully vetted (Fagan and Gaines 2016). 9 https://www.google.com/intl/en/scholar/about.html AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 16 expert review, or both. Haddaway (2015) and Moed et al. (2016) have written articles specifically discussing methodological aspects. Recent studies repeatedly find that Google Scholar’s coverage meets or exceeds that of other search tools, no matter what is identified by target samples, including journals, articles, and citations (Karlsson 2014; Harzing 2014; Harzing 2016b; Harzing and Alakangas 2016b; Moed, Bar- Ilan, and Halevi 2016; Prins et al. 2016; Wildgaard 2015; Ciccone and Vickery 2015). In only three studies did Google Scholar find fewer items, and the meaningful difference was minimal.10 Science disciplines were the most studied in Google Scholar, including agriculture, astronomy, chemistry, computer science, ecology, environmental science, fisheries, geosciences, mathematics, medicine, molecular biology, oceanography, physics, and public health. Social sciences studied include education (Prins et al. 2016), economics (Harzing 2014), geography (Ştirbu et al. 2015, 322-329), information science (Winter, Zadpoor, and Dodou 2014; Harzing 2016b), and psychology (Pitol and De Groote 2014). Studies related to the arts or humanities 2014-2016 included an analysis of open access journals in music (Testa 2016) and a comparison between Google Scholar and Web of Science for research evaluation within education, pedagogical sciences, and anthropology11 (Prins et al. 2016). Wildgaard (2015) and Bornmann et al. (2016) included samples of humanities scholars as part of bibliometric studies, but did not discuss disciplinary aspects related to coverage. Prior to 2014, the only study found related to the arts and humanities compared Google Scholar with Historical Abstracts (Kirkwood Jr. and Kirkwood 2011). Google Scholar’s coverage has been growing over time (Meier and Conkling 2008; Harzing 2014; Winter, Zadpoor, and Dodou 2014; Bartol and Mackiewicz-Talarczyk 2015, 531; Orduña-Malea and Delgado López-Cózar 2014) with recent increases in older articles (Winter, Zadpoor, and Dodou 2014; Harzing and Alakangas 2016b), leading some to question whether this supports the documented trend of increased citation of older literature (Martín-Martín et al. 2016c; Varshney 2012). Winter et al. noted that in 2005 Web of Science yielded more citations than Google Scholar for about two-thirds of their sample, but for the same sample in 2013, Google Scholar found more citations than Web of Science, with only 6.8% of citations not retrieved by Google Scholar (Winter, Zadpoor, and Dodou 2014, 1560). The unique citations of Web of Science were “typically documents before the digital age and conference proceedings not available online” (Winter, Zadpoor, and Dodou 2014, 1560). Harzing and Alakangas’s (2016b) large-scale longitudinal comparison of Google Scholar, Scopus, and Web of Science suggested that Google Scholar’s retroactive expansion has stabilized and now all three databases are growing at similar rates. 10 For example, Bramer, Giustini, and Kramer (2016a) found slightly more of their 4,795 references from systematic reviews in Embase (97.5%) than in Google Scholar (97.2%). In Testa (2016), the music database RILM indexed two more of the 84 OA journals than Google Scholar (which indexed at least one article from 93% of the journals). Finally, in a study using citations to the most-cited article of all time as a sample, Web of Science found more citations than did Google Scholar (Winter, Zadpoor, and Dodou 2014). 11 Prins et al. classified anthropology as part of the humanities. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 17 Google Scholar also seems to cover both the oldest and the most recent publications. Unlike traditional abstracts and indexes, Google Scholar is not limited by starting year, so as publishers post tables of contents of their earliest journals online, Google Scholar discovers those sources (Antell et al. 2013, 281). Trapp (2016) reported the number of citations to a highly-cited physics paper after the first 11 days of publication to be 67 in Web of Science, 72 in Scopus, and 462 in Google Scholar (Trapp 2016, 4). In a study of 800 citations to Nobelists in multiple fields, Harzing found that “Google Scholar could effectively be 9–12 months ahead of Web of Science in terms of publication and citation coverage” (2013, 1073). An increasing proportion of journal articles in Google Scholar are freely available in full text. A large-scale, longitudinal study of highly-cited articles 1950-2013 found 40% of article citations in the sample were freely available in full text (Martín-Martín et al. 2014). Another large-sample study found 61% of articles in their sample from 2004–2014 could be freely accessed (Jamali and Nabavi 2015). In both studies, nih.gov and ResearchGate were the top two full-text providers. Google Scholar’s coverage of major publisher content varies; having some coverage of a publisher does not imply all articles or journals from that publisher are covered. In a sample of 222 citations compared across Google Scholar, Scopus, and Web of Science, Google Scholar contained all of the Springer titles, as many Elsevier titles as Scopus, and the most articles by Wolters Kluwer and John Wiley. However, among the three databases, Google Scholar contained the fewest articles by BMJ and Nature (Rothfus et al. 2016). AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 18 18 Study Sample Results (Bartol and Mackiewicz- Talarczyk 2015) Documents retrieved in response to searches on crops and fibers in article titles, 1994-2013 (samples varied by crop) Google Scholar returned more documents retrieved for each crop. For example, “hemp” retrieved 644 results in Google Scholar, 493 in Scopus, and 318 in Web of Science; Google Scholar demonstrated higher yearly growth of records over time. (Bramer, Giustini, and Kramer 2016b) References from a pool of systematic reviewer searches in medicine (n=4795) Google found 97.2%, Embase, 97.5%, MEDLINE 92.3% of all references; When using search strategies, Embase retrieved 81.6%, MEDLINE 72.6%, and Google Scholar 72.8%. (Ciccone and Vickery 2015) Based on 183 user searches randomly selected from NCSU Libraries’ 2013 Summon search logs (n=137) No significant difference between the performance of Google Scholar, Summon, and EDS for known-item searches; “Google Scholar outperformed both discovery services for topical searches.” (Harzing 2014) Publications and citation metrics for 20 Nobelists in chemistry, economics, medicine, physics, 2012- 2013 (samples varied) Google Scholar coverage is now “increasing at a stable rate” and provides “comprehensive coverage across a wide set of disciplines for articles published in the last four decades” (575). (Harzing 2016b) Citations from one researcher (n=126) Microsoft Academic found all books and journal articles covered by Google Scholar; Google Scholar found 35 additional publications including book chapters, white papers, and conference papers. (Harzing and Alakangas 2016a) Samples from (Harzing and Alakangas 2016b, 802) (samples varied by faculty) Google Scholar provided higher “true” citation counts than Microsoft Academic but Microsoft Academic “estimated” citation counts were 12% higher than Google Scholar for life sciences and equivalent for the sciences. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 19 (Harzing and Alakangas 2016b) Citations of the works of 145 faculty among 37 scholarly disciplines at the University of Melbourne (samples varied by faculty) For the top faculty member, Google Scholar had 519 total papers (compared with 309 in both Web of Science and Scopus); Google Scholar had 16,507 citations (compared with 11,287 in Web of Science and 11,740 in Scopus). (Hilbert et al. 2015) Documents published by 76 information scientists in German-speaking countries (n=1,017) Google Scholar covered 63%, Scopus, 31%, BibSonomy, 24%, Mendeley, 19%, Web of Science, 15%, CiteULike, 8%. (Jamali and Nabavi 2015) Items published between 2004 and 2014 (n=8,310) 61% of articles were freely available; of these, 81% were publisher versions and 14% were pre-prints; ResearchGate was the top full-text source netting 10.5% of full-text sources, followed by ncbi.nlm.nih.gov (6.5%). (Karlsson 2014) Journals from ten different fields (n=30) Google Scholar retrieved documents from all the selected journals; Summon only retrieved documents from 14 out of 30 journals. (Lee et al. 2015) Journal articles housed in Florida State University’s institutional repository (n=170) Metadata found in Google for 46% of items and in Google Scholar for 75% of items; Google Scholar found 78% of available full text. Google Scholar found full text for six items with no full text in the IR. (Martín-Martín et al. 2014) Items highly cited by Google Scholar (n=64,000) 40% could be freely accessed using Google Scholar; Nih.gov and ResearchGate were the top two full-text providers. (Moed, Bar-Ilan, and Halevi 2016) Citations to 36 highly cited articles in 12 scientific-scholarly English-language journals (n=about 7,000) 47% of sources were in both Google Scholar and Scopus; 47% of sources were in Google Scholar only; 6% of sources were in Scopus only; Of the unique Google Scholar citations, sources were most often from Google Books, Springer, SSRN, ResearchGate, ACM Digital Library, Arxiv, and ACLweb.org. AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 20 (Prins et al. 2016) Article citations in the field of education and pedagogies, and citations to 328 articles in anthropology (n=774) Google Scholar found 22,887 citations in Education & Pedagogical Science compared to Web of Science’s 8,870, and 8,092 in Anthropology compared with Web of Science’s 1,097. (Ştirbu et al. 2015) Compared # of citations resulting from two geographical topic searches (samples varied) Google Scholar found 2,732 geographical references whereas Web of Science found only 275, GeoRef 97, and FRANCIS 45. For sedimentation, Google Scholar found 1,855 geographical references compared to Web of Science’s 606, GeoRef’s 1,265, and FRANCIS’s 33; Google Scholar overlapped Web of Science by 67% and 82% for the two searches, and GeoRef by 57% and 62% (Testa 2016) Open access journals in music (n=84) Google Scholar indexed at least one article from 93% of OA journals. RILM indexed two additional journals. (Wildgaard 2015) Publications from researchers in astronomy, environmental science, philosophy and public health (n=512) Publication count from Web of Science was 2-4 times lower for all disciplines than Google Scholar; Citation count was up to 13 times lower in Web of Science than in Google Scholar. (Winter, Zadpoor, and Dodou 2014) Growth of citations to 2 classic articles (1995- 2013) and 56 science and social science articles in Google Scholar, 2005-2013 (samples varied) Total citation counts 21% higher in Web of Science than Google Scholar for Lowry (1951) but Google Scholar 17% higher than Web of Science for Garfield (1955) and 102% higher for the 56 research articles; Google Scholar showed a significant retroactive expansion to all articles compared to negligible retroactive growth in Web of Science. Table 1. Studies investigating Google Scholar’s coverage of journal articles and citations, 2014-2016. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 21 Google Scholar Coverage: Books Many studies mentioned that books, including Google Books, are sometimes included in Google Scholar results. Jamali and Nabavi (2015) found 13% of their sample of 8,310 citations from Google Scholar were books, while Martín-Martín et al. (2014) had found that 18% of their sample of 64,000 citations from Google Scholar were books. Within the field of anthropology, Prins (2016) found books to generate the most citation impact in Google Scholar (41% of books in their sample were cited in Google Scholar) compared to articles (21% of articles were cited in Google Scholar). In education, 31% of articles and 25% of books were cited by Google Scholar (3). Abrizah and Thelwall found only 37% of their sample of 1,357 arts, humanities, and social sciences books from the five main university presses in Malaysia had been cited in Google Scholar (23% of the books had been cited in Google Books) (Abrizah and Thelwall 2014, 2502). The overlap was small: 15% had impact in both Google Scholar and Google Books. The authors concluded that due to the low overlap between Google Scholar and Google Books, searching both engines is required to find the most citations to academic books. English books were significantly more likely to be cited in Google Scholar (48% vs. 32%), as were edited books (53% vs. 36%). They surmised edited books’ citation advantage was due to the use of book chapters in social sciences. They found arts and humanities books more likely to be cited in Google Scholar than social sciences books (40% vs. 34%) (Abrizah and Thelwall 2014, 2503). Google Scholar Coverage: Grey Literature Grey literature refers to documents not published commercially, including theses, reports, conference papers, government information, and poster sessions. Haddaway et al. (2015) was the only empirical study found focused on grey literature. They discovered that between 8% and 39% of full-text search results from Google Scholar were grey literature, with the greatest concentration of citations from grey literature on page 80 of results for full-text searches and page 35 for title searches. They concluded “the high proportion of grey literature that is missed by Google Scholar means it is not a viable alternative to hand searching for grey literature as a stand- alone tool” (2015, 14). For one of the systematic reviews in their sample, none of the 84 grey literature articles cited were found within the exported Google Scholar search results. The only other investigation of grey literature found was Bonato (2016), who after conducting a very limited number of searches on one specific topic and a search for a known item, concluded Google Scholar to be “deficient.” In conclusion, despite much offhand praise for Google Scholar’s grey literature coverage (Erb and Sica 2015; Antell et al. 2013), the topic has been little studied and when it has, grey literature results have not been prominent. Google Scholar Coverage: Open Access and Institutional Repository Content Erb and Sica touted Google Scholar’s access to “free content that might not be available through a library’s subscription services,” including open access journals and institutional repository coverage (2015, 48). Recent research has dug deeper into both these content areas. AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 22 In general, OA articles have been shown to net more citations than non-OA articles, as Koler-Povh, Južnic, and Turk (2014) showed within the field of civil engineering. Across their sample of 2,026 scholarly articles in 14 journals, all indexed in Web of Science, Scopus, and Google Scholar, OA articles received an average of 43 citations while non-OA articles were cited 29 times (1039). Google Scholar did a better job discovering those citations; in Google Scholar the median of citations of OA articles was always higher than that for non-OA articles, wheras this was true in Web of Science for only 10 of the 14 journals and in Scopus for 11 of the 14 journals (1040). Similarly, Chen (2014) found Google Scholar to index far more OA journals than Scopus and Web of Science, especially “gold OA.”12 Google Scholar’s advantage should not be assumed across all disciplines, however; Testa (2016) found both Google Scholar and RILM to provide good coverage of OA journals in music, with Google Scholar indexing at least one article from 93% of the 84 OA journals in the sample. But the bibliographic database RILM indexed two more OA journals than Google Scholar. Google Scholar indexing of repositories may be critical for success, but results vary by IR platform and whether the IR metadata has been structured according to Google’s guidelines. In a random sample from Shodhganga, India’s central ETD database, Weideman (2015) found not one article had been indexed in full text by Google Scholar, although in many cases the metadata was indexed, leading the author to identify needed changes to the way Shodhganga stores ETDs.13 Likewise, Chen (2014) found that neither Google Scholar nor Google appears to index Baidu Wenku, a major full-text archive and social networking site in China similar to ResearchGate, and Orduña-Malea and López-Cózar (2015) found that Latin American repositories are not very visible in Google or Google Scholar due to limitations of the description schemas chosen as well as search engine reliability. In Yang’s (2016) study of Texas Tech’s DSpace IR, Google was the only search engine that indexed, discovered, or linked to PDF files supplemented with metadata; Google Scholar did not discover or provide links to the IR’s PDF files, and was less successful at discovering metadata. When Google Scholar is able to index IR content, it may be responsible for significant traffic. In a study of four major U.S. universities’ institutional repositories (three DSpace, one CONTENTdm) involving a dataset of 57,087 unique URLs and 413,786 records, researchers found that 48%–66% of referrals came from Google Scholar (Obrien et al. 2016, 870). The importance of Google Scholar in contrast to Google was noted by Lee et al. (2015), who conducted title searches on 170 journal articles housed in Florida State University’s institutional repository (using bePress’s Digital Commons platform), 100 of which existed in full text in the IR. Links to the IR were found in Google results for 45.9% of the 170 items, and in Google Scholar for 74.7% of the 170 items. Furthermore, Google Scholar linked to the full text for 78% of the 100 cases where full text was available, and even provided links to freely available full text for six items that did not have full 12 OA articles on publisher web sites, whether the journal itself is OA or not (Chen 2014) 13 Most notably, the need to store thesis documents as one PDF file instead of divided into multiple, separate files, to create HTML landing pages as per Google’s recommendations, and to submit the addresses of these pages to Google Scholar. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 23 text in the IR. However, the researchers also noted “relying on either Google or Google Scholar individually cannot ensure full access to scholarly works housed in OA IRs.” In their study, among the 104 fully open access items there was an overlap in results of only 57.5%; Google provided links to 20 items not found with Google Scholar, and Google Scholar provided links to 25 items not found with Google (Lee et al. 2015, 15). Google Scholar results note the number of “versions” available for each item. In a study of 982 science article citations (including both OA and non-OA) in IRs, Pitol and DeGroote found 56% of citations had between four and nine Google Scholar versions (2014, 603) Almost 90% of the citations shown were the publisher version, but of these, only 14.3% were freely available in full text on the publisher web site. Meanwhile, 70% percent of the items had at least one free full-text version available through a “hidden” Google Scholar version. The author’s experience in retrieving full text for this review indicates this issue still exists, but research would be needed to formulate reliable recommendations for users. Use and utility of Google Scholar as part of the research process Studies were found concerning Google Scholar’s popularity with users and their reasons for preferring it (or not) over other tools. Another group of studies examined issues related to the utility of Google Scholar for research processes, including issues related to messy metadata. Finally, a cluster of articles focused specifically on using Google Scholar for systematic reviews. Popularity and User Preferences Several studies have shown Google Scholar to be well-known to scholarly communities. A survey of 3,500 scholars from 95 countries found that over 60% of 3,500 scientists and engineers and over 70% of respondents in the social sciences, arts, and humanities were aware of Google Scholar and used it regularly (Van Noorden 2014). In a large-scale journal-reader survey, Inger and Gardner (2016) found that among academic researchers in high-income areas, academic search engines surpassed abstracts and indexes as a starting place for research (2016, 85, Figure 4). In low-income areas, Google use exceeded Google Scholar use for academic research. Major library link resolver software offers reports of full-text requests broken down by referrer. Inger and Gardner (2016) showed a large variance across subjects for whether people prefer Google or Google Scholar: “People in the social sciences, education, law, and business use Google Scholar more to find journal articles. However, people working in the humanities and religion and theology prefer to use Google” (88). Humanities scholar use of Google over Google Scholar was also found by Kemman et al. (2013); Google, Google Images, Google Scholar, and YouTube were used more than JSTOR or other library databases, even though humanities scholars’ trust in Google and Google Scholar was lower. User research since 2014 concerning Google Scholar has focused on graduate students. Results suggest Scholar is used regularly but the tool is only partially sufficient. In their study of 20 engineering masters’ students’ use of abstracts and indexes, Johnson and Simonsen (2015) found AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 24 that half their sample (n=20) had used Google Scholar the last time they located an article using specific search terms or criteria. Google was the second most-used source at 20%, followed by abstracting and indexing services (15%). Graduate students describe Google Scholar with nuance and refer to it as a specific part of their process. In Bøyum and Aabø’s (2015) interviews with eight PhD business students and Wu and Chen’s (2014, 381) interviews with 32 graduate students drawn from multiple academic disciplines, the majority described using library databases and Google Scholar for different purposes depending on the context. Graduate students in both studies were well aware of Google Scholar’s use for citation searching. Bøyum and Aabø’s (2015) subjects described library resources as more “academically robust” than Google or Google Scholar. Wu and Chen’s (2014) interviewees praised Google Scholar for its wider coverage and convenience, but lamented the uncertain quality, sometimes inaccessible full text, too many results, lack of sorting function (document type or date), finding documents from different disciplines, and duplicate citations. Google Scholar was seen by their subjects as useful during early stages of information seeking. In contrast to general assumptions, more than half the students (Wu and Chen 2014, 381) interviewed reported browsing more than 3 pages’ worth of Google Scholar results. About half of interviewees reported looking at cited documents to find more, however students had mixed opinions about whether the citing documents turned out to be relevant. Google Scholar’s “My Library” feature, introduced in 2013, now competes with other bibliographic citation management software. In a survey of 344 (mostly graduate) students, Conrad, Leonard, and Somerville found Google Scholar was the most-used (47%) followed by EndNote (37%), and Zotero (19%) (2015, 572). Follow-up interviews with 13 of the students revealed that a few students used multiple tools, for example one participant noted he/she used “EndNote for sharing data with lab partners and others “across the community”; Mendeley for her own personal thesis work, where she needs to “build a whole body of literature”; and Google Scholar Citations for “quick reference lists that I may not need for a second or third time.” Messy Metadata Many studies have suggested Google Scholar’s metadata is “messy.” Although none in the period of study examined this phenomenon in conjunction with relative user performance, the issues found could affect scholarship. A 2016 study itemized the most common mistakes in Google Scholar resulting from its extraction process: 1) incorrect title identification; 2) missing or incorrectly assigned authors; 3) book reviews indexed as books; 4) failing to group versions of the same document, which inflates citation counts; 5) grouping different editions of books, which deflates citation counts; 6) attributing citations to documents that did not cite them, or missing citations that did; and 7) duplicate author profiles (Martín-Martín et al. 2016b). The authors concluded that “in an academic big data environment, these errors (which we deem affect less than 10% of the records in the database) are of no great consequence, and do not affect the core system INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 25 performance significantly” (54). Two of these issues have been studied specifically: duplicate citations and missing publication dates. The rate of duplicate citations in Google Scholar has ranged upwards of 2.93% (Haddaway et al. 2015) and 5% (Winter, Zadpoor, and Dodou 2014, 1562), which can be compared to a .05% duplicate citation rate in Web of Science (Haddaway et al. 2015, 13). Haddaway found the main reasons for duplication include “typographical errors, including punctuation and formatting differences; capitalization differences (Google Scholar only), incomplete titles, and the fact that Google Scholar scans citations within reference lists and may include those as well as the citing article” (2015, 13). The issue of missing publication dates varies greatly across samples. Dates were found to be missing 9% of the time in Winter et al.’s study, although it varied by publication type: 4% of journals, 15% of theses, and 41% of the unknown document types” (Winter, Zadpoor, and Dodou 2014, 1562). However Martin-Martin et al. studied a sample of 32,680 highly-cited documents and found that Web of Science and Google Scholar agreed on publication dates 96.7% of the time, with an idiosyncratically large proportion of those mismatches in 2012 and 2013 (2017, 159). Utility for Research Processes Prior to 2014, studies such as Asher, Duke, and Wilson's 2012 evaluated Google Scholar’s utility as a general research tool, often in comparison with discovery tools. Since 2014, the only such study found was Namei and Young’s comparison of Summon, Google Scholar, and Google using 299 known-item queries. They found Google Scholar and Summon returned relevant results 74% of the time; Google returned relevant results 91% of the time. For “scholarly formats,” they found Summon returned relevant results 76% of the time; Google 79%; and Google 91% (2015, 526- 527). The remainder of studies in this category focused specifically on systematic reviews, perhaps because such reviews are so time-consuming. Authors develop search strategies carefully, execute them in multiple databases, and document their search methods and results carefully. Some prestigious journals are beginning to require similar rigor for any original research article, not just systematic reviews (Cals and Kotz 2016). Information provided by professional organizations about the use of Google Scholar for systematic reviews seems inconsistent: the Cochrane Handbook for Systematic Reviews of Interventions lists Google Scholar among sources for searching, but none of the five “highlighted reviews” on the Cochrane web site at the time of this article’s writing used Google Scholar in their methodologies. The UK organization National Institute for Health and Care Excellence’s manual (National Institute for Health and Care Excellence (NICE)) only mentions Google Scholar in an appendix of search sources under “Conference Abstracts.” A study by Gehanno et al. (2013) found Google Scholar contained 100% of the references from 29 systematic reviews, and suggested Google Scholar could be the first choice for systematic reviews or meta-analyses. This finding prompted a slew of follow-up studies in the next three years. An AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 26 immediate response by Giustini and Boulos (2013) pointed out that systematic reviews are not performed by searching for article titles as with Gehanno et al.’s method, but through search strategies. When they tried to replicate a systematic review’s topical search strategy in Google Scholar, the citations were not easily discovered. In addition the authors were not able to find all the papers from a given systematic review even by title searching. Haddaway et al. also found imperfect coverage: for one of the seven reviews examined, 31.5% of citations could not be found (2015, 11). Haddaway also noted that special characters and fonts (as with chemical symbols) can cause poor matching when such characters are part of article titles. Recent literature concurs that it is still necessary to search multiple databases when conducting a systematic review, including abstracts and indexes, no matter how good Google Scholar’s coverage seems to be. No one database’s coverage is complete, including Google Scholar (Thielen et al. 2016), and practical recall of Google Scholar is exceptionally low due to the 1,000 result limit, yet at the same time, Google Scholar’s lack of precision is costly in terms of researchers’ time (Bramer, Giustini, and Kramer 2016b; Haddaway et al. 2015). The challenges limiting study of Google Scholar’s coverage also bedevil those wishing to use it for reviews, especially the 1,000 result retrieval limit, lack of batch export, and lack of exported abstracts (Levay et al. 2016). Additionally, Google Scholar’s changing content, unknown algorithm and updating practices, search inconsistencies, limited Boolean functions, and 256-character query limit prevent the tool from accommodating the detailed, reproducible search methodologies required by systematic reviews (Bonato 2016; Haddaway et al. 2015; Giustini and Boulos 2013). Bonato noted Google Scholar retrieved different results with Advanced and Basic searches; could not determine the format of items (e.g. conference papers); and found other inconsistent results.14 Bonato also lamented the lack of any kind of document type limit. Despite the limitations and logistical challenges, practitioners and scholars are finding solid reasons for including academic web search engines as part of most systematic review methodologies (Cals and Kotz 2016). Stansfield et al. noted that “relevant literature for low- and middle-income countries, such as working and policy papers, is often not included in databases,” and that Google Scholar finds additional journal articles and grey literature not indexed in databases (2016, 191). For eight systematic reviews by EPPI-Center, “over a quarter of relevant citations were found from websites and internet search engines” (Stansfield, Dickson, and Bangpan 2016, 2). Specific tools and practices have been recommended when using search engines within the context of systematic reviews. Software is available to record search strategies and results (Harzing and Alakangas 2016b; Haddaway 2015). Haddaway suggests the use of snapshot tools (Haddaway 2015) to record the first 1,000 Google Scholar records rather than the typical assessment of the first 50 search results as had been done in the past: “This change in practice 14 Bonato (2016) found zero hits for conference papers when limiting by year 2015-2016, but found two papers presented at a 2015 meeting. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 27 could significantly improve both the transparency and coverage of systematic reviews, especially with respect to their grey literature components.” (Haddaway et al. 2015, 15). Both Haddaway (2015) and Cochrane recommend that review authors print or save locally electronic copies of the full text or relevant details rather than bookmarking web sites, “in case the record of the trial is removed or altered at a later stage” (Higgins and Green 2011). New methods for searching, downloading, and integrating academic search engine results into review procedures using free software to increase transparency, repeatability, and efficiency have been proposed by Haddaway and his colleagues (2015). Google Scholar Citations and Metrics Google Scholar Citations and Metrics are not academic search engines, but this article included them because these products are interwoven into the fabric of the Google Scholar database. Google Scholar Citations, launched in late 2011 (Martín-Martín et al. 2016b, 12) groups citations by author, while Google Metrics (launch date uncertain) provides similar data for articles and journals. Readers interested in an in-depth literature review of Google Scholar Citations for earlier years (2005-2012) are directed to (Thelwall and Kousha 2015b). In his comprehensive review of more recent literature about using Google Scholar Citations for citation analysis, Waltman (2016) described several themes. Google Scholar’s coverage of many fields is significantly broader than Web of Science and Scopus, and this seems to be continuing to improve over time. However studies regularly report Google Scholar’s inaccuracies, content gaps, phantom data, easily manipulatable citation counts, lack of transparency, and limitations for empirical bibliometric studies. As discussed in the coverage section, Google Scholar’s citation database is competitive with other major databases such as Web of Science and has been growing dramatically in the last few years (Winter, Zadpoor, and Dodou 2014; Harzing and Alakangas 2016b; Harzing 2014) but has recently stabilized (Harzing and Alakangas 2016b). More and more studies are concluding that Google Scholar will report more comprehensive information about citation impact than Web of Science or Scopus. Across a sample of articles from many years of one science journal, Trapp (2016) found the proportion of articles with zero citations was 37% for Web of Science, 29% for Scopus, and 19% for Google Scholar. Some of Google Scholar’s superiority for citation analysis in the social sciences and humanities is due to its inclusion of book content, software, and additional journals (Prins et al. 2016; Bornmann et al. 2016). Bornmann et al. (2016) noted citations to all ten of a research institute’s ten books published in 2009 were found in Google Scholar, whereas Web of Science found citations for only two books. Furthermore they found data in Google Scholar for 55 of the total of 71 of the institute’s book chapters. For the four conference proceedings they could identify in Google Scholar, there were 100 citations, of which 65 could be found in Google Scholar. The comparative success of Google Scholar for citation impact varies by discipline, however: (Levay et al. 2016) found Web of Science to be more reliable than Google Scholar, quicker for AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 28 downloading results, and better for retrieving 100% of the most important publications in public health. Despite Google Scholar’s growth, using all three major tools (Scopus, Web of Science, and Google Scholar) still seems to be necessary for evaluating researcher productivity. Rothfus (2016) compared Web of Science, Scopus, and Google Scholar citation counts for evaluating the impact of the Canadian Network of Observational Drug Effect Studies (CNODES), as represented by a sample of 222 citations from five articles. Attempting to determine citation metrics for the CNODES research team yielded different results for every article when using the three tools. They found that “using three tools (Web of Science, Scopus, Google Scholar) to determine citation metrics as indicators of research performance and impact provided varying results, with poor overall agreement among the three” (237). Major academic libraries’ web sites often explain how to find one’s h-index in all three (Suiter and Moulaison 2015). Researchers have also noted the disadvantages of Google Scholar for citation impact studies. Google Scholar is costly in terms of researcher time. Levay et al. (2016) estimated the cost of “administering results” from Web of Science to be 4 hours versus 75 hours for Google Scholar. Administering results includes using the search tool to search, download, and add records to bibliographic citation software, and removing duplicate citations. Duplicate citations are often mentioned as a problem (Prins et al. 2016), although Moed (2016) suggested the double counting by Google Scholar would occur only if the level of analysis is on target sources, not if it is on target articles.15 Downloaded citation samples can still suffer from double counts, however: Harzing and Alakangas described how cleaning “a fairly extreme case” in their study reduced the number of papers from 244 to 106 (2016b). Google Scholar also does not identify self-citations, which can dramatically influence the meaning of results (Prins et al. 2016). Furthermore, researchers have shown it is possible to corrupt Google Scholar Citations by uploading obviously false documents (Delgado López-Cózar, Robinson-García, and Torres-Salinas 2014).While the researchers noted traditional citation indexes can also be defrauded, Google’s products are less transparent and abuses may not be easily detected. Google did not respond to the research team when contacted and simply deleted the false documents to which it had been alerted without reporting the situation to the affected authors, and the researchers concluded: “This lack of transparency is the main obstacle when considering Google Scholar and its by-products for research evaluation purposes” (453). Because these disadvantages do not outweigh Google Scholar’s seemingly broader coverage, many articles investigate workarounds for using Google Scholar more effectively when evaluating 15 “if a document is, for instance, first published in ArXiv, and a next version later in a journal J, citations to the two versions are aggregated. In Google Scholar Metrics, in which ArXiv is included as a source, this document (assuming that its citation count exceed the h5 value of ArXiv and journal J) is listed both under ArXiv and under journal J, with the same, aggregate citation count (Moed 2016, 29). INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 29 research impact. Harzing and Alakangas (2016b) recommend the hI index16, which is corrected for career length and co-authorship patterns, as the citation metric of choice for a fair comparison of Google Scholar with other tools. Bornmann et al. (2016) investigated a method to normalize data and reduce errors when using Google Scholar data to evaluate citations in the social sciences and humanities. Researcher profiles can also be used to find other scholars by topic. In a 2014 survey of researchers (n=8,554), Dagienė and Krapavickaitė found that 22% used a third-party service such as Google Scholar or Microsoft Academic to produce lists of their scholarly activities and 63% reported their scholarly record was freely available on the Web (2016, 158, 161). Google Scholar ranked only second to Microsoft Word as the most frequently used software to maintain academic activity records (160). Martín-Martín et al. (2016b) examined 814 authors in the field of bibliometrics using Google Scholar Citations, ResearcherID, ResearchGate, Mendeley, and Twitter. Google Scholar was the most used social research sharing platform, followed by ResearchGate, with ResearcherID gaining wider acceptance among authors deemed “core” to the field. Only about one-third of the authors created a Twitter profile, and many Mendeley and ResearcherID profiles were found empty. The study found Google Scholar academic profiles’ distinctive advantages to be automatic updates and its high growth rate, with disadvantages of scarce quality control, inherited metadata mistakes from Google Scholar, and its manipulatability. Overall, Martin-Martin and colleagues concluded that Google Scholar “should be the preferred source for relational and comparative analyses in which the emphasis is put on author clusters” (57). Google Scholar Metrics provides citation information for articles and journals. In a sample of 1,000 journals, Orduña-Malea and Delgado López-Cózar found that “despite all the technical and methodological problems,” Google Scholar Metrics provides sound and reliable journal rankings (2014, 2365). Google Scholar Metrics seems to be an annual publication; the 2016 edition contains 5,734 publications and 12 language rankings. Russian, Korean, Polish, Ukranian, and Indionesian were added this year, while Italian and Dutch were removed for unknown reasons (Martín-Martín et al. 2016a). Researchers also found that many discussion papers and working papers were removed in 2016. English-language publications are broken into subject areas and disciplines. Google Scholar Metrics often, but not always creates separate entries for each language in which a journal is published. Bibliometricians call for Google Scholar Metrics to display the total number of documents published in the publications indexed and the total number of citations received: “These are the two essential parameters that make it possible to assess the reliability and accuracy of any bibliometric indicator” (13). Adding country and language of publication and self-citation rates are among the other improvements listed by Lopez-Cozar and colleagues. 16 Harzing and Alakangas (2016b) define the hIa as the hI norm/academic age. Academic age refers to the number of years elapsed since first publication. To calculate hI norm, one divides the number of citations by the number of authors for that paper, and then calculates the h-index of the normalized citation count. AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 30 Informing Practice The glaring lack of research related to the coverage of arts and humanities scholarship, limited research on book coverage, and relaunch of Microsoft Academic make it impossible to form a general recommendation regarding the use of academic web search engines for serious research. Until the ambiguity of arts and humanities coverage is clarified, and until academic web search engines are transparent and stable, traditional bibliographic databases still seem essential for systematic reviews, citation analysis, and other rigorous literature search purposes. Discipline- specific databases also have features such as controlled vocabulary, industry classification codes, and peer review indicators that make scholars more efficient and effective. Nevertheless, the increasing relevance of academic search engines and solid coverage of sciences and social sciences make it essential for librarians to become as expert with Google Scholar, Google Books, and Microsoft Academic. For some scholarly tasks, academic search engines may be superior: for example, when looking up doi numbers for this paper’s bibliography, the most efficient process seemed to be a Google search on the article title plus the term “doi,” and the most likely site to display in the results was ResearchGate.17 Librarians and scholars should champion these tools as an important part of an efficient, effective scholarly research process (Walsh 2015), while also acknowledging the gaps in coverage, biases, metadata issues and missing features available in other databases. Academic web search engines could form the centerpiece for instruction sessions surrounding the scholarly network, as shown by “cited by” features, author profiles, and full-text sources. Traditional abstracts and indexes could then be presented on the basis of their strengths. At some point, explaining how to access full text will likely no longer focus on the link resolver but on the many possible document versions a user might encounter (e.g. pre-prints or editions of books) and how to make an informed choice. In the meantime, even though web search engines and repositories may retrieve copious full text outside library subscriptions, college students should still be made aware of the library’s collections and services such as interlibrary loan. When considering Google Scholar’s weaknesses, it’s important to keep in mind Chen’s observation that we may not have a tool available that does any better (Antell et al. 2013). While Google Scholar may be biased toward English-language publications, so are many bibliographic databases. Overall, Google Scholar seems to have increased the visibility of international research (Bartol and Mackiewicz-Talarczyk 2015). While Google Scholar’s coverage of grey literature has been shown to be somewhat uneven (Bonato 2016; Haddaway et al. 2015), it seems to include more diversity among relevant document types than many abstracts and indexes (Ştirbu et al. 2015; Bartol and Mackiewicz-Talarczyk 2015). Although the rigors of systematic reviews may contraindicate the tool’s use as a single source, it adds value to search results from other databases (Bramer, Giustini, and Kramer 2016a). User preferences and priorities should also be taken into account; Google 17 Because the authority of ResearchGate is ambiguous, in such cases I then looked up the doi using Google to find the publisher’s version. In some cases, the doi was not displayed on the publisher’s result page (e.g., https://muse.jhu.edu/article/197091). INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 31 Scholar results have been said to contain “clutter,” but many researchers have found the noise in Google Scholar tolerable given its other benefits (Ştirbu et al. 2015). Google Books purportedly contains about 30 million items, focused on U.S.-published and English- language books. But its coverage is hit-or-miss, surprising Mays (2015) with an unexpected wealth of primary sources but disappointing Harper (2016) with limited coverage of academic health sciences books. Recent court decisions have enabled Google to continue progressing toward their goal of full-text indexing and making snippet views available for the Google-estimated universe of 130 million books, which suggests its utility may increase. Google Books is not integrated with link resolvers or discovery tools but has been found useful for providing information about scholarly research impact, especially for the arts, humanities, and social sciences. As re-launched in 2016, Microsoft Academic shows real potential to compete with Google Scholar in coverage and utility for finding journal articles. As of February 2017 its index contains 120 million citations. In contrast to the mystery of Google Scholar’s black-box algorithms and restrictive limitations, Microsoft Academic uses an open-system approach and offers an API. Microsoft Academic appears to have less coverage of books and grey literature compared with Google Scholar. Research is badly needed about the coverage and utility of both Google Books and Microsoft Academic. Google Scholar continues to evolve, launching a new algorithm for known-item searching in 201618 that appears to work very well. Google Scholar does not reveal how many items it searches but studies have suggested 160 million documents have been indexed. Studies have shown the Google Scholar relevance algorithm to be heavily influenced by citation counts and language of publication. Google Scholar has been so heavily researched and is such a “black box” that more attention would seem to have diminishing returns, except in the area of coverage of and utility for arts and humanities research. Librarians may find these takeaways useful for working with or teaching Google Scholar: • Little is known about coverage of arts and humanities by Google Scholar. • Recent studies repeatedly find that in the sciences and social sciences Google Scholar covers as much if not more than library databases, has more recent coverage, and frequently provides access to full text without the need for library subscriptions. • Although the number of studies is limited, Google Scholar seems excellent at retrieving known scholarly items compared with discovery tools. • Using proper accent marks in the title when searching for non-English language items appears to be important. 18 Google Scholar’s blog notes that in January 2016, a change was made so “Scholar now automatically identifies queries that are likely to be looking for a specific paper” Technically speaking, “it tries hard to find the intended paper and a version that that particular user is able to read” https://scholar.googleblog.com/. AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 32 • Finding full text for non-English journal articles may require searching Google Scholar in the original language. • While Google Scholar may include results from Google Books, it appears both tools should be used rather than assuming Google Books will appear in Google Scholar. • While Google Scholar does include grey literature, these results do not usually rank highly. • Google Scholar and Google must both be used to effectively search across institutional repository content. • Free full text may be buried underneath the “All X versions” links because the publisher’s web site is usually the dominant version presented to the user. The right-hand column links may help ameliorate this situation, but not reliably. • Google Scholar is well-known in most academic communities and used regularly; however, it is seldom the only tool used, with scholars continuing to use other web search tools, library abstracts and indexes, and published web sites as well. • Experts in writing systematic reviews recommend Google Scholar be included as a search tool along with traditional abstracts and indexes, using software to record the search process and results. • For evaluating research impact, Google Scholar may be superior to Web of Science or Scopus, but using all three tools still seems necessary. • As with any database, citation metadata should be verified against the publisher’s data; with Google Scholar, publication dates should receive deliberate attention. • When Google Scholar covers some of a major publisher’s content, that does not imply it covers all of that publisher’s content. • Google Scholar Metrics appears to provide reliable journal rankings. Research Agenda This review of the literature also provides direction for future research concerning academic web search engines. Because this review focused on 2014-2016, researchers may need to review studies from earlier periods for methodological ideas and previous findings, noting that dramatic changes in search engine coverage and behavior can occur within only a few years.19 Across the studies, some general best practices were observed. When comparing the coverage of academic web search engines, their utility for establishing research impact, or other bibliometric studies, researchers should strongly consider using software such as Publish or Perish, and to design their research approach with previous methodologies in mind. Information scientists have charted a set of clear disciplinary methods; there is no need to start from scratch. Even when 19 For example Ştirbu found that Google Scholar overlapped GeoRef by 57% and 62% (Ştirbu et al. 2015, 328), compared with a finding by Neuhaus in 2006 where Scholar overlapped with GeoRef by 26% (2006, 133). INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 33 performing a large-scale quantitative assessment such as (Kousha and Thelwall 2015), manually examining and discussing a subset of the sample seems helpful for checking assumptions and for enhancing the meaning of the findings to the reader. Some researchers examined the “top 20” or “top 10” results qualitatively (Kousha and Thelwall 2015), while others took a random sample from within their large-study sample (Kousha, Thelwall, and Rezaie 2011). Academic search engines for arts and humanities research Research into the use of academic web search engines within arts and humanities fields is sorely needed. Surveys show humanities scholars use both Google and Google Scholar (Inger and Gardner 2016; Kemman, Kleppe, and Scagliola 2013; Van Noorden 2014). During interviews of 20 historians by Martin and Quan-Haase (2016) concerning serendipity, five mentioned Google Books and Google Scholar as important for recreating serendipity of the physical library online. Almost all arts and humanities scholars search the Internet for researchers and their activities, and commonly expressed the belief that having a complete list of research activities online improves public awareness (Dagienė and Krapavickaitė 2016). Mays’s (2015) practical advice and the few recent studies on citation impact of Google Books for these disciplines point to the enormous potential for this tool’s use. Articles describing opportunities for new online searching habits of humanities scholars have not always included Google Scholar (Huistra and Mellink 2016). Wu and Chen’s interviews with humanities graduate students suggested their behavior and preferences were different from science and technology students, doing more known-item searching and struggling with “semantically ambiguous keywords” that retrieved irrelevant results (2014, 381). Platform preferences seem to have a disciplinary aspect: Hammarfelt’s (2014) investigation of altmetrics in the humanities suggests Mendeley and Twitter should be included along with Google Scholar when examining citation impact of humanities research, while a 2014 Nature survey suggests ResearchGate is much less popular in the social sciences and humanities than in the sciences (Van Noorden 2014). In summary, arts and humanities scholars are active users of academic web search engines and related tools, but their preferences and behavior, and the relative success of Google Scholar as a research tool cannot be inferred from the vast literature focused on the sciences. Advice from librarians and scholars about the strengths and limitations of academic web search engines in these fields would be incredibly useful. Specific examples of needed research, and related studies to reference for methodological ideas: • Similar to the studies that have been done in the sciences, how well do academic search engines cover the arts and humanities? An emphasis on formats important to the discipline would be important (Prins et al. 2016). • How does the quality of search results compare between academic search engines and traditional library databases for arts and humanities topics? To what extent can the user usefully accomplish her task? (Ruppel 2009)? AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 34 • To what extent do academic search engines support the research process for scholarship distinctive to arts and humanities disciplines (e.g. historiographies, review essays)? • In academic search engines, how visible is the arts and humanities literature found in institutional repositories (Pitol and De Groote 2014)? Specific aspects of academic search engine coverage This review suggests that broad studies of academic search engine coverage may have reached a saturation point. However, specific aspects of coverage need additional investigation: • Grey literature: Although Google Scholar’s inclusion of grey literature is frequently mentioned as valuable, empirical studies evaluating its coverage are scarce. Additional research following the methodology of Haddaway (2015) could investigate the bibliographies of literature other than systematic reviews, investigate various disciplines, or use a sample of valuable known items (similar to Kousha, Thelwall, and Rezaie’s (2011) methodology for books). • Non-Western, non-English language literature: For further investigation of the repeated finding of non-Western, non-English language bias (Abrizah and Thelwall 2014; Cavacini 2015), comparisons to library abstracts and indexes would be helpful for providing context. To what extent is this bias present in traditional research tools? Hilbert et al. found the coverage of their sample increased for English language in both Web of Science and Scopus, and “to a lesser extent” in Google Scholar (2015, 260). • Books: Any investigations of book coverage in Microsoft Academic and Google Scholar would be welcome. Very few 2014-2016 studies focused on books in Google Scholar, and even looking in earlier years turned up little research. Georgas (2015) compared Google with a federated search tool for finding books, so her study may be a useful reference. Kousha et al. (2011) found three times as many citations in Google Scholar than in Scopus to a sample of 1,000 academic books. The authors concluded “there are substantial numbers of citations to academic books from Google Books and Google Scholar, and it therefore may be possible to use these potential sources to help evaluate research in book- oriented disciplines” (Kousha, Thelwall, and Rezaie 2011, 2157). • Institutional Repositories: Yang (2016) recommended that “librarians of digital resources conduct research on their local digital repositories, as the indexing effects and discovery rates on metadata or associated text files may be different case by case,” and the studies found 2014-2016 show that IR platform and metadata schema dramatically affect discovery, with some IRs nearly invisible (Weideman 2015; Chen 2014; Orduña-Malea and López-Cózar 2015; Yang 2016) and others somewhat findable by Google Scholar (Lee et al. 2015; Obrien et al. 2016). Askey and Arlitsch (2015) have explained how Google Scholar’s decisions regarding metadata schema can dramatically affect results.20 Libraries who 20 For example, Google’s rejection of Dublin Core. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 35 would like their institutional repositories to serve as social sharing platforms for research should consider conducting a study similar to (Martín-Martín et al. 2016b). Finally, a study of IR journal article visibility in academic web search engines could be extremely informative. • Full-text retrieval: The indexing coverage of academic search engines relates to the retrieval of full text, which is another area ripe for more research studies, especially in light of the impressive quantity of full text that can be retrieved without user authentication. Johnson and Simonsen (2015) found that more of the engineering students they surveyed obtained scholarly articles from a free download or getting a PDF from a colleague at another institution than used the library’s subscription. Meanwhile, libraries continue to pay for costly subscription resources. Monitoring this situation is essential for strategic decision-making. Quint (2016) and Karlsson (2014) have suggested strategies for libraries and vendors to support broader access to subscription full text through creative licensing and per-item fee approaches. Institutional repositories have had mixed results in changing scholars’ habits (both contributors and searchers) but are demonstrably contributing to the presence of full text in the academic search engine experience. When will academic users find a good-enough selection of full-text articles that they no longer need the expanded full text paid for by their institutions? Google Books Similarly to Microsoft Academic, Google Books as a search tool also needs dedicated research from librarians and information scientists about its coverage, utility, and/or adoption. A purposeful comparison with other large digital repositories such as HathiTrust (https://www.hathitrust.org) would be a boon to practitioners and the public. While HathiTrust is transparent about its coverage (https://www.hathitrust.org/statistics_visualizations), specific areas of Google Books’ coverage have been called into question. Weiss (2016) suggested a gap in Google Books exists from about 1915-1965 “because many publishers either have let it fall out of print, or the book is orphaned and no one wants to go through the trouble of tracking down the copyright owners” and found that copies in Google Books “will likely be locked down and thus unreadable, or visible only as a snippet, at best” (303). Has this situation changed since the court rulings concerning the legality of snippet view? Longitudinal studies in the growth of Google Books similar to (Harzing 2014) could illuminate this and other questions about Google Books’s ability to deliver content. Uneven coverage of content types, geography, and language should be investigated. Mays noted a possible geographical imbalance within the United States (Mays 2015, 26). Others noted significant language and international imbalances, and large disciplinary differences (Weiss 2016; Abrizah and Thelwall 2014; Kousha and Thelwall 2015). Weiss and others suggest the implications of Google Books’ coverage imbalance have enormous social implications: “Google and other [massive digital libraries] have essentially canonized the books they have scanned and contribute to the marginalization of those left unscanned” (301). Therefore more holistic quantitative investigations of the types of information in Google Books and possible skewness AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 36 would be welcome. Finally, Chen’s study (2012) comparing the coverage of Google Books and WorldCat could be repeated to provide longitudinal information. The utility of Google Books for research purposes also needs further investigation. Books are far more prevalently cited in Wikipedia than are research articles (Thelwall and Kousha 2015a). Examining samples of Wikipedia articles’ citation lists for the prevalence of Google Books could reveal how dominant a force Google Books has become in that space. On a more philosophical level, investigating the ways Google Books might transform scholarly processes would be useful. Szpiech (2014) considered how the Google Books version of a medieval manuscript transformed his relationship with texts, causing a rupture “produced by my new power to extract words and information from a text without being subject to its order, scale, or authority” (78). He hypothesized readers approach Google Books texts as consumers, rather than learners, whereby “the critical sense of the gestalt” is at risk of being forgotten” (84). Have other researchers in experienced what he describes? Microsoft Academic Given the stated openness of Microsoft’s new academic web search engine,21 the closed nature of Google Scholar, and the promising findings of bibliometricians (Harzing 2016b; Harzing and Alakangas 2016a), librarians and information scientists should embark on a thorough review of Microsoft Academic with similar enthusiasm to which they approached Google Scholar. The search engine’s coverage, utility for research, and suitability for bibliometric analysis22 all need to be examined. Microsoft Academic’s abilities for supporting scholarly social networking would also be of interest, perhaps using Ward et al. (2015) as a theoretical groundwork. The tool’s coverage and utility for various disciplines and research purposes is a wide-open field for highly useful research. Professional and Instructional Approaches Based on User Research To inform instructional approaches, more study on user behavior is needed, perhaps repeating Herrera’s (2011) study with Google Scholar and Microsoft Academic. In light of the recent focus on graduate students, research concerning the use of academic web search engines by undergraduates, community college students, high school students, and other groups would be welcome. Using an interview or focus group generates exploratory findings that could be tested through surveys with a larger, more representative sample of the population of interest. Studying searching behaviors has been common; can librarians design creative studies to investigate reading, engagement, and reflection when web search engines are used as part of the process? Is there a way to study whether the “Matthew Effect” (Antell et al. 2013, 281), the aging citation 21 Microsoft’s FAQ says the company is “adopting an open approach in developing the service, and we invite community participation. We like to think what we have developed is a community property. As such, we are opening up our academic knowledge as a downloadable dataset” and offers the Academic Knowledge API (https://www.microsoft.com/cognitive-services/en-us/academic-knowledge-api). 22 See Jacsó (2011) for methodology. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 37 phenomenon (Verstak et al. 2014; Martín-Martín et al. 2016a; Davis and Cochran 2015), or other epistemological hypotheses are influencing scholarship patterns? A bold study could be performed to examine differences in quality outcomes between samples of students using primarily academic search engines versus traditional library search tools. Exploratory studies in this area could begin by surveying students about their use of search tools for research methods courses or asking them to record their research process in a journal, and correlating the findings with their grades on the final research product. Three specific areas of user research needed are the use of scholarly social network platforms, researcher profiles, and the influence of these on scholarly collaboration and research (Ward, Bejarano, and Dudás 2015, 178); the performance of Google’s relatively new known-item search23 (compared with Microsoft Academic’s known-item search abilities), and searching in non-English languages. Regarding the latter, Albarillo’s (2016) method which he applied to library databases could be repeated with Google Scholar, Microsoft Academic, and Google Books. Finally, to continue their strong track record as experts in navigating the landscape of digital scholarship, librarians need to research assumptions regarding best practices for scholarly logistics. For example, searching Google for article titles plus the term “doi,” then scanning the results list for ResearchGate was found by this study’s author to most efficiently provide doi numbers: but is this a reliable approach? Does ResearchGate have sufficient accuracy to be recommended as the optimal tool for this task? What is the most efficient way for a scholar to locate full text for a citation? Are academic search engines’ bibliographic citation management software export tools competitive with third-party commercial tools such as RefWorks? Another area needing investigation is the visibility of links to free full text in Google Scholar. Pitol and DeGroote found that 70% percent of the items in their study had at least one free full-text version available through a “hidden” Google Scholar version (2014, 603), and this author’s work on this review article indicates this problem still exists — but to what extent? Also, when free full text exists in multiple repositories (e.g. ResearchGate, Digital Commons, Academic.edu), which are the most trustworthy and practically useful for scholars? Librarians should discuss the answers to these questions and be ready to provide expert advice to users. CONCLUSION With so many users opting to use academic web search engines for research, librarians need to investigate the performance of Microsoft Academic, Google Books, and of Google Scholar for the arts and humanities, and to re-think library services and collections in light of these tools’ strengths and limitations. The evolution of web indexing and increasing free access to full text should be monitored in conjunction with library collection development. To remain relevant to 23 Google Scholar’s blog notes that in January 2016, a change was made so “Scholar now automatically identifies queries that are likely to be looking for a specific paper” Technically speaking, “it tries hard to find the intended paper and a version that that particular user is able to read” https://scholar.googleblog.com/. AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 38 modern researchers, librarians should continue to strengthen their knowledge of and expertise with public academic web search engines, full-text repositories, and scholarly networks. BIBLIOGRAPHY Abrizah, A., and Mike Thelwall. 2014. "Can the Impact of Non- Western Academic Books be Measured? An Investigation of Google Books and Google Scholar for Malaysia." Journal of the Association for Information Science & Technology 65 (12): 2498-2508. https://doi.org/10.1002/asi.23145. Albarillo, Frans. 2016. "Evaluating Language Functionality in Library Databases." International Information & Library Review 48 (1): 1-10. https://doi.org/10.1080/10572317.2016.1146036. Antell, Karen, Molly Strothmann, Xiaotian Chen, and Kevin O’Kelly. 2013. "Cross-Examining Google Scholar." Reference & User Services Quarterly 52 (4): 279-282. https://doi.org/10.5860/rusq.52n4.279. Asher, Andrew D., Lynda M. Duke, and Suzanne Wilson. 2012. "Paths of Discovery: Comparing the Search Effectiveness of EBSCO Discovery Service, Summon, Google Scholar, and Conventional Library Resources." College & Research Libraries 74(5):464-488. https://doi.org/10.5860/crl- 374. Askey, Dale, and Kenning Arlitsch. 2015. "Heeding the Signals: Applying Web Best Practices When Google Recommends." Journal of Library Administration 55 (1): 49-59. https://doi.org/10.1080/01930826.2014.978685. Authors Guild. "Authors Guild v. Google." Accessed January 1, 2016, https://www.authorsguild.org/where-we-stand/authors-guild-v-google/. Bartol, Tomaž, and Maria Mackiewicz-Talarczyk. 2015. "Bibliometric Analysis of Publishing Trends in Fiber Crops in Google Scholar, Scopus, and Web of Science." Journal of Natural Fibers 12 (6): 531. https://doi.org/10.1080/15440478.2014.972000. Boeker, Martin, Werner Vach, and Edith Motschall. 2013. "Google Scholar as Replacement for Systematic Literature Searches: Good Relative Recall and Precision Are Not Enough." BMC Medical Research Methodology 13 (1): 1. Bonato, Sarah. 2016. "Google Scholar and Scopus for Finding Gray Literature Publications." Journal of the Medical Library Association 104 (3): 252-254. https://doi.org/10.3163/1536- 5050.104.3.021. Bornmann, Lutz, Andreas Thor, Werner Marx, and Hermann Schier. 2016. "The Application of Bibliometrics to Research Evaluation in the Humanities and Social Sciences: An Exploratory Study using Normalized Google Scholar Data for the Publications of a Research Institute." INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 39 Journal of the Association for Information Science & Technology 67 (11): 2778-2789. https://doi.org/10.1002/asi.23627. Boumenot, Diane. "Printing a Book from Google Books." One Rhode Island Family. Last modified December 3, 2015, accessed January 1, 2017. https://onerhodeislandfamily.com/2015/12/03/printing-a-book-from-google-books/. Bøyum, Idunn, and Svanhild Aabø. 2015. "The Information Practices of Business PhD Students." New Library World 116 (3): 187-200. https://doi.org/10.1108/NLW-06-2014-0073. Bramer, Wichor M., Dean Giustini, and Bianca M. R. Kramer. 2016. "Comparing the Coverage, Recall, and Precision of Searches for 120 Systematic Reviews in Embase, MEDLINE, and Google Scholar: A Prospective Study." Systematic Reviews 5(39):1-7. https://doi.org/10.1186/s13643-016-0215-7. Cals, J. W., and D. Kotz. 2016. "Literature Review in Biomedical Research: Useful Search Engines Beyond PubMed." Journal of Clinical Epidemiology 71: 115-117. https://doi.org/10.1016/j.jclinepi.2015.10.012. Carlson, Scott. 2006. "Challenging Google, Microsoft Unveils a Search Tool for Scholarly Articles." Chronicle of Higher Education 52 (33). Cavacini, Antonio. 2015. "What is the Best Database for Computer Science Journal Articles?" Scientometrics 102 (3): 2059-2071. https://doi.org/10.1007/s11192-014-1506-1. Chen, Xiaotian. 2012. "Google Books and WorldCat: A Comparison of their Content." Online Information Review 36 (4): 507-516. https://doi.org/10.1108/14684521211254031. ———. 2014. "Open Access in 2013: Reaching the 50% Milestone." Serials Review 40 (1): 21-27. https://doi.org/10.1080/00987913.2014.895556. Choong, Miew Keen, Filippo Galgani, Adam G. Dunn, and Guy Tsafnat. 2014. "Automatic Evidence Retrieval for Systematic Reviews." Journal of Medical Internet Research 16 (10): 1-1. https://doi.org/10.2196/jmir.3369. Ciccone, Karen, and John Vickery. 2015. "Summon, EBSCO Discovery Service, and Google Scholar: A Comparison of Search Performance using User Queries." Evidence Based Library & Information Practice 10 (1): 34-49. https://ejournals.library.ualberta.ca/index.php/EBLIP/article/view/23845. Conrad, Lettie Y., Elisabeth Leonard, and Mary M. Somerville. 2015. "New Pathways in Scholarly Discovery: Understanding the Next Generation of Researcher Tools." Paper presented at the Association of College and Research Libraries annual conference, March 25-27, Portland, OR. https://pdfs.semanticscholar.org/3cb1/315476ccf9b443c01eb9b1d175ae3b0a5b4e.pdf. AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 40 Dagienė, Eleonora, and Danutė Krapavickaitė. 2016. "How Researchers Manage their Academic Activities." Learned Publishing 29(3):155-163. https://doi.org/10.1002/leap.1030. Davis, Philip M., and Angela Cochran. 2015. "Cited Half-Life of the Journal Literature." arXiv Preprint arXiv:1504.07479. https://arxiv.org/abs/1504.07479. Delgado López-Cózar, Emilio, Nicolás Robinson-García, and Daniel Torres-Salinas. 2014. "The Google Scholar Experiment: How to Index False Papers and Manipulate Bibliometric Indicators." Journal of the Association for Information Science & Technology 65 (3): 446-454. https://doi.org/10.1002/asi.23056. Erb, Brian, and Rob Sica. 2015. "Flagship Database for Literature Searching Or Flelpful Auxiliary?" Charleston Advisor 17 (2): 47-50. https://doi.org/10.5260/chara.17.2.47. Fagan, Jody Condit, and David Gaines. 2016. "Take Charge of EDS: Vet Your Content." Presentation to the EBSCO Users' Group, Boston, MA, May 10-11. Gehanno, Jean-François, Laetitia Rollin, and Stefan Darmoni. 2013. "Is the Coverage of Google Scholar Enough to be Used Alone for Systematic Reviews." BMC Medical Informatics and Decision Making 13 (1): 1. https://doi.org/10.1186/1472-6947-13-7. Georgas, Helen. 2015. "Google vs. the Library (Part III): Assessing the Quality of Sources found by Undergraduates." portal: Libraries and the Academy 15 (1): 133-161. https://doi.org/10.1353/pla.2015.0012. Giustini, Dean, and Maged N. Kamel Boulos. 2013. "Google Scholar is Not Enough to be Used Alone for Systematic Reviews." Online Journal of Public Health Informatics 5 (2). https://doi.org/10.5210/ojphi.v5i2.4623. Gray, Jerry E., Michelle C. Hamilton, Alexandra Hauser, Margaret M. Janz, Justin P. Peters, and Fiona Taggart. 2012. "Scholarish: Google Scholar and its Value to the Sciences." Issues in Science and Technology Librarianship 70 (Summer). https://doi.org/10.1002/asi.21372/full. Haddaway, Neal R. 2015. "The Use of Web-Scraping Software in Searching for Grey Literature." Grey Journal 11 (3): 186-190. Haddaway, Neal Robert, Alexandra Mary Collins, Deborah Coughlin, and Stuart Kirk. 2015. "The Role of Google Scholar in Evidence Reviews and its Applicability to Grey Literature Searching." PloS One 10 (9): e0138237. https://doi.org/10.1371/journal.pone.0138237. Hammarfelt, Björn. 2014. "Using Altmetrics for Assessing Research Impact in the Humanities." Scientometrics 101 (2): 1419-1430. https://doi.org/10.1007/s11192-014-1261-3. Hands, Africa. 2012. "Microsoft Academic Search – http://academic.research.microsoft.com." Technical Services Quarterly 29 (3): 251-252. https://doi.org/10.1080/07317131.2012.682026. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 41 Harper, Sarah Fletcher. 2016. "Google Books Review." Journal of Electronic Resources in Medical Libraries 13 (1): 2-7. https://doi.org/10.1080/15424065.2016.1142835. Harzing, Anne-Wil. 2013. "A Preliminary Test of Google Scholar as a Source for Citation Data: A Longitudinal Study of Nobel Prize Winners." Scientometrics 94 (3): 1057-1075. https://doi.org/10.1007/s11192-012-0777-7. ———. 2014. "A Longitudinal Study of Google Scholar Coverage between 2012 and 2013." Scientometrics 98 (1): 565-575. https://doi.org/10.1007/s11192-013-0975-y. ———. 2016a. Publish Or Perish. Vol. 5. http://www.harzing.com/resources/publish-or-perish. ———. 2016b. "Microsoft Academic (Search): A Phoenix Arisen from the Ashes?" Scientometrics 108 (3): 1637-1647.https://doi.org/10.1007/s11192-016-2026-y. Harzing, Anne-Wil, and Satu Alakangas. 2016a. "Microsoft Academic: Is the Phoenix Getting Wings?" Scientometrics: 1-13. Harzing, Anne-Wil, and Satu Alakangas. 2016b. "Google Scholar, Scopus and the Web of Science: A Longitudinal and Cross-Disciplinary Comparison." Scientometrics 106 (2): 787-804. https://doi.org/10.1007/s11192-015-1798-9. Herrera, Gail. 2011. "Google Scholar Users and User Behaviors: An Exploratory Study." College & Research Libraries 72 (4): 316-331. https://doi.org/10.5860/crl-125rl. Higgins, Julian, and S. Green, eds. 2011. Cochrane Handbook for Systematic Reviews of Interventions. Version 5.1.0 ed.: The Cochrane Collaboration. http://handbook.cochrane.org/. Hilbert, Fee, Julia Barth, Julia Gremm, Daniel Gros, Jessica Haiter, Maria Henkel, Wilhelm Reinhardt, and Wolfgang G. Stock. 2015. "Coverage of Academic Citation Databases Compared with Coverage of Scientific Social Media." Online Information Review 39 (2): 255-264. https://doi.org/10.1108/OIR-07-2014-0159. Hoffmann, Anna Lauren. 2014. "Google Books as Infrastructure of in/Justice: Towards a Sociotechnical Account of Rawlsian Justice, Information, and Technology." Theses and Dissertations. Paper 530. http://dc.uwm.edu/etd/530/. ———. 2016. "Google Books, Libraries, and Self-Respect: Information Justice Beyond Distributions." The Library 86 (1). https://doi.org/10.1086/684141. Horrigan, John B. "Lifelong Learning and Technology." Pew Research Center, last modified March 22, 2016, accessed February 7, 2017, http://www.pewinternet.org/2016/03/22/lifelong- learning-and-technology/. Hug, Sven E., Michael Ochsner, and Martin P. Braendle. 2016. "Citation Analysis with Microsoft Academic." arXiv Preprint arXiv:1609.05354.https://arxiv.org/abs/1609.05354. AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 42 Huistra, Hieke, and Bram Mellink. 2016. "Phrasing History: Selecting Sources in Digital Repositories." Historical Methods: A Journal of Quantitative and Interdisciplinary History 49 (4): 220-229. https://doi.org/10.1093/llc/fqw002. Inger, Simon, and Tracy Gardner. 2016. "How Readers Discover Content in Scholarly Publications." Information Services & Use 36 (1): 81-97. https://doi.org/10.3233/ISU-160800. Jackson, Joab. 2010. "Google: 129 Million Different Books have been Published." PC World, August 6, 2010. http://www.pcworld.com/article/202803/google_129_million_different_books_have_been_pu blished.html. Jacsó, P. 2008. "Live Search Academic." Peter’s Digital Reference Shelf, April. Jacsó, Péter. 2011. "The Pros and Cons of Microsoft Academic Search from a Bibliometric Perspective." Online Information Review 35 (6): 983-997. https://doi.org/10.1108/14684521111210788. Jamali, Hamid R., and Majid Nabavi. 2015. "Open Access and Sources of Full-Text Articles in Google Scholar in Different Subject Fields." Scientometrics 105 (3): 1635-1651. https://doi.org/10.1007/s11192-015-1642-2. Johnson, Paula C., and Jennifer E. Simonsen. 2015. "Do Engineering Master's Students Know What They Don't Know?" Library Review 64 (1): 36-57. https://doi.org/10.1108/LR-05-2014-0052. Jones, Edgar. 2010. "Google Books as a General Research Collection." Library Resources & Technical Services 54 (2): 77-89. https://doi.org/10.5860/lrts.54n2.77. Karlsson, Niklas. 2014. "The Crossroads of Academic Electronic Availability: How Well does Google Scholar Measure Up Against a University-Based Metadata System in 2014?" Current Science 107 (10): 1661-1665. http://www.currentscience.ac.in/Volumes/107/10/1661.pdf. Kemman, Max, Martijn Kleppe, and Stef Scagliola. 2013. "Just Google It-Digital Research Practices of Humanities Scholars." arXiv Preprint arXiv:1309.2434. https://arxiv.org/abs/1309.2434. Khabsa, Madian, and C. Lee Giles. 2014. "The Number of Scholarly Documents on the Public Web." PloS One 9 (5): https://doi.org/10.1371/journal.pone.0093949 Kirkwood Jr., Hal, and Monica C. Kirkwood. 2011. "Historical Research." Online 35 (4): 28-32. Koler-Povh, Teja, Primož Južnic, and Goran Turk. 2014. "Impact of Open Access on Citation of Scholarly Publications in the Field of Civil Engineering." Scientometrics 98 (2): 1033-1045. https://doi.org/10.1007/s11192-013-1101-x. Kousha, Kayvan, Mike Thelwall, and Somayeh Rezaie. 2011. "Assessing the Citation Impact of Books: The Role of Google Books, Google Scholar, and Scopus." Journal of the American Society INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 43 for Information Science and Technology 62 (11): 2147-2164. https://doi.org/10.1002/asi.21608. Kousha, Kayvan, and Mike Thelwall. 2017. "Are Wikipedia Citations Important Evidence of the Impact of Scholarly Articles and Books?" Journal of the Association for Information Science and Technology. 68(3):762-779. https://doi.org/10.1002/asi.23694. Kousha, Kayvan, and Mike Thelwall. 2015. "An Automatic Method for Extracting Citations from Google Books." Journal of the Association for Information Science & Technology 66 (2): 309- 320. https://doi.org/10.1002/asi.23170. Lee, Jongwook, Gary Burnett, Micah Vandegrift, Hoon Baeg Jung, and Richard Morris. 2015. "Availability and Accessibility in an Open Access Institutional Repository: A Case Study." Information Research 20 (1): 334-349. Levay, Paul, Nicola Ainsworth, Rachel Kettle, and Antony Morgan. 2016. "Identifying Evidence for Public Health Guidance: A Comparison of Citation Searching with Web of Science and Google Scholar." Research Synthesis Methods 7 (1): 34-45. https://doi.org/10.1002/jrsm.1158. Levy, Steven. "Making the World’s Problem Solvers 10% More Efficient." Backchannel. Last modified October 17, 2014, accessed January 14, 2016, https://medium.com/backchannel/the-gentleman-who-made-scholar-d71289d9a82d. Los Angeles Times. 2016. "Google, Books and 'Fair Use'." Los Angeles Times, April 19, 2016. http://www.latimes.com/opinion/editorials/la-ed-google-book-search-20160419-story.html Martin, Kim, and Anabel Quan-Haase. 2016. "The Role of Agency in Historians’ Experiences of Serendipity in Physical and Digital Information Environments." Journal of Documentation 72 (6): 1008-1026. https://doi.org/10.1108/JD-11-2015-0144. Martín-Martín, Alberto, Juan Manuel Ayllón, Enrique Orduña-Malea, and Emilio Delgado López- Cózar. 2016a. "2016 Google Scholar Metrics Released: A Matter of Languages... and Something Else." arXiv Preprint arXiv:1607.06260. https://arxiv.org/abs/1607.06260. Martín-Martín, Alberto, Enrique Orduña-Malea, Juan M. Ayllón, and Emilio Delgado López-Cózar. 2016b. "The Counting House: Measuring those Who Count. Presence of Bibliometrics, Scientometrics, Informetrics, Webometrics and Altmetrics in the Google Scholar Citations, ResearcherID, ResearchGate, Mendeley & Twitter." arXiv Preprint arXiv:1602.02412. https://arxiv.org/abs/1602.02412. Martín-Martín, Alberto, Enrique Orduña-Malea, Juan Manuel Ayllón, and Emilio Delgado López- Cózar. 2014. "Does Google Scholar Contain All Highly Cited Documents (1950-2013)?" arXiv Preprint arXiv:1410.8464. https://arxiv.org/abs/1410.8464. AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 44 Martín-Martín, Alberto, Enrique Orduña-Malea, Juan Ayllón, and Emilio Delgado López-Cózar. 2016c. "Back to the Past: On the Shoulders of an Academic Search Engine Giant." Scientometrics 107 (3): 1477-1487. https://doi.org/10.1007/s11192-016-1917-2. Martín-Martín, Alberto, Enrique Orduña-Malea, Anne-Wil Harzing, and Emilio Delgado López- Cózar. 2017. "Can we Use Google Scholar to Identify Highly-Cited Documents?" Journal of Informetrics 11 (1): 152-163. https://doi.org/10.1016/j.joi.2016.11.008. Mays, Dorothy A. 2015. "Google Books: Far More Than Just Books." Public Libraries 54 (5): 23-26. http://publiclibrariesonline.org/2015/10/far-more-than-just-books/ Meier, John J., and Thomas W. Conkling. 2008. "Google Scholar’s Coverage of the Engineering Literature: An Empirical Study." The Journal of Academic Librarianship 34 (3): 196-201. https://doi.org/10.1016/j.acalib.2008.03.002. Moed, Henk F., Judit Bar-Ilan, and Gali Halevi. 2016. "A New Methodology for Comparing Google Scholar and Scopus." arXiv Preprint arXiv:1512.05741.https://arxiv.org/abs/1512.05741. Namei, Elizabeth, and Christal A. Young. 2015. "Measuring our Relevancy: Comparing Results in a Web-Scale Discovery Tool, Google & Google Scholar." Paper presented at the Association of College and Research Libraries annual conference, March 25-27, Portland, OR. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/201 5/Namei_Young.pdf National Institute for Health and Care Excellence (NICE). "Developing NICE Guidelines: The Manual." Last modified April 2016, accessed November 27, 2016. https://www.nice.org.uk/process/pmg20. Neuhaus, Chris, Ellen Neuhaus, Alan Asher, and Clint Wrede. 2006. "The Depth and Breadth of Google Scholar: An Empirical Study." portal: Libraries and the Academy 6 (2): 127-141. https://doi.org/10.1353/pla.2006.0026. Obrien, Patrick, Kenning Arlitsch, Leila Sterman, Jeff Mixter, Jonathan Wheeler, and Susan Borda. 2016. "Undercounting File Downloads from Institutional Repositories." Journal of Library Administration 56 (7): 854-874. https://doi.org/10.1080/01930826.2016.1216224. Orduña-Malea, Enrique, and Emilio Delgado López-Cózar. 2014. "Google Scholar Metrics Evolution: An Analysis According to Languages." Scientometrics 98 (3): 2353-2367. https://doi.org/10.1007/s11192-013-1164-8. Orduña-Malea, Enrique, and Emilio Delgado López-Cózar. 2015. "The Dark Side of Open Access in Google and Google Scholar: The Case of Latin-American Repositories." Scientometrics 102 (1): 829-846. https://doi.org/10.1007/s11192-014-1369-5. Orduña-Malea, Enrique, Alberto Martín-Martín, Juan M. Ayllon, and Emilio Delgado López-Cózar. 2014. "The Silent Fading of an Academic Search Engine: The Case of Microsoft Academic INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 45 Search." Online Information Review 38(7):936-953. https://doi.org/10.1108/OIR-07-2014- 0169. Ortega, José Luis. 2015. "Relationship between Altmetric and Bibliometric Indicators Across Academic Social Sites: The Case of CSIC's Members." Journal of Informetrics 9 (1): 39-49. https://doi.org/10.1016/j.joi.2014.11.004. Ortega, José Luis, and Isidro F. Aguillo. 2014. "Microsoft Academic Search and Google Scholar Citations: Comparative Analysis of Author Profiles." Journal of the Association for Information Science & Technology 65 (6): 1149-1156. https://doi.org/10.1002/asi.23036. Pitol, Scott P., and Sandra L. De Groote. 2014. "Google Scholar Versions: Do More Versions of an Article Mean Greater Impact?" Library Hi Tech 32 (4): 594-611. https://doi.org/0.1108/LHT- 05-2014-0039. Prins, Ad A. M., Rodrigo Costas, Thed N. van Leeuwen, and Paul F. Wouters. 2016. "Using Google Scholar in Research Evaluation of Humanities and Social Science Programs: A Comparison with Web of Science Data." Research Evaluation 25 (3): 264-270. https://doi.org/10.1093/reseval/rvv049. Quint, Barbara. 2016. "Find and Fetch: Completing the Course." Information Today 33 (3): 17-17. Rothfus, Melissa, Ingrid S. Sketris, Robyn Traynor, Melissa Helwig, and Samuel A. Stewart. 2016. "Measuring Knowledge Translation Uptake using Citation Metrics: A Case Study of a Pan- Canadian Network of Pharmacoepidemiology Researchers." Science & Technology Libraries 35 (3): 228-240. https://doi.org/10.1080/0194262X.2016.1192008. Ruppel, Margie. 2009. "Google Scholar, Social Work Abstracts (EBSCO), and PsycINFO (EBSCO)." Charleston Advisor 10 (3): 5-11. Shultz, M. 2007. "Comparing Test Searches in PubMed and Google Scholar." Journal of the Medical Library Association : JMLA 95 (4): 442-445. https://doi.org/10.3163/1536-5050.95.4.442. Stansfield, Claire, Kelly Dickson, and Mukdarut Bangpan. 2016. "Exploring Issues in the Conduct of Website Searching and Other Online Sources for Systematic Reviews: How Can We be Systematic?" Systematic Reviews 5 (1): 191. https://doi.org/10.1186/s13643-016-0371-9. Ştirbu, Simona, Paul Thirion, Serge Schmitz, Gentiane Haesbroeck, and Ninfa Greco. 2015. "The Utility of Google Scholar when Searching Geographical Literature: Comparison with Three Commercial Bibliographic Databases." The Journal of Academic Librarianship 41 (3): 322-329. https://doi.org/10.1016/j.acalib.2015.02.013. Suiter, Amy M., and Heather Lea Moulaison. 2015. "Supporting Scholars: An Analysis of Academic Library Websites' Documentation on Metrics and Impact." The Journal of Academic Librarianship 41 (6): 814-820. https://doi.org/10.1016/j.acalib.2015.09.004. AN EVIDENCE-BASED REVIEW OF ACADEMIC WEB SEARCH ENGINES, 2014-2016| FAGAN | https://doi.org/10.6017/ital.v36i2.9718 46 Szpiech, Ryan. 2014. "Cracking the Code: Reflections on Manuscripts in the Age of Digital Books." Digital Philology: A Journal of Medieval Cultures 3(1): 75-100. https://doi.org/10.1353/dph.2014.0010. Testa, Matthew. 2016. "Availability and Discoverability of Open-Access Journals in Music." Music Reference Services Quarterly 19 (1): 1-17. https://doi.org/10.1080/10588167.2016.1130386. Thelwall, Mike, and Kayvan Kousha. 2015b. "Web Indicators for Research Evaluation. Part 1: Citations and Links to Academic Articles from the Web." El Profesional De La Información 24 (5): 587-606.https://doi.org/10.3145/epi.2015.sep.08. Thielen, Frederick W., Ghislaine van Mastrigt, L. T. Burgers, Wichor M. Bramer, Marian H. J. M. Majoie, Sylvia M. A. A. Evers, and Jos Kleijnen. 2016. "How to Prepare a Systematic Review of Economic Evaluations for Clinical Practice Guidelines: Database Selection and Search Strategy Development (Part 2/3)." Expert Review of Pharmacoeconomics & Outcomes Research: 1-17. https://doi.org/10.1080/14737167.2016.1246962. Trapp, Jamie. 2016. "Web of Science, Scopus, and Google Scholar Citation Rates: A Case Study of Medical Physics and Biomedical Engineering: What Gets Cited and What Doesn't?" Australasian Physical & Engineering Sciences in Medicine. 39(4): 817-823. https://doi.org/10.1007/s13246-016-0478-2. Van Noorden, R. 2014. "Online Collaboration: Scientists and the Social Network." Nature 512 (7513): 126-129. https://doi.org/10.1038/512126a. Varshney, Lav R. 2012. "The Google Effect in Doctoral Theses." Scientometrics 92 (3): 785-793. https://doi.org/10.1007/s11192-012-0654-4. Verstak, Alex, Anurag Acharya, Helder Suzuki, Sean Henderson, Mikhail Iakhiaev, Cliff Chiung Yu Lin, and Namit Shetty. 2014. "On the Shoulders of Giants: The Growing Impact of Older Articles." arXiv Preprint arXiv:1411.0275. https://arxiv.org/abs/1411.0275. Walsh, Andrew. 2015. "Beyond "Good" and "Bad": Google as a Crucial Component of Information Literacy." In The Complete Guide to Using Google in Libraries, edited by Carol Smallwood, 3-12. New York: Rowman & Littlefield. Waltman, Ludo. 2016. "A Review of the Literature on Citation Impact Indicators." Journal of Informetrics 10 (2): 365-391. https://doi.org/10.1016/j.joi.2016.02.007. Ward, Judit, William Bejarano, and Anikó Dudás. 2015. "Scholarly Social Media Profiles and Libraries: A Review." Liber Quarterly 24 (4): 174–204.https://doi.org/10.18352/lq.9958. Weideman, Melius. 2015. "ETD Visibility: A Study on the Exposure of Indian ETDs to the Google Scholar Crawler." Paper presented at ETD 2015: 18th International Symposium on Electronic Theses and Dissertations, New Delhi, India, November 4-6. http://www.web- INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 47 visibility.co.za/0168-conference-paper-2015-weideman-etd-theses-dissertation-india-google- scholar-crawler.pdf. Weiss, Andrew. 2016. "Examining Massive Digital Libraries (MDLs) and their Impact on Reference Services." Reference Librarian 57 (4): 286-306. https://doi.org/10.1080/02763877.2016.1145614. Whitmer, Susan. 2015. "Google Books: Shamed by Snobs, a Resource for the Rest of Us." In The Complete Guide to using Google in Libraries, edited by Carol Smallwood, 241-250. New York: Rowman & Littlefield. Wildgaard, Lorna. 2015. "A Comparison of 17 Author-Level Bibliometric Indicators for Researchers in Astronomy, Environmental Science, Philosophy and Public Health in Web of Science and Google Scholar." Scientometrics 104 (3): 873-906. https://doi.org/10.1007/s11192-015-1608-4. Winter, Joost, Amir Zadpoor, and Dimitra Dodou. 2014. "The Expansion of Google Scholar Versus Web of Science: A Longitudinal Study." Scientometrics 98 (2): 1547-1565. https://doi.org/10.1007/s11192-013-1089-2. Wu, Tim. 2015. "Whatever Happened to Google Books?" The New Yorker, September 11, 2015. Wu, Ming-der, and Shih-chuan Chen. 2014. "Graduate Students Appreciate Google Scholar, but Still Find use for Libraries." Electronic Library 32 (3): 375-389. https://doi.org/10.1108/EL-08- 2012-0102. Yang, Le. 2016. "Making Search Engines Notice: An Exploratory Study on Discoverability of DSpace Metadata and PDF Files." Journal of Web Librarianship 10 (3): 147-160. https://doi.org/10.1080/19322909.2016.1172539. 9720 ---- Microsoft Word - Author_Edits_March_ITAL_Rebmannproof_Edits.docx TV White Spaces in Public Libraries: A Primer Kristen Radsliff Rebmann, Emmanuel Edward Te, and Donald Means INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 36 ABSTRACT TV White Space (TVWS) represents one new wireless communication technology that has the potential to improve internet access and inclusion. This primer describes TVWS technology as a viable, long-term access solution for the benefit of public libraries and their communities, especially for underserved populations. Discussion focuses first on providing a brief overview of the digital divide and the emerging role of public libraries as internet access providers. Next, a basic description of TVWS and its features is provided, focusing on key aspects of the technology relevant to libraries as community anchor institutions. Several TVWS implementations are described with discussion of TVWS implementations in several public libraries. Finally, consideration is given to first steps that library organizations must take when contemplating new TVWS implementations supportive of Wi- Fi applications and crisis response planning. INTRODUCTION Tens of millions of people rely wholly or in part on libraries to provide access to the Internet. Many lack access to the Federal Communications Commission (FCC) recommended standard of 25 Mbps (megabits per second) download speed and 3 Mbps upload speed.1 Though the FCC reclassified high-speed Internet as a public utility under Title II of the Telecommunications Act to ensure that broadband networks are “fast, fair, and open” in 2015,2 the “digital divide” still remains. One in four community members does not have access to the Internet at home. Accounting for age and education level, households with the lowest median income households have service adoption rates of around 50%, compared to those with higher incomes, with rates of 80 to 90%.3 A recent Pew Research Center survey on home broadband adoption found that 43% of those surveyed reported cost being their main reason for non-adoption.4 Individuals with low quality or no access are more likely to be digitally disadvantaged, tend to use library computers more frequently, and are less equipped to interact and compete economically as more services and application processes move online.5 Kristen Radsliff Rebmann (Kristen.rebmann@sjsu.edu) is Associate Professor, San Jose State University School of Information, San Jose, CA. Emmanuel Edward Te (emmanueledward.te@sjsu.edu) is a graduate student, San Jose State University School of Information, San Jose, CA. Donald Means (don@digitalvillage.com) is co-founder and principal of Digital Village Associates, Sausalito, CA. TV WHITE SPACES IN PUBLIC LIBRARIES: A PRIMER | REBMANN, TE, AND MEANS | https://doi.org/10.6017/ital.v36i1.9720 37 This article highlights TV White Space (TVWS), a new wireless communication technology with the potential to assist libraries in addressing digital access and inclusion issues. This primer provides first a brief overview of the digital divide and the emerging role of public libraries as internet access providers, highlighting the need for cost-efficient, technological solutions. We go further to provide a basic description of TVWS and its features, focusing on key aspects of the technology relevant to libraries as community anchor institutions. Several TVWS implementations are described with discussion of how TVWS was set up in several public libraries. Finally, we extend consideration to first steps library organizations must consider when contemplating new implementations including everyday applications and crisis response planning. Digital Access and Inclusion The term “digital divide” describes the gap between people who can easily access and use technology and the internet, and those who cannot.6 As Kinney observes, “there has not been one single digital divide, but rather a series of divides that attend each new technology.”7 Digital divides are exacerbated by various factors including: socioeconomic status, education, geography, age, ability, language, and especially availability and quality.8 In recent years, the language describing this issue has changed, but the inequalities stay consistent and widen among different dimensions with each emerging technology. The most recent public policy term “digital inclusion” promotes digital literacy efforts for unserved and underserved populations.9 The progression from the term “digital divide” to “digital inclusion” represents a shift in focus from issues of access exclusively toward contexts and quality of participation and usage. Along these lines, the language of digital inclusion reframes the issue by making visible that simply focusing on internet access can obscure the fact that divides associated with quality and effectiveness remain.10 In response to the digital divide, public libraries have become the “unofficial” providers of internet access, stemming from libraries’ access to broadband infrastructure, maintenance of publicly- available computers, and services providing assistance and training.11 A Pew Research Center survey on perceptions of libraries found that most respondents reported viewing public libraries as important parts of their communities, providing resources and assisting in decisions regarding what information to trust.12 However, many public libraries are facing an “infrastructure plateau” of internet access due to few computer workstations and slower broadband connection speeds that can support a growing number of users,13 on top of insufficient funding, physical space, and staffing.14 Previous surveys show that although public libraries are connected to the internet and provide public access workstations and wireless access, nearly 50% of public libraries only offer wireless access that shares the same bandwidth as their workstations.15 This increased usage strains existing network connections and infrastructure, resulting in slower connections for everyone connected to the public library’s network. Many public libraries cannot accommodate more workstations, support the power requirements of both workstations and patrons’ laptops, and afford workstation upgrades and bandwidth increases to move past their insufficient connectivity speeds. Libraries often lack the IT skills, time, and funds to upgrade their INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 38 infrastructure.16 Typical wireless access via Wi-Fi is relegated to distances within library buildings, which may extend to exterior spaces and is available only during operating hours. Despite these challenges, public libraries continually provide access and “at-the-point-of-need” training and support for their patrons, especially for those who do not have easy access to the internet and computers.17 Subsidized by federal funding, libraries represent key access providers and technology trainers for the public without internet access.18 The FCC classifies libraries as “community anchor institutions” (CAIs), organizations that “facilitate greater use of broadband by vulnerable populations, including low-income, the unemployed, and the aged.”19 Recent surveys show that users have a positive view of libraries, providing opportunities to spend time in a safe space, pursue learning, and promote a sense of community. Librarians offer internet skills training programs more often than other community organizations though (at around 75% of the time) training occurs informally.20 In particular, 29% of respondents to a library use survey reported going to libraries to use computers, the internet, or the Wi-Fi network; 7% have also reported using libraries’ Wi-Fi signals outside when libraries are closed.21 The majority of these users are more likely to be young, black, female, and lower income, utilizing library technology resources for school or work (61%), checking email or sending texts (53%), finding health information (38%), and taking online courses or completing certifications (26%).22 Public libraries are already exploring creative approaches to providing internet access for these underserved communities. The mobile hotspot lending program in public library systems in New York City and Kansas City are just two examples.23 Yet libraries must do more by supporting innovation and providing leadership by partnering with other community organizations and their stakeholders to enhance resilience in addressing access and inclusion. The emergence of TVWS wireless technology presents an opportunity for libraries to explore expanding the reach of their wireless signals beyond library buildings and extend 24/7 library Wi-Fi availability to community spaces such as subsidized housing, schools, clinics, parks, senior centers, and museums. TVWS Basics TV Whitespace (TVWS) refers to the unoccupied portions of spectrum in the VHF/UHF terrestrial television frequency bands.24 Television broadcast frequency allocations traditionally assumed that TV station transmissions operating at high power needed wide spectrum separation to prevent interference between broadcasting channels, which led to the specific spectrum allocation of these frequency “guard bands.”25 Research discovered that low-power devices can operate within these spaces, which led the Federal Communications Commission (FCC) to field test TVWS applications to wireless communications and (ultimately) promote TVWS neutrality.26 In 2015, the Federal Communications Commission (FCC) made a portion of these very valuable TVWS bands of spectrum available for open, shared public use, like Wi-Fi. Yet, unlike Wi-Fi, with a reach measured in 10s of meters, the range of TVWS is measured in 100s or even 1000s of meters. TVWS has good propagation characteristics, which makes it an extremely valuable license-exempt radio spectrum.27 It is a relatively stable frequency that does not change over time, allowing for TV WHITE SPACES IN PUBLIC LIBRARIES: A PRIMER | REBMANN, TE, AND MEANS | https://doi.org/10.6017/ital.v36i1.9720 39 spectrum availability estimates to remain reliable and valid, which in turn promotes its various applications.28 Radio spectrum is considered a “common heritage of humanity,”29 as radio waves “do not respect national borders.”30 The FCC recently made a portion of these TVWS bands of spectrum available for open, shared public use.31 TVWS availability and application are contextual and dependent on many key factors. Availability is influenced by frequency (the idle channels purposely planned in TV bands, varying across regions), deployment (the height and location of the TVWS transmit antenna and its installation sites in relation to nearby surrounding TV broadcasting reception), space and distance (geographical areas outside the current planned TV coverage, including no present broadcasting signals), and time (off-air availability of licensed broadcasting transmitters during specific periods of time, subject to change by the broadcaster).32 As TVWS existed as fragmented “safety margins” between broadcast services, TVWS is typically more abundant in rural areas that have less broadcast coverage and in larger contiguous blocks rather than in highly dense urban areas.33 Assigned spectrum is not always used efficiently and effectively by licensees, and exclusive or non- exclusive sharing can alleviate pressure on these resources.34 This “spectrum crunch” of the inefficient use of scarce spectrum resources can be alleviated with dynamic spectrum access (DSA) and spectrum sharing. TVWS availability is small where digital television has been deployed, with the potentials for aggregate interference (from TVWS users in relation to primary TV service) and self-interference (within the TVWS network), which may lead to a “mismatch situation” where there is high demand for bandwidth but very low TVWS bandwidth supply.35 As most spectrum frequencies have been organized through some form of exclusive access in which only the licensee can use the specific spectrum, technologies such as cognitive radios can enable new modes of spectrum access, supporting autonomous, self-configuring, self-planning networks which rely on up-to-date TVWS availability databases. The limited distribution (in many areas) of basic broadband infrastructure and relatively high cost of access often prevents individuals with lower incomes from participating in the digital revolution of information access and its opportunities.36 Despite these challenges to broadband availability, TVWS excels in areas with low broadband coverage. Rural regions possess greater frequency availability due to lower density of spectrum licensing. In comparison to other frequencies operating higher up on the spectrum band, TVWS does not require direct line-of-sight between devices for operation, and has lower deployment costs. Equipment market costs are comparable to Wi-Fi equipment currently on the market.37 Importantly, TVWS can address access and inclusion by having relatively low start-up costs and no ongoing services fees. As a public resource, it can work with existing services to create new, potentially mobile connections to the internet that ensure the continuation of vital services in the event of service interruptions.38 In urban areas with fewer channels available, new efficient spectrum sharing policies will be necessary. Assigned spectrum is not always used efficiently and effectively by licensees, and exclusive or non-exclusive sharing or “recycling” of bands for more INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 40 effective spectrum use by multiple parties with changing spectrum needs can alleviate pressure on these resources.39 TVWS for Public Libraries TVWS is a viable medium for applications from internet access, content distribution within a given location, tracking (people, animals, and assets), task automation, and public safety and security,40 as well as remote patient monitoring and other telemedicine applications.41 TVWS complements existing networks that use other parts of the spectrum for access points, mobile communications, and home media devices.42 Analyses of a recent digital inclusion survey suggest that technology upgrades can have significant impact on the ability of libraries to expand programs and services.43 As community anchor institutions (CAIs), public libraries can use TVWS systems to expand and improve access to their services for their users, especially for underserved populations. Library-led collaborations to deploy TVWS networks in other CAIs and public spaces have numerous benefits. In conjunction with building-centered Wi-Fi, TVWS can redistribute network users from congested library spaces to other community sites, thereby distributing network usage across the community. From an existing broadband connection, libraries can extend their networks of internet access strategically across their communities. Yet, unlike networks which solely use limited-range Wi-Fi, far-reaching TVWS can improve the coverage and inclusion of patrons in accessing library programs, services and the broader internet.44 The portability of the access points allows libraries to extend their reach by providing wireless connections in the short- term, for cultural or civic events like fairs, markets, or concerts, and in the long-term, for use at popular public areas. Recent TVWS pilot installations have proven to be very stable in Kansas, Colorado, Mississippi, and Delaware. Manhattan Public Library (Kansas)’s TVWS project began in fall 2013. Though there were a few delays in the installation and testing process, the TVWS equipment was successfully implemented and welcomed by the community in early 2014. IT staff report that their remote locations have shown that this library service fills a community need, especially for underserved populations.45 Delta County Libraries (Colorado) are conducting trials with two public hotspots to support “Guest” access and potentially provide library patrons with more bandwidth access.46 TVWS implementations in the Pascagoula School District (Mississippi)47 and Delaware Public Libraries48 show successful initial pilot usage in providing wireless internet service directly to community-distributed access points. Though there are contextual differences across these sites, the strength of public libraries as CAIs providing internet access via TVWS systems is evident and promising. First Steps Any library can take the initiative in setting up a TVWS network on its own. The first step is to assess availability of spectrum in the library’s geographic location. Access to TVWS frequencies is free and requires no subscription fees other than the initial equipment investment. Public TV WHITE SPACES IN PUBLIC LIBRARIES: A PRIMER | REBMANN, TE, AND MEANS | https://doi.org/10.6017/ital.v36i1.9720 41 databases of TVWS availability are easily accessible and have been tested by the FCC since 2011;49 Google also has posted its own spectrum database as well.50 From this setup, the library gains access to public TVWS frequencies by which they can broadcast and receive internet connections from paired TVWS-enabled remote hotspots. Once it is determined that there is available spectrum/channels in the desired area, libraries can then explore how their current broadband and wireless connections might be expanded to include several community spaces where internet access is needed. Next, the library works with a TVWS equipment supplier to design and install a TVWS network consisting of a base station that is integrated with their wired connection to the internet. Finally, the library places TVWS-enabled remote hotspots in (previously identified) community-based spaces where Wi-Fi access is needed by underserved populations. Given a high quality backhaul (i.e., fiber optic cable high speed connection), TVWS can spread that signal and provide access from the library, which is able to propagate and penetrate multiple barriers and geographical features with a signal up to 10 times stronger than current Wi-Fi. Depending on the context (geographical features, TVWS availabilities, etc.), hotspots can be installed up to six miles (10 km) away and do not require line-of-sight between the base station and hotspots. This ability is superior to current Wi-Fi networks that only cover patrons in the immediate vicinity of the library. These TVWS remote hotspots also can be easily (and strategically) moved to support occasional community needs (such as neighborhood-wide or city events) or in response to crisis situations. TVWS, Libraries, and Emergency Response Public libraries provide leadership as “ready access point, first choice, first refuge, and last resort” for community services in everyday matters and in emergencies.51 They have assisted residents in relief efforts during Hurricanes Katrina and Rita, and other natural and man-made disasters.52 …the provision of access to computers and the internet was a wholly unique and immeasurably important role for public libraries… The infrastructure can be a tremendous asset in times of emergencies, and should be incorporated into community plans.53 They have likewise provided immediate and long-term assistance to communities and aid workers, providing physical space for recovery operations for emergency agencies, communication technologies, and emotional support for the community. In previous library internet usage surveys, nearly one-third of libraries reported that their computers and internet services would be used by the public in emergencies to access relief services and benefits.54 Such activities include finding and communicating with family and friends, completing online FEMA forms and insurance claims, and checking news sites regarding information of their affected homes.55 Yet, despite the admirable and successful efforts of many public libraries, their infrastructures are not always built to meet the increased demand of user needs and e-government services in emergency contexts.56 Jaeger, Shneiderman, Fleischmann , Preece, Qu, and Wu propose the concept of community response grids (CRGs), which utilize the internet and mobile communications devices so that emergency responders and residents in a disaster area can INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 42 communicate and coordinate accurate, appropriate responses.57 This concept relies on social networks, both in person and online, to enable residents and emergency responders to work together in a multi-directional communication scheme. CRGs provide residents tailored, localized information and a means to report pertinent disaster related information to emergency responders, who in turn can synthesize and analyze submitted information and act accordingly.58 Due to their existing role as community anchor institutions (CAIs), public libraries are uniquely positioned for CRG involvement. Libraries can assist in facilitation of internet access with portable TVWS network connection points. By virtue of their portability, TVWS hotspots can provide essential digital access in times of crisis by moving along with their affected populations. Emergency operations and communications in a crisis occur throughout networks comprised of various technologies. Information management before, during, and after a disaster affects how well a crisis is managed.59 Broadband internet can be one access route in the event that phone and radio transmissions are affected, and vice versa, as part of a “mixed media approach” to get messages to those that need it in an emergency.60 Yet one must remember that internet communications are double-edged: the internet provides relevant material on demand and near instant sharing and collaborating, but these very features can compound a crisis with misinformation.61 Despite these concerns, the potential of the integration of wireless devices and other technologies into a multi-technology, collaborative response system can solve the problem of existing communication structures that lack coordination and quality control.62 The proliferation of smartphones, laptops, and other portable wireless devices makes such technology ideal for emergency communications, especially in how users’ familiarity with their own devices will help them navigate CRG communications while under stress.63 CONCLUSION Supporting internet access and inclusion in public libraries and having equal, affordable, and available access to information is a necessary component to bridging the digital divide. Technology has become “an irreducible component of modern life, and its presence and use has significant impact on an individuals’ ability to fully engage in society.”64 As Cohron argues, this principle represents more than providing people with internet access: it is about “leveling the playing field in regards to information diffusion. The internet is such a prominent utility in peoples’ lives that we, as a society, cannot afford for citizens to go without.”65 Broadband access is the first step; digital literacy training is also a necessity. Access alone is not enough to ensure quality and effective use, however, as the digital divide is representative of broader social inequalities that computer and internet access cannot fully remedy.66 This is a complex problem that requires a multi-faceted solution. As Kinney states, “the digital divide is a moving target, and new divides open up with new technologies. Libraries help bridge some inequities more than others, and substantial disparities exist among library systems.”67 Internet access also becomes a necessity when the internet is to play a role in emergency communications.68 TV WHITE SPACES IN PUBLIC LIBRARIES: A PRIMER | REBMANN, TE, AND MEANS | https://doi.org/10.6017/ital.v36i1.9720 43 It is problematic to suggest that public libraries can be simultaneously promoted as the solution to digital divide issues while facing cuts to funding. Policy makers, community advocates, and the community members themselves are stakeholders in the success of their communities, and must also take responsibility for access and inclusion via public libraries.69 As public agencies automate to increase equality and save money, they exacerbate digital divides by excluding those without access. Suggesting that community members simply visit the library to ensure access to public services places additional pressure on libraries, yet these efforts may go unsupported and unacknowledged. Public libraries are already valuable community access points to resources especially in emergencies, though many suffer from a lack of concerted disaster planning. Along similar lines, many libraries are ill-equipped to accommodate the bandwidth needs of growing and oftentimes sparsely connected populations. As communications and government services move increasingly online, it becomes imperative to build strong cost-effective information infrastructures. TVWS connections can arguably help in breaking down the barriers that challenge ubiquitous access and inclusion. TVWS-enabled remote access points in daily use around communities are ideally situated to provide everyday Wi-Fi and for rapid redeployment to damaged areas (as pop-up hotspots) to provide essential communication and information resources in times of crisis. In short, TVWS can augment the technological infrastructure of public libraries toward further developing their roles as CAIs and leaders serve their communities well into the future. REFERENCES 1. Wireline Competition Bureau, “2016 Broadband Progress Report,” Federal Communications Commission, January 29, 2016, https://www.fcc.gov/reports-research/reports/broadband- progress-reports/2016-broadband-progress-report. 2. Office of Chairman Wheeler, “FCC Adopts Strong, Sustainable Rules to Protect the Open Internet,” Federal Communications Commission, February 26, 2015, https://apps.fcc.gov/edocs_public/attachmatch/DOC-332260A1.pdf. 3. “Here's What the Digital Divide Looks Like in the United States,” The White House, July 15, 2015, https://www.whitehouse.gov/share/heres-what-digital-divide-looks-united-states. 4. John B. Horrigan and Maeve Duggan, “Home Broadband 2015,” Pew Research Center, December 21, 2015, http://www.pewInternet.org/files/2015/12/Broadband-adoption- full.pdf. This 43% is further divided between 33% reporting the monthly subscription cost as their main reason, while the other 10% report the expensive cost of a computer as their reason for non-adoption. 5. Bo Kinney, “The Internet, Public Libraries, and the Digital Divide,” Public Library Quarterly 29, no. 2 (2010): 104-161, https://doi.org/10.1080/01616841003779718. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 44 6. Madalyn Cohron, “The Continuing Digital Divide in the United States,” The Serials Librarian 69, no. 1 (2015): 77-86, https://doi.org/10.1080/0361526X.2015.1036195. 7. Kinney, “The Internet, Public Libraries, and the Digital Divide.” 8. Paul T. Jaeger, John Carlo Bertot, Kim M. Thompson, Sarah M. Katz, and Elizabeth J. DeCoster, “The Intersection of Public Policy and Public Access: Digital Divides, Digital Literacy, Digital Inclusion, and Public Libraries,” Public Library Quarterly 31, no.1 (2012): 1-20, https://doi.org/10.1080/01616846.2012.654728. 9. Brian Real, John Carlo Bertot, and Paul T. Jaeger, “Rural Public Libraries and Digital Inclusion: Issues and Challenges,” Information Technology and Libraries 33, no. 1 (2014): 6-24, https://doi.org/10.6017/ital.v33i1.5141. 10. Jaeger et al., “The Intersection of Public Policy and Public Access.” 11. John Carlo Bertot, Paul T. Jaeger, Lesley A. Langa, Charles R. McClure, “Public access computing and Internet access in public libraries: The role of public libraries in e-government and emergency situations,” First Monday 11, no. 9 (2006), https://doi.org/10.5210/fm.v11i9.1392. 12. John. B Horrigan, “Libraries 2016,” Pew Research Center, September 9. 2016, http://www.pewinternet.org/2016/09/09/libraries-2016/. 13. Real et al., “Rural Public Libraries and Digital Inclusion.” 14. John Carlo Bertot, Charles R. McClure, and Paul T. Jaeger, “The Impacts of Free Public Internet Access on Public Library Patrons and Communities,” Library Quarterly 78, no.3 (2008): 285- 301, https://doi.org/10.1086/588445. 15. Charles R. McClure, Paul T. Jaeger, John Carlo Bertot, “The Looming Infrastructure Plateau? Space, Funding, Connection Speed, and the Ability of Public Libraries to meet the Demand for Free Internet Access,” First Monday 12, no. 12 (2007): https://doi.org/10.5210/fm.v12i12.2017 . 16. Ibid. 17. Bertot et al., “Public access computing and Internet access in public libraries.” 18. Ibid.; Jaeger et al., “The Intersection of Public Policy and Public Access.” 19. Wireline Competition Bureau, “WCB Cost Model Virtual Workshop 2012 - Community Anchor Institutions,” Federal Communications Commission, June 1, 2012, https://www.fcc.gov/news- events/blog/2012/06/01/wcb-cost-model-virtual-workshop-2012-community-anchor- institutions. 20. Jennifer Koerber, "ALA and iPAC Analyze Digital Inclusion Survey," Library Journal 141, no. 1 (2016): 24-26. 21. Horrigan, “Libraries 2016.” TV WHITE SPACES IN PUBLIC LIBRARIES: A PRIMER | REBMANN, TE, AND MEANS | https://doi.org/10.6017/ital.v36i1.9720 45 22. Ibid. 23. Timothy Inklebarger, “Bridging the tech gap,” American Libraries, September 11, 2015, https://americanlibrariesmagazine.org/2015/09/11/bridging-tech-gap-wi-fi-lending. 24. Andrew Stirling, “White spaces – the new Wi-Fi?,” International Journal of Digital Television 1, no. 1 (2010): 69–83, https://doi.org/10.1386/jdtv.1.1.69/1; Cristian Gomez, “TV White Spaces: Managing Spaces or Better Managing Inefficiencies?,” in TV White Spaces A Pragmatic Approach, eds. Ermanno Pietrosemoli and Marco Zennaro (Trieste: Abdus Salam International Centre for Theoretical Physics T/ICT4D Lab, 2013), 67-77. 25. Steve Song, “Spectrum and Development,” in TV White Spaces A Pragmatic Approach, eds. Ermanno Pietrosemoli and Marco Zennaro (Trieste: Abdus Salam International Centre for Theoretical Physics T/ICT4D Lab, 2013), 35-40. 26. Robert Horvitz, “Geo-Database Management of White Space vs. Open Spectrum,” in TV White Spaces A Pragmatic Approach, eds. Ermanno Pietrosemoli and Marco Zennaro (Trieste: Abdus Salam International Centre for Theoretical Physics T/ICT4D Lab, 2013), 7-17. 27. Julie Knapp, “FCC Announces Public Testing of First Television White Spaces Database,” Federal Communications Commission, September 14, 2011, https://www.fcc.gov/news- events/blog/2011/09/14/fcc-announces-public-testing-first-television-white-spaces- database. 28. Horvitz, “Geo-Database Management of White Space vs. Open Spectrum.” 29. Ryszard Strużak and Dariusz Więcek, “Regulatory Issues for TV White Spaces,” in TV White Spaces A Pragmatic Approach, eds. Ermanno Pietrosemoli and Marco Zennaro (Trieste: Abdus Salam International Centre for Theoretical Physics T/ICT4D Lab, 2013), 19-34. 30. Horvitz, “Geo-Database Management of White Space vs. Open Spectrum,” 8. 31. Engineering & Technology Bureau, “FCC Adopts Rules For Unlicensed Services In TV And 600 MHz Bands,” Federal Communications Commission, August 11, 2015, https://apps.fcc.gov/edocs_public/attachmatch/FCC-15-99A1_Rcd.pdf. 32. Gomez, “TV White Spaces: Managing Spaces or Better Managing Inefficiencies?,” 68. 33. Stirling, “White spaces – the new Wi-Fi?.” 34. Linda E. Doyle, “Cognitive Radio and Africa,” in TV White Spaces A Pragmatic Approach, eds. Ermanno Pietrosemoli and Marco Zennaro (Trieste: Abdus Salam International Centre for Theoretical Physics T/ICT4D Lab, 2013), 109-119. 35. Gomez, “TV White Spaces: Managing Spaces or Better Managing Inefficiencies?,” 72. 36. Mike Jensen, “The role of TV White Spaces and Dynamic Spectrum in helping to improve Internet access in Africa and other Developing Regions,” in TV White Spaces A Pragmatic INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2017 46 Approach, eds. Ermanno Pietrosemoli and Marco Zennaro (Trieste: Abdus Salam International Centre for Theoretical Physics T/ICT4D Lab, 2013), 83-89. 37. Song, “Spectrum and Development.” 38. Ibid. 39. Doyle, “Cognitive Radio and Africa,” 113. 40. Stirling, “White spaces – the new Wi-Fi?.” 41. Afton Chavez, Ryan Littman-Quinn, Kagiso Ndlovu, and Carrie L Kovarik, “Using TV white space spectrum to practice telemedicine: A promising technology to enhance broadband Internet connectivity within healthcare facilities in rural regions of developing countries,” Journal of Telemedicine and Telecare 22, no. 4 (2015): 260-263, https://doi.org/10.1177/1357633X15595324. 42. Stirling, “White spaces – the new Wi-Fi?.” 43. Koerber, "ALA and iPAC Analyze Digital Inclusion Survey." 44. Chavez et al., “Using TV white space spectrum to practice telemedicine.” 45. Kerry Ingersoll, June 22, 2015, Google+ comment to the Gigabit Libraries Network, https://plus.google.com/107631107756352079114/posts/L4Y8ci8sG5Y. 46. Delta County Libraries, “Super Wi-Fi Pilot,” accessed November 1, 2016, http://www.deltalibraries.org/super-wi-fi-pilot/. 47. Pascagoula TV White Spaces Facebook group, accessed November 1, 2016, https://www.facebook.com/PSDTVWS/. 48. “Delaware Libraries White Space Pilot Update, January 2015,” accessed November 1, 2016, http://lib.de.us/files/2015/01/Delaware-Libraries-White-Space-Pilot-Update-Jan-2015.pdf. 49. Knapp, “FCC Announces Public Testing of First Television White Spaces Database.” 50. See https://www.google.com/get/spectrumdatabase/. 51. Bertot et al., “Public access computing and Internet access in public libraries.” 52. Bertot et al., “The Impacts of Free Public Internet Access.” See also Horrigan, “Libraries 2016.” 53. Paul T. Jaeger, Lesley A. Langa, Charles R. McClure, and John Carlo Bertot, “The 2004 and 2005 Gulf Coast Hurricanes: Evolving Roles and Lessons Learned for Public Libraries in Disaster Preparedness and Community Services,” Public Library Quarterly 25, 3/4, (2007), 199-214. 54. Ibid. TV WHITE SPACES IN PUBLIC LIBRARIES: A PRIMER | REBMANN, TE, AND MEANS | https://doi.org/10.6017/ital.v36i1.9720 47 55. Bertot et al., “Public access computing and Internet access in public libraries.” 56. Ibid. 57. Paul T. Jaeger, Ben Shneiderman, Kenneth R. Fleischmann , Jennifer Preece, Yan Qu, Philip Fei Wu, “Community response grids: E-government, social networks, and effective emergency management,” Telecommunications Policy 31 (2007): 592-604, https://doi.org/10.1016/j.telpol.2007.07.008. 58. Ibid., 595. 59. Laurie Putnam, “By choice or by chance: How the Internet is used to prepare for, manage, and share information about emergencies,” First Monday 7, no.11 (2002), https://doi.org/10.5210/fm.v7i11.1007. 60. Ibid. 61. Ibid. 62. Jaeger et al., “Community response grids,” 598. Jaegar et al. describe how the Internet combines the best of one-to-one, one-to-many, many-to-one, and many-to-many in terms of the flow and quality of information. One-to-one communication is slow; many-to-one only benefits the central network, while outsiders reporting emergencies do not learn what others are reporting; one-to-many is inefficient, limited, and assumes the broadcaster has the appropriate information and can get it to those that need it most; many-to-many can create “information overload” of questionable content. 63. Ibid., 599. 64. Jaeger et al., “The Intersection of Public Policy and Public Access,” 3. 65. Cohron, “The Continuing Digital Divide in the United States,” 84. 66. Kinney, “The Internet, Public Libraries, and the Digital Divide,” 120. 67. Ibid., 148. 68. Jaeger et al., “Community response grids,” 599. 69. Bertot et al., “The Impacts of Free Public Internet Access,” 299. 9733 ---- Microsoft Word - 9733-16966-4-CE.docx Editorial Board Thoughts: Arts into Science, Technology, Engineering, and Mathematics – STEAM, Creative Abrasion, and the Opportunity in Libraries Today Tod Colegrove INFORMATION TECHNOLOGIES AND LIBRARIES | MARCH 2017 4 Over the millennia, man’s attempt to understand the universe has been an evolution from the broad to the sharply focused. A wide range of distinctly separate disciplines evolved from the overarching natural philosophy, the study of nature, of Greco-Roman antiquity: anatomy and astronomy through botany, mathematics, and zoology among many others. Similarly, the Arts, Humanities, and Engineering developed from broad over-arching interest into tightly focused disciplines that today are distinctly separate. As these legitimate divisions formed, grew, and developed into ever-deepening specialty, they enabled correspondingly deeper study and discovery1; in response, the supporting collections of the library divided and grew to reflect that increasing complexity. Libraries have long been about the organization of, and access to, information resources. Subject classification systems in use today, such as the Dewey Decimal system, are designed to group like items with like, albeit under broad overarching topic. A perhaps inevitable result for print collections housed under such a classification system is the physical isolation of items - and, by extension, the individuals researching those topics - from one another. Under the Library of Congress system, for example, items categorized as “geography” are physically removed from those in “science;” further still from “technology.” End-users benefit from the possibility of serendipitous discovery while browsing shelves nearby, even as they are effectively shielded from exposure to distracting topics outside of their immediate focus. Recent years have witnessed a rediscovery of, and renewed interest in, the fundamental role the library can have in the creation of knowledge, learning, and innovation among its members. As collections shift from print to electronic, libraries are increasingly less bound to the physical constraints imposed by their print collections. Rather than a continued focus on hyper- specialization and separation, we have the opportunity to rethink the library: exploring novel configurations and services that might better support its community, and embracing emerging roles of trans-disciplinary collaboration and innovation. The Library as Intersection Libraries reflect the institutional and organizational structures of their communities, even as the Tod Colegrove (pcolegrove@unr.edu), a member of the ITAL Editorial Board, is Head of DeLaMare Science & Engineering Library, University of Nevada, Reno. EDITORIAL BOARD THOUGHTS | COLEGROVE https://doi.org/10.6017/ital.v36i1.9733 5 physical organization of the structures built to house print collections mirror the classification system in use. Academic libraries are perhaps most entrenched in the structural division: rather than intrinsically promoting collaboration and discovery across disciplines, the organization of print collections, and typically the spaces around them, is designed to foster increased focus and specialization. Specialized almost to the exclusion of other areas of study altogether, in branch libraries of a college or university this division can reach a pinnacle; libraries and collections devoted to exclusive topics of engineering, science, music, and others, exist on campuses across the country. Amplified by separation and clustering of faculty and researchers, typically by department and discipline, it becomes entirely possible for individuals to “spend a lifetime working in a particular narrow field and never come into contact with the wider context of his or her study.”2 The library is also one of the few places in any community where individuals from a variety of backgrounds and specialties can naturally cross paths with one another. At a college or university, students and faculty from one discipline might otherwise rarely encounter those from other disciplines. Whether public, school, or academic library, outside of the library individuals and groups are typically isolated from one another physically, with little opportunity to interact organically. Without active intervention and deliberate effort on the part of the library, opportunities for creative abrasion3 and trans-disciplinary collaboration become virtually non- existent; its potential to “unleash the creative potential that is latent in a collection of unlike- minded individuals,”4 untapped. Leveraged properly, however, the intersection of interests and expertise that occurs naturally within the neutral spaces of the library can become a powerful tool that supports not only research, but creativity and innovation - a place where ideas and viewpoints can collide, building on one another: “For most of us, the best chance to innovate lies at the Intersection. Not only do we have a greater chance of finding remarkable idea combinations there, we will also find many more of them.... The explosion of remarkable ideas is what happened in Florence during the Renaissance, and it suggests something very important. If we can just reach an intersection of disciplines or cultures, we will have a greater chance of innovating, simply because there are so many unusual ideas to go around.”5 Difficult and Scary The problem? “Stimulating creative abrasion is difficult and scary because we are far more comfortable being with folks like us.”6 And yet a quick review of the literature reveals that knowledge creation, innovation, and success are inextricably linked7, with the fundamental understanding of their connection having undergone a dramatic shift: “knowledge is in fact essential to innovate, and while this might sound obvious today, putting knowledge and innovation and not physical assets at the centre of competitive advantage was a tremendous change.”8 As our libraries move toward embracing an even more active role within our communities, our organizational priorities are undergoing similarly dramatic shifts: support for knowledge creation INFORMATION TECHNOLOGIES AND LIBRARIES | MARCH 2017 6 and innovation becomes more central, even as physical assets shift toward a supporting, even peripheral, role. Libraries, as fundamentally neutral hubs of diverse communities, are uniquely positioned to be able to cultivate creative abrasion within and among their communities, fostering not only knowledge creation, but innovation and success. Indeed, the combination of physical, electronic, and staff assets can be the raw stuff by which trans-disciplinary engagement is encouraged. The active cultivation and support of creative abrasion, with direct linkage to desired outcomes, becomes arguably one of the most vital services the library can provide its community. Rather than deepening the cycle of hyper-specialization, the emergence of makerspace in our libraries is one example of a trend toward enabling libraries to broaden and embrace that support. Building on the intellectual diversity within the spaces of the library, staff members, volunteers, and fellow community members can serve as catalyst, triggering groups to “do something with that variety”9 by engaging across traditional boundaries. Indeed, “by deliberately creating diverse organizations and explicitly helping team members appreciate thinking-styles different than their own, creative abrasion can result in successful innovation.”10 Strategic placement and staff support of makerspace activity can dramatically increase the opportunity for creative abrasion - and, by extension, the resulting knowledge creation, creativity and innovation. Arts Bring a Fundamental Literacy and Resource to STEM In recent years, greater emphasis on students acquiring STEM (Science, Technology, Engineering, and Math) skills has raised the topic to be one of the most central issues in education. Considered a key solution to improving the competitiveness of American students on the global stage, the approach of STEM education shares the common goal of breaking down the artificial barriers that exist even within the separate disciplines of sciences, technology, engineering, and math - in short, increasing the diversity of the learning environment. Proponents of STEAM go further by suggesting that adding Art into the mix can bring new energy and language to the table, “sparking curiosity, experimentation, and the desire to discover the unknown in students.” 11 Federal agencies such as the U.S. Department of Education and the National Science Foundation have funded and underwritten a number of grants, conferences, and workshops in the field, including the seminal forum hosted by the Rhode Island School of Design (RISD), “Bridging STEM to STEAM: Developing New Frameworks for Art-Science-Design pedagogy.”12 John Maeda, the president of the RISD, identifies a direct connection between the approach and the creativity and success of late Apple co-founder Steve Jobs, with STEAM support “a pathway to enhance U.S. Economic competitiveness.”13 Proponents go further, arguing the Arts bring both a fundamental literacy and resource to the STEM disciplines, providing “innovations through analogies, models, skills, structures, techniques, methods, and knowledge.”14 Consider the findings of a study of Nobel Prize winners in the sciences, members of the Royal Society, and the U.S. National Academy of Sciences; Nobel laureates were: EDITORIAL BOARD THOUGHTS | COLEGROVE https://doi.org/10.6017/ital.v36i1.9733 7 - twenty-five times as likely as an average scientist to sing, dance, or act; - seventeen times as likely to be an artist; - twelve times more likely to write poetry and literature; - eight times more likely to do woodworking or some other craft; - four times as likely to be a musician; and - twice as likely to be a photographer.15 From the standpoint of creative abrasion, welcoming the “A” of Art into the library support of STEM disciplines increases the diversity of the library, and by default the opportunity for creative abrasion. From Aristotle and Pythagoras through Galileo Galilei and Leonardo da Vinci to Benjamin Franklin, Richard Feynman, and Noam Chomsky, a long list of individuals of wide- ranging genius hints at a potential left largely untapped by our traditional approach. Connections between STEM disciplines, Art, and the innovation arising directly out of their creative abrasion surround us: the electronic screens used on a wide range of technology, including computers, televisions, and cell phones, are the result of a collaboration between a series of painter-scientists and post-impressionist artists such as Seurat - a combination of red, green, and blue dots generate full-spectrum images in a way not unlike that of the artistic technique of pointillism. The electricity to drive that technology is understood, in part, due to early work by Franklin - even as he lay the foundations of the free public library with the opening of America’s first lending library, and pursued a broad range of parallel interests. The stitches used in medical surgery are the result of Nobel laureate Alexis Carrel taking his knowledge of lace making from a traditional arena into the operating room. Prominent American inventors “Samuel Morse (telegraph) and Robert Fulton (steam ship) were among the most prominent American artists before they turned to inventing.”16 In short, “increasing success in science is accompanied by developed ability in other fields such as the fine arts.”17 Rather than isolated in monastic study, “almost all Nobel laureates in the sciences are actively engaged in arts as adults.”18 Perhaps surprisingly, rather than being rewarded by an ever-increasing focus and hyper-specialization, genius in the sciences seems tied to individuals’ activity in the arts and crafts. The study’s authors cite three different Nobel prize winners, including J. H. Van’t Hoff’s 1878 speculation that scientific imagination is correlated with creative activities outside of science19; going on to detail similar findings from general studies dating back over a century. Of even more seminal interest, the authors point to a similar connection for adolescents/young adults where Milgram and colleagues20 found “having at least one persistent and intellectually stimulating hobby is a better predictor of career success in any discipline than IQ, standardized test scores, or grades.”21 Discussion The connection between individuals holding a multiplicity of interests, trans-disciplinary activity, and success is clear; what is less clear is to what extent we are fostering that connection in our libraries today. The potential is nevertheless tantalizing: a random group of people, thrown together, is not likely to be very creative. By going beyond specialization and wading into the INFORMATION TECHNOLOGIES AND LIBRARIES | MARCH 2017 8 deeper waters of supporting and cultivating creative abrasion and avocation among the membership of our libraries, we are fostering success and innovation beyond what might otherwise occur. The decision to catalyze and foster the cross-curricular collaboration that is STEAM22 is squarely in the hands of the library: in the design of its spaces, and in the interactions of the staff of the library with the communities served. We can choose to actively connect and catalyze across traditional boundaries. As the head of a science and engineering library, one of the early adopters of makerspace and actively exploring the possibilities of STEAM engagement for several years, I have time and again witnessed the leaps of insight and creativity brought about by creative abrasion. From across disciplines members are engaging with the resources of the library - and, with our encouragement, one another - in an ever-increasing cycle of knowledge creation, innovation, and success. The impact is particularly dramatic among individuals from strongly differing backgrounds and disciplines: for example, when an engineering student, who considers themselves to be expert with a particular technology, witnesses and interacts with an art student using that same technology and accomplishing something truly unexpected, even seemingly magical. Or when a science student approaching a problem from one perspective realizes a practitioner from a different discipline sees the problem from an entirely different, and yet equally valid, point of view. In each case, it’s as if the worldview of each suddenly melts: shifting and expanding, never to return to its original shape. Transformative experiences become the order of the day, even as the informal environment offers a wealth of opportunity to engage with and connect end-users to the more traditional resources of library. By actively seeking out opportunities to bring art into traditionally STEM-focused activity, and vice-versa, we are deliberately increasing the diversity of the environment. Makerspace services and activities, to the extent they are open and visibly accessible to all, are a natural for the spontaneous development of trans-disciplinary collaboration. Within the spaces of the library, opportunities to connect individuals around shared avocational interest might range from music and spontaneous performance areas to spaces salted with LEGO bricks and jigsaw puzzles; the potential connections between our resources and the members of our communities are as diverse as their interests. Indeed, when a practitioner from one discipline can interact and engage with others from across the STEAM spectrum, the world becomes a richer place – and maybe, just maybe, we can fan the flames of curiosity along the way. REFERENCES 1. Bohm, D., and F. D. Peat. 1987. Science, Order, and Creativity: A Dramatic New Look at the Creative Roots of Science and Life. London: Bantam. 2. Ibid., 18-19. 3. Hirshberg, Jerry. 1998. The Creative Priority: Driving Innovative Business in the Real World. London: Penguin. EDITORIAL BOARD THOUGHTS | COLEGROVE https://doi.org/10.6017/ital.v36i1.9733 9 4. Leonard-Barton, Dorothy, and Walter C. Swap. 1999. When Sparks Fly: Harnessing the Power of Group Creativity. Boston, Massachusetts: Harvard Business School Press Books. 5. Johansson, Frans. 2004. The Medici Effect: Breakthrough Insights at the Intersection of Ideas, Concepts, and Cultures. Boston, Massachusetts: Harvard Business School Press, 20. 6. Leonard-Barton, Dorothy, and Walter C. Swap. 1999. When Sparks Fly: Harnessing the Power of Group Creativity. Boston, Massachusetts: Harvard Business School Press Books, 25. 7. Nonaka, Ikujiro. 1994. “A Dynamic Theory of Organizational Knowledge Creation.” Organization Science 5 (1): 14–37. 8. Correia de Sousa, Milton. 2006. “The Sustainable Innovation Engine.” Vine 36 (4): 398–405, accessed February 14, 2017. https://doi.org/10.1108/03055720610716656. 9. Leonard-Barton, Dorothy, and Walter C. Swap. 1999. When Sparks Fly: Harnessing the Power of Group Creativity. Boston, Massachusetts: Harvard Business School Press Books, 20. 10. Adams, Karlyn. 2005. The Sources of Innovation and Creativity. Education, September, 2005, 33. https://doi.org/10.1007/978-3-8349-9320-5. 11. Jolly, Anne. 2014. “Stem vs. STEAM: Do the Arts Belong?” Education Week Teacher. http://www.edweek.org/tm/articles/2014/11/18/ctq-jolly-stem-vs- steam.html?qs=stem+vs.+steam. 12. Rose, Christopher, and Brian K. Smith. 2011. “Bridging STEM to STEAM: Developing New Frameworks for Art-Science-Design Pedagogy.” Rhode Island School District Press Release. 13. Robelen, Erik W. 2011. “STEAM: Experts Make Case for Adding Arts to STEM.” Education Week. http://www.bmfenterprises.com/aep-arts/wp-content/uploads/2012/02/Ed-Week-STEM- to-STEAM.pdf. 14. Root-Bernstein, Robert. 2011. “The Art of Scientific and Technological Innovations – Art of Science Learning.” http://scienceblogs.com/art_of_science_learning/2011/04/11/the-art-of- scientific-and-tech-1/. 15. Ibid. 16. Ibid. 17. Root-Bernstein, Robert, Lindsay Allen, Leighanna Beach, Ragini Bhadula, Justin Fast, Chelsea Hosey, Benjamin Kremkow, et al. 2008. “Arts Foster Scientific Success: Avocations of Nobel, National Academy, Royal Society, and Sigma Xi Members.” Journal of Psychology of Science and Technology. https://doi.org/10.1891/1939-7054.1.2.51. INFORMATION TECHNOLOGIES AND LIBRARIES | MARCH 2017 10 18. Ibid. 19. Van’t Hoff, Jacobus Henricus. 1967. “Imagination in Science,” Molecular Biology, Biochemistry and Biophysics, translated by G. F. Springer, 1, Springer-Verlag, pp. 1-18 20. Milgram, Roberta M., and Eunsook Hong. 1997. "Out-of-school activities in gifted adolescents as a predictor of vocational choice and work." Journal Of Secondary Gifted Education 8, no. 3: 111. Education Research Complete, EBSCOhost (accessed February 26, 2017). 21. Root-Bernstein, Robert, Lindsay Allen, Leighanna Beach, Ragini Bhadula, Justin Fast, Chelsea Hosey, Benjamin Kremkow, et al. 2008. “Arts Foster Scientific Success: Avocations of Nobel, National Academy, Royal Society, and Sigma Xi Members.” Journal of Psychology of Science and Technology. https://doi.org/10.1891/1939-7054.1.2.51. 22. Land, Michelle H. 2013. “Full STEAM Ahead: The Benefits of Integrating the Arts into STEM.” Procedia Computer Science 20. Elsevier Masson SAS: 547–52. https://doi.org/10.1016/j.procs.2013.09.317. 9750 ---- A Technology-Dependent Information Literacy Model within the Confines of a Limited Resources Environment Ibrahim Abunadi INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 119 Ibrahim Abunadi (i.abunadi@gmail.com) is an Assistant Professor, College of Computer and Information Sciences, Prince Sultan University, Riyadh, Saudi Arabia. ABSTRACT The purpose of this paper is to investigate information literacy as an increasingly evolving trend in computer education. A quantitative research design was implemented, and a longitudinal case study methodology was conducted to measure tendencies in information literacy skill development and to develop a practical information literacy model. It was found that both students and educators believe that the combination of information literacy with a learning management system is more effective in increasing information literacy and research skills where information resources are limited. Based on the quantitative study, a practical, technology-dependent information literacy model was developed and tested in a case study, resulting in fostering the information literacy skills of students who majored in information systems. These results are especially important in smaller universities with libraries having limited technology capabilities, located in developing countries. INTRODUCTION Many different challenges arise during a graduate’s career. Moreover, professional life can involve numerous situations and problems that university students are not prepared for during their college studies.1 The use of internet sources to find solutions to real problems depends on students’/graduates’ information literacy skills.2 A strong aid to students’ learning is the ability to search, analyze, and apply knowledge from different sources, including literature, databases, and the internet.3 One of the issues students face concerning technology is its continuous evolution. Although students learn survival skills in their professional lives, they also require special coping skills. A skill that should be considered for all technology-related courses is information literacy. Lin defines information literacy as a “set of abilities, skills, competencies, or fluencies, which enable people to access and utilize information resources.”4 These are part of the lifelong learning skills of students, which put the power of continuous education in their hands. Another issue is the exclusive allocation of the responsibility for information literacy skill development in smaller educational institutes to librarians or to instructors who majored in library science.5 This paper has taken another approach to information literacy skill development whereby specialized educators, such as capable information systems faculty members, facilitate this skill development. A learning management system (LMS) is a widely used form of technology for course delivery and the organization of subject material. Blackboard, Desire2Learn, Sakai, Moodle, and ANGEL, as common LMS platforms, provide an integrated guidance system to deliver and analyze learning. mailto:i.abunadi@gmail.com INFORMATION LITERACY MODEL | ABUNADI 120 https://doi.org/10.6017/ital.v37i4.9750 These systems can be used to support information literacy instruction. Standard features include assignments and quizzes, while other systems offer tools that allow students to view and comment on other students’ portfolios or work, depending on the LMS’s features.6 Before the 1990s, face-to- face learning was common within the educational domain. However, LMS emerged in the twenty- first century as the internet became a suitable alternative to traditional learning. Moodle, an open- source LMS, is an acronym that stands for “Modular Object-Oriented Dynamic Learning Environment.” This online education system is intended to make learning available with the necessary guidance for educators. Web services available through Moodle are based on a well- organized structural outline, and they are widely used to perform educational tasks and to analyze statistics helpful to instructors.7 Peter et al. (2015) presented an approach related to information literacy instruction in universities and colleges that combines traditional classroom instruction and online learning; this is known as “blended learning.”8 This involves only one seminar in the classroom; thus, it can replace traditional sessions at universities and colleges with education involving information literacy instructions. It has been recommended that a time-efficient method should be adopted by augmenting classroom seminars and literacy instructions through the addition of online materials. However, the findings of this study showed that students who only use online materials do not show greater progress in their learning than those who follow the blended approach. The results of another study by Jackson more effectively integrated educational services into learning management systems and library resources.9 Jackson suggested that better implementation was required, and recommended using Blackboard LMS to include information literacy and scaffolding activities into subject-specific courses. This study intends to determine the most effective method of information literacy education. It evaluates instructors’ and students’ perceptions of the effectiveness of traditional teaching in comparison to electronic teaching in information literacy. In this study, a quantitative research investigation was conducted with participants. A research model and questionnaire were developed for this purpose with three underlying latent variables. The participants were asked to describe their understanding of learning systems and their preferences in information literacy education. Their requirements varied with their continuing education levels and past educational activities, based on which software or website appeared to be more supportive and compatible with them.10 This study considered the research results, developed an information literacy intervention model and applied it to a case study. LITERATURE REVIEW Previously, educational institutions were limited to face-to-face teaching techniques or classroom- based teaching. Face-to-face teaching is the traditional method still used in most educational institutions. In classrooms, the subject is explained, and books or other paper-based materials are read out of class to enhance understanding.11 Face-to-face learning or teaching is limited by the number of physical resources available. Therefore, it becomes difficult to accommodate the widespread interest in information literacy through face-to-face learning.12 Gathering information using only physical resources can lead to information deficiencies.13 Education has evolved to benefit from advances in technologies by using LMS and online sources. The effective usage of LMS and online sources requires the development of information literacy. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 121 Information Literacy Information literacy includes technological literacy, information ethics, online library skills, and critical literacy.14 Technological literacy is defined as the ability to use common software, hardware, and internet tools to reach a specific goal. This ability is an important component of information literacy that enables a graduate to seek answers by using the internet and digital resources.15 Hauptman defines information ethics as “the production, dissemination, storage, retrieval, security, and application of information within an ethical context.”16 This skill is essential to preserve the original rights of researchers cited in a study, based on the ethical standards of the graduate conducting the study. Another important component of information literacy comprises online library skills, which can be defined as the ability to use online digital sources, including digital libraries, to effectively seek different knowledge resources by using search engines, correctly locating required information, and using online support when needed.17 Critical literacy is a thorough evaluation of online material that allows for the appropriate conclusion to be reached on the suitability of the material for the required investigation. 18 Seeking answers from appropriate sources is important to allow graduates to find and report on accurate and valid data. These components of information literacy enable information extraction from topics related to the desired course or field of research. Students, professors, instructors, employees, learners, and educational policy administrators are the major knowledge seekers who use information literacy skills.19 With improved online resources available for learning, many learning requirements are moving toward providing services that are exclusively online. 20 Gray and Montgomery studied an online information literacy course.21 They found that teaching with the aid of information literacy is helpful for students in obtaining improvised instruction. The authors also compared an online information literacy course and face-to-face instruction, focusing primarily on the behaviors and attitudes of teachers and college students toward the online course. The students agreed that the application of information literacy techniques would be particularly helpful to them in clarifying their understanding of complicated instructions. The teachers also indicated that an information literacy course would result in better regulation of academic processes than face-to-face learning. Dimopoulos et al. (2013) measured student performance within an online learning environment, finding that the online learning environment has direct relevance for the completion of challenging tasks within academic settings.22 The findings further indicated that an LMS could improve teaching activities. As an LMS, Moodle was also helpful for students to ensure their development of collaborative problem-solving skills. They concluded that Moodle includes different useful modules such as digital resource repositories, interactive Wikis, and external add- in tools that have been related to student learning when incorporated into the LMS environment, resulting in better performance. Hernández-García and Conde-González focused on learning analytical tools within engineering education, noting that engineering students are more likely to understand complicated concepts better. Therefore, the application of the information literacy model resulted in better performance.23 Further, educating students about information sources was found to be helpful for the instructors in enhancing the students’ learning by improving their online information retrieval skills. This study indicated that students can develop their learning traits more effectively through online learning than through face-to-face learning. INFORMATION LITERACY MODEL | ABUNADI 122 https://doi.org/10.6017/ital.v37i4.9750 Many researchers in this area have developed models that are only theoretical. 24 However, this paper develops a practical information literacy model that can be tested for improvement in information literacy skills. This is especially relevant for computer and information systems courses, which can sometimes fall outside the purview of library-related training or education in universities with limited resources. The inclusion of information literacy training within computer and information systems courses is not regularly done in the information literacy field. 25 Additionally, although some information literacy has been implemented practically in research, no other study has developed a practical information literacy model based on educators’ and students’ information literacy dispositions as well as both information literacy theory and practice.26 Moodle as an LMS Moodle is a useful and accommodating open-source platform with a stable structure of website services that allows instructors and learners to implement a range of helpful plugins. It can be used as a lively online education community and an enhancement to the face-to-face learning process.27 Moodle is used in around 190 countries and offers its services in over seventy languages. It acts as a mediator for instruction and is widely adopted in many institutions. Moodle provides services such as assignments, wikis, messaging, blogs, quizzes, and databases. 28 It can provide a more flexible teaching platform than traditional teaching. Health science educational service providers facilitate self-assurance in their learners. Several educational campuses operate by using face-to-face learning strategies, whereby learners obtain their training on-campus locations. The objective of Moodle is to enable the education of learners through internet access.29 Xing focused on the broad application of the Moodle LMS for developing educational technology within academic settings, suggesting that academic organizations should promote technology as a solution for common problems with students’ learning processes.30 Such suggestions have been supported by Costa et al. (2012) who found that Moodle is significantly helpful for developing an e-learning platform for students. They emphasized that engineering universities must use the Moodle LMS to provide students with extensive technical knowledge. 31 Costello et al. (2012) stated that Moodle, if used, will significantly help students improve their skills and knowledge effectively.32 METHODOLOGY In information literacy skill development, there are studies that support using only face-to-face education or only using an LMS. For example, Churkovich and Oughtred found that face-to-face learning leads to better results in information literacy tutorials than online learning. 33 At the same time, Anderson and May concluded that the use of an LMS is viewed by students as a better method than face-to-face instruction in information literacy.34 To test which educational pedagogy (traditional or technology) is better regarding information literacy, the following two hypotheses were posited: H1: Face-to-face learning has a significantly positive influence on information literacy disposition. H2: Moodle learning has a significantly positive influence on information literacy disposition. To provide a better understanding of the most effective method of information literacy instruction, a quantitative research design was used. The wording of the questionnaire items (shown in table 1) was inspired by the studies of Ng, Horvat et al., Abdullah, and Deng and Tavares. 35 Online INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 123 questionnaires were prepared and distributed to students, teachers, trainers, and professors as well as administrative departments in a small private university located in the Arabian Gulf region. Initially, a pilot study was conducted to test the instrument. This pilot study involved forty-nine participants and fifteen questions on information literacy. It also included demographic questions. Variables Code Item Wording Face-to-face Education Disposition (FED) FED1 Information literacy skills are polished through face-to-face learning FED2 Face-to-face learning accommodates information literacy requirements FED3 Face-to-face learning is easier than learning management systems FED4 Face-to-face learning is better than learning management systems Moodle Usage Disposition (MUD) MUD1 Moodle is more easily accessible than other online resources MUD2 Moodle is an effective web server for information literacy MUD3 Moodle is more reliable than other online resources MUD4 Moodle enables the provision of an extensive amount of useful information MUD5 Moodle is used to overcome language, understanding, and communication gaps Information Literacy Preference (IL) IL1 Students and teachers prefer online resources IL2 Inauthentic websites are helpful for students and teachers IL3 Authentic websites are useful for students and teachers IL4 Students and teachers prefer published articles, journals, and books IL5 Online learning is more effective IL6 Information is essential for individuals’ knowledge Table 1. Item coding. After the pilot study, a full-scale study was conducted, in which the participants were students, professors, and educational administrators. An online questionnaire was sent to the management of an academic institution in the Arabian Gulf region to assess the instruction methodology to improve students’ information literacy skills. The language used in the survey was Arabic, and the questionnaire was translated into English for this article by a professional translator. A total of five hundred questionnaires were sent, and 398 of them were received with complete responses. The following criteria were used to filter questionnaires that were not appropriate for this study: Inclusion Criteria • People currently involved in the education system. • Students, teachers, or members of an academic department. • People who understand information literacy. A question was added in the survey about whether the participant was familiar with information literacy; if not, the participant was removed from the sample. Exclusion Criteria • People who were not involved in the education system. • People who were not aware of online learning systems. • Staff with no role in learning or teaching. INFORMATION LITERACY MODEL | ABUNADI 124 https://doi.org/10.6017/ital.v37i4.9750 Gender Frequency Percent Male 186 46.73 Female 212 53.27 Total 398 100 Qualification Frequency Percent Undergraduate 181 45.48 Graduate 98 24.62 Masters 119 29.90 Total 398 100 Designation Frequency Percent Student 216 54.27 Instructor 90 22.61 Administrator 92 23.12 Total 398 100 Table 2. Demographic information. Question Agree Neutral Disagree Don’t Know Face-to-face Education Disposition (FED) FED1 46.8 22.8 21.3 9.1 FED2 10 74.5 14.2 1.3 FED3 1.5 12.8 75.8 9.9 FED4 32 30 26 12 Information Literacy Preference (IL) IL1 38.8 21.3 1.5 38.4 IL2 0.3 1 98.7 -- IL3 15 31 53.3 -- IL4 49.5 30 13.0 7.5 IL5 48 29.8 -- 22.2 IL6 74 11.5 1.8 12.7 Table 3. Questionnaire response distribution for FED and IL. Question Yes No Moodle Usage Disposition (MUD) MUD1 65 35 MUD2 73.3 26.8 MUD3 67 33 MUD4 66 34 MUD5 63.7 36.3 Table 4. Responses to MUD. The reliability statistics showed a high level of consistency for the pilot test because the Cronbach’s alpha for the fifteen items was 0.901, which is above the recommended level of 0.7.36 Cronbach’s alpha is a widely used coefficient measuring the internal consistency of items as a unified group.37 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 125 Based on the successful pilot study, a full-scale study was conducted. The demographic distribution for the full-scale study is shown in table 2 along with the mean and standard deviation of each demographic factor. The distribution of the questionnaire items for the full-scale study is shown in tables 3 and 4. Cronbach’s alpha was used to determine the reliability of the constructed items for the full-scale study. The standard benchmark for the reliability value is a 0.7 threshold; however, the Cronbach’s alpha for all constructed items was above the 0.7 standard value. Thus, this standard score revealed that all the items had appropriate and adequate reliability.38 RESULTS The research hypotheses were tested using structural equation modeling (SEM) with the analysis of momentum structures (AMOS) approach. SEM includes various statistical methods and computer algorithms that are used to assess latent variables along with observed variables. SEM also indicates the relationships among latent variables, showing the effects of the independent variables on the dependent variables.39 One well-regarded SEM methodology is AMOS, which is a multivariate technique that can concurrently assess the relationships between latent variables and their corresponding indicators (measurement model), as well as the relationships among the model’s variables.40 Highly cited information systems and statistics guidelines were followed for the SEM to ensure the validity and reliability of data analysis. 41 Measurement and Structural Model The measurement model contained fifteen items for ascertaining the representation of three latent variables, including face-to-face education disposition, Moodle usage disposition, and information literacy preferences. Before we proceed to this analysis, the data need to show normality for us to be able to trust the robustness of this parametric SEM. Curran et al. suggested a skewness and kurtosis less than the absolute value of 2 and 7, respectively, to display the normality of the data.42 All items’ absolute values of skewness and kurtosis were less than the suggested cut off, showing a suitable level of normality for conducting SEM analysis. The overall measurement model showed a high level for the fit indices: GFI=0.99, AGFI=0.98, NFI=0.98, CMIN/DF=0.86, and RMR=0.39. All these fit indices show that the theoretical model fits well with the empirical data if they are above 0.95, except CMIN/DF and RMR, which do not follow this cut off. The CMIN/DF should be less than 3, while the RMR should be less than 0.5.43 Table 5 shows all the items loaded on their corresponding latent variables higher than the suggested cut off (0.5). As shown in the table, IL6 was the only item that did not load clearly on its latent variable and, thus, it was dropped from further analysis.44 An additional method to assess item loading was item loading significance, which was significant at the level of 0.001, indicating that all items loaded on their latent variables.45 The indices of the measurement model suggested that the psychometric properties of this instrument can be preceded by the structural model. INFORMATION LITERACY MODEL | ABUNADI 126 https://doi.org/10.6017/ital.v37i4.9750 Item Estimate Face-to-face Education Disposition (FED) FED4 0.71 FED3 0.52 FED2 0.66 FED1 0.89 Moodle Usage Disposition (MUD) MUD5 0.93 MUD4 0.92 MUD3 0.92 MUD2 0.73 MUD1 0.93 Information Literacy Preference Disposition (IL) IL6 0.32 IL5 0.91 IL4 0.72 IL3 0.86 IL2 0.81 IL1 0.83 Table 5. Item loadings. The next step was to assess the structural model, which was used to evaluate the hypothesized relations between the dependent variables (face-to-face education disposition [FED] and Moodle usage disposition [MUD]) and independent variables (IL). Both education methods were tested in the hypotheses to identify the most suitable information literacy delivery mode for students. Both hypotheses were supported, which indicates that an individual method of information literacy delivery (either face-to-face instruction or LMS) is not preferred, and a different model can be suggested. Both hypotheses were supported at the level of 0.001 with an effect size for face-to-face education disposition of 0.32, which indicates a medium impact on information literacy preferences. Meanwhile, the Moodle usage disposition had an effect size of 0.70, which is considered a large effect size (Hair et al. 2010). Finally, the model’s explanatory power of information literacy preferences was determined by R2, which was high (0.85). Based on the previous analysis, it can be said that an individual method of information literacy delivery is insufficient in developing countries. Thus, a different model for information literacy was developed (figure 1), which had an impact on students’ related competencies. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 127 Figure 1. Information Literacy Intervention Model. As shown in figure 1, the model includes conducting weekly information literacy sessions that focus on educating students about technological literacy, information ethics, online library skills, and critical literacy. After each session is concluded, the instructor creates weekly assignments using an LMS that tests the students’ information literacy abilities regarding the subject material. The instructor follows up regarding the students’ overall performance and fills any identified gaps in subsequent information literacy sessions and assignments. The instructor studies the students’ performance after one month and provides feedback to students. Finally, a “real case project assignment” is used to teach students to solve real problems using the skills they learn. The instructor can further extend reflection on the process of assigning “real case project” grades by creating a course exit survey that asks students about their acquired level of information literacy skills. LONGITUDINAL CASE STUDY A small technical university in the Arabian Gulf region faces difficulties in providing adequate library resources to its students because of its limited capabilities. The university has about 4,500 students and five hundred employees. The university library and information technology department shortage in adequate staff and resources, resulting in an insufficient support for student learning. This has caused lack of student information literacy education, which is evident in the submission of student assignments. For example, students are not accustomed to citing materials that were used in their assessments. Thus, these undergraduates are viewed suspiciously by their educators when using online materials. Not knowing how to paraphrase then cite relevant online materials causes missing learning opportunities for students. Information literacy is a skill that should be considered for all technology-related courses.46 The outcomes of this course will be used to improve the education of students and place the power of learning in their hands.47 Therefore, the objective of this case study is to determine the influence of information literacy practices in improving student performance in solving organizational problems, especially when technology and library resources are scarce. This longitudinal case INFORMATION LITERACY MODEL | ABUNADI 128 https://doi.org/10.6017/ital.v37i4.9750 study was conducted in two semesters: the first was conducted traditionally without the use of an information literacy intervention model, whereas in the second semester, the intervention model was introduced. Finally, the performance and opinions of students for the two semesters were compared using a case study assignment and course exit survey. The information literacy intervention model was implemented by providing a series of practical tutorials at the beginning of the semester showing students how to use information from the internet. Then, the students applied the information and used information literacy skills to solve weekly assessments for an enterprise-architecture (EA) course. This course is taught under the information systems program at a private university. Students enrolling in the course are in their second year or higher. The information literacy assessments require students to search for reliable sources of information and cite and reference them. This forces them into the habit of critically examining sources of information, and grasping, analyzing, and using these sources to solve problems. The information literacy technology pedagogical method was followed to improve students’ knowledge of methods of learning.48 The students were educated through a series of classes on how to use the university’s databases, e-books, and internet resources to solve real-life organizational problems and to apply concepts in different situations, as shown in figure 1. The students were given ten small assessments from the Moodle LMS, where a concept taught in the class needed to be applied after students searched for it and learned more about it from different sources. This included looking in the correct places for reliable resources, online scholarly databases, and online videos that could be of use. Then, students were taught how to critically examine resources and determine which of these could be reliable. For example, students were shown that highly cited papers were more reliable than less cited papers and that online videos from professional organizations (e.g., IBM or Gartner) were more reliable than personal videos. Students were also taught how to use in-text citations and how to create reference lists. In the last quarter of the semester, a case study assignment was provided with real-life problems that students were required to solve using different sources, including the internet. The performance of semester-1 students (no intervention was conducted) was compared with that of semester-2 students (information literacy intervention was conducted) taking the same course. An improvement in grades was considered a successful indicator. The comparison point was a major project that required students to solve real-life organizational problems and required greater information literacy. Some of the EA concepts taught in the class required practice to apply them. For example, the as-is organizational modeling that is needed before implementing EA would be difficult to understand unless students actually conducted modeling on selected organizations. This enabled students to understand how they related to the real world. The concepts that were focused on were related to business tools in information systems (e.g., business process management and requirement elicitation) that are widely used for analysis within organizations. The theory behind these tools was explained in class; applying these theories required students to search many sources of information, including online books and research databases. Students were unaware of these resources until the instructors explained their availability on the internet and in the library. The students were provided with regular information literacy sessions to improve their skills in this aspect. They were shown how to search; for instance, if they could not find a specific term, they INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 129 could look for synonyms. They were instructed on how to use search engines and research databases and were shown the relevant electronic journals and books that can aid in solving weekly assessments. The usage of internet multimedia is also important in education.49 The students were shown relevant YouTube channels (e.g., by Harvard and Khan Academy) and relevant massive online open courses (e.g., free courses on Coursera.com and Udemy.com). Weekly tests required students to use these resources to solve the assessment problems. An important outcome of this intervention was an improvement in students’ abilities to use different digital resources. This was evident in semester-2 students’ usage of suitable reference lists and in-text citations, as compared to a lack of such usage by semester-1 students. An additional measure was the higher average score the students indicated in semester 2 (4.15/5), in comparison to semester 1 (3.2/5), for one of the items in the course exit survey relevant to information literacy: “Illustrate responsibility for one’s own learning.” The students were continually taught that information literacy grants a power that comes with responsibility, and no incidents of plagiarism were reported during the semester in which the intervention was conducted. Referencing became a habit with weekly information literacy assessments. The students’ grades in the final project were better than in the previous academic semester. The average grade for the project for semester 1 was 15.5/20, while that for semester 2 was 17/20. The difference between the grades for semester-1 and semester-2 projects was statistically significant at the level of 0.10, indicating significant differences in the students’ grades between the two semesters. The students could use digital library databases, and some were interested in using external online books. It became habitual for students to use in-text citations, and their references became diversified. Some students, however, struggled at times with the limited usage of suitable references in only some paragraphs. This feedback was delivered to students so that they could address this issue in other courses. DISCUSSION AND CONCLUSION This study was conducted to investigate the most effective mode of information literacy delivery. The study focused on smaller universities because they do not have adequate library facilities and technological capabilities to provide students with sufficient information literacy competencies during course delivery. A survey was conducted to determine the most suitable form of information literacy delivery. The survey determined that Moodle and face-to-face methods were both favored regarding information literacy. Thus, the information literacy intervention model was developed and tested in a case study, so that students’ performance would improve. The results of this study have shown that the combination of technology and information literacy instruction is an effective method to improve student skills in using digital resources in seeking knowledge. It was found that both face-to-face learning and the use of an LMS increase student performance in assessments that require information literacy. Face-to-face learning is required in order to explain information literacy concepts, while the LMS is used to disseminate the necessary digital resources and in creating assessment modules. Thus, the arrangement of both theory and practice in information literacy resulted in better understanding and implementation in knowledge seeking and problem-solving related to information systems. The inclusion of information literacy instruction along with the use of LMS for information literacy assessments within information systems courses has reduced the pressure on libraries that lack technological resources (such as PCs) and qualified staff. INFORMATION LITERACY MODEL | ABUNADI 130 https://doi.org/10.6017/ital.v37i4.9750 The results with regard to this study’s hypotheses are in agreement with those of previous studies.50 Hypothesis 1, which posited that there would be a positive significant influence on information literacy disposition, is congruent with the research of Churkovich and Oughtred. 51 Their research focused on student information literacy skill development using library facilities instead of faculty, which is a different approach than the approach followed in the present study. However, both the present study and the study of Churkovich and Oughtred found that using face- to-face instruction leads to improved student performance. Hypothesis 2, which posited a positive impact on information literacy disposition, correlates with the research of Anderson and May.52 They found that using an LMS is more effective than face-to-face instruction for information literacy instruction. Similar to Churkovich and Oughtred (and in contrast to the present study), Anderson and May relied on librarians to deliver information literacy instruction online. However, Anderson and May also relied on faculty staff in addition to librarians. There are two noteworthy outcomes of the first study. First, the questionnaire measurement model showed that the development of this instrument was successful and that the items and their latent variables can be used in further studies. Second, the results regarding the structural model indicated that both face-to-face instruction and Moodle use influenced information literacy preferences. Other studies have supported these results. The results of Peter et al. (2015) agree with the finding of the present study that the combination of face-to-face instruction and LMS use leads to improved student performance.53 Peter et al. (2015), based on psychology students, focused on the time-efficiency of the delivery of information literacy instruction; in contrast, the present study considers information literacy skill development as a progressive, long-term process. The information literacy intervention model is not only a learning medium but an interactive method of teaching that adapts to student learning patterns. The primary limitations of the study were the nature of the sample, the exclusion of some potentially relevant variables, and the simplification of the study’s findings. The sample was limited to students, professors, and people who were aware of the learning programs; it is highly possible that they were more familiar with such technological innovations than the general population. Future studies could retest the hypothesis of the study in a comprehensive manner and impose more control on the respondents. The interaction between people while visiting a site is itself an activity worthy of examination, but it must be either controlled or measured for us to understand the role it plays in shaping attitudes and behaviors. Future studies can apply the developed theoretical model in different settings to determine its interaction with other variables in the information systems field. A quantitative instrument can be developed based on the information literacy intervention model. Alternatively, this model can be applied with qualitative interviews in future studies to develop theoretical themes based on instructors’ and students’ responses. REFERENCES 1 Harry M. Kibirige and Lisa DePalo, “The Internet as a Source of Academic Research Information: Findings of Two Pilot Studies,” Information Technology and Libraries 19, no. 1 (2000): 11–15; Debbie Folaron, A Discipline Coming of Age in the Digital Age (Philadelphia: John Benjamins, 2006); N. N. Edzan, “Tracing Information Literacy of Computer Science Undergraduates: A Content Analysis of Students’ Academic Exercise,” Malaysian Journal of Library & Information Science 12, no. 1 (2007): 97–109. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 131 2 Heinz Bonfadelli, “The Internet and Knowledge Gaps,” European Journal of Communication 17, no. 1 (2002): 65–84, http://journals.sagepub.com/doi/abs/10.1177/0267323102017001607; Kibirige and DePalo, “The Internet as a Source of Academic Research Information,” 11–15. 3 Laurie A. Henry, “Searching for an Answer: The Critical Role of New Literacies While Reading on the Internet,” The Reading Teacher 59, no. 7 (2006): 614–27. 4 Peyina Lin, “Information Literacy Barriers: Language Use and Social Structure,” Library Hi Tech 28, no. 4 (2010): 548–68, https://doi.org/10.1108/07378831011096222. 5 Michael R. Hearn, “Embedding a Librarian in the Classroom: An Intensive Information Literacy Model,” Reference Services Review 33, no. 2 (2005): 219–27. 6 Hui Hui Chen et al., “An Analysis of Moodle in Engineering Education: The Tam Perspective” (paper presented at Teaching, Assessment and Learning for Engineering (TALE), 2012 IEEE International Conference on). 7 N. N. Edzan, “Tracing Information Literacy of Computer Science Undergraduates: A Content Analysis of Students' Academic Exercise,” Malaysian Journal of Library & Information Science 12, no. 1 (2007): 97–109. 8 Johannes Peter et al., “Making Information Literacy Instruction More Efficient by Providing Individual Feedback,” Studies in Higher Education (2015): 1–16, https://doi.org/10.1080/03075079.2015.1079607. 9 Pamela Alexondra Jackson, “Integrating Information Literacy into Blackboard: Building Campus Partnerships for Successful Student Learning,” The Journal of Academic Librarianship 33, no. 4 (2007): 454–61, https://doi.org/10.1016/j.acalib.2007.03.010. 10 Manal Abdulaziz Abdullah, “Learning Style Classification Based on Student's Behavior in Moodle Learning Management System,” Transactions on Machine Learning and Artificial Intelligence 3, no. 1 (2015): 28. 11 Catherine J. Gray and Molly Montgomery, “Teaching an Online Information Literacy Course: Is It Equivalent to Face-to-Face Instruction?,” Journal of Library & Information Services in Distance Learning 8, no. 3–4 (2014): 301–9, https://doi.org/10.1080/1533290X.2014.945876. 12 William Sugar, Trey Martindale, and Frank E Crawley, “One Professor’s Face-to-Face Teaching Strategies While Becoming an Online Instructor,” Quarterly Review of Distance Education 8, no. 4 (2007): 365–85. 13 Stephann Makri et al., “A Library or Just Another Information Resource? A Case Study of Users’ Mental Models of Traditional and Digital Libraries,” Journal of the Association for Information Science and Technology 58, no. 3 (2007): 433–45. 14 Christine Susan Bruce, “Workplace Experiences of Information Literacy,” International Journal of Information Management 19, no. 1 (1999): 33–47, Michael B Eisenberg, Carrie A Lowe, and Kathleen L Spitzer, Information Literacy: Essential Skills for the Information Age, (Westport, CT: Greenwood Publishing Group, 2004), https://doi.org/10.1016/S0268-4012(98)00045-0. INFORMATION LITERACY MODEL | ABUNADI 132 https://doi.org/10.6017/ital.v37i4.9750 15 Andy Carvin, “More Than Just Access: Fitting Literacy and Content into the Digital Divide Equation,” Educause Review 35, no. 6 (2000): 38–47. 16 Robert Hauptman, Ethics and Librarianship (Jefferson, NC: McFarland, 2002). 17 JaNae Kinikin and Keith Hench, “Poster Presentations as an Assessment Tool in a Third/College Level Information Literacy Course: An Effective Method of Measuring Student Understanding of Library Research Skills,” Journal of Information Literacy 6, no. 2 (2012), https://doi.org/10.11645/6.2.1698; Stuart Palmer and Barry Tucker, “Planning, Delivery and Evaluation of Information Literacy Training for Engineering and Technology Students, ” Australian Academic & Research Libraries 35, no. 1 (2004): 16–34, https://doi.org/10.1080/00048623.2004.10755254. 18 Lauren Smith, “Towards a Model of Critical Information Literacy Instruction for the Development of Political Agency,” Journal of Information Literacy 7, no. 2 (2013): 15–32, https://doi.org/10.11645/7.2.1809. 19 Melissa Gross and Don Latham, “What’s Skill Got to Do with It?: Information Literacy Skills and Self‐Views of Ability among First‐Year College Students,” Journal of the American Society for Information Science and Technology 63, no. 3 (2012): 574–83, https://doi.org/10.1002/asi.21681. 20 Bala Haruna et al., “Modelling Web-Based Library Service Quality and User Loyalty in the Context of a Developing Country,” The Electronic Library 35, no. 3 (2017): 507–19, https://doi.org/10.1108/EL-10-2015-0211. 21 Catherine J. Gray and Molly Montgomery, “Teaching an Online Information Literacy Course: Is It Equivalent to Face-to-Face Instruction?,” Journal of Library & Information Services in Distance Learning 8, no. 3–4 (2014): 301–9, https://doi.org/10.1080/1533290X.2014.945876. 22 Ioannis Dimopoulos et al., “Using Learning Analytics in Moodle for Assessing Students’ Performance” (paper presented at the 2nd Moodle Research Conference Sousse, Tunisia, 4 –6, 2013). 23 Ángel Hernández-García and Miguel Á. Conde-González, “Using Learning Analytics Tools in Engineering Education” (paper presented at LASI Spain, Bilbao, 2016). 24 Michael R. Hearn, “Embedding a Librarian in the Classroom: An Intensive Information Literacy Model,” Reference Services Review 33, no. 2 (2005): 219–27, https://doi.org/10.1108/00907320510597426; Thomas P Mackey and Trudi E Jacobson, “Reframing Information Literacy as a Metaliteracy,” College & Research Libraries 72, no. 1 (2011): 62–78; S. Serap Kurbanoglu, Buket Akkoyunlu, and Aysun Umay, “Developing the Information Literacy Self-Efficacy Scale,” Journal of Documentation 62, no. 6 (2006): 730–43, https://doi.org/10.1108/00220410610714949. 25 Michelle Holschuh Simmons, “Librarians as Disciplinary Discourse Mediators: Using Genre Theory to Move toward Critical Information Literacy,” portal: Libraries and the Academy 5, no. 3 (2005): 297–311, https://doi.org/10.1353/pla.2005.0041; Sharon Markless and David R. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 133 Streatfield, “Three Decades of Information Literacy: Redefining the Parameters,” Change and Challenge: Information Literacy for the 21st Century (Blackwood, South Australia: Auslib Press, 2007): 15–36; Meg Raven and Denyse Rodrigues, “A Course of Our Own: Taking an Information Literacy Credit Course from Inception to Reality,” Partnership: The Canadian Journal of Library and Information Practice and Research 12, no. 1 (2017), https://doi.org/10.21083/partnership.v12i1.3907. 26 Joanne Munn and Jann Small, “What Is the Best Way to Develop Information Literacy and Academic Skills of First Year Health Science Students? A Systematic Review,” Evidence Based Library and Information Practice 12, no. 3 (2017): 56–94, https://doi.org/10.18438/B8QS9M; Sheila Corrall, “Crossing the Threshold: Reflective Practice in Information Literacy Development,” Journal of Information Literacy 11, no. 1 (2017): 23–53, https://doi.org/10.11645/11.1.2241. 27 Liping Deng and Nicole Judith Tavares, “From Moodle to Facebook: Exploring Students’ Motivation and Experiences in Online Communities,” Computers & Education 68 (2013): 167– 76, https://doi.org/10.1016/j.compedu.2013.04.028. 28 Ana Horvat et al., “Student Perception of Moodle Learning Management System: A Satisfaction and Significance Analysis,” Interactive Learning Environments 23, no. 4 (2015): 515–27, https://doi.org/10.1080/10494820.2013.788033. 29 Cary Roseth, Mete Akcaoglu, and Andrea Zellner, “Blending Synchronous Face-to-Face and Computer-Supported Cooperative Learning in a Hybrid Doctoral Seminar,” TechTrends 57, no. 3 (2013): 54–59, https://doi.org/10.1007/s11528-013-0663-z. 30 Ruonan Xing, “Practical Teaching Platform Construction Based on Moodle—Taking ‘Education Technology Project Practice’ as an Example,” Communications and Network 5, no. 3 (2013): 631, https://doi.org/10.4236/cn.2013.53B2113. 31 Carolina Costa, Helena Alvelos, and Leonor Teixeira, “The Use of Moodle E-Learning Platform: A Study in a Portuguese University,” Procedia Technology 5 (2012): 334–43, https://doi.org/10.1016/j.protcy.2012.09.037. 32 Eamon Costello, “Opening Up to Open Source: Looking at How Moodle Was Adopted in Higher Education,” Open Learning: The Journal of Open, Distance and e-Learning 28, no. 3 (2013): 187– 200, https://doi.org/10.1080/02680513.2013.856289. 33 Marion Churkovich and Christine Oughtred, “Can an Online Tutorial Pass the Test for Library Instruction? An Evaluation and Comparison of Library Skills Instruction Methods for First Year Students at Deakin University,” Australian Academic & Research Libraries 33, no. 1 (2002): 25– 38, https://doi.org/10.1080/00048623.2002.10755177. 34 Karen Anderson and Frances A. May, “Does the Method of Instruction Matter? An Experimental Examination of Information Literacy Instruction in the Online, Blended, and Face-to-Face Classrooms,” The Journal of Academic Librarianship 36, no. 6 (2010): 495–500, https://doi.org/10.1016/j.acalib.2010.08.005. INFORMATION LITERACY MODEL | ABUNADI 134 https://doi.org/10.6017/ital.v37i4.9750 35 Wan Ng, “Can We Teach Digital Natives Digital Literacy?,” Computers & Education 59, no. 3 (2012): 1065–78, https://doi.org/10.1016/j.compedu.2012.04.016; Horvat et al., “Student Perception of Moodle Learning Management System,” 515–27, https://doi.org/ 10.1080/10494820.2013.788033; Manal Abdulaziz Abdullah, “Learning Style Classification Based on Student's Behavior in Moodle Learning Management System,” Transactions on Machine Learning and Artificial Intelligence 3, no. 1 (2015): 28; Liping Deng and Nicole Judith Tavares, “From Moodle to Facebook: Exploring Students’ Motivation and Experiences in Online Communities,” Computers & Education 68 (2013): 167–76, https://doi.org/10.1016/j.compedu.2013.04.028. 36 J. F. Hair, William C. Black, and Barry J. Babin, Multivariate Data Analysis: A Global Perspective, 7th ed. (Upper Saddle River, NJ: Pearson, 2010). 37 L. J. Cronbach, “Test Validation,” in Educational Measurement, R. L. Thorndike 2nd ed. (Washington, DC: American Council on Education, 1971). 38 B. Tabachnick and L. Fidell, Using Multivariate Statistics, 5th ed. (New York: Allyn and Bacon, 2007). 39 Hair, Black, and Babin, Multivariate Data Analysis. 40 B. M. Byrne, Structural Equation Modeling with Amos: Basic Concepts, Applications, and Programming, 2nd ed. (New York: Taylor & Francis Group, 2010); Hair, Black, and Babin, Multivariate Data Analysis. 41 T. A. Brown, Confirmatory Factor Analysis for Applied Research (Methodology in the Social Sciences) (New York: Guilford, 2006); Byrne, Structural Equation Modeling with Amos; D. Gefen, D. Straub, and M. Boudreau, “Structural Equation Modeling and Regression: Guidelines for Research Practice,” Communications of the Association for Information Systems 4, no. 7 (2000): 1–77; Hair, Multivariate Data Analysis: A Global Perspective. 42 P. J. Curran, S. G. West, and J. F. Finch, “The Robustness of Test Statistics to Nonnormality and Specification Error in Confirmatory Factor Analysis,” Psychological Methods 1, no. 1 (1996): 16–29, https://doi.org/10.1037/1082-989X.1.1.16. 43 Byrne, Structural Equation Modeling with Amos. 44 Brown, Confirmatory Factor Analysis for Applied Research; Byrne, Structural Equation Modeling with Amos. 45 Hair, Black, and Babin, Multivariate Data Analysis: A Global Perspective. 46 Michael B. Eisenberg, Carrie A. Lowe, and Kathleen L. Spitzer, Information Literacy: Essential Skills for the Information Age (Westport, CT: Greenwood Publishing Group, 2004). 47 James Elmborg, “Critical Information Literacy: Implications for Instructional Practice,” The Journal of Academic Librarianship 32, no. 2 (2006): 192–99, https://doi.org/10.1016/j.acalib.2005.12.004. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2018 135 48 Ibid. 49 Anderson and May, “Does the Method of Instruction Matter?,” 495–500; Horvat et al., “Student Perception of Moodle Learning Management System,” 515–27, https://doi.org/ 10.1080/10494820.2013.788033. 50 Horvat et al., “Student Perception of Moodle Learning Management System,” 515–27, https://doi.org/ 10.1080/10494820.2013.788033; Anderson and May, “Does the Method of Instruction Matter?,” 495–500; Raven and Rodrigues, “A Course of Our Own.” 51 Churkovich and Oughtred, “Can an Online Tutorial Pass the Test for Library Instruction?,” 25–38. 52 Anderson and May, “Does the Method of Instruction Matter?,” 495–500. 53 Peter et al., “Making Information Literacy Instruction More Efficient,” 1–16. ABSTRACT INTRODUCTION LITERATURE REVIEW Information Literacy Moodle as an LMS METHODOLOGY Inclusion Criteria Exclusion Criteria Results Measurement and Structural Model longitudinal CASE STUDY DISCUSSION AND CONCLUSION REFERENCES 9808 ---- Microsoft Word - December_ITAL_fifarek_final.docx President’s Message: For The Record Aimee Fifarek INFORMATION TECHNOLOGIES AND LIBRARIES | MARCH 2017 1 For a long time, I’ve have an idea that when a new President of the United States is elected, sometime after he's sworn in, amid all of the briefings, a wizened old man sits down with him to have The Talk. In my imagination the messenger is some cross between the Templar Knight from Indiana Jones and the Last Crusade and the International Express man from Neil Gaiman and Terry Pratchett’s Good Omens: officious yet wise. He tells the new President the why of it all, the real reasons why important things have happened in the ways they have, making all the decisions that seemed so wrong now seem inevitable. And probably not for the first time the new President thinks to himself “What have I gotten myself into?” This is clearly reflective of my desire for there to be, if not a reason for everything that happens, then at least some record of it all that can be reviewed, synthesized, and mined for meaning by future leaders. It’s the Librarian in me I suppose. Although being LITA President bears absolutely no resemblance to being President of the United States, I have been thinking about this little imagining of mine a lot lately. This is probably because, now that I am midway through my Presidential cycle (Vice President, President, Past President), I realize how much of what I’ve done has been marked by the absence of such a record. I did not receive a “How to be LITA President” manual along with my gavel, and no one gave me the LITA version of The Talk. The one person who could have done it, LITA Executive Director Jenny Levine, was as new to her position as I was to mine, so we have learned together and asked many questions of those around us with more experience. We are in the midst of Election season, and will soon have a new President-Elect. Bohyun Kim and David Lee King are both excellent candidates (http://litablog.org/2017/01/meet-your- candidates-for-the-2017-lita-election/); those of you who have not yet voted have a difficult choice. In order to make a little progress toward developing that how-to guide I thought I’d document a few of things I’ve learned since being LITA President. Being LITA President also means being President of a Division of the American Library Association. When I was elected I expected to manage the business of the Library Information Technology Association—Board meetings, Committee Appointments, Presidential Programs and LITA Forums. Seeing the Board complete the LITA Strategic Plan (http://www.ala.org/lita/about/strategic) was a great accomplishment at this level. While it’s possible for a Division leader to have minimal interactions with “Big ALA” during their Aimee Fifarek (aimee.fifarek@phoenix.gov) is LITA President 2016-17 and Deputy Director for Customer Support, IT and Digital Initiatives at Phoenix Public Library, Phoenix, AZ. PRESIDENT’S MESSAGE | FIFAREK https://doi.org/10.6017/ital.v36i1.9808 2 term and still be successful, my priority for my presidential year—increasing value LITAns receive from membership, especially those who are not able to attend in-person conferences—meant that I needed to learn more about how ALA works. After a year and a half, I have a much better understanding of the Association’s budgeting, publishing, and technology practices, and how all of these are impacted by declines in membership and decreasing revenues. Future LITA leaders are going to need to continue to be engaged at the larger organizational level if we are to be able to use LITA’s technological knowledge and expertise to support ALA’s efforts to maximize efficiency while minimizing costs. Being LITA President means speaking not just to, but for, an incredibly diverse community. My plan when I became LITA President was to blog on a more regular basis. However, I didn’t expect some of my first communications to be about a mass shooting in Dallas (in advance of the Forum in Ft. Worth) or working with the Board to craft a statement on inclusivity after the US presidential election. The proverbial curse “may you live in interesting times” has certainly been true this year. Having to speak to the LITA community about those issues made me acutely aware of my responsibility to adequately represent you when we’ve also been asked to weigh in on technology policy issues at the federal level such as the call for increased gun violence research and rescinding ISP regulations on privacy protection. The decision by the Board to include Advocacy and Information Policy as a primary focus for the strategic plan was certainly prescient. We are fortunate that our President Elect, Andromeda Yelton, is both well-versed in the issues and able to speak eloquently to them1. Being LITA President means being part of more than one team I’m continually amazed at the hard work and dedication that Board members (http://www.ala.org/lita/about/board), Committee and Interest Group Chairs (http://www.ala.org/lita/about/committees/chairs), and anyone who fits into Involvement our Member persona (http://litablog.org/2017/03/who-are-lita-members-lita-personas/). The success of LITA as an organization is entirely due to the time and passion of this team. But when you become LITA President-Elect you get a new team—the other Division Vice Presidents. This cohort travels to ALA HQ in Chicago in October after they are elected to meet each other and the incoming ALA President and learn about the structure of ALA. I have learned much from the other Presidents this year, and we have had a number of truly productive discussions about how the Divisions can collaborate and learn from each other to more effectively serve our members. LITA is directly benefitting from the expertise of the other groups and they are in turn looking to us for both our technical skillset and the successes we’ve had over 50-years as an Association. INFORMATION TECHNOLOGIES AND LIBRARIES | MARCH 1017 3 Consider this a new preface to the How to Be LITA President manual. I hope that my successors find it useful, and that it will serve as an inspiration for any LITAns out there who are thinking about putting their name on the ballot in future years. It has been a marvelous and educational experience. And the gavel is pretty cool, too. REFERENCES 1. Making ALA Great Again, Publisher’s Weekly, Feb 17, 2017. http://www.publishersweekly.com/pw/by-topic/industry-news/libraries/article/72814- making-ala-great-again.html 9817 ---- June_ITA_Pekala_final Privacy and User Experience in 21st Century Library Discovery Shayna Pekala INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 48 ABSTRACT Over the last decade, libraries have taken advantage of emerging technologies to provide new discovery tools to help users find information and resources more efficiently. In the wake of this technological shift in discovery, privacy has become an increasingly prominent and complex issue for libraries. The nature of the web, over which users interact with discovery tools, has substantially diminished the library’s ability to control patron privacy. The emergence of a data economy has led to a new wave of online tracking and surveillance, in which multiple third parties collect and share user data during the discovery process, making it much more difficult, if not impossible, for libraries to protect patron privacy. In addition, users are increasingly starting their searches with web search engines, diminishing the library’s control over privacy even further. While libraries have a legal and ethical responsibility to protect patron privacy, they are simultaneously challenged to meet evolving user needs for discovery. In a world where “search” is synonymous with Google, users increasingly expect their library discovery experience to mimic their experience using web search engines.1 However, web search engines rely on a drastically different set of privacy standards, as they strive to create tailored, personalized search results based on user data. Libraries are seemingly forced to make a choice between delivering the discovery experience users expect and protecting user privacy. This paper explores the competing interests of privacy and user experience, and proposes possible strategies to address them in the future design of library discovery tools. INTRODUCTION On March 23, 2017, the internet erupted with outrage in response to the results of a Senate vote to roll back Federal Communications Commission (FCC) rules prohibiting internet service providers (ISPs), such as Comcast, Verizon, and AT&T, from selling customer web browsing histories and other usage data without customer permission. Less than a week after the Senate vote, the House followed suit and similarly voted in favor of rolling back the FCC rules, which were set to go into effect at the end of 2017.2 The repeal became official on April 3, 2017 when the President signed it into law.3 This decision by U.S. lawmakers serves as a reminder that today’s internet economy is a data economy, where personal data flows freely on the web, ready to be compiled and sold to the highest bidder. Continuous online tracking and surveillance has become the new normal. Shayna Pekala (shayna.pekala@georgetown.edu) is Discovery Services Librarian, Georgetown University Library, Washington, DC. PRIVACY AND USER EXPERIENCE IN 21ST CENTURY LIBRARY DISCOVERY | PEKALA https://doi.org/10.6017/ital.v36i2.9817 49 ISPs are just one of the many players in the online tracking game. Major web search engines, such as Google, Bing, and Yahoo, also collect information about users’ search histories, among other personal information.4 By selling this data to advertisers, data brokers, and/or government agencies, these search engine companies are able to make a profit while providing the search engines themselves for “free.” In addition to profiting off of user data, web search engines also use it to enhance the user experience of their products. Collecting and analyzing user data enables systems to learn user preferences, providing personalized search results that make it easier to navigate the ever-increasing sea of online information. The collection and sharing of user data that occurs on the open web is deeply troubling for libraries, whose professional ethics embody the values of privacy and intellectual freedom. A user’s search history contains information about a user’s thought process, and the monitoring of these thoughts inhibits intellectual inquiry.5 Libraries, however, would be remiss to dismiss the success of web search engines and their use of data altogether. MIT’s preliminary report on the future of libraries urges, “While the notion of ‘tracking’ any individual’s consumption patterns for research and educational materials is anathema to the core values of libraries...the opportunity to leverage emerging technologies and new methodologies for discovery should not be discounted.”6 This article examines the current landscape of library discovery, the competing interests of privacy and user experience at play, and proposes possible strategies to address them in the future design of library discovery tools. BACKGROUND Library Discovery in the Digital Age The advent of new technologies has drastically shaped the way libraries support information discovery. While users once relied on shelf-browsing and card catalogs to find library resources, libraries now provide access to a suite of online tools and interfaces that facilitate cross-collection searching and access to a wide range of materials. In an online environment, many paths to discovery are possible, with the open web playing a newfound and significant role. Today’s library discovery tools fall into three categories: online catalogs (the patron interface of the integrated library system (ILS)), discovery layers (a patron interface with enhanced functionality that is separate from an ILS), and web-scale discovery tools (an enhanced patron interface that relies on a central index to bring together resources from the library catalog, subscription databases, and digital repositories).7 These tools are commonly integrated with a variety of external systems, including proxy servers, inter-library loan, subscription databases, individual publisher websites, and more. For the most part, libraries purchase discovery tools from third-party vendors. While some libraries use open source discovery layers, such as Blacklight or VuFind, there are currently no open source options for web-scale discovery tools.8 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 50 Outside of the library, web search engines (e.g. Google, Bing, and Yahoo), and targeted academic discovery products (e.g. Google Scholar, ResearchGate, and Academia.edu) provide additional systems that enable discovery.9 In fact, web search engines, particularly Google, play a significant role in the research process. Both students and faculty use Google in conjunction with library discovery tools. Students typically use Google at the beginning of the research process to get a better understanding of their topic and identify secondary search terms. Faculty, on the other hand, use Google to find out how other scholars are thinking about a topic.10 Unsurprisingly, Google and Google Scholar provide the majority of content access to major content platforms.11 The Data Economy and Online Privacy Concerns In an information discovery environment that is primarily online, new threats to patron privacy emerge. In today’s economy, user data has become a global commodity. Commercial businesses have recognized the value of data mining for marketing purposes. Bjorn Bloching, et. al. explain, “From cleverly aggregated data points, you can draw multiple conclusions that go right to the heart and mind of the customer.”12 Along the same lines, the ability to collect and analyze user data is extremely valuable to government agencies for surveillance purposes, creating an additional data-driven market.13 The increasing value of user data has drastically expanded the business of online tracking. In her book, Dragnet Nation, journalist Julia Angwin outlines a detailed taxonomy of trackers, including various types of government, commercial, and individual trackers.14 In the online information discovery process, multiple parties collect user data at different points. Consider the following scenario: a user executes a basic keyword search in Google to access an openly available online resource. In the fifteen seconds it takes the user to get to that resource, information about the user’s search is collected by the internet service provider (ISP), the web browser, the search engine, the website hosting the resource, and any third-party trackers embedded in the website. The search query, along with the user’s Internet Protocol (IP) address, become part of the data collector’s profile on the user. In the future, the data collector can sell the user’s profile to a data broker, where it will be merged with profiles from other data collectors to create an even more detailed portrait of the user.15 The data broker, in turn, can sell the complete dataset to the government, law enforcement, commercial businesses, and even criminals. This creates serious privacy concerns, particularly since users have no legal right over how their data is bought and sold.16 Privacy Protection in Libraries Libraries have deeply-rooted values in privacy and strong motivations to protect it. Intellectual freedom, the foundation on which libraries are built, necessarily requires privacy. In its interpretation of the Library Bill of Rights, the American Library Association (ALA) explains, “In a library (physical or virtual), the right to privacy is the right to open inquiry without having the subject of one’s interest examined or scrutinized by others.”17 Many studies support this idea, PRIVACY AND USER EXPERIENCE IN 21ST CENTURY LIBRARY DISCOVERY | PEKALA https://doi.org/10.6017/ital.v36i2.9817 51 having found that people who are indiscriminately and secretly monitored censor their behavior and speech.18 Libraries have both legal and ethical obligations to protect patron privacy. While there is no federal legislation that protects privacy in libraries, forty-eight states have regulations regarding the confidentiality of library records, though the extent of these protections varies by state.19 Because these statutes were drafted before the widespread use of the internet, they are phrased in a way that addresses circulation records and does not specifically include or exclude internet use records (records with information on sites accessed by patrons) from these protections. Therefore, according to Theresa Chmara, libraries should not treat internet use records any differently than circulation records with respect to confidentiality.20 The library community has established many guiding documents that embody its ethical commitment to protecting patron privacy. The ALA Code of Ethics states in its third principle, “We protect each library user's right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.”21 The International Federation of Library Associations and Institutions (IFLA) Code of Ethics has more specific language about data sharing, stating, “The relationship between the library and the user is one of confidentiality and librarians and other information workers will take appropriate measures to ensure that user data is not shared beyond the original transaction.”22 The library community has also established practical guidelines for dealing with privacy issues in libraries, particularly those issues relating to digital privacy, including the ALA Privacy Guidelines23 and the National Information Standards Organization (NISO) Consensus Principles on User’s Digital Privacy in Library, Publisher, and Software-Provider Systems.24 Additionally, The Library Freedom Project was launched in 2015 as an educational resource to teach librarians about privacy threats, rights, and tools, and in 2017, the Library and Information Technology Association (LITA) released a set of seven privacy checklists25 to help libraries implement the ALA Privacy Guidelines. Personalization of Online Systems While user data can be used for tracking and surveillance, it can also be used to improve the digital user experience of online systems through personalization. Because the growth of the internet has made it increasingly difficult to navigate the continually growing sea of information online, researchers have put significant effort into designing interfaces, interaction methods, and systems that deliver adaptive and personalized experiences.26 Angsar Koene, et. al. explain, “The basic concept behind personalization of on-line information services is to shield users from the risk of information overload, by pre-filtering search results based on a model of the user’s preferences… A perfect user model would…enable the service provider to perfectly predict the decision a user would make for any given choice.”27 The authors continue to describe three main flavors of personalization systems: 1. content-based systems, in which the system recommends items based on their similarity to items that the user expressed interest in; INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 52 2. collaborative-filtering systems, in which users are given recommendations for items that other users with similar tastes liked in the past; and 3. community-based systems, in which the system recommends items based on the preferences of the user’s friends.28 Many popular consumer services, such as Amazon.com, YouTube, Netflix, Google, etc., have increased (and continue to increase) the level of personalization that they offer.29 One such service in the area of academic resource discovery is Google Scholar’s Updates, which analyzes a user’s publication history in order to predict new publications of interest.30 Libraries, in contrast, have not pressed their developers and vendors to personalize their services in favor of privacy, even though studies have shown that users expect library tools to mimic their experience using web search engines.31 Some web-scale discovery services do, however, allow researchers to set personalization preferences, such as their field of study, and, according to Roger Schonfeld, it is likely that many researchers would benefit tremendously from increased personalization in discovery.32 In this vein, the American Philosophical Society Library recently launched a new recommendation tool for archives and manuscripts that uses circulation data and user-supplied interests to drive recommendations.33 Opportunities for User Experience in Library Discovery A major challenge in today’s online discovery environment is that the user is inhibited by an overwhelming number of results. This leads to users rely on relevance rankings and to fail to examine search results in depth. Creating fine-tuned relevance ranking algorithms based on user behavior is one remedy to this problem, but it relies on the use of personal user data.34 However, there may be opportunities to facilitate data-driven discovery while maintaining the user’s anonymity that would be suitable for library (and other) discovery tools. Irina Trapido proposes that relevance ranking algorithms could be designed to leverage the popularity of a resource measured by its circulation statistics or by ranking popular or introductory materials higher than more specialized ones to help users make sense of large results sets.35 Michael Schofield proposes “context-driven design” as an intermediary solution, whereby the user opts in to have the system infer context from neutral device or browser information, such as the time of day, business hours, weather, events, holidays, etc.36 Jason Clark describes a search prototype he built that applies these principles, but he questions whether these types of enhancements actually add value to users.37 Rachel Vacek cautions that personalization is not guaranteed to be useful or meaningful, and continuous user testing is key.38 DISCUSSION There are several aspects to consider for the design of future library discovery tools. The integrated, complex nature of the web causes privacy to become compromised during the information discovery process. Library discovery tools have been designed not to retain borrowing records, but have not yet evolved to mask user behavior, which is invaluable in today’s data economy. It is imperative that all types of library discovery tools have built-in functionality to PRIVACY AND USER EXPERIENCE IN 21ST CENTURY LIBRARY DISCOVERY | PEKALA https://doi.org/10.6017/ital.v36i2.9817 53 protect patron privacy beyond borrowing records, while also enabling the ethical use of patron data to improve user experience. Even if library discovery tools were to evolve so that they themselves were absolutely private (where no data were ever collected or shared), other online parties (ISPs, web browsers, advertisers, data brokers, etc.) would still have access to user data through other means, such as cookies and fingerprinting. The operating reality is such that privacy is not immediately and completely controllable by libraries. Laurie Rinehart-Thompson explains, “In the big picture, privacy is at the mercy of ethical and stewardship choices on the part of all information handlers.”39 While libraries alone cannot guarantee complete privacy for their patrons, they can and should mitigate privacy risks to the greatest extent possible. At the same time, ignoring altogether the benefits of using patron data to improve the discovery user experience may threaten the library’s viability in the age of Google. Roger Schonfeld explains, “If systems exclude all personal data and use-related data, the resulting services will be one- dimensional and sterile. I consider it essential for libraries to deliver dynamic and personalized services to remain viable in today's environment; expectations are set by sophisticated social networks and commercial destinations.”40 Libraries must find ways to keep up with greater industry trends while adhering to professional ethics. RECOMMENDATIONS While libraries have traditionally shied away from collecting data about patron transactions, these conservative tendencies run counter to the library’s mission to provide outstanding user experience and the need to evolve in a rapidly changing information industry. As the profession adopts new technologies, ethical dilemmas present themselves that are tied into their use. While several library organizations have issued guidance for libraries about the role of user data in these new technologies, this does not go far enough. The NISO Privacy Principles, for instance, acknowledge that its principles are merely “a starting point.”41 Examining the substance of these guidelines is important for confronting the privacy challenges facing library discovery in the 21st century, but there are additional steps libraries can take to more fully address the competing interests of privacy and user experience in library discovery and in library technologies more generally. Holding Third Parties Accountable Libraries are increasingly at the mercy of third parties when it comes to the development and design of library discovery tools. Unfortunately, these third parties not have the same ethical obligations to protect patron privacy that librarians do. In addition, the existing guidance for protecting user data in library technologies is directed towards librarians, not third party vendors. The library community must hold third parties accountable for the ethical design of library discovery tools. One strategy for doing this would be to develop a ranking or certification process for discovery tools based on a community set of standards. The development of HIPAA-compliant INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 54 records management systems in the medical field sets an example. Because healthcare providers are required by law to guarantee the privacy of patient data,42 they must select Electronic Health Records systems (ERMs) that have been certified by an Office of the National Coordinator for Health Information Technology (ONC)-authorized body.43 In order to be certified, the system must adhere to a set of criteria adopted by the Department of Health and Human Services,44 which includes privacy and security standards.45 Another example is the Consumer Reports standard and testing program for consumer privacy and security, which is currently in development. Consumer Reports explains the reason for developing this new privacy standard, “If Consumer Reports and other public-interest organizations create a reasonable standard and let people know which products do the best job of meeting it, consumer pressure and choices can change the marketplace.”46 Libraries could potentially adapt the Consumer Reports standards and rating system for library discovery tools and other library technologies. Engaging in UX Research & Design Libraries should not rely on third parties alone to address privacy and user experience requirements for library discovery tools. Libraries are well-poised to become more involved in the design process itself by actively engaging in user experience research and design. The opportunities for “context-driven design” and personalization based on circulation and other anonymous data are promising for library discovery but require ample user testing to determine their usefulness. Understanding which types of personalization features offer the most value while preserving privacy is key to accelerating the design of library discovery tools. The growth of User Experience Librarian jobs and the emergence of User Experience teams and departments in libraries signals an increasing amount of user experience expertise in the field, which can be leveraged to investigate these important questions for library discovery. Illuminating the Black Box When librarians adopt new discovery tools without fully understanding their underlying technologies and the data economy in which they operate, this does not serve users. Librarians have ethical obligations that should require them to thoroughly understand how and when user data is captured by library discovery tools and other web technologies, and how this information is compiled and shared at a higher level. Not only do librarians need to understand the technical aspects of discovery technologies, they also need to understand the related user experience benefits and privacy concerns and the resulting ethical implications. As technology continues to evolve, librarians should be required to engage in continued learning in these areas. Such technology literacy skills could be incorporated in the curriculum of Library and Information Science degree programs, as well as in ongoing professional development opportunities. Empowering Library Users Because information discovery in an online environment introduces new privacy risks, communication about this topic between librarians and patrons is paramount. Librarians should PRIVACY AND USER EXPERIENCE IN 21ST CENTURY LIBRARY DISCOVERY | PEKALA https://doi.org/10.6017/ital.v36i2.9817 55 proactively discuss with patrons the potential risks to their privacy when conducting research online, whether they are using the open web or library discovery tools. It is ultimately up to the patron to weigh their needs and preferences in order to decide which tools to use, but it is the librarian’s responsibility to empower patrons to be able to make these decisions in the first place. CONCLUSION With the rollback of the FCC privacy rules that prohibit ISPs from selling customer search histories without customer permission, understanding digital privacy issues and taking action to protect patron privacy is more important than ever. While privacy and user experience are both necessary and important components of library discovery systems, their requirements are in direct conflict with each other. An absolutely private discovery experience would mean that no user data is ever collected during the search process, whereas a completely personalized discovery experience would mean that all user data is collected and utilized to inform the design and features of the system. It is essential for library discovery tools to have built-in functionality that protects patron privacy to the greatest extent possible and enables the ethical use of patron data to improve user experience. The library community must take action to address these requirements beyond establishing guidelines. Holding third party providers to higher privacy standards is a starting point. In addition, librarians themselves need to engage in user experience research and design to discover and test the usefulness of possible intermediary solutions. Librarians must also become more educated as a profession on digital privacy issues and their ethical implications in order to educate patrons about their fundamental rights to privacy and empower them to make decisions about which discovery tools to use. Collectively, these strategies enable libraries to address user needs, uphold professional ethics, and drive the future of library discovery. REFERENCES 1. Irina Trapido, “Library Discovery Products: Discovering User Expectations through Failure Analysis,” Information Technologies and Libraries 35, no. 3 (2016): 9-23, https://doi.org/10.6017/ital.v35i3.9190. 2. Brian Fung, “The House Just Voted to Wipe Away the FCC’s Landmark Internet Privacy Protections,” The Washington Post, March 28, 2017, https://www.washingtonpost.com/news/the-switch/wp/2017/03/28/the-house-just- voted-to-wipe-out-the-fccs-landmark-internet-privacy-protections. 3. Jon Brodkin, “President Trump Delivers Final Blow to Web Browsing Privacy Rules,” ARS Technica, April 3, 2017, https://arstechnica.com/tech-policy/2017/04/trumps-signature- makes-it-official-isp-privacy-rules-are-dead/. 4. Nathan Freed Wessler, “How Private is Your Online Search History?” ACLU Free Future (blog), https://www.aclu.org/blog/how-private-your-online-search-history. 5. Julia Angwin, Dragnet Nation (New York: Times Books, 2014), 41-42. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 56 6. MIT Libraries, Institute-wide Task Force on the Future of Libraries (2016), 12, https://assets.pubpub.org/abhksylo/FutureLibrariesReport.pdf. 7. Trapido, “Library Discovery Products,” 10. 8. Marshall Breeding, “The Future of Library Resource Discovery,” NISO White Papers, NISO, Baltimore, MD, 2015, 4, http://www.niso.org/apps/group_public/download.php/14487/future_library_resource_dis covery.pdf. 9. Christine Wolff, Alisa B. Rod, and Roger C. Schonfeld, Ithaka S+R US Faculty Survey 2015 (New York: Ithaka S+R, 2016), 11, https://doi.org/10.18665/sr.277685. 10. Deirdre Costello, “Students and Faculty Research Differently” (presentation, Computers in Libraries, Washington, D.C., March 28, 2017), http://conferences.infotoday.com/documents/221/A103_Costello.pdf. 11. Roger C. Schonfeld, Meeting Researchers Where They Start: Streamlining Access to Scholarly Resources (New York: Ithaka S+R, 2015), https://doi.org/10.18665/sr.241038. 12. Björn Bloching, Lars Luck, and Thomas Ramge, In Data We Trust: How Customer Data Is Revolutionizing Our Economy (London: Bloomsbury Publishing, 2012), 65. 13. Angwin, 21-36. 14. Ibid., 32-33. 15. Natasha Singer, “Mapping, and Sharing, the Consumer Genome,” New York Times, June 16, 2012, http://www.nytimes.com/2012/06/17/technology/acxiom-the-quiet-giant-of- consumer-database-marketing.html. 16. Lois Beckett, “Everything We Know About What Data Brokers Know About You,” ProPublica, June 13, 2014, https://www.propublica.org/article/everything-we-know-about-what-data- brokers-know-about-you. 17. “An Interpretation of the Library Bill of Rights,” American Library Association, amended July 1, 2014, http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy. 18. Angwin, Dragnet Nation, 41-42. 19. Anne Klinefelter, “Privacy and Library Public Services: Or, I Know What You Read Last Summer,” Legal References Services Quarterly 26, no. 1-2 (2007): 258-260, https://doi.org/10.1300/J113v26n01_13. 20. Theresa Chmara, Privacy and Confidentiality Issues: Guide for Libraries and Their Lawyers (Chicago: ALA Editions, 2009), 27-28. 21. “Code of Ethics of the American Library Association,” American Library Association, PRIVACY AND USER EXPERIENCE IN 21ST CENTURY LIBRARY DISCOVERY | PEKALA https://doi.org/10.6017/ital.v36i2.9817 57 amended January 22, 2008, http://www.ala.org/advocacy/proethics/codeofethics/codeethics. 22. “IFLA Code of Ethics for Librarians and other Information Workers,” International Federation of Library Associations and Institutions, August 12, 2012, http://www.ifla.org/news/ifla-code-of-ethics-for-librarians-and-other-information- workers-full-version. 23. “Privacy & Surveillance,” American Library Association, approved 2015-2016, http://www.ala.org/advocacy/privacyconfidentiality. 24. National Information Standards Organization, NISO Consensus Principles on Users’ Digital Privacy in Library, Publisher, and Software- Provider Systems (NISO Privacy Principles), published on December 10, 2015, http://www.niso.org/apps/group_public/download.php/15863/NISO%20Consensus%20Pr inciples%20on%20Users%92%20Digital%20Privacy.pdf. 25. “Library Privacy Checklists,” Library and Information Technology Association, accessed March 7, 2017, http://www.ala.org/lita/advocacy. 26. Panagiotis Germanakos and Marios Belk, “Personalization in the Digital Era,” in Human- Centred Web Adaptation and Personalization: From Theory to Practice, (Switzerland: Springer International Publishing Switzerland, 2016), 16. 27. Ansgar Koene et al., “Privacy Concerns Arising from Internet Service Personalization Filters,” ACM SIGCAS Computers and Society 45, no. 3 (2015): 167. 28. Ibid., 168. 29. Ibid. 30. James Connor, “Scholar Updates: Making New Connections,” Google Scholar Blog, https://scholar.googleblog.com/2012/08/scholar-updates-making-new-connections.html. 31. Schonfeld, Meeting Researchers Where They Start, 2. 32. Roger C. Schonfeld, Does Discovery Still Happen in the Library?: Roles and Strategies for a Shifting Reality (New York: Ithaka S+R, 2014), 10, https://doi.org/10.18665/sr.24914. 33. Abigail Shelton, “American Philosophical Society Announces Launch of PAL, an Innovative Recommendation Tool for Research Libraries,” American Philosophical Society, April 3, 2017, https://www.amphilsoc.org/press/pal. 34. Trapido, “Library Discovery Products,” 17. 35. Ibid. 36. Michael Schofield, “Does the Best Library Web Design Eliminate Choice?” LibUX, September INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2017 58 11, 2015, http://libux.co/best-library-web-design-eliminate-choice/. 37. Jason A. Clark, “Anticipatory Design: Improving Search UX using Query Analysis and Machine Cues,” Weave: Journal of Library User Experience 1, no. 4 (2016), https://doi.org/10.3998/weave.12535642.0001.402. 38. Rachel Vacek, “Customizing Discovery at Michigan” (presentation, Electronic Resources & Libraries, Austin, TX, April 4, 2017), https://www.slideshare.net/vacekrae/customizing- discovery-at-the-university-of-michigan. 39. Laurie A. Rinehart-Thompson, Beth M. Hjort, and Bonnie S. Cassidy, “Redefining the Health Information Management Privacy and Security Role,” Perspectives in Health Information Management 6 (2009): 4.s 40. Marshall Breeding, “Perspectives on Patron Privacy and Security,” Computers in Libraries 35, no. 5 (2015): 13. 41. National Information Standards Organization, NISO Consensus Principles. 42. Joel JPC Rodrigues, et al., “Analysis of the Security and Privacy Requirements of Cloud-Based Electronic Health Records Systems,” Journal of Medical Internet Research 15, no. 8 (2013), https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3757992/. 43. Office of the National Coordinator for Health Information Technology, Guide to Privacy and Security of Electronic Health Information, April 2015, https://www.healthit.gov/sites/default/files/pdf/privacy/privacy-and-security-guide.pdf. 44. Office of the National Coordinator for Health Information Technology, “Health IT Certification Program Overview,” January 30, 2016, https://www.healthit.gov/sites/default/files/PUBLICHealthITCertificationProgramOvervie w_v1.1.pdf. 45. Office of the National Coordinator for Health Information Technology, “2015 Edition Health Information Technology (Health IT) Certification Criteria, Base Electronic Health Record (EHR) Definition, and ONC Health IT Certification Program Modifications Final Rule,” October 2015, https://www.healthit.gov/sites/default/files/factsheet_draft_2015-10-06.pdf. 46. Consumer Reports, “Consumer Reports to Begin Evaluating Products, Services for Privacy and Data Security,” Consumer Reports, March 6, 2017, http://www.consumerreports.org/privacy/consumer-reports-to-begin-evaluating- products-services-for-privacy-and-data-security/. 9825 ---- Current Trends and Goals in the Development of Makerspaces at New England College and Research Libraries Ann Marie L. Davis INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 94 Ann Marie L. Davis (davis.5257@osu.edu) is Faculty Librarian of Japanese Studies at The Ohio State University. ABSTRACT This study investigates why and which types of college and research libraries (CRLs) are currently developing makerspaces (or an equivalent space) for their communities. Based on an online survey and phone interviews with a sample population of CRLs in New England, I found that 26 CRLs had or were in the process of developing a makerspace in this region. In addition, several other CRLs were actively promoting and diffusing the maker ethos. Of these libraries, most were motivated to promote open access to new technologies, literacies, and STEM-related knowledge. INTRODUCTION AND OVERVIEW Makerspaces, alternatively known as hackerspaces, tech shops, and fab labs, are trendy new sites where people of all ages and backgrounds gather to experiment and learn. Born of a global community movement, makerspaces bring the do-it-yourself (DIY) approach to communities of tinkerers using technologies including 3D printers, robotics, metal- and woodworking, and arts and crafts.1 Building on this philosophy of shared discovery, public libraries have been creating free programs and open makerspaces since 2011.2 Given their potential for community engagement, college and research libraries (CRLs) have also been joining the movement in growing numbers.3 In recent years, makerspaces in CRLs have generated positive press in popular and academic journals. Despite the optimism, scholarly research that measures their impact is sparse. For example, current library and information science literature overlooks why and how various CRLs choose to create and maintain their respective makerspace. Likewise, there is scant data on the institutional objectives, frameworks, and experiences that characterize current CRL makerspace initiatives.4 This study begins to fill this gap by investigating why and which types of CRLs are creating makerspaces (or an equivalent room or space) for their library communities. Specifically, it focuses on libraries at four-year colleges and research universities in New England. Throughout this study, makerspace is used interchangeably with other terms, including maker labs and innovation spaces, to reflect the variation in names and objectives that underlie the current trends. In exploring their motives and experiences, this article provides a snapshot of the current makerspace movement in CRLs. mailto:davis.5257@osu.edu CURRENT TRENDS AND GOALS IN THE DEVELOPMENT OF MAKERSPACES | DAVIS 95 https://doi.org/10.6017/ital.v37i2.9825 The study finds that the number of CRLs actively involved in the makerspace movement is growing. In addition to more than two dozen that have or are in the process of developing a makerspace, another dozen CRLs have staff who support the diffusion of maker technologies, such as 3D printing and crafting tools that support active learning and discovery, in the campus library and beyond.5 Comprising research and liberal arts schools, public and private, and small and large, the CRLs involved with makerspaces are strikingly diverse. Despite these differences, this population is united by common objectives to promote new literacies, provide open access to new technologies, and foster a cooperative ethos of making. LITERATURE REVIEW The body of literature on library makerspaces is brief, descriptive, and often didactic. Given the newness of the maker movement in public and academic libraries, many articles focus on early success stories and defining the movement vis-à-vis the mission of the library. For instance, Laura Britton, known for having created the first makerspace in a public library (The Fayetteville Free Library’s Fabulous Laboratory), defines a makerspace as “a place where people come together to create and collaborate, to share resources, knowledge, and stuff.”6 This definition, she determines, is strikingly similar to that of the library. Most literature on makerspaces appears in academic blogs, professional websites, and popular magazines. Among the most frequently cited is TJ McCue’s article, which celebrates Britton’s (née Smedley) FabLab while distilling the intellectual underpinnings of the makerspace ethos.7 Phillip Torrone, editor of Make: magazine, supports Smedley’s project as an example of “rebuilding” or “retooling” our public spaces.8 Within this camp, David Lankes, professor of information studies at Syracuse University, applauds such work as activist and community-oriented librarianship.9 Many authors emphasize the philosophical “fit,” or intersection, of public makerspaces with the principles of librarianship. Building on Torrone’s work, J. L. Balas claims that creating access to resources for learning and making is in keeping with the “library’s historical role of providing access to the ‘tools of knowledge.’”10 Others emphasize the hands-on, participatory, and inter- generational features of the maker movement, which has the potential to bridge the digital divide.11 Still others identify areas of literacy, innovation, and STE(A)M skills where library makerspaces can have a broad impact. While public libraries often focus on early childhood or adult education, CRLs adopt separate frameworks for information literacy. Like public libraries, they aim to build (meta)literacies and STE(A)M skills. Nevertheless, their programs often tailor to curricular goals in the arts and sciences or specialized degrees in engineering, education, and business. This is especially true of CRLs situated within large, research-intensive universities. Considering their specific missions and aims, this study seeks to identify the goals and challenges that reinforce the development of makerspaces in undergraduate and research environments. RESEARCH DESIGN AND METHOD Data presented in this study was gathered from library directors (or their designees) through an online survey and oral telephone interviews. After choosing a sampling frame of CRLs in New England, I developed a three-path survey, sent invitations, and collected and analyzed data using the online platform SurveyMonkey. The survey was distributed following review by the INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 96 institutional review board (IRB) at Southern Connecticut State University, where I completed a Master of Library Science (MLS) degree. Survey Population To assess generalized findings for the larger population in North America, I chose a cluster- sampling approach that limited the survey population to the CRLs in New England. In generating the sampling frame, I included four-year and advanced-degree institutions based on the assumption that libraries at these schools supported specialized, research, or field-specific degrees. I omitted for-profit and two-year institutions, based on the assumption that they are driven by separate business models. This process generated a contact list of 182 library directors at the designated CRLs in Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, and Vermont. Survey Design The purpose of the survey was to gather basic data about the size and structure of the respondents’ institutions and to gain insights on their views and practices regarding makerspaces (the survey is reproduced in the appendix). The first page of the survey contained a statement of consent, including my contact information and that of my IRB. After a short set of preliminary questions, the survey branched into one of three paths based on respondents’ answers about makerspaces. The respondents were thus categorized into one of three groups: Path One (P1) for those with no makerspace and no plans to create one, Path Two (P2) for those with plans to develop a makerspace in the near future, and Path Three (P3) for those already running a makerspace in their libraries. P3 was the longest section of the survey, containing several questions about P3 experiences with makerspaces such as staffing, programing, and objectives. Data Collection In summer 2015, brief email invitations and two reminders were sent to the targeted population.12 To increase the participation rate, I sometimes wrote personal emails and made direct phone calls to CRLs known to have makerspace. For cold-call interviews, I developed a script explaining the nature of the online survey. After obtaining informed consent, I proceeded to ask the questions in the online survey and manually enter the participants’ responses at the time of the interview. On a few occasions, online respondents followed up with personal emails volunteering to discuss their library’s experiences in more detail. I took advantage of these invitations, which often provided unique and welcome insights. In analyzing the responses, I used tabulated frequencies for quantitative results and sorted qualitative data into two different categories. The first category was identified as “short and objective” and coded and analyzed numerically. The longer, more “subjective and value-driven” data was analyzed for common trends, relationships, and patterns. Within this second category, I also identified outlier responses that suggested possible exceptions to common experiences. RESULTS The survey closed after one month of data collection. At this time, 55 of 182 potential respondents had participated, yielding a response rate of 30.2%. Among these participants, the survey achieved a 100.0% response rate (9 completed surveys of 9 targeted CRLs) among libraries that were CURRENT TRENDS AND GOALS IN THE DEVELOPMENT OF MAKERSPACES | DAVIS 97 https://doi.org/10.6017/ital.v37i2.9825 currently operating makerspaces. I created a list of all known CRL makerspaces in New England based on an exhaustive website search of all CRLs in this region. Subsequent interviews with the managers of the makerspaces on this list revealed no other hidden or unknown makerspaces in this region. Of the 55 respondents, 29 (52.7%) were in P1, 17 (30.9%) were in P2, and 9 (16.4%) were in P3. (See figure 1.) Figure 1. Survey participants’ (n = 55) current CRL efforts and plans to develop and operate a makerspace. Among respondents in P2 and P3, the majority (13 of 23) indicated that they were from libraries that served a student population of 4,999 people or fewer, while only one library served a population of 30,000 or more (see figure 2). In terms of sheer numbers, makerspaces might seem to be gaining traction at smaller CRLs, but proportionally, one cannot say that smaller CRLs are adopting makerspaces at a higher rate because the majority of survey participants have student populations of 19,999 or less (51, or 91.1%). The number of institutions with populations over 20,000 were in a clear minority (5, or 8.9%). (See figure 3.) INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 98 Figure 2. P2 and P3 CRLs with makerspaces or concrete plans to develop a makerspace. Figure 3. The majority of CRLs (67.2%) that participated in the survey had a population of 4,999 students or less. Only 1.8% of schools that participated had a population of 30,000 students or more. CURRENT TRENDS AND GOALS IN THE DEVELOPMENT OF MAKERSPACES | DAVIS 99 https://doi.org/10.6017/ital.v37i2.9825 CRLs with No Makerspace (P1 = 29) In the first part of the survey, the majority of P1 respondents demonstrated positive views toward makerspaces despite having no plans to create one in the near future. Budgetary and space limitations aside, many were relatively open to the possibility of developing a makerspace in a more distant future. In the words of one respondent, “we have several areas within the library that present a heavy demand on our budget. In [the] future, we would love to consider a makerspace, and whether it would be a sensible and appropriate investment that would benefit our students.” When asked what their reasons were for not having a makerspace, some respondents (8, or 27.6%) said they had not given it much thought, but most (21, or 72.4%) offered specific answers. Among these, the most frequently cited reason (11, or 37.8%) was that a library makerspace would be redundant: such spaces and labs were already offered in other departments within the institution or in the broader community. At one CRL, for example, the respondent said the library did not want to compete with faculty initiatives elsewhere on campus. Other reasons included that makerspaces were expensive and not a priority. Some (5, or 17.2%) libraries preferred to allocate their funds to different types of spaces such as “a very good book arts studio/workshop” or “simulation labs.” Some (6, or 20.6%) shared concerns about a lack of space, staff, or simply “a good culture of collaboration [on campus].” Merging these sentiments, one respondent concluded, “People still need the library to be fairly quiet. . . . Having makerspace equipment in our library would be too distracting.” While some were skeptical (sharing concerns about potential hazards or that makerspaces were simply “the flavor of the month”), the majority (roughly 60%) were open and enthusiastic. One respondent, in fact, held a leadership position in a community makerspace beyond campus. According to this librarian, 3D printers, scanners, and laser cutters were sure to become more common, and CRLs would no doubt eventually develop “a formal space for making stuff.” CRLs with Plans for a Makerspace in the Near Future (P2 = 17) The second section of the survey (P2) focused primarily on the motivations and means by which this cohort planned to develop a makerspace. When asked why they were creating a makerspace, the most common response was to promote learning and literacy (15 respondents, or 88.2%). In addition, a large majority (12 respondents, or 70.6%) felt that makerspaces helped to promote the library as relevant, particularly in the digital age. Three more reasons that earned top scores (10 respondents each, or 58.2%) were being inspired by the ethos of making, creating a complement to digital repositories and scholarship initiatives, and providing access to expensive machines or tools. Additional reasons included building outreach and responding to community requests.13 (See figure 4.) INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 100 Figure 4. Rationale behind P2 respondents’ decision to plan a makerspace (n = 17). While P2 respondents indicated a clear decision to create a makerspace, their timeframes were noticeably different. I categorized their open responses into one of six timeframes: “within six months,” “within one year,” “within two years,” “within four years,” “within six years,” and “unknown.” The result presented a clear trimodal distribution with three subgroups: six CRLs with plans to open within 18 months, five with plans to open within the next two years, and six with plans to open after three or more years (see figure 5). In addition to their timeframe, P2 respondents were also asked about their plans for financing their future makerspaces. Based on their open responses, the following six funding sources emerged: • the library budget, including surplus moneys or capital project funds • internal funding, including from campus constituents • donations and gifts • external grants • cost recovery plans, including small charges to users • not sure/in progress CURRENT TRENDS AND GOALS IN THE DEVELOPMENT OF MAKERSPACES | DAVIS 101 https://doi.org/10.6017/ital.v37i2.9825 Figure 5. P2 respondents’ timeframe for developing the makerspace (n = 17). With seven mentions, the most common of the above funding was the “library budget.” With two mentions each, the least common sources were “cost recovery” and “not sure/in progress.” Among those who mentioned external grant applications, one respondent mentioned a focus on Women and STEM opportunities, and another specifically discussed attempts at grants from the Institute of Museum and Library Services. (See figure 6.) Figure 6. P2respondents’ plans for gathering and financing makerspace (n = 17). Regarding target user groups, some respondents focused on opportunities to enhance specific disciplinary knowledge, while others emphasized a general need for creating a free and open environment. One respondent mentioned that at her state-funded library, the space would be “geared to younger [primary and secondary school] ages,” “student teachers,” and “librarians on practicum assignments.” By contrast, another respondent at a large, private, Carnegie R1 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 102 university emphasized that the space was earmarked for the undergraduate and graduate students. In contrast to the cohort in P1, a notable number in P2 chose to create a makerspace despite the existence of maker-oriented research labs elsewhere on campus. As one respondent noted, the university was still “lacking a physical space where people could transition between technologies” and an open environment “where students doing projects for faculty” could come, especially later in the evenings. Another respondent at a similarly large, private institution explained that his colleagues recognized that most labs at their university were earmarked for specific professional schools. As a result, his colleagues came up with a strategy to provide self-service 3D printing stations at the media center, located in the library at the heart of campus. CRLs with Operating Makerspaces (P3 = 9) The final section of the survey (P3) focused on the motivations and means by which CRLs with makerspaces already in operation chose to develop and maintain their sites. In addition, this section gathered information on P3 CRL funding decisions, service models, and types of users in their makerspaces. Of the nine respondents in this path, all had makerspaces that had opened within the last three years. Among these, roughly a third (4) had been in operation from one to two years; another third (3) had operated for two to three years; and two had opened within the last year. (See table 1.) Table 1. Length of time the CRL makerspace has been in operation for P3 respondents (n = 9). Age of CRL Makerspace or Lab—P3 Answer Options Responses % Less than 6 months 1 11.1 6–12 months 1 11.1 1–2 years 4 44.4 2–3 years 3 33.3 More than 3 years 0 0.0 Total Responses 9 100.0 Priorities and Rationale The reasons behind P3 decisions to make a makerspace were slightly different from those of P2. While “promoting literacy and learning” was still a top priority, two other reasons, “promoting the maker culture of making” and “providing access to expensive machinery,” were deemed equally important (6 respondents, or 66.7%, for each). Other significant priorities included “promoting community outreach” (4 respondents, or 44.4%), “promoting the library as relevant” and in “direct response to community requests” (3 respondents, or 33.3%, for each). (See figure 7.) CURRENT TRENDS AND GOALS IN THE DEVELOPMENT OF MAKERSPACES | DAVIS 103 https://doi.org/10.6017/ital.v37i2.9825 Figure 7. Rationale behind P3 respondents’ decision to develop and maintain a makerspace (n = 9). The answer of “other” was also given top priority (5 respondents, or 55.6%). I conclude that this indicated a strong desire among respondents to express in their own words their library’s unique decisions and circumstances. (Their free responses to this question are discussed below.) A familiar theme in the responses of the five respondents who elaborated on their choice of “other” was the desire to situate a makerspace in the central and open environment of the campus library. As one participant noted, there were “other access points and labs on campus,” but those labs were “more siloed” or cut off from the general population. By contrast, the campus library aimed to serve a broader population and anticipated a general “student need.” Later, the same respondent added that the makerspace was an opportunity to promote social justice, cultivate student clubs, and encourage engagement at the hub of the campus community. This type of ecumenical thinking was manifested in a similar remark that the library’s role was to reinforce other learning environments on campus. One respondent saw the makerspace as an additional resource “that complemented the maker opportunities that we have had in our curriculum resource center for decades.” Likewise, the library makerspace was intended to offer opportunities to a range of users on campus and beyond. Funding, Staffing, and Service Models When prompted to discuss how they gathered the resources for their makerspaces, the largest group (4 respondents) stated that a significant means for funding was through gifts and donations. Thus, the majority of CRL makerspaces in New England depended primarily on contributions from friends of the library, university/college alumni, and donors. The second most common source (3 respondents) was through the library budget, including surplus money at the end of the year. Making use of grant money and cost recovery were mentioned by two library participants, and internal and constituent support was useful for two libraries. (See figure 8.) INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 104 Figure 8. P3 methods for gathering and financing a makerspace (n = 9). Among these, a particularly noteworthy case was a makerspace that had originated from a new student club focused on 3D printing. Originally based in a student dorm, the club was funded by a campus student union, which allocated grant money to students through a budget derived from the college tuition. As the club quickly grew, it found significant support in the library, which subsequently provided space (on the top floor of the library), staff, and financial support from surplus funds in the library budget. As this example would suggest, the sum of the responses showed that financing the makerspaces depended on a combination of strategies. One participant summarized it best: “We’ve slowly accumulated resources over time, using different funding for different pieces. Some grant funding. Mostly annual budget.” Regarding service models, more than half of these libraries (five) currently offer a combination of programming and open lab time where users could make appointments or just drop in. By contrast, two of the libraries offered programs only, and did not offer an open lab; another two did the opposite, offering no programming but an open makerspace at designated times. Of the latter, one is open Monday to Friday from 8 a.m. to 4 p.m., and the other is open during regular hours, with spaces that “can be booked ahead for classes or projects.” Most labs supported drop-in visitors and were open evenings and weekends. At one makerspace, where there was increasingly heavy demand, the staff required students to submit proposals with project goals. (See table 2.) While some libraries brought in community experts, others held faculty programs, and some scheduled lab time for individual classes. One makerspace prioritized not only the campus, but also the broader community, and thus featured programs for local high schools and seniors. Responses from this library emphasized the social justice thread that inspired their work and the community culture that they aimed to foster. CURRENT TRENDS AND GOALS IN THE DEVELOPMENT OF MAKERSPACES | DAVIS 105 https://doi.org/10.6017/ital.v37i2.9825 Table 2. Model for services offered in the CRL makerspace or 3D printing lab Do you offer programs in the makerspace/lab or is it simply opened at defined times for users to use? Answer Options Responses % Yes, we offer the following types of programs. 2 22.2 No, we simply leave the makerspace/lab open at the specific times. 2 22.2 We do both. We offer the programs and leave the makerspace/lab open at specific times. 5 55.6 As this data would suggest, most makerspaces were used by students (undergraduates and graduates) and faculty, in addition to local experts and generational groups. Survey responses showed that undergraduate students were the most common users (9 of 9 respondents checked this group as the most frequent type of user), and faculty and graduate students were the second and third most common (8 of 9 respondents checked these groups as most frequent) user groups in the labs. Local entrepreneurs, artists, designers, craftspeople, and campus and library staff also use the makerspaces. (See figure 9.) When prompted to identify “other” categories, one respondent specifically listed “learners, makers, sharers, studiers, [and] clubs.” Figure 9. Of the different types of users listed above, P3 respondents ranked them in order of who used the makerspace or equivalent lab most often (n = 9). The number and type of staff that managed and operated the makerspaces also varied widely at the nine CRLs in P3. Seven of the CRLs employed full-time, dedicated staff, among whom four participants checked off the “dedicated staff”–only options. Of the remaining two CRLs, one INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 106 reported staffing the makerspace with only one student, and one reported not having any staff working in the makerspace. I assume that the makerspace with no employees is managed by staff and students who are assigned to other, unspecified library departments or work groups. (See figure 10.) Figure 10. The staffing situations at the P3 respondents (n = 9), where each respondent is assigned a letter from “A” to “I.” Library programing was also diverse in terms of targeted audiences, speakers, and learning objectives. Instructional workshops varied from 3D scanning and printing to soldering, felt making, sewing, knitting, robotics, and programming (e.g., Raspberry Pi.) The type of equipment contained in each lab is likely correlated to the range in programming; however, investigating these links was beyond the scope of this study. Regarding this equipment, the size and activity of the participant CRLs varied considerably. Some responses were more specific than others, and thus the resulting dataset was incomplete (See table 3.) Challenges and Philosophies of CRL Makerspaces The final portion of the survey invited participants to freely offer their thoughts about operating a CRL makerspace. What follows below is a summary of the two most prominent themes that emerged: the challenges of building the lab and the social philosophies that framed these initiatives. In terms of challenges, the most common hurdle noted was the tremendous learning curve involved in establishing, maintaining, and promoting a makerspace. Setting up some of the 3D printers, for example, required knowledge about electrical networks, computer systems, and safety policies at a federal and local level. Once the hardware was running, lab managers needed to know how the machines interfaced with different and challenging software applications. Communication skills were also critical, as one respondent reported, “Printing anything and everything takes knowledge, experience.” Communicating with stakeholders and users in accessible and proactive ways required strong teaching and customer service skills. CURRENT TRENDS AND GOALS IN THE DEVELOPMENT OF MAKERSPACES | DAVIS 107 https://doi.org/10.6017/ital.v37i2.9825 Table 3. The types of tools and equipment used at P3 CRL respondents (n = 8), which are assigned letters from A to H. Major Equipment Offered by Individual Library Makerspaces or Equivalent Labs—Path 3 CRL Label Response Text A Die cut machine, 3D printer, 3D pens, raspberry pi, arduino, makey makey, art supplies, sewing supplies, pretty much anything anyone asks for we will try to get. B 2 Makerbot replicators, 1 digital scanner, 1 Othermill C 3D printing, 3D scanning, and Laser cutting. D 3D printing, 3D scanning, laser cutting, vinyl cutting, large format printing, cnc machine, media production/postproduction. E No response F 3 CreatorX, 1 Powerspec, 3 M3D, 2 Replicator 2, 1 Replicator2x, 1 Makergear, 1 LeapfrogXL, 1 Ultimaker, 1Type A,1 Deltaprinter, 1 Delta Maker, 2 Printrbot, 2 Filabots, 2X-box kinect for scanning, 2 Oculus rifts, embedded systems cabinet with Soldering stations, solar panels and micro controllers etc, 1 formlabs SLA, 1 Muve SLA, RoVa 5, a bunch of quadcopters G 3D printers (4 printers, 3 models), 3D scanning/digitizing equipment (3 models), Raspberry Pi, Arduino, a laser cutter and engraving system, poster printer, digital drawing tablets, GoPro, a variety of editing and design software, a number of tools (e.g. Dremel, soldering iron, wrenches, pliers, hammers, etc.), and a number of consumable or misc. items (e.g. paint, electrical tape, acetone, safety equipment, LED lights, screws and nails, etc.) H 48 printers (all Makerbot brand), 35 replicator 5th Gen (a moderate size printer, 5 Replicator Z18 printers (larger built size), and 5 replicator minis, 3 Replicator 2X) 5 Makerbot digitzers (turntable scanners 8" by 8") 1 Cubify Sense Hand Scanner 7 still cameras for photogrammetry 21 I-Mac computers 2 Mac Pros 2 Wacom graphics tablets (thinking about complementing other resources at other labs on campus) Another challenge that often came up was that of managing resources. As one respondent warned, CRLs should beware the “early adoption of certain technologies,” which can become “quickly INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 108 outdated by a rapidly growing field.” For others, it was a challenge to recruit the right staff that could run and fix machines in constant need of repair. In addition to hiring people with manufacturing and teaching skills, a successful lab required individuals who were savvy about outreach and community needs. Despite such challenges, many respondents were eager to discuss the aspirations and rewards of CRL makerspaces. Above all, respondents focused on the pedagogical opportunities on the one hand, and the potential for outreach and social justice on the other. One participant conceded that measuring advances in literacy and education was “intangible,” but he saw great value in “giving students the experience of seeing their ideas come to fruition.” The excitement that this created for one student manifested in a buzz, and subsequently a “fever” or groundswell, in which more users came in to tinker and learn. Meanwhile, the learning that took place among future professionals on campus was “critical,” even when results did not “go viral.” The aspiration to create human connections within and beyond campus was another striking theme. According to one respondent, the makerspace had “enabled some incredibly fruitful collaborations with different departments on campus.” This “fantastic outcome” was becoming more and more visible as the maker community grew. Other CRL makerspaces took pride in fostering a type of learning that was explicitly collaborative, exciting, and even “fun” for users. This in turn meant that some libraries were becoming “very popular,” generating a lot of “good PR,” and becoming central in the lives of new types of library users. Along these lines, some respondents aimed to leverage the power of the makerspace to achieve social justice goals that resonated with core values of librarianship. According to one enthusiastic participant, the ethos of sharing was alive and strong among the staff and the many students who saw their participation in the lab as a lifestyle and culture of collaborating. In another initiative, the respondent looked forward to eventually offering grants to those users who proposed meaningful ways to use the makerspace to create practical value for the community. From this perspective, there was added value in having the 3D printing lab situated specifically on a college or university campus. According to this respondent, the unique quality of the CRL makerspace was that by virtue of its location amid numerous and energetic young people, it was ripe for exploitation by those “who had great ideas and time and energy to do good.” DISCUSSION The aim of this study was to explore why and which types of CRLs had developed makerspaces (or an equivalent space) for their communities. Of the 56 respondents, roughly half (46%) were P2 and P3 libraries who were currently developing or operating a makerspace, respectively. Data from this survey indicated that none of the P2 or P3 CRLs fit a mold or pattern in terms of their size, educational models, or classifications. Upon analyzing the data, I found that the differentiators between the three groups were less clearly defined than originally anticipated. In one example of blurred lines, at least two respondents in P1 indicated that they were more actively engaged with makerspaces than two respondents in P2. Despite not having physical labs within their libraries, these P1 respondents were in the process of actively supporting or making plans for a makerspace within their CRL community. One P1 respondent, for example, served on the planning board for a local community makerspace and had therefore “thoroughly investigated and used” the makerspace at a CURRENT TRENDS AND GOALS IN THE DEVELOPMENT OF MAKERSPACES | DAVIS 109 https://doi.org/10.6017/ital.v37i2.9825 neighboring university. Based on his knowledge, he decided to develop a complementary initiative (e.g., a book arts workshop) at his university library. Although his library did not yet have a formal makerspace, he felt confident that the diffusion of 3D printers would come to his library in the near future. Another P1 respondent was responsible for administering faculty teaching and innovation grants. Among the recent grant recipients were two faculty collaborators who used the library’s funds to build a makerspace at a campus location that was separate from the library. Although the makerspace was not directly developed by the respondent’s library, it was nevertheless a direct product of his library’s programmatic support. The respondent reported that for this reason, his library did not want to compete with its own faculty initiatives. In another example of blurred distinctions, one librarian in P2 was as deeply immersed in providing access and education on makerspaces as his colleagues in P3. Although he was not clear on when or how his library would finance a future makerspace, his library already offered many of the same services and workshops as P3 libraries. As a “Maker in the Library,” he offered non- credit-bearing 3D printing seminars to students and offered trial 3D printing services in the library for graduates of the 3D printing seminar. In addition, he made appearances at relevant campus events. When the university museum ran a 3D printing day, for instance, he participated as an expert panelist and gave public demonstrations on library-owned 3D printers and a scanner Kinect bar. In sum, despite the respondents’ categorization in P1 and P2, they sometimes shared more in common with the cohorts in P2 and P3, respectively. Given their library’s programmatic involvement in creating and endorsing the maker movement, these respondents were more than just “interested” or “open to” the prospect of creating a makerspace. While only 16% of CRLs (P3 = 9) responded as actively operating a makerspace, another 30% (P2 = 17) were involved in developing a makerspace in the near future. Moreover, the number of CRLs formally involved with the diffusion of maker technologies was not limited to just these two groups. Although some makerspaces were not directly run by the library, they had come to fruition because of library- based funding, grants, and professional support. And although some libraries did not have immediate plans for a makerspace, they were already promoting maker technologies and the maker ethos in other significant ways. CONCLUSION This study is one of the first comprehensive and comparative studies on CRL makerspace programs and their respective goals, policies, and outcomes. While the number of current CRL makerspaces is relatively low, the data suggests that the population is increasing; a growing number of CRLs are involved in the makerspace movement. More than two dozen CRLs were planning to develop makerspaces in the near future, helping to diffuse maker technologies through CRL programming, and/or supporting nonlibrary maker initiatives on campus and beyond. In addition, some CRLs were buying equipment, hiring dedicated staff, offering relevant workshops and demonstrations, and supporting community efforts to build labs beyond the library. Although the author aimed to find structural commonalities between CRLs in groups P2 and P3, none were found. Respondents in these groups came from institutions of all sizes , a wide variety INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 110 of endowment levels, and both public and private funding models, and they ranged in emphasis from the liberal arts to professional certifications and graduate-level research. Although a majority of CRL respondents were not currently making plans to create a makerspace, many respondents were enthusiastic about current trends, and some even promoted the maker movement in unexpected ways. Acknowledging the steady diffusion of 3D printers, many anticipated using such technologies in the future to promote traditional library values and goals. Respondents in P2 and P3 indicated that their primary rationale for developing a makerspace was to promote learning and literacy. Other prominent reasons included promoting library outreach and the maker culture of learning. Data from CRLs with makerspaces indicated that these benefits were often symbiotic and correlated to strong ideas about universal access to emergent tools and practices in learning. Unexpected challenges for developing and operating makerspaces include staffing them with highly skilled, knowledgeable, and service-oriented employees. Learning the necessary skills— including operating the printers, troubleshooting models, and maintaining a safe environment, to name a few—was time-consuming and labor intensive. The majority of funding for CRLs with or planning maker labs came from internal budgets, gifts and donors, and some grants. While some P1 CRLs indicated that their reason for not developing makerspaces was a lack of community interest, P2 and P3 CRLs were not necessarily motivated by user requests or needs, nor was lack of explicit need or interest a deterrent. On the contrary, a few reported a desire to promote the campus library as ahead of the curve by keeping in front of student and community needs. In a similar contradiction, some P1 respondents reported that their libraries did not want to compete with other labs on campus. Respondents from P2 and P3, however, wanted to offer an alternative to the more siloed or structured model of department- or lab-funded makerspaces. Although makerspaces were sometimes forming in other parts of campus, some P2 and P3 CRLs felt there was a gap in accessibility and therefore aimed to offer more open and flexible spaces. A final salient theme among P2 and P3 respondents was their commitment to equity of access and issues of social justice. Above all, they saw a unique fit for makerspaces in their CRL philosophies to serve the greater good. Among other advantages, CRLs were in a unique position to leverage the power of the makerspaces to take advantage of campus communities of “cognitive surplus” and millennial aspirations to share and create spontaneous communities of knowledge. Given the amount of resources that are required to create and maintain a makerspace, this research will be useful for CRLs considering such a space in the future. The present data suggests that no one type of library currently has a monopoly on maker spaces; regardless of size or funding levels, the common thread among P2 and P3 CRLs was simply a commitment to providing access to emergent technologies and supporting new literacies. While annual budgets and grant applications were critical for some libraries, the majority of CRLs funded the bulk of their makerspaces through gifts and donations. Future studies on the characteristics and challenges of P2 and P3 populations beyond those in New England will certainly amplify our understanding of these trends. CURRENT TRENDS AND GOALS IN THE DEVELOPMENT OF MAKERSPACES | DAVIS 111 https://doi.org/10.6017/ital.v37i2.9825 APPENDIX: SURVEY QUESTIONS Informed Consent CURRENT TRENDS IN THE DEVELOPMENT OF MAKERSPACES AND 3D PRINTING LABS AT NEW ENGLAND COLLEGE AND RESEARCH LIBRARIES Consent for the Participation in a Research Study Southern Connecticut State University Purpose You are invited to participate in a research project conducted by Ann Marie L. Davis, a masters student in library and information studies at Southern Connecticut State University. The purpose of this project is to investigate the experiences and goals of college and research libraries (CRLs) that currently have or are making plans to have an open makerspace (or an equivalent room or space). The results from this study will be included in a special project report for the MLS degree and the basis for an article to submit for peer-review. Procedures If you decide to participate, you will volunteer to take a fifteen-minute online survey. Risks and Inconveniences There are no known risks associated with this research; other than taking a short amount of time, the survey should not burden you or infringe on your privacy in any way. Potential Benefits and Incentive By participating in this research, you will be contributing to our understanding of current trends and practices with regards to community learning labs in CRLs. In addition, you will be providing useful knowledge that can support other libraries in making more informed decisions as they potentially develop their own makerspaces in the future. Voluntary Participation Your participation in this research study is voluntary. You may choose not to participate and you may withdraw your consent to participate at any time. You will not be penalized in any way should you decide not to participate or withdraw from this study. Protection of Confidentiality The survey is anonymous and does not ask for sensitive or confidential information. Contact Information Before you consent, please ask any questions on any aspect of this study that is unclear to you. You may contact me at my student email address at any time: xxx@owls.southernct.edu. If you have questions regarding your rights as a research participant, you may contact the Southern Connecticut State Institutional Review Board at (203) xxx-xxxx. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 112 Consent By proceeding to the next page, you confirm that you understand the purpose of this research, the nature of this survey and the possible burdens and risks as well as benefits that you may experience. By proceeding, this indicates that you have read this consent form, understand it , and give your consent to participate and allow your responses to be used in this research. ACRL Survey on Makerspaces and 3D Printers Q1. What is the size of your college or university? • 4,999 students or less • 5,000–9,999 students • 10,000–19,999 students • 20,000–29,999 students • 30,000 students or more Q2. How would you categorize your institution? (Please check all that apply) • Private • Public • Doctorate-Granting University (awards 20 or more doctorates) • Master’s College or University (awards 50 or more master’s degrees, but fewer than 20 doctorates) Liberal Arts and Sciences College • Other Q3. Do any of the libraries at your institution have a makerspace or equivalent hands-on learning lab (including a 3-D printing station or lab)? • Yes [if “Yes,” respondents are directed to question 14] • No [if “No,” respondents are directed to question 4] Q4. Do any of the libraries at your institution have plans to develop a makerspace or equivalent learning lab in the near future? • Yes [if “Yes,” respondents are directed to question 8] • No [if “No,” respondents are directed to question 5] PATH ONE (CRLs with no makerspace, no plans for makerspace) Q5. Are there specific reasons why your institution has decided not to pursue developing a makerspace or equivalent lab in the near future? • No reasons. We have not given much thought to makerspaces for our library. • Yes Q6. Thank you for your participation. Would you like a copy of the results when the report is completed? If yes, please enter your email address in the space provided. CURRENT TRENDS AND GOALS IN THE DEVELOPMENT OF MAKERSPACES | DAVIS 113 https://doi.org/10.6017/ital.v37i2.9825 • No • Yes (please enter your email address below) Q7. You have almost concluded this survey. Before signing off, please feel free to share your thoughts and comments regarding the makerspace movement in college and research libraries. If no comments, please click “Next” to end the survey. PATH TWO [CRLs with plans to build a makerspace] Q8. What are the main goals that motivated your library’s decision to develop a makerspace or equivalent lab? (Please check all that apply) • promote community outreach • promote learning and literacy • promote the library as relevant • promote the maker culture of making • provide access to expensive machines or tools • complement digital repository or digital scholarship projects • as a direct response to community requests or needs • other Q9. Of these goals, please rank them in order of their level of priority for your library. (Choose “N/A” for goals that you did not select in the previous question) • promote community outreach • promote learning and literacy • promote the library as relevant • promote the maker culture of making • provide access to expensive machines or tools • complement digital repository or digital scholarship projects • as a direct response to community requests or needs • other Q10. What is your library’s time frame for developing a makerspace or equivalent lab? Q11. What are your library’s current plans for gathering and/or financing the resources needed for developing and maintaining the makerspace or equivalent lab? Q12. Thank you for your participation. Would you like a copy of the results when the report is completed? • No • Yes (please enter your email address below) Q13. You have almost concluded this survey. Before signing off, please feel free to share your thoughts and comments regarding the makerspace movement in college and research libraries. If no comments, please click “Next” to end the survey. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 114 PATH THREE [CRLs with a makerspace] Q14. How long have you had your makerspace or equivalent learning lab? • less than 6 months • 6–12 months • 1–2 years • 2–3 years • more than 3 years Q15. What were the main goals that motivated your library's decision to develop a makerspace or equivalent lab? (Please check all that apply) • promote community outreach • promote learning and literacy • promote the library as relevant • promote the maker culture of making • provide access to expensive machines or tools • complement digital repository or digital scholarship projects • as a direct response to community requests or needs other Q16. Of these goals, please rank them in order of their level of priority for your library. (Choose “N/A” for goals that you did not select in the previous question) • promote community outreach • promote learning and literacy • promote the library as relevant • promote the maker culture of making • provide access to expensive machines or tools • complement digital repository or digital scholarship projects • as a direct response to community requests or needs • other Q17. How did your library gather and/or finance the resources needed for developing and maintaining the makerspace or equivalent learning lab? Q18. Do you offer programs in the makerspace/lab or is it simply opened at defined times for users to use? • Yes, we offer the following types of programs: • No, we simply leave the makerspace/lab open at the following times (please note times and/or if a reservation is required): • We do both. We offer the following types of programs and leave the makerspace/lab open at the following times (please note types of programs, times open, and if a reservation is required): CURRENT TRENDS AND GOALS IN THE DEVELOPMENT OF MAKERSPACES | DAVIS 115 https://doi.org/10.6017/ital.v37i2.9825 Q19. What type of community members tend to use your library's makerspace or equivalent lab most? (Please check all that apply) • undergraduate researchers • graduate researchers • faculty • staff • general public • local artists, designers, or craftspeople • local entrepreneurs • other Q20. Of the cohorts chosen above, please rank them in order of who uses the makerspace or equivalent lab most often. (Use “N/A” for cohorts that are not relevant to your space or lab) • undergraduate researchers • graduate researchers • faculty • staff • general public • local artists, designers, or craftspeople • local entrepreneurs • other Q21. How many dedicated staff does your library currently employ for the makerspace or equivalent? • 0 • 1 • 2 • 3 • other Q22. Where is your makerspace or equivalent lab located? Q23. What is the title or name of your makerspace or equivalent lab, and if known, what were the reasons behind this particular name? Q24. What major equipment and services does your library makerspace or equivalent lab provide? Q25. What unexpected considerations, challenges, or failures has your library faced in developing and maintaining the makerspace or equivalent lab? Q26. How would you assess the benefits or “return on investment” of having a makerspace or equivalent lab? Q27. Thank you for your participation. Would you like a copy of the final results when the report is completed? If yes, please enter your email address in the space provided. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 116 • No • Yes (please enter your email address below) Q28. You have almost concluded this survey. Before signing off, please feel free to share your thoughts and comments regarding the makerspace movement in college and research libraries. If no comments, please click “Next” to end the survey. REFERENCES AND NOTES 1 Laura Britton, “A Fabulous Laboratory: The Makerspace at Fayetteville Free Library,” Public Libraries 51, no. 4 (July/August 2012): 30–33, http://publiclibrariesonline.org/2012/10/a- fabulous-labaratory-the-makerspace-at-fayetteville-free-library/; Madelynn Martiniere, “Hack the World: How the Maker Movement is Impacting Innovation: From DIY Geige,” Medium, October 27, 2014, https://medium.com/@mmartiniere/hack-the-world-how-the-maker- movement-is-impacting-innovation-bbc0b46bd820#.3mnhow4jz. 2 David V. Loertscher, “Maker Spaces and the Learning Commons,” Teacher Librarian 39, no. 6 (October 2012): 45–46, accessed December 9, 2016, Library, Information Science & Technology Abstracts with Full Text, EBSCOhost; Jon Kalish, “Libraries Make Room For High-Tech ‘Hackerspaces,’” National Public Radio, December 25, 2011, http://www.npr.org/2011/12/10/143401182/libraries-make-room-for-high-tech- hackerspaces; Diane Slatter and Zaana Howard, “A Place to Make, Hack, and Learn: Makerspaces in Australian Public Libraries,” Australian Library Journal 62, no. 4: 272–84, https://doi.org/10.1080/00049670.2013.853335. 3 Sharon Crawford Barniskis, “Makerspaces and Teaching Artists,” Teaching Artist Journal 12, no. 1: 6–14. 4 Anne Wong and Helen Partridge, “Making as Learning: Makerspaces in Universities,” Australian Academic & Research Libraries 47, no. 3 (September 2016): 143–59, https://doi.org/10.1080/00048623.2016.1228163. 5 Erich Purpur et al., “Refocusing Mobile Makerspace Outreach Efforts Internally as Professional Development,” Library Hi Tech 34, no. 1 (2016): 130–42. 6 Britton, “A Fabulous Laboratory,” 30. 7 TJ McCue, “First Public Library to Create a Maker Space,” Forbes, November 15, 2011, http://www.forbes.com/sites/tjmccue/2011/11/15/first-public-library-to-create-a-maker- space/. 8 Phillip Torrone, “Is It Time to Rebuild and Retool Public Libraries and Make ‘TechShops’?,” Make:, March 20, 2011, http://makezine.com/2011/03/10/is-it-time-to-rebuild-retool-public- libraries-and-make-techshops/. 9 R. David Lankes, “Killing Librarianship,” (keynote speech, New England Library Association Annual Conference, October 3, 2011, Burlington, Vermont), https://davidlankes.org/killing- librarianship/. http://publiclibrariesonline.org/2012/10/a-fabulous-labaratory-the-makerspace-at-fayetteville-free-library/ http://publiclibrariesonline.org/2012/10/a-fabulous-labaratory-the-makerspace-at-fayetteville-free-library/ https://medium.com/@mmartiniere/hack-the-world-how-the-maker-movement-is-impacting-innovation-bbc0b46bd820#.3mnhow4jz https://medium.com/@mmartiniere/hack-the-world-how-the-maker-movement-is-impacting-innovation-bbc0b46bd820#.3mnhow4jz http://www.npr.org/2011/12/10/143401182/libraries-make-room-for-high-tech-hackerspaces http://www.npr.org/2011/12/10/143401182/libraries-make-room-for-high-tech-hackerspaces https://doi.org/10.1080/00049670.2013.853335 https://doi.org/10.1080/00048623.2016.1228163 http://www.forbes.com/sites/tjmccue/2011/11/15/first-public-library-to-create-a-maker-space/ http://www.forbes.com/sites/tjmccue/2011/11/15/first-public-library-to-create-a-maker-space/ http://makezine.com/2011/03/10/is-it-time-to-rebuild-retool-public-libraries-and-make-techshops/ http://makezine.com/2011/03/10/is-it-time-to-rebuild-retool-public-libraries-and-make-techshops/ https://davidlankes.org/killing-librarianship/ https://davidlankes.org/killing-librarianship/ CURRENT TRENDS AND GOALS IN THE DEVELOPMENT OF MAKERSPACES | DAVIS 117 https://doi.org/10.6017/ital.v37i2.9825 10 Janet L. Balas, “Do Makerspaces Add Value to Libraries?,” Computers in Libraries 32, no. 9 (November 2012): 33. 11 Balas, “Do Makerspaces Add Value to Libraries?,” 33; Adrian G Smith et al., “Grassroots Digital Fabrication and Makerspaces: Reconfiguring, Relocating and Recalibrating Innovation?” (working paper, University of Sussex, SPRU Working Paper SWPS, Falmer, Brighton, September 2013), https://doi.org/10.2139/ssrn.2731835. 12 The number of and interval between emails corresponded roughly with Dillman’s “five-contact framework” as outlined in Carolyn Hank, Mary Wilkins Jordan, and Barbara M. Wildemuth, “Survey Research,” in Applications of Social Research Methods to Questions in Information and Library Science, edited by Barbara Wildemuth, 256–69 (Westport, CT: Libraries Unlimited, 2009), 261. 13 In choosing these priorities, respondents were asked to select as many of the reasons that applied to their own CRL. https://doi.org/10.2139/ssrn.2731835 ABSTRACT Introduction and Overview Literature Review Research Design and Method Survey Population Survey Design Data Collection Results CRLs with No Makerspace (P1 = 29) CRLs with Plans for a Makerspace in the Near Future (P2 = 17) CRLs with Operating Makerspaces (P3 = 9) Priorities and Rationale Funding, Staffing, and Service Models Challenges and Philosophies of CRL Makerspaces Discussion Conclusion Appendix: Survey Questions Informed Consent Purpose Procedures Risks and Inconveniences Potential Benefits and Incentive Voluntary Participation Protection of Confidentiality Contact Information Consent ACRL Survey on Makerspaces and 3D Printers PATH ONE PATH TWO PATH THREE References and Notes 9878 ---- Digitization of Text documents Using PDF/A Yan Han and Xueheng Wan INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 52 Yan Han (yhan@email.arizona.edu) is Full Librarian, the University of Arizona Libraries, and Xueheng Wan (wanxueheng@email.arizona.edu) is a student, Department of Computer Science, University of Arizona. ABSTRACT The purpose of this article is to demonstrate a practical use case of PDF/A for digitization of text documents following FADGI’s recommendation of using PDF/A as a preferred digitization file format. The authors demonstrate how to convert and combine TIFFs with associated metadata into a single PDF/A-2b file for a document. Using real-life examples and open source software, the authors show readers how to convert TIFF images, extract associated metadata and International Color Consortium (ICC) profiles, and validate against the newly released PDF/A validator. The generated PDF/A file is a self-contained and self-described container that accommodates all the data from digitization of textual materials, including page-level metadata and ICC profiles. Providing theoretical analysis and empirical examples, the authors show that PDF/A has many advantages over the traditionally preferred file format, TIFF/JPEG2000, for digitization of text documents. BACKGROUND PDF has been primarily used as a file delivery format across many platforms in almost every device since its initial release in 1993. PDF/A was designed to address concerns about long-term preservation of PDF files, but there has been little research and few implementations of this file format. Since the first standard (ISO 19005 PDF/A-1), published in 2005, some articles discuss the PDF/A family of standards, relevant information, and how to implement PDF/A for born-digital documents.1 There is growing interest in the PDF and PDF/A standards after both the US Library of Congress and the National Archives and Records Administration (NARA) joined the PDF Association in 2017. NARA joined the PDF Association because PDF files are used as electronic documents in every government and business agency. As explained in a blog post, the Library of Congress joined the PDF Association because of the benefits to libraries, including participating in developing PDF standards, promoting best-practice use of PDF, and access to the global expertise in PDF technology.2 Few articles, if any, have been published about using this file format for preservation of digitized content. Yan Han published a related article in 2015 about theoretical research on using PDF/A for text documents.3 In this article, Han discussed the shortcomings of the widely used TIFF and JPEG2000 as master preservation file formats and proposed using the then-emerging PDF/A as the preferred file format for digitization of text documents. Han further analyzed the requirements mailto:yhan@email.arizona.edu mailto:wanxueheng@email.arizona.edu DIGITIZATION OF TEXT DOCUMENTS USING PDF/A | HAN AND WAN 53 HTTPS://DOI.ORG/10.6017/ITAL.V37I1.9878 of digitization of text documents and discussed the advantages of PDF/A over TIFF and JPEG2000. These benefits include platform independence, smaller file size, better compression algorithms, and metadata encoding. In addition, the file format reduces workload and simplifies post- digitization processing such as quality control, adding and updating missing pages, and creating new metadata and OCR data for discovery and digital preservation. As a result, PDF/A can be used in every phase of a digital object in an Open Archival Information System (OAIS)—for example, a Submission Information Package (SIP), Archive Information Package (AIP), and Dissemination Information Package (DIP). In summary, a PDF/A file can be a structured, self-contained, and self- described container allowing a simpler one-to-one relationship between an original physical document and its digital surrogate. In September 2016, the Federal Agencies Digital Guidelines Initiative (FADGI) released its latest guidelines for digitization related to raster images: Technical Guidelines for Digitizing Heritage Materials.4 The de-facto best practices for digitization, these guidelines provide federal agencies guidance and have been used in many cultural heritage institutions. Both the PDF Association and the authors welcomed the recognition of PDF/A as the preferred master file format for digitization of text documents such as unbound documents, bound volumes, and newspapers.5 GOALS AND TASKS Since Han has previously provided theoretical methods of coding raster images, metadata, and related information in PDF/A, the goals of this article are threefold: 1. present real-life experience of converting TIFFs/JPEG2000s to PDF/A and back, along with image metadata 2. test open source libraries to create and manipulate images, image metadata, and PDF/A 3. validate generated PDF/As with the first legitimate validator for PDF/A validation The tasks included the following: ● Convert all the master files in TIFFs/JPEG2000 from digitization of text documents into single PDF/A files losslessly. One document, one PDF/A file. ● Evaluate and extract metadata from each TIFF/JPEG2000 image and encode it along with its image when creating the corresponding PDF/A file. ● Demonstrate the runtimes of the above tasks for feasibility evaluation. ● Validate the PDF/A files against the newly released open source PDF/A validator veraPDF. ● Extract each digital image from the PDF/A file back to its original master image files along with associated metadata. ● Verify the extracted image files in the back-and-forth conversion process against the original master image files Choices of PDF/A Standards and Conformance Level This article demonstrates using PDF/A-2b as a self-contained self-describing file format. Currently, there are three related PDF/A standards (PDF/A-1, PDF/A-2, and PDF/A-3), each with INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 54 three conformance levels (a, b, and u). The reasons for choosing PDF/A-2 (instead of PDF/A-1 or PDF/A-3) are the following: ● PDF/A-1 is based on PDF 1.4. In this standard, images coded in PDF/A-1 cannot use JPEG2000 compression (named in PDF/A as JPXDecode). One can still convert TIFFs to PDF/A-1 using other lossless compression methods such as LZW. However, the space- saving benefits of JPEG2000 compression over other methods would not be utilized. ● PDF/A-2 and PDF/A-3 are based on PDF 1.7. One significant feature of PDF 1.7 is that it supports JPEG2000 compression, which saves 40–60 percent of space for raster images compared to uncompressed TIFFs. ● PDF/A-3 has one major feature that PDF/A-2 does not have, which is to allow arbitrary files to be embedded within the PDF file. In this case, there is no file to be embedded. The authors chose conformance level b for simplicity. ● b is basic conformance, which requires only necessary components (e.g., all fonts embedded in the PDF) for reproduction of a document’s visual appearance. ● a is accessible conformance, which means b conformance level plus additional accessibility (structural and semantic features such as document structure). One can add tags to convert PDF/2b to PDF/2a. ● u represents a conformance level with the additional requirement that all text in the document have Unicode equivalents. This article does not cover any post-processing of additional manual or computational features such as adding OCR text to the generated PDF/A files. These features do not help faithfully capture the look and feel of original pages in digitization, and they can be added or updated later without any loss of information. In addition, OCR results rely on the availability of OCR engines for the document’s language, and results can vary between different OCR engines over time. OCR technology is getting better and will produce better results in the future. For example, current OCR technology for English gives very reliable (more than 90 percent) accuracy. In comparison, traditional Chinese manuscripts and Pashto/Persian give unacceptably low accuracy (less than 60 percent). The cutting edge on OCR engines has started to utilize artificial intelligence networks, and the authors believe that a breakthrough will happen soon. Data Source The University of Arizona Libraries (UAL) and Afghanistan Center at Kabul University (ACKU) have been partnering to digitize and preserve ACKU’s permanent collection held in Kabul. This collaborative project created the largest Afghan digital repository in the world. Currently the Afghan digital repository (http://www.afghandata.org) contains more than fifteen thousand titles and 1.6 million pages of documents. Digitization of these text documents follows the previous version of the FADGI guideline, which recommended scanning each page of a text document into a separate TIFF file as the master file. These TIFFs were organized by directories in a file system, where each directory represents a corresponding document containing all the scanned pages of this title. An example of the directory structure can be found in Han’s article. http://www.afghandata.org/ DIGITIZATION OF TEXT DOCUMENTS USING PDF/A | HAN AND WAN 55 HTTPS://DOI.ORG/10.6017/ITAL.V37I1.9878 PDF/A and Image Manipulation Tools There are a few open source and proprietary PDF software development kits (SDK). Adobe PDF Library and Foxit SDK are the most well-known commercial tools to manipulate PDFs. To show readers that they can manipulate and generate PDF/A documents themselves, open source software, rather than commercial tools, was used. Currently, only a very limited number of open source PDF SDKs are available, including iText and PDFBox. iText was chosen because it has g ood documentation and provides a well-built set of APIs to support almost all the PDF and PDF/A features. Initially written by Bruno Lowagie (who was in the ISO PDF standard working group) in 1998 as an in-house project, Lowagie later started up his own company, iText, and published iText in Action with many code examples.6 Moreover, iText has Java and C# coding options with good code documentation. It is worth mentioning that iText has different versions. The author used iText 5.5.10 and 5.4.4. Using an older version in our implementation generated a non-compatible PDF/A file because the it was not aligned with the PDF/A standard.7 For image processing, there were a few popular open source options, including ImageMagick and GIMP. ImageMagick was chosen because of its popularity, stability, and cross-platform implementation. Our implementation identified one issue with ImageMagick: the current version (7.0.4) could not retrieve all the metadata from TIFF files as it did not extract certain information such as the Image File Directory and color profile. These metadata are critical because they are part of the original data from digitization. Unfortunately, the author observed that some image editors were unable to preserve all the metadata from the image files during the conversion process. Hart and De Varies used case studies to show the vulnerability of metadata, demonstrating metadata elements in a digital object can be lost and corrupted by use or conversion of a file to another format. They suggested that action is needed to ensure proper metadata creation and preservation so that all types of metadata must be captured and preserved to achieve the most authentic, consistent, and complete digital preservation for future use.8 Metadata Extraction Tools and Color Profiles As we digitize physical documents and manipulate images, color management is important. The goal of color management is to obtain a controlled conversion between the color representations of various devices such as image scanners, digital cameras, and monitors. A color profile is a set of data that control input and output of a color space. The International Color Consortium (ICC) standards and profiles were created to bring various manufacturers together because embedding color profiles into images is one of the most important color management solutions. Image formats such as TIFF and JPEG2000 and document formats such as PDF may contain embedded color profiles. The authors identified a few open source tools to extract TIFF metadata, includin g ExifTool, Exiv2, and tiffInfo. ExifTool is an open source tool for reading, writing, and manipulating metadata of media files. Exiv2 is another free metadata tool supporting different image formats. The tiffInfo program is widely used in the Linux platform, but it has not been updated for at least ten years. Our implementations showed that ExifTool was the one that most easily extracted the full ICC profiles and other metadata from TIFF and JPEG2000 files. ImageMagick and other image processing software were examined in Van der Knijff’s article discussing JPEG2000 for long-term preservation.9 He found that ICC profiles were lost in ImageMagick. Our implementation has INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 56 showed the current version of ImageMagick has fixed this issue. A metadata sample can be found in appendix A. IMPLEMENTATION Converting and Ordering TIFFs into a Single PDF/A-2 File When ordering and combining all individual TIFFs of a document into a single PDF/A-2b file, the authors intended to preserve all information from the TIFFs, including raster image data streams and metadata stored in each TIFF’s header. The raster image data streams are the main images reflecting the original look and feel of these pages, while the metadata (including technical and administrative metadata such as BitsPerSample, DateTime, and Make/Model/Software) tells us important digitization and provenance information. Both are critical for delivery and digital preservation. The TIFF images were first converted to JPEG2000 with lossless compression using the open source ImageMagick software. Our tests of ImageMagick demonstrated that it can handle different color profiles and will convert images correctly if the original TIFF comes with a color profile. This gave us confidence that past concerns about JPEG2000 and ImageMagick had been resolved. These images were then properly sorted into their original order and combined into a single PDF/A-2 file. An alternative is to directly code TIFF’s image data stream into a PDF/A file, but this approach would miss one benefit of PDF/A-2: tremendous file size reduction with JPEG2000. The following is the pseudocode of ordering and combining all the TIFFs in a text document into a single PDF/A- 2 file. CreatePDFA2(queue TiffList) { Create an empty queue XMLQ; Create an empty queue JP2Q; /* TiffFileList is pre-sorted queue based on the original order */ /* Convert each TIFF to JPEG2000 losslessly, then add each JPEG2000 and its metadata into a queue */ while (TiffList is NOT empty) { String TiffFilePath = TiffList.dequeue(); string xmlFilePath = Tiff metadata extracted using exiftool; XMLQ.enqueue(xmlFilePath); String jp2FilePath = JPEG2000 file location from Tiff converted by ImageMagick; JP2Q.enqueue(jp2FilePath); } /* Convert each image’s metadata to XMP, add each JPEG2000 and its metadata into the PDF/A-2 file based on its original order */ Document pdf2b = new Document(); /* create PDF/A-2b conformance level */ PdfAWriter writer = PdfAWriter.getInstance(doc, new FileOutputStream(PdfAFilePath),PdfAConformaceLevel.PDF_A_2B); writer.createXmpMetadata(); //Create Root XMP DIGITIZATION OF TEXT DOCUMENTS USING PDF/A | HAN AND WAN 57 HTTPS://DOI.ORG/10.6017/ITAL.V37I1.9878 pdf2b.open(); while(JP2Q is NOT empty){ Image jp2 = Image.getInstance(JP2Q.dequeue()); Rectangle size = new Rectangle(jp2.getWidth(), jp2.getHeight()); //PDF page size setting pdf2b.setPageSize(size); pdf2b.newPage(); // create a new page for a new image byte[] bytearr = XmpManipulation(XMLQ.dequeue()); // convert original metadata based on the XMP standard writer .setPageXmpMetadata(bytearr); pdf2b.add(jp2); } pdf2b.close(); } Converting PDF/A-2 Files back to TIFFs and JPEG2000s To ensure that we can extract raster images from the newly created PDF/A-2 file, the authors also wrote code to convert a PDF/A-2 file back to the original TIFF or JPEG2000 format. This implementation was a reverse process of the above operation. Once the reverse conversion process was completed, the authors verified that the image files created from the PDF/A-2 file were the same as before the conversion to PDF/A-2. Note that we generated MD5 checksums to verify image data streams. Images data streams are the same, but metadata location can be varied because of inconsistent TIFF tags used over the years. When converting one TIFF to another TIFF, ImageMagick has its implementation of metadata tags. The code can be found in appendix B. PDF/A Validation PDF/A is one of the most recognized digital preservation formats, specially designed for long -term preservation and access. However, no commonly accepted PDF/A validator was available in the past, although several commercial and open source PDF preflight and validation engines (e.g., Acrobat) were available. Validating a PDF/A against the PDF/A standards is a challenging task for a few reasons, including the complexity of the PDF and PDF/A formats. The PDF Association and the Open Preservation Foundation recognized the need and started a project to develop an open source PDF/A validator and build a maintenance community. Their result, VeraPDF, is an open source validator designed for all PDF/A parts and conformance levels. Released in January 2017, the goal of veraPDF is to become the commonly accepted PDF/A validator. 10 Our generated PDF/As have been validated with veraPDF 1.4 and Adobe Acrobat Pro DC Preflight. Both products validated the PDF/A-2b files as fully compatible. Our implementations showed that veraPDF 1.4 verified more cases than Acrobat DC Preflight. Figure 1 shows a PDF file structure and its metadata. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 58 Figure 1. A PDF object tree with root-level metadata. RUNTIME AND CONCLUSION The time complexity of our code is O(log n) because of the sorting algorithms used. TIFFs were first converted to JPEG2000. When JPEG2000 images are added to a PDF/A-2 file, no further image manipulation is required because the generated PDF/A-2 uses JPEG2000 directly (in other words, it uses the JPXDecode filter). Tables 1 and 2 show the performance comparison running in our computer hardware and software environment (Intel Core i7-2600 CPU@3.4GHz, 8GB DDR3 RAM, 3TB 7200-RPM 64MB-cache hard disk running Ubuntu 16.10). DIGITIZATION OF TEXT DOCUMENTS USING PDF/A | HAN AND WAN 59 HTTPS://DOI.ORG/10.6017/ITAL.V37I1.9878 Table 1. Runtimes of converting grayscale TIFFs to JPEG2000s and to PDF/A-2b No. of Files Total File Size (MB) Image Conversion Runtime (TIFFs to JP2s in seconds) Total Runtime (TIFFs to JP2s to a single PDF/A-2b in seconds) 1 9.1 3.61 3.98 10 91.1 35.63 36.71 20 182.2 71.83 73.98 50 455.5 179.06 184.63 100 910.9 358.3 370.91 Table 2. Runtimes of converting color TIFFs to JPEG2000s and to PDF/A-2b No. of Files Total File Size (MB) Image Conversion Runtime (TIFFs to JP2s in seconds) Total Runtime (TIFFs to JP2s to a single PDF/A-2b in seconds) 1 27.3 14.80 14.94 10 273 150.51 151.55 20 546 289.95 293.21 50 1,415 741.89 749.75 100 2,730 1490.49 1509.23 The results show that (a) the majority of the runtime (more than 95 percent) is spent in converting a TIFF to a JPEG2000 using ImageMagick (see figure 2); (b) the average runtime of converting a TIFF has a constant positive relationship with the file’s size (see figure 2); (c) in INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 60 comparison, the runtime of converting a color TIFF is significantly higher than that of converting a greyscale TIFF (see figure 2); and (d) it is feasible in terms of time and resources to convert existing master images of digital document collections to PDF/A-2b. For example, the runtime of 1 TB of conversion of color TIFFs will be 552,831 seconds (153.5 hours; 6.398 days) using the above hardware. The authors have already processed more than 600,000 TIFFs using this method. The authors conclude that using PDF/A gives institutions advantages of the newly preferred master file format for digitization of text documents over TIFF/JPEG2000. The above implementation demonstrates the ease, the reasonable runtime, and the availability of open source software to perform such conversions. From both the theoretical analysis and empirical evidences, the authors show that PDF/A has advantages over the traditional preferred file format TIFF for digitization of text documents. Following best practice, a PDF/A file can be a self- contained and self-described container that accommodates all the data from digitization of textual materials, including page-level metadata and ICC profiles. SUMMARY The goal of this article is to demonstrate empirical evidences of using PDF/A for digitization of text document. The authors evaluated and used multiple open source software programs for processing raster images, extracting image metadata, and generating PDF/A files. These PDF/A files were validated using the up-to-date PDF/A validators veraPDF and Acrobat Preflight. The authors also calculated the time complexity of the program and measured the total runtime in multiple testing cases. Most of the runtime was spent on image conversions from TIFF to JPEG2000. The creation of the PDF/A-2b file with associated page-level metadata accounted for less than 5 percent of the total runtime. Runtime of conversion of a color TIFF was much higher than that of a greyscale one. Our theoretical analysis and empirical examples show that using PDF/A-2 presents many advantages over the traditional preferred file format (TIFF/JPEG2000) for digitization of text documents. DIGITIZATION OF TEXT DOCUMENTS USING PDF/A | HAN AND WAN 61 HTTPS://DOI.ORG/10.6017/ITAL.V37I1.9878 Figure 2. File size, greyscale and color TIFFs and runtime ratio. INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 62 APPENDIX A: SAMPLE TIFF METADATA WITH ICC HEADER 8 3400 4680 8 8 8 Uncompressed RGB (Binary data 41025 bytes, use -b option to extract) 3 1 (Binary data 28079 bytes, use -b option to extract) 400 400 Chunky APPL 2.2.0 Display Device Profile RGB XYZ 2006:02:02 02:20:00 acsp Apple Computer Inc. Not Embedded, Independent none Reflective, Glossy, Positive, Color Perceptual 0.9642 1 0.82491 EPSO 0 EPSON sRGB 0.43607 0.22249 0.01392 0.38515 0.71687 0.09708 0.14307 0.06061 0.7141 0.95045 1 1.08905 Copyright (c) SEIKO EPSON CORPORATION 2000 - 2006. All rights reserved. (Binary data 8204 bytes, use -b option to extract) (Binary data 8204 bytes, use -b option to extract) (Binary data 8204 bytes, use -b option to extract) 0 0 0 DIGITIZATION OF TEXT DOCUMENTS USING PDF/A | HAN AND WAN 63 HTTPS://DOI.ORG/10.6017/ITAL.V37I1.9878 APPENDIX B: SAMPLE CODE TO CONVERT PDF/A-2 BACK TO JPEG2000S /* Assumption: The PDF/A-2b file was specifically generated from image objects converted from TIFF images with JPXDecode along with page-level metadata */ public static void parse(String src, String dest) throws IOException{ PdfReader reader = new PdfReader(src); PdfObject obj; int counter = 0; for(int i = 1; i <= reader.getXrefSize(); i ++){ obj = reader.getPdfObject(i); if(obj != null && obj.isStream()){ PRStream stream = (PRStream) obj; byte[] b; try{ b = PdfReader.getStreamBytes(stream); }catch(UnsupportedPdfException e){ b = PdfReader.getStreamBytesRaw(stream); } PdfObject pdfsubtype = stream.get(PdfName.SUBTYPE); FileOutputStream fos = null; if (pdfsubtype != null && pdfsubtype.toString().equals(PdfName.XML.toString())) { fos = new FileOutputStream(String.format(dest + "_xml/" + counter+".xml", i)); System.out.println("Page Metadata Extracted!"); } if (pdfsubtype != null && pdfsubtype.toString().equals(PdfName.IMAGE.toString())) { counter ++; fos = new FileOutputStream(String.format(dest + "_jp2/" + counter+".jp2", i)); } if (fos != null) { fos.write(b); fos.flush(); fos.close(); System.out.println("JPEG2000s Conversion from PDF completed !"); } } } /* Then Use ImageMagick library to convert JPEG2000s to TIFFs */ INFORMATION TECHNOLOGY AND LIBRARIES | MARCH 2018 64 REFERENCES 1 PDF-Tools.com and PDF Association, “PDF/A—The Standard for Long-Term Archiving,” version 2.4, white paper, May 20, 2009, http://www.pdf- tools.com/public/downloads/whitepapers/whitepaper-pdfa.pdf; Duff Johnson, “White Paper: How to Implement PDF/A,” Talking PDF, August 24, 2010, https://talkingpdf.org/white-paper- how-to-implement-pdfa/; Alexandra Oettler, “PDF/A in a Nutshell 2.0: PDF for Long-Term Archiving,” Association for Digital Standards, 2013, https://www.pdfa.org/wp- content/until2016_uploads/2013/05/PDFA_in_a_Nutshell_211.pdf; Library of Congress, “PDF/A, PDF for Long-Term Preservation,” last modified July 27, 2017, https://www.loc.gov/preservation/digital/formats/fdd/fdd000318.shtml. 2 Library of Congress, “The Time and Place for PDF: An Interview with Duff Johnson of the PDF Association,” The Signal (blog), December 12, 2017, https://blogs.loc.gov/thesignal/2017/12/the-time-and-place-for-pdf-an-interview-with-duff- johnson-of-the-pdf-association/. 3 Yan Han, “Beyond TIFF and JPEG2000: PDF/A as an OAIS Submission Information Package Container,” Library Hi Tech 33, no. 3 (2015): 409–23, https://doi.org/10.1108/LHT-06-2015- 0068. 4 Federal Agencies Digital Guidelines Initiative, Technical Guidelines for Digitizing Cultural Heritage Materials. (Washington, DC: Federal Agencies Digital Guidelines Initiative, 2016), http://www.digitizationguidelines.gov/guidelines/FADGI%20Federal%20%20Agencies%20D igital%20Guidelines%20Initiative-2016%20Final_rev1.pdf. 5 Duff Johnson, “US Federal Agencies Approve PDF/A,” PDF Association, September 2, 2016, http://www.pdfa.org/new/us-federal-agencies-approve-pdfa/. 6 Bruno Lowagie, iText in Action, 2nd ed. (Stamford, CT: Manning, 2010). 7 “iText 5.4.4,” iText, last modified September 16, 2013, http://itextpdf.com/changelog/544. 8 Timothy Robert Hart and Denise de Vries, “Metadata Provenance and Vulnerability,” Information Technology and Libraries 36, no. 4 (2017), https://doi.org/10.6017/ital.v36i4.10146. 9 Johan Van der Knijff, “JPEG 2000 for Long-Term Preservation: JP2 as a Preservation Format,” D- Lib 17, no. 5/6 (2011), https://doi.org/10.1045/may2011-vanderknijff. 10 PDF Association, “How veraPDF does PDF/A Validation,” 2016, http://www.pdfa.org/how- verapdf-does-pdfa-validation/. http://www.pdf-tools.com/public/downloads/whitepapers/whitepaper-pdfa.pdf http://www.pdf-tools.com/public/downloads/whitepapers/whitepaper-pdfa.pdf https://talkingpdf.org/white-paper-how-to-implement-pdfa/ https://talkingpdf.org/white-paper-how-to-implement-pdfa/ https://www.pdfa.org/wp-content/until2016_uploads/2013/05/PDFA_in_a_Nutshell_211.pdf https://www.pdfa.org/wp-content/until2016_uploads/2013/05/PDFA_in_a_Nutshell_211.pdf https://www.loc.gov/preservation/digital/formats/fdd/fdd000318.shtml https://blogs.loc.gov/thesignal/2017/12/the-time-and-place-for-pdf-an-interview-with-duff-johnson-of-the-pdf-association/ https://blogs.loc.gov/thesignal/2017/12/the-time-and-place-for-pdf-an-interview-with-duff-johnson-of-the-pdf-association/ https://blogs.loc.gov/thesignal/2017/12/the-time-and-place-for-pdf-an-interview-with-duff-johnson-of-the-pdf-association/ https://blogs.loc.gov/thesignal/2017/12/the-time-and-place-for-pdf-an-interview-with-duff-johnson-of-the-pdf-association/ https://doi.org/10.1108/LHT-06-2015-0068 https://doi.org/10.1108/LHT-06-2015-0068 http://www.digitizationguidelines.gov/guidelines/FADGI%20Federal%20%20Agencies%20Digital%20Guidelines%20Initiative-2016%20Final_rev1.pdf http://www.digitizationguidelines.gov/guidelines/FADGI%20Federal%20%20Agencies%20Digital%20Guidelines%20Initiative-2016%20Final_rev1.pdf http://www.digitizationguidelines.gov/guidelines/FADGI%20Federal%20%20Agencies%20Digital%20Guidelines%20Initiative-2016%20Final_rev1.pdf http://www.digitizationguidelines.gov/guidelines/FADGI%20Federal%20%20Agencies%20Digital%20Guidelines%20Initiative-2016%20Final_rev1.pdf https://www.pdfa.org/new/us-federal-agencies-approve-pdfa/ https://www.pdfa.org/new/us-federal-agencies-approve-pdfa/ https://www.pdfa.org/new/us-federal-agencies-approve-pdfa/ http://itextpdf.com/changelog/544 http://itextpdf.com/changelog/544 https://doi.org/10.6017/ital.v36i4.10146 https://doi.org/10.6017/ital.v36i4.10146 https://doi.org/10.1045/may2011-vanderknijff https://www.pdfa.org/how-verapdf-does-pdfa-validation/ https://www.pdfa.org/how-verapdf-does-pdfa-validation/ https://www.pdfa.org/how-verapdf-does-pdfa-validation/ Abstract Background Goals and Tasks Choices of PDF/A Standards and Conformance Level Data Source PDF/A and Image Manipulation Tools Metadata Extraction Tools and Color Profiles Implementation Converting and Ordering TIFFs into a Single PDF/A-2 File Converting PDF/A-2 Files back to TIFFs and JPEG2000s PDF/A Validation Runtime and Conclusion Summary Appendix A: Sample TIFF Metadata with ICC header Appendix B: Sample Code to convert PDF/A-2 back to JPEG2000s References 9953 ---- Mobile Website Use and Advanced Researchers: Understanding Library Users at a University Marine Sciences Branch Campus Mary J. Markland, Hannah Gascho Rempel, and Laurie Bridges INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 7 ABSTRACT This exploratory study examined the use of the Oregon State University Libraries website via mobile devices by advanced researchers at an off-campus branch location. Branch campus–affiliated faculty, staff, and graduate students were invited to participate in a survey to determine what their research behaviors are via mobile devices, including frequency of their mobile library website use and the tasks they were attempting to complete. Findings showed that while these advanced researchers do periodically use the library website via mobile devices, mobile devices are not the primary mode of searching for articles and books or for reading scholarly sources. Mobile devices are most frequently used for viewing the library website when these advanced researchers are at home or in transit. Results of this survey will be used to address knowledge gaps around library resources and research tools and to generate more ways to study advanced researchers’ use of library services via mobile devices. INTRODUCTION As use of mobile devices has expanded in the academic environment, so has the practice of gathering data from multiple sources about what mobile resources are and are not being used. This data informs the design decisions and resource investments libraries make in mobile tools. Web analytics is one tool that allows researchers to discover which devices patrons use to access library webpages. But web analytics data do not show what patrons want to do and what hurdles they face when using the library website via a mobile device. Web analytics also lacks nuance in that it cannot distinguish user characteristics, such as whether users are novice or advanced researchers, which may affect how these users interact with a mobile device. User surveys are another tool for gathering data on mobile behaviors. User surveys help overcome some of the limitations of web analytics data by directly asking users about their perceived research skills and the resources they use on a mobile device. As is the case at most libraries, Oregon State University Libraries serves a diverse range of users. We were interested in learning whether advanced researchers—particularly advanced researchers who work at a branch campus—use the library’s resources differently than main Mary J. Markland (mary.markland@oregonstate.edu), is Head, Guin Library; Hannah Gascho Rempel (hannah.rempel@oregonstate.edu) is Science Librarian and Coordinator of Graduate Student Success Services; and Laurie Bridges (laurie.bridges@oregonstate.edu) is Instruction and Outreach Librarian, Oregon State University Libraries and Press. mailto:mary.markland@oregonstate.edu mailto:hannah.rempel@oregonstate.edu mailto:laurie.bridges@oregonstate.edu MOBILE WEBSITE USE AND ADVANCED RESEARCHERS | MARKLAND, REMPEL, AND BRIDGES doi:10.6017/ital.v36i4.9953 8 campus users. We were chiefly interested in these advanced researchers because of the mobile nature of their work. They are graduate students and faculty in the field of marine science who work in a variety of locations, including their offices, labs, and in the field (which can include rivers, lakes, and the ocean). We focused on the use of the library website via mobile devices as one way to determine whether specific library services should be adapted to best meet the needs of this targeted user community. Oregon State University (OSU) is Oregon’s land-grant university; its home campus is in Corvallis, Oregon. Hatfield Marine Science Center (HMSC) in Newport is a branch campus that includes a branch library. Guin Library at HMSC serves OSU students and faculty from across the OSU colleges along with the co-located federal and state agencies of the National Oceanic and Atmospheric Administration (NOAA), US Fish and Wildlife Service, Environmental Protection Agency (EPA), United States Geological Survey (USGS), United States Department of Agriculture (USDA), and the Oregon Department of Fish and Wildlife. The Guin Library is in Newport, which is forty-five miles from the main campus. Like many other branch libraries, Guin Library was established at a time when providing a print collection close to where researchers and students work was paramount, but today it must adapt its services to meet the changing information needs of its user base. Branch libraries are typically designed to serve a clientele or subject area, which can create a different institutional culture from the main library. Guin Library serves advanced undergraduates, graduate students, and scientific researchers. HMSC’s distance from Corvallis, the small size of the researcher community, and the shared focus on a research area—marine sciences—create a distinct culture. While Guin Library is often referred to as the “heart of HMSC,” the number of in-person library users is decreasing. This decline is not unexpected as numerous studies have shown that faculty and graduate students have fewer needs that require an in-person trip to the library.1 Studies have also shown that faculty and graduate students can be unaware of the services and resources that libraries provide, thereby continuing the cycle of underuse. 2 To learn more about the needs of HMSC’s advanced researchers, this exploratory study examined their research behaviors via mobile devices. The goals of this study were to • determine if and with what frequency advanced researchers at HMSC use the OSU Libraries website via mobile devices; • gather a list of tasks advanced users attempt to accomplish when they visit the OSU Libraries website on a mobile device; and • determine whether the mobile behaviors of these advanced researchers are different from those of researchers from the main OSU campus (including undergraduate students), and if so, whether these differences warrant alternative modes of design or service delivery. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 9 LITERATURE REVIEW The conversation about how best to design mobile library websites has shifted over the past decade. Early in the mobile-adoption process some libraries focused on creating special websites or apps that worked with mobile devices.3 While libraries globally might still be creating mobile- specific websites and apps,4 US libraries are trending toward responsively designed websites as a more user-friendly option and a simpler solution for most libraries with limited staff and budgets.5 Most of the literature on mobile-device use in higher education is focused on undergraduates across a wide range of majors who are using a standard academic library. 6 To help provide context for how libraries have designed their websites for mobile users, some of those specific findings will be shared later. But because our study focused on graduate students and faculty in a science- focused branch library, we will begin with a discussion of what is known about more advanced researchers’ use of library services and their mobile-device habits. Several themes emerged from the literature on graduate students’ relationships with libraries. In an ironic twist, faculty think graduate students are being assisted by the library while librarians think faculty are providing graduate students with the help they need to be successful.7 This results in many graduate students end up using their library’s resources in an entirely disintermediated way. Graduate students, especially those in the sciences, visit the physical library less often and use online resources more than undergraduate students.8 Most graduate students start their research process with assistance from academic staff, such as advisors and committee members,9 and are unaware of many library services and resources.10 As frequent virtual-library users who receive little guidance on how to use the library’s tools, graduate students need a library website that is clear in scope and purpose, offers help, and has targeted services. 11 Compared to reports on undergraduate use of mobile devices to access their library’s website, relatively few studies have focused on graduate-student or faculty mobile behaviors. A recent survey of Japanese Library and Information Science (LIS) students compared and undergraduate graduate students’ usage of mobile devices to access library services and found slight differences. However, both groups reported accessing libraries as last on their list of preferred smartphone uses.12 Aharony examined the mobile use behaviors of Israeli LIS graduate students and found approximately half of these graduate students used smartphones and perceived them to be useful and easy tools for use in their everyday life, and could transfer those habits to library searching behaviors.13 When looking specifically at how patrons use library services via a mobile device, Rempel and Bridges found the top reason graduate students at their main campus used the OSU Libraries website via mobile devices was to find information on library hours, followed by finding a book and researching a topic.14 Barnett-Ellis and Vann surveyed their small university and found that both undergraduate and graduate students were more than twice as likely to use mobile devices as are their faculty and staff; a majority of students also indicated they were likely to use mobile devices to conduct research.15 Finally, survey results showed graduate students in Hofstra University’s College of Education reported accessing library materials via a mobile device twice as often as other student groups. In addition, these graduate students reported being comfortabl e MOBILE WEBSITE USE AND ADVANCED RESEARCHERS | MARKLAND, REMPEL, AND BRIDGES doi:10.6017/ital.v36i4.9953 10 reading articles up to five pages long on their mobile devices. Graduate students were also more likely to be at home when using their mobile device to access the library, a finding the authors attributed to education graduate students frequently being employed as full-time teachers.16 Research on how faculty members use library resources characterizes a population that is confident in their literature-searching skills, prefers to search on their own, and has little direct contact with the library.17 Faculty researchers highly value convenience;18 they rely primarily on electronic access to journal articles but prefer print access to monographs.19 Faculty tend to be self-trained at using search tools, such as PubMed or other online databases, and therefore are not always aware of the more in-depth functionality of these tools.20 In contrast to graduate students, Rempel and Bridges found that faculty using the library website via mobile devices were less interested in information about the physical library, such as library hours, and were more likely to be researching a topic.21 Medical faculty are one of the few faculty groups whose mobile-research behaviors have been specifically examined. A survey administered by Bushhousen et al. at a medical university revealed that a third of respondents used mobile apps for research-related activities.22 Findings by Boruff and Storie indicate that one of the biggest barriers to mobile use in health-related academic settings was wireless access.23 Thus apps that did not require the user to be connected to the internet were highly desired. Faculty and graduate students in health-related academic settings saw a role for the library in advocating for better wireless infrastructure, providing access to a targeted set of heavily used resources, and providing online guides or in-person tutorials on mobile apps or procedures specific to their institution. 24 According to the literature, most design decisions for library mobile sites have been made on the basis of information collected about undergraduate students’ behavior at main-branch campuses. To help inform our understanding of how recent decisions have been made, the remainder of the literature review focuses on what is known about undergraduate students’ mobile behavior. Undergraduate students are very comfortable using mobile technologies and perceive themselves to be skilled with these devices. According to the 2015 EDUCAUSE Center for Research and Analysis’ (ECAR) study of undergraduate students and information technology, most undergraduate students consider themselves sophisticated technology users who are engaged with information technologies.25 Undergraduate students mainly use their smartphones for non- class activities. But students indicate they could be more effective technology users if they were more skilled at tools such as the learning management system, online collaboration tools, e-books, or laptops and smartphones in class. Of interest to libraries is the ECAR participants’ top area of reported interest, “search tools to find reference or other information online for class work.”26 However, when a mobile library site is in place, usage rates have been found to be lower than anticipated. In a study of undergraduate science students, Salisbury et al. found only 2 percent of respondents reported using their cell phones to access library databases or the library’s catalog every hour or daily, despite 66 percent of the students browsing the internet using their mobile INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 11 phone hourly or daily. Salisbury et al. speculated that users need to be told about mobile- optimized library resources if libraries want to increase usage. 27 Rempel and Bridges used a pop-up interrupt survey while users were accessing the OSU Libraries mobile site.28 This approach allowed a larger cross-section of library users to be surveyed. It also reduced memory errors by capturing their activities in real time. Activities that had been included in the mobile site because of their perceived usefulness in a mobile environment, such as directions, asking a librarian a question, and the coffee shop webcam, were rarely cited as a reason for visiting the mobile site. The OSU Libraries branch at HMSC is entering a new era. A Marine Studies Initiative will result in the building of a new multidisciplinary research campus at HMSC that aims to serve five hundred undergraduate students. The change in demographics and the increase in students who will need to be served has prompted Guin Library staff to explore how the current population of advanced researchers interact with library resources. In addition, examining the ways undergraduate students at the main campus use these tools will help with planning for the upcoming changes in the user community. METHODS This study used an online Qualtrics survey to gather information about how frequently advanced researchers (graduate students, faculty, and affiliated scientists at a branch library for marine science) use the OSU Libraries website via mobile devices, what they search for, and other ways they use mobile devices to support their research behaviors. A recruitment email with a link to the survey was sent to three discussion lists used by HMSC community in Spring 2016. The survey was available for four weeks, and a reminder email was sent one week before the survey closed. The invitation email included a link to an informed- consent document. Once the consent document had been reviewed, users were taken to the survey via a second link. Respondents could provide an email address to receive a three-dollar coffee card for participating in the study, but their email address was recorded in a separate survey location to preserve their anonymity. The invitation email indicated that this survey was about using the website via a mobile device, and the first survey question asked users if they had ever accessed the library website on a mobile device. If they answered “no,” they were immediately taken to the end of the survey and were not recorded as a participant in the study. A similar survey was conducted with users from OSU’s main campus in 2012–13 and again in 2015. The results from 2012–13 have been published previously,29 but the results from 2015 have not. While the focus of the present study is on the mobile behaviors of advanced researchers in the HMSC community, data from the 2015 main-campus study is used to provide a comparison to the broader OSU community. OSU main-campus respondents in 2015 and HMSC participants in 2016 both answered closed- and open-ended questions that explored participants’ general mobile- device behaviors and behaviors specific to using the OSU Libraries website via mobile devices. MOBILE WEBSITE USE AND ADVANCED RESEARCHERS | MARKLAND, REMPEL, AND BRIDGES doi:10.6017/ital.v36i4.9953 12 However, the HMSC survey also asked questions about behaviors related to using the OSU (nonlibrary) website via a mobile device and participants’ mobile scholarly reading and writing behaviors. The survey concluded with several demographic questions. The survey data was analyzed using Qualtrics’ cross-tab functionality and Microsoft Excel to observe trends and potential differences between user groups. Open-ended responses were examined for common themes. Twenty-three members of the HMSC community completed the survey, whereas one hundred participants responded to the 2015 main campus survey. Participation in the 2015 survey was capped at one hundred respondents because limited incentives were available. The participation difference between the two surveys reflects several differences between the two sampled communities. The most obvious difference is size. The OSU community comprises more than thirty-six thousand students, faculty, and staff; the HMSC community is approximately five hundred students, researchers, and faculty—some of whom are also included as part of the larger OSU community. The second factor influencing response rates relates to the difference in size between the two communities, but is more striking in the HMSC community: the survey relied on a self-selected group of users who indicated they had a history using the library website via a mobile device. Therefore, it is not possible to estimate the population size of mobile-device library-website users specific to the branch library or the main campus library. This limitation means that the results from this study cannot be used to generalize findings to all users who visit a library website via mobile devices; instead the results are intended to present a case that other libraries may compare with behaviors observed on their own campuses. Sharing the behaviors of advanced researchers at a branch campus is particularly valuable as this population has historically been understudied. RESULTS AND DISCUSSION Participant Demographics and Devices Used Of the twenty-three respondents to the HMSC mobile behaviors survey, 13 (62 percent) were graduate students, 7 (34 percent) were faculty (this category includes faculty researchers and courtesy faculty), and one respondent was an NOAA employee. Two participants declined to declare their affiliation. Of the 97 respondents to the 2015 OSU main-campus survey who shared their affiliation, 16 (16 percent) were graduate students, 5 (5 percent) were faculty members, and 69 (71 percent) were undergraduates. Respondents varied in the types of mobile devices they used when doing library research. Smartphones were used by 78 percent (18 respondents) and 22 percent (5 respondents) used a tablet. Apple (15 respondents) was the most common device brand used, although six of the respondents used an Android phone or tablet. Compared to the general population’s device ownership, these respondents are more likely to own Apple devices, but the two major device types owned (Apple and Android) match market trends.30 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 13 Frequency of Library Site Use on Mobile Devices Most of the HMSC respondents are infrequent users of the library website via mobile devices: 50 percent (11 respondents) did so less than once a month; 41 percent (9 respondents) did so at least once a month; and 9 percent (2 respondents) did so at least once a week. The low level of library website usage via mobile devices was especially notable as this population reports being heavy users of the library website via laptops or desktop computers, with 82 percent (18 respondents) visiting the library website via those tools at least once a week. Researchers at HMSC used the library website via mobile devices much less often than the 2015 main-campus respondents (undergraduates, graduate students, and faculty). No HMSC respondents visited the mobile site daily compared to 10 percent of main-campus users, and only 9 percent of HMSC respondents visited weekly compared to 28 percent of main-campus users (see Figure 1). Figure 1. 2016 HMSC participants vs. 2015 OSU main-campus participants reported frequency of library website visits via a mobile device by percent of responses. While HMSC advanced researchers share some mobile behaviors with main-campus students, this exploratory study demonstrates they do not use the library website via mobile devices as frequently. Some possible reasons for this are researchers rarely spend time coming and going to and from classes and therefore do not have small gaps of time to fill throughout their day. Instead, their daily schedule involves being in the field or in the lab collecting and analyzing data. 0% 10% 20% 30% 40% 50% 60% This is my first time Less often than once a month At least once a month At least once a week Every day or almost every day Branch 2016 Main 2015 MOBILE WEBSITE USE AND ADVANCED RESEARCHERS | MARKLAND, REMPEL, AND BRIDGES doi:10.6017/ital.v36i4.9953 14 Alternatively, they are frequently involved in writing-intensive projects such as drafting journal articles or grant proposals. They carve out specific periods to do research and do not appear to be filling time with short bursts of literature searching. They can work on laptops and do not need to multitask on a phone or tablet between classes or in other situations. Mobile-device ownership among HMSC graduate students might also be limited because of personal budgets that do not allow for owning multiple mobile devices or for having the most recent model. In addition, this group of scientists may not be on the front edge of personal technologies, especially compared to medical researchers, because few mobile apps are designed specifically for the research needs of marine scientists. Where Researchers Are When Using Mobile Devices for Library Tasks Because mobile devices facilitate connecting to resources from many locations, and because advanced researchers conduct research in a range of settings—including the field, the office, and home—we asked respondents where they were most likely to use the library website via a mobile device. Thirty-two percent were most likely to be at home, 27 percent in transit; 18 percent at work; and 9 percent in the field. The popularity of using the library website via mobile devices while in transit was somewhat unexpected, but perhaps should not have been because many people try to maximize their travel time by multitasking on mobile devices. The distance from the main campus might explain this finding because a local bus service provides an easy way to travel to and from the main campus, and the hour-long trip would provide opportunities for multitasking via a mobile device. Relatively few respondents used mobile devices to access the library website while at work. Previous studies show that a lack of reliable campus wireless internet access can affect students’ ability to use mobile technology.31 HMSC also struggles to provide consistent wireless access, and signals are spotty in many areas of our campus. Despite signal boosters in Guin Library, wireless access is still limited at times. In addition, cell phone service is equally spotty both at HMSC and up and down the coast of Oregon. It is much less frustrating to work on a device that has a wired connection to the internet while at HMSC. These respondents did use mobile devices while at home, which might indicate they had a better wireless signal there. Alternatively, working from home on a mobile device might indicate that they compartmentalize their library-research time as an activity to do at home instead of in the office. Researchers used their mobile devices to access the library while in the field less than originally expected, but upon further reflection, it made sense that researchers would be less likely to use library resources during periods of data collection for oceanic or other water-based research projects because of their focused involvement during that stage. The water-based research also increases the risk of losing mobile devices. Library Resources Accessed via Mobile Devices INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 15 To learn more about how these respondents used the library website, we asked them to choose what they were searching for from a list of options. Respondents could choose as many options as applied to their searching behaviors. HMSC respondents’ primary reason for visiting the library’s site via a mobile device was to find a specific source: 68 percent looked for an article, 45 percent for a journal, 36 percent for a book, and 14 percent for a thesis. Many of the HMSC respondents also looked for procedural or library-specific information: 36 percent looked for hours, 32 percent for My Account information, 18 percent for interlibrary loan, 14 percent for contact information, 9 percent for how to borrow and request books, 9 percent for workshop information, and 9 percent for Oregon estuaries bibliographies—a unique resource provided by the HMSC library. Fifty-five percent of searches were for a specific source and 43 percent were for procedural or library- specific information. Notably missing from this list were respondents who reported searching via their mobile device for directions to the library. Compared to the 2015 OSU Libraries main-campus survey respondents, HMSC respondents were much more likely to visit the library website via a mobile device to look for an article (68 percent vs. 37 percent), find a journal (45 percent vs. 23 percent), access My Account information (32 percent vs. 7 percent), use interlibrary loan (18 percent vs. 5 percent), or find contact information (14 percent vs. 1 percent). However, unlike HMSC participants, who do not have access to course reserves at the branch library, 7 percent of OSU main-campus respondents used their mobile devices to find course reserves on the library website. See Figure 2. 0% 10% 20% 30% 40% 50% 60% 70% Directions Contact information Interlibrary loan Course reserves My account A journal A book Library hours An article Branch 2016 Main 2015 MOBILE WEBSITE USE AND ADVANCED RESEARCHERS | MARKLAND, REMPEL, AND BRIDGES doi:10.6017/ital.v36i4.9953 16 Figure 2. 2016 HMSC vs. 2015 OSU main-campus participants reported searches while visiting the library website via a mobile device by percent of responses. It is possible that HMSC users with different affiliations might use the library site via a mobile device differently. These exploratory findings show that graduate students used the greatest variety of content via mobile devices. Graduate students as a group reported using 11 of the 14 provided content choices via a mobile device while faculty reported using 8 of the 14. Graduate students were the largest group (62 percent of respondents), which might explain why as a group they searched for more types of content via mobile devices. Interestingly, faculty members and faculty researchers reported looking for a thesis via a mobile device, but no graduate students did. Perhaps these graduate students had not yet learned about the usefulness of referencing past theses as a starting point for their own thesis writing. Or perhaps they were only familiar with searching for journal articles on a topic. In contrast, faculty members might have been searching for specific theses for which they had provided advising or mentoring support. To help us make decisions about how to best direct users to library content via mobile devices, we asked respondents to indicate their searching behaviors and preferences. Of the 16 HMSC respondents who answered this question, 12 (75 percent) used our web-scale discovery search box via mobile devices; 4 (25 percent) reported that they did. Presumably these latter searchers were navigating to another database to find their sources. Of 16 respondents, only 6 (38 percent) indicated that they looked for a specific library database (as opposed to the discovery tool) when using a mobile device. Those respondents who were looking for a database tended to be looking for the Web of Science database, which makes sense for their field of study. When conducting searches for sources on their mobile devices, HMSC respondents employed a variety of search strategies: the 12 respondents who replied used a combination of author (75 percent), journal title (67 percent, keyword (67 percent), and book title (50 percent) searches when starting at the mobile version of the discovery tool. When asked about their preferred way to find sources, a majority of HMSC respondents reported that they tended to prefer a combination of searching and menu navigation while using the library website from mobile devices, while the remainder were evenly divided between preferring menu - driven and search-driven discovery. While OSU Libraries does not currently provide links to any specific apps for source discovery, such as PubMed Mobile or JSTOR Browser, 13 (62 percent) of the HMSC respondents indicated they would be somewhat or very likely to use an app to access and use library services. This finding connects to the issue of reliable wireless access. Medical graduate students had a wider array of apps available to them, but the primary reason they wanted to use these apps was because they provided a better searching experience in hospitals that had intermittent wireless access—an experience to which researchers at HMSC could relate.32 University Website Use Behaviors on Mobile Devices To help situate respondents’ library use behaviors on mobile devices in comparison to the way they use other academic resources on mobile devices, we asked HMSC respondents to describe INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 17 their visits to resources on the OSU (nonlibrary) website via mobile devices. Compared to their use of the library site on a mobile device, respondents’ use of university services was higher: 43 percent (9 respondents) visited the university’s website via a mobile device at least once a week compared to only 9 percent (2 respondents) who visited the library site with that frequency. This makes sense because of the integral function many of these university services play in most university employees’ regular workflow. Respondents indicated visiting key university sites including MyOSU (a portal webpage, visited by 60 percent of respondents), the HMSC webpage (55 percent), Canvas (the university’s learning management system, visited by 50 percent of respondents), and webmail (45 percent). See Figure 3. Figure 3. University webpages HMSC respondents access on a mobile device by percent of responses. University resources such as campus maps, parking locations, and the graduate school website were frequently used by this population. The use of the first two makes sense as HMSC users are located off-site and need to use maps and parking guidance when they visit the main campus. The use of the graduate school website makes sense because the respondents were primarily graduate students and graduate school guidelines are a necessary source of information. Interestingly, our advanced users are similar to undergraduates in that they primarily read email, information from social networking sites, and news on their mobile devices. 33 Other Research Behaviors on Mobile Devices MOBILE WEBSITE USE AND ADVANCED RESEARCHERS | MARKLAND, REMPEL, AND BRIDGES doi:10.6017/ital.v36i4.9953 18 We wanted to know what other research-related behaviors the HMSC respondents are engaged in via mobile devices to determine if there might be additional ways to support researchers’ workflows. We specifically asked about respondents’ reading, writing, and note-taking behaviors to learn how well these respondents have integrated them with their mobile usage behaviors. All respondents reported reading on their mobile device (see Figure 4). Email represented the most common reading activity (95 percent), followed by “quick reading” activities, such as reading social networking posts (81 percent), current news (81 percent), and blog posts (62 percent). Smaller numbers used their mobile devices for academic or long-form reading, such as reading scholarly articles (33 percent) or books (19 percent). Of those respondents who read articles and books on their mobile devices, only respondents highlighted or took notes using their mobile device. Seven respondents used a citation manager on their mobile device: three used EndNote, one used Mendeley, one used Pages, and one used Zotero. One respondent used Evernote on their mobile device, and one advanced user reported using specific data and database management software, websites, and apps related to their projects. More advanced and interactive mobile- reading features, such as online spatial landmarks, might be needed before reading scholarly articles on mobile devices becomes more common.34 Figure 4. What HMSC respondents reported reading on a mobile device by percent of responses. LIMITATIONS This exploratory study had several limitations, most of which reflect the nature of doing research with a small population at a branch campus. This study had a small sample size, which limited observations of this population; however, future studies could use research techniques such as interviews or ethnographic studies to gather deep qualitative information about mobile-use 19% 33% 62% 81% 81% 95% 0% 20% 40% 60% 80% 100% 120% Books Academic or scholarly articles Blog posts Current news Social networking posts (Facebook, Twitter, etc.) Email Percent of Responses INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 19 behaviors in this population. A second limitation was that previous studies of the OSU Libraries mobile website used Google Analytics to compare survey results with what users were actually doing on the library website. Unfortunately, this was not possible for this study. Because of how HMSC’s network was set up, anyone at HMSC using the OSU internet connections is assigned an IP address that shows a Corvallis, Oregon, location rather than a Newport, Oregon, location, which rendered parsing HMSC-specific users in Google Analytics impossible. The research behaviors of advanced researchers at a branch campus has not been well-examined; despite its limitations, this study provides beneficial insights into the behaviors of this user population. CONCLUSION Focusing on how advanced researchers at a branch campus use mobile devices while accessing library and other campus information provides a snapshot of key trends among this user group. These exploratory findings show that these advanced researchers are infrequent users of library resources via mobile devices and, contrary to our initial expectations, are not using mobile devices as a research resource while conducting field-based research. Findings showed that while these advanced researchers do periodically use the library website via mobile devices, mobile devices are not the primary mode of searching for articles and books or for reading scholarly sources. Mobile devices are most frequently used for viewing the library website when these advanced researchers are at home or in transit. The results of this survey will be used to address the HMSC knowledge gaps around use of library resources and research tools via mobile devices. Both graduate students and faculty lack awareness of library resources and services and have unsophisticated library research skills. 35 While the OSU main campus has library workshops for graduate students and faculty, these workshops have been inconsistently duplicated at the Guin Library. Because the people working at HMSC come from such a wide variety of departments across OSU that focus on marine sciences, HMSC has never had a library orientation. The results indicate possible value in devising ways to promote Guin Library’s resources and services locally, which could include highlighting the availability of mobile library access. While several participants mentioned using research tools like Evernote, Pages, or Zotero on their mobile devices, most participants did not report enhancing their mobile research experience with these mobile-friendly tools. Workshops specifically modeling how to use mobile-friendly tools and apps such as Dropbox, Evernote, GoodReader, or Browzine could help introduce the benefits of these tools to these advanced researchers. Because wireless access is even more of a concern for researchers at this branch location than for researchers at the main campus, database-specific apps will be explored to determine if the use of searching apps could help alleviate inconsistent wireless access. If database apps that are appropriate for marine science researchers are available, these will be promoted to this user population. Future research might involve follow-up interviews or focus groups, ethnographic studies, or interviews, which could expand the knowledge of these researchers’ mobile-device behaviors and MOBILE WEBSITE USE AND ADVANCED RESEARCHERS | MARKLAND, REMPEL, AND BRIDGES doi:10.6017/ital.v36i4.9953 20 their perceptions of mobile devices. Exploring the technology usage by these advanced researchers in their labs, including electronic lab notebooks or other tools, might be an interesting contrast to their use of mobile devices. In addition, as the HMSC campus grows with the expansion of the Marine Studies Initiative, increasing numbers of undergraduates will use Guin Library. The ECAR 2015 statistics show that current undergraduates own multiple internet-capable devices.36 Presumably, these HMSC undergraduates will be likely to follow the trends seen in the ECAR data. Certainly, the plans to expand HMSC’s internet and wireless infrastructure will affect all its users. Our mobile survey gave us insights into how a sample of the HMSC population uses the library’s resources and services. These observations will allow Guin Library to expand its services for the HMSC campus. We encourage other librarians to explore their unique user populations when evaluating services and resources. REFERENCES 1 Maria Anna Jankowska, “Identifying University Professors’ Information Needs in the Challenging Environment of Information and Communication Technologies,” Journal of Academic Librarianship 30, no. 1 (2004): 51–66, https://doi.org/10.1016/j.jal.2003.11.007; Pali U. Kuruppu and Anne Marie Gruber, “Understanding the Information Needs of Academic Scholars in Agricultural and Biological Sciences,” Journal of Academic Librarianship 32, no. 6 (2006): 609–23; Lotta Haglund and Per Olsson, “The Impact on University Libraries of Changes in Information Behavior among Academic Researchers: A Multiple Case Study,” Journal of Academic Librarianship 34, no. 1 (2008): 52–59, https://doi.org/10.1016/j.acalib.2007.11.010; Nirmala Gunapala, “Meeting the Needs of the ‘Invisible University’: Identifying Information Needs of Postdoctoral Scholars in the Sciences,” Issues in Science and Technology Librarianship, no. 77 (Summer 2014), https://doi.org/10.5062/F4B8563P. 2 Tina Chrzastowski and Lura Joseph, “Surveying Graduate and Professional Students’ Perspectives on Library Services, Facilities and Collections at the University of Illinois at Urbana- Champaign: Does Subject Discipline Continue to Influence Library Use?,” Issues in Science and Technology Librarianship no. 45 (Winter 2006), https://doi.org/10.5062/F4DZ068J; Kuruppu and Gruber, “Understanding the Information Needs of Academic Scholars in Agricultural and Biological Sciences”; Haglund and Olsson, “The Impact on University Libraries of Changes in Information Behavior Among Academic Researchers.” 3 Ellyssa Kroski, “On the Move with the Mobile Web: Libraries and Mobile Technologies,” Library Technology Reports 44, no. 5 (2008): 1–48, https://doi.org/10.5860/ltr.44n5. 4 Paula Torres-Pérez, Eva Méndez-Rodríguez, and Enrique Orduna-Malea, “Mobile Web Adoption in Top Ranked University Libraries: A Preliminary Study,” Journal of Academic Librarianship 42, no. 4 (2016): 329–39, https://doi.org/10.1016/j.acalib.2016.05.011. 5 David J. Comeaux, “Web Design Trends in Academic Libraries—A Longitudinal Study,” Journal of Web Librarianship 11, no. 1 (2017), 1–15, https://doi.org/10.1080/19322909.2016.1230031; https://doi.org/10.1016/j.jal.2003.11.007 https://doi.org/10.1016/j.acalib.2007.11.010 https://doi.org/10.5062/F4B8563P https://doi.org/10.5062/F4DZ068J https://doi.org/10.5860/ltr.44n5 https://doi.org/10.1016/j.acalib.2016.05.011 https://doi.org/10.1080/19322909.2016.1230031 INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 21 Zebulin Evelhoch, “Mobile Web Site Ease of Use: An Analysis of Orbis Cascade Alliance Member Web Sites,” Journal of Web Librarianship 10, no. 2 (2016): 101–23, https://doi.org/10.1080/19322909.2016.1167649. 6 Barbara Blummer and Jeffrey M. Kenton, “Academic Libraries’ Mobile Initiatives and Research from 2010 to the Present: Identifying Themes in the Literature,” in Handbook of Research on Mobile Devices and Applications in Higher Education Settings, ed. Laura Briz-Ponce, Juan Juanes- Méndez, and José Francisco García-Peñalvo (Hershey, PA: IGI Global, 2016), 118–39. 7 Jankowska, “Identifying University Professors’ Information Needs in the Challenging Environment of Information and Communication Technologies.” 8 Chrzastowski and Joseph, “Surveying Graduate and Professional Students’ Perspectives on Library Services, Facilities and Collections at the University of Illinois at Urbana-Champaign.” 9 Carole A. George et al., “Scholarly Use of Information: Graduate Students’ Information Seeking Behaviour,” Information Research 11, no. 4 (2006), http://www.informationr.net/ir/11- 4/paper272.html. 10 Kristin Hoffman et al., “Library Research Skills: A Needs Assessment for Graduate Student Workshops,” Issues in Science and Technology Librarianship 53 (Winter-Spring 2008), https://doi.org/10.5062/F48P5XFC; Hannah Gascho Rempel and Jeanne Davidson, “Providing Information Literacy Instruction to Graduate Students through Literature Review Workshops,” Issues in Science and Technology Librarianship 53 (Winter-Spring 2008), https://doi.org/10.5062/F44X55RG. 11 Jankowska, “Identifying University Professors’ Information Needs in the Challenging Environment of Information and Communication Technologies.” 12 Ka Po Lau et al., “Educational Usage of Mobile Devices: Differences Between Postgraduate and Undergraduate Students,” Journal of Academic Librarianship 43, no. 3 (May 2017), 201–8, https://doi.org/10.1016/j.acalib.2017.03.004. 13 Noa Aharony, “Mobile Libraries: Librarians’ and Students’ Perspectives,” College & Research Libraries 75, no. 2 (2014): 202–17, https://doi.org/10.5860/crl12-415. 14 Hannah Gashco Rempel and Laurie M. Bridges, “That Was Then, This Is Now: Replacing the Mobile-Optimized Site with Responsive Design,” Information Technology and Libraries 32, no. 4 (2013): 8–24, https://doi.org/10.6017/ital.v32i4.4636. 15 Paula Barnett-Ellis and Charlcie Pettway Vann, “The Library Right There in My Hand: Determining User Needs for Mobile Services at a Medium-Sized Regional University,” Southeastern Librarian 62, no. 2 (2014): 10–15. https://doi.org/10.1080/19322909.2016.1167649 http://www.informationr.net/ir/11-4/paper272.html http://www.informationr.net/ir/11-4/paper272.html https://doi.org/10.5062/F48P5XFC https://doi.org/10.5062/F44X55RG https://doi.org/10.1016/j.acalib.2017.03.004 https://doi.org/10.5860/crl12-415 https://doi.org/10.6017/ital.v32i4.4636 MOBILE WEBSITE USE AND ADVANCED RESEARCHERS | MARKLAND, REMPEL, AND BRIDGES doi:10.6017/ital.v36i4.9953 22 16 William T. Caniano and Amy Catalano, “Academic Libraries and Mobile Devices: User and Reader Preferences,” Reference Librarian 55, no. 4 (2014), 298–317, https://doi.org/10.1080/02763877.2014.929910. 17 Haglund and Olsson, “The Impact on University Libraries of Changes in Information Behavior Among Academic Researchers.” 18 Kuruppu and Gruber, “Understanding the Information Needs of Academic Scholars in Agricultural and Biological Sciences.” 19 Christine Wolff, Alisa B. Rod, and Roger C. Schonfeld, “Ithaka S+R US Faculty Survey 2015,” Ithaka S+R, April 4, 2016, http://www.sr.ithaka.org/publications/ithaka-sr-us-faculty-survey- 2015/. 20 M. Macedo-Rouet et al., “How Do Scientists Select Articles in the PubMed Database? An Empirical Study of Criteria and Strategies,” Revue Européenne de Psychologie Appliquée/European Review of Applied Psychology 62, no. 2 (2012): 63–72. 21 Rempel and Bridges, “That Was Then, This Is Now.” 22 Ellie Bushhousen et al., “Smartphone Use at a University Health Science Center,” Medical Reference Services Quarterly 32, no. 1 (2013): 52–72, https://doi.org/10.1080/02763869.2013.749134. 23 Jill T. Boruff and Dale Storie, “Mobile Devices in Medicine: A Survey of How Medical Students, Residents, and Faculty Use Smartphones and Other Mobile Devices to Find Information,” Journal of the Medical Library Association 102, no. 1 (2014): 22–30, https://doi.org/10.3163/1536- 5050.102.1.006. 24 Bushhousen et al., “Smartphone Use at a University Health Science Center”; Boruff and Storie, “Mobile Devices in Medicine.” 25 Eden Dahlstrom et al., “ECAR Study of Students and Information Technology, 2015 ," research report, EDUCAUSE Center for Analysis and Research, 2015, https://library.educause.edu/~/media/files/library/2015/8/ers1510ss.pdf?la=en. 26 Ibid., 24. 27 Lutishoor Salisbury, Jozef Laincz, and Jeremy J. Smith, “Science and Technology Undergraduate Students’ Use of the Internet, Cell Phones and Social Networking Sites to Access Library Information,” Issues in Science and Technology Librarianship 69 (Spring 2012), https://doi.org/10.5062/F4SB43PD. 28 Rempel and Bridges, “That Was Then, This Is Now.” 29 Ibid. https://doi.org/10.1080/02763877.2014.929910 http://www.sr.ithaka.org/publications/ithaka-sr-us-faculty-survey-2015/ http://www.sr.ithaka.org/publications/ithaka-sr-us-faculty-survey-2015/ https://doi.org/10.1080/02763869.2013.749134 https://doi.org/10.3163/1536-5050.102.1.006 https://doi.org/10.3163/1536-5050.102.1.006 https://library.educause.edu/~/media/files/library/2015/8/ers1510ss.pdf?la=en https://doi.org/10.5062/F4SB43PD INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 23 30 “Mobile/Tablet Operating System Market Share,” NetMarketShare, March 2017, https://www.netmarketshare.com/operating-system-market-share.aspx?qprid=8&qpcustomd=1. 31 Boruff and Storie, “Mobile Devices in Medicine”; Patrick Lo et al., “Use of Smartphones by Art and Design Students for Accessing Library Services and Learning,” Library Hi Tech 34, no. 2 (2016): 224–38, https://doi.org/10.1108/LHT-02-2016-0015. 32 Boruff and Storie, “Mobile Devices in Medicine.” 33 Dahlstrom et al., “ECAR Study of Students and Information Technology, 2015.” 34 Caroline Myrberg and Ninna Wiberg, “Screen vs. Paper: What Is the Difference for Reading and Learning?” Insights 28, no. 2 (2015): 49–54, https://doi.org/10.1629/uksg.236. 35 Barnett-Ellis and Vann, “The Library Right There in My Hand”; Haglund and Olsson, “The Impact on University Libraries of Changes in Information Behavior Among Academic Researchers”; Hoffman et al., “Library Research Skills”; Kuruppu and Gruber, “Understanding the Information Needs of Academic Scholars in Agricultural and Biological Sciences”; Lau et al., “Educational Usage of Mobile Devices”; Macedo-Rouet et al., “How Do Scientists Select Articles in the PubMed Database?” 36 Dahlstrom et al., “ECAR Study of Students and Information Technology, 2015.” https://www.netmarketshare.com/operating-system-market-share.aspx?qprid=8&qpcustomd=1 https://doi.org/10.1108/LHT-02-2016-0015 https://doi.org/10.1629/uksg.236 ABSTRACT INTRODUCTION LITERATURE REVIEW METHODS RESULTS AND DISCUSSION Participant Demographics and Devices Used Frequency of Library Site Use on Mobile Devices Where Researchers Are When Using Mobile Devices for Library Tasks Library Resources Accessed via Mobile Devices University Website Use Behaviors on Mobile Devices Other Research Behaviors on Mobile Devices LIMITATIONS CONCLUSION REFERENCES 9959 ---- Everyone’s Invited: A Website Usability Study Involving Multiple Library Stakeholders Elena Azadbakht, John Blair, and Lisa Jones INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2017 34 Elena Azadbakht (elena.azadbakht@usm.edu) is Health and Nursing Librarian and Assistant Professor, John Blair (john.blair@usm.edu) is Web Services Coordinator, and Lisa Jones (lisa.r.jones@usm.edu) is Head of Finance and Information Technology, University of Southern Mississippi, Hattiesburg, Mississippi. ABSTRACT This article describes a usability study of the University of Southern Mississippi Libraries website conducted in early 2016. The study involved six participants from each of four key user groups— undergraduate students, graduate students, faculty, and library employees—and consisted of six typical library search tasks, such as finding a book and an article on a topic, locating a journal by title, and looking up hours of operation. Library employees and graduate students completed the study’s tasks most successfully, whereas undergraduate students performed relatively simple searches and relied on the Libraries’ discovery tool, Primo. The study’s results displayed several problematic features that affected each user group, including library employees. These results increased internal buy-in for usability-related changes to the library website in a later redesign. INTRODUCTION Within the last decade, usability testing has become a common way for libraries to assess their websites. Eager to gain a better understanding of how users experience our website, we assembled a two-person team and conducted the first usability study of the University of Southern Mississippi Libraries website in February 2016. The Web Advisory Committee—which is tasked with developing, maintaining, and enhancing the Libraries’ online presence—wanted to determine if the content on the website was organized in a way that made sense to users and facilitated the efficient use of the Libraries’ online resources. Our usability study involved six participants from each of the following library user groups: undergraduate students, graduate students, faculty, and library employees. Student and faculty participants represented several academic disciplines and departments. All of the library employees involved in the study work in public-facing roles. The Web Advisory Committee and Libraries’ administration wanted to know how each of these groups differ in their website use and whether they have difficulty with the same architecture or features. Usability testing helped illuminate which aspects of the website’s design might be hindering users from accomplishing key tasks, thereby identifying where and how improvement needed to be made. We included library employees in this study to compare their approach to the website to that of other users in the mailto:elena.azadbakht@usm.edu mailto:john.blair@usm.edu mailto:lisa.r.jones@usm.edu EVERYONE’S INVITED | AZADBAKHT, BLAIR, AND JONES 35 https://doi.org/10.6017/ital.v36i4.9959 hope of increasing internal stakeholders’ buy-in for recommendations resulting from this study. This article will discuss the usability study’s design, results, and recommendations as well as the implications of the study’s findings for similarly situated academic libraries. We will give special consideration to how the behavior of library employees compared to that of other groups. LITERATURE REVIEW The literature on library-website user experience and usability is extensive. In 2007, Blummer conducted a literature review of research related to academic-library websites, including usability studies. Her article provides an overview of the goals and outcomes of early library-website usability studies. 1 More recent articles focus on a portion or aspect of a library’s website such as the homepage, federated search or discovery tool, or subject guides. Fagan published an article in 2010 that reviews user studies of faceted browsing and outlines several best practices for designing studies that focus on next-generation catalogs or discovery tools. 2 Other library-website studies have reported on the habits of user groups, with undergraduates being the most commonly studied constituent group. Emde, Morris, and Claassen-Wilson observed University of Kansas faculty and graduate students’ use of the library website, which had been recently redesigned, including a new federated search tool. 3 Many of the study’s participants gravitated toward the subject-specific resources they were familiar with and either missed or avoided using the website’s new features. When asked for their opinions on the federated search tool, several participants said that while it was not a tool they saw themselves using, they did see how it might be a helpful for undergraduate students who were still new to research. The researchers also provided the participants with an article citation and asked them to locate it using the using the library’s website or online resources. While half the participants did use the website’s “E-Journals” link, others were less successful. Some who had the most difficulty “search[ed] for the journal title in a search box that was set up to search database titles.” 4 This led Emde, Morris, and Claassen-Wilson to observe that “locating journal articles from known citations is a difficult concept even for some advanced researchers.” Turner’s 2011 article describes the result of a usability study at Syracuse University Library that included both students and library staff. Participants were asked to start at the library’s homepage and complete five tasks designed to emulate the types of searches a typical library user might perform, such as finding a specific book, a multimedia item, an article in the journal Nature, and primary sources pertaining to a historic event. 5 When asked to find Toni Morrison’s Beloved, most staff members used the library’s traditional online catalog whereas students almost always began their searches with the federated search tool located on the homepage. Participants of both types were less successful at locating a primary source, although this task highlighted key differences in each groups’ approach to searching the library website. Since library staff were more familiar than students with the library’s collections and online search tools, they relied more on facets and limiters to narrow their searches, and some even began their searches by navigating to the library’s webpage for special collections. INFORMATION TECHNOLOGY AND LIBRARIES |DECEMBER 2017 36 Library staff tended to be more persistent; draw upon their greater knowledge of the library’s collections, website, and search tools; and use special syntax in their searchers, like inverting an author’s first and last names. “Library staff took more time, on average, to locate materials,” writes Turner, because of their “interest in trying alternative strategies.” 6 Students, on the other hand, usually included more detail than necessary in their search queries (such as adding a word related to the format they were searching for after their keywords) and could not always differentiate various types of catalog records, for example, the record for a book review and the record for the book itself. Turner concludes that the students’ mental models for searching online and their experiences with other web-search environments influence their expectations of how library search tools work and that library-website design should take these mental models into consideration. Research on the search behaviors of students versus more experienced researchers or subject experts also has implications for library website design. Two recent articles explore the different mental models or mindsets students bring to a search. The students in Asher and Duke’s 2012 study “generally treated all search boxes as the equivalent of a Google search box” and used very simple keyword searches. 7 This tracked with Holman’s 2010 study, which likewise found that the students she observed relied on simple search strategies and did not understand how search interfaces and systems are structured. 8 METHODS Our research team consisted of the Libraries’ health and nursing librarian and the web services coordinator. We worked closely with the head of finance and information technology in designing and running the usability study. A two-week period in mid-February 2016 was chosen for usability testing to avoid losing potential participants to midterms or spring break. We posted a call for participants to two university discussion lists, on the Libraries website, and on social media (Facebook and Twitter). We also reached out directly to faculty in academic departments we regularly work with and emailed library employees directly. We directed nonlibrary participants to a web form on the Libraries website to provide their name, contact information, university affiliation/class standing, and availability. The health and nursing librarian followed up with and scheduled participants on the basis of their availability. Each student participant received a ten-dollar print card and each faculty participant received a ten-dollar Starbucks gift card. To record the testing sessions, we needed a free or low-cost software option. Since the Libraries already had a subscription to Screencast-O-Matic to develop video tutorials, and the tool allows for simultaneous screen, audio, and video capture, so we decided to use it to record all testing sessions. We also used a spare laptop with an embedded camera and microphone. The health and nursing librarian served as both facilitator and note-taker for most usability testing sessions. Participants were given six tasks to complete. We encouraged participants to EVERYONE’S INVITED | AZADBAKHT, BLAIR, AND JONES 37 https://doi.org/10.6017/ital.v36i4.9959 narrate as they completed each task. The sessions began with simple, secondary navigational questions like the following: • How late is our main library open on a typical Monday night? • How could you contact a librarian for help? • Where would you find more information about services offered by the library? Next, we asked the participants to complete tasks designed to assess their ability to search for specific library resources and to illuminate any difficulty users might have navigating the website in the process. Each of the three tasks focused on a particular library-resource type, including books, articles, and journals: • Find a book about rabbits. • Find an article about rabbits. • Check to see if we have a subscription/access to a journal called Nature. After the usability testing was complete, we reviewed the recordings and notes and coded them. For each task, we calculated time to completion and documented the various paths participants took to answer each question, noting any issues they encountered. We also compared the four user groups in our analysis. Limitations Although we controlled for user type (undergraduate, graduate, faculty, or library employee) in the recruitment of study participants, we did not screen by academic discipline. Doing so would have hindered our team’s ability to include enough graduate students and faculty members in the study, as nearly all the volunteers from these two groups were from humanities or social science fields. The results might have differed slightly had the study successfully managed to include more faculty from the so-called hard sciences and allied health fields. Additionally, the order in which we asked participants to attempt the tasks might have affected how they approached some of the later tasks. If a participant chose to search for a book using the Primo discovery tool, for example, they might be more inclined to use it to complete the next task (find an article) rather than navigate to a different online resource or tool. Despite these limitations, usability testing has helped improve the website in key ways. We plan to correct for these limitations in future studies. RESULTS Every group included a participant who failed to complete at least one of the six tasks. An adequate answer to each of the study’s six tasks can be found within one or two pages/clicks from the Libraries homepage (Figure 1). The average distance to a solution remained at about two page loads across all of the study’s participants, despite a few individual “website safaris.” INFORMATION TECHNOLOGY AND LIBRARIES |DECEMBER 2017 38 Figure 1. University of Southern Mississippi Libraries’ homepage. Graduate students tended to complete tasks the quickest and were generally as successful as library employees. They preferred to use Primo for finding books but tended to favor the list of scholarly databases on the “Articles & Databases” page to find articles and journals. Undergraduates were the second fastest group, but many struggled to complete one or more of the six tasks. They had the most trouble finding books and locating the journal by title. Undergraduates generally performed simple searches and had trouble recovering from missteps. They were heavy users of Primo, relying on the discovery tool more than any other group. The other two user groups, faculty and library employees, were slower at completing tasks. Of the two, faculty took the longest to complete any task and failed to complete tasks at a similar rate as undergraduates. Likewise, this group favored Primo nearly as often. In contrast, library employees took almost as long as faculty to complete tasks but were much more successful. As a group, library employees demonstrated the different paths users could take to complete each task but favored those paths they identified as the “preferred” method for finding an item or resource over the fastest route. EVERYONE’S INVITED | AZADBAKHT, BLAIR, AND JONES 39 https://doi.org/10.6017/ital.v36i4.9959 The majority of study participants across all user groups had little trouble with the first three tasks. Although most participants favored the less direct path to the Libraries’ hours—missing the direct link at the top of the homepage (Figure 2)—they spent relatively little time on this task. Likewise, virtually all participants took note of the links to our “Ask-A-Librarian” and “Services” pages located in our homepage’s main navigation menu. This portion of the usability study alerted us to the need for a more prominent display of our opening hours on the homepage. Figure 2. Link to “Hours” from the homepage. Of the second set of tasks—find a book, find an article, and determine if we have access to Nature—the first and last proved the most challenging for participants. One undergraduate was unable to complete the book task, and one faculty member took nearly eight minutes to do so—the longest time to completion of any task by any user in the study. Primo was the most preferred method for finding a book. Although an option for searching our Classic Catalog (which uses Innovative Interfaces’ Millennium integrated library system) is contained within a search widget on the homepage, Primo is the default search option and therefore users’ default choice. Interestingly, even after statements from some faculty such as “I don’t love Primo,” “Primo isn’t the best,” and “the [Classic Catalog] is better,” these participants proceeded to use Primo to find a book. Library employees were evenly split between Primo and Classic Catalog. One undergraduate student, graduate student, and library employee were unable to determine whether we have access to Nature. This task was the most time consuming for library employees because there are multiple ways to approach this question and library employees tended to favor the most consistently successful yet most time-consuming options (e.g., searching within the Classic Catalog). Lacking a clear option in the main navigation bar, the most popular path started INFORMATION TECHNOLOGY AND LIBRARIES |DECEMBER 2017 40 with our “Articles & Databases” page, but the answer was most often successfully found using Primo. Several participants tried using the “Search for Databases” search box on the “Articles & Databases” page, which yielded no results because it searches only our database list. The search widget on the homepage that includes Primo has an option for searching e-journals by title, as shown in Figure 3. However, nearly all nonlibrary employees missed this feature. Participants from both the undergraduate and graduate student user groups had trouble with this task, including those who were ultimately successful. Unfortunately, many of the undergraduates could not differentiate a journal from an article, and while graduate students were aware of the distinction, a few indicated that they were not used to the idea of finding articles from a specific journal. Figure 3. E-journals search tab. When it came to finding articles, undergraduates, as well as several faculty and a few library employees, gravitated toward Primo. Others, particularly graduate students and library employees, opted to search a specific database—most often Academic Search Premier or JSTOR. However, those who used Primo to answer this question arrived at an answer two to three times faster because of the discovery tool’s accessibility from a search widget on the homepage. Regardless of the tool or resource they used, most participants found a sufficient result or two. Common Breakdowns Despite the clear label “Search for Databases,” at least one participant from each user group, including library employees, attempted to enter a book title, journal name, or keyword into the LibGuides’ database search tool on our “Articles & Databases” page (Figure 4). Some participants attempted this repeatedly despite getting no results. Others did not try a search but stated, with EVERYONE’S INVITED | AZADBAKHT, BLAIR, AND JONES 41 https://doi.org/10.6017/ital.v36i4.9959 confidence, that entering a journal, book, or article title into the “Search for Databases” field would yield a relevant result. A few participants also attempted this with the search box on our Research Guides (LibGuides) page, which searches only within the content of the LibGuides themselves. Across all groups, when not starting at the homepage, many participants had difficulty finding books because no clear menu option exists for finding books like it does for articles (our “Articles & Databases” page). This was difficulty was compounded by many participants struggling to return to the Libraries homepage from within the website’s subpages. Those participants who were able to navigate back to the homepage were reminded of the Primo search box located there and used it to search for books. Figure 4. “Search for Databases” box on the “Articles & Databases” page. Another breakdown was the “Help & FAQ” page (Figure 5). Participants who turned there for help at any point in the study spent a relatively long time trying to find a usable answer and often ended up more confused than before. In fact, only one in three participants managed to use “Help & FAQ” successfully because the FAQ consists of many questions with answers on many different pages and subpages. This portion of the website had not been updated in several years and therefore the questions were not listed in order of frequency. INFORMATION TECHNOLOGY AND LIBRARIES |DECEMBER 2017 42 Figure 5. The answer to the “How do I find books?” FAQ item leads to several subpages. DISCUSSION Using the results of the study, we made several recommendations to the Libraries’ Web Advisory Committee and administration: (1) display our hours of operation on the homepage; (2) remove the search boxes from the “Articles & Databases” and “Research Guides” pages; (3) condense the “Help & FAQ” pages; and (4) create a “Find Books” option on the homepage. All of these recommendations were taken into account during a recent redesign of the website. We also considered each user group’s performance and its implications for website design as well as instruction and outreach efforts. First, our team suggested that the current day’s hours of operation be featured prominently on the website’s front page. Despite “How late is our main library open on a typical Monday night?” being one of two tasks that had a 100 percent completion rate, this change is easy to make, adds convenience, and addresses a long-voiced complaint. Several participants expressed a desire to see this change implemented. Moreover, this is something many of our peer libraries provide on their websites. The team’s next recommendation was to remove the “Find Databases by Title” search box from the “Article & Databases” page. During the study, participants who had a particular database in mind opted to navigate directly to that database rather than search for it. Another such search box exists on the “Research Guides” page. Although most of the participants did not encounter this search box during the study, those that did also mistook it for a general search tool. Participants EVERYONE’S INVITED | AZADBAKHT, BLAIR, AND JONES 43 https://doi.org/10.6017/ital.v36i4.9959 from all groups, especially undergraduate students, assumed that any search box on the Libraries’ website was designed to search for and within resources like article databases and the online catalog, regardless of how the search box was labeled. Given our findings, libraries with similar search boxes might also consider removing these from their websites. Another recommended change was to condense the “Help & FAQ” section of the website considerably. The “Help & FAQ” section was too large and unwieldy for participants to use successfully without becoming visibly frustrated, defeating its purpose. Moreover, Google Analytics showed that only nine of the more than one hundred “Help & FAQ” pages were used with any regularity. Going forward, we will work to identify the roughly ten most important questions to feature in this section. The final major recommendation was to consider adding a top-level menu item called “Find Books” that would provide users with a means to escape the depths of the site and direct them to Primo or the Classic Catalog. When participants would get stuck on the book-finding task, they looked for a parallel to the “Articles & Databases” menu option. A “Getting Started” page or LibGuide could take this idea a step further by also including brief, straightforward instructions on finding articles and journals by title. In effect, this option would be another way to condense and reinvent some of the topics originally addressed in the “Help & FAQ” pages. Comparing each user group’s average performance helped illuminate the strengths and weaknesses of the website’s design. We suspect that graduate students were the fastest and nearly most successful group because they are early in their academic careers and doing a great deal of their own research (as compared to faculty). Many of them are also responsible for teaching introductory courses and are working closely with first-year students who are just learning how to do research. Faculty, because their research tends to be on narrower topics, were familiar with the specific resources and tools they use in their work but were less able to efficiently navigate the parts of the website with which they have less experience. Moreover, individual faculty varied widely in their comfort level with technology, and this affected their ability to complete certain tasks. CONCLUSION The results of our website usability study echo those found elsewhere in the literature. Students approach library search interfaces as if they were Google and generally conduct very simple searches. Without knowledge of the Libraries’ digital environment and without the research skills library employees possess, undergraduates in our study tended to favor the most direct route to the answer—if they could identify it. This group had the most trouble with library and academic terminology or concepts like the difference between an article and a journal. Though not as quick as the graduate students, undergraduates completed tasks swiftly, mainly becau se of their reliance on the Primo discovery tool. However, undergraduate students were less able to recover from missteps; more of them confused the “Find Databases by Title” search tool for an article search tool than participants from any other group. Since undergraduates compose the bulk of our user INFORMATION TECHNOLOGY AND LIBRARIES |DECEMBER 2017 44 base and are the least experienced researchers, we decided to focus our redesign on solutions that will help them use the website more easily. Although all of the library employees in our study work in public-facing roles, not all of them provide regular research help or teach information literacy. Since most of them are very familiar with our website and online resources, they approached the tasks more methodically and thoroughly than other participants. Library employees tended to choose the search strategy or path to discovery that would yield the highest-quality result or they would demonstrate multiple ways of completing a given task, including any necessary workarounds. The inclusion of library employees yielded the most powerful tool in our research team’s arsenal. Holding this group’s “correct” methods side-by-side to equally valid methods of discovery helped shake loose rigid thinking, and the fact that some library employees were unable to complete certain tasks shocked all parties in attendance when we presented our findings to stakeholders. Any potential argument that student, faculty, and staff missteps were the result of improper instruction and not of a usability issue was countered by evidence that the same missteps were sometimes made by library staff. Not only was this an eye-opening revelation to our entire staff, it served as the evidence our team needed to break through entrenched resistance to making any changes. We were met with almost instant, even enthusiastic, buy-in to our redesign recommendations from the Libraries’ administration. Therefore, we highly recommend that other academic libraries consider including library staff as participants in their website usability studies. REFERENCES 1 Barbara A. Blummer, “A Literature Review of Academic Library Web Page Studies,” Journal of Web Librarianship 1, no. 1 (2007): 45–64, https://doi.org/10.1300/J502v01n01_04. 2 Jody Condit Fagan, “Usability Studies of Faceted Browsing: A Literature Review,” Information Technology and Libraries 29, no. 2 (2010): 58–66, https://ejournals.bc.edu/ojs/index.php/ital/article/view/3144/2758. 3 Judith Z. Emde, Sara E. Morris, and Monica Claassen-Wilson, “Testing an Academic Library Website for Usability with Faculty and Graduate Students,” Evidence Based Library and Information Practice 4, no. 4 (2009): 24–36, https://doi.org/10.18438/B8TK7Q. 4 Ibid., 30. 5 Nancy B. Turner, “Librarians Do It Differently: Comparative Usability Testing with Students and Library Staff,” Journal of Web Librarianship 5, no. 4 (2011): 286–98, https://doi.org/10.1080/19322909.2011.624428. 6 Ibid., 295. https://doi.org/10.1300/J502v01n01_04 https://ejournals.bc.edu/ojs/index.php/ital/article/view/3144/2758 https://doi.org/10.18438/B8TK7Q https://doi.org/10.1080/19322909.2011.624428 EVERYONE’S INVITED | AZADBAKHT, BLAIR, AND JONES 45 https://doi.org/10.6017/ital.v36i4.9959 7 Andrew D. Asher and Lynda M. Duke, “Searching for Answers: Student Behavior at Illinois Western University,” in College Libraries and Student Culture: What We Now Know (Chicago: American Library Association, 2012), 77–78. 8 Lucy Holman, “Millennial Students’ Mental Models of Search: Implications for Academic Librarians and Database Developers,” Journal of Academic Librarianship 37, no. 1 (2011): 21– 23, https://doi.org/10.1016/j.acalib.2010.10.003. https://doi.org/10.1016/j.acalib.2010.10.003 ABSTRACT INTRODUCTION METHODS Limitations RESULTS Common Breakdowns DISCUSSION CONCLUSION REFERENCES 9966 ---- Untitled A Case Study on the Path to Resource Discovery Beth Guay INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 18 ABSTRACT A meeting in April 2015 explored the potential withdrawal of valuable collections of microfilm held by the University of Maryland, College Park Libraries. This resulted in a project to identify OCLC record numbers (OCN) for addition to OCLC’s Chadwyck-Healey Early English Books Online (EEBO) KBART file.1 Initially, the project was an attempt to adapt cataloging workflows to a new environment in which the copy cataloging of e-resources takes place within discovery system tools rather than traditional cataloging utilities and MARC record set or individual record downloads into online catalogs. In the course of the project, it was discovered that the microfilm and e-version bibliographic records contained metadata which had not been utilized by OCLC to improve its link resolution and discovery services for digitized versions of the microfilm resources. This metadata may be advantageous to OCLC and to others in their work to transition from MARC to linked data on the Semantic Web. With MARC record field indexing and linked data implementations, this collection and others could better support scholarly research. Collections, Discovery Tools, and Metadata Services The University of Maryland, College Park Libraries’ (the Libraries; UM Libraries) collections include 3.45 million print books and 1.2 million eBooks, 17,000 electronic journals, and 352 electronic databases.2 In late 2011, the Libraries implemented WorldCat Local, OCLC’s single- search-box interface to the WorldCat database of cataloged resources and a central index of metadata provided by publishers, Abstracting and Indexing Services, institutional repositories, and so on. With WorldCat Local, and later, WorldCat Discovery, OCLC utilizes a knowledge base in managing e-resources discovery and access.3 Knowledge bases are “associated with link resolvers and electronic resource management systems” and “contain title-level metadata, linking syntax rules, publication ranges and other data.”4 KBART files are so named to represent files compliant with the NISO recommended practice, “Knowledge Bases and Related Tools (KBART).”5 KBART files, created and supplied by content providers, are used to transmit this title level metadata to knowledge base vendors and discovery service providers.6 Since OCLC enhances these files with OCLC numbers (OCN) in order to provide automated holdings maintenance on WorldCat bibliographic records, the Libraries’ Metadata Services Department (MSD) adopted a policy in 2012 to provide access to e-resources only via WorldCat when such files are available. Beth Guay (baguay@umd.edu) is Continuing Resources Librarian, University of Maryland Libraries, University of Maryland, College Park. A CASE STUDY ON THE PATH TO RESOURCE DISCOVERY | GUAY | doi:10.6017/ital.v36i3.9966 19 Space Planning Early on, the Libraries’ collection policies targeted duplicate copies of print monographs and print journals held electronically in trusted repositories, e.g., JSTOR, for deselection. By March 2014, the Libraries’ Collection Development Council discussed moving microfilm collections to the yet to be opened Severn Library, slated to “house lesser used materials … in order to free up much needed space for users and the development of new collaborative learning spaces.” 7 8 A year later, in April 2015, a meeting was called by the Assistant Head, Collection Development, to investigate microfilm collection retention decisions. This time the Libraries were considering the withdrawal of microfilm resources for which equivalent versions were held online. A caveat placed on the withdrawal of the microfilm by the collection managers was that prior to their withdrawal and subsequent deletion of the Libraries’ holdings on the WorldCat bibliographic records, the equivalent e-version resources should be made discoverable in WorldCat UMD (the Libraries’ WorldCat Discovery implementation) by the addition of the Libraries’ holdings on e-version bibliographic records corresponding to the microfilm version records. Following the meeting, the Librarian for English, Latin American, & Latina/o Studies and Second Language Acquisition provided the Continuing and Electronic Resources Cataloger (C-ER Cataloger) with a list of eight valuable microfilm collections of resources and for each, the name of the comparable online collection (or e-collection) subscribed to. It was agreed that the C-ER Cataloger would investigate to determine if any of those microfilm collections could be withdrawn in compliance with the collection managers’ caveat. In other words, the C-ER Cataloger’s mission was to ensure a one-to-one correspondence of electronic and microfilm version bibliographic records for the equivalent versions of the resources. One of the e-collections added to the WorldCat Knowledge Base (WCKB) by the Libraries was Gale’s, The Making of the Modern World, 1450-1850: Part I collection (MOMW). This collection is comprised of digitized versions of Gale's microfilm resources in the series, The Goldsmiths'-Kress library of economic literature.9 A KBART file was derived from the Libraries’ MOMW MARC record set and uploaded to the WCKB sandbox where it supports the Libraries’ access to the e-version resources. The MOMW MARC record set had been reviewed and vetted by the Libraries prior to its purchase, and upon its purchase, Gale had set the Libraries’ holdings on the WorldCat bibliographic records representing the resources. With this information in mind, the C-ER Cataloger determined that the MOMW e-resource bibliographic records were comparable to those representative of the Libraries’ corresponding Goldsmiths'-Kress library of economic literature microfilm collection, thus meeting the collection managers’ criteria for deselection. The 3380 reels that could be withdrawn comprised a small but not insignificant allotment of physical space in the library. Provision of discoverability of equivalent e-versions of resources held in other collections proved difficult. For example, the corresponding microfilm collections represented in the WCKB’s British Periodicals Collections I and II were held in the series, Early British periodicals and English literary INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 20 periodicals.10 The Libraries had cataloged 186 individual serial titles in the microfilm series, Early British periodicals in 2002, but none in the series, English literary periodicals. Thus the objective would have been to ensure discoverability for the equivalent electronic versions of the Libraries’ 186 cataloged microfilm versions in the Early British periodicals series. At the time of this investigation, there were 580 British Periodicals I and II KBART file title entries; 390 of which had OCN. Whereas the OCN of The Making of the Modern World, 1450-1850: Part I WCKB collection were known entities, the OCN of the remaining e-collections had yet to be vetted. Thus the British Periodicals Collections I and II records were spot checked for evaluation. The quality of the 390 OCLC records ranged from excellent, e.g., OCLC record #297425799, to poor, e.g. #818401694 (see Figure 1, 2, 3, and 4). MARC record images in Figures 1-4 are sourced from OCLC’s Connexion cataloging client interface to the WorldCat bibliographic database. Figures 1 and 2 represent a microfilm version record and a comparable “excellent” quality record given for the resource in the WorldCat Knowledge Base, while Figures 3 and 4 represent a microfilm version and comparable “poor” quality record given for the resource in the WCKB. Note that the C-ER Cataloger’s definition of an excellent quality e-version record was one which provided metadata comparable to those of its equivalent microfilm version record; likewise, a poor quality record lacked comparable metadata. In other words, an excellent quality record was viewed as a guarantor of a discoverable resource, while a poor quality record was viewed as an obstacle to discovery. For this WCKB collection, the C-ER Cataloger determined that staff expertise with serial bibliographic records was required, and due to MSD staffing limitations, moved ahead to examine the other collections. A CASE STUDY ON THE PATH TO RESOURCE DISCOVERY | GUAY | doi:10.6017/ital.v36i3.9966 21 Figure 1. Microfilm version record INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 22 Figure 2. Excellent quality e-version record — OCN in the KB file A CASE STUDY ON THE PATH TO RESOURCE DISCOVERY | GUAY | doi:10.6017/ital.v36i3.9966 23 Figure 3. Microfilm version record INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 24 Figure 4. Poor quality e-version record — OCN in the KB file In an investigation into OCLC’s Chadwyck-Healey Early English Books Online (EEBO) KBART file, for which equivalent e-versions of microfilm resources in the series Early English books, 1475- 1640 and Early English books, 1641-1700 are held, it was found that the availability of comparable e-version bibliographic records was optimal.11 In consultation with the MSD department head, a project to ensure the discoverability of equivalent e-versions of the Libraries’ 5,062 cataloged microfilm resources in the series, Early English books, 1475-1640 was initiated. The C-ER Cataloger had hoped to follow with a similar effort for the Libraries’ resources in the series Early English books, 1641-1700 (represented by 41,306 records in the Libraries’ Integrated Library System). Background: EEBO, Related Resources and Bibliographic Records Much has been written on EEBO’s inception and continuing development as a collection of digital reproductions of microfilm reproductions of pre-1700 print resources, and on its scholarly value (Kitchuk, 2007; Martin, 2007; Gadd, 2009; Mak, 2013; Folger Shakespeare Library, 2015).12 Alfred Pollard and Gilbert Redgrave’s A short-title catalogue of books printed in England, Scotland, & Ireland and of English books printed abroad, 1475-1640 (“STC”), and the “companion” volume, Donald Wing's Short-title catalogue of books printed in England, Scotland, Ireland, Wales, and British America, and of English books printed in other countries, 1641-1700 (“Wing”), respectively, were used in selecting the print resources for filming.13 Gadd (2009, 683) pinpointed the STC as “a catalogue of editions (or more accurately, editions and issues) not copies although, of course, the A CASE STUDY ON THE PATH TO RESOURCE DISCOVERY | GUAY | doi:10.6017/ital.v36i3.9966 25 information about any edition is derived primarily from the surviving copies … Each entry gives the location of known copies …” 14 The “successor” to STC and Wing, the English Short Title Catalog (ESTC), “includes records for every item listed in STC, every item in Wing, every item in the Eighteenth Century Short Title Catalogue … and newspapers and other serials which began publication before 1801” and is freely available online from the British Library.15 16 Gadd (2009, 685-686) offered this critique concerning EEBO’s bibliographic data and relationship to the ESTC: EEBO’s relationship with the original STC and Wing is straightforward and clear; EEBO’s relationship with electronic ESTC, on the other hand, is less well-known. A series of agreements made between ESTC and University Microfilms/ProQuest between 1989 and 1997 allowed EEBO to draw directly on ESTC’s existing bibliographical data … EEBO heavily edited ESTC’s data for its own purposes; certain categories of data were removed (e.g. collations, Stationer’s Register entrances), some information was amended (e.g., subject headings), and some was added (e.g. microfilm specific details). Second, there is no formal mechanism for synchronizing the data between the two resources. Occasionally, snapshots of data are sent by EEBO to ESTC but there is no guarantee that a correction or revision made to an ESTC entry will be replicated in the corresponding EEBO or vice-versa: neither ESTC nor EEBO will necessarily know when the other made a correction.17 Gadd postured that “as both resources continue to amend and expand their bibliographical data for their own purposes, there is an increasing likelihood of significant discrepancy between the two resources.”18 He did not further address the quality of the bibliographic records describing the EEBO versions of the resources; perhaps he was unaware of the sources of the EEBO bibliographic data. Microfilm version bibliographic records serve as the basis of the metadata describing the EEBO version resources. According to ProQuest, “MARC records (from which EEBO Bibliographic records derive) are produced for the microfilm collection Early English Books (EEB) after they are filmed.”19 OCLC’s cataloging database has served as one source of microfilm version records for titles in the series since the 1980s. In 1984, the Association of Research Libraries (1984, p. J-3) reported that one library had “input an indeterminate amount [of bibliographic records] into OCLC” for Early English books, 1475-1640, and that one had “input records for an indeterminate percentage of the set into OCLC” for resources in the series, Early English books 1641-1700.20 The cataloging sources of these microfilm resources have varied over time, from cooperative projects to UMI/ProQuest staff to individual libraries, however, adherence to standards has characterized the totality of the efforts invested. Joachim (1993, p. 111) described the cooperative effort begun in 1984 by the Indiana University Libraries, University of California, Riverside, University of Delaware, and the University of Utah to catalog microfilm version resources cataloged by Wing: INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 26 In order to maintain standards and consistency among the five libraries, the project director prepared a “Wing STC Project manual.” The manual includes general information, information on authority work, a bibliography, a discussion of special cataloging problems and procedures, sample records, and database input guidelines.21 OCLC’s MARC records for the microfilm and EEBO version resources contain note fields identifying the locations of the print copies filmed and subsequently reproduced digitally by UMI/ProQuest. Gadd (2009, p. 686) emphasized the importance of this information to scholars in stating that “different copies from the same edition might vary, sometimes markedly.”22 As to Gadd’s (2009) critique concerning the lack of a formal synchronization mechanism and increasing likelihood of discrepancies between EEBO and ESTC, further examination of EEBO and ESTC bibliographic record displays such as those shown in Figures 5 and 6 suggest that the British Library is working with ProQuest to align their data. It appears a focus of the British Library may be to inform the scholar of the availability of the microfilm and electronic versions of the print resources. In its ESTC overview, the British Library states that “the existence of selected … printed and digital surrogates within products such as Early English Books Online … is … noted” in its records and that its records “act as an index to several major research microform series … including Early English Books, 1475-1640 … [and] Early English books, 1641-1700.”23 A CASE STUDY ON THE PATH TO RESOURCE DISCOVERY | GUAY | doi:10.6017/ital.v36i3.9966 27 Figure 5. EEBO bibliographic record for the resource cited by STC 2nd edition entry 9164 and reproduced from the copy held at the Society of Antiquaries, London. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 28 Figure 6. ESTC catalog record for STC 2nd edition, entry 9164 (http://estc.bl.uk/S3614). The code, “Lsa” given as “Loc. Of filmed copy” is the British Libraries’ MARC code for the Society of Antiquaries Library. 24 A CASE STUDY ON THE PATH TO RESOURCE DISCOVERY | GUAY | doi:10.6017/ital.v36i3.9966 29 Finally, to add to this mix of print, microfilm, and EEBO digitized images, XML/SGML versions of the resources are being created by the Text Creation Partnership (TCP), formed in 1999 by the university libraries of Michigan and Oxford, ProQuest, and the Council on Library and Information Resources, to provide full text search capability.25 Catalog records describing TCP versions are available in WorldCat. According to the TCP, “the TCP does not have the resources to create new catalog records for each text we produce (though you are welcome to do so, and if you are willing to share them we would be very glad to know about it).”26 The UM Libraries’ EEBO Project The OCLC EEBO KBART file, which contained 129,544 title entries when downloaded, 58,518 of which lacked OCN, was combined with a file extracted from the 5,062 MARC records that represented the microfilm resources. The merged file was to be used as a tool in identifying the OCN of the equivalent e-versions of the microfilm resources held. The plan was to add the e- version OCN to the EEBO KBART file via OCLC’s OCN correction form.27 Significant time was spent developing and documenting procedures by which staff could perform the work of identifying OCN for addition to the EEBO file. The basic procedures are as follows: (1) via the OCLC Connexion cataloging client, search and retrieve the e-version record using the microfilm version record data; (2) use titles and/or OCN of the microfilm version record to identify the comparable EEBO resource in the KBART file; (3) view the EEBO resource record using the URL in the file; and (4) record the OCN of the matching e-version record in the appropriate row/column of the file.28 Subsequently, two MSD staff members were recruited to assist in the effort. In early November and mid-December, 2015, training sessions were held with both staff, followed by an individual session with each. Before the year’s end, each staff member had successfully completed an assigned number of “titles” for review. Importantly, from the initial investigative work, a KBART file with 50 OCN was compiled and submitted to OCLC. Confirmation from OCLC Customer Support was given that the file would be loaded. Due to the ongoing developmental status of OCLC’s services, the OCN were not loaded into the WCKB until June 2016. However, a second file sent in April 2016 was loaded in June as well. The number of OCN added to the WorldCat Knowledge Base from the project’s inception through 2016 was small due to staffing issues. The average staff time to complete a microfilm/equivalent e-version title entry in the KBART file was 13 minutes.29 As the project progressed, staff following the procedures confirmed that some OCN in the EEBO KBART file were incorrect. Most often, the “errors” stemmed from the attribution of TCP or German language of cataloging record OCN to the EEBO version resources. These TCP and German language of cataloging records correctly corresponded to matching EEBO version resources, however, TCP version records refer to XML/SGML encoded text editions; secondly, OCLC attempts to prefer English language of cataloging records over others in its knowledge base.30 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 30 Other OCN errors seriously detract from the WCKB’s EEBO file’s value. For example, WorldCat record number 606541404 describes the “fourth edition very much enlarged” of “A Most exact catalogue of the Lords spirituall and temporall, as peers of the realme, in the higher House of Parliament, according to their dignities, offices, and degrees: some other called thither for their assistance, & officers of their attendances …” yet this OCN in the WorldCat Knowledge Base’s EEBO KBART file links to an EEBO record describing the “third edition much enlarged.” See Figure 8 illustrating the WorldCat UMD record which links to an EEBO resource record describing the “third edition much enlarged.” Note that the OCLC record (as seen in the Connexion client view of the record in Figure 9) is cited by STC (2nd ed.) 7746.3 while the EEBO version record linked to is cited by STC (2nd ed.) 7746.2. To make matters worse, the author determined that the corresponding image associated with the EEBO catalog record cited by STC 7746.2 and displayed at the site corresponded to neither resource cited as STC 7742.2 and STC 7746.3. These were both printed in 1628, but the image provided at the EEBO site was of a resource printed in 1640 (see Figure 10). Figure 8. WorldCat UMD record OCN 606541404 linking to the wrong version of a resource in EEBO. A CASE STUDY ON THE PATH TO RESOURCE DISCOVERY | GUAY | doi:10.6017/ital.v36i3.9966 31 Figure 9. Connexion client view of OCN 606541404 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 32 Figure 10. Digital image linked to from EEBO record describing the “third edition much enlarged” of a resource printed in 1628. http://gateway.proquest.com/openurl?ctx_ver=Z39.88- 2003&res_id=xri:eebo&rft_id=xri:eebo:image:23639 A CASE STUDY ON THE PATH TO RESOURCE DISCOVERY | GUAY | doi:10.6017/ital.v36i3.9966 33 Further investigation identified errors of misappropriation of OCN in the KBART file to EEBO version records describing copies of editions filmed at locations other than those noted in the corresponding OCLC records. For example, the EEBO resource, “By the King. A proclamation for the adiournement of part of Trinitie terme,” identified in the WCKB as associated with OCN 71492075, links the scholar to a resource described by the EEBO version record as the copy filmed at the British Library. OCLC record 71492075 however indicates that the copy it describes was the copy filmed at the Henry E. Huntington Library and Art Gallery. See Figures 11-13. Figure 11. The WCKB associates OCN 71492075 with the EEBO resource, “By the King. A proclamation for the adiournement of part of Trinitie terme,” described by the EEBO website as the copy filmed at the British Library. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 34 Figure 12. The EEBO resource record linked from OCN 71492075 by the OCLC EEBO KBART file indicates the copy filmed was held by the British Library. A CASE STUDY ON THE PATH TO RESOURCE DISCOVERY | GUAY | doi:10.6017/ital.v36i3.9966 35 Figure 13. OCN 71492075 indicates it describes a copy of the resource, “By the King : a proclamation for the adiournement of part of Trinitie terme,” filmed at the Henry E. Huntington Library and Art Gallery. Evaluation The UM Libraries’ EEBO project procedures revealed that match points of equivalent microfilm and e-version records were the names of the institutions holding the filmed copies and the STC citations to the resources.31 STC citations are carried in the MARC 510 fields of the bibliographic records in two subfields: 1. in subfield “a,” the names of citing works, given in a brief form, e.g., “STC” to represent Pollard and Redgrave’s Short-title catalogue; and 2. in subfield “c,” the location (e.g., page number or volume) within the citing works, e.g. “8626.”32 Figure 14 displays a Connexion Client view of OCN 33150534, cited as STC 9170, and Figure 15 shows the same record in the WorldCat display view. Unfortunately, the MARC 510 fields are neither indexed by OCLC nor displayed in WorldCat.33 OCLC could enable the identification and INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 36 collocation of records for equivalent print, microfilm and electronic versions by indexing the MARC 510 fields and subfields.34 Figure 14. Microfilm version record OCN 33150534, cited as STC 9170. A CASE STUDY ON THE PATH TO RESOURCE DISCOVERY | GUAY | doi:10.6017/ital.v36i3.9966 37 Figure 15. WorldCat.org view of OCN 33150534, STC 9170 (http://www.worldcat.org/oclc/33150534). The underlying MARC 510 field metadata is not displayed. Investigation by the author revealed that TCP version records supply these metadata elements in duplicate in different MARC fields; one a free text note field, the other a number/code field, 024. The 024 field is defined to carry a “standard number or code published on an item which cannot be accommodated in another field (e.g., field 020 (International Standard Book Number).”35 It should be noted that use of the 024 field to carry a number that is not published on the item is not in accordance with the field’s definition. The TCP records use the 024 field with a first indicator value “8,” conveying that the number is an unspecified type of standard number or code.36 Subfield “a” of the 024 field, which carries the STC numbers in the TCP version records, is indexed by OCLC. In the TCP version records, however, these elements are ensconced within strings of text, e.g., “(stc) STC (2nd ed.) 9170.”37 A search on standard number, “9170,” in WorldCat will therefore fail to retrieve the appropriate record. See Figure 16 for an example of a TCP version record of a resource cited as STC 9170. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 38 In respect to the MARC field definitions, should there be a need to retrieve bibliographic records representing TCP versions of resources via STC citations, these numbers should be entered in “a” subfields, and the brief abbreviated names of the citing source, e.g., “STC (2nd ed.),” “Wing,” etc. in the “2” subfield which is defined to carry the “Source of number or code.”38 Should OCLC choose to index the MARC 510 fields as described above, the Text Creation Partnership records would be missed. Figure 16. Text Creation Partnership version OCN 832931179, STC 9170 Indexing of the MARC 510 fields/subfields by OCLC combined with use of other MARC field/subfield values, such as language of cataloging, to limit results to desired OCN could support elimination of EEBO KBART file OCN errors and identification of thousands of new OCN for addition to this and perhaps other similar files. 39 As a point of reference, according to OCLC’s “MARC Usage in WorldCat” webpages, as of January 1, 2016, there were 6,382,317 instances of MARC 510 “a” subfields and 4,082,280 instances of the “c” subfields.40 It should be noted, however, there are five first indicator values available for use in MARC 510 fields and only one of them is used to convey the information that the location in the source data is given in the field. Also worth noting, 024 data at the “MARC Usage in WorldCat” webpages shows that there were 4,633,776 occurrences of subfield 2 of the 024 field, and 43,711,819 occurrences of subfield “a.”41 510 field indexing to support identification of OCN for addition to the EEBO KBART file may require the participation of the content provider, ProQuest. The 510 field elements are indexed in A CASE STUDY ON THE PATH TO RESOURCE DISCOVERY | GUAY | doi:10.6017/ital.v36i3.9966 39 its Early English Books Online collection. ProQuest could add these data to its EEBO KBART file in support of OCN matching. The KBART Recommended Practice allows content providers “to include any extra data fields after the last KBART utilized position.”42 Finally, it should be noted that reconciliation of errors in the WCKB EEBO file pertaining to the locations of the filmed copies as noted in OCLC records but found to be different at the EEBO site would require more complex steps than 510 field matching. Furthermore, catalogers working on the EEBO project were not instructed to check the images at the EEBO website but only to confirm the STC citation match points in the EEBO version records. A closer examination of EEBO in light of the findings in this paper of an EEBO record linked to a resource printed 12 years later is an area calling for further study. In respect of the needs of scholars as eloquently described by Gadd (2009), the WorldCat Knowledge Base OCN must improve its accuracy in terms of access provision via WorldCat Discovery. MARC 510 Elements: Opportunities for Linked Data Applications? OCLC is actively engaged in research and collaboration with the greater library community to transition its metadata to linked data, however, MARC 510 metadata is lacking in its linked data record display views (see Figure 14 in a Connexion client view of a record and Figure 17 in the WorldCat linked data display view).43 44 On the other hand, in its work to transfer its English Short Title Catalog, a “MARC based … vendor-supplied ILS” to “ESTC21” a “native linked data resource,” it appears the British Library combines the MARC 510 subfield values, e.g., “Bristol, B7384” as a resource property value (Figures 18 and 19).45 46 “Bristol, B7384” represents entry number 7384 in Roger P. Bristol’s Supplement to Charles Evans' American bibliography (see Figure 20, WorldCat OCLC record number 88701).47 As presented in Figure 19 (Stahmer, 2014), “Bristol, B7384” may be comprehensible to a well-versed scholar, librarian or archivist, but not to a computer. Hillmann, Dunsire, and Phipps (2013) posited that “it would be useful if all managers of schemas and other standards were to develop element sets and value vocabulary representations that match the source semantics at the finest granularity and make them available along with maps of the internal ontologies.”48 Could a Semantic Web implementation of MARC 510 metadata at the finest granularity, with resource identifiers representing citing works such as “Bristol” and with property values such as “7384” representing locations within citing works, offer benefits to scholarship? It has been demonstrated in this paper that the consistent match points across bibliographic records representing equivalent versions of these resources has been the metadata contained in MARC 510 fields. Ultimately, a linked data implementation of the MARC bibliographic 510 field should lead the scholar to every known print copy comprising every edition, according to Gadd’s definition of an edition, above, and to the institutional holdings of equivalent microform, digitized images, or digitized full-text versions, giving the scholar the path to the resources of interest.49 OCLC, the British Library, members of the TCP, and other stakeholders may want to consider further exploration of use case scenarios to determine or rule out additional benefits of transforming MARC 510 field metadata to linked data. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 40 Figure 17. Linked data view of OCLC #33150534, http://www.worldcat.org/oclc/33150534 A CASE STUDY ON THE PATH TO RESOURCE DISCOVERY | GUAY | doi:10.6017/ital.v36i3.9966 41 Figure 18. MARC 510 field data in ESTC INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 42 Figure 19. MARC 510 metadata in structured data view in ESTC21 A CASE STUDY ON THE PATH TO RESOURCE DISCOVERY | GUAY | doi:10.6017/ital.v36i3.9966 43 Figure 20. Print version of OCN 88701, Supplement to Charles Evans' American bibliography by Roger P. Bristol, http://www.worldcat.org/oclc/88701. CONCLUSION At the current pace, given available staffing and the number of EEBO resources lacking OCN, the time and effort spent by the Libraries’ Metadata Services Department staff toward the goal of adding OCN to the OCLC EEBO KBART file, though well spent, will be years in the making. A collective effort in this endeavor by the WCKB community of users is welcomed by this author.50 A combined effort by OCLC and ProQuest to improve discovery and link resolution services for these valuable scholarly resources could increase their discoverability exponentially, allowing MSD staff to spend more time creating and enhancing the metadata that will lead researchers to the uncatalogued EEBO resources they seek. As to the transition of MARC 510 field metadata to linked data, OCLC, the British Library, members of the TCP, and other stakeholders should consider their options before moving forward without it. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 44 ACKNOWLEDGEMENT The author wishes to thank Karen Coyle for reading and advising on earlier versions of this paper; Becky Culbertson, Nathan Putnam, and Patricia Herron for supporting the project; and Joshua Westgard for converting the data to get the project underway. Special thanks are due to staff members of the UM Libraries, Donna King, Roselin Becker, Erica Hemsley, Yeo-Hee Koh, and Tanisha Lee, and to Freeda Brook, Luther College, for their work on the project. REFERENCES 1. A KBART file is a file compliant with the NISO recommended practice, Knowledge Bases and Related Tools (KBART). See KBART Phase II Working Group, Knowledge Bases and Related Tools (KBART): Recommended Practice: NISO RP-9-2014 (Baltimore, MD: National Information Standards Organization (NISO), 2014), accessed March 14, 2017, http://www.niso.org/publications/rp/rp-9-2014/. 2. University of Maryland Libraries. “About.” Last updated July 28, 2016, http://www.lib.umd.edu/about 3. In 2015, the Libraries implemented WorldCat Discovery, intended to be a replacement for WorldCat Local. 4. Marshall Breeding, The Future of Library Resource Discovery, (Baltimore, MD: National Information Standards Group (NISO), 2015): 17, accessed February 18, 2017. http://www.niso.org/apps/group_public/download.php/14487/future_library_resource_disc overy.pdf 5. KBART Phase II Working Group, Knowledge Bases and Related Tools (KBART): Recommended Practice: NISO RP-9-2014, (Baltimore, MD: National Information Standards Group (NISO), 2014), accessed April 13, 2017, http://www.niso.org/publications/rp/rp-9-2014/ 6. Open Discovery Initiative Working Group, Open Discovery Initiative: Promoting Transparency in Discovery: NISO RP-19-2014, (Baltimore, MD: NISO, 2014): 13, accessed March 14, 2017, http://www.niso.org/publications/rp/rp-9-2014/ 7. University of Maryland Libraries Collection Development Council. “Meeting Notes,” March 4, 2014. 8. “University of Maryland Libraries Master Space Plan,” Nov. 2015, June 2016 update. 9. See Gale’s web page, “The Making of the Modern World (MOMW) FAQ,” at http://find.galegroup.com/mome/component/researchtools/xml/FAQ.xml, accessed February 18, 2017, for a details about the collection. WorldCat Knowledge Base collections may be created by libraries and uploaded to the Knowledge Base. Details on the process are available at http://www.oclc.org/support/services/collection- manager/documentation.en.html#knowledgebase, accessed February 18, 2017. A CASE STUDY ON THE PATH TO RESOURCE DISCOVERY | GUAY | doi:10.6017/ital.v36i3.9966 45 10. ProQuest’s British Periodicals collection “offers facsimile page images and searchable full text for nearly 500 British periodicals published from the 17th century through to the early 21st” and “is available in four separate collections, British Periodicals Collections I, II, III, and IV, each of which can be purchased separately.” ProQuest British Periodicals product description page, http://search.proquest.com/britishperiodicals/productfulldescdetail?accountid=14696, accessed Jan. 29, 2017 11. Details about resources available in EEBO are provided by ProQuest at its website, “EEBO: About EEBO,” accessed January 29, 2017. http://eebo.chadwyck.com/marketing/about.htm 12. Diana Kichuk, “Metamorphosis: Remediation in Early English Books Online (EEBO),” Literary and Linguistic Computing, 22:3 (2007): 291-303; Shawn Martin, “EEBO, Microfilm, and Umberto Eco: Historical Lessons and Future Directions for Building Electronic Collections,” Microform & Imaging Review, 36:4 (2007): 159-164; Ian Gadd, “The Use and Misuse of Early English Books Online, Literature Compass, 6:3 (2009): 680-692; Bonnie Mak, “Archaeology of a Digitization,” Journal of the Association for Information Science and Technology, 65:8 (2014): 1515-1526; Folger Shakespeare Library, “History of Early English Books Online,” http://folgerpedia.folger.edu/History_of_Early_English_Books_Online, last modified on 26 August 2015. 13. A.W. Pollard and G. R. Redgrave. A short-title catalogue of books printed in England, Scotland, & Ireland and of English books printed abroad, 1475-1640, Rev. ed. (London: The Bibliographical Society, 1976–1991); Donald Wing, Short-title catalogue of books printed in England, Scotland, Ireland, Wales, and British America, and of English books printed in other countries, 1641-1700, 2d ed., newly rev. and enl. (New York : Modern Language Association of America, 1972- <1994>) 14. Gadd, “The Use and Misuse of Early English Books Online,” 683. 15. “About EEBO.” 16. Details on the ESTC are provided by the British Library at http://www.bl.uk/reshelp/findhelprestype/catblhold/estccontent/estccontent.html, viewed March 12, 2017 17. Gadd, “The Use and Misuse of Early English Books Online,” 685-686. 18. Gadd, “The Use and Misuse of Early English Books Online,” 686. 19. EEBO, “Frequently Asked Questions,” accessed February 18, 2017. http://eebo.chadwyck.com/help/faqs.htm 20. Association of Research Libraries, Microform Sets in U.S. and Canadian Libraries, (Washington, D.C.: Association of Research Libraries, 1984), J-3. 21. Martin D. Joachim, “Cooperative Cataloging of Microform Sets,” in Cooperative Cataloging: Past, Present, and Future (New York: The Haworth Press, 1993), 111. INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 46 22. Gadd, “The Use and Misuse of Early English Books Online,” 686. 23. British Library, “Catalogs of British Library Holdings: English Short Title Catalogue - content,” accessed February 18, 2017. http://www.bl.uk/reshelp/findhelprestype/catblhold/estccontent/estccontent.html 24. The British Libraries ESTC codes for filmed copy locations are difficult to translate. See Meaghan J. Brown’s finding aid, “STC Location Code Transcription” wherein she offers details on STC and ESTC location codes and the problem her finding aid addresses. Brown explains, “… it is currently possible to search the ESTC for items using MARC codes, but not the location codes familiar from the STC,” accessed February 18, 2017. http://www.meaghan- brown.com/stc-location-codes/ 25. Text Creation Partnership, accessed January 25, 2017. http://www.textcreationpartnership.org/home/ 26. Text Creation Partnership, accessed January 25, 2017. http://www.textcreationpartnership.org/catalog-records/ 27. OCLC’s form is available at https://www.oclc.org/content/dam/support/knowledge- base/ocn_report.xlsx, accessed October 18, 2016. 28. See Appendix 1 for the Procedures 29. With streamlined KBART search features introduced by a Metadata Services Department colleague, it’s expected this time may be reduced moving forward. 30. A June 9, 2015 email from an OCLC staff member to the KB-L@oclc.org listserv reported on OCLC’s efforts to match OCN in its KBART files to English language of cataloging records, when available. 31. UM Libraries’ staff use this metadata in the equivalent OCLC microfilm and e-version and EEBO resource records as match points. Staff do not verify that the images linked to the EEBO version records correspond to those in the aforementioned bibliographic records. It is hoped that ProQuest will investigate the case described in this paper in which the EEBO resource differs from its corresponding record. 32. “510 Citation/Reference Note,” OCLC, Bibliographic Formats and Standards. 4th Edition, last revised August 22, 2016. https://www.oclc.org/bibformats/en/5xx/510.html 33. As of January 29, 2017, the MARC 510 field has not been indexed by OCLC. See http://www.oclc.org/support/help/SearchingWorldCatIndexes/#05_FieldsAndSubfields/5xx _fields.htm 34. E.g., OCLC indexes “internet resources” using a combination of MARC data elements. These are laid out in “Searching WorldCat Indexes” at http://www.oclc.org/support/help/SearchingWorldCatIndexes/#06_Format_Document_Typ e_Codes/Format_Document_type_codes.htm. MARC 21 Bibliographic at A CASE STUDY ON THE PATH TO RESOURCE DISCOVERY | GUAY | doi:10.6017/ital.v36i3.9966 47 https://www.loc.gov/marc/bibliographic/bdleader.html provides the Leader position 06 code for “Language material.” MARC Code List for Languages (http://www.loc.gov/marc/languages/) contains the language codes contained in the language of cataloging field/subfield (MARC 040 field, subfield “b”). 35. “024 Other Standard Identifier,” in OCLC, Bibliographic Formats and Standards, 4th edition, accessed January 25, 2017. https://www.oclc.org/bibformats/en/0xx/024.html 36. Ibid. 37. OCLC. Searching WorldCat Indexes, accessed February 18, 2017. http://www.oclc.org/support/help/SearchingWorldCatIndexes/#05_FieldsAndSubfields/0xx _fields.htm%3FTocPath%3DFields%2520and%2520subfields%7C_____2 38. See OCLC Bibliographic Formats and Standards, Fourth edition. 024 Other Standard Identifier https://www.oclc.org/bibformats/en/0xx/024.html, viewed January 25, 2017 39. An Oct. 18, 2016 review of OCLC’s all-collections-list, available at https://www.oclc.org/content/dam/support/knowledge-base/all-collections-list.xlsx indicates that 38.5% percent of the 129,498 resources on the EEBO KBART file have OCLC number coverage. 40. http://experimental.worldcat.org/marcusage/510.html 41. http://experimental.worldcat.org/marcusage/024.html 42. KBART Phase II Working Group, Knowledge Bases and Related Tools (KBART): Recommended Practice: NISO RP-9-2014 (Baltimore, MD: NISO 2014), 18. http://www.niso.org/workrooms/kbart 43. https://www.oclc.org/worldcat/data-strategy.en.html, viewed Jan. 26, 2017 44. The image of the linked data view of Figure 14 was captured on February 18, 2017. 45. Carl Stahmer, “Making MARC Agnostic: Transforming the English Short Title Catalogue for the Linked Data Universe,” in Linked Data for Cultural Heritage, (Chicago: ALA Editions), p. 23-25. 46. The assertion that the ESTC transformation of MARC 510 field metadata is solely based on Carl Stahmer, “The ESTC as a 21st Century Research Tool,” Presentation given at the 2014 conference of the Text Encoding Initiative, viewed February 19, 2017. https://figshare.com/articles/ESTC21_at_TEI_2014/1558057 47. Roger P. Bristol, Supplement to Charles Evans' American Bibliography (Charlottesville: University Press of Virginia, 1970). 48. Dianne Hillmann, Gordon Dunsire, and Jon Phipps, “Maps and Gaps: Strategies for Vocabulary Design and Development,” in DCMI International Conference on Dublin Core and Metadata Applications, 2013: 88, accessed February 18, 2017. http://dcpapers.dublincore.org/pubs/article/view/3673/1896 INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2017 48 49. See Reference 14 above. 50. A discussion and invitation to collaborate on this work took place in late 2016 on the OCLC WorldCat KB listserv (see http://listserv.oclc.org/scripts/wa.exe?SUBED1=kb-l&A=1). To date, the Preus Library, Luther College, will be working with the Libraries on this project. 9987 ---- It is Our Flagship: Surveying the Landscape of Digital Interactive Displays in Learning Environments Lydia Zvyagintseva INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 50 Lydia Zvyagintseva (lzvyagintseva@epl.ca) is the Digital Exhibits Librarian at the Edmonton Public Library in Edmonton, Alberta. ABSTRACT This paper presents the findings of an environmental scan conducted as part of a Digital Exhibits Intern Librarian Project at the Edmonton Public Library in 2016. As part of the Library’s 2016–2018 Business Plan objective to define the vision for a digital exhibits service, this research project aimed to understand the current landscape of digital displays in learning institutions globally. The resulting study consisted of 39 structured interviews with libraries, museums, galleries, schools, and creative design studios. The environmental scan explored the technical infrastructure of digital displays, their user groups, various uses for the technologies within organizational contexts, the content sources, scheduling models, and resourcing needs for this emergent service. Additionally, broader themes surrounding challenges and successes were also included in the study. Despite the variety of approaches taken among learning institutions in supporting digital displays, the majority of organizations have expressed a high degree of satisfaction with these technologies. INTRODUCTION In 2020, the Stanley A. Milner Library, the central branch of the Edmonton (Alberta) Public Library (EPL) will reopen after extensive renovations to both the interior and exterior of the building. As part of the interior renovations, EPL will have installed a large digital interactive display wall modeled after The Cube at Queensland University of Technology (QUT) in Brisbane, Australia. To prepare for the launch of this new technology service, EPL hired a digital exhibits intern librarian in 2016, whose role consisted of conducting research to inform the library in defining the vision for a digital display wall serving as a shared community platform for all manner of digitally accessible and interactive exhibits. As a result, the author carried out an environmental scan and a literature review related to digital display, as well as their consequent service contexts. For the purposes of this paper, “digital displays” refers to the technology and hardware used to showcase information, whereas “digital exhibits” refers to content and software used on those displays. Wherever the service of running, managing, or using this technology is discussed, it is framed as “digital display service” and concerns both technical and organizational aspects of using this technology in a learning institution. METHOD The data were collected between May 30 and August 20, 2016. A series of structured interviews were conducted by Skype, phone, and email. The study population was driven by searching Google mailto:lzvyagintseva@epl.ca IT IS OUR FLAGSHIP | ZVYAGINTSEVA 51 https://doi.org/10.6017/ital.v37i2.9987 and Google News for keywords such as “digital interactive AND library,” “interactive display,” “public display,” or “visualization wall” to identify organizations that have installed digital displays. A list of the study population was expanded by reviewing websites of creative studios specializing in interactive experiences and through a snowball effect once the interviews had begun. A small number of vendors, consisting primarily of creative agencies specializing in digital interactive services, were also included in the study population. Participants were then recruited by email. The goal of this project was to gain a broad understanding of the emergent technology, content, and service model landscape related to digital displays. As a result, structured interviews were deemed to be the most appropriate method of data collection because of their capacity to generate a large amount of qualitative and quantitative data. In total, 39 interviews were conducted. A list of interview questions prepared for the interviews is included in appendix A. Additionally, a complete list of the study population can also be found in Appendix B. Predominantly, organizations from Canada, the United States, Australia, and New Zealand are represented in this study. LITERATURE REVIEW Definitions • Public displays, a term used in the literature to refer to a particular type of digital display, can refer to “small or large sized screens that are placed indoor . . . or outdoor for public viewing and usage” and which may be interactive to support information browsing and searching activities.”1 In public displays, a large proportion of users are passers-by and thus first-time users.2 In academic environments, these technologies may be referred to as “video walls” and have been characterized as display technologies with little interactivity and input from users, often located in high-traffic, public areas with content prepared ahead of time and scheduled for display according to particular priorities.3 • Semi-public displays, on the other hand, can be understood as systems intended to be used by “members of a small, co-located group within a confined physical space, and not general passers-by.”4 In academic environments, they have been referred to as “visualization spaces” or “visualization studios,” and can be defined as workspaces with real-time content displayed for analysis or interpretation, often placed in in libraries or research department units.5 For the purposes of this paper, “digital displays” refers to both public and semi-public displays, as organizations interviewed as part of this study had both types of displays, occasionally simultaneously. • Honeypot effect describes how people interacting with an information system, such as a public display, stimulate other users to observe, approach, and engage in interaction with that system.6 This phenomenon extends beyond digital displays to tourism, art, or retail environments, where a site of interest attracts attention of passers-by and draws them to participate in that site. Interactivity The area of interactivity with public displays has been studied by many researchers, with three commonly used modes of interaction clearly identified: touch, gesture, and remote modes. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 52 • Touch (or multi-touch): This is the most common way users interact with personal mobile devices such as smartphones and tablets. Multi-touch interaction on public displays should support many individuals interacting with the digital screen simultaneously, since many users expect immediate access and will not take turns. For example, some technologies studied in this report support up to 30 touch points at any given time, while others, like QUT’s The Cube, allow for a near infinite number of touch points. Though studies show that this technique is fast and natural, it also requires additional physical effort from the user.7 While touch interaction using infrared sensors has a high touch recognition rate, its shortcomings have been identified as being expensive and being influenced by light interference, such as light around the touch screen.8 • Gesture: This is interaction is through movement of the user’s hands, arms, or entire body, recognized by sensors such as the Microsoft Kinect or Leap Motion systems. Although studies show that this type of interaction is quick and intuitive, it also brings “a cognitive load to the users together with the increased concern of performing gestures in public spaces.”9 Specifically, body gestures were found not to be well suited to passing-by interaction, unlike hand gestures, which can be performed while walking. Hand gestures also have an acceptable mental, physical and temporal workload.10 Research into gesture- based interaction shows that “more movement can negatively influence recall” and is therefore not suited for informational exhibits.11 Similarly, people consider gestures to be too much work “when they require two hands and large movements” to execute.12 Not surprisingly, research suggests that gestures deemed to be socially acceptable for public spaces are small, unobtrusive ones that mimic everyday actions. They are also more likely to be adopted by users. • Remote: These are interactions using another device, such as mobile phones, tablets, virtual-reality headsets, game controllers, and other special devices. Connection protocols may include Bluetooth, SMS messaging, near-field communication, radio-frequency identification, wireless-network connectivity, and other methods. Mobile-based interaction with public displays has received a lot of attention in research, media, and commercial environments because this mode allows users to interact from variable distance with minimal physical effort. However, users often find mobile interaction with a public display “too technical and inconvenient” because it requires sophisticated levels of digital literacy in addition to having access to a suitable device.13 Some suggest that using personal devices for input also helps “avoid occlusion and offers interaction at a distance” without requiring multi-touch or gesture-based interactions.14 As well, subjects in studies on mobile interaction often indicate their preference for this mode because of its low mental effort and low physical demand. However, it is possible that these studies focused on users with high degrees of digital literacies rather than the general public with varying degrees of access and comfort with mobile technologies. User Engagement Attracting user attention is not necessarily guaranteed by virtue of having a public display. According to research, the most significant factors that influence user engagement with public digital displays are age, display content, and social context. IT IS OUR FLAGSHIP | ZVYAGINTSEVA 53 https://doi.org/10.6017/ital.v37i2.9987 Age Hinrichs found that children were the first to engage in interaction with public displays and would often recruit adults accompanying them toward the installation.15 On the other hand, the Hinrichs found adults to be more hesitant in approaching the installation: “they would often look at it from a distance before deciding to explore it further.”16 These findings suggest that designing for children first is an effective strategy for enticing interaction from users of all ages. Display Content Studies on engagement in public digital display environments indicate that both passive and active types of engagement exist with digital displays. The role of emotion in the content displayed also cannot be overlooked. Specifically, Clinch et al. state that people typically pay attention to displays “only when they expected the content to be of interest to them” and that they are “more likely to expect interesting content in a university context rather than within commercial premises.”17 In other words, the context in which the display is situated affects user expectations and primes them for interaction. The dominant communication pattern in existing display and signage systems has been narrowcast, a model in which displays are essentially seen as distribution points for centrally created content without much consideration for users. This model of messaging exists in commercial spaces, such as malls, but also in public areas like transit centers, university campuses, and other spaces where crowds of people may gather or pass by. Observational studies indicate that people tend to perceive this type of content as not relevant to them and ignore it.18 For public displays to be engaging to end users, in other words, “there needs to be some kind of reciprocal interaction.”19 In public spaces, interactive displays may be more successful than non- interactive displays in engaging viewers and making city centers livelier and more attractive.20 In terms of precise measures of attention to such displays, studies of average attention time correlate age with responsiveness to digital signage. Children (1–14 years) are more receptive than adults and men spend more time observing digital signage than women.21 Studies also indicate a significantly higher average attention times for observing dynamic content as compared to static content.22 Scholars like Buerger suggest that designers of applications for public digital displays should assume that viewers are not willing “to spend more than a few seconds to determine whether a display is of interest.”23 Instead, they recommend presenting informational content with minimal text and in such a way that the most important information can be determined in two-to-three seconds. In a museum context, the average interaction time with the digital display was between two and five minutes, which was also the average time people spent exploring analog exhibits.24 Dynamic, game-like exhibits at The Cube incorporate all the above findings to make interaction interesting, short, and drawing the attention of children first. Social Context Social context is another aspect that has been studied extensively in the field of human-computer interaction, and it provides many valuable lessons for applying evidence-based practices to technology service planning in libraries. Many scholars have observed the honeypot effect as related to interaction with digital displays in public settings. This effect describes how users who are actively engaged with the display perform two important functions: they entice passers-by to become actively engaged users themselves, and they demonstrate how to interact with the technology without formal instruction. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 54 Many argue that a conductive social context can “overcome a poor physical space, but an inappropriate social context can inhibit interaction” even in physical spaces where engagement with the technology is encouraged.25 This finding relates to use of gestures on public displays. Researchers also found that contextual social factors such as age and being around others in a public setting do, in fact, influence the choice of multi-touch gestures. Hinrichs suggests enabling a variety of gestures for each action—accommodating different hand postures and a large number of touch points, for example—to support fluid gesture sequences and social interactions.26 A major deterrent to users’ interaction with large public displays has been identified as the potential for social embarrassment.27 As an implication, the authors suggest positioning the display along thoroughfares of traffic and improving how the interaction principles of the display are communicated implicitly to bystanders, thus continually instructing new users on techniques of interaction.28 FINDINGS Technical and Hardware Landscape The average age of public displays was around three years, indicating an early stage of development of this type of service among learning institutions. Such technologies first appeared in Europe more than 10 years ago (for example, the most widely cited early example of a public display is the CityWall in Helsinki in 2007).29 However, adoption in North American did not start until around 2013.The median year for the installation of these technologies among organizations studied in this report is 2014. Among public institutions represented in the study population, such as public libraries and museums, digital displays were most frequently installed in 2015. While most organizations have only one display space, it was not unusual to find several within a single organization. For example, for the purposes of this study, the researcher has counted The Cube as three display spaces, as documentation and promotional literature on the technology cites “3 separate display zones.” As a result, the average number of display spaces in the population of this study is 1.75. The following modes of interaction beyond displaying video content with digital displays have been observed in the study population in descending order of frequency: • Sound (79%). While research on human-computer interaction is inconclusive about best practices related to incorporating sound into digital interactive displays, it is clear, among the organizations interviewed in the environmental scan, that sound is a major component of digital exhibits and should not be overlooked. • Touch or multi-touch (46%). This finding highlights that screens capable of supporting multi-user interaction is not consistent across the study population. • Gesture (25%): These include tools such as Microsoft Kinect, Leap Motion, or other systems for detecting movement for interaction. • Mobile (14%). While some researchers in the human-computer interaction field suggest mobile is the most effective way to bridge the divide between large public displays, personalization of content, and user engagement, mobile interactivity is not used frequently to engage with digital displays in the study population. One outlier is North Carolina State University Library, which takes a holistic, “massively responsive design” approach in which responsive web design principles are applied to content that can be IT IS OUR FLAGSHIP | ZVYAGINTSEVA 55 https://doi.org/10.6017/ital.v37i2.9987 displayed effectively at once online, on digital display walls, and on mobile devices while optimizing institutional resources dedicated to supporting visualization services. Further, as in the broader personal computing environment, the Microsoft Windows operating system dominates display systems, with 61% of the organizations choosing a Windows machine to power their digital display. A fifth (21%) of all organizations have some form of networked computing infrastructure, such as The Cube with its capacity to process exhibit content using 30 servers. Instead, the majority (79%) of organizations interviewed have a single computer powering the display. This finding is perhaps not surprising, given that few institutions have dedicated IT teams to support a single technology service like The Cube. Users and Use Cases Understanding primary audiences was also important for this study, as the organizational user base defines the context for digital exhibits. The breakdown of these audiences is summarized in figure 1. For example, the University of Oregon Ford Alumni Center’s digital interactive display focuses primarily on showcasing the success of its alumni, with a goal of recruiting new students to the university. However, the interactive exhibits also serve the general public through tours and events on the University of Oregon campus. Other organizations with digital displays, such as All Saints Anglican School and the Philadelphia Museum of Art, also target specific audiences, so planning for exhibits may be easier in those contexts than in organizations like the University of Waterloo Stratford Campus, with its display wall at the downtown campus that receives visitor traffic from students, faculty, and the public. 44% 33% 22% Types of Audience Academic Public Both public and academic INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 56 Figure 1. Audience types for digital displays in the study population. Digital displays serve various purposes, which depend on the context of the organization in which they exist, their technical functionality, their primary audience, their service design, and other factors. Interview participants were asked about the various uses for these technologies at their institutions. A single display could have multiple functions within a single institution. The following list summarizes these multiple uses: 1. Educational (67%), such as displaying digital collections, archives, historical maps, and other informational. These activities can be summarized in the words of one participant as “education via browse”—in other words, self-guided discovery rather than formal instruction. 2. Fun or entertainment (56%), including art exhibitions, film screenings, games, playful exhibits, and other engaging content to entice users. 3. Communication (47%), which can be considered a form of digital signage to promote library or institutional services and marketing content. Displays can also deliver presentations and communicate scholarly work. 4. Teaching (42%), including formal and semi-formal instruction, workshops, student presentations, and student course-work showcases. 5. Events (31%), such as public tours, conferences, guest speakers, special events, galas, and other social activities near or using the display. 6. Community engagement (28%), including participation from community members through content contribution, showing local content, using the display technology as an outreach tool, and other strategies to build relationships with user communities. 7. Research (22%), where the display functions as a tool that facilitates scholarly activities like data collection, analysis, and peer review. Many study participants acknowledged challenges in using digital displays for this purpose and have identified other services that might support this use more effectively. Content Types and Management In the words of Deakin University librarians, “Content is critical, but the message is king,” so it was particularly important for the author to understand the current digital display landscape as it relates to content.30 Specifically, the research project encompassed the variety of content used on digital displays as well as how it is created, managed, shared, and received by the audiences of various organizations interviewed in this study. As can be observed in figure 2, all organizations supported 2D content, such as images, video, audio, presentation slides, and other visual and textual material. However, dynamic forms of content, such as social media feeds, interactive maps, and websites were less prevalent. IT IS OUR FLAGSHIP | ZVYAGINTSEVA 57 https://doi.org/10.6017/ital.v37i2.9987 Figure 2. Types of content supported by digital displays in the study population. Discussions around interest in emergent, immersive, and dynamic 3D content such as games and virtual and augmented reality also came up frequently in the study interviews, and the researcher found that these types of content were supported in only 16 (57%) of the 28 total cases. This number is lower than the total number of interviewees because not all organizations interviewed had content to manage or display. In addition, many organizations recognized that they would likely be exploring ways to present 3D games or immersive environments through their digital display in the near future. Not surprisingly, the creative agencies included in this study revealed an awareness and active development of content of this nature, noting “rising demand and interest in 3D and game-like environments.” Furthermore, projects involving motion detection, the Internet of Things, and other sensor-based interactions are also seeing rise in demand, according to study participants. 100 % 61 % 57 % 0 10 20 30 40 50 60 70 80 90 100 Content types Supported Content Types Static 2D Dynamic web Dynamic 3D INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 58 Figure 3. Content management systems for digital displays. In terms of managing various types of content, 20 (71%) of the organizations interviewed had used some form of content management system (CMS), while the rest did not use any tool to manage or organize content. Of those organizations that used a CMS, 15 (75%) relied on a vendor- supplied system, such as tools by FourWinds Interactive, Visix, or NEC Live. The remaining 5 (18%) CMS users created a custom solution without going to a vendor. This finding suggests that since the majority of content supported by organizations with digital displays is 2D, current vendor solutions for managing that content are sufficient for the study population at this point. It is unclear how the rise in demand for dynamic, game-like content will be supported by vendors in the coming years. Table 1 reflects the distribution of approaches to managing content observed in the study population. 18% 11% 53% 18% 71% Content Management No system Unknown Vendor-supplied system In-house created system IT IS OUR FLAGSHIP | ZVYAGINTSEVA 59 https://doi.org/10.6017/ital.v37i2.9987 Table 1. Content management in study population Content Management Responses % Vendor supplied system 15 54 In-house created system 5 18 No system 5 18 Unknown 3 10 Middleware, Automation, and Exhibit Management Middleware can be described as the layer of software between the operating system and applications running on the display, especially in a networked computing environment. For example, most organizations studied in the environmental scan supported a Windows environment with a range of exhibit applications, like slideshows, web browsers, and executable files, such as games. Middleware can simplify and automate the process of starting up, switching between, and shutting off display applications on a set schedule. As figure 4 demonstrates, the majority of the organizations in the study population (17, or 61%) did not have a middleware solution. However, this group was heterogeneous: 14 organizations (50%) did not require a middleware solution because they ran content semi-permanently or relied on user-supplied content, in which case the display functioned as a teaching tool. The remaining three organizations (11%) manually managed scheduling and switching between exhibit content. In such cases, a middleware solution would be valuable to management of content, especially as the number of applications grows, but it was not present in these organizations. Comparatively, 10 organizations (36%) used a custom solution, such as a combination of Windows or Linux scripts to manage automation and scheduling of content on the display. One organization (3%) did not specify their approach to managing content. These findings suggest that no formalized solution to automating and managing software currently exists among the study population. In addition to organizing content, digital-exhibits services involve scheduling or automating content to meet user needs according to the time of day, special events, or seasonal relevance. As a result, the middleware technology solution supports sustainable management of displays and predictable sharing of content for end users. This environmental scan revealed that digital exhibits and interactive experiences are still in the early days of development. It is possible that new solutions for managing content both at the application and the middleware level may emerge in the coming years, but they are currently limited. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 60 Figure 4. Middleware solutions in the study population. Sources of Content When finding sources of content to be displayed on digital displays, organizations interviewed used multiple strategies simultaneously. Table 2 below brings together the findings related to this theme. Table 2. Content sources for digital exhibits Content Source % External/commissioned 64 User-supplied 64 Internal/in-house 50 Collaborative with partner 43 For example, many organizations rely on their users to generate and submit material (18, or 64%); others commission vendors to create exhibits for them (18, or 64%). In 50% of all cases, organizations also produce content for exhibits in-house. In other words, most organizations used a combination of all sources to generate content for their digital displays. Only a few use a single 61% 36% 3% Middleware Use None Custom Unknown IT IS OUR FLAGSHIP | ZVYAGINTSEVA 61 https://doi.org/10.6017/ital.v37i2.9987 source of content, such as the semi-permanent historical exhibit at Henrico County Public Library. Others, like the Duke Media Wall, rely entirely on their users to supply content, which employs a “for students by students” model of content creation. Additionally, only 12 (43%) of the organizations interviewed had explored or established some form of partnership for creating exhibits. Primarily, these partnerships existed with departments, centers, institutes, campus units, and/or students in academic settings, such as the computer science department, faculty of graduate studies, and international studies. Other examples of partnerships were with similar civic, educational, cultural, and heritage organizations, such as municipal libraries, historical societies, art galleries, museums, and nonprofits. Examples included study participants working with Ars Electronica, local symphony orchestras, Harvard Space Science, and NASA on digital exhibits. Clearly, a variety of approaches were taken in the study population to come up with digital exhibits content. Content Creation Guidelines Seven organizations (19%) in the study population shared publicly the content guidelines aimed to simplify the process of engaging users in creating exhibits. These guidelines were analyzed, and key elements were identified that are necessary for users to know in order to contribute in a meaningful way, thereby lowering the barrier to participation. These elements include resolution of the display screen(s), touch capability, ambient light around the display space, required file formats, and maximum file size. A complete list of organizations with such guidelines, along with websites where these guidelines can be found, is included in appendix C. Based on the analysis of this limited sample, the bare minimum for community participation guidelines would include clearly outlining • the scope, purpose, audience, and curatorial policy of the digital exhibits service; • the technical specifications, such as the resolution, aspect ratio, and file formats supported by the display; • the design guidelines, such as colors, templates and other visual elements; • the contact information of the digital exhibits coordinator; and • the online or email submission form. It should be noted, however, that such specifications are primarily useful when a CMS exists and the content solicited from users is at least somewhat standardized. For example, images, slides, or webpages may be easier for community partners to contribute than video games or 3D interactive content. No examples of guidelines for the latter were observed in the study. Content Scheduling Whereas the middleware section of this study examined the technical approaches to content management and automation, this section explores the frequency of exhibit rotation from a service design perspective. As can be observed in figure 5, no consistent or dominant model for exhibit scheduling has been identified in the study population. Generally, approaches to scheduling digital exhibits reflect organizational contexts. For example, museums typically design an exhibit and display it on a permanent basis, while academic institutions change displays of student work or scholarly communication once per semester. The following scheduling models have emerged in the descending order of frequency in the study population. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 62 Figure 5. Content scheduling distribution in the study population. 1. Unstructured (29%): no formal approach, policy, or expectation is identified by the organization regarding displaying exhibits. This model is largely related to the early stage of service development in this domain, lack of staff capacity to support the service, and/or responsiveness to user needs. One study participant, for example, referred to this loose approach by noting that “no formalized approach and no official policy exists.” For example, institutions may have frameworks for what types of content are acceptable but no specific requirements on the content subjects. Institutions adopting a lab space model (see figure 6) for digital displays largely belong to this category. In other words, content is created on the fly through workshops, data analysis, and other situations as needed by users. In this case, no formal scheduling is required apart from space reservations. 2. Seasonal (29%), which can be defined as a period from three to six months and includes semester-based scheduling in academic institutions. Many organizations operate on a quarterly basis, so it would seem logical that content refresh cycles reflect the broader workflow of the organization. 3. Permanent (21%): in the cases of museums, permanent exhibits may mean displaying content indefinitely or until the next hardware refresh, which might reconfigure the entire interactive display service. No specific date ranges were cited for this model. 4. Monthly (10%): this pattern was observed among academic libraries, with production of “monthly playlists” featuring curated book lists or other monthly specials. 5. Weekly (7%): North Carolina State University and Deakin University Libraries aim to have fresh content up once per week; they achieve this in part by formalizing the roles needed to support their digital display and visualization services. 29% 29% 21% 10% 7% 4% Content Scheduling Unstructured Seasonal Permanent Monthly Weekly Daily IT IS OUR FLAGSHIP | ZVYAGINTSEVA 63 https://doi.org/10.6017/ital.v37i2.9987 6. Daily (4%): only Griffith University ensures that new content is available every day on its #SeeMore display; it does this largely by relying on standardized external and internal inputs, such as weather updates and the university marketing department content. Staffing and Skills One key element of the digital exhibits research project included investigating staffing models required to support a service of this nature. Not surprisingly, the theme around resource needs for digital exhibits emerged in most interviews conducted. Several participants have noted that one “can’t just throw up content and leave it” while others advised to “have expertise on staff before tech is installed.” Data gathered shows that the average full-time equivalent (FTE) needed to support digital display services in organizations interviewed was 2.97—around three full time staff members. In addition, 74% of the organizations studied had maintenance or support contracts with various vendors, including AV integrators, CMS specialists, creative studios that produced original content, or hardware suppliers. Hardware and AV integrators typically provided a 12-month contract for technical troubleshooting while creative studios ensured a 3- month support contract for digital exhibits they designed. The average time to create an original, interactive exhibit was between 9 and 12 months according to the data provided by creative agencies, The Cube teams, and learning organizations who have in-house teams creating exhibits regularly. This length of time varies on the complexity of interaction designed, depth of the exhibit “narrative,” and modes of input supported by the exhibit application. Additionally, it was important to understand the curatorial labor behind digital exhibits; the author did not necessarily speak with the curator of exhibits, and this work may be carried out by multiple individuals within organizations with digital displays or creative studios. In 20 (57%) of the cases, the person interviewed also curated some of or all the content for the digital display in their respective institutions. In five (14%) of the cases, the individual interviewed was not a curator for any of the content, because there was no need for curation in the first place. For example, displays in these cases were used for analysis or teaching and therefore did not require prepared content. In the rest of the cases (10, or 29%), a creative agency vendor, another member of the team, or a community partner was responsible for the curation of exhibit content. This finding suggests that, while a significant number of organizations outsource the design and curation of exhibits, the majority retain control over this process. Therefore, dedicating resources to curation, organization, and management of exhibit content is deemed significant by the organizations represented in the study. In terms of the capacity to carry out digital display services, skills that have been identified by study participants as being important to supporting work of this nature include the following: 1. technical skills (such as the ability to troubleshoot), general interest in technology, and flexibility and willingness to learn new things (74%) 2. design, visual, and creative sensibility (40%), as this type of work is primarily a visual experience 3. software-development or programming-language knowledge (31%) 4. communication, collaboration, and relationship-building (25%) 5. project management (20%) INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 64 6. audiovisual and media skills (14%), as digital exhibits are “as much an AV experience as an IT experience,” according to one study participant 7. curatorial, organizational, and content-management skills (11%) The most frequent dedicated roles mentioned in the interviews are shown in table 3. Table 3. Types of roles significant to digital exhibits work Position Responses % Developer/programmer 11 31 Project manager 8 23 Graphic designer 6 17 User experience or user interface designer 4 11 IT systems administrator 4 11 AV or media specialist 4 11 The relatively low percentages represented in this table suggest the distribution of skills mentioned above among various team members or combining multiple skills in a single role, as may be the case in small institutions or those without formalized services with dedicated roles. Nevertheless, the presence of specific job titles indicates understanding of various skill sets needed to run a service that uses digital displays. Challenges and Successes Many challenges were identified by study participants related to initiating and supporting a service that uses digital displays for learning. Clearly, multiple challenges could be associated with the services related to digital displays within a single organization. However, many successes and lessons learned were also shared by interviewees, often overlapping with identified challenges. This pattern suggests that some organizations can pursue strategies that address challenges faced by their library or museum colleagues while perhaps lacking resources or capacity in other areas related to this type of service. For example, some organizations have observed a lack of user engagement because of limited interactivity of the technology solution they used. Others have had successful user engagement largely by investing in technology solutions that provide a range of modes of interaction. It is important to learn from both these areas to anticipate possible pain points and to be able to capitalize on successes that lead to industry recognition and engagement from library customers. Table 4 summarized the range of challenges identified. IT IS OUR FLAGSHIP | ZVYAGINTSEVA 65 https://doi.org/10.6017/ital.v37i2.9987 Table 4. Challenges related to digital display services Challenge Identified Responses % Technical 14 41 Content 11 33 Costs 11 33 User expectations 11 33 Workflow 10 29 Service design 9 26 Time 8 24 Organizational culture 8 24 User engagement 7 20 As reflected in table 4, several key challenges have been discussed: 1. Technical, such as troubleshooting the technology, keeping up with new technologies or upgrades, and finding software solutions appropriate for the hardware selected. 2. Content, such as coming up with original content or curating existing sources. In the words of one participant, “quality and refresh of content is key—it has to be meaningful, interesting, and new.” This clearly presents a resource requirement. 3. Costs, such as the financial commitment to the service, the unseen costs in putting exhibits together, software licensing, and hardware upgrades. 4. User expectations, such as keeping the service at its full potential, using maximum functionality of the hardware, and software solutions. According to study participants, users “may not want what they think or they say they want,” and to some extent, "such technologies are almost an expectation now, and not as exciting for users.” 5. Workflow or project-management strategies specifically related to emergent multimedia experiences that require new cycles of development and testing. 6. Time to plan, source, create, troubleshoot, launch, and improve exhibits. 7. Service design, such as thinking holistically about the functions of the technology within the larger organizational structure. As one study participant stated, organizations “cannot disregard the reality of the service being tied to a physical space” in that these types of technologies are both a virtual and physical customer experience. 8. Organizational culture and policy, in terms of adapting project-based approaches to planning and resourcing services, getting institutional support, and educating all staff about the purpose, function, and benefits of the service. 9. User engagement, particularly keeping users interested in the exhibits and continually finding new and exciting content. Various participants have found that “linger time is INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 66 between 30 seconds to few minutes” and content being displayed needs to be “something interesting, unique, and succinct, but not a destination in itself.” Despite the clear challenges with delivering digital exhibits services, organizations that participated in this study have identified keys to success (see table 5). Table 5. Successes and lessons learned in using digital displays Successful Approach or Lesson Identified Responses % User engagement and interactivity 16 47 Service design 14 41 “Wow” factor 12 35 Organizational leadership 12 35 Technology solution 10 29 Flexibility 10 29 Communication and collaboration 10 29 Project management 9 26 Team and skill sets 9 26 As reflected in table 5, several approaches have been discussed: • User engagement and interactivity, particularly for those institutions that invested in highly interactive and immersive experiences; the rewards are seen in interest and enthusiasm of their user groups. • Service design: organizations that have carefully planned the service have found that this technology was successfully serving the needs of their user communities. • Promotion and “wow factor” that has brought attention to the organization and the service. It is not surprising that digital displays are central points on tours of dignitaries, political figures, and external guests. Further, many have commented that they “did not imagine a library could be involved in such an innovative experiment,” and others have added that their digital displays have “created new conversations that did not exist before.” • Leadership and vision at the organizational level, which secures support and resources as well as defines the scope of the service to ensure its sustainability and success: “Money is not necessarily the only barrier to doing this service, but risk taking, culture.” • Technology solution, where “everything works” and both the organization and users of the service are happy with the functionality, features, and performance of the chosen solution. • Flexibility and willingness to learn new things, including being open to agile project- management methods, taking risks, and continually learning new tools, technologies, and processes as the service matures. IT IS OUR FLAGSHIP | ZVYAGINTSEVA 67 https://doi.org/10.6017/ital.v37i2.9987 • Communication and collaboration, both internally among stakeholders and externally by building community partnerships, new audiences, and user participation in content creation. For example, one study participant noted that the technology “has contributed to giving the museum a new audience of primarily young people and families—a key objective held in 2010 at the commencement of the gallery refurbishments.” • Workflow and project management for those embracing new approaches required to bring multiple skill sets together to create engaging new exhibits. As one participant has put it, “These types of approaches require testing, improvement, a new workflow and lifecycle for the projects.” • Having the right team with appropriate skills to support the service, though this theme was rated as being less significant than designing services effectively and securing institutional support for the technology service. In other words, study participants noted that having in-house programming or design skills is not enough without proper definition of success for digital exhibits services. Perceptions Institutional and user reception of digital displays as a service to pursue in learning organizations has been identified as overwhelmingly positive, with 87% of the organizations noting positive feedback. For example, one study participant noted the positive attention received by the wider community for the digital display, stating “it is our flagship and people are in general impressed by both the potential and some of the existing content." Some participants have gone as far as to say that the reception among users has been “through the roof” and they have “never had a negative feedback comment” about their display. This finding indicates a high degree of satisfaction with such technologies by organizations that pursued a digital display. Table 6 further explores the range of perceptions observed in the study. Table 6. Perception of digital display services Perception Responses % Positive 20 87 Hesitation or uncertainty 7 30 Concerns about purpose 4 17 Concerns about user engagement 4 17 Concerns about costs 3 13 Negative 3 13 A minority (13%) have noted some negative perceptions, largely related to concerns about costs or functionality of the technology; 30% have observed uncertainty and hesitation on behalf of the staff and users in terms of engagement as well as interrogating its purpose in the organization. For example, one study participant summarizes this mixed sentiment by saying, “The perception is INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 68 that it’s really neat and worthwhile for exploring new ways of teaching, but that the same features and functions could be achieved with less (which we think is a good thing!).” It is helpful to note this trend in perception, as any new service will likely bring a mixture of excitement, hesitation, and occasional opposition. Interestingly, these reactions have originated both from the staff of organizations interviewed and their communities of users. DISCUSSION The findings from this study indicate that the functions of the digital displays are highly dependent on the organizational context in which displays exist. This context, in turn, defines the nature of the services delivered through the digital display. For example, figure 6 can be useful in classifying the various ways digital displays appear in the study population, from research and teaching-oriented lab spaces to public spaces with passive messaging or active immersive game- like digital experiences. Figure 6. Types of digital displays in the study population. As such, visualization walls might belong in the “lab spaces” category that typically appears in academic libraries or research units and do not require content planning and scheduling. What we might call “digital interactive exhibits” tend to appear in museums and galleries with a primarily public audience and may have a permanent, seasonal, or monthly rotation schedule. However, despite a range of approaches taken to provide content and in terms of use of these technologies, many organizations share resourcing needs and challenges, such as troubleshooting the technology solution, creating engaging content, and managing costs of interactive projects. Despite these common concerns, the digital-exhibits services were perceived as being overwhelmingly satisfactory in all types of organizations included in this study because they brought new audiences to the organization and were often seen as “showpieces” in the broader community. The data gathered in the environmental scan demonstrates that there is currently little consistency among digital displays in learning environments. This lack of consistency is seen in content-development methods among study participants, their programming, content IT IS OUR FLAGSHIP | ZVYAGINTSEVA 69 https://doi.org/10.6017/ital.v37i2.9987 management, technology solutions, and even naming of the display (and, by extension, the display service). For example, this study revealed that no evidently “open platform” for managing content at the application or the middleware level currently exists. A small number of software tools are used by organizations to support digital displays, but their use is in no way standardized, as compared to nearly every other area of library services. There is some indication that digital- display services may become more standardized in the coming years, and more tools, solutions, vendors, and communities of practice will be available. For example, many signage CMSs are currently on the market, and the number of game-like immersive experience companies is growing, suggesting extension of these services to libraries in the coming years. Only a few software tools exist for creating exhibits, such as IntuiFace and TouchDesigner, though no free, open-source versions of exhibit software are currently available. As well, the growing number of digital exhibits and interactive media companies currently focuses on turnkey—rather than software-as-a-service or platform—solutions. In contrast, some consistency exists in staffing needs and skills required to support the digital- exhibits service. A majority of organizations interviewed agreed that design, software development, systems administration, and project-management skills are needed to ensure digital-exhibits services run sustainably in a learning organization. In addition, lack of public library representation in this study makes it challenging to draw parallels to the library context. Adapting museum practices is also not necessarily reliable, as there is rarely a mandate to engage communities and partner on content creation, as there is in libraries. For example, only the El Paso (Texas) Museum of History engages the local community to source and organize content. These findings suggest that digital displays are a growing domain, and more solutions are likely to emerge in the coming years. The Cube, compared to the rest of the study population, is a unique service model because it successfully brings together most elements examined in the environmental scan. For example, to ensure continual engagement with the digital display, The Cube schedules exhibits on a regular basis and employs user interface designers, systems administrators, software engineers, and project managers. It also extends the content through community engagement, public tours, and STEM programming. It has created an in-house middleware solution to simplify exhibit delivery and has chosen Unity3D as its platform of choice for exhibit development. LIMITATIONS Only organizations from English-speaking countries were interviewed as part of the environmental scan. It is therefore unclear if access to organizations from non–English-speaking countries would have produced new themes and significantly different results. In addition, as with all environmental scans, the data is limited by the degree of understanding, knowledge, and willingness to share information of the individual being interviewed. Particularly, individuals with whom the author spoke may or may not have been technology or service leads for the digital display at their respective institutions. Thus, the study participants had a range of understanding of hardware specifications, functionality, and service-design components associated with digital displays. For example, having access to technology leads would have likely provided more nuanced responses around the middleware solutions and the underlying technical infrastructure required to support this service. INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 70 A small number of vendors were also interviewed as part of the environmental scan even though vendors did not necessarily have digital displays or service models parallel to libraries or museums. They are included in appendix B. Nevertheless, gathering data from this group was deemed relevant to the study, as creative agencies have formalized staffing models and clearly identified skill sets necessary to support services of this nature. In addition, this group possesses knowledge of best practices, workflows, and project-management processes related to exhibit development. Finally, this environmental scan also did not capture any interaction with direct users of digital displays, whose experiences and perceptions of these technologies may or may not support the findings gathered from the organizations interviewed. These limitations were addressed by increasing the sample size of the study within the time and resource constraints of the research project. CONCLUSION The findings of this study show that the functions of digital-display technologies and their related services are highly dependent on the organizational context in which they exist. However, despite a range of approaches taken to provide content and in terms of use of these technologies, many organizations share resourcing needs and challenges, such as troubleshooting the technology solution, creating engaging content, and managing costs of interactive projects. Despite these common concerns, digital displays were perceived as being overwhelmingly positive in all types of organizations interviewed in this study, as they brought new audiences to the organization and were often seen as “showpieces” in the broader community. The successes and lessons learned from the study population are meant to provide a broader perspective on this maturing domain as well as help inform planning processes for future digital exhibits in learning organizations. IT IS OUR FLAGSHIP | ZVYAGINTSEVA 71 https://doi.org/10.6017/ital.v37i2.9987 APPENDIX A. ENVIRONMENTAL SCAN QUESTIONS Digital Exhibits Environmental Scan Interview Questions—Museums, Libraries, Public Organizations 1. What are the technical specifications of the digital interactive technology at your institution? 2. Who are the primary users of this technology (those interacting with the platform)? Is there anyone you thought would use it and isn’t? 3. What are primary uses for the technology (events, presentations, analysis, workshops)? 4. What types content is supported by the technology (video, images, audio, maps, text, games, 3D, all of the above?) 5. Where is content created and how is this content managed? 6. What is the schedule for the content and how is it prioritized? 7. Can you estimate the FTE (full-time equivalent) of staff members involved in supporting this technology/service, both directly and indirectly? What does indirect support for this technology entail? 8. In your experience, what kinds of skills are necessary in order to support this service? 9. Have partnerships with other organizations producing content to be exhibited been established or explored? 10. What challenges have you encountered in providing this service? 11. What have been some keys to the successes in supporting this service? 12. What has been the biggest success of this service and what has been the biggest disappointment? 13. What is the perception of this technology in institution more broadly? 14. Are there any other institutions you suggest we contact to learn more about similar technologies? INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 72 Digital Exhibits Environmental Scan Interview Questions: Vendors 1. What is the relationship between creative studio and hardware/fabrication? Do you do everything or work with AV integrators instead to put together touch interactives? 2. Who have been the primary users of the interactive exhibits and projects you have completed? 3. Who writes the use cases when creating a digital interactive exhibit? 4. What types content is supported by the technology (video, images, audio, maps, text, games, 3D, all of the above?) Do you see a rise in interest for 3D and game-like environments and do you have internal expertise to support it? 5. Where is content created for the exhibits and how is this content managed? Who curates? 6. What timespan or lifecycle do you design for? 7. How big is your team? How long to projects typically take to create? 8. What types of expertise do you have in house? What might a project team look like? 9. To what extent is there a goal of sharing knowledge back with the company from clients or users? 10. What challenges have you encountered in providing this service? 11. What have been some keys to the successes in supporting this service? IT IS OUR FLAGSHIP | ZVYAGINTSEVA 73 https://doi.org/10.6017/ital.v37i2.9987 APPENDIX B: STUDY POPULATION IN ENVIRONMENTAL SCAN Organization Location Date Interviewed All Saints Anglican School Merrimac, Australia July 25, 2016 Anode Nashville, TN July 22, 2016 Belle & Wissell Seattle, WA July 26, 2016 Bradman Museum Bowral, Australia July 10, 2016 Brown University Library Providence, RI June 3, 2016 University of Calgary Library and Cultural Resources Calgary, AB June 2, 2016 Deakin University Library Geelong, Australia June 14, 2016 University of Colorado Denver Library Denver, CO June 24, 2016 Duke University Library Durham, NC August 17, 2016 El Paso Museum of History El Paso, TX June 24, 2016 Georgia State University Library Atlanta, GA June 10, 2016 Gibson Group Wellington, New Zealand July 16, 2016 Henrico County Public Library Henrico, VA August 9, 2016 Ideum Corrales, NM July 26, 2016 Indiana University Bloomington Library Bloomington, IN May 31, 2016 Interactive Mechanics Philadelphia, PA August 2, 2016 Johns Hopkins University Library Baltimore, MD June 20, 2016 Nashville Public Library Nashville, TN July 22, 2016 North Carolina State University Library Raleigh, NC June 8, 2016 University of North Carolina atChapel Hill Library Chapel Hill, NC June 2, 2016 University of Nebraska Omaha Omaha, NE June 16, 2016 Omaha Do Space Omaha, NE July 11, 2016 University of Oregon Alumni Center Eugene, OR June 7, 2016 Philadelphia Museum of Art Philadelphia, PA August 10, 2016 Queensland University of Technology Brisbane, Australia June 30; July 29, 2016; August 16, 2016 Société des Arts Technologiques Montreal, QC August 8, 2016 Second Story Portland, OR July 28, 2016 St. Louis University St. Louis, MO July 4, 2016 Stanford University Library Stanford, CA July 22, 2016 University of Illinois at Chicago Chicago, IL June 22, 2016 University of Mary Washington Fredericksburg, VA July 7, 2016 Visibull Waterloo, ON August 12, 2016 University of Waterloo Stratford Campus Stratford, ON June 22, 2016 Yale University Center for Science and Social Science Information New Haven, CT July 13, 2016 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 74 APPENDIX C: DIGITAL CONTENT PUBLISHING GUIDELINES Organization Name Guidelines Website Deakin University Library http://www.deakin.edu.au/library/projects/sparking-true- imagination Duke University https://wiki.duke.edu/display/LMW/LMW+Home Griffith University https://intranet.secure.griffith.edu.au/work/digital- signage/seemore North Carolina State University Library http://www.lib.ncsu.edu/videowalls University Colorado Denver http://library.auraria.edu/discoverywall University of Calgary Library and Cultural Resources http://lcr.ucalgary.ca/media-walls University of Waterloo Stratford Campus https://uwaterloo.ca/stratford-campus/research/christie- microtiles-wall http://www.deakin.edu.au/library/projects/sparking-true-imagination http://www.deakin.edu.au/library/projects/sparking-true-imagination https://wiki.duke.edu/display/LMW/LMW+Home https://intranet.secure.griffith.edu.au/work/digital-signage/seemore https://intranet.secure.griffith.edu.au/work/digital-signage/seemore http://www.lib.ncsu.edu/videowalls http://library.auraria.edu/discoverywall http://lcr.ucalgary.ca/media-walls https://uwaterloo.ca/stratford-campus/research/christie-microtiles-wall https://uwaterloo.ca/stratford-campus/research/christie-microtiles-wall IT IS OUR FLAGSHIP | ZVYAGINTSEVA 75 https://doi.org/10.6017/ital.v37i2.9987 REFERENCES 1 Flora Salim and Usman Haque, “Urban Computing in the Wild: A Survey on Large Scale Participation and Citizen Engagement with Ubiquitous Computing, Cyber Physical Systems, and Internet of Things,” International Journal of Human-Computer Studies 81 (September 2015): 31–48, https://doi.org/10.1016/j.ijhcs.2015.03.003. 2 Peter Peltonen et al., “It’s Mine, Don't Touch! Interactions at a Large Multi-touch Display in a City Center,” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Florence, Italy, April 5–10, 2008, 1285–94, https://doi.org/10.1145/1357054.1357255. 3 Shawna Sadler, Mike Nutt, and Renee Reaume, “Managing Public Video Walls in Academic Library,” (presentation, CNI Spring 2015 Meeting, Seattle, Washington, April 13-14, 2015), http://dro.deakin.edu.au/eserv/DU:30073322/sadler-managing-2015.pdf. 4 Peltonen et al., “It’s Mine, Don't Touch!” 5 John Brosz, E. Patrick Rashleigh, and Josh Boyer. “Experiences with High Resolution Display Walls in Academic Libraries” (presentation, CNI Fall 2015 Meeting, Washington, DC, December 13-14, 2015), https://www.cni.org/wp-content/uploads/2015/12/cni_experiences_brosz.pdf; Bryan Sinclair, Jill Sexton, and Joseph Hurley, “Visualization on the Big Screen: Hands-On Immersive Environments Designed for Student and Faculty Collaboration” (presentation, CNI Spring 2015 Meeting, Seattle, Washington, April 13–14, 2015), https://scholarworks.gsu.edu/univ_lib_facpres/29/. 6 Niels Wouters et al., “Uncovering the Honeypot Effect: How Audiences Engage with Public Interactive Systems. Conference on Designing Interactive Systems,” DIS ’16 Proceedings of the 2016 ACM Conference on Designing Interactive Systems, Brisbane, Australia, June 4–8, 2016, 5- 16, https://doi.org/10.1145/2901790.2901796. 7 Gonzalo Parra, Joris Klerkx, and Erik Duval, “Understanding Engagement with Interactive Public Displays: An Awareness Campaign in the Wild,” Proceedings of the International Symposium on Pervasive Displays, Copenhagen, Denmark, June 3–4, 2014, 180–85, https:/doi.org/10.1145 /2611009.2611020; Ekaterina Kurdyukova, Mohammad Obaid, and Elisabeth Andre, “Direct, Bodily or Mobile Interaction?,” Proceedings of the 11th International Conference on Mobile and Ubiquitous Multimedia, Ulm, Germany, December 4–6, 2012, https://doi.org/10.1145 /2406367.2406421; Tongyan Ning et al., “No Need to Stop: Menu Techniques for Passing by Public Displays,” Proceedings of the 2011 Annual Conference on Human Factors in Computing Systems, Vancouver, British Columbia, https://www.gillesbailly.fr/publis/BAILLY_CHI11.pdf. 8 Jung Soo Lee et al., “A Study on Digital Signage Interaction Using Mobile Device,” International Journal of Information and Electronics Engineering 5 no. 5 (2015): 394–97, https://doi.org/10.7763/IJIEE.2015.V5.566. Jung Soo Lee et al., “A Study on Digital Signage Interaction Using Mobile Device,” International Journal of Information and Electronics Engineering 5 no. 5 (2015): 394–97, https://doi.org/10.7763/IJIEE.2015.V5.566. 9 Parra et al, “Understanding Engagement,” 181. https://doi.org/10.1016/j.ijhcs.2015.03.003 https://doi.org/10.1145/1357054.1357255 http://dro.deakin.edu.au/eserv/DU:30073322/sadler-managing-2015.pdf https://www.cni.org/wp-content/uploads/2015/12/cni_experiences_brosz.pdf https://scholarworks.gsu.edu/univ_lib_facpres/29/ https://doi.org/10.1145/2901790.2901796 https://doi.org/10.1145/2611009.2611020 https://doi.org/10.1145/2611009.2611020 https://doi.org/10.1145/2406367.2406421 https://doi.org/10.1145/2406367.2406421 https://www.gillesbailly.fr/publis/BAILLY_CHI11.pdf https://doi.org/10.7763/IJIEE.2015.V5.566 https://doi.org/10.7763/IJIEE.2015.V5.566 INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2018 76 10 Parra et al, “Understanding Engagement,” 181; Walter, Robert, Gilles Gailly, and Jorg Müller. “StrikeAPose: revealing mid-air gestures on public displays.” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Paris, France, April 27-May 2, 2013, 841- 850. https://doi.org/10.1145/2470654.2470774. 11 Philipp Panhey et al., “What People Really Remember: Understanding Cognitive Effects When Interactive with Large Displays,” Proceedings of the 2015 International Conference on Interactive Tabletops & Surfaces, Madeira, Portugal, November 15–18, 2015, 103–6, https://doi.org/10.1145/2817721.2817732. 12 Christopher Ackad et al., “An In-the-Wild Study of Learning Mid-air Gestures to Browse Hierarchical Information at a Large Interactive Public Display,” Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, Osaka, Japan, September 7–11, 2015, 1227–38, https://doi.org/10.1145/2750858.2807532. 13 Parra et al, “Understanding Engagement,” 181; Kurdyukova, Obaid and Andre, 2012, n.p. 14 Jouni Vepsäläinen et al., “Web-Based Public-Screen Gaming: Insights from Deployments,” IEEE Pervasive Computing 15 no. 3 (2016): 40–46, https://ieeexplore.ieee.org/document/7508836/. 15 Uta Hinrichs, Holly Schmidt, and Sheelagh Carpendale, “EMDialog: Bringing Information Visualization into the Museum,” IEEE Transactions on Visualization and Computer Graphics 14 no. 6 (November 2008):1181-1188. https://doi.org/10.1109/TVCG.2008.127. 16 Hinrichs, Schmidt, and Carpendale, “EMDialog.” 17 Sarah Clinch et al., “Reflections on the Long-term Use of an Experimental Digital Signage System,” Proceedings of the 13th International Conference on Ubiquitous Computing, Beijing, China, September 17-21, 2011, 133-142. https://doi.org/10.1145/2030112.2030132. 18 Elaine M. Huang, Anna Koster, and Jan Borchers. “Overcoming Assumptions and Uncovering Practices: When Does the Public Really Look at Public Displays?,” Proceedings of the 6th International Conference on Pervasive Computing, Sydney, Australia, May 19-22, 2008, 228-243. https://doi.org/10.1007/978-3-540-79576-6_14; Jorg Muller et al., “Looking glass: a field study on noticing interactivity of a shop window,” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Austin, Texas, May 5-10, 2012, 297-306. https://doi.org/10.1145/2207676.2207718. 19 Salim & Haque, “Urban Computing in the Wild,” 35 20 Mettina Veenstra et al., “Should Public Displays Be Interactive? Evaluating the Impact of Interactivity on Audience Engagement,” Proceedings of the 4th International Symposium on Pervasive Displays, Saarbruecken, Germany, June 10–12, 2015, 15–21, https://doi.org/10.1145/2757710.2757727. 21 Clinch et al., “Reflections.” https://doi.org/10.1145/2470654.2470774 https://doi.org/10.1145/2817721.2817732 https://doi.org/10.1145/2750858.2807532 https://ieeexplore.ieee.org/document/7508836/ https://doi.org/10.1109/TVCG.2008.127 https://doi.org/10.1145/2030112.2030132 https://doi.org/10.1007/978-3-540-79576-6_14 https://doi.org/10.1145/2207676.2207718 https://doi.org/10.1145/2757710.2757727 IT IS OUR FLAGSHIP | ZVYAGINTSEVA 77 https://doi.org/10.6017/ital.v37i2.9987 22 Robert Ravnik and Franc Solina, “Audience Measurement of Digital Signage: Qualitative Study in Real-World Environment Using Computer Vision,” Interacting with Computers 25, no. 3 (2013), https://doi.org/10.1093/iwc/iws023. 23 Neal Buerger, “Types of Public Interactive Display Technologies and How to Motivate Users to Interact,” Media Informatics Advanced Seminar on Ubiquitous Computing, 2011, Hausen, Doris, Conradi, Bettina, Hang, Alina, Hennecke, Fabiant, Kratz, Sven, Lohmann, Sebastian, Richter, Hendrik, Butz, Andreas and Hussmann, Heinrich (eds). University of Munich, Department of Computer Science, Media Informatics Group, 2011. https://pdfs.semanticscholar.org/533a/4ef7780403e8072346d574cf288e89fc442d.pdf . 24 C. G. Screven, “Information Design in Informal Settings: Museums and other Public Spaces,” in Information Design, ed. Robert E. Jacobson (Cambridge, MA: MIT Press, 2000), 131–192. 25 Parra et al., “Understanding Engagement,” 181. 26 Uta Hinrichs and Sheelagh Carpendale, “Gestures in the wild: Studying Multi-touch Gesture Sequences on Interactive Tabletop Exhibits,” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Vancouver, British Columbia, May 7–12, 2011, 3023–32, https://doi.org/10.1145/1978942.1979391. 27 Harry Brignull and Yvonne Rogers, “Enticing People to Interact with Large Public Displays in Public Spaces,” INTERACT ’03, Proceedings of the International Conference on Human-Computer Interaction, Zurich, Switzerland, September 1-5, 2003, 17-24, Matthias Rauterberg, Marino Menozzi, and Janet Wesson (eds.), Tokyo: IOS Press, 2003. http://www.idemployee.id.tue.nl/g.w.m.rauterberg/conferences/interact2003/INTERACT200 3-p17.pdf. 28 Peltonen et al., “It’s Mine, Don't Touch!” 29 Peltonen et al., “It’s Mine, Don't Touch!” 30 Anne Horn, Bernadette Lingham, and Sue Owen, “Library Learning Spaces in the Digital Age,” Proceedings of the 35th Annual International Association of Scientific and Technological University Libraries Conference, Espoo, Finland, June 2-5, 2014. http://docs.lib.purdue.edu/iatul/2014/libraryspace/2. https://doi.org/10.1093/iwc/iws023 https://pdfs.semanticscholar.org/533a/4ef7780403e8072346d574cf288e89fc442d.pdf https://doi.org/10.1145/1978942.1979391 http://www.idemployee.id.tue.nl/g.w.m.rauterberg/conferences/interact2003/INTERACT2003-p17.pdf http://www.idemployee.id.tue.nl/g.w.m.rauterberg/conferences/interact2003/INTERACT2003-p17.pdf http://docs.lib.purdue.edu/iatul/2014/libraryspace/2 ABSTRACT Introduction Method Literature Review Definitions Interactivity User Engagement Age Display Content Social Context Findings Technical and Hardware Landscape Users and Use Cases Figure 1. Audience types for digital displays in the study population. Content Types and Management Middleware, Automation, and Exhibit Management Sources of Content Content Creation Guidelines Content Scheduling Staffing and Skills Challenges and Successes Perceptions Discussion Figure 6. Types of digital displays in the study population. Limitations Conclusion Appendix A. ENVIRONMENTAL SCAN QUESTIONS Digital Exhibits Environmental Scan Interview Questions—Museums, Libraries, Public Organizations Digital Exhibits Environmental Scan Interview Questions: Vendors Appendix B: Study Population in Environmental Scan Appendix C: Digital Content Publishing Guidelines References