aghassibake-supporting-2020 ---- Supporting Data Visualization Services in Academic Libraries / About JITP Submit Guidelines for Authors Issues Assignments Blueprints Reviews Teaching Fails Tool Tips Issue Eighteen Supporting Data Visualization Services in Academic Libraries 2 responses December 10, 2020 Negeen Aghassibake, University of Washington Libraries Justin Joque, University of Michigan Library Matthew L. Sisk, Navari Family Center for Digital Scholarship, University of Notre Dame Abstract Data visualization in libraries is not a part of traditional forms of research support, but is an emerging area that is increasingly important in the growing prominence of data in, and as a form of, scholarship. In an era of misinformation, visual and data literacy are necessary skills for the responsible consumption and production of data visualizations and the communication of research results. This article summarizes the findings of Visualizing the Future, which is an IMLS National Forum Grant (RE-73-18-0059-18) to develop a literacy-based instructional and research agenda for library and information professionals with the aim to create a community of praxis focused on data visualization. The grant aims to create a diverse community that will advance data visualization instruction and use beyond hands-on, technology-based tutorials toward a nuanced, critical understanding of visualization as a research product and form of expression. This article will review the need for data visualization support in libraries, review environmental scans on data visualization in libraries, emphasize the need for a focus on the people involved in data visualization in libraries, discuss the components necessary to set up these services, and conclude with the literacies associated with supporting data visualization. Introduction Now, more than ever, accurately assessing information is crucially important to discourse, both public and academic. Universities play an important role in teaching students how to understand and generate information. But at many institutions, learning how to effectively communicate findings from the research process is considered idiosyncratic for each field or the express domain of a particular department (e.g. applied mathematics or journalism). Data visualization is the use of spatial elements and graphical properties to display and analyze information, and this practice may follow disciplinary customs. However, there are many commonalities in how we visualize information and data, and the academic library, at the heart of the university, can play a significant role in teaching these skills. In the following article, we suggest a number of challenges in teaching complex technological and methodological skills like visualization and outline a rationale for, and a strategy to, implement these types of services in academic libraries. However, the same argument can be made for any academic support unit, whether college, library, or independently based. Why Do We Need Data Visualization Support in Libraries? In many ways the argument for developing data visualization services in libraries mirrors the discussion surrounding the inclusion and extension of digital scholarship support services throughout universities. In academic settings, libraries serve as a natural hub for services that can be used by many departments and fields. Often, data visualization (like GIS or text-mining) expertise is tucked away in a particular academic department making it difficult for students and researchers from different fields to access it. As libraries already play a key role in advocacy for information literacy and ethics, they may also serve as unaffiliated, central places to gain basic competencies in associated information and data skills. Training patrons how to accurately analyze, assess, and create data visualizations is a natural enhancement to this role. Building competencies in these areas will aid patrons in their own understanding and use of complex visualizations. It may also help to create a robust learning community and knowledge base around this form of visual communication. In an age of “fake news” and “post-truth politics,” visual literacy, data literacy, and data visualization have become exceedingly important. Without knowing the ways that data can be manipulated, patrons are not as capable of assessing the utility of the information being displayed or making informed decisions about the visual story being told. Presently, many academic libraries are investing resources in data services and subscriptions. Training students, faculty and researchers in ways of effectively visualizing these data sources increases their use and utility. Finally, having data visualization skills within the library also comes with an operational advantage, allowing more effective sharing of data about the library. We are the Visualizing the Future Symposia, an Institute of Museum and Library Services National Forum Grant-funded group created to develop instructional and research materials on data visualization for library professionals and a community of practice around data visualization. The grant was designed to address the lack of community around data visualization in libraries. More information about the grant is available at the Visualizing the Future website. While we have only included the names of the three main authors; this work was a product of the work of the entire cohort, which includes: Delores Carlito, David Christensen, Ryan Clement, Sally Gore, Tess Grynoch, Jo Klein, Dorothy Ogdon, Megan Ozeran, Alisa Rod, Andrzej Rutkowski, Cass Wilkinson Saldaña, Amy Sonnichsen, and Angela Zoss. We are currently halfway through our grant work and, in addition to providing publicly available resources for teaching visualization, are also in the process of synthesizing and collecting shared insights into developing and providing data visualization instruction. This present article represents some of the key findings of our grant work. Current Environment In order to identify some broad data visualization needs and values, we reviewed three environmental scans. The first was carried out by Angela Zoss, who is one of the co-investigators on the grant, at Duke University (2018) based on a survey that received 36 responses from 30 separate institutions. The second, by S.K. Van Poolen (2017), focuses on an overview of the discipline and includes results from a survey of Big Ten Academic Alliance institutions and others. And the final report by Ilka Datig for Primary Research Group Inc (2019) provides a number of in-depth case studies. While none of the studies claim to provide an exhaustive list of every person or institution providing data visualization support in libraries, in combination they provide an effective overview of the state of the field. Institutions The combined environmental scans represent around thirty-five institutions, primarily academic libraries in the United States. However, the Zoss survey also includes data from the Australian National University, a number of Canadian universities, and the World Bank Group. The universities represented vary greatly in size and include large research institutions, such as the University of California Los Angeles, and small liberal arts schools, such as Middlebury and Carleton College. Some appointments were full-time, while others reported visualization as a part of other job responsibilities. In the Zoss survey, roughly 33% of respondents reported the word “visualization” in their job title. Types of activities The combined scans include a variety of services and activities. According to the Zoss survey, the two most common activities (i.e. activities that the most respondents said they engaged in) were providing consultations on visualization projects and giving short workshops or lectures on data visualization. After that other services offered include: providing internal data visualization support for analyzing and communicating library data; training on visualization hardware and spaces (e.g. large scale visualization walls, 3D CAVEs); and managing such spaces and hardware. Resources needed These three environmental scans also collectively identify a number of resources that are critical for supporting data visualization in librarians. One of the key elements is training for new librarians, or librarians new to this type of work, on visualization itself and teaching/consulting on data visualization. They also mention that resources are required to effectively teach and support visualization software, including access to the software, learning materials, but also ample time is required for librarians to learn, create and experiment themselves so that they can be effective teachers. Finally they outline the need for communities of practice across institutions and shared resources to support visualization. It’s About the People In all of our work and research so far, one important element seems worth stressing and calling out on its own: It is the people who make data visualization services work. Even visualization services focused on advanced instructional spaces or immersive and large scale displays, require expertise to help patrons learn how to use the space, maintain and manage technology, schedule events to create interest, and, especially in the case of advanced spaces, create and manage content to suggest the possibilities. An example of this is the North Carolina State University Libraries’ Andrew W. Mellon Foundation-funded project “Immersive Scholar” (Vandegrift et al.), which brought visiting artists to produce immersive artistic visualization projects in collaboration with staff for the large scale displays at the library. We encourage any institution that is considering developing or expanding data visualization services to start by defining skill sets and services they wish to offer rather than the technology or infrastructure they intend to build. Some of these skills may include programming, data preparation, and designing for accessibility, which can support a broad range of services to meet user needs. Unsupported infrastructure (stale projects, broken technology, etc.) is a continuing problem in providing data visualization services, and starting any conversation around data visualization support by thinking about the people needed is crucial to creating sustainable, ethical, and useful services. As evidenced by both the information in the environmental scans and the experiences of Visualizing the Future fellows, one of the most consistently important ways that libraries are supporting visualization is through consultations and workshops that span technologies from Excel to the latest virtual reality systems. Moreover, using these techniques and technologies effectively requires more than just technical know-how; it requires in-depth considerations of design aesthetics, sustainability, and the ethical use and re-use of data. Responsible and effective visualization design requires a variety of literacies (discussed below), critical consideration of where data comes from, and how best to represent data—all elements that are difficult to support and instruct without staff who have appropriate time and training. Services Data visualization services in libraries exist both internally and externally. Internally, data visualization is used for assessment (Murphy 2015), marketing librarians’ skills and demonstrating the value of libraries (Bouquin and Epstein 2015), collection analysis (Finch 2016), internal capacity building (Bouquin and Epstein 2015), and in other areas of libraries that primarily benefit the institution.  External services, in contrast, support students, faculty, researchers, non-library staff, and community members. Some examples of services include individual consultations, workshops, creating spaces for data visualization (both physical and virtual), and providing support for tools. Some libraries extend visualization services into additional areas, like the New York University Health Sciences Library’s “Data Visualization Clinic,” which provides a space for attendees to share and receive feedback on their data visualizations from their peers (Zametkin and Rubin 2018), and the North Carolina State University Libraries’ Coffee and Viz Series, “a forum in which NC State researchers share their visualization work and discuss topics of interest” that is also open to the public (North Carolina State University Libraries 2015). In order to offer these services, libraries need staff who have some interest and/or experience with data visualization. Some models include functional roles, such as data services librarians or data visualization librarians. These functional librarian roles ensure that the focus is on data and data visualization, and that there is dedicated, funded time available to work on data visualization learning and support. It is important to note that if there is a need for research data management support, it may require a position separate from data visualization. Data services are broad and needs can vary, so some assessment on the community’s greatest needs would help focus functional librarian positions.  Functional librarian roles may lend themselves to external facing support and community building around data visualization outside of internal staff. A needs assessment can help identify user-centered services, outreach, and support that could help create a community around data visualization for students, faculty, researchers, non-library staff, and members of the public. Having a community focused on data visualization will make sure that services, spaces, and tools are utilized and meeting user needs.  There is also room to develop non-librarian, technical data visualization positions, such as data visualization specialists or tool-specific specialist positions. These positions may not always have an outreach or community building focus and may be best suited for internal library data visualization support and production. Offering data visualization support as a service to users is separate from data visualization support as a part of library operations, and the decision on how to frame the positions can largely be determined by library needs.  External data visualization services can include workshops, training sessions, consultations, and classroom instruction. These services can be focused on specific tools, such as Tableau, R, Gephi, and so on. They can be focused on particular skills, such as data cleaning and normalizing, dashboard design, and coding. They can also address general concerns, such as data visualization transparency and ethics, which may be folded into all of the services. There are some challenges in determining which services to offer: Is there an interest in data visualization in the community? This question should be answered before any services are offered to ensure services are utilized. If there are any liaison or outreach librarians at your institution, they may have deeper insight into user needs and connections to the leaders of their user groups. Are there staff members who have dedicated time to effectively offer these services and support your users? Is there funding for tools you want to teach? Do you have a space to offer these services? This does not have to be anything more complicated than a room with a projector, but if these services begin to grow, it is important to consider the effectiveness of these services with a larger population. For example, a cap on the number of attendees for a tool-specific workshop might be needed to ensure the attendees receive enough individual support throughout the session. If all of these areas are not addressed, there will be challenges in providing data visualization services and support. Successful data visualization services have adequate staffing, access to the required tools and data, space to offer services (not necessarily a data wall or makerspace, but simply a space with sufficient room to teach and collaborate), and community that is already interested and in need of data visualization services.  Literacies The skills that are necessary to provide good data visualization services are largely practical. We derive the following list from our collective experience, both as data visualization practitioners and as part of the Visualizing the Future community of practice. While the following list is not meant to be exhaustive, these are the core competencies that should be developed to offer data visualization services, either from an individual or as part of a team.  A strong design sense: Without an understanding of how information is effectively conveyed, it is difficult to create or assess visualizations. Thus, data visualization experts need to be versed in the main principles of design (e.g. Gestalt, accessibility) and how to use these techniques to effectively communicate visual information. Awareness of the ethical implications of data visualizations: Although the finer details are usually assessed on a case by case basis, a data visualization expert should be able to interpret when a visualization is misleading and have the agency to decline to create biased products. This is a critical part of enabling the practitioner to be an active partner in the creation of visualizations.  An understanding, if not expertise, in a variety of visualization types: Network visualizations, maps, glyphs, Chernoff Faces, for example. There are many specialized forms of data visualization and no individual can be an expert in all of them, but a data visualization practitioner should at least be conversant in many of them. Although universal expertise is impractical, a working knowledge of when particular techniques should be used is a very important literacy. A similar understanding of a variety of tools: Some examples include Tableau, PowerBI, Shiny, and Gephi. There are many different tools in current use for creating static graphics and interactive dashboards. Again, universal expertise is impractical, but a competent practitioner should be aware of the tools available and capable of making recommendations outside their expertise. Familiarity with one or more coding languages: Many complex data visualizations happen at the command line (at least partially) so there is a need for an effective practitioner to be at least familiar with the languages most commonly used (likely either R or Python). Not every data visualization expert needs to be a programmer, but familiarity with the potential for these tools is necessary. Conclusion The challenges inherent in building and providing data visualization instruction in academic libraries provide an opportunity to address larger pedagogical issues, especially around emerging technologies, methods, and roles in libraries and beyond. In public library settings, the needs for services may be even greater, with patrons unable to find accessible training sources when they need to analyze, assess, and work with diverse types of data and tools. While the focus of our grant work has been on data visualization, the findings reflect the general difficulties of balancing the need and desire to teach tools and invest in infrastructure with the value of teaching concepts and investing in individuals. It is imperative that work teaching and supporting emerging technologies and methods focus on supporting the people and the development of literacies rather than just teaching the use of specific tools. To do so requires the creation of spaces and networks to share information and discoveries. Bibliography Bouquin, Daina and Helen-Ann Brown Epstein. 2015. “Teaching Data Visualization Basics to Market the Value of a Hospital Library: An Infographic as One Example.” Journal of Hospital Librarianship 15, no. 4: 349–364. https://doi.org/10.1080/15323269.2015.1079686. Datig, Ilka. 2019. Profiles of Academic Library Use of Data Visualization Applications. New York: Primary Research Group Inc. Finch, Jannette L. and Angela R. Flenner. 2016. “Using Data Visualization to Examine an Academic Library Collection.” College & Research Libraries 77, no. 6: 765–778. https://doi.org/10.5860/crl.77.6.765. “Immersive Scholar.” Accessed June 26, 2020. https://www.immersivescholar.org/. LaPolla, Fred Willie Zametkin and Denis Rubin. 2018. “The “Data Visualization Clinic”: a library-led critique workshop for data visualization.” Journal of the Medical Library Association 106, no. 4: 477–482. https://doi.org/10.5195/jmla.2018.333. Murphy, Sarah Anne. 2015. “How data visualization supports academic library assessment.” College & Research Libraries News 76, no. 9: 482–486. https://doi.org/10.5860/crln.76.9.9379. North Carolina State University Libraries. “Coffee & Viz.” Accessed December 4, 2019. https://www.lib.ncsu.edu/news/coffee–viz.  Van Poolen, S.K. 2017. “Data Visualization: Study & Survey.” Practicum study at the University of Illinois.  Zoss, Angela. 2018. “Visualization Librarian Census.” TRLN Data Blog. Last modified June 16, 2018. https://trln.github.io/data-blog/data%20visualization/survey/visualization-librarian-census/. About the Authors Negeen Aghassibake is the Data Visualization Librarian at the University of Washington Libraries. Her goal is to help library users think critically about data visualization and how it might play a role in their work. Negeen holds an MS in Information Studies from the University of Texas at Austin. Matthew Sisk is a spatial data specialist and Geographic Information Systems Librarian based in Notre Dame’s Navari Family Center for Digital Scholarship. He received his PhD in Paleolithic Archaeology from Stony Brook University in 2011 and has worked extensively in GIS-based archaeology and ecological modeling.  His research focuses on human-environment interactions, the spatial scale environmental toxins and community-based research. Justin Joque is the Visualization Librarian at the University of Michigan. He completed his PhD in Communications and Media Studies at the European Graduate School and holds a Master of Science in Information (MIS) from the University of Michigan. This entry is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license. Share this article Tags: Data Visualizationdigital pedagogylibrarianslibrariesliteracy Previous: Ethnographies of Datasets: Teaching Critical Data Analysis through R Notebooks Next: Interdisciplinarity and Teamwork in Virtual Reality Design 'Supporting Data Visualization Services in Academic Libraries' has 2 comments December 10, 2020 @ 10:29 am Introduction / […] approach to cultivating such interdisciplinary collaboration: leveraging the library. In “Supporting Data Visualization Services in Academic Libraries,” the authors identify a host of factors that can lead to more successful support of responsible […] Reply December 10, 2020 @ 10:28 am Table of Contents / […] Supporting Data Visualization Services in Academic Libraries Negeen Aghassibake, Justin Joque, and Matthew L. Sisk […] Reply Would you like to share your thoughts? Cancel replyYour email address will not be published. Anti-spam word: (Required)* To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word. This site uses Akismet to reduce spam. Learn how your comment data is processed. Issues Table of Contents: Issue Thirteen Table of Contents: Issue Twelve Table of Contents: Issue Eleven Table of Contents: Issue Ten Table of Contents: Issue Nine Table of Contents: Issue Eight Table of Contents: Issue Seven Table of Contents: Issue Six Table of Contents: Issue Five Table of Contents: Issue Four Table of Contents: Issue Three Table of Contents: Issue Two Table of Contents: Issue One About the Journal The mission of The Journal of Interactive Technology and Pedagogy is to promote open scholarly discourse around critical and creative uses of digital technology in teaching, learning, and research. We are committed first and foremost to teaching and learning, and intend that the journal itself—both in process and in product—provide opportunities to reveal, reflect on, and revise academic publication and classroom practice. Read more... Call for Submissions Call for Submissions: General Issue Submission Deadline: November 15th, 2018 For this general issue, we are interested in contributions that take advantage of the affordances of digital platforms in creative ways. We invite both textual and multimedia submissions employing interdisciplinary and creative approaches in the humanities, sciences, and social sciences. Read more… Feeds RSS Join Our Email List Contact Us Search for: Need help with the Commons? Visit our help page Send us a message Skip to toolbar CUNY Academic Commons Home People Groups Sites Papers Events News Help About About the Commons Contact Us Publications on the Commons Image Credits Privacy Policy Project Staff Terms of Service Log In Register Help Help | Contact Us | Privacy Policy | Terms of Service | Image Credits | Creative Commons (CC) license unless otherwise noted Built with WordPress | Protected by Akismet | Powered by CUNY altman-building-2021 ---- Chapter 8 Building a Machine Learning Pipeline Audrey Altman Digital Public Library of America As a new machine learning (ML) practitioner, it is important to develop a mindful approach to the craft. By mindful, I mean possessing the ability to think clearly about each individual piece of the process, and understanding how each piece fits into the larger whole. In my experience, there are many good tutorials available that will help you work with an individual tool, deploy a specific algorithm, or complete a single task. It is more difficult to find guidelines for building a holistic system that supports the entire ML workflow. My aim is to help you build just such a system, so that you are free to focus on inquiry and discovery rather than struggling with in- frastructure and process. I write this as a software developer who has, at one time or another, been on the wrong end of all the recommendations presented here, and hopes to save you from similar headaches. Many of the examples and design choices are drawn from my experiences at the Digital Public Library of America, where I have worked alongside a very talented team of developers. This is by no means an exhaustive text, but rather a bit of pragmatic advice and a jumping-off point for further research, designed to give you a clearer idea of which questions to ask throughout your practice. This article reviews the basic machine learning workflow, discussing design considerations along the way. It offers recommendations for data storage, guidelines on selecting and working with ML algorithms, and questions to guide tool selection. Finally, it describes some challenges with scaling up. My hope is that the insight presented here, combined with your good judgement, will empower you to get started with the actual practice of designing and executing a machine learning project. 89 90 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 8 Algorithm selection As you begin ingesting and preparing data, you’ll want to explore possible machine learning al- gorithms to perform on your dataset. Choose an algorithm that fits your research question and data. If you’re not sure which algorithm to choose and not constrained by time, experiment with several different options and see which one yields the best results. Start by determining what gen- eral type of learning algorithm you need, and proceed from there to research and select one that specifically addresses your research question. In supervised learning, you train a model to predict an output condition based on given in- put conditions; for example, predicting whether or not a patient has some disease based on their symptoms, or the topic of a news article based on keywords in the text. In order for supervised learning to work, you need labeled training data, meaning data in which the outcome is already known. Examples include records of symptoms in patients who were known to have the disease (or not), or news articles that have already been assigned topics. Classification and regression are both types of supervised learning. In a classification problem, you are predicting a discrete number of possible outcomes. For example, “based on what I know about this book, will it make the New York Times Best Seller list?” is a classification problem because there are two discrete outcomes: yes or no. Classification algorithms include naive Bayes, decision trees, and k-nearest neighbor. Regression problems try to predict an outcome from a continuum of possibilities, i.e., “based on what I know about this book, what will its retail price be?” Regression algorithms include linear regression and regression trees. In unsupervised learning, the ML algorithm discovers a new pattern. The training data is unlabeled, meaning there is no indication of how the data should be organized at the outset. A common example is clustering, in which the algorithm groups items together based on features it finds mathematically significant. Perhaps you have a collection of news articles (with no existing topic labels), and you want to discover common themes or topics that appear throughout the collection. The algorithm will not tell you what the themes or topics are, but will show which articles group together. It is then up to the researcher to work out the common thread. In addition to serving your research question, your algorithm should also be a good fit for your data. Specific considerations will vary for each dataset and algorithm, so make sure you know the strengths and weaknesses of your algorithm and how they relate to the unique qualities of your dataset. For example, algorithms differ in their abilities to handle datasets with a very large number of features, handle datasets with high variance, efficiently process very large datasets, and glean meaningful intelligence from very small datasets. Is it important that your algorithm be easy to explain? Some algorithms, such as neural nets, function as black boxes, and it is difficult to decipher how they arrive at their decisions. Other algorithms, such as decision trees, are easy to understand. Can you prepare your data for the algorithm with a reasonable amount of pre- processing? Can you find examples of success (or failure) from people using similar datasets with the same algorithm? Asking these sorts of questions will help you to choose an algorithm that works well for your data, and will also inform how you prepare your data for optimal use. Finally, consider whether or not you are constrained by time, hardware, or available toolsets. Different algorithms require different amounts of time and memory to train and/or execute. Dif- ferent ML tools offer implementations of different algorithms. Altman 91 The machine learning pipeline The metaphor of a pipeline is often used for a machine learning workflow. This metaphor cap- tures the idea of data channeled through a series of sequential transformations. However, it is important to note that each stage in the process will need to be repeated and honed through- out the course of your project. Therefore, don’t think of yourself as building a single intelligent model, such as a decision tree or clustering algorithm. Instead, build a pipeline with pieces that can be swapped in and out as needed. Data flows through the pipeline and outputs a version of a decision tree, clustering algorithm, or other intelligent model. Throughout your process, you will tweak your pipeline, making many intelligent models. Eventually you will select the best model for your use case. To use another metaphor, don’t build a car, build an assembly line for making cars. While the final output of a machine learning workflow is some sort of intelligent model, there are many factors that make repetition and iteration necessary. ML processes often involve subjective decisions, such as which data points to ignore, or which configurations to select for your algorithm. You will want to test different possibilities to see what works best. As you learn more about your dataset throughout the course of the project, you will go back and tweak parts of your process. You may discover biases in your data or algorithms that need to be addressed. If you are working collaboratively, you will be incorporating asynchronous feedback from members of your team. At some point, you may need to introduce new or revised data, or try a new tool or algorithm. It is also prudent to expect and plan for errors. Human errors are inevitable, and hardware errors, such as network timeouts or memory overloads, are common. For all of these reasons, you will be well-served by a pipeline composed of modular, repeatable steps, each with discrete and stable output. A modular pipeline supports a batch processing workflow, in which whole datasets undergo a series of transformations. During each step of the process, a large amount of data (possibly the entire dataset) is transformed all at once and then incrementally stored. This can be contrasted with a real-time workflow, in which individual records are transformed instantaneously (e.g. a li- brarian updates a single record in library catalog); or a streaming workflow, in which a continuous flow of data is pushed through an entire pipeline, often without incremental storage along the way (e.g. performing analysis on a continuous stream of new tweets). Batch processing is com- mon in the research and development phase of an ML project, and may also be a good choice for a production system. When designing any step in the batch processing pipeline, assume that at some point you will need to repeat it either exactly as is, or with modifications. Documenting your process lets you compare the outputs of different variations and communicate the ways in which your choices impact the final results. If you’re writing code, version control software can help. If you’re doing more manual data manipulations, such as editing data in spreadsheets, you will need an inten- tional system of documenting exactly which transformations you are applying to your data. It is generally preferable to automate processes wherever possible so that you can repeat them with ease and consistency. A concrete example from my own experience demonstrates the importance of a pipeline that supports repetition. In my first ever ML project, I worked with a set of XML library data con- verted to CSV. I did most of my data cleanup by hand using spreadsheet software, and was not careful about preserving the formulas for each step of the process; instead, I deleted and wrote over many important intermediate computations, saving only the final results. This whole pro- 92 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 8 cess took me countless hours, and when an updated dataset became available, there was no way to reproduce my painstaking cleanup process. I was stuck with outdated data, and my final output was doomed to grow more and more irrelevant as time wore on. Since then, I have always written repeatable scripts for all my data cleanup tasks. Each decision you make will have an impact on the final results, so it is important to keep clear documentation and to verify your assumptions and hypotheses wherever possible. Sometimes there will be explicit tests to perform; at other times, you may just need to look at data—make a quick visualization, perform a simple calculation, or glance through a sample of records. Be cognizant of the potential to introduce error or bias. For example, you could remove a field that you don’t think is important, but that would, in fact, have a meaningful impact on the final result. All of these precautions will strengthen confidence in your final outcomes and make them intelligible to your collaborators and other audiences. The pipeline for a machine learning project generally comprises five stages: data acquisition, data preparation, model training and testing, evaluation and analysis, and application of results. Data acquisition The first step is to acquire the data that you will be using for your machine learning project. You may need to combine data from several different sources. There are many ways to acquire data, including downloading files, querying a database or API, or scraping web pages. Depending on the size of the source data and how it is made available, this can be a quick and simple step or the most challenging bottleneck in your pipeline. However you get your initial data, it is generally a good idea to save a copy in the rawest possible form and treat that copy as immutable, at least dur- ing the initial phase of testing different algorithms or configurations. Having a raw, immutable copy of your initial dataset (or datasets) ensures that you can always go back to the beginning of your ML process and start over with exactly the same input. It will also save you from the possi- bility that the source data will change from beneath you, thereby compromising your ability to compare the outputs of different operations (for more on this, see the section on data storage). If possible, it’s often worthwhile to learn about how the original data was created, especially if you are getting data from multiple sources that differ in subtle ways. Data preparation Data preparation involves cleaning data and transforming it into an appropriate format for sub- sequent machine learning tasks. This is often the part of the process that requires the most work, and you should expect to iterate over your data preparations many times, even after you’ve started training and testing models. The first step of data preparation is to parse your acquired data and transform it into a com- mon, usable schema. Acquired data often comes in file formats that are good for data sharing, such as XML, JSON, or CSV. You can parse these files into whatever schema makes sense to man- age the various transformations you want to perform, but it can help to have a sense of where you are headed. Your eventual choice of data format will likely be dictated by your ML algo- rithms; likely candidates include multidimensional arrays, tensors, matrices, and DataFrames. Look ahead to specific functions in the specific libraries you plan to use, and see what type of input data is required. You don’t have to use these same formats during your data preparations, though it can simplify the process. Altman 93 Data cleanup and transformation is an art. Data is messy, and the messier the data, the harder it is to analyze and uncover underlying patterns. Yet, we are only human, and perfect data is far beyond our reach. To strike a workable balance, focus on those cleanup tasks that you know (or strongly suspect) will have a significant impact on the final product. Cleanup and transfor- mation operations include removing punctuation or stopwords from textual data, standardizing date and number formats, replacing missing or dummy values with a meaningful default, and excluding data that is known to be erroneous or atypical. You will select relevant data points, and you may need to represent them in a new way: a birth date becomes age range; a place name be- comes geo-coordinates; a text document becomes a word density vector. There are many possible normalizations to perform, depending on your dataset and which algorithm(s) you plan to use. It’s not a bad idea to ensure that there’s a genuinely unique identifier for each record (even if you don’t see an immediate need for one). This is also a good time to reflect on any biases that might be inherent in your data, and whether or not you can adjust for them; even if you cannot, under- standing how they might impact the ML process will help you conduct a more nuanced analysis and frame your final results. At the very least, you can record biases in the documentation so that future researchers will be aware of them and react accordingly. As you become more familiar with the data, you will likely hone your cleanup process and iterate through the steps multiple times. The more you can learn about the data, the better your preparations will be. During the data preparation phase, practitioners often make use of visualizations and query frameworks to pic- ture their data holistically, identify patterns, and find errors or outliers. Some ML tools support these features out-of-the-box, or are intentionally interoperable with external query and visual- ization tools. For a lightweight tool, consider spreadsheet or notebook software. Depending on your use case, it may be worthwhile to put your data into a temporary database or search index so that you can make use of a more sophisticated query interface. Model testing and training During the testing and training phase, you will build multiple models and determine which one gives you the best results. One of the main ways you will tune your model is by trying multiple combinations of hyperparameters. A hyperparameter is a value that you set before you run the learning process, which impacts how the learning process works. Hyperparameters control things like the number of learning cycles an algorithm will iterate through, the number of layers in a neural net, the characteristics of a cluster, or the number of decision trees in a forest. Often, you will also want to circle back to your data preparation steps to try different configurations, apply new enhancements, or address new problems and particularities that you’ve uncovered. The process is deceptively simple: try out different configurations until you get a good result. The challenge comes when you try to define what constitutes a good (or good-enough) result. Measuring the quality of a machine learning model takes finesse. Start by asking: What would you expect to see if the model learned perfectly? Equally important, what would you expect to see if the model didn’t learn anything at all? You can often utilize randomness as a stand-in for no learning, e.g. “if a result was selected at random, the probability of the desired outcome would be X”. These two questions will help you to set benchmarks at both extremes of the realm of possible outcomes. Perfection is illusive, and the return on investment dwindles after a while, so be prepared to stop training once you’ve arrived at an acceptably good model. In a supervised learning problem the dataset is split into training and testing datasets. The algorithm uses the training data to “learn” a set of rules that it can subsequently apply to new, 94 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 8 unseen data to predict the outcome. The testing dataset (also called a validation dataset) is used to test how well the model performs. Often, a third dataset is held out as well, reserved for fi- nal testing after the model has been trained. This third dataset provides an additional bulwark against bias and overfitting. Results are typically evaluated based on some statistical measure- ment that is directly relevant to your research question. In a classification problem, you might optimize for recall or precision. In a regression problem, you can use formulas such as the root- mean square deviation to measure how well the regression line matches the actual data points. How you choose to optimize your model will depend on your specific context and priorities. Testing an unsupervised model is not as straightforward, since there is no preconceived no- tion of correct and incorrect categorization. You can sometimes rely on a known pattern in the underlying dataset that you would reasonably expect to be reflected in a successful model. There may also be characteristics of the final model that indicate success. For example, if you are work- ing with a clustering algorithm, models with dense, well-defined clusters are probably better than sparse clusters with vague boundaries. In unsupervised learning, you may want to hold back some portion of your data to perform an independent validation of your results, or you may use the entire dataset to build the model—it depends on what type of testing you want to perform. Application of results As the final step of your workflow, you will use your intelligent model to perform some task. Perhaps you will use it for scholarly analysis of a dataset, or perhaps you will integrate it into a software product. If it is the former, consider how to export any final data and preserve the artifacts of your project. If it is the latter, consider how the model, its outputs, and its contin- ued maintenance will fit into existing systems and workflows. Planning for interoperability may influence decisions from tool selection to data formats and storage. Immutable data storage Immutable data storage can benefit the batch-processing ML pipeline, especially during the ini- tial research and development phase. This type of data storage supports iteration and allows you to compare the results of many different experiments. Treating data as immutable means that af- ter each significant change or set of changes to your data, you save a new snapshot of the dataset that is never edited or changed. It also allows you to be flexible and adaptive with your data model. Immutable data storage has become a popular choice for data-intensive or “big data” applications as a way to easily assemble large quantities of data, often from multiple sources, without having to spend time upfront crafting a strict data model. You may have heard the term “data lake” to refer to such large, unstructured collections of data. This can be contrasted with a “data warehouse”, which usually indicates a highly structured, centralized repository such as a relational database. To demonstrate how immutable supports iteration and experimentation, consider the fol- lowing scenario: You start with an input file Kvn/�i�X+bp, and then perform some cleanup operation over the data, such as converting all measurements in miles to kilometers, rounded to the nearest whole number. If you were treating your data as mutable, you might overwrite the original contents of Kvn/�i�X+bp with the transformed values. The problem with this ap- proach comes if you want to test some alteration of your cleanup operation. Say, for example, you wanted to round all your conversions to the nearest tenth instead. Since you no longer have your original data, you would have to start the entire ML process from the top. If you instead Altman 95 treated your data as immutable, you would keep Kvn/�i�X+bp in its original state, and save the output of your cleanup operation in a new file, say Kvn+H2�Mn/�i�X+bp. That way, you could return to Kvn/�i�X+bp as many times as you wished, try different operations on this data, and easily compare the results of these operations knowing the source data was exactly the same for each one. Think of each immutable dataset as a place in your process that you can safely reset to anytime you want to try something new or correct for some bias or failure. To illustrate the benefits of a flexible data model, consider a mutable data store, such as a relational database. Before you put any data into the database, you would first need to design a system of tables with set fields and datatypes, and the relationships between those tables. This can feel like putting the cart before the horse, especially if you are starting with a dataset with which you are not yet intimately familiar, and you want the ability to experiment with different algorithms, all of which might require slightly different transformations on the original dataset. Revisiting the example in the previous paragraph, you might initially have defined your distance datatype as an integer (when you were rounding to the nearest whole number), and would later have to change it to a floating point number (when you were rounding to the nearest tenth). Making this change would mean altering the database schema and migrating all of the existing data to the new type, which is a nontrivial task—especially if you later decide to revert back to the original type. By contrast, if you were working with immutable CSV files, it would be much easier to write out two files, one with each data type, and keep whichever one ultimately proved most effective. Throughout your ML process, you can create several incremental datasets that are essentially read-only. There’s no one correct data storage format, but ideally you would use something sim- ple and space-efficient with the capacity to interoperate with different tools, such as flat files (plain text files without extraneous markup, such as TXT, CSV, or Parquet). Even if your data is ulti- mately destined for a different kind of datastore, such as a relational database or triplestore, con- sider using simple, immutable storage as an intermediary to facilitate iteration and experimenta- tion. If you’re concerned about overwhelming your local drive, cloud storage is a good option, especially if you can read and write directly from your programs or software services. One final benefit of immutable storage relates to scale. Batch processing workflows and im- mutable data storage work well with distributed data processing frameworks, such as MapReduce and Spark. Therefore, if you need to scale your ML project using distributed processing, the in- tegration will be more seamless (for more, see the section on scaling up). Organizing Immutable Data Organizing immutable data stores can be a challenge, especially with multiple users. A little planning can save you from losing track of your experiments and results. A well-ordered direc- tory structure, informative and consistent file names, liberal use of timestamps, and disciplined note-taking are simple but effective strategies. For example, say you were acquiring MARCXML records from an API feed, parsing out subject terms, and building a clustering algorithm around these terms. Let us explore one possible way that you could organize your data outputs through each step of the machine learning pipeline. To enforce a naming convention, create a helper method that generates the output path for each run of a particular data process. This output path includes the date and timestamp of the run—that way you won’t have to think about naming each individual file, and can avoid the phenomenon of a mess of files called Kvn+H2�Mn/�i�X+bp, Kvn+H2�M2`n/�i�X+bp, 96 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 8 Kvn7BM�Hn+H2�M2bin/�i�X+bp, etc. Your file path for the acquired data might be in the format: KvS`QD2+if�+[mBbBiBQMbfK�`+nuuuuJJ..n>>JJaaXtKH In this case, “YYMMDD” represents the date and “HHMMSS” represents the timestamp. Your file path for prepared and cleaned data might be: KvS`QD2+if+H2�Mn/�i�b2ibfbm#D2+ibnuuuuJJ..n>>JJaaX+bp Finally, each clustering model you build could be saved using the file path pattern: KvS`QD2+ifKQ/2Hbf+Hmbi2`nuuuuJJ..n>>JJaa Following this general pattern, you can organize all of the outputs for your entire project. Using date and timestamps in the file name also enables easy sorting and retrieval of the most recent output. For each data output, you will want to maintain a record of the exact input, any special at- tributes of the process (e.g. “this time I rounded decimals to the nearest hundredth”), and metrics that will help you determine success or failure of the process. If you can generate this information automatically for each process, all the better for ensuring an accurate record. One strategy is to include a second helper method in your program that will generate and write out a companion file to each data output. The companion file contains information that will help evaluate results, detect errors, perform optimizations, and differentiate between any two data outputs. In the example project, you could accompany the acquisition output with a text file detailing the exact API call used to fetch the data, the number of records acquired, and the runtime for the process. Keeping companion files as close as possible to their outputs helps prevent accidental separation, so save it to: KvS`QD2+if�+[mBbBiBQMfK�`+nuuuuJJ..n>>JJaaXiti In this case, the date and timestamp should exactly match that of its companion XML file. When running processes that test and train models, you can include information in your com- panion file about hyperparameters and whatever metrics you are using to evaluate the quality of the model. In our example, the companion file to each cluster model may contain the file path for the cleaned input data, the number of clusters, and a measure of cluster variance. Working with machine learning algorithms New technologies and software advances make machine learning more accessible to “lay” users, by which I mean those of us without advanced degrees in mathematics or data science. Yet, the algorithms are complex, and you need at least an intuitive understanding of how they work if you hope to implement them correctly. I use the following three questions as a guide for under- standing an algorithm. Keep in mind that any one project will likely make use of several complex algorithms along the way. These questions help ensure that I have the information I truly need, and avoid getting bogged down with details best left to mathematicians. • What do the inputs and outputs of the algorithm mean? There are two parts to answering this question. First is the data structure, e.g. “this is a vector with 300 integers.” Second Altman 97 is knowing what this data describes, e.g. “each vector represents a document, and each integer specifies the number of times a particular word appears in that document.” You also need to be aware of specific implementation details—perhaps the input needs to be normalized in some way, perhaps the output has been smoothed (a technique that com- pensates for noisy data or outliers). This may seem straightforward, but it can be a lot to keep track of once you’ve gone through several layers of processing and abstraction. • What effect do different hyperparameters have on the algorithm? Part of the machine learn- ing process is tuning hyperparameters, or trying out multiple configurations until you get satisfying results. Part of the frustration is that you can’t try every possible configuration, so you have to do some intelligent guesswork. Twiddling hyperparameters can feel enig- matic and unitutive, since it can be difficult to predict their impact on the final outcome. The better you understand hyperparameters and their roles in the ML process, the more likely you are to make reasonable guesses and adjustments—though you should always be prepared for a surprise. • Canyouexplainhowthisalgorithmworkstoalaypersonandwhyit’sbeneficialtotheproject? There are two benefits to articulating a response to this question. First, it ensures that you really understand the algorithm yourself. And second, you will likely be called on to give this explanation to co-collaborators and other stakeholders. A good explanation will build excitement around the project, while a befuddling one could sow doubt or disinterest. It can be difficult to strike a balance between general summary and technical equations, since your stakeholders will likely include people with diverse backgrounds, so do your best and look for opportunities for people with different expertises to help refine your team’s understanding of the algorithm. Learning more about the underlying math can help you make better, more nuanced decisions about how to deploy the algorithm, and is fascinating in its own right—but in most cases I have found that the above three questions provide a solid foundation for machine learning research. Tool selection Tool selection is an important part of your process and should be approached thoughtfully. A good approach is to articulate and prioritize the needs of your team, and make selections that meet these needs. I’ve listed some possible questions for consideration below, many of which you will recognize as general concerns for any tool selection process. • What sorts of features and interfaces do they offer? If you require a specific algorithm, the ability to make data visualizations, or query interfaces, you can find tools to meet these specific needs. • How well do tools interoperate with one another, or with other parts of your existing systems? One of the advantages of a well-designed pipeline is that it will enable you to swap out software components if the need arises. For example, if your data is in a format that is interoperable with many systems, it frees you from being tied down to any specific tool. • How do the tools align with the skill sets and comfort levels of your team? For example, con- sider what coding languages your collaborators know, and whether or not they have the 98 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 8 capacity to learn a new one. If you have someone who is already a wiz with a preferred spreadsheet program, see if you can export data into a compatible file format. • Arethetoolsstable,well-documented,andwell-supported? Machine learning is a fast-changing field, with new algorithms, services, and software features being developed all the time. Something new and exciting that hasn’t yet been road-tested may not be worth the risk if there is a more dependable alternative. Furthermore, there tends to be more scholarship, documented use cases, and tutorials for older, more widely-adopted tools. • Are you concerned about speed and scale? Don’t get bogged down with these considerations if you’re just trying to get a working pilot off the ground, but it can help to at least be aware of how problems are likely to manifest as your volume of data increases, or as you integrate into time-sensitive workflows. You and your team can work through these questions and articulate additional requirements relevant to your specific context. Scaling up Scaling up in machine learning generally means that you need to work with a larger volume of data, or that you need processes to execute faster. Recent advances in hardware and software make the execution of complex computations magnitudes faster and more efficient than they were even a decade ago, and you can often achieve quite a bit by working on a personal computer. Yet, time is valuable, and it can be difficult to iterate and experiment effectively when individual processes take too long to execute. There are many ML software packages that can help you make efficient use of whatever hard- ware you have, including your personal computer. Some examples at the time of writing are Apache Spark, TensorFlow, Scikit-learn, and Microsoft Cognitive Toolkit, each with their own strengths and applications. In addition to providing libraries for building and testing models, these software packages optimize algorithmic performance, memory resources, data through- puts, and/or parallel computations. They can make a remarkable difference in both processing speed and the amount of data you can comfortably handle. There are also services that allow you to submit executable code and data to the cloud for processing, such as Google AI Platform. Managing your own hardware upgrades is not without challenge. You may be lucky enough to have access to a high-powered computer capable of accelerated processing. A common example is a computer with GPUs (graphics processing units), which break complex processes into many small tasks and run them in parallel. However, these powerful machines can be prohibitively ex- pensive. Another scaling technique is distributed or cluster computing, in which complex pro- cesses are distributed across multiple computers, often in the cloud. A cloud cluster can bring significant cost savings, but managing one requires specialized knowledge and the learning curve can be rather steep. It is also important to note that different algorithms require different scal- ing techniques. Some clustering algorithms, for example, scale well with GPUs but not with distributed computing. Even with the right hardware and software, scaling up can be a tricky business. ML processes tend to have dramatic spikes in memory or network use, which can tax your systems. Not all ML algorithms scale well, causing memory use or execution time to grow exponentially as more data is added. Sometimes you have to add additional, complexity-reducing steps to your pipeline to Altman 99 handle data at scale. Some of the more common machine learning languages, such as Python and R, execute relatively slowly, putting the onus on developers to optimize operations for efficiency. In anticipation of these and other challenges, it is often a good idea to start with a scaled-down pilot or proof of concept, and not to underestimate the time and resources necessary to scale up from there. Conclusion New technologies make it possible for more researchers and developers to leverage the power of machine learning. Building an effective machine learning system means supporting the entire workflow, from data acquisition to final analysis. Practitioners must be mindful of how each im- plementation decision and subjective choice—from the way you structure and store your data to the algorithms you use to the ways you validate your results—will impact the efficiency of opera- tions and the quality of learned intelligence. This article has offered some practical guidelines for building ML systems with modular, repeatable processes and intelligible, verifiable results. There are many resources available for further research, both online and in your libraries, and I encour- age you to consult with subject specialists, data scientists, mathematicians, programmers, and data engineers. May your data be clean, your computations efficient, and your results profound. Further Reading I include here a few suggestions for further reading on key topics. I have also found that in the fast-changing world of machine learning technologies, blogs, internet communities, and online classes can be a great source of information that is current, introductory, and/or geared toward practitioners. Tan, Pang-Ning, Michael Steinbach, and Vipin Kumar. 2005. Introduction to Data Mining. Boston: Pearson Addison Wesley. See chapter 2 for data preparation strategies. Later chap- ters introduce common classification and clustering algorithms. Marz, Nathan and James Warren. 2015. Big Data: Principles and best practices of scalable real- time data systems. Shelter Island: Manning. “Part 1: Batch Layer” discusses immutable storage in depth. Kleppmann, Martin. 2017. Designing Data-Intensive Applications: The Big Ideas Behind Reli- able, Scalable, and Maintainable Systems. Boston: O’Reilly. “Chapter 10: Batch Process- ing” is especially relevant if you are interested in scaling up. bandyopadhyay-beyond-2021 ---- Beyond Node Embedding: A Direct Unsupervised Edge Representation Framework for Homogeneous Networks Sambaran Bandyopadhyay1 and Anirban Biswas2 and Narasimha Murty3 and Ramasuri Narayanam4 Abstract. Network representation learning has traditionally been used to find lower dimensional vector representations of the nodes in a network. However, there are very important edge driven mining tasks of interest to the classical network analysis community, which have mostly been unexplored in the network embedding space. For applications such as link prediction in homogeneous networks, vec- tor representation (i.e., embedding) of an edge is derived heuristically just by using simple aggregations of the embeddings of the end ver- tices of the edge. Clearly, this method of deriving edge embedding is suboptimal and there is a need for a dedicated unsupervised approach for embedding edges by leveraging edge properties of the network. Towards this end, we propose a novel concept of converting a net- work to its weighted line graph which is ideally suited to find the em- bedding of edges of the original network. We further derive a novel algorithm to embed the line graph, by introducing the concept of col- lective homophily. To the best of our knowledge, this is the first direct unsupervised approach for edge embedding in homogeneous infor- mation networks, without relying on the node embeddings. We val- idate the edge embeddings on three downstream edge mining tasks. Our proposed optimization framework for edge embedding also gen- erates a set of node embeddings, which are not just the aggregation of edges. Further experimental analysis shows the connection of our framework to the concept of node centrality. 1 Introduction Network representation learning (also known as network embedding) has gained significant interest over the last few years. Traditionally, network embedding [22, 12, 28] maps the nodes of a homogeneous network (where nodes denote entities of similar type) to lower di- mensional vectors, which can be used to represent the nodes. It has been shown that such continuous node representations outperform conventional graph algorithms [2] on several node based downstream mining tasks like node classification, community detection, etc. Edges are also important components of a network. From the point of downstream network mining analytics, there are plenty of network applications - such as computing edge betweenness centrality [20] and information diffusion [24] - which heavily depend on the infor- mation flow in the network. Compared to the conventional down- stream node embedding tasks (such as node classification), these tasks are more complex in nature. But similar to node based ana- lytics, there is a high chance to improve the performance of these tasks in a continuous lower dimensional vector space. Thus, it makes 1 IBM Research & IISc, Bangalore, email: sambband@in.ibm.com 2 Indian Institute of Science, Bangalore, email: anirbanb@iisc.ac.in 3 Indian Institute of Science, Bangalore, email: mnm@iisc.ac.in 4 IBM Research, Bangalore, email: ramasurn@in.ibm.com sense to address these problems in the context of network embed- ding via direct representation of the edges of a network. As a first step towards this direction, it is important to design dedicated edge embedding schemes and validate the quality of those embeddings on some basic edge-centric downstream tasks. (a) Synthetic Graph (b) node2vec (c) line2vec Figure 1: Edge Visualization: (a) We created a small synthetic net- work with two communities. So, there are three types of edges: Green (or red) edges with both the end points belonging to the green (or red respectively) community; Blue edges with end points belonging to two different communities. (b) node2vec embedding (8 dimensional) of the edges obtained by taking average of the embeddings of the end vertices and then used t-SNE for visualization. (c) Direct edge em- beddings (8 dimensional) obtained by line2vec and then used t-SNE for visualization. Clearly, line2vec is superior which visually sepa- rates the edge communities, compared to that with the conventional way of aggregating node embeddings to obtain edge representation. In the literature, there are indirect ways to compute embedding of an edge in an information network. For tasks like link predic- tion, where a classifier needs to be trained on both positive (existing) and negative (not existing) edge representations, a simple aggrega- tion function [12] such as vector average or Hadamard product has been used on the representations of the two end vertices to derive the vector representation of the corresponding edge. Typically node embedding algorithms use the homophily property [18] by respecting different orders of node proximities in a network. As the inherent ob- jective functions of these algorithms are focused on the nodes of the network, using an aggregation function on these node embeddings to get the edge embedding could be suboptimal. We demonstrate the shortcoming of this approach in Figure 1, where the visualization of the edge embeddings derived by aggregating node embeddings (tak- ing average of the two end nodes) from node2vec [12] on a small synthetic graph do not maintain the edge community structure of the network. Whereas, a direct edge embedding approach line2vec, to be proposed in this paper, completely adheres to the community struc- ture, as edges of different types are visually segregated in the t-SNE plot of the same shown in 1(c). So there is a need to develop algo- rithms for directly embedding edges (i.e., not via aggregating node embeddings) in information networks. We address this research gap ar X iv :1 91 2. 05 14 0v 1 [ cs .S I] 1 1 D ec 2 01 9 in this paper in a natural way. Following are the contributions: • We propose a novel edge embedding framework line2vec, for ho- mogeneous social and information networks. To the best of our knowledge, this is the first work to propose a dedicated unsuper- vised edge embedding scheme which avoids aggregation of the end node embeddings. • We exploit the concept of line graph for edge representation by converting the given network to a weighted line graph. We fur- ther introduce the concept of collective homophily to embed the line graph and produce the embedding of the edges of the given network. • We conduct experiments on three edge-centric downstream tasks. Though our approach is proposed for embedding edges, we further analyze to show that, a set of robust node embeddings, which are not just the aggregation of edges, are also generated in the process. • We experimentally discover the non-trivial connection of the clas- sical concept of node centrality with the optimization framework of line2vec. The source code of line2vec is available at https: //bit.ly/2kfiS2l to ease the reproducibility of the results. Though edge centric network mining tasks such as edge central- ity, network diffusion and link prediction can be benefited from edge embeddings, applications of edge embeddings to tackle them is non- trivial and needs a separate body of work. For example, finding cen- tral edges in the network amounts to detecting a subset of points in the embedding space which are diverse between each other and rep- resent a majority of the other points. We leave them to be addressed in some future work. 2 Related Work and Research Gaps Node embedding in information network has received great interest from the research community. We refer the readers to the survey arti- cles [33] for a comprehensive survey on network embedding and cite only some of the more prominent works in this paragraph. DeepWalk [22] and node2vec [12] are two node embedding approaches which employ different types of random walks to capture the local neigh- borhood of a node and maximize the likelihood of the node context. Struc2vec [23] is another random walk based strategy which finds similar embeddings for nodes which are structurally similar. A deep autoencoder based node embedding technique (SDNE) that preserves structural proximity is proposed in [31]. Different types of node em- bedding approaches for attributed networks are also present in the literature [35, 3, 9]. A semi-supervised graph convolution network based node embedding approach is proposed in [14] and further ex- tended in GraphSAGE [13] which learns the node embeddings with different types of neighborhood aggregation methods on attributes. Recently, node embedding based on semi-supervised attention net- works [28], maximizing mutual information [29], and in the presence of outliers [4] are proposed. Compared to the above, representing edges in information net- works is significantly less matured. Some preliminary works ex- ist which use random walk on edges for community detection in networks [15] or to classify large-scale documents into large-scale hierarchically-structured categories [11]. [1] focuses on the asym- metric behavior of the edges in a directed graph for deriving node embeddings, but it represents a potential edge just by a scalar which determines its chance of existence. [25, 30] derive embeddings for different types of edges in a heterogeneous network, but their pro- posed method essentially uses an aggregation function inside the op- timization framework to generate edge embeddings from the node embeddings. For knowledge bases, embedding entities and relation types in a low dimensional continuous vector space [5, 7, 10] have been shown to be useful. But, several fundamental concepts of graph embedding, such as homophily, are not directly applicable to them. [19] proposes a dual-primal GCN based semi-supervised node em- bedding approach which first aggregates edge features by convolu- tion, and then learns the node embeddings by employing a graph attention on the incident edge features of a node. To the best of our knowledge, [36] is the only work which proposes a supervised approach based on adversarial training and an auto-encoder, purely for edge representation learning in homogeneous networks. But their framework needs a large amount of labelled edges to train the GAN, which makes it restrictive for real world applications. Hence in this paper, we propose a task-independent unsupervised dedicated edge embedding framework for homogeneous information networks to ad- dress the research gaps. 3 Problem Description An information network is typically represented by a graph G = (V,E,W), where V = {v1,v2, · · · ,vn} is the set of nodes (a.k.a. vertices), each representing a data object. E ⊆{(vi,vj )|vi,vj ∈ V} is the set of edges. We assume, |E| = m. Each edge e ∈ E is associated with a weight wvi,vj > 0 (1 if G is unweighted), which indicates the strength of the relation. Degree of a node v is denoted as dv, which is the sum of weights of the incident edges. N(v) is the set of neighbors of the node v ∈ V . For the given network G, the edge representation learning is to learn a function f : e 7→ x ∈ RK , i.e., it maps every edge e ∈ E to a K dimensional vector called edge embedding, where K < m. These edge embeddings should preserve the underlying edge semantics of the network, as described below. Edge Importance: Not all the edges in a network are equally im- portant. For example, in a social network, millions of fans can be connected to a movie star. But any two fans of a movie star may not be similar to each other. So this type of connections are weaker com- pared to an edge which connects two friends who have much lesser number of connections individually [16]. Edge Proximity: The edges which are close to each other in terms of their topography or semantics should have similar embeddings. Similar to the concepts of node proximities [31], it is easy to define first and higher order edge proximities via incidence matrix. 4 Solution Approach: line2vec We propose an elegant solution (referred as line2vec) to embed each edge of the given network. First we map the network to a weighted line graph, where each edge of the original network is transformed into a node.Then we propose a novel approach for embedding the nodes of the line graph, which essentially provides the edge embed- dings of the original network. For simplicity of presentation, we as- sume that the given network is undirected. Nevertheless, it can triv- ially be generalized for directed graphs. 4.1 Line Graph Transformation Given an undirected graph G = (V,E), the line graph L(G) is the graph such that each node of L(G) is an edge in G and two nodes of L(G) are neighbors if and only if their correspond- ing edges in G share a common endpoint vertex [32]. Formally L(G) = (VL,EL) where VL = {(vi,vj ) : (vi,vj ) ∈ E} and EL = { ( (vi,vj ), (vj,vk) ) : (vi,vj ) ∈ E , (vj,vk) ∈ E}. Figure 2 https://bit.ly/2kfiS2l https://bit.ly/2kfiS2l shows how to convert a graph into the line graph [8]. Hence the line graph transformation induces a bijection from the set of edges of the given graph to the set of nodes of the line graph as l : e 7→ v where ∀e ∈ E, ∃ v ∈ VL and if two edges ei,ej ∈ E are adjacent there will be an corresponding edge e ∈ EL in the line graph. Figure 2: Transformation process of a graph into its line graph. (a) Represents an information network G. (b) Each edge in the original graph has a corresponding node in the line graph. Here the green edges represent the nodes in line graph. (c) For each adjacent pair of edges in G there exists an edge in L(G). The dotted lines here are the edges in the line graph. (d) The line graph L(G) of the graph G 4.2 Weighted Line Graph Formation We propose to construct a weighted line graph for our problem even if the original graph is unweighted. These weights would help the random walk in the later stage of line2vec to focus more on the rel- evant nodes in the line graph. It is evident from Section 4.1 that a node of degree k in the original graph G produces k(k−1)/2 edges in the line graph L(G). Therefore high degree nodes in the origi- nal graph may get over-represented in the line graph. Often many of these incident edges are not that important to the concerned node in the given network, but they can potentially change the movement frequency of a random walk in the line graph. We follow a simple strategy to overcome this problem. The goal is to ensure that the line graph not only reflects the topology of the original graph G (which is guaranteed by Whitney graph isomorphism theorem [32] in almost all cases) but also the dynamics of the graph is not affected by the transformation process. The edge weights are defined to facilitate a random walk on L(G), as described in Section 4.3.1. Intuitively if we start a random walk from a node vij ≡ (vi,vj ) ∈ L(G) and want to traverse to vjk ≡ (vj,vk) ∈ L(G), then it is equivalent to selecting the node vj ∈ G from (vi,vj ) and move to vk ∈ G. If G is undirected, we define the probability of choosing vj to be propor- tional to dvi dvi +dvj . Here, dvi and dvj are the degrees of the end point nodes of the edge (vi,vj ) and an edge in general is more important to the endpoint node having lower degree than the other endpoint with a higher degree [16]. Then selecting vk is proportional to edge weight of ejk ≡ (vj,vk) ∈ E. Hence, for any two adjacent edges eij ≡ (vi,vj ) and ejk ≡ (vj,vk), we define the edge weight for the edge (eij,ejk) of the line graph L(G) as follows: w(eij,ejk) = di di + dj × wjk∑ r∈N(vj ) wjr −wij (1) This completes the formation of the weighted line graph from any given network. 4.3 Embedding the Line Graph Here we propose a novel approach to embed the nodes of the line graph. Line graph is a special type of graph which comes with some nice properties. Below is one important observation that we exploit in embedding the line graph. Lemma 1 Each (non-isolated) node in the graph G induces a clique in the corresponding line graph L(G). Proof 1 Let’s assume that a (non-isolated) node v in the graph G has nv edges connected to it. So these nv edges are neighbors of each other. Hence in the corresponding line graph L(G), each of these edges would be mapped to a node and each of these nodes is connected to all the other nv −1 nodes. Thus there is a clique of size nv induced in the line graph by node v. This can be visualized in Fig. 2, where the node 1 in (a) with de- gree 3 induces a clique of size 3, including the nodes (1,2), (1,3) and (1,4) into the corresponding line graph in (d). Lemma 1 is interesting because it tells that the nodes of the line graph exhibit some col- lective property, rather than just pairwise property. To clarify, in the given network, two nodes are pairwise connected by an edge, but in the line graph, a group of nodes form a clique. Pairwise homophily [18], which has been the backbone to many standard embedding al- gorithms [31], is not sufficient for embedding the line graph. Hence we propose a new concept ‘collective homophily’ applicable to the line graph. We explain it below. Figure 3: Collective Homophily ensures the embeddings of the edges which are connected via a common node in the network, stay within a sphere of small radius. 4.3.1 Collective Homophily and Cost Function Formulation We emphasize that all the nodes, which are part of a clique in a line graph, should be close to each other in the embedding space. One way to enforce collective homophily is to introduce a sphere (of small radius R ∈ R) in the embedding space and ensure that embedding of the nodes (in the line graph) which are part of a clique, remain within the sphere. Hence any two embeddings within a sphere are at a maxi- mum of 2R distance apart from each other. The concept is explained in Fig. 3. Smaller the radius R, embeddings of the neighbor edges would be closer to each other and hence the better the enforcement of collective homophily. Note that a sum of pairwise homophily loss in the embedding space may lead to some pairs being very close to each other and others may still be quite far. So, we formulate the objective function to embed the (weighted) line graph as follows. Let us introduce some notation. Bold face letters like u (or v) denote a node in the line graph L(G), which can also be denoted by uuv when the correspondence with the edge (u,v) ∈ E in the original graph G is required. Normal face letters like u,v denote nodes in the given graph. xv ∈ RK (equivalently xuv) denotes the embedding of the node vuv in line graph (or the edge (u,v) ∈ E). To map the nodes of the line graph to vectors, first we want to pre- serve different orders of node proximities in the line graph. For this, a truncated random walk based sampling strategy S is used to provide a set of nodes NS(v) as context to a node v in the network. Here we employ the random walk proposed by [12], which balances between the BFS and DFS search strategy in the graph. As the generated line graph is a weighted one, we consider the weights of the edges while computing the node transition probabilities. Let X denote the matrix with each row as the embedding xv of a node v of the line graph. As- suming conditional independence of the nodes, we seek to maximize (w.r.t. X) the log likelihood of the context of a node as:∑ v∈VL log P(NS(v)|xv) = ∑ v∈VL ∑ v′∈NS (v) log P(v ′|xv) Each of the above probabilities can be represented using standard softmax function parameterized by the dot product of xv′ and xv. As usual, we also approximate the computationally expensive denomi- nator of the softmax function using some negative sampling strategy N̄(v) for any node v. The above equation, after simple algebraic manipulations, leads to maximizing the following:∑ v∈VL ∑ v′∈NS (v) xv′ ·xv −|NS(v)| log ( ∑ v̄∈N̄(v) exp(xv̄ ·xv) ) (2) Next, we implement the concept of collective homophily as pro- posed above. Each node u ∈ V (in the original network) induces a clique in the line graph (Lemma 1). An edge (u,v) ∈ E corresponds to the node vuv ∈ VL in the line graph. So we want all the nodes of the form vuv ∈ VL belong to a sphere centered at cu ∈ RK and of radius Ru, where v ∈ N(u) (neighbors of u). As collective ho- mophily suggests that embeddings of these nodes must be close to each other, we minimize the sum of all such radii. This with Eq. 2 gives the final cost function of line2vec as follows. min X,R,C ∑ v∈VL [ |NS(v)| log ( ∑ v̄∈N̄(v) exp(xv̄ ·xv) ) − ∑ v′∈NS (v) xv′ ·xv ] + α ∑ u∈V R 2 u such that, ||xuv −cu||22 ≤ R 2 u, ∀v ∈N(u), ∀u ∈ V Ru ≥ 0, ∀u ∈ V (3) Here, α is a positive weight factor. The constraint ||xuv −cu||22 ≤ R2u ensures that nodes of the form xuv belong to the sphere of radius Ru and centered at cu. We use R and C to denote set of all such radii and centers respectively. 4.3.2 Solving the Optimization Equation 3 is a non-convex constrained optimization problem. We use penalty functions [6] technique to convert this to an uncon- strained optimization problem as follows: min X,R,C ∑ v∈VL [ |NS(v)| log ( ∑ v̄∈N̄(v) exp(xv̄ ·xv) ) − ∑ v′∈NS (v) xv′ ·xv ] + α ∑ u∈V R 2 u + λ ∑ u∈V ∑ v∈N(u) g(||xuv −cu||22 −R 2 u) + ∑ u∈V γug(−Ru) (4) Here the function g : R → R is defined as g(t) = max(t, 0). So it imposes a penalty to the cost function in Eq. 4 when the argument inside g is positive, i.e., when there is a violation of the constraints in Eq. 3. We use a linear penalty g(t) as the gradient does not van- ish even when t → 0+. To solve the unconstrained optimization in Eq. 4, we use stochastic gradient descent, computing gradients w.r.t. each of X, R and C. We take subgradient when t = 0 for g(t). All the penalty parameters λ and γu’s corresponding to penalty func- tions are positive. When there is any violation of a constraint (or sum of constraints), the corresponding penalty parameter is increased to impose more penalty. We give more importance to the type of con- straints Ru ≥ 0, as violation of them may change the intuition of the solution. So we use different penalty parameters for each of them, so imposing a different penalty to each of such constraints is possible. One can show that under appropriate assumptions, any convergent subsequence of solutions to the unconstrained penalized problems must converge to a solution of the original constrained problem [6]. Very small values of the penalty parameters might lead to the vio- lation of constraints, and very large values would make the gradient descent algorithm oscillate. So, we start with smaller values of λ and γu’s and keep increasing them until all the constraints are satisfied or the gradients become too large making abrupt function changes. Note that, theoretically some of the constraints in Eq. 3 may still be violated, but experimentally we found them satisfied up to a large extent (Section 5). In the final solution, xv gives the vector represen- tation of node v of the line graph, which is essentially the embedding of the corresponding edge in the original network. 4.4 Key Observations and Analysis Both the edge embedding properties mentioned in Section 3 are pre- served in the construction and embedding of the weighted line graph. Particularly, if two edges have a common incident node in the orig- inal network, the corresponding two nodes in the transformed line graph would be neighbors. Also two edges having similar neighbor- hood in the original network lead to two nodes having similar neigh- borhood in the transformed line graph. The random walk and collec- tive homophily preserve both pairwise and collective node proxim- ity of the line graph in the embedding space. Thus different orders of edge proximities of the original network is captured well in the edge embeddings. Also the construction of edge weights in line graph (Sec. 4.2) ensures that underlying importance of edges of the original network is preserved in the transformed line graph, and hence in the embeddings through truncated random walk. Time Complexity: Edge embedding is computationally difficult than node embedding, as the number of edges in a real life net- work is more than the number of nodes. From Lemma 1, each node u in the original network induces a clique of size du (degree of u in G). Hence total number of edges in the line graph is: mL =∑ u∈V ( du 2 ) = ∑ u∈V du(du−1) 2 ≤ |V |d2, where d is the maximum de- gree of a node in the given network. So, the construction of line graph would take O(|V |d2) time. Next, we use alias table for fast computation of the corpus of node sequences in O(mL log(mL)) = O(|V |d2 log(|V |d)) by the random walks, assuming the number of random walks on the line graph, maximum length of a random walk, context window size and the number of negative samples for each node to be constant, as they are the hyper parameters of skip-gram model. Then, the first term (under the sum over the nodes in VL) of Eq. 4 can be computed in O(|VL|) = O(|E|) time. Next, the term weighted by α can be computed in O(|V |) time. Then, for the term weighted by λ, we need to visit each node in V and for each such node, its neighbors in the original graph, which can be computed in a total of O(|E|) time. The last term of Eq. 4 can be computed in O(|V |) time. As we use penalty methods to solve it, the runtime of solving Eq. 4 is O(|E| + |V |). Hence the total runtime complexity of line2vec is O(|V |d2 log(|V |d)). So in the worst case, (for e.g., a nearly complete graph), run time complexity is O(|V |3log|V |). But for most of the real life social networks, the maximum degree can be considered as a constant (i.e., does not grow with the number of nodes). Hence for them, the run time complexity is O(|V |log|V |). 5 Experimental Evaluation We conduct detailed experiments on three downstream edge centric network mining tasks and thoroughly analyze the proposed optimiza- tion framework of line2vec. 5.1 Design of Baseline Algorithms Unsupervised direct edge embedding for information network itself is a novel problem. Existing approaches only aggregate the embed- dings of the two end nodes to find the embedding of an edge. So as baselines, we only consider the publicly available implementation of a set of popular unsupervised node embedding algorithms which can work only using the link structure of the graph: DeepWalk, node2vec, SDNE, struc2vec and GraphSage (official unsupervised implemen- tation for the un-attributed networks). We have considered differ- ent types of node aggregation methods such as taking the average, Hadamard product, vector norms of two end node embeddings [12] to generate the edge embeddings for the baseline algorithms. It turns out that average aggregation method performs the best among them. So we report the performance of the baseline methods with average node aggregation, where embedding of an edge (u,v) is computed by taking the average of the node embeddings of u and v. 5.2 Datasets Used and Setting Hyper-parameters We used five real world publicly available datasets for the ex- periments. A summary of the datasets is given in Table 1. For Zachary’s karate club and Dolphin social network (http:// www-personal.umich.edu/˜mejn/netdata/), there are no ground truth community labels given for the nodes. So we use the modularity based community detection algorithm, and label the nodes based on the communities they belong to. For Cora, Pubmed (https://linqs.soe.ucsc.edu/data) and MSA [26], the ground truth node communities are available. The ground truth edge labels are derived as follows. If an edge connects two nodes of the same community (intra community edge), the label of that edge is the common community label. If an edge connects nodes of different communities (inter community edge), then that edge is not consid- ered for calculating the accuracy of downstream tasks. Note that, all the edges (both intra and inter community) are considered for learn- ing the edge embeddings. We also provide the size of the generated weighted line graphs in Table 1. Note that, line graphs are still ex- tremely sparse in nature, which enables the application of efficient data structures and computation on sparse graphs here. We set the parameter α in Eq. 3 to be 0.1 in the experiments. At that value, the two components in the cost function in Eq. 3 con- tribute roughly the same to the total cost in the first iteration of line2vec. The dimension (K) of the embedding space is set as 8 for Karate club and Dolphin social network as they are small in size, Table 1: Summary of the datasets used. Dataset #Nodes #Edges #Edge-Labels #Nodes in L(G) #Edges in L(G) Zachary’s Karate club 34 78 3 78 528 Dolphin social network 62 159 4 159 923 Cora 2708 5278 7 5278 52301 Pubmed 19717 44327 3 44327 699385 MSA 30101 204926 3 204926 6149555 and it is set as 128 for the other three larger datasets (for all the algo- rithms). For the faster convergence of SGD, we set the initial learning rate higher and decrease it over the iterations. We vary the penalty pa- rameters in Eq. 4 over the iterations as discussed in Section 4.3.2 to ensure that the constraints are satisfied at large. 5.3 Penalty Errors of line2vec Optimization We have shown the values of two different penalty errors (or con- straint violation error of the penalty method based optimization) over the iterations of line2vec in Figure 5. For all the datasets, total spher- ical error ∑ u∈V ∑ v∈N(u) g(||xuv − cu||22 − R2u) converges to a small value very close to zero and negative error ∑ u∈V g(−Ru) remains to be zero. This means, almost all the constraints of line2vec formula- tion are satisfied in the final solution. 5.4 Downstream Edge Mining Tasks Edge visualization: It is important to understand if the edge embed- dings are able to separate the communities visually. We use the em- bedding of the edges as input in RK , and use t-SNE [17] to plot the edge embedding in a 2 dimensional space. Fig. 4 shows the edge vi- sualizations by line2vec, along with the baselines algorithms on Cora datasets. Note that, line2vec is able to visually separate the commu- nities well compared to all the other baselines. The same trend was observed even in Fig. 1 for the small synthetic network. Line2vec, be- ing a direct approach for edge embedding via collective homophily, outperforms all the baselines which aggregate node embeddings to generate the embeddings for the edges. Edge Clustering: Like node clustering, edge clustering is also im- portant to understand the flow of information within and between the communities. For clustering the embeddings of the edges, we apply KMeans++ algorithm. To evaluate the quality of clustering, we use unsupervised clustering accuracy [34] which uses different permuta- tions of the labels and chooses the assignment which gives best possi- ble accuracy. Figure 6a shows that line2vec outperforms all the base- lines for edge clustering on all the datasets. DeepWalk and node2vec also perform well among the baselines. Multi-class Edge Classification: We use only 10% edges with ground truth label (as generated in Section 5.2) as the training set, because getting labels is expensive in networks. A logistic regression classifier is trained on the edge embeddings generated by different al- gorithms. The performance on the test set is reported using Micro F1 score. Figure 6b shows that line2vec is better or highly competitive with the state-of-the-art embedding algorithms. node2vec and Deep- Walk follows line2vec closely. On the Dolphin dataset, node2vec outperforms line2vec marginally. Performance of line2vec for edge classification again shows the superiority of a direct edge embedding scheme over the node aggregation approaches. 5.5 Ablation Study of line2vec The idea of line2vec is to embed the line graph for generating the edge embeddings of a given network. There are two main novel com- http://www-personal.umich.edu/~mejn/netdata/ http://www-personal.umich.edu/~mejn/netdata/ https://linqs.soe.ucsc.edu/data (a) DeepWalk (b) node2vec (c) SDNE (d) struc2vec (e) GraphSAGE (f) line2vec Figure 4: Edge visualization on Cora dataset. Different colors represent different edge communities. (a) Spherical Error (b) Non-negative Error Figure 5: Both spherical error ∑ u∈V ∑ v∈N(u) g(||xuv − cu||22 − R2u) and non-negative error ∑ u∈V g(−Ru) in the penalty function based optimization of line2vec converge to zero very fast on all the datasets. ponents in line2vec: first, the construction of weighted line graph; and second, more importantly, proposing the concept of collective homophily on the weighted line graph. In this subsection, we show the incremental benefit of each component through a small experi- ment of edge visualization on the Dolphin dataset, as shown in Fig. 7. We use node2vec (N2V) as the starting point because the skip- gram objective component of line2vec (L2V) is similar to node2vec. Though, visually there is not much difference between Sub-figures 7a and 7b, but there is some improvement when we apply node2vec on the weighted line graph (without using collective homophily) in Sub-fig. 7c. Finally, superiority of line2vec because of using collec- tive homophily on the weighted line graph is clear from Sub-fig. 7c. Thus, both the novel components of line2vev have their incremental benefits for the overall algorithm. 5.6 Parameter Sensitivity of line2vec Figure 8 shows the sensitivity of line2vec with respect to the hyper- parameter α (in Eq. 3 of the main paper) on Karate and Dolphin datasets. We have shown the variation of performance for node clas- sification (both micro and macro F1 scores) and node clustering (un- supervised accuracy). From the figure, one can observe that optimal performance in most of the cases is obtained when the value of α is from 0.05 to 0.1. Around these values, the loss from both the com- ponents of line2vec in Eq. 3 are close to each other. For our other experiments, we fix α=0.1 for all the datasets. 5.7 Interpretation of cu as Node Embedding line2vec is dedicated for direct edge embedding in information net- works. Lemma 1 suggests that each node in the given network G induces a clique in the line graph L(G). Based on the concept of col- lective homophily, corresponding to a node u in G, the clique in the line graph is enclosed by a sphere centered at cu ∈ RK (Eq. 3). In- tuitively, the center acts as a point which is close to the embeddings of all the nodes in the clique induced by u (or equivalently, all the edges incident on u in G). Hence the role of this center in the em- bedding space is similar to the role of the node to its adjacent edges in the graph. This motivates us to consider cu as the node embedding of u ∈ V in G. If (u,v) ∈ E, then the edge embedding of (u,v) should be close to both cu and cv, which in turn pulls cu and cv close to each other. Thus, node proximities are also captured in cu. We use clustering of the nodes (a.k.a. community detection) of the given network to validate the quality of node embedding obtained from the centers of the line2vec optimization. We use k-means++ clustering, as before, on the set of points cu, ∀u ∈ V and validate the clustering quality by using unsupervised accuracy [4]. Figure 6c shows that line2vec, though designed specifically for edge embed- ding, performs really good for a node based mining task. Specifically, for Karate and Dolphins networks, the gain is significantly more than best of the baselines. This result is interesting as we aimed to find edge embeddings, but also generate a set of efficient node embed- dings, which are not just the aggregation of the incident edges. 5.8 Connection of Node Centrality with Ru This subsection analyzes the interpretation of the radius Ru of the sphere enclosing the clique induced by node u ∈ V in the embed- ding space. When a node u has less number of incident edges, and the neighbors are very close to each other in the embedding space (for e.g., they are all from the same sub-community), a small radius Ru should be enough to enclose all the edges incident on u. But when the neighbors of the node u are diverse in nature, the corresponding edges would also be different in terms of strength and semantics. For example, an influential researcher may be directly connected to many other researchers in a research network, but only few of them can be direct collaborators. Hence, a larger sphere is needed to enclose the clique in the line graph induced by such a node. This intuition con- nects radius Ru of a sphere in the embedding space of line graph to the centrality [27] of the node u in the given network. A node which is loosely connected (i.e., less number or very similar neighbors) in the network is less central, and a node which is strongly connected (many or diverse set of neighbors) is considered as highly central. As real life networks are noisy [4], first we experiment with a small synthetic graph as shown in Figure 9 to show the connection between Ru and the centrality of the node u ∈ V . It has three communities and there is a central (red colored) node connecting all the commu- nities. Each community has three sub-communities which are con- nected via the green colored nodes. The degree of each node in this network is kept roughly the same. We use closeness centrality [21], which is used widely in the network analysis literature. The closeness centrality of the nodes are plotted in Fig. 9(b). The nodes in the y-axis are sorted based on their closeness centrality values and as expected, the red node top the list as it is well connected to all the communi- ties, followed by the green nodes, with yellow nodes placed at the bottom. We run line2vec on this synthetic graph and plot the Radius Ru for each node u in Fig. 9(c). Here also, the nodes are sorted in (a) Edge Clustering (b) Edge Classification (c) Node Clustering Figure 6: Performance Comparisons: (a) Micro F1 Score of Edge Classification. (b) Edge Clustering with KMeans++. (c) Node Clustering with KMeans++. Here we use cu as the embedding of the node u in the given network. (a) N2V (b) N2V+LG (c) N2V+WLG (d) L2V Figure 7: Edge visualization on Dolphin Dataset by t-SNE: In the following sub-figures, edge Embeddings are obtained (a) by using node2vec on the input graph and then taking average of end node embeddings for each edge, (b) by using node2vec on an unweighted (conventional) line graph, (c) by using node2vec on our proposed weighted line graph, (d) by line2vec. Clearly, there is an incremental improvement of the quality because of using weighted line graph and then collective homophily as reflected in (c) and (d) respectively. (a) Edge Classification (b) Edge Clustering Figure 8: Sensitivity of line2vec with respect to the hyper-parameter α (in Eq. 3 of the main paper) on Karate and Dolphin datasets: We have shown the variation of performance for edge classification (Mi- cro F1 score) and edge clustering (unsupervised accuracy). the same order as in sub-figure 9(b). As one can see, the red node has the highest value of the radius. As this node is connected to a diverse set of nodes in the network, it needs a larger sphere to enclose the induced clique in the line graph. We also observe that most of the green nodes have higher values of Ru than that of the yellow nodes. The correlation coefficient between the closeness centrality and the radius Ru is 0.56. A more prominent trend can be observed for be- tweenness centrality [27], where the correlation coefficient with the radius Ru is 0.86. On all the real-world datasets, we show the correlation of Ru with the two centrality metrics for all the nodes in Table 2. High positive correlation between them can conclude that radius Ru of a node is roughly proportional to the centrality of the node u in the network. However, a detailed analysis is required to see the scope of introduc- ing a new type of node centrality based on the values of Ru. (a) (b) (c) Figure 9: Relationship between radius Ru associated with each node and closeness centrality in a synthetic graph. (a) shows the structure of the synthetic network. (b) shows the closeness centrality of the nodes, where in Y axis, nodes are sorted based on their centrality values. (c) shows the Ru for all the nodes. Nodes in Y-axis of (c) are sorted in the same order as in (b). The colors of the lines in (b) and (c) correspond to three different types of nodes (colored accordingly) in (a). This figure also shows the high overlap between the top few nodes in both the lists. Table 2: Pearson Correlation-Coefficient(CC) values obtained be- tween the radius(Ru) and centrality values of nodes for different net- works. The centrality measures considered here are Betweenness and Closeness centrality. Dataset Karate Dolphins Cora Pubmed MSA Betweenness CC 0.81 0.66 0.29 0.26 0.35 Closeness CC 0.68 0.78 0.79 0.59 0.72 6 Discussion and Future Work We proposed a novel unsupervised dedicated edge embedding frame- work for homogeneous information and social networks. We convert the given network to a weighted line graph and introduce the con- cept of collective homophily to embed the weighted line graph. Our framework is quite generic. The skip-gram based component in the objective function of line2vec can easily be replaced with any other approach like graph convolution in weighted line graph. Beside, we also plan to extend this methodology for heterogeneous information networks and knowledge bases. There are several edge centric appli- cations in networks. This work, being the first one towards a direct edge embedding, can play a basis to solve some of them in the con- text of network embedding and help to move network representation learning beyond node embedding. REFERENCES [1] Sami Abu-El-Haija, Bryan Perozzi, and Rami Al-Rfou, ‘Learning edge representations via low-rank asymmetric projections’, in Proceedings of the 2017 ACM on Conference on Information and Knowledge Man- agement, pp. 1787–1796. ACM, (2017). [2] Lada A Adamic and Eytan Adar, ‘Friends and neighbors on the web’, Social networks, 25(3), 211–230, (2003). [3] Sambaran Bandyopadhyay, Harsh Kara, Aswin Kannan, and M Narasimha Murty, ‘Fscnmf: Fusing structure and content via non-negative matrix factorization for embedding information net- works’, arXiv preprint arXiv:1804.05313, (2018). [4] Sambaran Bandyopadhyay, N Lokesh, and M Narasimha Murty, ‘Out- lier aware network embedding for attributed networks’, in Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pp. 12– 19, (2019). [5] Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko, ‘Translating embeddings for modeling multi- relational data’, in Advances in neural information processing systems, pp. 2787–2795, (2013). [6] Kurt Bryan and Yosi Shibberu, ‘Penalty functions and constrained opti- mization’, Dept. of Mathematics, Rose-Hulman Institute of Technology. http:// www. rosehulman. edu/˜ bryan/lottamath/penalty. pdf, (2005). [7] Muhao Chen and Chris Quirk, ‘Embedding edge-attributed relational hierarchies’. SIGIR, (2019). [8] Tim S Evans and Renaud Lambiotte, ‘Line graphs of weighted net- works for overlapping communities’, The European Physical Journal B, 77(2), 265–272, (2010). [9] Hongchang Gao and Heng Huang, ‘Deep attributed network embed- ding.’, in IJCAI, volume 18, pp. 3364–3370, (2018). [10] Zheng Gao, Gang Fu, Chunping Ouyang, Satoshi Tsutsui, Xiaozhong Liu, Jeremy Yang, Christopher Gessner, Brian Foote, David Wild, Ying Ding, et al., ‘edge2vec: Representation learning using edge semantics for biomedical knowledge discovery’, BMC bioinformatics, 20(1), 306, (2019). [11] Mohammad Golam Sohrab, Toru Nakata, Makoto Miwa, and Yutaka Sasaki, ‘Edge2vec: Edge representations for large-scale scalable hier- archical learning’, Computación y Sistemas, 21(4), 569–579, (2017). [12] Aditya Grover and Jure Leskovec, ‘node2vec: Scalable feature learn- ing for networks’, in Proceedings of the 22nd ACM SIGKDD interna- tional conference on Knowledge discovery and data mining, pp. 855– 864. ACM, (2016). [13] Will Hamilton, Zhitao Ying, and Jure Leskovec, ‘Inductive representa- tion learning on large graphs’, in Advances in Neural Information Pro- cessing Systems, pp. 1025–1035, (2017). [14] Thomas N Kipf and Max Welling, ‘Semi-supervised classification with graph convolutional networks’, arXiv preprint arXiv:1609.02907, (2016). [15] Suxue Li, Haixia Zhang, Dalei Wu, Chuanting Zhang, and Dongfeng Yuan, ‘Edge representation learning for community detection in large scale information networks’, in International Workshop on Mobility Analytics for Spatio-temporal and Social Data, pp. 54–72. Springer, (2017). [16] David Liben-Nowell and Jon Kleinberg, ‘The link-prediction problem for social networks’, Journal of the American society for information science and technology, 58(7), 1019–1031, (2007). [17] Laurens van der Maaten and Geoffrey Hinton, ‘Visualizing data us- ing t-sne’, Journal of machine learning research, 9(Nov), 2579–2605, (2008). [18] Miller McPherson, Lynn Smith-Lovin, and James M Cook, ‘Birds of a feather: Homophily in social networks’, Annual review of sociology, 27(1), 415–444, (2001). [19] Federico Monti, Oleksandr Shchur, Aleksandar Bojchevski, Or Litany, Stephan Günnemann, and Michael M Bronstein, ‘Dual-primal graph convolutional networks’, arXiv preprint arXiv:1806.00770, (2018). [20] M.E.J. Newman, Networks: An Introduction, Oxford University Press, Oxford, UK, 2010. [21] Tore Opsahl, Filip Agneessens, and John Skvoretz, ‘Node centrality in weighted networks: Generalizing degree and shortest paths’, Social networks, 32(3), 245–251, (2010). [22] Bryan Perozzi, Rami Al-Rfou, and Steven Skiena, ‘Deepwalk: Online learning of social representations’, in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 701–710. ACM, (2014). [23] Leonardo FR Ribeiro, Pedro HP Saverese, and Daniel R Figueiredo, ‘struc2vec: Learning node representations from structural identity’, in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 385–394. ACM, (2017). [24] E. Rogers, Diffusion of Innovations, Free Press, New York, USA, 1995. [25] Yu Shi, Qi Zhu, Fang Guo, Chao Zhang, and Jiawei Han, ‘Easing em- bedding learning by comprehensive transcription of heterogeneous in- formation networks’, in Proceedings of the 24th ACM SIGKDD In- ternational Conference on Knowledge Discovery & Data Mining, pp. 2190–2199. ACM, (2018). [26] Arnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo- june Paul Hsu, and Kuansan Wang, ‘An overview of microsoft aca- demic service (mas) and applications’, in Proceedings of the 24th in- ternational conference on world wide web, pp. 243–246. ACM, (2015). [27] Oskar Skibski, Talal Rahwan, Tomasz P Michalak, and Makoto Yokoo, ‘Attachment centrality: An axiomatic approach to connectivity in net- works’, in Proceedings of the 2016 International Conference on Au- tonomous Agents & Multiagent Systems, pp. 168–176. International Foundation for Autonomous Agents and Multiagent Systems, (2016). [28] Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio, ‘Graph attention networks’, in International Conference on Learning Representations, (2018). [29] Petar Veličković, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm, ‘Deep graph infomax’, in Inter- national Conference on Learning Representations, (2019). [30] Janu Verma, Srishti Gupta, Debdoot Mukherjee, and Tanmoy Chakraborty, ‘Heterogeneous edge embedding for friend recommenda- tion’, in European Conference on Information Retrieval, pp. 172–179. Springer, (2019). [31] Daixin Wang, Peng Cui, and Wenwu Zhu, ‘Structural deep network embedding’, in Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 1225–1234. ACM, (2016). [32] H. Whitney, ‘Congruent graphs and the connectivity of graphs’, Amer- ican Journal of Mathematics, 54(1), 150–168, (1932). [33] Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S Yu, ‘A comprehensive survey on graph neural net- works’, arXiv preprint arXiv:1901.00596, (2019). [34] Junyuan Xie, Ross Girshick, and Ali Farhadi, ‘Unsupervised deep em- bedding for clustering analysis’, in International conference on ma- chine learning, pp. 478–487, (2016). [35] Cheng Yang, Zhiyuan Liu, Deli Zhao, Maosong Sun, and Edward Y Chang, ‘Network representation learning with rich text information.’, in IJCAI, pp. 2111–2117, (2015). [36] Yang Zhou, Sixing Wu, Chao Jiang, Zijie Zhang, Dejing Dou, Ruom- ing Jin, and Pengwei Wang, ‘Density-adaptive local edge representa- tion learning with generative adversarial network multi-label edge clas- sification’, in 2018 IEEE International Conference on Data Mining (ICDM), pp. 1464–1469. IEEE, (2018). 1 Introduction 2 Related Work and Research Gaps 3 Problem Description 4 Solution Approach: line2vec 4.1 Line Graph Transformation 4.2 Weighted Line Graph Formation 4.3 Embedding the Line Graph 4.3.1 Collective Homophily and Cost Function Formulation 4.3.2 Solving the Optimization 4.4 Key Observations and Analysis 5 Experimental Evaluation 5.1 Design of Baseline Algorithms 5.2 Datasets Used and Setting Hyper-parameters 5.3 Penalty Errors of line2vec Optimization 5.4 Downstream Edge Mining Tasks 5.5 Ablation Study of line2vec 5.6 Parameter Sensitivity of line2vec 5.7 Interpretation of cu as Node Embedding 5.8 Connection of Node Centrality with Ru 6 Discussion and Future Work bielak-attre2vec-2021 ---- AttrE2vec: Unsupervised Attributed Edge Representation Learning See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/348079131 AttrE2vec: Unsupervised Attributed Edge Representation Learning Preprint · December 2020 CITATIONS 0 READS 7 3 authors: Some of the authors of this publication are also working on these related projects: Social networks View project TRANSFoRm View project Piotr Bielak Wroclaw University of Science and Technology 2 PUBLICATIONS   0 CITATIONS    SEE PROFILE Tomasz Kajdanowicz Wroclaw University of Science and Technology 113 PUBLICATIONS   829 CITATIONS    SEE PROFILE Nitesh V Chawla University of Notre Dame 382 PUBLICATIONS   21,078 CITATIONS    SEE PROFILE All content following this page was uploaded by Piotr Bielak on 04 January 2021. The user has requested enhancement of the downloaded file. https://www.researchgate.net/publication/348079131_AttrE2vec_Unsupervised_Attributed_Edge_Representation_Learning?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_2&_esc=publicationCoverPdf https://www.researchgate.net/publication/348079131_AttrE2vec_Unsupervised_Attributed_Edge_Representation_Learning?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_3&_esc=publicationCoverPdf https://www.researchgate.net/project/Social-networks-9?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_9&_esc=publicationCoverPdf https://www.researchgate.net/project/TRANSFoRm-3?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_9&_esc=publicationCoverPdf https://www.researchgate.net/?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_1&_esc=publicationCoverPdf https://www.researchgate.net/profile/Piotr_Bielak2?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_4&_esc=publicationCoverPdf https://www.researchgate.net/profile/Piotr_Bielak2?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_5&_esc=publicationCoverPdf https://www.researchgate.net/institution/Wroclaw_University_of_Science_and_Technology?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_6&_esc=publicationCoverPdf https://www.researchgate.net/profile/Piotr_Bielak2?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_7&_esc=publicationCoverPdf https://www.researchgate.net/profile/Tomasz_Kajdanowicz?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_4&_esc=publicationCoverPdf https://www.researchgate.net/profile/Tomasz_Kajdanowicz?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_5&_esc=publicationCoverPdf https://www.researchgate.net/institution/Wroclaw_University_of_Science_and_Technology?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_6&_esc=publicationCoverPdf https://www.researchgate.net/profile/Tomasz_Kajdanowicz?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_7&_esc=publicationCoverPdf https://www.researchgate.net/profile/Nitesh_Chawla?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_4&_esc=publicationCoverPdf https://www.researchgate.net/profile/Nitesh_Chawla?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_5&_esc=publicationCoverPdf https://www.researchgate.net/institution/University_of_Notre_Dame?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_6&_esc=publicationCoverPdf https://www.researchgate.net/profile/Nitesh_Chawla?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_7&_esc=publicationCoverPdf https://www.researchgate.net/profile/Piotr_Bielak2?enrichId=rgreq-ee0c9a6154948c3f0080a33b782b9118-XXX&enrichSource=Y292ZXJQYWdlOzM0ODA3OTEzMTtBUzo5NzYzNjM1MzYyNzM0MTFAMTYwOTc5NDYxNTk2NA%3D%3D&el=1_x_10&_esc=publicationCoverPdf AttrE2vec: Unsupervised Attributed Edge Representation Learning Piotr Bielaka, Tomasz Kajdanowicza, Nitesh V. Chawlaa,b aDepartment of Computational Intelligence, Wroclaw University of Science and Technology, Poland bDepartment of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, USA Abstract Representation learning has overcome the often arduous and manual featurization of net- works through (unsupervised) feature learning as it results in embeddings that can apply to a variety of downstream learning tasks. The focus of representation learning on graphs has focused mainly on shallow (node-centric) or deep (graph-based) learning approaches. While there have been approaches that work on homogeneous and heterogeneous net- works with multi-typed nodes and edges, there is a gap in learning edge representations. This paper proposes a novel unsupervised inductive method called AttrE2Vec, which learns a low-dimensional vector representation for edges in attributed networks. It sys- tematically captures the topological proximity, attributes affinity, and feature similarity of edges. Contrary to current advances in edge embedding research, our proposal extends the body of methods providing representations for edges, capturing graph attributes in an inductive and unsupervised manner. Experimental results show that, compared to contemporary approaches, our method builds more powerful edge vector representations, reflected by higher quality measures (AUC, accuracy) in downstream tasks as edge classi- fication and edge clustering. It is also confirmed by analyzing low-dimensional embedding projections. Keywords: representation learning, graphs, edge embedding, random walk, neural network, attributed graph. 1. Introduction Complex networks, included attributed and heterogeneous networks, are ubiquitous — from recommender systems to citation networks and biological systems [1]. These networks present a multitude of machine learning problem statements, including node classification, link prediction, and community detection. A fundamental aspect of any such machine learning (ML) task, transductive or inductive, is the availability of fea- turized data. Traditionally, researchers have identified several network characteristics suited to specific ML tasks and used them for the learning algorithm. This practice is arduous as it often entails customizing to each specific ML task, and also is limited to the computable characteristics. This has led to a surge in (unsupervised) algorithms and methods that learn embed- dings from the networks, such that these embeddings form the featurized representation Preprint submitted to Information Sciences January 1, 2021 ar X iv :2 01 2. 14 72 7v 1 [ cs .L G ] 2 9 D ec 2 02 0 Figure 1: Our proposed AttrE2vec model compared to other methods in the task of an attributed graph embedding. Colors denote edge features. On the left we can see a graph, where the features are aligned to substructures of the graph. On the right, the features were shuffled (ca. 50%). Traditional approaches fail to build robust representations, whereas our method includes features information to construct the embedding vectors. of the network for the ML tasks [2, 3, 4, 5, 6]. This area of research is generally no- tated as representation learning in networks. Generally, these embeddings generated by representation learning methods are agnostic to the end use-case, as they are generated in an unsupervised fashion. Traditionally, the focus was on representation learning on homogeneous networks, i.e. the networks that have singular type of nodes and edges, and also do not have attributes attached to the nodes and edges [4]. Existing representation learning models mainly focus on transductive learning, where a model can only be trained using the entire input graph. It means that the model requires all the nodes and a fixed structure of the network in the training phase, e.g., Node2vec [7], DeepWalk [8] and GCN [9], to some extent. Besides, there have been methods focused on heterogeneous networks that incorporate different typed nodes and edges in a network, as well as content at each node [10, 11]. On the other hand, a less explored and exploited approach is the inductive setting. In this approach, only a part of the network is used to train the model to infer embeddings for new nodes. Several attempts have been made in the inductive setting including EP-B [12], GraphSAGE [13], GAT [14], SDNE [15], TADW [16], AHNG[17] or PVECB [18]. There is also recent progress on heterogeneous graph embedding, e.g., MIFHNE [19] or 2 models based on graph neural networks [20]. State-of-the-art network embedding techniques are mostly unsupervised, i.e., aim at learning low-dimensional representations that preserve the structure of an input graph, e.g., GraphSAGE [13], DANE [21], line2vec [22], RCAN [23]. Nevertheless, semi-supervised or supervised methods can learn vector representations but for a specific downstream pre- diction task, e.g., TADW [16] or FSCNMF [24]. Hence it has been shown in the literature that not much supervision is required to learn the embeddings. In recent years, proposed models mainly focus on the graphs that do not contain attributes related to nodes and edges [4]. It is especially noticeable for edge attributes. The majority of proposed approaches consider node attributes only, omitting the richness of edge feature space while learning the representation. Nevertheless, there have been successfully introduced such models as DANE [21], GraphSAGE [13], SDNE [15] or CAGE [25] which make use of node features and EGNN [26], NEWEE [27], EGAT [28] that consume edge attributes. Table 1: Comparison of most representative graph embedding methods with their abilities to learn the representation, with or without attributes, reasoning types and short characteristics. The most prominent and appropriate methods selected to compare to AttrE2vec in experiments are marked with bold text. Method Representation Attributed Reasoning Family Nodes Edges Nodes Edges Transduct. Induct. S u p e r v is e d ECN [29] (2016) X X neigh. aggr. GCN [9] (2017) X X X X GCN/GNN ECC [30] (2017) X X X GCN, DL FSCNMF [24] (2018) X X X GCN GAT [14] (2018) X X X X AE, DL Planetoid [31] (2018) X X X X GNN EGNN [26] (2019) X X X X X X GNN EdgeConv [32] (2019) X X GNN EGAT [28] (2019) X X X X X X GNN Attribute2vec [33] (2020) X X X GCN U n s u p e r v is e d DeepWalk [8] (2014) X X RW, skip-gram TADW [16] (2015) X X X RW, MF LINE [34] (2015) X X RW, skip-gram Node2vec [7] (2016) X X RW, skip-gram SDNE [15] (2016) X X X X AE GraphSAGE [13] (2017) X X X X RW EP-B [12] (2017) X X X X AE Struc2vec [35] (2017) X X RW, skip-gram DANE [21] (2018) X X X X AE Line2vec [22] (2019) X X RW, skip-gram NEWEE [27] (2019) X X X X RW, skip-gram AttrE2vec (2020) X X X X X RW, AE, DL Both node-based embedding methods and graph neural network inspired methods do not generalize effectively to both transductive and inductive settings, especially when there are attributes associated with edges. This work is motivated by the idea of un- supervised learning on networks with attributed edges such that the embeddings are generalizable across tasks and are inductive. To that end, we develop a novel AttrE2vec, an unsupervised learning model that adapts auto-encoder and self-attention network with the use of feature reconstruction and graph structural loss. To learn edge representation, AttrE2vec splits edge neighborhood into two parts, separately for each node endings of the edge, and then generates random 3 edge walks in both neighborhoods. All walks are then aggregated over the node and edge attributes using one of the proposed strategies (Avg, Exp, GRU, ConcatGRU). These are accumulated with the original nodes and edge features and then fed to attention and dense layer to encode the edge. The embeddings are subsequently inferred via a two-step loss function — for both feature reconstruction and graph structural loss. As a consequence, AttrE2vec can explicitly incorporate feature information from nodes and edges at many hops away to effectively produce the plausible edge embeddings for the inductive setting. In summary, our main contributions are as follows: • we propose a novel unsupervised AttrE2vec method, which learns a low-dimensional vector representation for edges that are attributed • we exploit the concept of a graph-topology-driven edge feature aggregation, from simple ones to learnable GRU based, that captures edge topological proximity and similarity of edge features • the proposed method is inductive and allows getting the representation for edges not present in the training phase • we conduct various experiments and show that our AttrE2vec method has superior performance over all of the baseline methods on edge classification and clustering tasks. 2. Related work and Research Gap Embedding information networks has received significant interest from the research community. We refer the readers to the survey articles for a comprehensive overview of network embedding [4, 5, 3, 2] and cite only some of the most prominent works that are relevant. Unsupervised network embedding methods use only the network structure or original attributes of nodes and edges to construct embeddings. The most common method is DeepWalk [8], which in two-phases constructs node neighborhoods by per- forming fixed-length random walks and employs the skip-gram [7] model to preserve the co-occurrences between nodes and their neighbors. This two-phase framework was later an inspiration for learning network embeddings by proposing different strategies for con- structing node neighborhoods or modeling co-occurrences between nodes, e.g., node2vec [7], Struc2vec [35], GraphSAGE [13], line2vec [22] or NEWEE [27]. Another group of un- supervised methods utilizes auto-encoder or graph neural networks to obtain embedding. SDNE [15] uses auto-encoder architecture to preserve first and second-order proximities by jointly optimizing the loss in neighborhood reconstruction. Another auto-encoder based representatives are EP-B [12] and DANE [21]. Supervised network embedding methods are constructed as an end-to-end meth- ods for particular tasks like node classification or link prediction. These methods require network structure, attributes of nodes and edges (if method is capable of using) and some annotated target like node class. The representatives are ECN [29], ECC [30], FSCNMF [24], GAT [14], planetoid [31], EGNN [26], GCN [9], EdgeConv [32], EGAT [28], Attribute2vec [33]. 4 Edge representation learning has been already tackled by several methods, i.e. ECN [29], EGNN [26], line2vec [22], EdgeConv [32], EGAT [28]. However, non of these methods was able to directly take into account attributes of edges as well as perform the learning in an unsupervised manner. All the characteristics of the representative node and edge representation learning methods are grouped in Table 1. 3. Method 3.1. Motivation In the following paragraphs, we explain our three-fold motivation to propose the AttrE2vec. Edge embeddings. For a decade, network processing approaches gather more and more attention as graph data is produced in an increasing number of systems. Network em- bedding traditionally provided the notion of vectorizing nodes that was used in node classification or clustering. However, the edge representation learning did not gather enough attention and was accomplished through node embedding transformation [36]. Nevertheless, such an approach is problematic. For instance, inferring edge type from neighboring nodes’ embeddings may not be the best choice for edge type classification in heterogeneous social networks. We claim that efficient edge clustering, edge attribute re- gression, or link prediction tasks require dedicated and specific edge representations. We expect that the representation learning approach devoted strictly to edges provides more powerful vector representations than traditional methods that require node embeddings trained upfront and transform nodes’ embedding to represent edges. Inductive embedding methods. A vast majority of contemporary network representation learning methods is transductive (see Table 1). It means that any change to the graph requires the whole retraining of the method to provide predictions for unseen cases—such property limits the applicability of methods due to high computational costs. Contrary, the inductive approach builds a predictive ability that can be applied to unseen cases and does not need retraining – in general, inductive methods have a lower computation cost. Considering these advantages, we expect modern edge embedding methods to be inductive. Encoding graph attributes in embeddings. Much of the real-world data exhibits rich at- tribute sets or meta-data that contain crucial information, e.g., about the similarity of nodes or edges. Traditionally, graph representation learning has been focused on ex- ploiting the network structure, omitting the related content. Thus, we may expect to consume attributes as a regularizer over the structure. It would allow overcoming the limitation when the only edge discriminating ability is encoded in the edges’ attributes, not in the graph’s structure. Relying only on the network would produce inconclusive embeddings. 5 3.2. Attributed graph edge embedding We denote an attributed graph as G = (V,E), where V is a set of nodes and E = {(u,v) ∈ V ×V} a set of edges. Every node u and every edge e = (u,v) has associated features: mu ∈ RdV and fuv ∈ RdE , where M ∈ R|V |×dV and F ∈ R|E|×dE are node and edge feature matrices, respectively. By dV we denote dimensionality of node feature space and dE dimensionality of edge feature space. The edge embedding task is defined as learning a function g : E → Rd, which takes an edge and outputs its low-dimensional vector representation. Note that the embedding dimension d should be much less than the original edge feature dimensionality dE, i.e.: d << dE. More specifically, we aim at using the topological structure of the graph and node and edge attributes: f : (E,F,M) → Rd. Figure 2: Overview of the AttrE2vec model. The model first computes edge random walks on two neighborhoods of a given edge (u,v). Each neighbourhood walks are aggregated into Su,Sv. Both are combined with the edge features fuv using an Encoder module, which results into the edge embedding vector huv. The loss function consists of two parts: structural loss (Lcos) and feature reconstruction loss (LMSE). 3.3. AttrE2vec In contrast to traditional node embedding methods, we shift the focus from nodes to edges and consider a graph from an edge perspective. Given any edge e = (u,v), we can observe three natural sources of knowledge: the edge attributes itself and the two neighborhoods - Nu and Nv, located behind nodes u and v, respectively. In AttrE2vec, we exploit all three sources jointly. First, we obtain aggregations (summaries) Su,Sv of the both neighborhoods Nu,Nv. We want to capture the topological structure of the neighborhood, so we perform k edge random walks of length L, which start from node u (or v, respectively) and use a uniformly distributed neighbor sampling approach (DeepWalk-like) to obtain the next edge. Each ith walk wiu started from node u is hence a sequences of edges. RW(G,k,L,u) →{w1u,w 2 u, . . . ,w k u} wiu ≡ (u,u2), (u3,u4), . . . , (uL−1,uL) 6 Next, we take the attributes of the edges (and nodes, if applicable) in each random walk and aggregate them into a single vector using the walk aggregation model Aggw. Siu = Aggw(w i u,F,M) Later, aggregated walks are combined using the neighborhood aggregation model Aggn, which summarizes the neighborhood Su (and Sv, respectively). The proposed implementations of these aggregation are given in Section 3.4. Su = Aggn({S1u,S 2 u, . . . ,S k u}) Finally, we obtain the low dimensional edge embedding huv using an encoder Enc module. It combines the edge attributes fuv with the summarized neighborhood infor- mation Su, Sv. We employ a simple Multilayer Perceptron (MLP) with 3 inputs (each of size equal to the edge features dimensionality) and an attention mechanism over these in- puts, to check how much of the information of each input is used to create the embedding vector (see Figure 3): huv = Enc(fuv,Su,Sv) Figure 3: Encoder module architecture The overall illustration of the method is contained in Figure 2 and the inference algorithm is shown in Algorithm 1. 3.4. Aggregation models For the purpose of the neighborhood aggregation model Aggn, we use an average over vectors Siu, as there is no particular ordering of these vectors (each one was generated by an equally important random walk). In the case of walk aggregation, we propose the following: 7 Algorithm 1: AttrE2vec inference algorithm Data: graph G, edge list xe, edge features F, node features M Params: number of random walks per node k, random walk length L Result: edge embedding vectors huv begin foreach (u, v) in xe do foreach i in (1. . . k) do wiu = RW(G,L,u) Siu = Aggw(w i u,F,M) wiv = RW(G,L,v) Siv = Aggw(w i v,F,M) end Su = Aggn({S1u, . . . ,Sku}) Sv = Aggn({S1v, . . . ,Skv}) huv = Enc(fuv,Su,Sv) end end • average – that computes a simple average of the edge attribute vectors in the random walk; Siu = 1 L L∑ n=1 funun+1 • exponential – that computes a weighted average, where the weights are exponents of the ”minus” position in the random walk so that further away edges are less important than the near ones; Siu = 1 L L∑ n=1 e−nfunun+1 • GRU – that uses a Gated Recurrent Unit [37] architecture, where hidden and input dimension is equal to the edge attribute dimension; the aggregated representation is the output of the last hidden vector; the aggregation process starts here at the end of the random walk and proceeds to the beginning; Siu = GRU({funun+1,fun−1un, . . . ,fu1u2}) • ConcatGRU – that is similar to the GRU-based aggregator, but here we also use the node feature information by concatenating the node attributes with the edge attributes; hence the GRU input size is equal to the sum of the edge and node dimensions; in case there are not any node features available, one could use 8 network-specific features, like degree, betweenness or more advanced techniques like Node2vec; the hidden dimension size and the aggregation direction is unchanged; Siu = ConcatGRU({funun+1 ⊕mun, . . . ,fu1u2 ⊕mu1}) 3.5. Learning AttrE2vec’s parameters AttrE2vec is designed to make the most use of edge attributes and information about the structure of the network. Therefore we propose a loss function, which consists of two main parts: • structural loss Lcos – computes a cosine embedding loss; such function tries to minimize the cosine distance between a given embedding h and embeddings of edges sampled from the random walks h+ (positive), and simultaneously to maximize a cosine distance between an embedding h and embeddings of edges sampled from a set of all edges in the graph h− (negative), except for these in the random walks: Lcos = 1 |B| ∑ huv∈B  ∑ h + uv (1 − cos(huv,h+uv)) + ∑ h − uv cos(huv,h − uv)   where B denotes a minibatch of edges and |B| the minibatch size, • feature reconstruction loss LMSE – computes a mean squared error of the actual edge features and the outputs of a decoder (implemented as a 3-layer MLP – see Figure 4), that reconstruct the edge features based on the edge embeddings; LMSE = 1 |B| ∑ (huv,fuv)∈B (DEC(huv) −fuv) 2 where B denotes a minibatch of edges and |B| the minibatch size. Figure 4: Decoder module architecture We combine the values of the above loss functions using a mixing parameter λ ∈ [0, 1]. The higher the value of this parameter is, the more structural information is preserved and less focus is one the feature reconstruction. The total loss of AttrE2vec is given as follows: L = λ∗Lcos + (1 −λ) ∗LMSE 9 4. Experiments To evaluate the proposed model’s performance, we perform three tasks: edge classi- fication, edge clustering, and embedding visualization on three real-world datasets. We first train our model on a small subset of edges (inductive setting). Then we use the model to infer embeddings for edges from the test set. Finally, we evaluate them in all downstream tasks: by predicting the class of edges in citation graphs (edge classifi- cation), by applying the K-means++ algorithm (edge clustering; as defined in [22]) and by the dimensionality reduction method T-SNE (embedding visualization). We compare our model to several baselines and contemporary methods in all experiments, see Table 1. Eventually, we check the influence of AttrE2vec’s hyperparameters and per- form an ablation study on artificially generated datasets. We implement our model in the popular deep learning framework PyTorch. All experiments were performed on an NVIDIA GTX1080Ti. Upon acceptance in the journal, we will make our code available at https://github.com/attre2vec/attre2vec and include our DVC [38] pipeline so that all experiments can be easily reproduced. 4.1. Datasets Table 2: Datasets used in the experiments. Name Features Number of Training instances initial pre-processed node edge node edge nodes edges classes inductive transductive Cora 1 433 0 32 260 2 485 5 069 7+1 160 5 069 Citeseer 3 703 0 32 260 2 110 3 668 6+1 140 3 668 Pubmed 500 0 32 260 19 717 44 324 3+1 80 44 324 In order to compare gathered evaluation evidence we focused on well known datasets, that appear in the literature, namely: Cora [39], Citeseer [39] and Pubmed [40]. These are citation networks of scientific papers in several research areas, where nodes are the papers and edges denote citations between papers. We summarize basic statistics about the datasets before and after pre-processing steps in Table 2. Raw datasets contain node features only in the form of high dimensional sparse bags of words. For Cora and Citeseer, these are binary vectors, showing which of the most popular words were used in a given paper, and for Pubmed, the features are in the form of TF-IDF vectors. To adjust the datasets to our problem setting, we apply the following pre-processing steps to obtain edge level features, which are used to train and evaluate our AttrE2vec model: • we create dense vector representations of the nodes’ features by applying Doc2vec [41] in the PV-DBOW variant with a target dimension size of 128; • for each edge (u,v) and its symmetrical version (v,u) (necessary to perform uni- form, undirected random walks) we extract the following features: – 1 feature – cosine similarity of raw node features for nodes u and v (binary BoW; for Pubmed transformed from TF-IDF to binary BoW), 10 https://github.com/attre2vec/attre2vec – 2 features – the ratios of the number of used words (number of ones in the BoW) to all possible words in the document (length of BoW vector) in each paper u and v, – 256 features – concatenation of Doc2vec features for nodes u and v, – 1 feature – a binary indicator, which denotes whether this is an original edge (1) or its symmetrical counterpart (0), • we apply standardization (StandardScaler in Scikit-Learn [42]) of the edge feature matrix. Moreover, we extracted new node features as 32-dimensional Node2vec embeddings to provide the evaluation possibility for one of our model versions (AttrE2vec with Con- catGRU aggregator), which generalizes upon both edge and nodes attributes. Raw datasets provide each node labeled by the research area the paper comes from. To apply this knowledge in the edge classification problem setting, we applied the following rule: if an edge has two nodes from the same class (research area), the edge receives this class; if two nodes have different classes, the edge between these nodes is assigned with a cross-domain citation class. To ensure a fair comparison method, we follow the dataset preparation scheme from EP-B [12], i.e., for each dataset (Cora, Citeseer, Pubmed) we sample 10 train/validation/test sets, where the train set consists of 20 edges per class and the validation and test sets to contain 1 000 randomly chosen edges each. While reporting the resulting metrics, we show the mean values over these ten sampled sets (together with the standard deviation). 4.2. Baselines We compare our method against several baseline methods. In the most simple case, we use the edge features obtained during the pre-processing phase for all datasets (further referred to as Doc2vec). Many standard approaches employ simple node embedding transformations to obtain edge embeddings. The authors of Node2vec [36] proposed binary operators like averaging, Hadamard product, or L1 and L2 norms of vector differences. Here, we will use following methods to obtain node embeddings: DeepWalk [8], Node2vec [36], SDNE [43] and Struc2vec [35]. In preliminary experiments, we evaluated these methods and checked that the Average operator and an embedding size of 64 gives the best results. We will use these models in 2 setups: (a) Avg(M,M) – using only the averaged node features, (b) Avg(M,M)⊕F – like previously but concatenated with the edge features from the dataset (in total 324-dim vectors). We also checked a scheme to compute a 64-dim PCA reduction of the concatenated features to have comparable vector sizes with the 64-dimensional embedding of our model, but these turned out to perform poorly. Note that SDNE has the capability of inductive reasoning, but due to the non-availability of such implementation, we decided to evaluate this method in the transductive scheme (which works in favor of the method). 11 Figure 5: Architecture of the MLP(M,M). Figure 6: Architecture of the MLP(M,M,F). We also extend our body of baselines by more sophisticated approaches – two dense autoencoder architectures. In the first setting MLP(M,M), we train a model (see Figure 5), which reconstructs concatenated embeddings of connected nodes. In the second baseline MLP(M,M,F), the autoencoder (see Figure 6) is extended by edge attributes. In both settings, we employ the mean squared error as the model loss function. The output of the encoders (embeddings) is used in the downstream tasks. The input node embeddings are obtained using the methods mentioned above, i.e., DeepWalk, Node2vec, SDNE, and Struc2vec. The last baseline is Line2vec [22], which is directly dedicated for edges - we use an embedding size of 64. 4.3. Edge classification To evaluate our model in an inductive setting, we need to make sure that test edges are unseen during the model training procedure – we remove them from the graph. Note that all baselines (except for GraphSage, see 1) require all edges during the training phase (i.e., these are transductive methods). After each training epoch of AttrE2vec, we evaluate the embeddings using L2- regularized Logistic Regression (LR) classifier and compute AUC. The regression model is trained on edge embeddings from the train set and evaluated on edge embeddings from the validation set. We take the model with the highest AUC value on the validation set. 12 Table 3: AUC values for edge classification. F denotes the edge attributes (also referred to as ”Doc2vec”), M – node attributes (e.g., embeddings computed using ”Node2vec”), ⊕ – concatenation operator, Avg(M,M) – average operator on node embeddings, MLP(·) – encoder output of MLP autoencoder trained on given attributes. AUC in bold shows the highest value and AUC in italic — the second highest value. Method group/name Vector AUC size Citeseer Cora Pubmed T r a n s d u c ti v e Edge features only; F (Doc2vec) 260 86.13 ± 0.95 88.67 ± 0.51 79.15 ± 1.41 Line2vec 64 86.19 ± 0.28 91.75 ± 1.07 84.88 ± 1.19 Avg(M,M) DeepWalk 64 58.40 ± 1.08 59.98 ± 1.32 51.04 ± 1.23 Node2vec 64 58.26 ± 0.89 59.59 ± 1.11 51.03 ± 1.01 SDNE 64 54.28 ± 1.57 55.91 ± 1.11 50.00 ± 0.00 Struc2vec 64 61.29 ± 0.86 61.30 ± 1.58 54.67 ± 1.46 MLP(M,M) DeepWalk 64 55.88 ± 1.68 57.87 ± 1.53 51.23 ± 0.77 Node2vec 64 55.35 ± 2.26 57.44 ± 0.87 51.48 ± 1.55 SDNE 64 55.56 ± 0.93 56.02 ± 1.22 50.00 ± 0.00 Struc2vec 64 59.93 ± 1.43 59.76 ± 1.80 53.27 ± 1.32 Avg(M,M)⊕F DeepWalk 324 86.13 ± 0.95 88.67 ± 0.51 79.15 ± 1.41 Node2vec 324 86.13 ± 0.95 88.67 ± 0.51 79.15 ± 1.41 SDNE 324 86.14 ± 1.03 88.70 ± 0.51 79.15 ± 1.41 Struc2vec 324 86.21 ± 0.97 88.73 ± 0.48 79.24 ± 1.36 MLP(M,M,F) DeepWalk 64 84.58 ± 1.11 86.47 ± 0.87 78.60 ± 1.84 Node2vec 64 84.65 ± 1.05 86.71 ± 0.68 78.84 ± 1.71 SDNE 64 84.32 ± 1.13 85.99 ± 0.77 78.34 ± 1.07 Struc2vec 64 83.95 ± 1.16 85.54 ± 0.96 77.19 ± 1.42 In d u c ti v e Avg(M,M) GraphSage 64 54.84 ± 1.90 55.16 ± 1.36 51.14 ± 1.64 MLP(M,M) GraphSage 64 55.19 ± 1.04 55.47 ± 1.66 50.36 ± 1.54 Avg(M,M)⊕F GraphSage 324 86.14 ± 0.95 88.68 ± 0.51 79.16 ± 1.41 MLP(M,M,F) GraphSage 64 84.63 ± 1.11 86.14 ± 0.45 78.00 ± 1.85 AttrE2vec (our) Avg 64 88.97 ± 0.82 93.43 ± 0.56 87.68 ± 1.25 Exp 64 88.91 ± 1.10 92.80 ± 0.38 86.18 ± 1.41 GRU 64 88.92 ± 1.13 93.06 ± 0.63 86.39 ± 1.21 ConcatGRU 64 88.56 ± 1.34 92.93 ± 0.61 86.34 ± 1.18 Moreover, an early stopping strategy is implemented– if the validation AUC metric does not improve for more than 15 epochs, the learning is terminated. Our approach to model selection is aligned with the schema proposed in [44] because this approach is more nat- ural than relying on the loss function. This is repeated for all 10 data splits (see: Section 4.1 for details). We report a mean and std AUC measures for 10 test sets (see Table 3) We choose AdamW [45] with a learning rate of 0.001 to optimize our model’s pa- rameters. We also set the size of positive samples to |h+| = 5 and negative samples to |h−| = 10 in the cosine embedding loss. The mixing coefficient is set to λ = 0.5, equally including the influence of features and topological graph structure. We choose an embedding size of 64 as a reasonable value while dealing with edge features of size 260. In Table 3, we summarize the AUC values for baseline methods and for our model. Even though vectors’ original dimensionality is relatively high (260), good results are already yielded using only the edge features (Doc2vec). However, adding structural information about the graph could further improve the results. Using representations from node embedding methods, which are transformed to edge 13 embeddings using the average operator Avg(M,M), achieve poor results of about 50- 60% AUC. However, if these are combined with the edge features from the datasets Avg(M,M)⊕F, the AUC values increase significantly to about 86%, 88% and 79% for Citeseer, Cora, and Pubmed, respectively. Unfortunately, this results in an even higher vector dimensionality (324). The MLP-based approach results lead to similar conclusions. Using only node em- beddings MLP(M,M) we achieve quite poor results of about 50% (on Pubmed) up to 60% (on Cora). With MLP(M,M,F) approach we observe that edge features improve the classification results. The AUC values are still slightly worse than concatenation operator (Avg(M,M)⊕F), but we can reduce the edge embedding size to 64. The Line2vec [22] algorithm achieves very good results, without considering edge features information – we get about 86%, 92% and 85% AUC for Citeseer, Cora, and Pubmed, respectively. These values are higher than for any other baseline approach. Our model performs the best among all evaluated methods. For Citeseer, we gain about 3 percent points compared to the best baselines: Line2vec, Struc2vec (Avg(M,M)⊕F) or GraphSage (Avg(M,M)⊕F). Note that the algorithm is trained only on 140 edges in the inductive setting, whereas all transductive baselines require the whole graph for training. The gains on Cora are 2 pp, and on Pubmed we achieve up to 4pp (and up to 8pp compared only to GraphSage (Avg(M,M)⊕F)). Our model with the Average (Avg) aggregator works the best, whereas the Gated Recurrent Unit (GRU) aggregator achieves the second-best results. 4.4. Edge clustering Similarly to Line2vec [22], we apply the K-Means++ algorithm on the resulting em- bedding vectors and compute an unsupervised clustering accuracy [46]. We summarize the results in Table 4. Our model performs the best in all but one case and achieves significantly better results than other baseline methods. The only exception is for the Pubmed dataset, where Line2vec achieves the best clustering accuracy. Other baseline methods perform similarly as in the edge classification task. Hence, we will not discuss the details, and we encourage the reader to go through the detailed results. 4.5. Embedding visualization For all tested baseline methods and our proposed AttrE2vec method, we compute 2-dimensional projections of the produced embeddings using T-SNE [47] method. We visualize them in Figure 7. In our subjective opinion, these plots correspond to the AUC scores reported in Table 3—the higher the AUC, the better the group separation. In details, for Doc2vec raw edge features seem to form groups, but unfortunately overlap to some degree. We cannot observe any pattern in the node embedding-based settings (Avg(M,M) and MLP(M,M)), they tempt to be quasi-random. When concatenated with the edge attributes (Avg(M,M)⊕F and MLP(M,M,F)) we observe a slightly better grouping, but yet non satisfying. AttrE2vec model produces much more formed groups, with only a little overlapping. To summarize, based on the observed groups’ separability and AUC metrics, our approach works the best among all methods. 14 Figure 7: 2-D T-SNE projections of embedding vectors for all evaluated methods. Columns denotes aggregation approach, beside F that denotes the edge attributes and g(E) that is an edge embedding obtained with graph structure only. Rows gather particular methods. 15 Table 4: Accuracy on edge clustering. F denotes the edge attributes (also referred to as ”Doc2vec”), M – node attributes (e.g., embeddings computed using ”Node2vec”), ⊕ – concatenation operator, Avg(M,M) – average operator on node embeddings, MLP(·) – encoder output of MLP autoencoder trained on given attributes. AUC in bold shows the highest value and AUC in italic — the second highest value. Method group/name Vector Accuracy size Citeseer Cora Pubmed T r a n s d u c ti v e Edge features only; F (Doc2vec) 260 54.13 ± 2.73 54.64 ± 5.86 46.33 ± 1.53 Line2vec 64 54.73 ± 2.56 63.50 ± 1.92 55.26 ± 1.36 Avg(M,M) DeepWalk 64 28.89 ± 1.06 21.93 ± 0.86 27.24 ± 0.50 Node2vec 64 26.82 ± 0.67 21.32 ± 0.62 27.17 ± 0.74 SDNE 64 21.01 ± 0.50 17.97 ± 0.47 31.38 ± 0.69 Struc2vec 64 25.21 ± 1.33 20.15 ± 0.64 32.02 ± 1.49 MLP(M,M) DeepWalk 64 26.36 ± 1.37 21.06 ± 0.57 27.40 ± 0.93 Node2vec 64 26.37 ± 1.64 21.31 ± 0.98 27.67 ± 0.78 SDNE 64 22.27 ± 0.76 17.15 ± 0.36 28.44 ± 1.21 Struc2vec 64 24.22 ± 0.83 19.56 ± 0.49 31.31 ± 1.70 Avg(M,M)⊕F DeepWalk 324 54.13 ± 2.73 54.70 ± 5.85 46.33 ± 1.53 Node2vec 324 54.13 ± 2.73 54.70 ± 5.85 46.33 ± 1.53 SDNE 324 55.29 ± 2.06 55.43 ± 4.63 46.33 ± 1.53 Struc2vec 324 55.59 ± 1.51 52.47 ± 6.52 46.32 ± 1.29 MLP(M,M,F) DeepWalk 64 48.74 ± 4.03 47.38 ± 4.72 46.49 ± 1.20 Node2vec 64 50.80 ± 2.30 48.48 ± 3.38 46.15 ± 1.43 SDNE 64 46.17 ± 3.15 44.87 ± 3.54 45.74 ± 1.89 Struc2vec 64 47.35 ± 3.73 44.38 ± 3.04 45.40 ± 1.72 In d u c ti v e Avg(M,M) GraphSage 64 18.79 ± 0.62 17.70 ± 1.05 27.04 ± 0.71 MLP(M,M) GraphSage 64 18.92 ± 0.98 17.89 ± 0.85 27.09 ± 0.81 Avg(M,M)⊕F GraphSage 324 54.06 ± 2.54 54.82 ± 6.86 46.49 ± 1.64 MLP(M,M,F) GraphSage 64 48.79 ± 4.04 47.49 ± 5.41 45.15 ± 1.54 AttrE2vec (our) Avg 64 59.82 ± 3.30 65.42 ± 1.71 48.86 ± 2.46 Exp 64 59.07 ± 4.65 66.36 ± 3.62 48.02 ± 2.55 GRU 64 60.16 ± 2.25 66.15 ± 3.71 49.41 ± 1.49 ConcatGRU 64 60.71 ± 2.75 66.00 ± 2.21 50.27 ± 3.75 5. Hyperparameter Sensitivity of AttrE2vec We investigate hyperparameters’ effect considering each of them independently, i.e., setting a given parameter and preserving default values for all other parameters. The evaluation is applied for our model’s two inductive variants: with the Average aggregator and with the GRU aggregator. We use all three datasets (Cora, Citeseer, Pubmed) and report the AUC values. We choose following hyperparameter value sets (values with an asterisk denote the default value for that parameter): • length of random walk: L = {4, 8∗, 16}, • number of random walks: k = {4, 8, 16∗}, • embedding size: d = {16, 32, 64∗}, • mixing parameter: λ = {0, 0.25, 0.5∗, 0.75, 1}. 16 Figure 8: Effects of hyperparameters on Cora, Citeseer and Pubmed datasets. The results of all experiments are summarized in Figure 8. We observe that for both aggregation variants, Avg and GRU, the trends are similar, so we will include and discuss them based only on the Average aggregator. In general, the higher the number of random walks k and the length of a single random walk L, the better results are achieved. One may require higher values of these parameters, but it significantly increases the random walk computation time and the model training itself. Unsurprisingly, the embedding size (embedding dimension) also follows the same trend. With more dimensions, we can fit more information into the created representa- tions. However, as an embedding goal is to find low-dimensional vector representations, we should keep reasonable dimensionality. Our chosen values (16, 32, 64) seem plausible while working with 260-dimensional edge features. As for loss mixing parameter λ, we observe that too high values negatively influence the model performance. The greater the value, the more critical the structural loss be- comes. Simultaneously the feature loss becomes less relevant. Choosing λ = 0 causes the loss function to consider feature reconstruction only and completely ignores the em- bedding loss. This yields significantly worse results and confirms that our approach of combining both feature reconstruction and structural embedding loss is justified. In general, the best values are achieved for setting an equal influence of both loss factors (λ = 0.5). 6. Ablation study We performed an ablation study to check whether our method AttrE2vec is invariant to introduced noise in an artificially generated network. We use a barbell graph, which 17 Figure 9: AttrE2vec performance for various noise levels p and mixing parameter values λ ∈{0, 0.5, 1}. Figure 10: 2-D representations of ideal and noisy graph edges using AttrE2vec with λ ∈{0, 0.5, 1}. 18 consists of two fully connected graphs and a path which connects them (see: Figure 1). The graph has seven nodes in each full graph and seven nodes in the path – a total of 50 edges. Next, we generate features from 3 clusters in a 200-dimensional space using isotropic Gaussian blobs. We assign the features to 3 parts of the graph: the first to the edges in one of the full graphs, the second to the edges in the path and the third to the edges in the other full graph. The edge classes are matching the feature clusters (i.e., three classes). Therefore, the structure is aligned with the features, so any good structure based embedding method can fit this data very well (see: Figure 1). A problem occurs when the features (and hence the classes) are shuffled within the graph structure. Methods that employ only a structural loss function will fail. We want to check how our model AttrE2vec, which includes both structural and feature-based loss, performs with different amount of such noise. We will use the graph mentioned above and introduce noise by shuffling p% of all edge pairs, which are from different classes, i.e., an edge with class 2 (originally lo- cated in the path) may be swapped with one from the full graphs (classes 1 or 3). We use our AttrE2vec model with an Average aggregator in the transductive setting (due to the graph size) and report the edge classification AUC for different values of p ∈{0, 0.1, . . . , 0.5, . . . , 0.9, 1} and λ ∈{0, 0.5, 1}. The values of the mixing parameter λ allow us to check how the model behaves when working only with a feature-based loss (λ = 0), only with a structural loss (λ = 1), and with both losses at equal importance (λ = 0.5). We train our model for five epochs and repeat the computations ten times for every (p,λ) pair, due to the shuffling procedure’s randomness. We report the mean and standard deviation of the AUC value in Figure 9. Using only the feature loss or a combination of both losses allows us to achieve nearly 100% AUC in the classification task. The fluctuations appear due to the low number of training epochs and the local optima problem. The performance of the model that uses only structural loss (λ = 1) decreases with higher shuffling probabilities, and from a certain point, it starts improving slightly because shuffling results in a complete swap of two classes, i.e., all features and classes from one graph part are exchanged with all features and classes from another part of the graph. We also demonstrate how our method reacts on noisy data with various λ ∈{0, 0.5, 1}. There are two graphs: one where the features are aligned to substructures of the graph and the second with shuffled features (ca. 50%), see Figure 10. Keeping AttrE2vec with λ = 0.5 allows to represent noisy graphs fairly. 7. Conclusions and future work We introduce AttrE2vec – the novel unsupervised and inductive embedding model to learn attributed edge embeddings by leveraging on the self-attention network with auto- encoder over attribute space and structural loss on aggregated random walks. Attre2vec can directly aggregate feature information from edges and nodes at many hops away to infer embeddings not only for present nodes, but also for new nodes. Extensive experimental results show that AttrE2vec obtains the state-of-the-art results in edge classification and clustering on CORA, PUBMED and CITESEER. 19 Acknowledgments The work was partially supported by the National Science Centre, Poland grant No. 2016/21/D/ST6/02948, and 2016/23/B/ST6/01735, as well as by the Department of Computational Intelligence, Wroc law University of Science and Technology statutory funds. References [1] W. Hu, M. Fey, M. Zitnik, Y. Dong, H. Ren, B. Liu, M. Catasta, J. Leskovec, R. Barzilay, P. Battaglia, Y. Bengio, M. Bronstein, S. Günnemann, W. Hamilton, T. Jaakkola, S. Jegelka, M. Nickel, C. Re, L. Song, J. Tang, M. Welling, R. Zemel, Open graph benchmark: Datasets for machine learning on graphs (may 2020). arXiv:2005.00687. URL http://arxiv.org/abs/2005.00687 [2] D. Zhang, J. Yin, X. Zhu, C. Zhang, Network Representation Learning: A Survey, IEEE Transac- tions on Big Data 6 (1) (2018) 3–28. doi:10.1109/tbdata.2018.2850013. [3] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, P. S. Yu, A Comprehensive Survey on Graph Neural Networks, IEEE Transactions on Neural Networks and Learning Systems (2019) 1–21doi:10.1109/ TNNLS.2020.2978386. [4] B. Li, D. Pi, Network representation learning: a systematic literature review, Neural Computing and Applications 32 (21) (2020) 16647–16679. doi:10.1007/s00521-020-04908-5. [5] I. Chami, S. Abu-El-Haija, B. Perozzi, C. Ré, K. Murphy, Machine Learning on Graphs: A Model and Comprehensive Taxonomy (2020). URL http://arxiv.org/abs/2005.03675 [6] S. Bahrami, F. Dornaika, A. Bosaghzadeh, Joint auto-weighted graph fusion and scalable semi- supervised learning, Information Fusion 66 (2021) 213–228. URL www.scopus.com [7] A. Grover, J. Leskovec, Node2vec: Scalable feature learning for networks, in: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Vol. 13-17- Augu, 2016, pp. 855–864. doi:10.1145/2939672.2939754. [8] B. Perozzi, R. Al-Rfou, S. Skiena, DeepWalk: Online Learning of Social Representations Bryan, in: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’14, ACM Press, New York, New York, USA, 2014, pp. 701–710. doi:10.1145/ 2623330.2623732. URL http://dl.acm.org/citation.cfm?doid=2623330.2623732 [9] T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, in: 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings, International Conference on Learning Representations, ICLR, 2017, pp. 1–14. arXiv:1609.02907. URL http://arxiv.org/abs/1609.02907 [10] Y. Dong, N. V. Chawla, A. Swami, Metapath2vec: Scalable representation learning for hetero- geneous networks, in: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Vol. Part F1296, ACM, New York, NY, USA, 2017, pp. 135–144. doi:10.1145/3097983.3098036. URL https://dl.acm.org/doi/10.1145/3097983.3098036 [11] S. . Wang, V. V. Govindaraj, J. M. Górriz, X. Zhang, Y. . Zhang, Covid-19 classification by fgcnet with deep feature fusion from graph convolutional network and convolutional neural network, Information Fusion 67 (2021) 208–229, cited By :1. URL www.scopus.com [12] A. Garćıa-Durán, M. Niepert, Learning graph representations with embedding propagation, in: Advances in Neural Information Processing Systems, Vol. 2017-Decem, 2017, pp. 5120–5131. [13] W. L. Hamilton, R. Ying, J. Leskovec, Inductive representation learning on large graphs, in: Ad- vances in Neural Information Processing Systems, Vol. 2017-Decem, 2017, pp. 1025–1035. [14] P. Veličković, A. Casanova, P. Liò, G. Cucurull, A. Romero, Y. Bengio, Graph attention networks, in: 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings, International Conference on Learning Representations, ICLR, 2018, pp. 1–12. arXiv: 1710.10903. 20 http://arxiv.org/abs/2005.00687 http://arxiv.org/abs/2005.00687 http://arxiv.org/abs/2005.00687 http://arxiv.org/abs/2005.00687 http://dx.doi.org/10.1109/tbdata.2018.2850013 http://dx.doi.org/10.1109/TNNLS.2020.2978386 http://dx.doi.org/10.1109/TNNLS.2020.2978386 http://dx.doi.org/10.1007/s00521-020-04908-5 http://arxiv.org/abs/2005.03675 http://arxiv.org/abs/2005.03675 http://arxiv.org/abs/2005.03675 www.scopus.com www.scopus.com www.scopus.com http://dx.doi.org/10.1145/2939672.2939754 http://dl.acm.org/citation.cfm?doid=2623330.2623732 http://dx.doi.org/10.1145/2623330.2623732 http://dx.doi.org/10.1145/2623330.2623732 http://dl.acm.org/citation.cfm?doid=2623330.2623732 http://arxiv.org/abs/1609.02907 http://arxiv.org/abs/1609.02907 http://arxiv.org/abs/1609.02907 https://dl.acm.org/doi/10.1145/3097983.3098036 https://dl.acm.org/doi/10.1145/3097983.3098036 http://dx.doi.org/10.1145/3097983.3098036 https://dl.acm.org/doi/10.1145/3097983.3098036 www.scopus.com www.scopus.com www.scopus.com http://arxiv.org/abs/1710.10903 http://arxiv.org/abs/1710.10903 [15] D. Wang, P. Cui, W. Zhu, Structural deep network embedding, in: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Vol. 13-17-Augu, 2016, pp. 1225–1234. doi:10.1145/2939672.2939753. [16] C. Yang, Z. Liu, D. Zhao, M. Sun, E. Y. Chang, Network representation learning with rich text information, in: IJCAI International Joint Conference on Artificial Intelligence, Vol. 2015-Janua, 2015, pp. 2111–2117. [17] M. Liu, J. Liu, Y. Chen, M. Wang, H. Chen, Q. Zheng, Ahng: Representation learning on attributed heterogeneous network, Information Fusion 50 (2019) 221–230, cited By :3. URL www.scopus.com [18] L. Lan, P. Wang, J. Zhao, J. Tao, J. Lui, X. Guan, Improving network embedding with partially available vertex and edge content, Information Sciences 512 (2020) 935–951. doi:10.1016/j.ins. 2019.09.083. [19] B. Li, D. Pi, Y. Lin, I. Khan, L. Cui, Multi-source information fusion based heterogeneous network embedding, Information Sciences 534 (2020) 53–71. doi:10.1016/j.ins.2020.05.012. [20] C. Zhang, D. Song, C. Huang, A. Swami, N. V. Chawla, Heterogeneous graph neural network, in: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, New York, NY, USA, 2019, pp. 793–803. doi:10.1145/3292500.3330961. URL https://dl.acm.org/doi/10.1145/3292500.3330961 [21] H. Gao, H. Huang, Deep attributed network embedding, in: IJCAI International Joint Conference on Artificial Intelligence, Vol. 2018-July, 2018, pp. 3364–3370. doi:10.24963/ijcai.2018/467. [22] S. Bandyopadhyay, A. Biswas, N. Murty, R. Narayanam, Beyond node embedding: A direct unsu- pervised edge representation framework for homogeneous networks (2019). arXiv:1912.05140. [23] Y. Chen, T. Qian, Relation constrained attributed network embedding, Information Sciences 515 (2020) 341–351. doi:10.1016/j.ins.2019.12.033. [24] S. Bandyopadhyay, H. Kara, A. Kannan, M. N. Murty, FSCNMF: Fusing structure and content via non-negative matrix factorization for embedding information networks (2018). arXiv:1804.05313. [25] D. Nozza, E. Fersini, E. Messina, CAGE: Constrained deep Attributed Graph Embedding, Infor- mation Sciences 518 (2020) 56–70. doi:10.1016/j.ins.2019.12.082. [26] J. Kim, T. Kim, S. Kim, C. D. Yoo, Edge-labeling graph neural network for few-shot learning, in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recogni- tion, Vol. 2019-June, 2019, pp. 11–20. arXiv:1905.01436, doi:10.1109/CVPR.2019.00010. [27] Q. Li, Z. Cao, J. Zhong, Q. Li, Graph representation learning with encoding edges, Neurocomputing 361 (2019) 29–39. doi:10.1016/j.neucom.2019.07.076. [28] L. Gong, Q. Cheng, Exploiting edge features for graph neural networks, in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019, pp. 9203–9211. doi:10.1109/CVPR.2019.00943. [29] C. Aggarwal, G. He, P. Zhao, Edge classification in networks, in: 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016, Institute of Electrical and Electronics Engineers Inc., 2016, pp. 1038–1049. doi:10.1109/ICDE.2016.7498311. [30] M. Simonovsky, N. Komodakis, Dynamic edge-conditioned filters in convolutional neural networks on graphs, in: Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Vol. 2017-Janua, 2017, pp. 29–38. doi:10.1109/CVPR.2017.11. [31] T. D. Bui, S. Ravi, V. Ramavajjala, Neural Graph Learning: Training Neural Networks Using Graphs, dl.acm.org 2018-Febua (2018) 64–71. doi:10.1145/3159652.3159731. [32] Y. Wang, Y. Sun, M. M. Bronstein, J. M. Solomon, Z. Liu, S. E. Sarma, Dynamic Graph CNN for Learning on Point Clouds, ACM Transactions on Graphics 38 (5) (2019) 146. doi:10.1145/3326362. [33] T. Wanyan, C. Zhang, A. Azad, X. Liang, D. Li, Y. Ding, Attribute2vec: Deep network embedding through multi-filtering GCN (apr 2020). arXiv:2004.01375. URL http://arxiv.org/abs/2004.01375 [34] J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, Q. Mei, LINE: Large-scale information network embedding, in: WWW 2015 - Proceedings of the 24th International Conference on World Wide Web, 2015, pp. 1067–1077. doi:10.1145/2736277.2741093. [35] L. F. Ribeiro, P. H. Saverese, D. R. Figueiredo, Struc2vec: Learning node representations from structural identity, in: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Vol. Part F1296, 2017, pp. 385–394. doi:10.1145/3097983.3098061. [36] A. Grover, J. Leskovec, node2vec: Scalable feature learning for networks, in: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 2016, pp. 855–864. [37] J. Chung, C. Gulcehre, K. Cho, Y. Bengio, Empirical Evaluation of Gated Recurrent Neural Net- 21 http://dx.doi.org/10.1145/2939672.2939753 www.scopus.com www.scopus.com www.scopus.com http://dx.doi.org/10.1016/j.ins.2019.09.083 http://dx.doi.org/10.1016/j.ins.2019.09.083 http://dx.doi.org/10.1016/j.ins.2020.05.012 https://dl.acm.org/doi/10.1145/3292500.3330961 http://dx.doi.org/10.1145/3292500.3330961 https://dl.acm.org/doi/10.1145/3292500.3330961 http://dx.doi.org/10.24963/ijcai.2018/467 http://arxiv.org/abs/1912.05140 http://dx.doi.org/10.1016/j.ins.2019.12.033 http://arxiv.org/abs/1804.05313 http://dx.doi.org/10.1016/j.ins.2019.12.082 http://arxiv.org/abs/1905.01436 http://dx.doi.org/10.1109/CVPR.2019.00010 http://dx.doi.org/10.1016/j.neucom.2019.07.076 http://dx.doi.org/10.1109/CVPR.2019.00943 http://dx.doi.org/10.1109/ICDE.2016.7498311 http://dx.doi.org/10.1109/CVPR.2017.11 http://dx.doi.org/10.1145/3159652.3159731 http://dx.doi.org/10.1145/3326362 http://arxiv.org/abs/2004.01375 http://arxiv.org/abs/2004.01375 http://arxiv.org/abs/2004.01375 http://arxiv.org/abs/2004.01375 http://dx.doi.org/10.1145/2736277.2741093 http://dx.doi.org/10.1145/3097983.3098061 http://arxiv.org/abs/1412.3555 http://arxiv.org/abs/1412.3555 works on Sequence Modeling (dec 2014). arXiv:1412.3555. URL http://arxiv.org/abs/1412.3555 [38] R. Kuprieiev, D. Petrov, R. Valles, P. Redzyński, C. da Costa-Luis, A. Schepanovski, I. Shcheklein, S. Pachhai, J. Orpinel, F. Santos, A. Sharma, Zhanibek, D. Hodovic, P. Rowlands, Earl, A. Grigorev, N. Dash, G. Vyshnya, maykulkarni, Vera, M. Hora, xliiv, W. Baranowski, S. Mangal, C. Wolff, nik123, O. Yoktan, K. Benoy, A. Khamutov, A. Maslakov, Dvc: Data version control - git for data & models (May 2020). doi:10.5281/zenodo.3859749. URL https://doi.org/10.5281/zenodo.3859749 [39] P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, T. Eliassi-Rad, Collective classification in network data, AI Magazine 29 (3) (2008) 93. doi:10.1609/aimag.v29i3.2157. URL https://ojs.aaai.org/index.php/aimagazine/article/view/2157 [40] G. Namata, B. London, L. Getoor, B. Huang, Query-driven Active Surveying for Collective Clas- sification, in: Proceedings ofthe Workshop on Mining and Learn- ing with Graphs, Edinburgh, Scotland, UK., 2012, pp. 1–8. [41] Q. Le, T. Mikolov, Distributed representations of sentences and documents, in: 31st International Conference on Machine Learning, ICML 2014, Vol. 4, 2014, pp. 2931–2939. arXiv:1405.4053. URL http://arxiv.org/abs/1405.4053 [42] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Pret- tenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: Machine learning in Python, Journal of Machine Learning Research 12 (2011) 2825–2830. [43] D. Wang, P. Cui, W. Zhu, Structural deep network embedding, in: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, ACM, New York, NY, USA, 2016, pp. 1225–1234. doi:10.1145/2939672.2939753. URL http://doi.acm.org/10.1145/2939672.2939753 [44] D. Q. Nguyen, T. D. Nguyen, D. Phung, A self-attention network based node embedding model (jun 2020). arXiv:2006.12100. URL http://arxiv.org/abs/2006.12100 [45] I. Loshchilov, F. Hutter, Decoupled Weight Decay Regularization (nov 2017). arXiv:1711.05101. URL http://arxiv.org/abs/1711.05101 [46] J. Xie, R. Girshick, A. Farhadi, Unsupervised deep embedding for clustering analysis, in: M. F. Balcan, K. Q. Weinberger (Eds.), Proceedings of The 33rd International Conference on Machine Learning, Vol. 48 of Proceedings of Machine Learning Research, PMLR, New York, New York, USA, 2016, pp. 478–487. URL http://proceedings.mlr.press/v48/xieb16.html [47] L. van der Maaten, G. Hinton, Visualizing data using t-SNE, Journal of Machine Learning Research 9 (2008) 2579–2605. URL http://www.jmlr.org/papers/v9/vandermaaten08a.html 22 View publication statsView publication stats http://arxiv.org/abs/1412.3555 http://arxiv.org/abs/1412.3555 http://arxiv.org/abs/1412.3555 http://arxiv.org/abs/1412.3555 https://doi.org/10.5281/zenodo.3859749 https://doi.org/10.5281/zenodo.3859749 http://dx.doi.org/10.5281/zenodo.3859749 https://doi.org/10.5281/zenodo.3859749 https://ojs.aaai.org/index.php/aimagazine/article/view/2157 https://ojs.aaai.org/index.php/aimagazine/article/view/2157 http://dx.doi.org/10.1609/aimag.v29i3.2157 https://ojs.aaai.org/index.php/aimagazine/article/view/2157 http://arxiv.org/abs/1405.4053 http://arxiv.org/abs/1405.4053 http://arxiv.org/abs/1405.4053 http://doi.acm.org/10.1145/2939672.2939753 http://dx.doi.org/10.1145/2939672.2939753 http://doi.acm.org/10.1145/2939672.2939753 http://arxiv.org/abs/2006.12100 http://arxiv.org/abs/2006.12100 http://arxiv.org/abs/2006.12100 http://arxiv.org/abs/1711.05101 http://arxiv.org/abs/1711.05101 http://arxiv.org/abs/1711.05101 http://proceedings.mlr.press/v48/xieb16.html http://proceedings.mlr.press/v48/xieb16.html http://www.jmlr.org/papers/v9/vandermaaten08a.html http://www.jmlr.org/papers/v9/vandermaaten08a.html https://www.researchgate.net/publication/348079131 1 Introduction 2 Related work and Research Gap 3 Method 3.1 Motivation 3.2 Attributed graph edge embedding 3.3 AttrE2vec 3.4 Aggregation models 3.5 Learning AttrE2vec's parameters 4 Experiments 4.1 Datasets 4.2 Baselines 4.3 Edge classification 4.4 Edge clustering 4.5 Embedding visualization 5 Hyperparameter Sensitivity of AttrE2vec 6 Ablation study 7 Conclusions and future work bahnemann-transforming-2021 ---- Transforming Metadata into Linked Data to Improve Digital Collection Discoverability: A CONTENTdm Pilot Project Transforming metadata into linked data to improve digital collection discoverability: A CONTENTdm Pilot Project O C L C R E S E A R C H R E P O R T Transforming Metadata into Linked Data to Improve Digital Collection Discoverability: A CONTENTdm Pilot Project Greta Bahnemann Minnesota Digital Library Michael Carroll Temple University Libraries Paul Clough, University of Miami Libraries Mario Einaudi The Huntington Library, Art Museum, and Botanical Gardens Chatham Ewing Cleveland Public Library Jeff Mixter OCLC Research Jason Roy Minnesota Digital Library Holly Tomren Temple University Libraries Bruce Washburn OCLC Research Elliot Williams University of Miami Libraries © 2021 OCLC. This work is licensed under a Creative Commons Attribution 4.0 International License. http://creativecommons.org/licenses/by/4.0/ January 2021 OCLC Research Dublin, Ohio 43017 USA www.oclc.org ISBN: 978-1-55653-185-9 DOI: 10.25333/fzcv-0851 OCLC Control Number: 1230259668 ORCID iDs Greta Bahnemann, Minnesota Digital Library https://orcid.org/0000-0002-5823-7217 Michael Carroll, Temple University Libraries https://orcid.org/0000-0003-3736-0678 Paul Clough, University of Miami Libraries https://orcid.org/0000-0001-6939-2805 Mario Einaudi, The Huntington Library, Art Museum, and Botanical Gardens https://orcid.org/0000-0002-6859-594X Chatham Ewing, Cleveland Public Library https://orcid.org/0000-0002-8402-0652 Jeff Mixter, OCLC Research https://orcid.org/0000-0002-8411-2952 Jason Roy, Minnesota Digital Library https://orcid.org/0000-0002-3644-1970 Holly Tomren, Temple University Libraries https://orcid.org/0000-0002-6062-1138 Bruce Washburn, OCLC Research http://orcid.org/0000-0003-4396-7345 Elliot Williams, University of Miami Libraries https://orcid.org/0000-0001-6925-7144 Please direct correspondence to: OCLC Research oclcresearch@oclc.org Suggested citation: Bahnemann, Greta, Michael Carroll, Paul Clough, Mario Einaudi, Chatham Ewing, Jeff Mixter, Jason Roy, Holly Tomren, Bruce Washburn, and Elliot Williams. 2021. Transforming Metadata into Linked Data to Improve Digital Collection Discoverability: A CONTENTdm Pilot Project. Dublin, OH: OCLC Research. https://doi.org/10.25333/fzcv-0851. http://creativecommons.org/licenses/by/4.0/ http://www.oclc.org https://orcid.org/0000-0002-5823-7217 https://orcid.org/0000-0003-3736-0678 https://orcid.org/0000-0001-6939-2805 https://orcid.org/0000-0002-6859-594X https://orcid.org/0000-0002-8402-0652 https://orcid.org/0000-0002-8411-2952 https://orcid.org/0000-0002-3644-1970 https://orcid.org/0000-0002-6062-1138 http://orcid.org/0000-0003-4396-7345 https://orcid.org/0000-0001-6925-7144 mailto:oclcresearch@oclc.org https://doi.org/10.25333/fzcv-0851 C O N T E N T S Acknowledgments ........................................................................... viii Executive Summary ........................................................................... ix Introduction ....................................................................................... 11 Three-Phase Project Plan ..................................................................13 Phase 1: Mapping textual metadata to entities ..................................................... 15 Phase 2: Tools for managing metadata in Wikibase ............................................. 15 Phase 3: Wikibase entities drive discovery ........................................................... 15 The Wikibase Environment ................................................................16 Developing A Data Model .................................................................. 17 Describing the “type” of a creative work at three levels ...................................... 18 Distinguishing between instances of concepts and ontological classes ............ 19 Managing the data model in Wikibase ................................................................. 20 Managing source metadata outside of the data model ....................................... 21 Gathering and Transforming Metadata ............................................ 22 Selecting and analyzing collections from pilot partner CONTENTdm sites ........23 Optimizing tools and workflows for reconciliation and transformation ..............24 Adding related entities to the Contentdm Wikibase from external sources .......25 Creating entities in advance for anticipated matches ....................................26 Testing an alternative openrefine reconciliation endpoint .............................26 Creating placeholder entities for things that could not be reconciled .......... 27 Representing Compound Objects .................................................... 28 Syndicating Data in Standard Schemas ........................................... 29 Wikibase Ecosystem Advantages ..................................................... 29 Implementing authority control ............................................................................29 Decreasing cataloging inefficiencies, increasing descriptive quality ................ 30 Generating data visualizations ..............................................................................32 User Interface Extensions ................................................................ 33 MediaWiki gadgets .................................................................................................33 Adding the Mirador viewer ...............................................................................33 Showing contextual information from Wikidata ..............................................33 Contextual Data and Image from DBPedia and Wikimedia Commons Embedded in the Wikibase User Interface ......................................................34 Revealing constraint violations ........................................................................34 CONTENTdm custom pages ..................................................................................35 Embedding Schema.org JSON-LD in CONTENTdm pages ............................ 36 Showing contextual information for headings based on Wikibase data ........ 37 New Applications .............................................................................. 39 The Image Annotator ............................................................................................ 39 User study results .............................................................................................42 The Retriever ......................................................................................................... 43 The Describer ........................................................................................................ 46 The Explorer and the Transportation Hub ............................................................. 47 The Field Analyzer ..................................................................................................53 Cohort Communication .................................................................... 55 Partner Reflections ........................................................................... 56 Cleveland Public Library ........................................................................................56 The Huntington Library, Art Museum, And Botanical Gardens ............................58 Minnesota Digital Library .......................................................................................59 Invitation ...........................................................................................................59 Development of three tools by OCLC ..............................................................59 Leveraging the power of linked data ............................................................. 60 Concluding thoughts ....................................................................................... 61 Temple University Libraries .................................................................................... 61 University of Miami Libraries ................................................................................ 63 Key Findings and Conclusions .......................................................... 64 Testing the linked data value proposition ............................................................ 64 Evaluating a shared data model ........................................................................... 64 Selecting and transforming metadata .................................................................65 Continuing the journey to linked data ..................................................................65 Working partnerships represent strength in numbers ........................................ 66 Notes ................................................................................................. 67 F I G U R E S FIGURE 1 Planned project phases ................................................................................... 14 FIGURE 2 The Wikibase Ecosystem .................................................................................16 FIGURE 3 A CONTENTdm class hierarchy data model .................................................... 17 FIGURE 4 Example type, classification used, and process or format properties and values for a description of a postcard ................................... 18 FIGURE 5 A depicts statement for the concept of “Dogs” ............................................. 19 FIGURE 6 A type classification of “dog” for a specific dog ............................................ 19 FIGURE 7 The “dog” class is defined by the concept of “Dogs” ................................... 20 FIGURE 8 Wikibase templates for proposing new properties ........................................ 21 FIGURE 9 Unmapped CONTENTdm metadata displayed in the Wikibase user interface using a Gadget extension ....................................................... 22 FIGURE 10 Wikibase Discussion page for a collection review ......................................... 23 FIGURE 12 A “placeholder” entity for a person without an established identity .............27 FIGURE 13 Example “has creative work part” statements and sequencing for the first four parts of an album ................................................................ 28 FIGURE 14 Other names associated with the Los Angeles Dodgers entity .................... 30 FIGURE 15 First parts of the description of Jasper Wood ................................................ 31 FIGURE 16 SPARQL Query map visualization of places depicted in works from a collection ................................................................................. 32 FIGURE 17 Mirador image viewer embedded in the Wikibase user interface ................ 33 FIGURE 18 Contextual data and image from DBPedia and Wikimedia Commons embedded in the Wikibase user interface .................................. 34 FIGURE 19 A constraint violation indicating that the “occupation” property should only be used for instances of the type “person” ............................... 35 FIGURE 20 Schema.org data evaluated using Google’s Structured Data Testing Tool ..................................................................................................... 37 FIGURE 21 Additional contextual information displayed in CONTENTdm based on entity descriptions in the pilot Wikibase ....................................... 38 FIGURE 22 Image Annotator initial view of an image and subjects ................................ 40 FIGURE 23 Image Annotator cropping an image of a person ........................................ 40 FIGURE 24 Image Annotator after adding more depicted subjects ................................ 41 FIGURE 25 Wikibase item updated with illustrated depicts statements ........................ 42 FIGURE 26 Retriever search results from Wikidata, VIAF, and FAST for “Lake Vermilion” ............................................................................... 44 FIGURE 27 Retriever entity editor ..................................................................................... 45 FIGURE 28 Wikibase entity created by the Retriever ....................................................... 45 FIGURE 29 Editing essential details for an entity in the Describer .................................. 46 FIGURE 30 Explorer home page ....................................................................................... 48 FIGURE 31 Explorer Transportation Hub and related collections .................................. 49 FIGURE 32 Explorer search results for “strike” ................................................................. 50 FIGURE 33 Explorer view of a truck bringing employees home during a PTC walkout ....................................................................................... 51 FIGURE 34 Explorer view of a protest against the Philadelphia Transportation Company ................................................................................. 51 FIGURE 35 Explorer view of an 1899 Cleveland transit strike in Public Square .................................................................................................. 52 FIGURE 36 Explorer view of streetcars parked on the street during a transit strike ................................................................................................. 53 FIGURE 37 Field Analyzer field usage chart .................................................................... 54 FIGURE 38 Field Analyzer list of field values .................................................................... 55 viii A C K N O W L E D G M E N T S The OCLC CONTENTdm Linked Data Pilot project team consisted of the following OCLC staff: Hanning Chen, Eric Childress, Shane Huddleston, Jeff Mixter, Mercy Procaccini, and Bruce Washburn. The Linked Data project team wishes to thank the project partners who enthusiastically and generously collaborated with us in this endeavor. Your vision for and commitment to a linked data future have been illuminating and inspiring. OCLC particularly appreciates the efforts of those who contributed to or co-authored this report: • Cleveland Public Library: Chatham Ewing, Rachel Senese, Amia Wheatley • The Huntington Library, Art Museum, and Botanical Gardens: Mario Einaudi • Minnesota Digital Library: Greta Bahnemann, Jolie Graybill, Jason Roy • Temple University Libraries: Michael Carroll, Stefanie Ramsay, Holly Tomren • University of Miami Libraries: Paul Clough, Elliot Williams The team also acknowledges the consultation, guidance, and support provided by our OCLC colleagues: Dave Collins, Rachel Frick, Marti Heyman, Erik Mayer, Carolyn Morgan, Andrew Pace, Taylor Surface, and Diane Vizine-Goetz. Thank you to Jeanette McNicol for the excellent design of this report and to Erica Melko for her skillful editing. E X E C U T I V E S U M M A R Y In the CONTENTdm Linked Data Pilot project, OCLC partnered with institutions that manage their digital collections with OCLC’s CONTENTdm service to investigate methods for—and the feasibility of—transforming metadata into linked data to improve the discoverability and management of digitized cultural materials and their descriptions. This report, Transforming Metadata into Linked Data to Improve Digital Collection Discoverability, describes the course of the project and its primary areas of investigation and summarizes key findings and conclusions generated by the collaborative study. The project was designed to help the OCLC team and the pilot participants better understand the following questions: • How divergent are the descriptive data practices across the institutions using CONTENTdm, and what tools are needed to make that assessment? • Can a shared and extensible data model be developed to support the differing needs and demands for a range of material types and institution types? • What is the right mix of human attention and automation to effectively reconcile metadata headings to linked data entities? • What types of tools can help extend the description of cultural materials to subject matter experts? • After metadata from different institutions and collections is transformed, are there new discovery tools that can help researchers find new—or previously hidden—connections through a centralized discovery system? • What are the institutional and individual interests in the paradigm shift of moving to linked data? Over the course of the pilot, the project team and partners observed improved metadata management and discovery in action... Five organizations representing a cross-section of different types of institutions—The Huntington Library, Art Museum, and Botanical Gardens; the Cleveland Public Library; the Minnesota Digital Library; Temple University Libraries; and University of Miami Libraries—participated in the project. ix The pilot focused on developing efficient workflows for transforming metadata, evaluating existing interfaces to leverage linked data, and testing applications built in the Wikibase environment for managing the newly created linked data. Over the course of the pilot, the project team and partners observed improved metadata management and discovery in action and reflected on the potential benefits: higher-quality and richer metadata can be managed with greater efficiency by staff, and linked data can be used to add contextual information and to create a network of connections that better reflects knowledge in the real world. This context and these connections can help researchers achieve a fuller understanding of collection materials, inviting increased engagement and use by community members. Higher-quality and richer metadata can be managed with greater efficiency by staff, and linked data can be used to add contextual information and to create a network of connections that better reflects knowledge in the real world. While the pilot project findings are based on a limited set of institutions and collections, they strongly suggest that there is significant potential for improved discovery and more efficient data management when the materials that have been digitized are described using a shared data model, where headings are associated with linked data entities and relationships, and when the entities and relationships are brought together into a single aggregation. An overarching question driving the linked data project was, for a paradigm shift of this magnitude, how can the foundational changes be made more scalable, affordable, and sustainable? The project showed that the scope and magnitude of the effort required to completely analyze, transform, and reconcile all current descriptive metadata into consistently modeled linked data is beyond the reach of a single centralized agency. It will require substantial and shared resource commitments from a decentralized community of practitioners who will need to be supplied with easily accessible tools and workflows for carrying out the transition. Evidence gathered during the project and detailed in this report about data modeling, metadata reconciliation, and data analysis provides new knowledge about how these tools and workflows could be designed and used. x I N T R O D U C T I O N The CONTENTdm1 Linked Data Pilot project (also referred to throughout this report as the “Linked Data project”) is the latest (as of 2020) in a series of investigations2 that OCLC has organized and led over several years in the interest of developing a shared understanding how libraries, archives, and museums can make the transition to linked data. OCLC works in partnership with these institutions to increase researchers’ ability to discover, evaluate, and use digitized cultural materials, principally through its support of the CONTENTdm service for building, preserving, and showcasing a library’s unique digital collections. This Linked Data project was focused on envisioning and evaluating scalable and affordable systems and workflows that will be needed to produce rich linked data representations of entities and relationships, which will then help to make visible connections that were formerly invisible. The project was grounded in the context of the linked data value proposition, which states that these best practices for publishing structured data on the web—using URIs (Uniform Resource Identifiers) as names for things, using HTTP URIs so that people can look up those names, providing useful information using standards when someone looks up a URI, and including links to other URIs so that people can discover more things—lead to an interconnected global network of data that can serve both developers and researchers.3 Five organizations representing a cross-section of different types of institutions—The Huntington Library, Art Museum, and Botanical Gardens; the Cleveland Public Library; the Minnesota Digital Library; Temple University Libraries; and University of Miami Libraries—participated as partners in the project. The pilot participants collaborated with OCLC on a range of focused studies, including developing efficient workflows for transforming source metadata into linked data, evaluating CONTENTdm interface customizations to leverage linked data for discovery and syndication, and testing new applications built in the Wikibase environment for data retrieval, image annotation, editing, metadata analysis, and discovery. This report describes the course of the CONTENTdm Linked Data Pilot project and its primary areas of investigation, shares the experiences of the five participating partner institutions, and summarizes key findings and conclusions generated by that collaborative study. The Linked Data project’s focus on sustainability and scalability posed many questions to pursue, including: How divergent are the descriptive data practices across the institutions using CONTENTdm, and what tools are needed to make that assessment? The large volume of cultural material descriptive metadata stored in CONTENTdm offered an excellent test bed for evaluating a large-scale transition to linked data. Additionally, the outcomes 12 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability and findings from OCLC’s Metadata Refinery project completed in 2016 and its Project Passage4 linked data prototype completed in 2018 provided important insights into how to implement a system to facilitate the mapping, reconciliation, storage, and retrieval of structured data for unique digital materials. This pilot project built on those insights and successes. The sections below that describe the Wikibase5 environment, the steps for gathering and transforming metadata, and the prototype “Field Analyzer” application highlight the challenges of applying this work at scale. Can a shared and extensible data model be developed to support the differing needs and demands for a range of material types and institution types? The wide variety of data models and descriptive practices currently used across CONTENTdm could be significantly easier for staff to manage if there was a shared data model available, and if that shared model could also support rich discovery for researchers in a single, aggregated discovery system. This project set out to develop a shared data model, building on existing standards but allowing for extensions as evidence surfaced in the source metadata for additional classes and relationships. The section below on developing the data model provides an overview and examples of the results of that work. What is the right mix of human attention and automation to effectively reconcile metadata headings to linked data entities? The project spent substantial time and effort on testing reconciliation workflows and prototyping new tools to make this work more efficient while maintaining quality. Prototyping a new metadata reconciliation endpoint helped us understand the potential for improving the performance of what can be a time-consuming automated process. The development of the “Retriever” web application for finding related entities in other systems and transforming them into new Wikibase entities addressed a cataloger workflow stumbling block. Both prototypes are described below. What types of tools can help extend the description of cultural materials to subject matter experts? The project team developed—and the participants tested—an “Image Annotator” prototype application that could be used by either library staff or subject matter experts from outside the library to associate subject headings with depicted entities in images, envisioning how the transformed data along with new tools could open the door to more and richer descriptions from an engaged community. The description below of the Image Annotator includes a summary of its usability test results. After metadata from different institutions and collections is transformed, are there new discovery tools that can help researchers find new, or previously hidden, connections through a centralized discovery system? The “Explorer” prototype application, developed during the project and described below, demonstrated the ability to search across data from a range of repositories, with searching and faceting powered by entities derived from authority files and from vocabularies created by librarians. And the “Transportation Hub” virtual collection included in the Explorer gave the project team and participants a way to test linked data discovery in action, working with thematically related item descriptions that were supplied by a cross-section of institutions and collections and transformed into separate entities and relationships. Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 13 What are the institutional and individual interests in the paradigm shift of moving to linked data? The close collaboration between OCLC and the pilot project partners was one of the most rewarding aspects of the project. Given that most of the project was carried out as people and the organizations they work for were experiencing transformative disruptions to their lives and work as the 2020 COVID-19 pandemic began and unfolded, it was unclear at first what relative priority and attention the pilot could receive. But attention and participation from the participants—and support from OCLC—never wavered, and we mutually benefited greatly from the endeavor. Look to the following sections on cohort communication and the partner reflections for more insights and perspectives on the impact of this project and the partners’ first-hand views on the implications for our shared futures. The findings of the project—detailed in this report—about data modeling, metadata reconciliation, and data analysis provide new knowledge about how these tools and workflows could be designed and used, which we anticipate will inform future linked data investigations and developments from the library, archives, and museum communities. The CONTENTdm Linked Data Pilot project is another stage in a growing body of linked data research and development that OCLC has undertaken over the past decade. The findings of the project—detailed in this report—about data modeling, metadata reconciliation, and data analysis provide new knowledge about how these tools and workflows could be designed and used, which we anticipate will inform future linked data investigations and developments from the library, archives, and museum communities. Three-Phase Project Plan The pilot project was planned as a one-year effort to be carried out in three phases (figure 1) so that the project could address the most pressing questions first and allow for reconsideration and adjustments to the plan as it progressed: • Phase 1: Concentrated on mapping metadata for digital collections to descriptions of related entities: works, people, organizations, places, concepts, and events. Three partner institutions joined the project in Phase 1: The Huntington Library, Art Museum, and Botanical Gardens; the Cleveland Public Library; and the Minnesota Digital Library. 14 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability • Phase 2: Focused on a needs assessment and prototypes for managing metadata in the Wikibase environment. Two more partner institutions joined the project in Phase 2 after the OCLC team had developed a better understanding of the institutional support requirements, and to expand representation of materials from academic research libraries: Temple University Libraries and University of Miami Libraries. • Phase 3: Anticipated testing an end-user discovery experience based entirely on the data and tools developed within the Wikibase environment. CONTENTdm Linked Data Planned Project Phases FIGURE 1. Planned project phases.6 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/phase-diagram.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 15 PHASE 1: MAPPING TEXTUAL METADATA TO ENTITIES In the first phase, the plan was to focus on the systems and workflows needed to clean up, analyze, and reconcile CONTENTdm metadata for input into a linked data environment. Building on the project team’s prior experience with the environment in OCLC’s earlier linked data pilot, Project Passage, the Wikibase extension to the MediaWiki platform was selected as the project’s linked data environment. The Wikibase environment of related databases, indexes, and services is described in fuller detail below. In this phase, linked data was expected to be shown in the CONTENTdm interface, delivering data from the pilot project Wikibase using CONTENTdm’s custom Javascript feature. PHASE 2: TOOLS FOR MANAGING METADATA IN WIKIBASE In the second phase, the work was expected to focus on the Wikibase editing interface and on supplementary tools that could be used to extend that environment. These tools would help bridge the gap between CONTENTdm staff user expectations and the features and limitations of the Wikibase environment. The design and development of mechanisms for returning data from the Wikibase to the production CONTENTdm environment were also expected to be part of this phase. PHASE 3: WIKIBASE ENTITIES DRIVE DISCOVERY The focus of phase three was intended to be on a discovery interface that relied solely on data within the Wikibase to evaluate the features that could be part of a redesigned CONTENTdm discovery system. As the project unfolded, the project team made adjustments to the original plan, responding to new findings from its early phases. For example, work on some staff tools for editing Wikibase data began during Phase 1 (planned for Phase 2). On the other hand, the initial plan included the prototyping of a user interface for entity editing as an alternative to the Wikibase user interface but the team did not completely build and test that prototype before the project ended. The Phase 2 work anticipated that the project would encourage loading data from the Wikibase back into the CONTENTdm system using its “Catcher”7 web service that can add and edit metadata using a standard XML-based method. But given the conditions of the pilot project, the project partners could not be sure if the modified headings would conflict with their ongoing data management work. The first two phases concentrated on building and evaluating workflows for analyzing, transforming, and reconciling CONTENTdm metadata into Wikibase Linked Data with as complete and lossless a result as was feasible. In the third phase, a new course was charted to see how much linked data could be generated from CONTENTdm with minimal human intervention and evaluate the results in a front-end discovery application to more clearly demonstrate the linked data value proposition. These types of adjustments to the project plan are expected in a research-oriented pilot, where a full understanding of the issues and questions that will naturally surface over time are not defined at the outset. 16 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability The Wikibase Environment To proceed through the planned phases, the project needed an effective and proven platform for working with linked data. Based on the successful results from Project Passage (2018), the CONTENTdm Linked Data Pilot project used the Wikibase environment, which includes several interrelated APIs, databases, indexes, and services (figure 2): • MediaWiki is the primary software platform, the same software on which the Wikipedia8 encyclopedia and other “wikis” operate. • To handle structured data, the Wikibase extension to MediaWiki is used, which is the same software that supports the Wikidata9 knowledge base. Together, MediaWiki and Wikibase provide both a user interface for searching and editing and a range of APIs for access to authentication and editing services. • But to support linked data, a parallel system is synchronized with the Wikibase data, including its own linked database or triplestore that can be accessed using a linked data query language called SPARQL.10 A SPARQL Query service user interface is also provided. These powerful tools are the product of years of open source software development and support provided by the Wikidata and Wikimedia communities. The Wikibase Ecosystem FIGURE 2 . The Wikibase Ecosystem.11 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/wikibase-system-architecture.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 17 Developing A Data Model CONTENTdm repositories employ a wide range of vocabularies and institution-specific data dictionaries. Some institutions apply patterns to their data descriptions that are consistent across all their collections, while others use different patterns for different collections, either due to evolution of their institutional preferences over time and the effort required to maintain and revise “legacy” patterns in previously described collections, or to account for special characteristics in the data and use cases associated with specific collections. For the Linked Data project Wikibase, a single data model was needed that could reflect the variations seen in the metadata across CONTENTdm sites. Rather than selecting an existing data model to which we’d force CONTENTdm metadata to conform, the pilot project tested the theory that, through sampling current metadata and looking for general patterns, a model could be developed that was driven by data and that avoided speculation. Where appropriate, the properties and classes defined in the project data model were linked to equivalent properties and classes in other ontologies and vocabularies. CONTENTdm Class Hierarchy Data Model FIGURE 3. A CONTENTdm class hierarchy data model.12 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/class-ontology.png 18 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability This work began by looking across an inventory of CONTENTdm metadata for the most common metadata practices, leveraging CONTENTdm’s ability to help institutions relate their local vocabulary terms to the Dublin Core13 element set and associated controlled vocabularies. This step identified the classes and relationships that would be encountered most frequently in the pilot participants’ data and gave a starting point for building the pilot project data model. A field analysis survey was conducted for about 13 million records, selected from all CONTENTdm sites, that evaluated the most frequently used fields to identify important properties for creative works. From that same CONTENTdm survey, the most frequently used terms were extracted to build an initial class taxonomy for creative works. This method was later revised based on conversations with partners and colleagues. The class hierarchy from the project’s data model is illustrated in figure 3. DESCRIBING THE “TYPE” OF A CREATIVE WORK AT THREE LEVELS As analysis of the pilot project participants’ data began, new classes and relationships were encountered and were evaluated as possible extensions to the data model. One part of the model that changed substantially was the Creative Work taxonomical branch. It was originally populated with the “types” of creative works based on how they were described in the source metadata, but that resulted in a large and unstructured list of classes. After consulting with the pilot partners and with colleagues at the J. Paul Getty Trust, the team decided to revise the model using a three-level approach. At the top level, creative work “type” classes were mapped to the Dublin Core DCMI Type14 terms. An immediate benefit of that decision was the ability to neatly facet results across the different DCMI Types, a common way of providing a high-level filter for search and retrieval of digital collections. To refine the top level DCMI Type classes, a second level “classification used” property was created that was associated with 25 “classification” entities. The set of classification entities was developed based on work done in the Linked.Art15 project as well as through consultation with colleagues at the pilot partner Minnesota Digital Library. If more detail was needed, a third level for the “process or format” property could be used to connect the item to any conceptual entity. An example of this revision to the data model is illustrated in figure 4 for a postcard, which is a type of “image,” uses the classification “Prints,” and adds a “process or format” of “Postcards.” Example of Mapped DCMI Data Levels for a Postcard FIGURE 4. Example type, classification used, and process or format properties and values for a description of a postcard.16 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/entity-Q73226.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 19 DISTINGUISHING BETWEEN INSTANCES OF CONCEPTS AND ONTOLOGICAL CLASSES Distinguishing between instances of concepts and ontological classes presented a data modeling challenge. This challenge is related to how, in the library domain, controlled vocabularies have been developed and translated to ontology-based systems. Concepts derived from a controlled vocabulary can be used both as conceptual entities for subject headings and as ontological classifications for a specific instance of the subject. A good example of this dual use is the concept entity of “Dogs.” As a concept it can be used to describe what a photograph depicts as seen in figure 5. “Depicts” Statement for the Concept of “Dogs” FIGURE 5. A depicts statement for the concept of “Dogs.”17 View a larger image online. But “dog” can also be used as an ontological class to describe specific dogs, such as the dog named “Buck” who appears in a photograph (figure 6). Type Classification of “Dog” for a Specific Dog FIGURE 6. A type classification of “dog” for a specific dog.18 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/entity-Q147731.png https://researchworks.oclc.org/cdmld/screenshots/entity-Q142481.png 20 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability To distinguish the conceptual entity of “Dogs” from the ontological class “dog” in the pilot data model, an “is defined by” property was created, based on the property “isDefinedBy,” which is found in the linked data modeling vocabulary RDF Schema19 to connect the class to the conceptual entity that describes it (figure 7). The “Dog” Class “isDefinedBy” the Concept of “Dogs” FIGURE 7. The “dog” class is defined by the concept of “Dogs.”20 View a larger image online. MANAGING THE DATA MODEL IN WIKIBASE OCLC staff took advantage of the components built into the Wikibase infrastructure to manage the process of developing the data model, using a template form to submit and review proposals for new properties and classes. This approach helped illustrate the expected advantages that these additions to the model would bring and provided a history to look back on as the project proceeded. OCLC staff found that these templates and the proposal/review/acceptance workflow were an effective way for a small but distributed team to manage the process and recommends this approach to others who are building a system using the Wikibase software platform (figure 8). https://researchworks.oclc.org/cdmld/screenshots/entity-Q73829.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 21 Wikibase Templates for Proposing New Properties FIGURE 8. Wikibase templates for proposing new properties.21 View the larger Wikibase images property proposal and property proposal/is defined by online. MANAGING SOURCE METADATA OUTSIDE OF THE DATA MODEL Some of the CONTENTdm source metadata fell outside of the information that was expected to be accounted for in the data model. The data model was intended to support the description of cultural materials, but the source metadata also included technical information about their digital representations and administrative data associated with the cataloging process. To prevent this additional information from being lost in the transformation process, the associated fields and values were indexed in a system separate from the Wikibase. The indexed data included the identifier for the associated entity in the project Wikibase. This allowed unmapped elements to be displayed in the Wikibase user interface (illustrated in figure 9) without disrupting the data model with entities and relationships that were administrative or technical in nature. https://researchworks.oclc.org/cdmld/screenshots/cdm-property-proposal.png https://researchworks.oclc.org/cdmld/screenshots/cdm-property-proposal.png https://researchworks.oclc.org/cdmld/screenshots/cdm-property-proposal-is-defined-by.png 22 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability Unmapped CONTENTdm Metadata Displayed in the Wikibase User Interface Using a Gadget Extension FIGURE 9. Unmapped CONTENTdm metadata displayed in the Wikibase user interface using a Gadget extension.22 View a larger image online. Gathering and Transforming Metadata The primary focus of the first phase of the linked data project involved assembling metadata describing digitized cultural materials and transforming it to descriptions of related entities. The following notes provide a detailed view of that work, including how metadata was selected for inclusion and analyzed, the development of tools and workflows to manage the transformation, and how the database was enriched to build more connections between entities. https://researchworks.oclc.org/cdmld/screenshots/entity-Q143578.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 23 SELECTING AND ANALYZING COLLECTIONS FROM PILOT PARTNER CONTENTDM SITES Pilot project participants were asked to suggest a small group of CONTENTdm collections that they wanted to work with. OCLC suggested working with collections of varying sizes and content types but emphasized that the described materials should be primarily visual (photographs, prints, maps, etc.) rather than finding aids or PDF documents. In some cases, for very large collections, OCLC chose to represent a subset of the entire collection, given the pilot project’s resource constraints.23 Wikibase Discussion Page for a Collection Review FIGURE 10. Wikibase Discussion page for a collection review.24 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/cdm-item-talk-Q148309.png 24 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability OCLC staff exported CONTENTdm metadata for each suggested collection and created an entity description for it in the CONTENTdm Wikibase and used the Wikibase “Discussion Page” feature to develop a metadata crosswalk, analyzing fields used in the collection and mapping them to Wikibase properties and classes. After OCLC staff created the initial crosswalk, individual meetings were held with each pilot site to review the initial analysis and address questions. This process highlighted the importance of domain expertise when thinking through the metadata transformation process, as institution-specific, and sometimes collection-specific, cataloging practices cannot always be discerned by others outside the institution. OPTIMIZING TOOLS AND WORKFLOWS FOR RECONCILIATION AND TRANSFORMATION After analyzing collection fields and reviewing the analysis with the pilot participants, OCLC created a project for each collection in the program OpenRefine25 (figure 11), which provides tools for data analysis, cleanup, and reconciliation. OpenRefine has a significant learning curve but is a tool OCLC has used frequently for metadata analysis; it was a natural fit for this project and proved to be an effective platform. CONTENTdm Collection Metadata in an OpenRefine Project FIGURE 11. CONTENTdm collection metadata in an OpenRefine project.26 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/openrefine-project.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 25 As OCLC staff gained more experience with CONTENTdm metadata, reusable OpenRefine recipes were developed for carrying out generic data transformation tasks, which helped speed up the data processing for OCLC staff. For example, a recipe was developed for looking up an item’s Wikibase identifier using its IIIF27 Manifest URL (“IIIF” is an image interoperability standard, and it defines a “manifest” that represent the digital content associated with a collection or item) and retrieving data from the pilot project’s linked data “triplestore”28 database, a recipe for converting personal names from indirect order to direct order, and a recipe to extract and format individual height and width values and corresponding unit from extent data text strings. The code for each recipe was documented and stored in a Wikibase Help page for sharing and reuse by OCLC staff. An important advantage of the OpenRefine platform is its ability to reconcile strings of text against external vocabularies to obtain a persistent identifier for the thing that the text string describes. The reconciliation feature is built into OpenRefine and can be configured to compare strings against external OpenRefine-compatible reconciliation endpoints. OCLC staff worked with the OpenRefine reconciliation endpoint software29 developed for the Wikidata community and reconfigured it as an endpoint for the project Wikibase. That way OpenRefine could be used to reconcile text strings against matches found through the OpenRefine reconciliation endpoints for the CONTENTdm Wikibase and could also use the similar endpoint supported by the Wikidata community to reconcile strings against Wikidata. OCLC also made use of OpenRefine endpoints developed and hosted by others to reconcile against the OCLC FAST30 subject terminology system, the VIAF31 authority file service, and the GeoNames32 service for geographic data. After cleaning up and reconciling the CONTENTdm metadata, OCLC staff exported the data from OpenRefine and used locally developed scripts written in the Python33 scripting language to restructure the data to match the format specified for the Wikidata QuickStatements34 application. This is a tab-separated format with a set of rules for adding data to a Wikibase, with each row representing a single component of the item’s description. And OCLC utilized the Pywikibot35 library to develop another application that could read the QuickStatements data and load it into the Wikibase. The most significant barrier to quickly transforming and loading CONTENTdm metadata into the project Wikibase was the absence . . . of Wikibase entity descriptions for the people, organizations, places, concepts, and events that are represented in the CONTENTdm records. ADDING RELATED ENTITIES TO THE CONTENTDM WIKIBASE FROM EXTERNAL SOURCES The most significant barrier to quickly transforming and loading CONTENTdm metadata into the project Wikibase was the absence, especially in the early stages, of Wikibase entity descriptions for the people, organizations, places, concepts, and events that are represented in the CONTENTdm records. In a linked data environment, each of those related entities must have its own entity description in the system, so that relationships can be defined between the entities. For example, when transforming a CONTENTdm record for a photograph, a “photographer” property should be added to the entity describing the photograph with a link to a separate entity for the photographer. 26 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability Unless those related entities are already in the project Wikibase and can be matched through OpenRefine reconciliation, the data loading process stalls until data for the related entities can be found, transformed, and loaded. To move the process along and create entities as quickly as possible, OCLC staff initially created entities just for the creative work and its direct string-based properties (e.g., its title, description, height, width, IIIF Manifest URL, etc.). Once that step was completed, OpenRefine and the project’s SPARQL Query Service were used to look up the newly created Wikibase identifier for each item and those identifiers were added into a new column in the OpenRefine project. That step was followed by the creation of one or more new OpenRefine projects focused on reconciling strings for related entities and making connections between those entities and the creative work entities in the Wikibase. Creating entities in advance for anticipated matches OCLC also created Wikibase entity descriptions in advance for concepts and places that were anticipated to be mentioned in the CONTENTdm source data so the OpenRefine reconciliation process would find something to match against. Entities for concepts were based on a set of headings from OCLC’s FAST subject vocabulary. Staff selected subject headings that are widely used in other databases with the expectation that these would represent headings that would also occur in CONTENTdm metadata. The subject headings were transformed and loaded into the Wikibase as concept entities. This created an initial set of about 75,000 concept entities. In a second step, the FAST data was analyzed to find “broader concept” relationships for the 75,000 concept entities, and new concept entities were created for all of the “broader concept” FAST headings. Adding broader concept entities resulted in a total of over 100,000 concept entities being added to the Wikibase to support the CONTENTdm metadata matching process. Entities for places that were anticipated to be found in CONTENTdm metadata were created based on information from the GeoNames geographical database, beginning with data describing cities with a population larger than 15,000 along with other place descriptions from administratively higher levels (countries, states, provinces, territories, counties, etc.). The GeoNames data processing produced about 70,000 Place entities for reconciliation. This step of prepopulating the Wikibase with descriptions of entities for anticipated CONTENTdm headings helped reduce the barriers for entity creation. But the limits that were applied to the external sources meant that there were potential matches still to be found in FAST or GeoNames that had not been included, and potentially additional or richer data available from VIAF and Wikidata. Unmatched headings were reconciled against those services in OpenRefine, and if matches were found, the external source data was retrieved, converted, and loaded into the project Wikibase, and reconciliation was attempted again. OCLC also developed a separate application called the “Retriever,” described in more detail later in this report, that staff used to search for matches in Wikidata, VIAF, and FAST and create new entities with a simple web interface. Testing an alternative openrefine reconciliation endpoint During the second phase of the pilot, OCLC prototyped a new reconciliation endpoint for matching against headings in the project Wikibase, relying on separate indexes of entity data to speed up the reconciliation process. The performance metrics for this prototype service were very encouraging, as it does not rely on SPARQL Queries and the triplestore for matching, which can Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 27 slow the process down. The index response times were consistently much faster in OCLC tests. This efficiency gain comes at the cost of replicating and synchronizing data from the Wikibase in another index, but for this project the costs were easily managed. OCLC staff provided a detailed presentation on this prototype work as part of the OCLC DevConnect Online 202036 series. The DevConnect webinar sparked some interest from a few developers that work on OpenRefine, and the OpenRefine Reconciliation Service API and OCLC has consulted with them to determine if any of our optimizations can be incorporated into their projects. Creating placeholder entities for things that could not be reconciled Some entities mentioned in CONTENTdm records could not be found in the controlled vocabularies and authority control systems that were used by OCLC for reconciliation. This was an anticipated, and indeed one of the points of carrying out this pilot was to better understand how these references appear and how to account for them in a Wikibase, where the established identity of the entity is of great importance. The solution the team settled on was to create a “placeholder” entity with as much information about the referenced entity as could be extracted from the CONTENTdm description, for instance its type (person, organization, etc.), birth and death dates (if present), occupation, and a consistently applied component of the Wikibase description that would help suggest, for potential future matches during reconciliation, that the entity’s identity had not yet been established (figure 12). A “Placeholder” Entity for a Person without an Established Identity FIGURE 12 . A “placeholder” entity for a person without an established identity. 37 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/entity-Q144548.png 28 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability Representing Compound Objects Descriptions of cultural materials that consist of multiple parts, such as a photograph album or the recto and verso views of a postcard, can be structured in CONTENTdm as “compound objects.” Compound objects maintain the sequential order of related digitized items and can include an “object description” of the whole item, along with more detailed “item descriptions” about each part.38 The project team tested two ways to maintain this structure and descriptive detail in Wikibase entities that were created from CONTENTdm compound object metadata. In the most granular and detailed approach, for a photograph album, an entity for the album was created along with separate entities for the album cover and its individual pages. Each cover or page entity has a “part of creative work” property linking back to the album entity. While this approach acknowledges the whole-part relationship of pages to the album, that relationship on its own cannot represent the sequential order of the parts. To document their sequential order, “has creative work part” statements were added to the description of the album, linking to the related parts, and each statement was qualified with a “series ordinal” property to represent the numeric sequence of the pages (figure 13). Example “has creative work part” Statements and Sequencing for the First Four Parts of an Album FIGURE 13. Example “has creative work part” statements and sequencing for the first four parts of an album. 39 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/entity-Q144548.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 29 In reviewing other compound objects, a more typical pattern revealed that they included very little item-level descriptive data beyond a default caption such as “Page 1,” “Page 2,” etc. It was also noted that the IIIF Manifest that is present for compound objects maintains the structure, sequence, and captions of related images. That led to a decision to describe most compound objects as a single entity without separate entities for the items in the compound object, relying on access to the structure and sequence and caption-level metadata in the corresponding IIIF Manifest. Syndicating Data in Standard Schemas The data managed in the CONTENTdm Wikibase is accessible through MediaWiki APIs and the Wikibase user interface and can be transformed by Wikibase in several formats including the RDF40 linked data formats Turtle,41 N-Triples,42 and JSON-LD,43 along with the non-RDF formats of a proprietary Wikibase JSON44 object and Serialized PHP.45 The pilot study also evaluated mechanisms for transforming data from Wikibase into schemas used by other systems where this data may eventually be shared. Specifically, OCLC added equivalent class and equivalent property statements for the CONTENTdm data model’s classes and properties, which were then used by a conversion program to crosswalk the data into either the DPLA Metadata Application Profile46 or Schema.org.47 The Schema.org transformation was used with a CONTENTdm Custom Javascript extension, described below, to test embedding JSON-LD linked data within CONTENTdm item pages. A separate conversion process was developed to convert the project’s Wikibase data model representation into an RDF OWL Ontology48 description. This conversion demonstrated the portability of both the Wikibase data model and the instance data if Wikibase were to be replaced by another structured data management system. The exported ontology also provided a clear way to see the model, separate from the instance data, which helped the project team explain how the data was created and structured. In testing how data exports could be created and used, OCLC developed a conversion process that took the Wikibase JSON data and generated JSON-LD data that conformed to the JSON-LD 1.1 specification and followed the emerging W3C best practices for JSON-LD49 as well as IIIF JSON-LD design patterns.50 This conversion demonstrated the versatility and portability of the Wikibase JSON data and provided “developer friendly” data for our prototype applications, such as the Explorer application described in this report, to use. Wikibase Ecosystem Advantages The selection of the MediaWiki environment and its Wikibase extension brings several advantages right out of the box. Without custom software development or user interface design and testing, these can be employed to produce new data management and user experience benefits. IMPLEMENTING AUTHORITY CONTROL CONTENTdm currently has a traditional record-oriented data model, where headings for various entities are based on a single string. Varying cataloging practices and sources for controlled vocabularies can, in that approach, create obstacles to searching for the name of a person, organization, concept, place, or event if you do not know the exact form of the heading. But in the 30 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability Wikibase environment, any number of different heading strings and in different languages can be associated with an entity, greatly increasing the effectiveness of recall while strongly supporting precision as well. Other Names Associated with the Los Angeles Dodgers Entity FIGURE 14. Other names associated with the Los Angeles Dodgers entity. 51 View a larger image online. For example, in CONTENTdm a precise search to find works associated with the Los Angeles Dodgers baseball team may (depending on the cataloging practices of the institution) need to use the Library of Congress (LC) heading “Los Angeles Dodgers (Baseball team).” But in the Wikibase environment, the entity describing that organization could be found using that LC preferred form, or any of several current colloquial names or previous official names, including “LA Dodgers,” “Brooklyn Dodgers,” “Trolley Dodgers,” “Brooklyn Grays,” and others (figure 14). In the Wikibase environment each entity is registered with and retrievable with its own unique identifier, separate from any and all names with which it may be associated. DECREASING CATALOGING INEFFICIENCIES, INCREASING DESCRIPTIVE QUALITY In a record-oriented system like CONTENTdm, if a cataloger wants to include biographic or other descriptive information about an entity associated with a work, such as information about the photographer of an image or about a depicted person, that information needs to be added to as the value of a field in every record where it is applicable. Then, if information about the related entity needs to change, all the associated records need to be updated to keep that information current and synchronized. This data management overhead may be one reason why descriptions of related entities are not common in traditional cataloging environments. https://researchworks.oclc.org/cdmld/screenshots/entity-Q166325.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 31 First Parts of the Description of Jasper Wood FIGURE 15. First parts of the description of Jasper Wood. 52 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/entity-Q147700.png 32 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability In the Wikibase environment, entities for works and for things associated with the work are maintained separately. The description of the photographer, or of the depicted person, can be entered and maintained in one entity description as illustrated in figure 15, and any changes to that description can be immediately seen through the relationships the entity has to other entities. This efficiency improvement could encourage richer descriptions of related entities, including context and relations that are not typically added in existing record-oriented systems. GENERATING DATA VISUALIZATIONS As the system architecture diagram included in this report represents, the Wikibase ecosystem includes a component that watches for changes in the Wikibase entity descriptions, retrieves that data in the form of linked data triples, updates the data in a linked data database or “triplestore,” and provides a separate user interface for querying that data using the SPARQL Query language. The user interface has built-in tools for constructing SPARQL queries and determining how the results can be visualized. The SPARQL query language is a powerful tool for making connections between and across entities, producing results that would be difficult and, in some cases, not feasible in a traditional record-oriented system. As shown in figure 16, a simple SPARQL query can retrieve all of the entities for places that are said to be depicted by works in a collection and, using the geographic coordinates in the place entity description, locate the place in a map visualization along with information about the related work. SPARQL Query Map Visualization of Places Depicted in Works from a Collection FIGURE 16. SPARQL Query map visualization of places depicted in works from a collection. 53 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/sparql-visualization.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 33 User Interface Extensions MEDIAWIKI GADGETS The MediaWiki platform for Wikibase provides a Gadgets54 extension that can be used to develop and add custom features to the user interface. OCLC staff took advantage of this feature to extend the interface, both to alter the user experience and to provide quality assurance tools. Adding the Mirador viewer Mirador55 is a configurable, extensible, and easy-to-integrate image viewer that enables image annotation and comparison of images from repositories dispersed around the world. It can interpret the metadata and images that are included in IIIF Presentation Manifests. CONTENTdm generates IIIF manifests for all its image-based content, so Mirador was a great fit for this pilot project. Without an embedded image viewer, the Wikibase item entity displays are limited to text and are static. The Mirador viewer adds a degree of interactivity to the user experience: images can be viewed in detail, and for compound objects pages can be turned, without leaving the Wikibase user interface, as shown in figure 17. Mirador Image Viewer Embedded in the Wikibase User Interface FIGURE 17. Mirador image viewer embedded in the Wikibase user interface. 56 View a larger image online. Showing contextual information from Wikidata One of the most important value propositions of working with linked data is for entities to link to other related things in other systems, leveraging the network to obtain more contextual data “on the fly” instead of duplicating data across systems. And in the linked data project Wikibase, many entities included identifiers for descriptions of the same entity in Wikidata. https://researchworks.oclc.org/cdmld/screenshots/entity-Q165895.png 34 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability OCLC developers created a Wikibase gadget that could detect the presence of the related Wikidata identifier in an entity description, make a connection to Wikidata in real-time to find an associated Wikipedia article link, and use the Wikipedia link to obtain a summary description of the entity and, in many cases, a related image from Wikimedia Commons. OCLC developers found that this MediaWiki Gadget was simple to write. But the Gadget depended on a separate and more complex application created by OCLC developers that made all the system connections and carried out the database searches for contextual information and cached that information so as not to overburden the other shared services. The resulting contextual information included in a Wikibase entity description of San Francisco is illustrated in figure 18. Contextual Data and Image from DBPedia and Wikimedia Commons Embedded in the Wikibase User Interface FIGURE 18. Contextual data and image from DBPedia and Wikimedia Commons embedded in the Wikibase user interface. 57 View a larger image online. Revealing constraint violations Constraints58 are a Wikibase quality assurance feature that can be defined for properties and classes to describe their expected or allowed uses. For example, the property for “birthplace” might have a type constraint set indicating that the property should only be used for items that are an instance the class “Person,” that the object of the birthplace statement should be an instance of the class “Place” or one of its subclasses, and a cardinality constraint indicating that an entity should not have more than one birthplace statement. Leveraging the project’s SPARQL Query Service and its triplestore, OCLC developed a gadget that can compare the properties set for an item with any constraints set for the property and return a list of “constraint violations.” In some cases, these will represent errors in the description that should be changed. In other cases, they can point to adjustments that may be needed to the data model. https://researchworks.oclc.org/cdmld/screenshots/entity-Q71945.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 35 As illustrated in figure 19, a constraint violation is noted for the Soviet Space Dog “Laika.” An occupation property has been set to “Astronauts,” but the type constraint for the occupation property indicates that it should only be used for instances of the type “person.” This view helps the project team see the violations generated by unexpected data and decide whether to modify these constraints, in this case based on what the community decides about occupations and whether they can be associated with other beings other than persons. Constraint Violation Indicating the “Occupation” Property Should Only Be Used for Instances of the Type “Person” FIGURE 19. A constraint violation indicating that the “occupation” property should only be used for instances of the type “person.”59 View a larger image online. CONTENTDM CUSTOM PAGES A very useful feature of the CONTENTdm system is the ability to create Custom Pages60 using CSS and Javascript to adjust and extend the default user interface features. You can find a wide array of examples in the CONTENTdm Customization Cookbook site.61 The CONTENTdm pilot used this customization feature to test how linked data from the pilot project Wikibase could power two enhancements to the production CONTENTdm system’s item displays. https://researchworks.oclc.org/cdmld/screenshots/entity-Q73246.png 36 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability Embedding Schema.org JSON-LD in CONTENTdm pages Schema.org is a collaborative, community activity with a mission to create, maintain, and promote schemas for structured data on the internet, on web pages, in email messages, and beyond. By mapping the CONTENTdm Wikibase data model to Schema.org classes and properties, and by developing a conversion program to generate Schema.org-compatible descriptions of entities in the Wikibase, OCLC developed a CONTENTdm customization that embeds the Schema.org data within a CONTENTdm item page, formatted as JSON-LD, to make the content of the page easier for search engines to find and interpret (table 1). TABLE 1. Example Schema.org JSON-LD for a CONTENTdm entity Example Schema.org JSON-LD for a CONTENTdm entity Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 37 The visibility to search engines of this embedded JSON-LD Schema.org metadata can be evaluated using applications like Google’s Structured Data Testing Tool (figure 20).62 Schema.org Data Evaluated Using Google’s Structured Data Testing Tool FIGURE 20. Schema.org data evaluated using Google’s Structured Data Testing Tool.63 View a larger image online. Showing contextual information for headings based on Wikibase data Similar to the Wikibase user interface gadget that adds contextual information about a single entity by connecting through Wikidata to obtain related information from Wikipedia, DBPedia, and Wikimedia Commons, an application was written that could be called by a CONTENTdm Custom Javascript component and, using the CONTENTdm item identifier as a way to find the related entity for the work in the pilot Wikibase, also find other entities related to the work in the Wikibase (the collection of which it is a part, subjects that it is about, its creator, and more), and for each of those entities look for and display more information, including an abstract and a thumbnail image. This customization, shown in figure 21, was demonstrated to the pilot participants and there was interest in applying it to some of their collections, but the project did not see a production implementation of it, beyond OCLC’s testing, before the pilot period ended. https://researchworks.oclc.org/cdmld/screenshots/google-structured-data-testing-tool.png 38 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability Additional Contextual Information Displayed in Contentdm Based on Entity Descriptions in the Pilot Wikibase FIGURE 21. Additional contextual information displayed in CONTENTdm based on entity descriptions in the pilot Wikibase.64 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/cdm15725-p16003coll7-14.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 39 New Applications The Linked Data project was well served by the “out of the box” features and functions of the MediaWiki platform, its Wikibase extension, the SPARQL Query service interface, the MediaWiki Gadgets component, and CONTENTdm Custom Pages. But for more complex investigations that were carried out during the project, the following prototype applications were developed: • The Image Annotator to evaluate how subject matter experts could assist catalogers in describing images • The Retriever to make the process of finding and adding new entity descriptions more efficient • The Describer to investigate alternatives to the default Wikibase editing interface • The Explorer and the Transportation Hub to demonstrate the value of aggregation and new discovery system features that maximize the value of linked data • The Field Analyzer to assist metadata managers with analyzing their current collections THE IMAGE ANNOTATOR The CONTENTdm metadata transformation and reconciliation process produced descriptions of creative works that included, among other statements, relationships to other entities that the creative work either depicted or was in a more general sense “about.” The distinction between these two relationships was not always certain and a project goal was to better understand how this distinction is discerned by those managing digital collections. There was also an interest in testing whether the Wikibase platform could serve as the basis for new application development— in this case for an interface that would let domain experts and others augment the transformed CONTENTdm metadata with new annotations. A user can review the statements that were created as part of the CONTENTdm metadata conversion process and quickly update any statements that need adjusting. The Image Annotator application was developed and tested to investigate those questions. Given the Wikibase entity identifier for a creative work, it initially presents the work’s image along with a list of the “about” or “depicts” statements that are part of the entity description. This selective presentation of just some of the elements associated with the entity description was designed to give focus to the questions at hand: What is the image about, what does it depict, and can portions of the image be associated with depicted things? A user can review the statements that were created as part of the CONTENTdm metadata conversion process and quickly update any statements that need adjusting, for example changing an “about” statement to a “depicts” statement if they determine that the related entity is truly depicted in the image (figure 22). 40 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability Image Annotator Initial View of an Image and Subjects FIGURE 22 . Image Annotator initial view of an image and subjects.65 View a larger image online. And for any “depicts” statements, the user can apply the image cropping tool to associate the appropriate portion of the image with the depicted entity, providing a much finer-grained reckoning of the item and supplementing the Wikibase with new images associated with other entities (figure 23). The IIIF Image API supports the management of these selections and the persistent retrieval of the associated images. Image Annotator Cropping an Image of a Person FIGURE 23. Image Annotator cropping an image of a person.66 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/image-annotator-1.png https://researchworks.oclc.org/cdmld/screenshots/image-annotator-2.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 41 The subject relationships shown in the Image Annotator are based on the CONTENTdm source data, but this new application gives users the opportunity to supplement the entity description with more “about” or “depicts” statements by searching for related entities and adding the new connections, with another cropped image if appropriate, as illustrated in figure 24 with the addition of the subjects “Baseball umpires” and “Catchers (Baseball)” with associated images. This [Image Annotator] gives users the opportunity to supplement the entity description with more “about” or “depicts” statements by searching for related entities and adding the new connections. Image Annotator After Adding More Depicted Subjects FIGURE 24. Image Annotator after adding more depicted subjects.67 View a larger image online. Once the changes have been made in the Image Annotator, they can be saved to the Wikibase where they are immediately visible in its user interface (figure 25). https://researchworks.oclc.org/cdmld/screenshots/image-annotator-3.png 42 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability Wikibase Item Updated with Illustrated Depicts Statements FIGURE 25. Wikibase item updated with illustrated depicts statements.68 View a larger image online. User study results The usability of the Image Annotator was tested in November and December 2019 in three separate “Think Aloud” user studies.69 In this type of study, test participants use the system while continuously thinking aloud—that is, verbalizing their thoughts as they move through the user interface. Reactions to and suggestions made about the Image Annotator were for the most part very positive, but also identified user interface and indexing improvements that would need to be implemented before it could become a truly productive tool. https://researchworks.oclc.org/cdmld/screenshots/entity-Q148552.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 43 The test results indicated that the Image Annotator application was usable, as everyone was able to complete the exercise steps. The results also helpfully revealed usability issues that had not been previously encountered during OCLC staff testing. The test study wrap-up discussions provided especially good feedback and suggestions for improvements. Test participants noted that the Image Annotator would be a useful tool for cleaning up metadata and would provide an easy way to bring subject matter experts from outside the library into the process of describing cultural materials. The test results also identified several areas of needed improvement before the Image Annotator would be ready for regular use, including search and retrieval, scalability, user interface issues, and guidelines for descriptive practice. In the area of search and retrieval, it was unclear that entering “free text” subject annotations that did not match a heading would not be retained when the entity was updated. Participants found that expected search results were not returned, for example a search for “kite” did not show a match for “kites.” And some participants hoped that vocabularies from nonlibrary domains could be included as the source for related entities: “That’s ultimately what makes digital collections meaningful.” Scalability of the manual effort was a noted concern, in that it takes time to make annotations for individual objects, and collections can include thousands of objects. Some wondered whether crowdsourcing, under certain management and controls, could address that concern. The Image Annotator’s mechanism for adding and removing annotations presented some usability obstacles: the absence of the “camera” button for “about” headings was confusing, a cropping icon would be preferable to a camera icon for that button, the camera and “add a depiction” buttons compete for attention, and there isn’t a way to delete a cropped image without deleting the entire depicts statement. Some participants wondered how many subjects are “enough,” and what subjects are notable enough to deserve annotation. They wished for easier access to other descriptive metadata for the work, to help identify additional depicted subjects. The gray area between “about” and “depicts” was discussed by all without a clear consensus on when to select one or the other, though generally participants felt that “depicts” should have more of a guarantee that you will see the depicted thing in its entirety or as a significant portion in the object. For example, a photograph of Public Square in Cleveland would depict Public Square but be about Cleveland. All the test subjects noted that the Image Annotator was enjoyable to use: • “I like this little system.” • “Once you get going it’s actually kind of fun.” • “One of the things you’re offering is a way to have fun—quite literally a window into new ways of thinking about what we do.” THE RETRIEVER When describing an entity using the Wikibase interface, the workflow can come to a halt if you are trying to establish a relationship between the entity you are describing and some other entity when it is not yet in the Wikibase. To fill this gap, OCLC developed an application called the “Retriever” that can quickly search for an entity described in other systems and transfer those descriptions into the Wikibase as a new entity. 44 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability For example, if you are describing a photograph of Lake Vermilion in Minnesota and wanted to add a “depicts” statement linking the photograph entity to that place entity, if there isn’t already an entity in the Wikibase describing Lake Vermilion you’d need to stop editing the photograph’s description, switch to a new Wikibase editing window to create a new entity for the lake, and then return to the photograph entity description to add the statement claiming that the photograph depicts the lake. That kind of disruption to the workflow can be reduced if there is a way to quickly add the missing entity’s description to the Wikibase. When a Wikibase is in its early stages, unless it is prepopulated with entity descriptions from another source, this situation will be commonplace. But it is also the case that in many instances the missing entity is already described in some other authority control system or vocabulary, so a tool that can find those descriptions, transform the data to align with the classes and properties in the Wikibase, provide an opportunity for human review and correction of the transformed data, and then automatically load the source data into the Wikibase as a new entity can help bridge the gap and keep the cataloging work flowing. OCLC designed the Retriever to provide a simple keyword search interface to look for matching items in Wikidata, VIAF, and FAST (figure 26), a user interface for reviewing and editing data extracted from those sources (figure 27), and a back-end process for loading the transformed data into the Wikibase (figure 28). This application was originally developed, for the same use case, in OCLC’s Project Passage. The user interface component of the application was re- written in the Linked Data pilot to use a different Javascript framework, but the functionality was generally the same. Retriever Search Results from Wikidata, VIAF, and FAST for “Lake Vermilion” FIGURE 26. Retriever search results from Wikidata, VIAF, and FAST for “Lake Vermilion.”70 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/retriever-1.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 45 Retriever Entity Editor FIGURE 27. Retriever entity editor.71 View a larger image online. Wikibase Entity Created by the Retriever FIGURE 28. Wikibase entity created by the Retriever.72 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/retriever-2.png https://researchworks.oclc.org/cdmld/screenshots/entity-Q221424.png 46 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability There is a server-based part of the Retriever application that takes search requests from the browser, handles mapping of external source data elements to the project Wikibase properties and classes, and utilizes the Python Pywikibot library to carry out data loading into the Wikibase. THE DESCRIBER The Linked Data project’s goals included testing editing interface alternatives to the Wikibase default user interface. OCLC began development of a prototype web application named the “Describer” that aspired to provide a guided mode to cataloging entities for works, illustrated in figure 29. The user experience in the Describer would begin by prompting the cataloger to choose the type and classification of the material they were describing. Based on those selections, the Describer would begin prompting for additional details that would be common or expected for entities of that type and classification, factoring in property constraints and other details of the underlying data model. The Describer could also incorporate capabilities and features that had been previously tested in the Image Annotator and in the Retriever. Work on the Describer prototype was not completed before the end of the pilot, but the initial testing suggested promise while also revealing the importance of carefully documenting the data model constraints in order to drive the user experience. Though not part of this pilot, a related investigation that could prove similarly illuminating would be to evaluate a language designed to express the shape of the data, such as SHACL73 or ShEx,74 as the mechanism for defining how the data model works and how that relates to user interface development. Editing Essential Details for an Entity in the Describer FIGURE 29. Editing essential details for an entity in the Describer.75 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/describer-1.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 47 THE EXPLORER AND THE TRANSPORTATION HUB An important value proposition for making the transition to linked data is the ability to browse or navigate across the graph of data connections to find important related entities and reveal relationships that would be hard to see in a more traditional record-oriented search and retrieval system. To evaluate this potential OCLC developed a prototype web application named the “Explorer” to focus on the most frequently occurring connections between entities, see relationships that were described by different institutions for different items in different collections, look for thematically- related content, and follow the graph-based connections to locate important related entities. An important value proposition for making the transition to linked data is the ability to browse or navigate across the graph of data connections to find important related entities and reveal relationships that would be hard to see in a more traditional record-oriented search and retrieval system. The home page of the Explorer lists entities organized across a subset of categories, sorted by frequency, to help researchers jump into the browsing experience and quickly see what the pilot project data is mostly “about” (figure 30). The Explorer also has a keyword search interface. While the collections that were selected for evaluation in the first two phases of the pilot project were all interesting and, as a group, gave us a good idea about the range of data transformation and reconciliation challenges we’d likely encounter when working with other CONTENTdm sites and collections, they were not chosen with any special attention paid to how the materials they describe might thematically overlap. To generate more topically related connections across the pilot participants’ data, OCLC assembled a new selection of CONTENTdm metadata records based on the topic of transportation. Using a general search for transportation-related subjects (the subject terms used were “streetcars,” “transportation,” “roads,” “highways,” “airports,” “railroads,” “automobiles,” “ferries,” “rockets, “ships,” “boats,” “streets,” “paths”), OCLC staff applied a search across all collections for each pilot participant’s CONTENTdm site, gathered the resulting metadata records, and transformed the data for loading into Wikibase, reconciling as many headings to related entities as could be done without significant amounts of human attention. This step provided more data for us to use in assessing the scalability of this data transformation process, as we could compare this more automated and streamlined effort with the very thorough and largely manual process that had been applied to the initial set of pilot project collections. 48 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability Explorer Home Page FIGURE 30. Explorer home page.76 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/explorer-1.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 49 OCLC established a new virtual collection entity in the Wikibase for a “CONTENTdm Transportation Hub” and associated all the new Wikibase items for the related works to this collection, along with their original source collection. In the Explorer, the Transportation Hub collection can be selected as the starting point for browsing and selection, with facets helping to narrow the scope to different topics, things depicted, source collections, and more (figure 31). Explorer Transportation Hub and Related Collections FIGURE 31. Explorer Transportation Hub and related collections.7 7 View a larger image online. The Transportation Hub can also be used to narrow a keyword search. For example, a keyword search for “strike” shown in figure 32 matches descriptions of items associated with labor strikes of various kinds (among other things) and narrowing the keyword search result to the Transportation Hub collection can highlight images and other works associated with transit strikes. The Transportation Hub can also be used to narrow a keyword search. https://researchworks.oclc.org/cdmld/screenshots/explorer-2.png 50 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability Explorer Search Results for “Strike” FIGURE 32 . Explorer search results for “strike.”78 View a larger image online. For a researcher interested in that topic, the Explorer can return very different perspectives on a particular transit strike; for example a Philadelphia Evening Bulletin newspaper photograph depicting the effect of the Philadelphia Transit Company strike of August 1944 on transportation options for workers (figure 33), contrasts with a John W. Mosley Photograph Collection image of a protest from the previous year in support of hiring African American trolley drivers (figure 34). For a researcher interested in that topic, the Explorer can return very different perspectives on a particular transit strike. https://researchworks.oclc.org/cdmld/screenshots/explorer-3.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 51 Explorer View of a Truck Bringing Employees Home During a PTC Walkout FIGURE 33. Explorer view of a truck bringing employees home during a PTC walkout.79 View a larger image online. Explorer View of a Protest against the Philadelphia Transportation Company FIGURE 34. Explorer view of a protest against the Philadelphia Transportation Company. 80 View a larger image online. https://researchworks.oclc.org/cdmld/screenshots/explorer-4.png https://researchworks.oclc.org/cdmld/screenshots/explorer-5.png 52 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability Explorer View of an 1899 Cleveland Transit Strike in Public Square FIGURE 35. Explorer view of an 1899 Cleveland transit strike in Public Square. 81 View a larger image online. The Transportation Hub helps to find images of transit strikes and their impacts in collections across institutions, including a Cleveland Public Library photograph of crowds surrounding a trolley car during a transit strike in during 1899 (figure 35) and a University of Miami photograph of parked trolley cars during a strike in Havana (figure 36). The Transportation Hub helps to find images of transit strikes and their impacts in collections across institutions. https://researchworks.oclc.org/cdmld/screenshots/explorer-6.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 53 Explorer View of Streetcars Parked on the Street during a Transit Strike FIGURE 36. Explorer view of streetcars parked on the street during a transit strike. 82 View a larger image online. The processes could be automated and extended to provide different views of how fields are defined and used across collections in a simple web application. THE FIELD ANALYZER Late in the pilot project, the OCLC developers saw a need for a new tool that could visualize how CONTENTdm fields are defined across different collections for participating institutions. This field- level analysis had been carried out in earlier phases of the project as a largely manual process, using CONTENTdm APIs and custom applications to gather data and reformat it for analysis in OpenRefine. After those manual processes had been ironed out, OCLC staff found that the processes could be automated and extended to provide different views of how fields are defined and used across collections in a simple web application. https://researchworks.oclc.org/cdmld/screenshots/explorer-7.png 54 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability Field Analyzer Field Usage Chart FIGURE 37. Field Analyzer field usage chart. 83 View a larger image online. Many participants found it to be a useful addition to their CONTENTdm toolkit, giving them a cross-collection view of how their collection vocabularies are defined. During the pilot, the data that could be listed and visualized by the Field Analyzer was based on a “snapshot” of records copied from CONTENTdm and needed to be periodically refreshed to reflect any subsequent changes made. Pilot participants expressed interest in having the Field Analyzer maintained after the end of the pilot for ongoing use, with access to “live” or frequently synchronized data, and for all collections. https://researchworks.oclc.org/cdmld/screenshots/field-analyzer-1.png Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 55 Field Analyzer List of Field Values FIGURE 38. Field Analyzer list of field values. 84 View a larger image online. Cohort Communication Communication is key to the success of any project, and was vital for effectively collaborating with the Linked Data project participants. In addition to using the CONTENTdm Community Center for addressing questions and tracking progress, the project participants and OCLC staff met every two weeks for an “office hour.” These sessions covered work in progress, planning for future stages, and demonstrations of new applications and processes. Apart from those regularly occurring topics, many sessions included a more open-ended group exploration of other questions, including: • What local authority sources are used for reconciling headings? • What user research practices have been applied to evaluate your systems? • How are access, use, and reuse rights managed at your institution? Are these rights documented in CONTENTdm? Are there different rights assigned for physical vs. digital materials? • How is CONTENTdm technical and administrative metadata managed? • How and when should a “placeholder” entity description be created, for things that lack an established identity? • What are current local practices for metadata cleanup, and do the work location changes made in response to the COVID-19 pandemic impact the priority of that work? • Who is using the CONTENTdm Catcher utility, and who else might use it that isn’t yet? • How could advancing racial equity in CONTENTdm descriptive metadata be facilitated? https://researchworks.oclc.org/cdmld/screenshots/field-analyzer-2.png 56 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability The office hours served as a key point of communication and connection over the course of the project. Engaging in discussions about the challenges and day-to-day work of managing digital collection metadata and receiving real-time feedback about developed applications and tools provided OCLC staff with critical insights to inform and improve project outputs. Exploring the questions listed above as a group helped participants to share experiences and learn from one another. The regular connection points that the sessions provided were especially valuable with the onset of the COVID-19 pandemic and ensuing facility closures. Amid many disruptions, the meetings included periodic check-ins for project participants to discuss and reflect on the effects of the pandemic on their work and their libraries in the near- to long-term; the reported impacts were varied and substantial. Partner Reflections At the end of the Linked Data project, the project partners provided their perspectives, representing both complementary and contrasting views of their experiences, the benefits returned, and implications for the future. CLEVELAND PUBLIC LIBRARY (CHATHAM EWING) Cleveland Public Library (CPL) has partnered with OCLC on metadata issues over the last several years, beginning with Project Passage in 2018-19 and then in 2019-20 with this Linked Data project. CPL believed the projects would have several potential benefits. The projects presented an opportunity to motivate our staff and institution to revisit, revise, and improve our metadata content and structure. It had the potential to lead to better and more accurate description and consequently improved discovery for our customers. The projects also provided motivation to rethink how we might enable more effective sharing through platforms such as DPLA or WorldCat. Over the course of the projects, our digital library staff engaged with and learned from other partners and OCLC’s team. OCLC’s team helped us deeply consider how linked data could have an impact on our descriptive work. CPL staff presented on collections and processes using our scrapbooks project as an example, were recorded live for the purposes of a user interface usability study, submitted CPL collections for analysis through the field analyzer and got back a useful matrix that enabled analysis of our work, frequently conferred with OCLC team members as well as other partners, and more. Our staff worked diligently to raise questions related to public library practice. Though the latter part of the Linked Data project happened during COVID-19, hampering our ability to explore some socially oriented goals with CPL digital library partners, staff were grateful for the intellectual lifeline the Linked Data project provided during the lockdown, and we eagerly anticipate working with project tools in the future to keep exploring some of the community-oriented possibilities for metadata brought up by the project. We believe that much of what we anticipated did happen, but additional insights emerged from the process. The experience and results strongly validate implementing more and more effective approaches to the use of linked data in digital library contexts in public libraries, and we strongly support the report’s call for further investigation into using linked data. We also agree with the recognition that tools for reconciling data, particularly data such as name authorities and discipline specific thesauri, should be an integral part of any advance with regard to digital library tools within OCLC’s suite of digital library applications. Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 57 But we feel there is another observation to make: digital collections described using linked data might be able to help explain what is “uniquely the same” about Cleveland as a place in the United States and the world. Representing what is locally unique yet making local information legible to outsiders and creating a mechanism where differences can be understood, bridged, and linked is an important part of what public libraries using newer descriptive systems can surface. Because we already involve diverse community members and community partners in the creation of digital items for our collections, it seemed natural to think about how we might include our partners in description, as well. During the project we looked at several examples of collecting information about and from CPL digital collections. We looked at our digital collection of scrapbooks, local newspapers, local theaters, and the library archives, and we tried to think about places where we had drawn description from the language of the communities we were working with rather than from internationally scoped name authority lists or cataloging thesauri. And that was promising, we thought. Representing what is locally unique yet making local information legible to outsiders and creating a mechanism where differences can be understood, bridged, and linked is an important part of what public libraries using newer descriptive systems can surface. But it was when the project took a turn and collections like these were juxtaposed against one another that things became interesting. The “Transportation Hub” was a useful example of how this concept was explored by the project. Each institution’s collecting around transportation was pulled together into a gathered collection, and the project implemented a platform that offered a glimpse of how to explore the role the digitized items played in each of the communities documented by the separate collections. The project team at OCLC discussed the challenges of normalizing and reconciling the data in the collections for the Transportation Hub, and the process highlighted typical challenges in reconciling data and enabling searching across multiple institutional collections. And, as we mentioned before, the process also highlighted the labor-intensive nature of such work and spotlighted a long- standing need for more robust tools within the OCLC suite of applications for managing controlled vocabularies across collections in the context of digital library tools. However, the OCLC staff’s discussion of the process also raised the question of what to do about significant local variances in uncontrolled description language for digitized items. Perhaps we should also uncover and share different communities’ understandings of more generalized concepts? It would seem that a linked data system holds up the promise of capturing some locally generated data that reflect local variances while also offering traditional authoritative descriptive data. We feel that a linked data system that includes some broader, more locally oriented mechanism for participation in description would be a powerful tool for our work doing digitization in our community. For us at CPL, we began to consider how we might describe collections that not 58 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability only made use of authorized names, subjects, and thesaurus terms to describe Cleveland’s unique and local digital collections, but which also described our city and region’s uniqueness and made it legible through networks of links to alternative information including local lingo, alternate lists of names, and diverse uses of descriptive language found in lists and resources that might supplement more standard descriptions. A natural extension of this kind of thinking is that (for public libraries at least) tools need to be easy to use not only to professional catalogers, but also to community experts as well. A wiki offers a simple, public-facing user interface, but other interfaces might also be designed to be more inclusive and accessible, allowing digital projects to easily incorporate local, grounded expertise that librarians cannot often be expected to have. Linked data systems can facilitate that inclusivity by creating and making connections between related or synonymous local terminology and concepts. Perhaps using linked data for description even has the potential to decenter hierarchies and master narratives about cultural heritage that may be implicit in approved authoritative descriptive practice, allowing for alternative hierarchies and assumptions to surface and enrich descriptive practice. And the local/global break with regard to epistemological understanding revealed through the Transportation Hub implies other breaks that could be drawn out from other collection gatherings that might also contribute to rich, differentiated hierarchies of description that will enable more diverse access through richer and more inclusive community generated description. Perhaps design a system that is usable by expert catalogers (because solid hierarchies are a backbone of effective access), comprehensible by local metadata experts (because local historians have awesome expertise), and is also open enough to capture (and sift) the kinds of description generated at the level of the general user. This might not only lead to higher quality and more comprehensive metadata, but also, if handled well, can create opportunities for deep community listening. We could generate access points to our information based upon empirical observation of how our communities create links within our information. This kind of engagement might enable libraries to engage patrons and learners, using digitization as a process for delving into what really makes their communities wonderful and unique. THE HUNTINGTON LIBRARY, ART MUSEUM, AND BOTANICAL GARDENS (MARIO EINAUDI) The invitation to join this project in August 2019 came as the Huntington Library was reviewing the digital collections accessible in the Huntington Digital Library (HDL), which had been launched in 2011. In 2018 it was determined that an overhaul and a full review of the metadata and structure of the 23 collections was needed. We had hopes that the Linked Data project might aid us in this endeavor. Following the initial ingest of materials selected from three of our collections, the review and initial cleanup work, along with the testing done by Bruce Washburn to feed our metadata into the Wikibase, it was quickly apparent that this pilot would not be able to help that cleanup directly. Rather the Linked Data project provided us the context to better understand our workflows, our metadata, and how we structured that metadata. Importantly, this project did demonstrate the incredible value of linked data as a way of creating and maintaining metadata. Linked data in the Wikibase enabled the creation of a web of connections and context that is lacking in many other systems. A good example of the power of the tools developed using linked data was the Image Annotator. This tool allows the user to highlight a section of an image and then apply one of the known entities to that highlighted section. This creates links between that image and other images in the collection that would not exist—unless the cataloger remembered that x also appeared in y and z. It provided a tantalizing look at a new tool for cataloging materials. Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 59 It would have been good to test some of these tools outside of the pilot. The Image Annotator if reconfigured for use with a future CONTENTdm would be a great improvement. Subject specialists could be brought on board a project and asked to identify people or places, with the linked data providing added indexing in a controlled environment in the background. Also, the Explorer tool would enhance discovery across collections, both internally to the library, and if part of a larger linked data universe, to other libraries, large or small. While the project was focused on the benefits of, and how to create, linked data, one tool grew out of the need to analyze the extant data in the participants systems. And that tool, the Field Analyzer, proved so useful that it stands above all the others. This tool enabled us to review all our collections systematically and plan cleanup more effectively. It has allowed us to pursue our goal of descriptive uniformity across all CONTENTdm collections. A companion tool that would replace, or build on, the Catcher interface, allowing for the cleaned-up metadata to be pushed back into our CONTENTdm site would also have been a real boon. But the complexities faced in cleaning up the data, along with the entity-based structure within Wikibase, foreclosed that option. Throughout this Linked Data Pilot project OCLC Staff were incredible, providing guidance, soliciting input, posing questions, and seeking solutions that engaged all the participants. The tools developed and the cleanup done by Bruce Washburn and Jeff Mixter show all the power and promise of linked data, as well as some of the hurdles. Yet, this is a path that should be followed, especially as CONTENTdm shows its age. The leap forward to a new solution has been greatly helped by the solid work done by all on this project. We will use the knowledge gained from this project to rethink our workflows and our descriptive metadata with an eye toward the promise of linked data. MINNESOTA DIGITAL LIBRARY (GRETA BAHNEMANN AND JASON ROY) Invitation In July of 2019, the Minnesota Digital Library (MDL) was asked to join the CONTENTdm Linked Data Pilot project. Initially, we were one of three pilot partners. This invitation was an opportunity for us to see the practical application of Wikidata to MDL’s collection of images. MDL is a collection of digitized cultural heritage materials comprised of images, text-based, cartographic materials, etc. with 67% of our collection represented by images. Given our high percentage of images, we were especially interested to see how our image metadata would reconcile and work with Wikidata. Would MDL’s metadata withstand this kind of work? Development of three tools by OCLC During the Linked Data project, OCLC developed three tools to assist the project participants: 1. Retriever—designed to help pilot partners search for and create entity descriptions. Especially helpful for those new to the process 2. Image Annotator—subject analysis tool that has the potential to change how we describe cultural heritage materials 3. Field Analyzer—developed in response to the need of the pilot project participants but has usefulness beyond the pilot. This tool provides partners with a backend look at their data, and gives a comprehensive view of how data is mapped, field names used, etc. It quickly shows the inconsistencies in a collection’s data regarding field names, mapping, etc. 60 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 4. The Image Annotator has the most potential to change the user’s understanding of digital content. With its capacity to provide both a layer of subject analysis and descriptive details to images in CONTENTdm, it is no less than groundbreaking. For example, an albumen photograph of a late 19th century home in Minneapolis can be “about” the concept/subjects of “Richardsonian Romanesque Style Architecture” and/or “Rock-face Construction;” but it can also “depict” things found in the image, such as a horse-drawn wagon, a fire hydrant, pedestrians, named individuals, etc. This added layer of meaning and contextualization can only add to the user’s understanding of the image. This is a type of analysis traditionally associated with the fields of fine art, architecture, urban planning and has the potential to add more nuanced description to cultural heritage materials and change how users understand these materials. While this tool is valuable and has huge potential for changing how we describe cultural heritage materials, it can be a labor-intensive process that may not be sustainable on a large scale. The Image Annotator has the most potential to change the user’s understanding of digital content. With its capacity to provide both a layer of subject analysis and descriptive details to images in CONTENTdm, it is no less than groundbreaking. Leveraging the power of linked data In terms of linked data support, a lot of initial effort was spent discussing how these controlled vocabularies might best be ingested and stored within the CONTENTdm framework. The rationale, one would believe, behind this was to ensure that it would better integrate with our more hyperlocal vocabularies and taxonomies. That is, how best to blend national vocabularies alongside locally created terms to best describe the source material. Unfortunately, by bringing and storing this “national” data into our local systems we are taking away some of the power of linked data; power that comes in the form of networked vocabularies that work best in a layer above our localized instances. Linked data is powerful, in part because it is not tied to any one system, but rather, integrates content across collections, thereby creating user-discoverable connections across collections and, more importantly, repositories. What may be a path forward is an opportunity for CONTENTdm to create web services that call upon these linked data sources at the point of need. This would allow catalogers and metadata creators the opportunity to align their local descriptive practices more closely with national and international initiatives. CONTENTdm would store the URI, not the term itself, thus creating linkages that would allow for more accurate and consistent sharing without “hardwiring” terms into the CONTENTdm data store. Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 61 It is, ultimately, the data store itself that is the most valuable piece of information. From this building block we can construct user interfaces and applications, share out our metadata for others to package, and scale out across multiple, shared repositories. We consume this data to create our local, default CONTENTdm view, but this same data can be packaged and shared in new ways. Applications such as that created by the Minnesota Digital Library85 consume the same data but build it out in different ways; additionally, this same data is openly shared with and aggregated by the Digital Public Library of America86 for use in their national initiative. Same data, different views. Ultimately, it is the data that must remain interoperable enough to work across systems and alongside other data sources. Within the limited timeframe of our project, OCLC was able to provide a proof of concept of the potential for enhancing CONTENTdm metadata through linked data integrations by way of a single new view that builds upon the existing CONTENTdm user facing discovery layer. Ultimately, it is the data that must remain interoperable enough to work across systems and alongside other data sources. Concluding thoughts We believe that this work should result in the further decoupling of some of these tight integrations in order to achieve our desired results: separating out the data store from the data view layer; leveraging the URI for further linkages out toward reliable and trustworthy linked data sources within the data store itself; and allowing for the open sharing of our data (and our assets as well through the existing IIIF infrastructure) with others to achieve large scales of discovery and to better network our data alongside that of our colleagues. Included in all of this should be a discussion of the future application of the tools OCLC developed for this project. The Image Annotator and Field Analyzer could be integrated into the CONTENTdm package/workflow to help CONTENTdm users (both administrators as well as crowd-sourced end users) provide a more robust, nuanced description via the Image Annotator. The Field Analyzer can also help CONTENTdm adopters see their data, across multiple collections, in a single interface. Both tools should be developed further, thereby making CONTENTdm more user-friendly—for both administrators and end users. The Minnesota Digital Library was excited to be a part of this pilot project. In addition to learning more about the practical application of Wikidata, it was a great opportunity to get to know staff at OCLC and speak about the potential future of CONTENTdm in a collaborative environment. TEMPLE UNIVERSITY LIBRARIES (HOLLY TOMREN AND MICHAEL CARROLL) In 2019 we joined the CONTENTdm Linked Data project. Focusing on how this compared to and differed from our previous experience with Project Passage, while Project Passage was more about one-by-one original description, the CONTENTdm pilot was more about batch transformation of existing metadata. OCLC staff consulted with us about how they planned to map our metadata and to answer any of our questions, and we provided feedback about the mappings as well as any questions we had 62 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability about the data model, which was now much expanded from what we had started with in Project Passage. This gave us a sense of what a future data migration would look like, and how migration to a linked data model can be even more complex than a migration from one flat metadata model to another. Linked data also provides different opportunities for how we can search our CONTENTdm metadata, particularly through more indirect relationships between entities in the system. After OCLC transformed our data, we evaluated it to see how this could help us look at our metadata differently, where is there room for further data enrichment, and what are the new relationships and connections we can create with a system that is built to do so. One thing that particularly stood out were the different ways we could browse our data using the Explorer tool that OCLC developed. At Temple, our customized library discovery layer is built on three concepts: Search, Browse, and Recommend. But so far, we have only implemented Search. As we’ve thought internally about Browse features, we’ve struggled with a way to approach this that is different from the standard Title, Author, Subject browse from the past. The CONTENTdm Explorer offers a model that provides a variety of different starting points for browsing and then allows a user to traverse a graph of relationships, which is inspiring as we continue to develop our local discovery environment. Linked data also provides different opportunities for how we can search our CONTENTdm metadata, particularly through more indirect relationships between entities in the system. For example, we were thinking of the use case where we might have an “On This Day” feature to post on social media. We were able to develop queries in the SPARQL endpoint that could help us find images that depict people born on a certain day or images that depict people born in Philadelphia that could be used to help us select featured images for different scenarios. Participating in the project introduced the Wikibase interface and exciting tools to enhance the discovery of and engagement with digital records. The Wikibase offered a glimpse into what a digital collections database that employs linked data might look like and how the cataloging process might change. For instance, the inclusion of clickable headings for each entity has the potential to make it even easier for student catalogers to understand the context of the terms they use to describe an image. The Describer prototype tool was a simplified visual interface that enables cataloging based on the resource type classification. This tool felt more approachable than the Wikibase interface. The text box of the Describer tool automatically suggested verified terms like controlled vocabularies in CONTENTdm, but this tool felt more intuitive and tailored to what the user was typing. Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 63 It was very useful from a cataloging perspective to have access—supported by IIIF standards and viewers—to the image and be able to zoom in to see details while describing it. We also thought the Image Annotator had a lot of potential for being able to associate a part of an image with a specific depicts or subject property, and it would be interesting to see how that could be incorporated into the end user discovery experience. One of the potential impacts of this project would be to rethink our cataloging workflows in accordance with a linked data structure. The Temple University team described existing images as a group exercise that proved challenging without the original objects in front of us. It became clear during this exercise that there would also need to generate more nuanced descriptions when cataloging in order to develop a richer network of relationships between entities. The Linked Data project demonstrated the amount of work involved in the transition to linked data, but also that the tools exist and that the workflows can be developed. UNIVERSITY OF MIAMI LIBRARIES (PAUL CLOUGH AND ELLIOT WILLIAMS) Participating in the Linked Data project was an opportunity for us to understand more concretely what it would take to transform our existing collections into linked data and what a linked data version of CONTENTdm might look like. Interacting with our metadata in the Wikibase environment raised valuable questions about how our existing metadata practices might complicate the transition to linked data, such as a lack of standardization of elements and inconsistent uses of existing vocabularies, and inspired us to focus more on data normalization and consistency. Some of the insights and tools that came out of the project, such as the Field Analyzer, will be immediately useful for our work in CONTENTdm, even outside of the transition to linked data. Participating in a cohort with other CONTENTdm users and OCLC staff was also a great opportunity to learn from and with our peers. The Linked Data project demonstrated the amount of work involved in the transition to linked data, but also that the tools exist and that the workflows can be developed. While we appreciate the promise of linked data, we believe that more work still needs to be done to show that the effort will be worth it. 64 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability K E Y F I N D I N G S A N D C O N C L U S I O N S The linked data project reaffirmed some prior lessons learned and provided new insights across a range of concerns, including the expected benefits of working in a linked data environment, the potential to develop a shared data model, a reality check on the effort to transform metadata to linked data, and the essential benefits of a strong partnership. TESTING THE LINKED DATA VALUE PROPOSITION The project confirmed key aspects of the linked data value proposition, that cultural material discovery and data management can be significantly improved when the materials are described using a shared and extensible data model, when metadata string-based headings are transformed to linked data entities and relationships, and when those entities and relationships are brought together into a single discovery system. In this environment, the technology works in service to both the staff, who can more easily and accurately impart the expertise they have about the collections they steward, and to the researcher, who can see more robust connections between— and context about—the cultural materials that make up CONTENTdm collections. In project prototype applications, entities can be retrieved by searches that use a persistent identifier rather than a string heading. This capability provides integrated authority control for the entities and greatly improves the precision and recall performance metrics for discovery. As CONTENTdm string headings are reconciled and converted to entities, additional information from external data sources can automatically and efficiently enrich the entity description. This supports new discovery and data visualization capacities that would be expensive or impossible to achieve in the current CONTENTdm system. For example, place entity descriptions can be enriched with geographic coordinates, which can then be used to generate map-based visualizations of places depicted in cultural materials. In an entity-oriented system like Wikibase, different types of entities have their own distinct representation. This design contrasts with record-oriented systems where the creative work is the primary entity and other types of things are only present as statements representing notes and headings that are associated with the work. Data management and maintenance efficiencies are gained by transforming these statements into entities. For example, a biographical statement about a person can be associated with that person’s entity description, rather than repeated as a note in every record that is in some way about that person. EVALUATING A SHARED DATA MODEL Building an initial data model with a high-level structure informed by other standards, including Dublin Core and Schema.org, provided a solid set of initial classes and properties. The model could be effectively and responsively expanded based on new entities and relationships represented in the source metadata. The metadata and mapping discussions with pilot partners helped OCLC develop the data model, as data was encountered in the CONTENTdm sources that OCLC had not anticipated. Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 65 SELECTING AND TRANSFORMING METADATA Data transformation tools should be shared and the workflows decentralized. This will be essential to making the conversion scalable, as the workload is too great for a central agency to carry out. Domain expertise is needed to determine how locally defined fields are used at the institution level and sometimes at the collection level. Though it required considerable manual effort, most headings for concepts and places found in CONTENTdm source metadata could be reconciled to matching entities described in other sources, including the Wikidata knowledge base, the VIAF authority file, FAST, and GeoNames. Not surprisingly given the relative lack of notability of some of the represented people and organizations, those headings often could not be found in one of the external sources OCLC used for reconciliation and led to manual data entry for a “placeholder” entity. Other than the initial field mapping review, pilot participants did not get a more in-depth “behind- the-scenes” view of the data processing workflows, which could have been offered as “office hour” homework or a workshop. In retrospect that appears to be a missed opportunity. For the transition to linked data to be comprehensive and complete, a set of new CONTENTdm tools are called for that can be applied to transformation and reconciliation workflows in a decentralized way, along with fundamental changes to the centralized CONTENTdm system. A paradigm shift of this scale will necessarily take time to carry out and calls for long-term strategies and planning. CONTINUING THE JOURNEY TO LINKED DATA Substantial resource commitments will be required to carry out these data transformations across all CONTENTdm institutions and collections, but the community does not need to wait for the transformation to linked data to be fully completed before they can see benefits. Data management and discovery benefits are applicable from this work in the current CONTENTdm environment, and downstream linked data transformation efficiencies accrue as metadata makes greater use of shared vocabularies and persistent identifiers. For the transition to linked data to be comprehensive and complete, a set of new CONTENTdm tools are called for that can be applied to transformation and reconciliation workflows in a decentralized way, along with fundamental changes to the centralized CONTENTdm system. A paradigm shift of this scale will necessarily take time to carry out and calls for long-term strategies and planning. 66 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability Several of the prototype applications developed during the pilot point the way to advantageous additions to the CONTENTdm toolkit. In particular, the Image Annotator encourages domain experts to enrich material descriptions, and the Field Analyzer helps CONTENTdm users make sense of the variations in field definitions and uses across their collections (a prerequisite for more holistic data rationalization and transformation). The project participants encouraged OCLC to pursue these and other improvements as part of CONTENTdm’s evolution into a linked data platform. WORKING PARTNERSHIPS REPRESENT STRENGTH IN NUMBERS The value of library participants as partners in this project cannot be overstated. As colleagues and thought partners in the work, participants connected with project staff in regularly scheduled office hours throughout the project. Through these meetings and regular communications, project participants shared their thoughts on topics ranging from philosophical approaches and concepts to technical details and provided ongoing feedback that steered the project work toward tools and applications of greatest practical value for library staff and researchers. Recognizing the critical insights contributed by the project partners confirms the importance of involving library staff in this manner for similar technical research projects. Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 67 N O T E S 1. OCLC’s CONTENTdm digital content management service overview: https://www.oclc.org/en/contentdm.html. 2. An overview of OCLC’s history of Linked Data research projects: https://www.oclc.org/research/areas/data-science/linkeddata/linked-data-outputs.html. 3. W3C. “Linked Data.” https://www.w3.org/wiki/LinkedData. 4. An overview of OCLC’s linked data pilot Project Passage: https://www.oclc.org/research/areas/data-science/linkeddata/linked-data-prototype.html; See also: Godby, Jean, Karen Smith-Yoshimura, Bruce Washburn, Kalan Davis, Karen Detling, Christine Fernsebner Eslao, Steven Folsom, Xiaoli Li, Marc McGee, Karen Miller, Honor Moody, Holly Tomren, and Craig Thomas. 2019. Creating Library Linked Data with Wikibase: Lessons Learned from Project Passage. Dublin, OH: OCLC Research. https://doi.org/10.25333/faq3-ax08. 5. The Wikibase environment includes several components: The MediaWiki Platform: https://www.mediawiki.org/wiki/MediaWiki; MediaWiki. “Wikibase:Overview—MediaWiki extension for managing structured data. Updated 29 December 2020, at 19:51. https://www.mediawiki.org/wiki/Wikibase; Wikipedia. “Triplestore: [...] a purpose-built database for the storage and retrieval of triples through semantic queries.” Updated 12 November 2020, at 18:12 (UTC). https://en.wikipedia.org/wiki/Triplestor; Wikipedia. “SPARQL” (Query service for reading data from the triplestore). Updated 3 January 2021, at 14:42 (UTC). https://en.wikipedia.org/wiki/SPARQL. 6. CONTENTdm Linked Datat Planned Project Phases diagram: https://researchworks.oclc.org/cdmld/screenshots/phase-diagram.png. 7. OCLC. 2020. “Guide to the CONTENTdm Catcher.” Updated 7 August 2020. https://help.oclc.org/Metadata_Services/CONTENTdm/CONTENTdm_Catcher/Guide_to_the _CONTENTdm_Catcher. 8. Wikipedia: https://www.wikipedia.org/. 9. Wikidata: The free knowledge base. Updated 30 December 2019, at 04:00. https://www.wikidata.org/wiki/Wikidata:Main_Page. 10. SPARQL [Linked Data] query language for RDF. W3C Recommendation 15 January 2008. https://www.w3.org/TR/rdf-sparql-query/. https://doi.org/10.25333/C3FC9Q https://www.oclc.org/en/contentdm.html https://www.oclc.org/research/areas/data-science/linkeddata/linked-data-outputs.html https://www.w3.org/wiki/LinkedData https://www.oclc.org/research/areas/data-science/linkeddata/linked-data-prototype.html https://www.oclc.org/research/areas/data-science/linkeddata/linked-data-prototype.html https://www.oclc.org/research/areas/data-science/linkeddata/linked-data-prototype.html https://www.oclc.org/research/areas/data-science/linkeddata/linked-data-prototype.html https://www.oclc.org/research/areas/data-science/linkeddata/linked-data-prototype.html https://www.mediawiki.org/wiki/MediaWiki https://www.mediawiki.org/wiki/Wikibase https://en.wikipedia.org/wiki/Triplestore https://en.wikipedia.org/wiki/SPARQL https://researchworks.oclc.org/cdmld/screenshots/phase-diagram.png https://help.oclc.org/Metadata_Services/CONTENTdm /CONTENTdm_Catcher/Guide_to_the_CONTENTdm_Catcher https://help.oclc.org/Metadata_Services/CONTENTdm /CONTENTdm_Catcher/Guide_to_the_CONTENTdm_Catcher https://www.wikipedia.org/ https://www.wikidata.org/wiki/Wikidata:Main_Page https://www.w3.org/TR/rdf-sparql-query/ 68 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 11. Wikibase system architecture diagram: https://researchworks.oclc.org/cdmld/screenshots /wikibase-system-architecture.png. 12. CONTENTdm Class data model visualization: https://researchworks.oclc.org/cdmld /screenshots/class-ontology.png. 13. Background on the Dublin Core Metadata Initiative and the Dublin Core element set: https://dublincore.org. 14. The Dublin Core Metadata Initiative DCMI Type Vocabulary: https://www.dublincore.org/specifications/dublin-core/dcmi-type-vocabulary/. 15. Linked Art project: https://linked.art/. 16. Example use of type, classification, and process or format properties in the description of a postcard: https://researchworks.oclc.org/cdmld/screenshots/entity-Q73226.png. 17. A depicts statement for the concept of “Dogs”: https://researchworks.oclc.org/cdmld/screenshots/entity-Q147731.png. 18. A type statement of “dog” for a specific dog: https://researchworks.oclc.org/cdmld/screenshots/entity-Q142481.png. 19. The RDF Linked Data modeling vocabulary “RDF Schema”: https://www.w3.org/TR/rdf-schema/. 20. The class “dog” is defined by the concept “Dogs”: https://researchworks.oclc.org/cdmld/screenshots/entity-Q73829.png. 21. Wikibase templates for proposing new properties: https://researchworks.oclc.org/cdmld/screenshots/cdm-property-proposal.png; https://researchworks.oclc.org/cdmld/screenshots/cdm-property-proposal-is-defined-by.png. 22. Unmapped CONTENTdm metadata displayed in the Wikibase user interface with a Gadget extension: https://researchworks.oclc.org/cdmld/screenshots/entity-Q143578.png. 23. Collections evaluated for the pilot project: • Cleveland Public Library º Cleveland Picture Collection: https://cplorg.contentdm.oclc.org/digital/collection /p4014coll18/search/searchterm/cleveland%20picture%20collection/field/collec /mode/exact/conn/and/order/sortda/ad/asc; º Jasper Wood photos of Cleveland: https://cdm16014.contentdm.oclc.org/digital /collection/p4014coll18/search/searchterm/jasper+wood/field/creato/mode/all /conn/and; º John G. White Collection of Chess and Checkers, Chess Player Portraits Collection: https://cdm16014.contentdm.oclc.org/digital/collection/p4014coll20/search /searchterm/Chess Portraits. https://researchworks.oclc.org/cdmld/screenshots/wikibase-system-architecture.png https://researchworks.oclc.org/cdmld/screenshots/wikibase-system-architecture.png https://researchworks.oclc.org/cdmld/screenshots/class-ontology.png https://researchworks.oclc.org/cdmld/screenshots/class-ontology.png https://dublincore.org https://www.dublincore.org/specifications/dublin-core/dcmi-type-vocabulary/ https://linked.art/ https://researchworks.oclc.org/entity/Q73226 https://researchworks.oclc.org/cdmld/screenshots/entity-Q73226.png https://researchworks.oclc.org/cdmld/screenshots/entity-Q147731.png https://researchworks.oclc.org/entity/Q147731 https://researchworks.oclc.org/cdmld/screenshots/entity-Q142481.png https://www.w3.org/TR/rdf-schema/ https://researchworks.oclc.org/cdmld/screenshots/entity-Q73829.png https://researchworks.oclc.org/cdmld/screenshots/cdm-property-proposal.png https://researchworks.oclc.org/cdmld/screenshots/cdm-property-proposal-is-defined-by.png https://researchworks.oclc.org/cdmld/screenshots/entity-Q143578.png https://cplorg.contentdm.oclc.org/digital/collection/p4014coll18/search/searchterm/cleveland%20picture%20collection/field/collec/mode/exact/conn/and/order/sortda/ad/asc https://cplorg.contentdm.oclc.org/digital/collection/p4014coll18/search/searchterm/cleveland%20picture%20collection/field/collec/mode/exact/conn/and/order/sortda/ad/asc https://cplorg.contentdm.oclc.org/digital/collection/p4014coll18/search/searchterm/cleveland%20picture%20collection/field/collec/mode/exact/conn/and/order/sortda/ad/asc https://cdm16014.contentdm.oclc.org/digital/collection/p4014coll18/search/searchterm/jasper+wood/field/creato/mode/all/conn/and https://cdm16014.contentdm.oclc.org/digital/collection/p4014coll18/search/searchterm/jasper+wood/field/creato/mode/all/conn/and https://cdm16014.contentdm.oclc.org/digital/collection/p4014coll18/search/searchterm/jasper+wood/field/creato/mode/all/conn/and https://cdm16014.contentdm.oclc.org/digital/collection/p4014coll20/search/searchterm/Chess Portraits https://cdm16014.contentdm.oclc.org/digital/collection/p4014coll20/search/searchterm/Chess Portraits Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 69 • The Huntington Library, Art Museum, and Botanical Gardens: º Edwin Hubble Papers: https://cdm16003.contentdm.oclc.org/digital/collection /p15150coll2/search/searchterm/Edwin%20Hubble%20Papers/field/physic/mode /exact/conn/and; º Palmer Conner Collection of Color Slides of Los Angeles, 1950 - 1970: https://hdl.huntington.org/digital/collection/p15150coll2/search/searchterm /Palmer+Conner+Collection+of+Color+Slides+of+Los+Angeles%2C+1950+-+1970/field /physic/mode/all/conn/and/order/nosort; º Photographs of the California Missions by William Henry Jackson https://hdl.huntington.org/digital/collection/p15150coll2/search/searchterm /Photographs%20of%20the%20California%20Missions%20by%20William%20 Henry%20Jackson/field/physic/mode/exact/conn/and; º Verner Collection of Panoramic Negatives https://hdl.huntington.org/digital/collection/p15150coll2/search/searchterm /Verner+Collection+of+Panoramic+Negatives/field/physic/mode/all/conn/and /order/title. • Minnesota Digital Library: º American Swedish Institute: https://reflections.mndigital.org/?f%5Bcollection _name_ssi%5D%5B%5D=American+Swedish+Institute; º Becker County Historical Society: https://reflections.mndigital.org/?f%5Bcollection _name_ssi%5D%5B%5D=Becker+County+Historical+Society; º Kanabec County Historical Society: https://reflections.mndigital.org/?f%5Bcollection _name_ssi%5D%5B%5D=Kanabec+County+Historical+Society. • Temple University: º John W. Mosley Photograph Collection: https://digital.library.temple.edu/digital/collection/p15037coll17; º Temple History in Photographs. Templana Event Album Collection: https://cdm16002.contentdm.oclc.org/digital/collection/p245801coll0/search /searchterm/Templana%20Event%20Album%20Collection/field/reposa/mode /exact/conn/and; º Temple History in Photographs. Templana Photograph Collection: https://cdm16002.contentdm.oclc.org/digital/collection/p245801coll0/search /searchterm/Templana%20Photograph%20Collection/field/reposa/mode/exact /conn/and; º Temple History in Photographs. Temple Times Photographs: https://cdm16002.contentdm.oclc.org/digital/collection/p245801coll0/search /searchterm/Temple%20Times%20Photographs/field/reposa/mode/exact/conn/and; º Temple University Libraries. YWCA Philadelphia Branches Records: https://digital.library.temple.edu/digital/search/collection /p16002coll6!p15037coll19!p15037coll14!p16002coll2/searchterm /YWCA%20Philadelphia%20Branches%20Records/field/digitb/mode/exact/conn/and. • University of Miami: º Cuban Map Collection: https://merrick.library.miami.edu/cubanHeritage/chc0468/; https://cdm16003.contentdm.oclc.org/digital/collection/p15150coll2/search/searchterm/Edwin%20Hubble%20Papers/field/physic/mode/exact/conn/and https://cdm16003.contentdm.oclc.org/digital/collection/p15150coll2/search/searchterm/Edwin%20Hubble%20Papers/field/physic/mode/exact/conn/and https://cdm16003.contentdm.oclc.org/digital/collection/p15150coll2/search/searchterm/Edwin%20Hubble%20Papers/field/physic/mode/exact/conn/and https://hdl.huntington.org/digital/collection/p15150coll2/search/searchterm/Palmer+Conner+Collection+of+Color+Slides+of+Los+Angeles%2C+1950+-+1970/field/physic/mode/all/conn/and/order/nosort https://hdl.huntington.org/digital/collection/p15150coll2/search/searchterm/Palmer+Conner+Collection+of+Color+Slides+of+Los+Angeles%2C+1950+-+1970/field/physic/mode/all/conn/and/order/nosort https://hdl.huntington.org/digital/collection/p15150coll2/search/searchterm/Palmer+Conner+Collection+of+Color+Slides+of+Los+Angeles%2C+1950+-+1970/field/physic/mode/all/conn/and/order/nosort https://hdl.huntington.org/digital/collection/p15150coll2/search/searchterm/Photographs%20of%20the%20California%20Missions%20by%20William%20Henry%20Jackson/field/physic/mode/exact/conn/and https://hdl.huntington.org/digital/collection/p15150coll2/search/searchterm/Photographs%20of%20the%20California%20Missions%20by%20William%20Henry%20Jackson/field/physic/mode/exact/conn/and https://hdl.huntington.org/digital/collection/p15150coll2/search/searchterm/Photographs%20of%20the%20California%20Missions%20by%20William%20Henry%20Jackson/field/physic/mode/exact/conn/and https://hdl.huntington.org/digital/collection/p15150coll2/search/searchterm/Verner+Collection+of+Panoramic+Negatives/field/physic/mode/all/conn/and/order/title https://hdl.huntington.org/digital/collection/p15150coll2/search/searchterm/Verner+Collection+of+Panoramic+Negatives/field/physic/mode/all/conn/and/order/title https://hdl.huntington.org/digital/collection/p15150coll2/search/searchterm/Verner+Collection+of+Panoramic+Negatives/field/physic/mode/all/conn/and/order/title https://reflections.mndigital.org/?f%5Bcollection_name_ssi%5D%5B%5D=American+Swedish+Institute https://reflections.mndigital.org/?f%5Bcollection_name_ssi%5D%5B%5D=American+Swedish+Institute https://reflections.mndigital.org/?f%5Bcollection_name_ssi%5D%5B%5D=Becker+County+Historical+Society https://reflections.mndigital.org/?f%5Bcollection_name_ssi%5D%5B%5D=Becker+County+Historical+Society https://reflections.mndigital.org/?f%5Bcollection_name_ssi%5D%5B%5D=Kanabec+County+Historical+Society https://reflections.mndigital.org/?f%5Bcollection_name_ssi%5D%5B%5D=Kanabec+County+Historical+Society https://digital.library.temple.edu/digital/collection/p15037coll17 https://cdm16002.contentdm.oclc.org/digital/collection/p245801coll0/search/searchterm/Templana%20Event%20Album%20Collection/field/reposa/mode/exact/conn/and https://cdm16002.contentdm.oclc.org/digital/collection/p245801coll0/search/searchterm/Templana%20Event%20Album%20Collection/field/reposa/mode/exact/conn/and https://cdm16002.contentdm.oclc.org/digital/collection/p245801coll0/search/searchterm/Templana%20Event%20Album%20Collection/field/reposa/mode/exact/conn/and https://cdm16002.contentdm.oclc.org/digital/collection/p245801coll0/search/searchterm/Templana%20Photograph%20Collection/field/reposa/mode/exact/conn/and https://cdm16002.contentdm.oclc.org/digital/collection/p245801coll0/search/searchterm/Templana%20Photograph%20Collection/field/reposa/mode/exact/conn/and https://cdm16002.contentdm.oclc.org/digital/collection/p245801coll0/search/searchterm/Templana%20Photograph%20Collection/field/reposa/mode/exact/conn/and https://cdm16002.contentdm.oclc.org/digital/collection/p245801coll0/search/searchterm/Temple%20Times%20Photographs/field/reposa/mode/exact/conn/and https://cdm16002.contentdm.oclc.org/digital/collection/p245801coll0/search/searchterm/Temple%20Times%20Photographs/field/reposa/mode/exact/conn/and https://digital.library.temple.edu/digital/search/collection/p16002coll6!p15037coll19!p15037coll14!p16002coll2/searchterm/YWCA%20Philadelphia%20Branches%20Records/field/digitb/mode/exact/conn/and/order/title/ad/asc https://digital.library.temple.edu/digital/search/collection/p16002coll6!p15037coll19!p15037coll14!p16002coll2/searchterm/YWCA%20Philadelphia%20Branches%20Records/field/digitb/mode/exact/conn/and/order/title/ad/asc https://digital.library.temple.edu/digital/search/collection/p16002coll6!p15037coll19!p15037coll14!p16002coll2/searchterm/YWCA%20Philadelphia%20Branches%20Records/field/digitb/mode/exact/conn/and/order/title/ad/asc https://merrick.library.miami.edu/cubanHeritage/chc0468/ 70 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability º Latin American and Caribbean Photograph Collection: https://merrick.library.miami.edu/cdm/search/collection/asm0304; º Rosenstiel School of Marine & Atmospheric Science Photograph Collection: https://merrick.library.miami.edu/rsmas/rsmasphotos/. 24. Wikibase Discussion page for a collection review: https://researchworks.oclc.org/cdmld/screenshots/cdm-item-talk-Q148309.png. 25. The OpenRefine software for cleaning up, analyzing, and reconciling metadata: https://openrefine.org/. 26. CONTENTdm collection metadata viewed in OpenRefine: https://researchworks.oclc.org/cdmld/screenshots/openrefine-project.png. 27. IIIF International Image Interoperability Framework website: https://iiif.io/. 28. A triplestore is a database to manage linked data “triples”, which are a combination of a subject, predicate, and object: https://en.wikipedia.org/wiki/Triplestore. 29. Wikidata OpenRefine reconciliation endpoint software. See Delpeuch, Antonin. (2017) 2020. “Wetneb/Openrefine-Wikibase.” Python. https://github.com/wetneb/openrefine-wikibase. 30. OCLC’s FAST (Faceted Application of Subject Terminology) system: https://www.oclc.org/research/areas/data-science/fast.html. 31. VIAF OpenRefine reconciliation endpoint service: http://iphylo.org/~rpage/phyloinformatics/services/reconciliation_viaf.php. 32. The GeoNames service for geographic data: https://www.geonames.org/. 33. The Python scripting language. See “Manual:Pywikibot/Overview - MediaWiki.” n.d. Accessed 7 January 2021. https://www.mediawiki.org/wiki/Manual:Pywikibot/Overview. https://www.python.org/. 34. “Help:QuickStatements - Wikidata.” Edited on 4 January 2021, at 10:41. https://www.wikidata.org/wiki/Help:QuickStatements. 35. Pywikibot Python library overview. See “Manual:Pywikibot/Overview - MediaWiki.” n.d. Accessed 7 January 2021. https://www.mediawiki.org/wiki/Manual:Pywikibot/Overview. https://www.mediawiki.org/wiki/Manual:Pywikibot/Overview. 36. OCLC DevConnect Online 2020 presentation on the alternative OpenRefine reconciliation endpoint software developed during the pilot project. See Mixter, Jeff, and Bruce Washburn. 2020. “Building an OpenRefine Reconciliation Endpoint for a Wikibase project: Lessons Learned.” Produced by OCLC, 20 May 2020. MP4 video presentation, 58:01. https://www.oclc.org/en/events/2020/devconnect-online-2020/devconnect-2020-creating -linked-descriptive-data-for-contentdm.html. 37. A “placeholder” entity for a person without an established identity: https://researchworks.oclc.org/cdmld/screenshots/entity-Q144548.png. https://merrick.library.miami.edu/cdm/search/collection/asm0304 https://merrick.library.miami.edu/rsmas/rsmasphotos/ https://researchworks.oclc.org/cdmld/screenshots/cdm-item-talk-Q148309.png https://openrefine.org/ https://researchworks.oclc.org/cdmld/screenshots/openrefine-project.png https://iiif.io/ https://en.wikipedia.org/wiki/Triplestore https://github.com/wetneb/openrefine-wikibase https://www.oclc.org/research/areas/data-science/fast.html http://iphylo.org/~rpage/phyloinformatics/services/reconciliation_viaf.php https://www.geonames.org/ https://www.python.org/ https://www.wikidata.org/wiki/Help:QuickStatements https://www.mediawiki.org/wiki/Manual:Pywikibot/Overview https://www.oclc.org/en/events/2020/devconnect-online-2020/devconnect-2020-creating-linked-descriptive-data-for-contentdm.html https://www.oclc.org/en/events/2020/devconnect-online-2020/devconnect-2020-creating-linked-descriptive-data-for-contentdm.html https://researchworks.oclc.org/cdmld/screenshots/entity-Q144548.png https://researchworks.oclc.org/entity/Q144548 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 71 38. An example CONTENTdm compound object for a photograph album. See University of Miami Libraries. “Album Documenting a Sea Journey to Trinidad, Venezuela, and Grenada.” Latin American and Caribbean Photograph Collection. Digital Collections. Accessed 7 January 2021, https://cdm17191.contentdm.oclc.org/digital/collection/asm0304/id/1311. 39. Example “has creative work part” statements for the parts of an album: https://researchworks.oclc.org/cdmld/screenshots/entity-Q73586.png. 40. RDF Resource Description Framework standard for linked data. See W3C Semantic Web. “RDF: Resource Description Framework.” Updated 15 March 2014, at 21:35. https://www.w3.org/RDF/. 41. RDF Turtle textual syntax. See Beckett, David, Tim Berners-Lee, Eric Prud’hommeaux, and Gavin Carothers. 2014. “RDF - Semantic Web Standards.” https://www.w3.org/TR/turtle/. 42. RDF N-triples plain text syntax. See W3C Semantic Web. 2014. “RDR 1.1 N-Triples: A Line-based Syntax for an RDF Graph.” https://www.w3.org/TR/n-triples/. 43. RDF JSON-LD format for linked data. See Sporny, Manu, Dave Longley, Gregg Kellogg, Markus Lanthaler, Pierre-Antoine Champin, and Niklas Lindström. 2020. “JSON-LD 1.1: A JSON-based Serialization for Linked Data.” W3C Editor’s draft. Edited by Gregg Kellogg, Pierre-Antoine Champin and Dave Longley. Posted 14 November 2020. https://w3c.github.io/json-ld-syntax/. 44. JSON (JavaScript Object Notation) data format. See Wikipedia. “JSON.” Updated 31 December 2020, at 22:32 (UTC). https://en.wikipedia.org/wiki/JSON. 45. The PHP Group. “Object Serialization: Serializing Objects - Objects In Sessions. PHP Manual. Accessed 7 January 2021. https://www.php.net/manual/en/language.oop5.serialization.php. 46. DPLA Metadata Application Profile documentation: https://pro.dp.la/hubs/metadata-application-profile. 47. Schema.org metadata schema documentation. See “Organization of Schemas.” 2021. https://schema.org/docs/schemas.html. 48. W3C Semantic Web. “Web Ontology Language (OWL).” Updated 11 December 2013, at 11:38. https://www.w3.org/OWL/. 49. Kellogg, Greg (ed). 2020. “JSON -LD Best Practices: W3C Editor’s Draft 20 February 2020.” W3C (MIT, ERCIM, Keio, Beihang). https://w3c.github.io/json-ld-bp/. 50. Appleby, Michael, Tom Crane, Robert Sanderson, Jon Stroop, and Simeon Warner. 2018. “JSON-LD Design Patterns.” Chap. 3 in IIIF Design Patterns. International Image Interoperability Framework Consortium. https://iiif.io/api/annex/notes/design_patterns/#json-ld-design-patterns. 51. Other names associated with the Los Angeles Dodgers entity: https://researchworks.oclc.org/cdmld/screenshots/entity-Q166325.png. 52. First parts of the description of Jasper Wood: https://researchworks.oclc.org/cdmld/screenshots/entity-Q147700.png. 53. SPARQL Query map visualization of places depicted in works from a collection: https://researchworks.oclc.org/cdmld/screenshots/sparql-visualization.png. https://cdm17191.contentdm.oclc.org/digital/collection/asm0304/id/1311 https://researchworks.oclc.org/cdmld/screenshots/entity-Q73586.png https://www.w3.org/RDF/ https://www.w3.org/TR/turtle/ https://www.w3.org/TR/n-triples/ https://w3c.github.io/json-ld-syntax/ https://en.wikipedia.org/wiki/JSON https://www.php.net/manual/en/language.oop5.serialization.php https://pro.dp.la/hubs/metadata-application-profile https://schema.org/docs/schemas.html https://www.w3.org/OWL/ https://w3c.github.io/json-ld-bp/ https://iiif.io/api/annex/notes/design_patterns/#json -ld-design-patterns https://researchworks.oclc.org/cdmld/screenshots/entity-Q166325.png https://researchworks.oclc.org/cdmld/screenshots/entity-Q147700.png https://researchworks.oclc.org/cdmld/screenshots/sparql-visualization.png 72 Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 54. Wikibase Gadgets extension documentation. See MediaWiki. “Extension:Gadgets.” Updated 16 October 2020, at 11:36. https://www.mediawiki.org/wiki/Extension:Gadgets. 55. Mirador IIIF-compatible image viewer project website: https://projectmirador.org/. 56. Mirador image viewer embedded in the Wikibase user interface: https://researchworks.oclc.org/cdmld/screenshots/entity-Q165895.png. 57. Contextual data and image from DBPedia and Wikimedia Commons embedded in the Wikibase user interface: https://researchworks.oclc.org/cdmld/screenshots/entity-Q71945.png. 58. Constraints quality assurance Wikibase mechanism documentation. See MediaWiki. “Extension:Wikibase Quality Extensions.” Archived 7 January 2019, at 13:45. https://www.mediawiki.org/wiki/Extension:Wikibase_Quality_Extensions. 59. A constraint violation indicating that the “occupation” property should only be used for instances of the type “person” https://researchworks.oclc.org/cdmld/screenshots /entity-Q73246.png. 60. OCLC CONTENTdm Custom Pages with CSS and JavaScript documentation. Updated 28 June 2018. https://help.oclc.org/Metadata_Services/CONTENTdm/Advanced_website_customization / Custom_pages/Custom_pages_with_CSS_and_JavaScript. 61. OCLC. CONTENTdm Advanced Website Customization Cookbook website: https://help.oclc.org /Metadata_Services/CONTENTdm/Advanced_website_customization/Customization_cookbook. 62. Google’s Structured Data Testing Tool: https://search.google.com/structured-data/testing-tool. [Google has announced that this tool is being discontinued.] 63. CONTENTdm Schema.org data evaluated using the Google Structured Data Testing tool: https://researchworks.oclc.org/cdmld/screenshots/google-structured-data-testing-tool.png. 64. Additional contextual information displayed in CONTENTdm based on entity descriptions in the pilot Wikibase: https://researchworks.oclc.org/cdmld/screenshots /cdm15725-p16003coll7-14.png. 65. Image Annotator initial view with subjects: https://researchworks.oclc.org/cdmld/screenshots/image-annotator-1.png. 66. Image Annotator cropped image of a person: https://researchworks.oclc.org/cdmld/screenshots/image-annotator-2.png. 67. Image Annotator after adding more depicted subjects: https://researchworks.oclc.org/cdmld/screenshots/image-annotator-3.png. 68. Wikibase item updated with depicted subjects and associated cropped images: https://researchworks.oclc.org/cdmld/screenshots/entity-Q148552.png. 69. Nielsen, Jakob. 2012. “Thinking Aloud: The #1 Usability Tool.” Nielsen Norman Group. Posted 15 January 2020. https://www.nngroup.com/articles/thinking-aloud-the-1-usability-tool/. https://www.mediawiki.org/wiki/Extension:Gadgets https://projectmirador.org/ https://researchworks.oclc.org/cdmld/screenshots/entity-Q165895.png https://researchworks.oclc.org/cdmld/screenshots/entity-Q71945.png https://www.mediawiki.org/wiki/Extension:Wikibase_Quality_Extensions https://researchworks.oclc.org/cdmld/screenshots/entity-Q73246.png https://researchworks.oclc.org/cdmld/screenshots/entity-Q73246.png https://help.oclc.org/Metadata_Services/CONTENTdm/Advanced_website_customization/Custom_pages/Custom_pages_with_CSS_and_JavaScript https://help.oclc.org/Metadata_Services/CONTENTdm/Advanced_website_customization/Custom_pages/Custom_pages_with_CSS_and_JavaScript https://help.oclc.org/Metadata_Services/CONTENTdm/Advanced_website_customization/Customization_cookbook https://help.oclc.org/Metadata_Services/CONTENTdm/Advanced_website_customization/Customization_cookbook https://search.google.com/structured-data/testing-tool https://researchworks.oclc.org/cdmld/screenshots/cdm15725-p16003coll7-14.png https://researchworks.oclc.org/cdmld/screenshots/cdm15725-p16003coll7-14.png https://researchworks.oclc.org/cdmld/screenshots/image-annotator-1.png https://researchworks.oclc.org/cdmld/screenshots/image-annotator-2.png https://researchworks.oclc.org/cdmld/screenshots/image-annotator-3.png https://researchworks.oclc.org/cdmld/screenshots/entity-Q148552.png https://www.nngroup.com/articles/thinking-aloud-the-1-usability-tool/ Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 73 70. Retriever search results from Wikidata, VIAF, and FAST for “lake vermilion”: https://researchworks.oclc.org/cdmld/screenshots/retriever-1.png. 71. Retriever entity editor: https://researchworks.oclc.org/cdmld/screenshots/retriever-2.png. 72. Wikibase entity created by the Retriever: https://researchworks.oclc.org/cdmld/screenshots/entity-Q221424.png. 73. Knublauch, Holger, and Dimitris Kontokostas (eds). 2017. “Shapes Constraint Language (SHACL): W3C Recommendation 20 July 2017.” W3C. https://www.w3.org/TR/shacl/. 74. ShEx Shape Expressions Language W3C Recommendation. See Prud’hommeaux, Eric, Lovka Boneva, Jose Labra Gayo, and Gregg Kellogg. 2017. “Shape Expressions Language 2.0: Draft Community Group Report 27 March 2017.” W3C. http://shex.io/shex-semantics-20170327/. 75. Editing essential details for an entity in the Describer: https://researchworks.oclc.org/cdmld/screenshots/describer-1.png. 76. Explorer home page: https://researchworks.oclc.org/cdmld/screenshots/explorer-1.png. 77. Explorer Transportation Hub and related collections: https://researchworks.oclc.org/cdmld/screenshots/explorer-2.png. 78. Explorer search results for “strike”: https://researchworks.oclc.org/cdmld/screenshots/explorer-3.png. 79. Explorer view of a truck bringing workers home during a PTC walkout: https://researchworks.oclc.org/cdmld/screenshots/explorer-4.png. 80. Explorer view of a protest against the Philadelphia Transportation Company: https://researchworks.oclc.org/cdmld/screenshots/explorer-5.png. 81. Explorer view of an 1899 Cleveland transit strike in Public Square: https://researchworks.oclc.org/cdmld/screenshots/explorer-6.png. 82. Explorer view of streetcars parked on the street during a transit strike: https://researchworks.oclc.org/cdmld/screenshots/explorer-7.png. 83. Field Analyzer field usage chart: https://researchworks.oclc.org/cdmld/screenshots/field-analyzer-1.png. 84. Field Analyzer list of field values: https://researchworks.oclc.org/cdmld/screenshots/field-analyzer-2.png. 85. Minnesota Digital Library website. See University of Minnesota. “Minnesota Reflections.” https://reflections.mndigital.org/. 86. Digital Public Library of America website: https://dp.la/. https://researchworks.oclc.org/cdmld/screenshots/retriever-1.png https://researchworks.oclc.org/cdmld/screenshots/retriever-2.png https://www.w3.org/TR/shacl/ http://shex.io/shex-semantics-20170327/ https://researchworks.oclc.org/cdmld/screenshots/describer-1.png https://researchworks.oclc.org/cdmld/screenshots/explorer-1.png https://researchworks.oclc.org/cdmld/screenshots/explorer-2.png https://researchworks.oclc.org/cdmld/screenshots/explorer-3.png https://researchworks.oclc.org/cdmld/screenshots/explorer-4.png https://researchworks.oclc.org/cdmld/screenshots/explorer-5.png https://researchworks.oclc.org/cdmld/screenshots/explorer-6.png https://researchworks.oclc.org/cdmld/screenshots/explorer-7.png https://researchworks.oclc.org/cdmld/screenshots/field-analyzer-1.png https://researchworks.oclc.org/cdmld/screenshots/field-analyzer-2.png https://reflections.mndigital.org/ https://dp.la/ For more information about our work related to digitizing library collections, please visit: oc.lc/digitizing 6565 Kilgour Place Dublin, Ohio 43017-3395 T: 1-800-848-5878 T: +1-614-764-6000 F: +1-614-764-6096 www.oclc.org/research ISBN: 978-1-55653-185-9 DOI: 10.25333/fzcv-0851 RM-PR-216817-WWAE 2101 O C L C R E S E A R C H R E P O R T http://oc.lc/digitizing Acknowledgments Executive Summary Introduction Three-Phase Project Plan Phase 1: Mapping textual metadata to entities Phase 2: Tools for managing metadata in Wikibase Phase 3: Wikibase entities drive discovery The Wikibase Environment Developing A Data Model Describing the “type” of a creative work at three levels Distinguishing between instances of concepts and ontological classes Managing the data model in Wikibase Managing source metadata outside of the data model Gathering and Transforming Metadata Selecting and analyzing collections from pilot partner CONTENTdm sites Optimizing tools and workflows for reconciliation and transformation Adding related entities to the Contentdm Wikibase from external sources Creating entities in advance for anticipated matches Testing an alternative openrefine reconciliation endpoint Creating placeholder entities for things that could not be reconciled Representing Compound Objects Syndicating Data in Standard Schemas Wikibase Ecosystem Advantages Implementing authority control Decreasing cataloging inefficiencies, increasing descriptive quality Generating data visualizations User Interface Extensions MediaWiki gadgets Adding the Mirador viewer Showing contextual information from Wikidata Contextual Data and Image from DBPedia and Wikimedia Commons Embedded in the Wikibase User Interface Revealing constraint violations CONTENTdm custom pages Embedding Schema.org JSON-LD in CONTENTdm pages Showing contextual information for headings based on Wikibase data New Applications The Image Annotator User study results The Retriever The Describer The Explorer and the Transportation Hub The Field Analyzer Cohort Communication Partner Reflections Cleveland Public Library (Chatham Ewing) The Huntington Library, Art Museum, And Botanical Gardens (Mario Einaudi) Minnesota Digital Library (Greta Bahnemann and Jason Roy) Invitation Development of three tools by OCLC Leveraging the power of linked data Concluding thoughts Temple University Libraries (Holly Tomren and Michael Carroll) University of Miami Libraries (Paul Clough and Elliot Williams) Key Findings and Conclusions Testing the linked data value proposition Evaluating a shared data model Selecting and transforming metadata Continuing the journey to linked data Working partnerships represent strength in numbers Notes Figure 1. Planned project phases. Figure 2. The Wikibase Ecosystem. Figure 3. A CONTENTdm class hierarchy data model. Figure 4. Example type, classification used, and process or format properties and values for a description of a postcard. Figure 5. A depicts statement for the concept of “Dogs.” Figure 6. A type classification of “dog” for a specific dog. Figure 7. The “dog” class is defined by the concept of “Dogs.” Figure 8. Wikibase templates for proposing new properties. Figure 9. Unmapped CONTENTdm metadata displayed in the Wikibase user interface using a Gadget extension. Figure 10. Wikibase Discussion page for a collection review. Figure 12. A “placeholder” entity for a person without an established identity. Figure 13. Example “has creative work part” statements and sequencing for the first four parts of an album. Figure 14. Other names associated with the Los Angeles Dodgers entity. Figure 15. First parts of the description of Jasper Wood. Figure 16. SPARQL Query map visualization of places depicted in works from a collection. Figure 17. Mirador image viewer embedded in the Wikibase user interface. Figure 18. Contextual data and image from DBPedia and Wikimedia Commons embedded in the Wikibase user interface. Figure 19. A constraint violation indicating that the “occupation” property should only be used for instances of the type “person.” Figure 20. Schema.org data evaluated using Google’s Structured Data Testing Tool. Figure 21. Additional contextual information displayed in CONTENTdm based on entity descriptions in the pilot Wikibase. Figure 22. Image Annotator initial view of an image and subjects. Figure 23. Image Annotator cropping an image of a person. Figure 24. Image Annotator after adding more depicted subjects. Figure 25. Wikibase item updated with illustrated depicts statements. Figure 26. Retriever search results from Wikidata, VIAF, and FAST for “Lake Vermilion.” Figure 27. Retriever entity editor. Figure 28. Wikibase entity created by the Retriever. Figure 29. Editing essential details for an entity in the Describer. Figure 30. Explorer home page. Figure 31. Explorer Transportation Hub and related collections. Figure 32. Explorer search results for “strike.” Figure 33. Explorer view of a truck bringing employees home during a PTC walkout. Figure 34. Explorer view of a protest against the Philadelphia Transportation Company. Figure 35. Explorer view of an 1899 Cleveland transit strike in Public Square. Figure 36. Explorer view of streetcars parked on the street during a transit strike. Figure 37. Field Analyzer field usage chart. Figure 38. Field Analyzer list of field values. Blank Page cohen-machine-2021 ---- Chapter 12 Machine Learning + Data Creation in a Community Partnership for Archival Research Jason Cohen Berea College Mario Nakazawa Berea College Introduction: Cultural Heritage and Archival Preservation in Eastern Kentucky In this chapter, two researchers, Jason Cohen and Mario Nakazawa, describe the contexts for an archivally focused project that emerged from a partnership between the Pine Mountain Settle- ment School (PMSS)1 in Harlan County, Kentucky, and scholars and students at Berea College. In this process, we have entered into a critical dialogue with our sources and knowledge pro- duction that Roopika Risam calls for in “self-reflexive” investigations in the digital humanities (2015, para. 16). Risam’s intervention, nevertheless, does not explicitly distinguish questions of class and the concomitant geographic constraints that often accompany the economic and social disadvantages of poverty (Ahmed et al. 2018). Our work demonstrates how class and geography are tied, even in digital archives, to the need for reflexive and diverse approaches to humanist ma- terials. For instance, a recent invited contribution to Proceedings of the IEEE articulates a need 1See ?iiT,ffTBM2KQmMi�BMb2iiH2K2Mib+?QQHX+QK. 137 http://pinemountainsettlementschool.com 138 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 12 for diversity in computing and technology without mentioning class or region as factors shaping these related issues of diversity (Stephan et al. 2012, 1752–5). Given these constraints, perhaps it is also pertinent to acknowledge that the machine learning application we describe in this chapter is itself not particularly novel in scope or method—we describe our data acquisition and prepa- ration, and two parallel implementations of commercially available tools for facial recognition. What stands out as unique are the ethical and practical concerns tied to bringing unique archival materials out of their local contexts into a larger conversation about computer vision as a tool that helps liberate, and at the same time possibly endanger, a subaltern cultural heritage. In that light, we enter our archival investigation into what Bruno Latour has productively named “actor-network theory” (2007, 11–13) because, as we suggest below, our actions were highly conditioned not only by the physical and social spaces our research occupies and where its events occurs, but also because the nature of the historical artifacts themselves act powerfully to shape our work in these contexts. Moreover, the partnership model of curation and archiving that we pursued in this project complicates the very concept of agency because the actions form- ing the project emerged from a continuing dialogue rather than any one decision or hierarchy. As we suggest later, a distributed model for decisions (Sabharwal 2015, 52–5) also revealed the limitations of using a participatory and identity-based model for archival development and man- agement. Indeed, those historical artifacts will exert influence on this network of relations long after any one of us involved in the current project has ceased to pursue them. When we came to this project, we asked a version of a classic question that has arisen in a variety of forms begin- ning with very early efforts by Bell Laboratories, among others, to translate data structures to suit the often flexible needs of humanist data: “what aspects of life are formalizable?” (Weizenbaum 1976, 12). We discovered that while an ontology may represent a formalized relationship of an archive to a database or finding aid, it also asks questions about the ethical implications of what information and embedded relationships can be adequately formalized by an abstract schema. The Promises and Realities of Technology After Coal in Eastern Kentucky Despite the longstanding threats of having to adapt to a post-coal economy, Harlan County, Ken- tucky continues to rely on coal and the mountains from which that coal is extracted as two of the cornerstones that shape the identity of the territory as well as the people who call it home. The mountains of Eastern Kentucky, like much of Appalachia, are by turns beautiful and devastated, and both authors of this essay have found conversations with Eastern Kentucky’s citizens about the role the mountains play and the traditions that emerge from them both insightful and, at times, heartbreaking. This dramatic landscape, with its drastic challenges, may not sound like a place likely to find uses for machine learning. You would not be alone in your assumption. Standing far from urban centers of technology and mobility, Eastern Kentucky combines deeply structural problems of generational poverty with a hard won understanding that, since the moment of the region’s colonization, outsiders have taken resources and made uninformed decisions about what the region needs, or where it should turn in order to gain a better pur- chase on the narrative of American progress, self-improvement, and the unavoidable allures of development-driven capitalism. Suspicion of outsiders is endemic here. And unfortunately, eco- nomic and social conditions, such as the high workplace injury rates associated with mining and extraction-related industries, the effects of the pharmaceutical industry’s abuse of prescription Cohen and Nakazawa 139 opioids to treat a wide array of medical pain symptoms without treating the underlying causal conditions, and the systematic dismantling of federal- and state-level social support programs, have become increasingly acute concerns today. But this trajectory is not new: when President Lyndon B. Johnson announced the beginning of the War on Poverty in 1964, he landed an hour away in Martin County, and subsequently, drove through Harlan on a regional tour to inaugurate the initiative. Successive generations have sought to leave a mark, and all the while, the residents have been collecting their own local histories of their place. Our project, centered on recovering a latent social network of historical families represented by the images held in one local archive, mobilizes this tension between insiders’ persistence and outsiders’ interventions to think about how, as Bruno Latour puts it, we can “reassemble the social” while still respecting the local (2007, 191–2). PMSS occupies a unique position in this social and physical landscape: both local in its emplacement and attention, and a site of philanthropic work that attracted outside money as well as human and cultural capital, PMSS is at once of Harlan County and beyond it. As we sug- gest in the later sections of this essay, PMSS’s position, both within local and straddling regional boundaries, complicates the network we identified. More than that, however, its split position complicates the relationships of power and filiation embedded in its historical social network. While an economy centered on coal continues to define the Eastern Kentucky regional iden- tity, a second history can be told about this place and its people, one centered on resilience, in- dependence, simplicity, and beauty, both of the land and its people. This second history has made outsiders’ recent appeals for the region to court technology as a potential solution for what comes “after coal” particularly attractive to a region that prides itself on its capacity to sustain, out- last, and overcome obstacles. While that techno-utopian vision offers another version of the self- aggrandizing Silicon Valley bootstraps success story J.D. Vance narrates in Hillbilly Elegy (2016), like Vance’s story itself, those narratives most often get told by outsiders to outsiders using re- gional stereotypes as the grounds for a sales pitch. In reality, however, those efforts have largely proven difficult to sustain, and at times, become the sources of potentially explosive accusations of fraud and malfeasance. Recently, for instance, organizations including Mined Minds2 have been accused by residents aiming to prepare for a post-coal economy of misleading students, at least, and of fraud at worst. As with the timber, coal, and gas extraction industries that preceded these software development firms’ aspirations, the promises of technology have not been kind to Eastern Kentucky, and in particular, as with those extraction industries that preceded them, the technological-industrial complex making its pitch in Kentucky’s mountains has not returned resources to the region’s residents whom the work was intended at least nominally to support (Hochschild 2018; Campbell 2019; Bailey 2017). In this context of technology, culture, and the often controversial position machine learning occupies in generating obscure metrics for its classifiers that may embed bias, our project aims to activate its archival holdings and bring critical awareness to the question of how to actively engage with a paper archive of a local place as we venture further into our pervasively digital mo- ment. The School operates today as a regional cultural heritage institution; it opened in 1913 as a residential school and operated as an educational institution until 1974, at which point it trans- formed itself into an environmental and cultural outreach institution focused on developing its local community and maintaining the richness of the region’s cultural resources and heritage. Every year since 1974, PMSS has brought hundreds of students and citizens onto its campus to learn about nature and the landscape, traditional crafts and artistic practices, and musical and dance forms, among many other programs. Similarly, it has created a space for locals to come 2See ?iiT,ffrrrXKBM2/KBM/bXQ`;f. http://www.minedminds.org/ 140 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 12 together for social events, community celebrations, and festival days, and at the same time, has become a destination for national-level events that create community from shared interests in- cluding foodways, wildflowers, traditional dance forms, and other wide-ranging attractions. Project Background: Preserving Cultural Heritage in Harlan Country The archives of the Pine Mountain Settlement School emerge from its shifting history. The ma- jority of its papers relate to its time as a traditional institution of education, including student records (which continue to be restricted for several reasons, including FERPA constraints, and personal and community interests in privacy), minutes of its board meetings (again, partially re- stricted), and financial and narrative accounts of its many activities across a year. The school’s records are unique because they provide a snapshot, year by year and month by month, of the region’s interests and challenges during key years of the 20th Century, spanning the First World War to Vietnam. In addition, they detail the relations the School maintained with a philanthropic base of donors who helped to support it and shape it, and beyond its local relations, place it into contact with a larger set of cultural interactions than a boarding school that relied on tuition or other profit-driven means to sustain its operations would. While the archival holdings contin- ued to be informally developed by its directors and staff, who kept the official papers organized roughly by year, the archive itself sat largely neglected after 1974. Beginning around the turn of the millennium, a volunteer archivist named Helen Wykle began digitizing items one by one, and soon, hosted a curated selection of those digital surrogates along with interpretive and descrip- tive narration on a WordPress installation, The Pine Mountain Settlement School Collections.3 The PMSS Collections WordPress site has been continuously running and frequently updated by Wykle and the volunteer community members she has organized since 1999.4 Together with her collaborators and volunteers, Wykle has grown the WordPress site to over 2200 pages, including over 30,000 embedded images that include photographs and newspapers; scanned memos, meet- ing minutes and other textual material (in JPG and PDF formats); HTML transcriptions and bibliographies hard-coded into the pages; scanned images of 3-D collections objects like textile looms or wood carving tools; partially scanned runs of serial publications; and other compos- ite visual material. None of those objects was hosted within a regular and complete metadata hierarchy or ontology: no regular scheme of fields or file-naming convention was followed, no controlled vocabulary was maintained, no object-types were defined, no specific fields were re- quired prior to posting, and perhaps unsurprisingly as a result, the search and retrieval functions of the site had deteriorated noticeably. In 2016, Jason Cohen approached PMSS with the idea of using its archives as the basis for curricular development at Berea College.5 Working in collaboration beginning in 2017, Mario Nakazawa and Cohen developed two courses in digital and computational humanities, led a team-directed study in augmented reality in coordination with Pine Mountain, contributed ma- 3See ?iiTb,ffTBM2KQmMi�BMb2iiH2K2MiXM2if. 4Jason Cohen and Mario Nakazawa wish to extend a note of appreciation to Helen Hays Wykle, Geoff Marietta, the former director of PMSS, and Preston Jones, its current director, for welcoming us and enabling us to access the physical archives at PMSS from 2016–20. 5Jason Cohen would like to recognize the support this project received from the National Endowment for the Hu- manities’ “Humanities Connections” grant. See grant number AK-255299-17, description online at ?iiTb,ffb2+m `2;`�MibXM2?X;QpfTm#HB+[m2`vfK�BMX�bTt?74R�;M4�E@k88kNN@Rd. https://pinemountainsettlement.net/ https://securegrants.neh.gov/publicquery/main.aspx?f=1&gn=AK-255299-17 https://securegrants.neh.gov/publicquery/main.aspx?f=1&gn=AK-255299-17 Cohen and Nakazawa 141 terials and methods for a new course in Appalachian Studies, and promoted the use of PMSS archival materials in several other extant courses in history and art history, among others. These new college courses each make use of PMSS historical documents as a shared core of visual and textual material in a digital and computational humanities concentration that clusters around critical archival and textual studies.6 The success of that initial collaboration and course development seeded the potential in 2019– 2021 for a Whiting Public Engagement7 fellowship focused on developing middle and high school curricula for use in Kentucky public schools with PMSS archival materials. That Whiting funded project has generated over 80 lessons keyed to Kentucky state standards; these lessons are cur- rently in use at nine schools across eight school districts, and each school is using PMSS materials to highlight its own regional and local interests. The work we have done with these archives has thus far reached the classrooms of at least eleven different middle and high school teachers, and as a result, touched over 450 students in eastern and central Kentucky public schools. We mention these numbers in order to demonstrate that our collaboration has not been shal- low nor fleeting. We have come to know these archives quite well, and because they are not ade- quately cataloged, the only way to get to know them is to spend time reading through the mate- rials one page at a time. An ancillary consequence of this durable collaboration and partnership across the public-academic divide is the shared recognition early in 2019 that the PMSS archival database and its underlying data structure (a flat SQL database generated by the WordPress inter- face) would provide inadequate stability for records management and quality control in future development. In addition, we discovered that the interpretive materials and metadata associated with the WordPress installation were also insufficient for linked metadata across the objects in this expanding digital archive, for reasons discussed below. As partners, we decided together to migrate to a ContentDM instance hosted by the Ken- tucky Virtual Library,8 a consortium to which Berea College belongs, and which is open to future membership from PMSS. That decision led a team of Berea College undergraduate and faculty re- searchers to scrape the data from the PMSS archive site and supplement the images and transcrip- tions it contains with available textual metadata drawn from the site.9 Alongside the WordPress instance as our reference, we were also granted access to a Dropbox account that hosted higher resolution versions of the images featured on the blog. The scraper pulled over 19,228 unique images (and located over 11,000 duplicate images in the process), 732 document transcriptions for scanned texts on the site, and 380 subject and person bibliographies, including Library of Congress Subject Headings that had been hard-coded into the site’s HTML. We also extracted the unique object identifiers and labels associated with each image, which in WordPress are not associated with the image objects themselves. We used that data to populate the ContentDM in- stance and returned a sparse but stable skeleton for future archival development. In the process, we also learned significantly about how a future implementation of a controlled vocabulary, an image acquisition and processing pipeline, and object documentation standards should work in the next stages of our collaborative PMSS archival development. 6In the original version of the collaboration, we had planned also to teach basic computer programming to high school students during a summer program that also would have used that same set of materials, but with the paired departures of the original co-PI as well as the former director, that plan has thus far remained unfulfilled. 7See ?iiTb,ffrrrXr?BiBM;XQ`;f+QMi2MifD�bQM@+Q?2M. 8See ?iiTb,ffF/HXFvpHXQ`;f. 9Jason Cohen wishes to thank Mario Nakazawa, Bethanie Williams, and Tradd Schmidt for undertaking this project with him. The github repo for the PMSS scraper is hosted here: ?iiTb,ff;Bi?m#X+QKfh`�//@a+?KB/ifSJaana +`�T2`. https://www.whiting.org/content/jason-cohen https://kdl.kyvl.org/ https://github.com/Tradd-Schmidt/PMSS_Scraper https://github.com/Tradd-Schmidt/PMSS_Scraper 142 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 12 As we developed and refined this new point of entry to the digital archives using the Con- tentDM hosting and framework, some of the ethical issues surrounding this local archive came more clearly into focus. A parallel set of questions arose in response in the first instance to J.D. Vance’s work, and in the second, to outsiders’ claims for technological solutions to the deteri- oration of local and cultural heritage. Because we were creating virtual archival surrogates for materials housed at Pine Mountain, for instance, questions arose from the PMSS board mem- bers related to privacy and use of historical materials. Further, the board was concerned that even historical materials could bear on families present in the community today. We found that while profession-wide responses to archival constraints are shaped predominantly by discussions of copyright and fair use, issues of personal privacy are often left tacit. This gap between legal use and public interests in privacy reveals how tasks executed using techniques in machine learning may impinge upon more ethical constraints of public trust and civic obligation.10 Similarly, as the ownership of historical images suddenly extended to include present-day community members, and as these questions of access and serving a local public were inextri- cably bound up with interactions with members of that shared public whose family names and faces appear in the images we were making available, we began to consider the ways in which our archival work was tied to what Ryan Calo calls the “historical validation” of primary source materials (2017, 424–5). When an AI system recognizes an object, Calo remarks, that object is validated. But how should one handle the lack of a specific vocabulary within a given training set? One answer, of course, would be to train a new set—but that response is becoming increasingly prohibitive for smaller cultural heritage projects like ours: the time and computational power re- quired to execute the training is non-negligible. In addition, training resources (such as data sets, algorithms, and platforms) are increasingly becoming monetized, and we do not have the mar- gins to buy access to new data for training. As a consequence, questions stemming from how one labels material in a controlled vocabulary were also at issue. We encountered a failure in historical validation when, for instance, our AI system labeled a “spinning wheel” as a wheel, but did not de- tect its historical relationship to weaving and textiles. That validation was further obscured when the system also failed to categorize a second form of “spinning wheel,” which refers locally to a home-made merry-go-round.11 In other words, not only did the system flatten a spinning wheel into a generic wheel, it also missed the regional homology between textile production and play, a cultural crux that reveals how this place envisions an intersection between work and recreation. By breaking the associations between two forms of “spinning wheel,” our system erased a small but significant site of cultural inheritance. How, we asked, should one handle such instances of effacement? At one level, one would expect an archival system to be able to identify the prim- itive machine for spinning wool, flax, or other raw materials into usable thread for textiles, but what about the merry-go-round? And what should one do when a system neglects both of these meanings and reduces the object to the same status as a wheel on a tractor, car, or carriage? Similarly, when competing naming conventions arise for landmarks, we were conscious to consider which name should be granted priority as the default designation, and we asked how one should designate a local or historical name, whether for a road, waterway, knob, or other fea- ture, in relationship to a more widely accepted nomenclature such as state route designations or 10The professional conversation in archive and collections management has not been as rich as the one emerging in AI contexts more broadly. For a recent discussion of the conflict in the roles of public trust and civic service that emerge from the context of the powers artificial intelligence holds for image recognition in policing applications, see Elizabeth Joh, “Artificial Intelligence and Policing: First Questions,” Seattle University Law Review 41: 1139–44. 11See “Spinning Wheel” in Cassidy 1985–2012. Cohen and Nakazawa 143 standardized toponym? As we attempted to address the challenge of multiple naming conven- tions, we encountered some of the same challenges that archivists find in dealing with indigenous peoples and their textual, material, and physical artifacts.12 Following an example derived from the Passamaquoddy people, we implemented a small set of “traditional knowledge labels”13 to describe several forms of information, including (a) restrictions on images that should not be shown to strangers (to protect family privacy), (b) places that should remain undisclosed (for in- stance, wild ginseng, ramp, orchid, or morel mushroom patches), and (c) educational materials focused on “how it was done” as related to local skills and crafts that have more modern imple- mentations, but for which the traditional practices have remained meaningful. This included cases such as Maypole dancing and festivals, which remain endowed with ritual significance. In the final analysis, neither the framework supplied by copyright and fair use nor the one supplied by data validation proved singularly adequate to our purposes, but they did provide guidelines from which our facial recognition project could proceed, as we discuss below. Machine Learning in a Local Archive These preliminary discussions of ethics and convention may seem unrelated to the focus this col- lection adopts toward machine learning and artificial intelligence in the archive. However, as we have begun to suggest, the data migration to ContentDM opened the door to machine learning for this project, and those initial steps framed the pitfalls that we continue to navigate as we con- tinue forward. As we suggested at the outset, the technical machine-learning task that we set for ourselves is not cutting edge research as much as an application of existing technologies to a new aspect of archival investigation. We proposed (and succeeded with) an application of commercial facial recognition software to identify the persons in historic photographs in the PMSS archives. We subsequently proposed and are currently working to identify the photographs sharing com- mon but unnamed faces, and in coordination with photographs of known people, to re-create the social network of this historic institution across slices of its history. We describe the next steps briefly below, but let us tarry for a moment with the question of how the ethical concerns we navigated up to this point also influenced our approach to facial recognition. The first of those concerns has to do with commercial and public access to archival materials that, as we suggested above, include materials that are designated as restricted use in some way. We demonstrated to the local members at Pine Mountain how our use case and its con- straints for digital archives fit with the current standards for the fair use of copyrighted materials based on the “substantive transformation” of reproduced objects (Levendowski 2018, 622–9). Since we are not making available large bodies of materials still protected by copyright, and since our use of select materials shifts the context within which they are presented, we were able to negotiate with PMSS to allow us to design a system for facial recognition using the ContentDM instance as our image source. What that negotiation did not consider, however, is when fair use does not provide a sufficiently high standard of control for the institution involved in the appli- cation of algorithms to institutional memory or its technological dependencies. First, to test the facial recognition processes, we reached back to the most primitive and local version of facial recognition software that we could find, Google’s retired platform, the Picasa 12One well-documented digital approach to handling indigenous archival materials includes the Mukurtu platform for indigenous cultural heritage: ?iiTb,ffKmFm`imXQ`;f. 13For the original traditional knowledge labels, see: ?iiTb,ffT�bb�K�[mQ//vT2QTH2X+QKfT�bb�K�[mQ//v@ i`�/BiBQM�H@FMQrH2/;2@H�#2Hb. https://mukurtu.org/ https://passamaquoddypeople.com/passamaquoddy-traditional-knowledge-labels https://passamaquoddypeople.com/passamaquoddy-traditional-knowledge-labels 144 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 12 Web Albums API, which was retired in May 2016 and fully deprecated as of March 2018 (Sab- harwal 2016). We chose Picasa because it is a self-contained software application that operates using a locally hosted script and locally hosted images. Given its deprecated status and its loca- tion on a local machine, we were confident that no cloud services would be ingesting the images we fed into the system for our trial. This meant that we could test small data examples without fear of having to upload an entire corpus of material that could subsequently be incorporated into commercial facial recognition engines or pop up unexpectedly in search results. We thus began by upholding a high threshold for privacy and insisting on finding ways for PMSS to maintain control over these images within the grasp of its local directories. The Picasa system created surprisingly good results within the scope we allowed it. It was highly successful at matching the small group of known faces we supplied as test materials. While it would be difficult to supply a numerical match rate first because of this limited test set, and second because we have not expanded the test to a broad sample using another platform, we were anecdotally surprised at how robust Picasa’s matching was in practice. For instance, Picasa matched the images of a single person’s face, Celia Cathcart, from pictures of her as a teenager to images of her as a grandmother. It recognized Cathcart in a group of basketball players, and it also identified her face from side-view and off-center angles, as in a photograph of her looking down at her newborn child. The most immediate limitation of Picasa lies in its tagging, which required manual entry of every name and did not allow any automation. Following the success of that hand-tagging and cross-image identification process, we dis- cussed with our partners whether the next step, using Amazon Web Services’ computer vision and facial recognition platform, ReKognition, would be acceptable. They agreed, and we ran the images through the AWS application, testing our results against samples pulled from our Pi- casa run to verify the results. Perhaps unsurprisingly, AWS ReKognition fared even better with those test cases. Using one photograph image, the AWS application identified all of the Picasa matches as well as three new images that had not previously been tagged with Cathcart’s name. The same pattern held for other images in our sample group: Katherine Pettit was positively iden- tified across more likenesses than had been previously tagged, and Alice Cobb was also positively tracked across images. This positive attribution also reveals a limitation of the metadata: while these three women we have named are important historical figures at PMSS, and while they are widely acknowledged in the archive and well-represented in the photographic record, not all of the photographs have been well-tagged or fully documented in the archive. The newly tagged images that we found would enrich the metadata available to the archive not because these im- ages include surprising faces, but rather, because the tagging has been inconsistent, and over time, previously known faces have become less easy to discern. Like other recent discussions of private materials disclosed within systems trained for match- ing and similarity, we found that the ethics of private materials for this non-private purpose pro- voked strong reactions. While some of the reaction was positive with community members happy to have more images of the School’s founding director, Katherine Pettit, identified, those same community members were not comfortable with our role as researchers identifying people in the photographs in their community’s archive, unsupervised. They wanted instead to verify each positive identification, a point that we agreed with, but which also hindered the process of mov- ing through 19,000 images. They wanted to maintain authority, and while we saw our efforts as contributions to their goals of better describing their archival holdings, it turns out that the larger scope of automation we brought to the project was intimidating. While its legal status and direct ethics seemed settled before the beginning of the project, ultimately, this project contributed to Cohen and Nakazawa 145 a sense among some individuals at PMSS that they were losing control of their own archive.14 That fear of a loss of control led to another reckoning with the project, as we discuss in the next section. What Machine Learning Cannot Learn: An Ethics of the Archive It became clear at the same moment we validated our test case, that our research goals and those of our partners had quickly diverged. We had discussed the scope and use of PMSS materials with our partners at PMSS and laid out in a formally drafted “Memorandum of Understanding” (MOU) adapted from the US Department of Justice (2008; 2017) our shared goals in the project. As we described in the MOU, both partners considered it mutually beneficial for the archive and its metadata to be able to identify faces of named as well as unnamed people. We aimed to capture single-person images as well as groups in order to enrich the archive with cross-links to other pho- tographs or archival materials with a shared subject heading, and we hoped to increase the number of names included in object attributes. Despite those conversations and multiple revisions of the MOU draft, what we discovered was ultimately different than the path our planning had indi- cated. Instead of creating an historical social network using the five decades of photographs we had prepared, we found that the history of the social network and the family and kinship relation- ships detailed through those images was deeply personal for the community living in the region today. We found out the hard way that those kinships reflected economic changes in status and power, realignments among families and their communities, and new patterns in the social fabric formed by the warp of personal relationships and the weft of local institutions (schools, hospi- tals, and local governance). Revealing those changes was not always something that our partners wanted us to do, and these were not patterns we had sought to discover: they are simply there, embedded in the images and the relations among images. These social changes in local alignments—tied in complex ways to marriages and separations, legal conflicts and resolutions, changes in ownership of residential and commercial interests, and other material reflections of that social fabric—remain highly charged and, for those continuing to live in the area, they revealed potentially unexpected parts of the lived realities and values of the place. As a result, even though we had an MOU that worked for the technical details of the project, we could not find common ground for how to handle the competing social and ethical values of the project. As we problem-solved, we tried to describe new forms of restriction and to generate appro- priately sensitive guidelines to handle future use and access, but it turned out that all of these approaches were threatening to the values of a tightly knit community. They, rightly, want to tell their story, and so many people have told it so poorly for so long that they wish to have sole access to the materials from which the narratives are assembled. As researchers interested in open access and stable platform management, we have disagreements with the scholarly and archival implications of this decision, but we ultimately respect the resolve and underlying values that accompany the difficult choices PMSS makes about its public audiences and the corresponding goals it maintains for its collections. Interestingly, Wykle has come to view our work with PMSS collections as another form of the material and cultural extraction that has dominated the region 14See, for another example of the ethical quandaries that may be associated with legal applications of machine learning techniques, Ema et al. 2019. 146 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 12 for generations. While we see our work in light of preservation and access as well as our lasting commitment to PMSS and the region, we have also come to recognize the powerful explanatory force that the idea of “extraction” has become for the communities in a region that has suffered many forms of extraction industries’ negative effects. In acknowledging the limitations of our own efforts, we would posit that our case study offers a counter-example to works that suggest how AI systems can be designed automatically to meet the needs of their constituents (Winfield et al. 2019). We tried to use a design approach to address our research goals and our partner’s needs, and it turned out that the dynamically constructed and evolving nature of those needs outstripped the capacity we could build into our available system of machine learning. The divergence of our goals has led the collaboration to an impasse. Given that we had al- ready outlined further steps in our initial documents that could not be satisfied after the partners identified their divergent intentions, the collaborative scope the partners initially described was not completely fulfilled. The divergence of goals became stark: as researchers interested in the relevance and sustainability of these archives, we were moving the collections toward a more ac- cessible and comprehensive platform with open documentation and protocols for future devel- opment. By contrast, the PMSS staff were moving toward more stringent and local controls over access to the archives in order to limit dissemination. At this juncture, we had some negotiating to do. First, we made the ContentDM instance a password protected and not publicly accessible (private) sandbox rather than a public instance of a virtual digital collection. As PMSS owns the material, they decided shortly thereafter to issue a take-down order of the ContentDM instance, and we complied. As the ContentDM materials were ultimately accessible in the public domain on their live site, this decision revealed how personal the challenges had become. Nothing in- cluded in the take-down order was unique or new material—rather, the ContentDM site simply provided a more accessible format for existing primary material on the WordPress site, stripped of its interpretive and secondary contexts. If there is a silver lining, it lies in this context for use: the “academic divorce” we underwent by discontinuing our collaboration has made it possible for us to continue conducting research on the publicly available archival materials without being obligated to host a live and dynamic reposi- tory for further materials. As a result, we can test best-approaches without having to worry about pushing them to a live production site. Within this constraint, we aim to continue re-creating the historical social network without compromising our partners’ needs for privacy and control of their production site. The mutual decision to terminate further partnership activities based in archival development arose because of these differing paths forward. That decision meant that any further enrichment of the archival materials would not become publicly available, which we saw as a penalty against using the archive at a moment when archives need as much advocacy and visible support as possible. Under these constraints of private accessibility, we have continued to work on the AWS ReKog- nition pipeline and have successfully identified all of the faces of named people featured in the archive, with face and name labels now associated with over 1900 unique images. Our next step, delayed to Spring 2021 as a result of the COVID-19 pandemic, includes the creation of an associative network that first identifies unnamed faces in each image using unique identifiers. The second element of that process will be to generate an historical social network using the co- occurrence among those faces as well as the faces of named people in the available images. Given that our metadata enrichment has already included date associations for most of the images, we are confident that we will be able to reconstruct historically specific networks for a given year or range of years, and moreover, that the association between dates and named people will help us Cohen and Nakazawa 147 to identify further members of the community who are not currently named in the photographs because of the small groups involved in activities and clubs, as well as the generally limited student and teacher populations during any given year. We are now far more sensitive to how the local concerns of this community shape our research methods and outcomes. The longer-term hope, one it is not clear at all that we will be allowed to pursue, would be to use natural language processing tools on the archive’s textual materials, par- ticularly named entity recognition and word vectors, to search and match images where known names occur proximate to the names of unmatched faces. The present goal, however, remains to create a more replete and densely connected network of faces and the places they occupied when they were living in the gentle shadows of Pine Mountain. In order to abide by PMSS community wishes for privacy, we will be using anonymized aggregate results without identifying individuals in the photographs. While this method has the drawback of not being able to reveal the complex- ity of the historical relations at the granular level of individuals, it will allow us to report on the persistence or variation in network metrics, such as network density, centrality, path length, and betweenness measures, among others. In this way, we aim to be able to measure and report on the network and its changes over time without reporting on individuals. We arrived at an anonymiz- ing method as a solution to the dissolved partnership by asking about the constraints of FERPA as well as by looking back at federal and commercial facial recognition practices. In each case, the dark side of these technological tools remains one associated with surveillance, and in the lan- guage of Eastern Kentucky, extraction. We mention this not only to be transparent about our recognition of these limitations, but also in the hopes of opening a new dialogue with our part- ners that might stem from generating interesting discoveries without compromising their sense of the local ownership of their archival materials. Nonetheless, in order to report on the most interesting aspects, the actual people and their local histories of place, the work to be done would remain more at a human level than at a technical one. Conclusion In conclusion, our project describes a success that remains imbricated with a shortcoming in machine learning. The machine learning tasks and algorithms our project implemented serve a mimetic function in the distilled picture of the community they reflect. By matching histori- cal faces to names, the project embraces a form of digital surrogacy: we have aimed to produce a meta-historical account of the present institution’s social and cultural function as a site of social networking and local knowledge transmission. As Robyn Caplan and danah boyd have recently suggested, the “bureaucratic functions” these algorithms promote can be understood by the ways in which they structure users’ behaviors (2018, 3). We would like to supplement Caplan and boyd’s insight regarding the potential coercions involved in how data structures implicitly shape their contents as well as their users’ behaviors. Not only do algorithms promote a kind of bureau- cracy, to ends that may be positive and negative, and sometimes both at once, but further, those same structures may reflect or shape public behaviors and interactions beyond a single platform. As we move between digital and public spheres, our work similarly shifts its scope. The re- search that we intended to have positive community effects was instead read by that very same set of people as an attempt to displace a community from the center of its own history. In other words, the bureaucratic functions embedded in PMSS as an institution saw our new approach to their storytelling as an unwanted and external intervention. As their response suggests, the inter- nal and extant structures for governing their community, its stories, and the people who tell them, 148 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 12 saw our contribution as an effort to co-opt their control. Where we thought we were offering new tools for capturing, discovering, and telling stories, they saw what Safiya Noble has recently characterized in a specifically racialized context as “algorithms of oppression” (2018). Here the oppression would be geographic, socio-economic, and cultural, rather than racial; nevertheless, the perception that one is being oppressed by systems set into place by agents working beyond one’s own community remains a shared foundation in Noble’s argument and in the unexpected reception of our project. As we move forward with our own project into unknown territories, in which our work-products may never see the light of day because of the value conflicts bound up in making archival objects public and accessible, we have found a real and lasting respect for the institutional dependencies and emplacements within which we all do our work. We hope to channel some of those functions of emplacement to create new forms of accountability and restraint that will allow us to move forward, but at least for now, we have found with our project one limitation of machine learning, and it is not the machine. References Ahmed, Manan, Maira E. Álvarez, Sylvia A. Fernández, Alex Gil, Rachel Hendery, Moacir P. de Sá Pereira, and Roopika Risam. 2018. “Torn Apart / Separados.” Group for Experimental Methods in Humanistic Research. ?iiTb,fftTK2i?Q/XTH�BMi2tiXBMfiQ`M@�T�`i fpQHmK2fkf. Bailey, Ronald. 2017. “The Noble, Misguided Plan to Turn Coal Miners Into Coders.” Reason, November 25, 2017. ?iiTb,ff`2�bQMX+QKfkyRdfRRfk8fi?2@MQ#H2@KBb;mB/2/@ TH�M@iQ@imf. Calo, Ryan. 2017. “Artificial Intelligence Policy: A Primer and Roadmap.” University of Cali- fornia, Davis Law Review 51:399-435. Caplan, Robyn and danah boyd. 2018. “Isomorphism through algorithm: Institutional de- pendencies in the case of Facebook.” Big Data & Society (January-June): 1-12. ?iiTb, ff/QBXQ`;fRyXRRddfky8jN8RdR3d8dk8j. Cassidy, Frederic G. et al., eds. 1985-2012. Dictionary of American Regional English. Cam- bridge, MA: Belknap Press. ?iiTb,ffrrrX/�`2/B+iBQM�`vX+QK. Ema, Arisa et. al. 2019. “Clarifying Privacy, Property, and Power: Case Study on Value Conflict Between Communities.” Proceedings of the IEEE 107, no. 3 (March): 575-80. ?iiTb, ff/QBXQ`;fRyXRRyNfCS_P*XkyR3Xk3jdy98. Harkins, Anthony and Meredith McCarroll, eds. 2019. Appalachian Reckoning: A Region Re- sponds to Hillbilly Elegy. Morgantown, WV: West Virginia University Press. Hochschild, Arlie. 2018. “The Coders of Kentucky.” The New York Times, September 21, 2018. ?iiTb,ffrrrXMviBK2bX+QKfkyR3fyNfkRfQTBMBQMfbmM/�vfbBHB+QM@p�HH2v @i2+?X?iKH. Joh, Elizabeth. 2018. “Artificial Intelligence and Policing: First Questions.” Seattle University Law Review 41 (4): 1139-44. Latour, Bruno. 2007. Reassembling the Social: An Introduction of Actor-Network Theory. New York: Oxford University Press. Levendowski, Amanda. 2018. “How Copyright Law Can Fix Artificial Intelligence’s Implicit Bias Problem.” Washington Law Review 93 (2): 579-630. Mukurtu CMS. ?iiTb,ffKmFm`imXQ`;f. Accessed December 12, 2019. https://xpmethod.plaintext.in/torn-apart/volume/2/ https://xpmethod.plaintext.in/torn-apart/volume/2/ https://reason.com/2017/11/25/the-noble-misguided-plan-to-tu/ https://reason.com/2017/11/25/the-noble-misguided-plan-to-tu/ https://doi.org/10.1177/2053951718757253 https://doi.org/10.1177/2053951718757253 https://www.daredictionary.com https://doi.org/10.1109/JPROC.2018.2837045 https://doi.org/10.1109/JPROC.2018.2837045 https://www.nytimes.com/2018/09/21/opinion/sunday/silicon-valley-tech.html https://www.nytimes.com/2018/09/21/opinion/sunday/silicon-valley-tech.html https://mukurtu.org/ Cohen and Nakazawa 149 Noble, Safiya. 2018. Algorithms of Oppression: How Search Engines Reinforce Racism. New York: NYU Press. Passamaquoddy People. “Passamaquoddy Traditional Knowledge Labels.” ?iiTb,ffT�bb�K�[mQ//vT2QTH2X+QKfT�bb�K�[mQ//v@i`�/BiBQM�H@FMQrH2 /;2@H�#2Hb Accessed December 12, 2019. Risam, Roopika. 2015. “Beyond the Margins: Intersectionality and the Digital Humanities.” DHQ: Digital Humanities Quarterly 9 (2). ?iiT,ff/B;Bi�H?mK�MBiB2bXQ`;f/?[f pQHfNfkfyyyky3fyyyky3X?iKH. Robertson, Campbell. 2019. “They Were Promised Coding Jobs in Appalachia. Now They Say It Was a Fraud.” The New York Times, May 12, 2019. ?iiTb,ffrrrXMviBK2bX+QKfky RNfy8fRkfmbfKBM2/@KBM/b@r2bi@pB`;BMB�@+Q/BM;X?iKH. Sabharwal, Anil. 2016. “Moving on from Picasa.” Google Photos Blog. Last modified March 26, 2018. ?iiTb,ff;QQ;H2T?QiQbX#HQ;bTQiX+QKfkyRefykfKQpBM;@QM@7`QK@T B+�b�X?iKH. Sabharwal, Arjun. 2015. Digital Curation in the Digital Humanities: Preserving and Promoting Archival and Special Collections. Boston: Chandos. Stephan, Karl D., Katina Michael, M.G. Michael, Laura Jacob, and Emily P. Anesta. 2012. “So- cial Implications of Technology: The Past, the Present, and the Future.” Proceedings of the IEEE 100, Special Centennial Issue (May): 1752-1781. ?iiTb,ff/QBXQ`;fRyXRRyNf CS_P*XkyRkXkR3NNRN. United States Department of Justice. 2008. “Guidelines for a Memorandum of Understanding.” ?iiTb,ffrrrXDmbiB+2X;QpfbBi2bf/27�mHif7BH2bfQprfH2;�+vfkyy3fRyfk Rfb�KTH2@KQmXT/7. . 2017. “Sample Memorandum of Understanding.” ?iiT,ffrrrX/QDXbi�i2X Q`XmbfrT@+QMi2MifmTHQ�/bfkyRdfy3fKQmnb�KTH2n;mB/2HBM2bXT/7. Vance, J.D. 2016. Hillbilly Elegy: A Memoir of a Family and Culture in Crisis. New York: Harper. Weizenbaum, Joseph. 1976. Computer Power and Human Reason: From Judgment to Calcula- tion. New York: W.H. Freeman and Co. Winfield, Alan F., Katina Michael, Jeremy Pitt, and Vanessa Evers. 2019. “Machine Ethics: the design and governance of ethical AI and autonomous systems.” ProceedingsoftheIEEE 107, no. 3 (March): 509-17. ?iiTb,ff/QBXQ`;fRyXRRyNfCS_P*XkyRNXkNyyekk. https://passamaquoddypeople.com/passamaquoddy-traditional-knowledge-labels https://passamaquoddypeople.com/passamaquoddy-traditional-knowledge-labels http://digitalhumanities.org/dhq/vol/9/2/000208/000208.html http://digitalhumanities.org/dhq/vol/9/2/000208/000208.html https://www.nytimes.com/2019/05/12/us/mined-minds-west-virginia-coding.html https://www.nytimes.com/2019/05/12/us/mined-minds-west-virginia-coding.html https://googlephotos.blogspot.com/2016/02/moving-on-from-picasa.html https://googlephotos.blogspot.com/2016/02/moving-on-from-picasa.html https://doi.org/10.1109/JPROC.2012.2189919 https://doi.org/10.1109/JPROC.2012.2189919 https://www.justice.gov/sites/default/files/ovw/legacy/2008/10/21/sample-mou.pdf https://www.justice.gov/sites/default/files/ovw/legacy/2008/10/21/sample-mou.pdf http://www.doj.state.or.us/wp-content/uploads/2017/08/mou_sample_guidelines.pdf http://www.doj.state.or.us/wp-content/uploads/2017/08/mou_sample_guidelines.pdf https://doi.org/10.1109/JPROC.2019.2900622 hansen-can-2021 ---- Chapter 14 Can a Hammer Categorize Highly Technical Articles? Samuel Hansen University of Michigan When everything looks like a nail... I was sure I had the most brilliant research project idea for my course in Digital Scholarship tech- niques. I would use the Mathematical Subject Classification (MSC) values assigned to the publi- cations in MathSciNet1 to create a temporal citation network which would allow me to visualize how new mathematical subfields were created and perhaps even predict them while they were still in their infancy. I thought it would be an easy enough project. I already knew how to analyze network data and the data I needed already existed, I just had to get my hands on it. I even sold a couple of my fellow coursemates on the idea and they agreed to work with me. Of course nothing is as easy as that, and numerous requests for data went without response. Even after I reached out to personal contacts at MathSciNet, we came to understand we would not be getting the MSC data the entire project relied upon. Not that we were going to let a little setback like not having the necessary data stop us. After all, this was early 2018 and there had already been years of stories about how artificial intelligence, machine learning in particular, was going to revolutionize every aspect of our world (Kelly 2014; Clark 2015; Parloff 2016; Sangwani 2017; Tank 2017). All the coverage made it seem like AI was not only a tool with as many applications as a hammer, but that it also magically turned all problems into nails. While none of us were AI experts, we knew that machine learning was supposed to be good at classification and categorization. The promise seemed to be that if you had stacks of data, a machine learning algorithm could dive in, find the needles, and arrange them into neatly divided piles of similar sharpness and length. Not only that, but there were pre- built tools that made it so almost anyone could do it. For a group of people whose project was on 1See ?iiTb,ffK�i?b+BM2iX�KbXQ`;f. 159 https://mathscinet.ams.org/ 160 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 14 life support because we could not get the categorization data we needed, machine learning began to look like our only potential savior. So, machine learning is what we used. I will not go too deep into the actual process, but I will give a brief outline of the techniques we employed. Machine-learning-based categorization needs data to classify, which in our case were mathematics publications. While this can be done with titles and abstracts we wanted to provide the machine with as much data as we could, so we decided to work with full-text articles. Since we were at the University of Wisconsin at the time, we were able to connect with the team behind GeoDeepDive2 who have agreements with many publishers to provide the full text of ar- ticles for text and data mining research (“GeoDeepDive: Project Overview” n.d.). GeoDeepDive provided us with the full text of 22,397 mathematics articles which we used as our corpus. In or- der to classify these articles, which were already pre-processed by GeoDeepDive with CoreNLP,3 we first used the Python package Gensim4 to process the articles into a Python-friendly format and to remove stopwords. Then we randomly sampled 1⁄3 of the corpus to create a topic model using the MALLET5 topic modeling tool. Finally, we applied the model to the remaining articles in our corpus. We then coded the words within the generated topics to subfields within mathe- matics and used those codes to assign articles a subfield category. In order to make sure our results were not just a one-off, we repeated this process multiple times and checked for variance in the results. There was none, the results were uniformly poor. That might not be entirely fair. There were interesting aspects to the results of the topic mod- eling, but when it came to categorization they were useless. Of the subfield codes assigned to arti- cles, only two were ever the dominant result for any given article: Graph Theory and Undefined, which does not really tell the whole story as Undefined was the run-away winner in the article classification race with more than 70% of articles classified as Undefined in each run, including one for which it hit 95%. The topics generated by MALLET were often plagued by gibberish caused by equations in the mathematics articles and there was at least one topic in each run that was filled with the names of months and locations. Add how the technical language of math- ematics is filled with words that have non-technical definitions (for example, map or space), or words which have their own subfield-specific meanings (such as homomorphism or degree), both of which frustrate attempts to code a subfield. These issues help make it clear why so many arti- cles ended up as “Undefined.” Even for the one subfield which had a unique enough vocabulary for our topic model to partially be able to identify, Graph Theory, the results were marginally positive at best. We were able to obtain Mathematical Subject Classification (MSC) values for around 10% of our corpus. When we compared the articles we categorized as Graph Theory to the articles which had been assigned the MSC value for Graph Theory (05Cxx), we found we had a textbook recall-versus-precision problem. We could either correctly categorize nearly all of the Graph Theory articles with a very high rate of false positives (high recall and low precision) or we could almost never incorrectly categorize an article as Graph Theory, but miss over 30% that we should have categorized as Graph Theory (high precision and low recall). Needless to say, we were not able to create the temporal subfield network I had imagined. While we could reasonably claim that we learned very interesting things about the language of mathematics and its subfields, we could not claim we even came close to automatically catego- rizing mathematics articles. When we had to report back on our work at the end of the course, 2See ?iiTb,ff;2Q/22T/Bp2XQ`;f. 3See ?iiTb,ffbi�M7Q`/MHTX;Bi?m#XBQf*Q`2LGSf. 4See ?iiTb,ff`�/BK`2?m`2FX+QKf;2MbBKf. 5See ?iiT,ffK�HH2iX+bXmK�bbX2/mfiQTB+bXT?T. https://geodeepdive.org/ https://stanfordnlp.github.io/CoreNLP/ https://radimrehurek.com/gensim/ http://mallet.cs.umass.edu/topics.php Hansen 161 our main result was that basic, off-the-shelf topic modelling does not work well when it comes to highly technical articles from subjects like mathematics. It was also a welcome lesson in not believing the hype of machine learning, even when a problem looks exactly like the kind machine learning was supposed to excel at solving. While we had a hammer and our problem looked like a nail, it seemed that the former was a ball peen and the latter a railroad tie. In the end, even in the land of hammers and nails, the tool has to match the task. Though we failed to accomplish automated categorization of mathematics, we were dilettantes in the world of machine learning. I believe our project is a good example of how machine learning is still a long way from being the magic tool as some, though not all (Rahimi and Recht 2017), have portrayed it. Let us look at what happens when smarter and more capable minds tackle the problem of classifying mathe- matics and other highly technical subjects using advanced machine learning techniques. Finding the Right Hammer To illustrate the quest to find the right hammer I am going to focus on three different projects that tackled the automated categorization of highly technical content, two of which also attempted to categorize mathematical content and one that looked to categorize scholarly works in general. These three projects provide examples of many of the approaches and practices employed by ex- perts in automated classification and demonstrate the two main paths that these types of projects follow to accomplish their goals. Since we have been discussing mathematics, let us start with those two projects. Both projects began because the participants were struggling to categorize mathematics pub- lications so they would be properly indexed and searchable in digital mathematics databases: the Czech Digital Mathematics Library (DML-CZ)6 and NUMDAM7 in the case of Radim Ře- hůřek and Petr Sojka (Řehůřek and Sojka 2008), and Zentralblatt MATH (zbMath)8 in the case of Simon Barthel, Sascha Tönnies, and Wolf-Tilo Balke (Barthel, Tönnies, and Balke 2013). All of these databases rely on the aforementioned MSC9 to aid in indexing and retrieval, and so their goal was to automate the assignment of MSC values to lower the time and labor cost of requir- ing humans to do this task. The main differences between their tasks related to the number of documents they were working with (thousands for Řehůřek and Sojka and millions for Barthel, Tönnies, and Balke), the amount of the works available (full text for Řehůřek and Sojka, and titles, authors, and abstracts for Barthel, Tönnies, and Balke), and the quality of the data (mostly OCR scans for Řehůřek and Sojka and mostly TeX for Barthel, Tönnies, and Balke). Even with these differences, both projects took a similar approach, and it is the first of the two main pathways to- ward classification I spoke of earlier: using a predetermined taxonomy and a set of pre-categorized data to build a machine learning categorizer. In the end, while both projects determined that the use of Support Vector Machines (Gandhi 2018)10 provided the best categorization results, their implementations were different. The Ře- 6See ?iiTb,ff/KHX+xf. 7See ?iiT,ffrrrXMmK/�KXQ`;f. 8See ?iiTb,ffx#K�i?XQ`;f. 9Mathematical Subject Classification (MSC) values in MathSciNet and zbMath are a particularly interesting catego- rization set to work with as they are assigned and reviewed by a subject area expert editor and an active researcher in the same, or closely related, subfield as the article㸪s content before they are published. This multi-step process of review yields a built-in accuracy check for the categorization. 10Support Vector Machines (SVMs) are machine learning models which are trained using a pre-classified corpus to split a vector space into a set of differentiated areas (or categories) and then attempt to classify new items by where in the https://dml.cz/ http://www.numdam.org/ https://zbmath.org/ 162 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 14 hůřek and Sojka SVMs were trained with terms weighted using augmented term frequency11 and dynamic decision threshold12 selection using s-cut13 (Řehůřek and Sojka 2008, 549) and Barthel, Tönnies, and Balke’s with term weighting using term frequency–inverse document frequency14 and Euclidean normalization15 (Barthel, Tönnies, and Balke 2013, 88), but the main difference was how they handled formulae. In particular the Barthel, Tönnies, and Balke group split their corpus into words and formulae and mapped them to separate vectors which were then merged together for a combined vector used for categorization. Řehůřek and Sojka did not differenti- ate between words and formulae in their corpus, and they did note that their OCR scans’ poor handling of formulae could have hindered their results (Řehůřek and Sojka 2008, 555). In the end, not having the ability to handle formulae separately did not seem to matter as Řehůřek and Sojka claimed microaveraged F1 scores of 89.03% (Řehůřek and Sojka 2008, 549) when classify- ing the top level MSC category with their best performing SVM. When this is compared to the microaveraged F1 of 67.3% obtained by Barthel, Tönnies, and Balke (Barthel, Tönnies, and Balke 2013, 88), it would seem that either Řehůřek’s and Sojka’s implementation of SVMs or their ac- cess to full-text led to a clear advantage. This advantage becomes less clear when one takes into account that Řehůřek and Sojka were only working with top level MSCs where they had at least 30 (60 in the case of their best result) articles, and their limited corpus meant that many top-level MSC categories would not have been included. Looking at the work done by Barthel, Tönnies, and Balke makes it clear that these less common MSC categories such as K-Theory or Potential Theory, for which Barthel, Tönnies, and Balke achieved microaveraged F1 measures of 18.2% and 24% respectively, have a large impact on the overall effectiveness of the automated categorization. Remember, this is only for the top level of MSC codes, and the work of Barthel, Tönnies, and Balke suggests it would get worse when trying to apply the second and third level for full MSC categorization to these less-common categories. This leads me to believe that in the case of cat- egorizing highly technical mathematical works to an existing taxonomy, people have come close to identifying the overall size of the machine learning hammer, but are still a long way away from finding the right match for the categorization nail. Now let us shift from mathematics-specific categorization to subject categorization in gen- eral and look at the work Microsoft has done assigning Fields of Study (FoS) in the Microsoft Academic Graph (MAG) which is used to create their Microsoft Academic article search prod- uct.16 While the MAG FoS project is also attempting to categorize articles for proper indexing and search, it represents the second path which is taken by automated categorization projects: using machine learning techniques to both create the taxonomy and to classify. Microsoft took a unique approach in the development of their taxonomy. Instead of rely- vector space the trained model places them. For a more in-depth, technical explanation, see: ?iiTb,ffiQr�`/b/�i�b +B2M+2X+QKfbmTTQ`i@p2+iQ`@K�+?BM2@BMi`Q/m+iBQM@iQ@K�+?BM2@H2�`MBM;@�H;Q`Bi?Kb@Nj9�99 97+�9d. 11Augmented term frequency refers to the number of times a term occurs in the document divided by the number of times the most frequent occurring term appears in the document. 12The decision threshold is the cut-off for how close to a category the SVM must determine an item to be in order for it to be assigned that category. Řehůřek and Sojka㸪s work varied this threshold dynamically. 13Score-based local optimization, or s-cut, allows a machine-learning model to set different thresholds for each category with an emphasis on local, or category, instead of global performance. 14Term frequency–inverse document frequency provides a weight for terms depending on how frequently it occurs across the corpus. A term which occurs rarely across the corpus but with a high frequency within a single document will have a higher weight when classifying the document in question. 15A Euclidean norm provides the distance from the origin to a point in an n-dimensional space. It is calculated by taking the square root of the sum of the squares of all coordinate values. 16See ?iiTb,ff�+�/2KB+XKB+`QbQ7iX+QKf. https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47 https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47 https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47 https://academic.microsoft.com/ Hansen 163 ing on the corpus of articles in the MAG to develop it, they relied primarily on Wikipedia for its creation. They generated an initial seed by referencing the Science Metrix classification scheme17 and a couple thousand FoS Wikipedia articles they identified internally. They then used an iter- ative process to identify more FoS in Wikipedia based on whether they were linked to Wikipedia articles that were already identified as FoS and whether the new articles represented valid entity types—e.g. an entity type of protein would be added and an entity type of person would be ex- cluded (Shen, Ma, and Wang 2018, 3). This work allowed Microsoft to develop a list of more than 200,000 Fields of Study for use as categories in the MAG. Microsoft then used machine learning techniques to apply these FoS to their corpus of over 140 million academic articles. The specific techniques are not as clear as they were with the previ- ous examples, likely due to Microsoft protecting their specific methods from competitors, but the article published to the arXiv by their researchers (Shen, Ma, and Wang 2018) and the write up on the MAG website does make it clear they used vector based convolutional neural networks which relied on Skip-gram (Mikolov et al. 2013) embeddings and bag-of-words/entities features to cre- ate their vectors (“Microsoft Academic Increases Power of Semantic Search by Adding More Fields of Study—Microsoft Research” 2018). One really interesting part of the machine learn- ing method used by Microsoft was that it did not rely only on information from the article being categorized. It also utilized the citations to and references from information about the article in the MAG, and used the FoS the citations and references were assigned in order to influence the FoS of the original article. The identification of potential FoS and their assignment to articles was only a part of Mi- crosoft’s purpose. In order to fully index the MAG and make it searchable they also wished to determine the relationships between the FoS; in other words they wanted to build a hierarchical taxonomy. To achieve this they used the article categorizations and defined a Field of Study A as the parent of B if the articles categorized as B were close to a subset of the articles categorized as A (a more formal definition can be found in (Shen, Ma, and Wang 2018, 4). This work, which cre- ated a six-level hierarchy, was mostly automated, but Microsoft did inspect and manually adjust the relationships between FoS on the highest two levels. To evaluate the quality of their FoS taxonomy and categorization work, Microsoft randomly sampled data at each of the three steps of the project and used human judges to assess their ac- curacy. The accuracy assessments of the three steps were not as complete as they would be with the mathematics categorization, as that approach would evaluate terms across the whole of their data sets, but the projects are of very different scales so different methods are appropriate. In the end Microsoft estimates the accuracy of the FoS at 94.75%, the article categorization at 81.2%, and the hierarchy at 78% (Shen, Ma, and Wang 2018, 5). Since MSC was created by humans there is no meaningful way to compare the FoS accuracy measurements, but the categorization accuracy falls somewhere between that of the two mathematics projects. This is a very impres- sive result, especially when the aforementioned scale is taken into account. Instead of trying to replace the work of humans categorizing mathematics articles indexed in a database, which for 2018 was 120,324 items in MathSciNet18 and 97,819 in zbMath,19 the FoS project is trying to replace the human categorization of all items indexed in MAG, which was 10,616,601 in 2018.20 17See ?iiT,ffb+B2M+2@K2i`BtX+QKf?[42Mf+H�bbB7B+�iBQM. 18See ?iiTb,ffK�i?b+BM2iX�KbXQ`;fK�i?b+BM2ifb2�`+?fTm#HB+�iBQMbX?iKH?/`4Tm#v2�`�v`QT4 2[��`;j4kyR3. 19See ?iiTb,ffx#K�i?XQ`;f?[4TvWj�kyR3. 20See ?iiTb,ff�+�/2KB+XKB+`QbQ7iX+QKfTm#HB+�iBQMbfjjNkj89d. http://science-metrix.com/?q=en/classification https://mathscinet.ams.org/mathscinet/search/publications.html?dr=pubyear&yrop=eq&arg3=2018 https://mathscinet.ams.org/mathscinet/search/publications.html?dr=pubyear&yrop=eq&arg3=2018 https://zbmath.org/?q=py%3A2018 https://academic.microsoft.com/publications/33923547 164 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 14 Both zbMath and MathSciNet were capable of providing the human labor to do the work of assigning MSC values to the mathematics articles they indexed in 2018.21 Therefore using an automated categorization, which at best could only get the top level right with 90% accuracy, was not the right approach. On the other hand, it seems clear that no one could feasibly provide the human labor to categorize all articles indexed by MAG in 2018 so an 80% accurate categorization is a significant accomplishment. To go back to the nail and hammer analogy, Microsoft may have used a sledgehammer but they were hammering a rather giant nail. Are You Sure it’s a Nail? I started this chapter talking about how we have all been told that AI and machine learning were going to revolutionize everything in the world. That they were the hammers and all the world’s problems were nails. I found that this was not the case when we tried to employ it, in an ad- mittedly rather naive fashion, to automatically categorize mathematical articles. From the other examples I included, it is also clear computational experts find the automatic categorization of highly technical content a hard problem to tackle, one where success is very much dependent on what it is being measured against. In the case of classifying mathematics, machine learning can do a decent job but not enough to compete with humans. In the case of classifying everything, scale gives machines an edge, as long as you have the computational power and knowledge wielded by a company like Microsoft. This collection is about the intersection of AI, machine learning, deep learning, and libraries. While there are definitely problems in libraries where these techniques will be the answer, I think it is important to pause and consider if artificial intelligence techniques are the best approach before trying to use them. Libraries, even those like the one I work in, which are lucky enough to boast of incredibly talented IT departments, do not tend to have access to a large amount of unused computational power or numerous experts in bleeding-edge AI. They are also rather no- toriously limited budget-wise and would likely have to decide between existing budget items and developing an in-house machine learning program. Those realities combined with the legitimate questions which can be raised about the efficacy of machine learning and AI with respect to the types of problems a library may encounter, such as categorizing the contents of highly technical articles, make me worry. While there will be many cases where using AI makes sense, I want to be sure libraries are asking themselves a lot of questions before starting to use it. Questions like: is this problem large enough in scale to substitute machines for human labor given that machines will likely be less accurate? Or: will using machines to solve this problem cost us more in equip- ment and highly technical staff than our current solution, and has that factored in the people and services a library may need to cut to afford them? Or: does the data we have to train a machine contain bias and therefore will produce a biased model which will only serve to perpetuate exist- ing inequities and systemic oppression? Not to mention: is this really a problem or are we just looking for a way to employ machine learning to say that we did? In the cases where the answers to these questions are yes, it will make sense for libraries to employ machine learning. I just want libraries to look really carefully at how they approach problems and solutions, to make sure that 21When an article is indexed by MathSciNet it receives initial MSC values from a subject area editor who then passes the article along to an external expert reviewer who suggests new MSC values, completes partial values, and provides potential corrections to the MSC values assigned by the editors (㸫Mathematical Reviews Guide For Reviewers㸬2020) and then the subject area editors will make the final determination in order to make sure internal styles are followed. zbMath follows a similar procedure. Hansen 165 their problem is, in fact, a nail, and then to look even closer and make sure it is the type of nail a machine-learning hammer can hit. References Barthel, Simon, Sascha Tönnies, and Wolf-Tilo Balke. 2013. “Large-Scale Experiments for Math- ematical Document Classification.” In Digital Libraries: Social Media and Community Networks, edited by Shalini R. Urs, Jin-Cheon Na, and George Buchanan, 83–92. Cham: Springer International Publishing. Clark, Jack. 2015. “Why 2015 Was a Breakthrough Year in Artificial Intelligence.” Bloomberg, December 8, 2015. ?iiTb,ffrrrX#HQQK#2`;X+QKfM2rbf�`iB+H2bfkyR8@Rk@y3 fr?v@kyR8@r�b@�@#`2�Fi?`Qm;?@v2�`@BM@�`iB7B+B�H@BMi2HHB;2M+2. Gandhi, Rohith. 2018. “Support Vector Machine—Introduction to Machine Learning Algo- rithms.” Medium. July 5, 2018. ?iiTb,ffiQr�`/b/�i�b+B2M+2X+QKfbmTTQ`i@p2+ iQ`@K�+?BM2@BMi`Q/m+iBQM@iQ@K�+?BM2@H2�`MBM;@�H;Q`Bi?Kb@Nj9�9997 +�9d. “GeoDeepDive: Project Overview.’ n.d. Accessed May 7, 2018. ?iiTb,ff;2Q/22T/Bp2XQ` ;f�#QmiX?iKH. Kelly, Kevin. 2014. “The Three Breakthroughs That Have Finally Unleashed AI on the World.” Wired, October 27, 2014. ?iiTb,ffrrrXrB`2/X+QKfkyR9fRyf7mim`2@Q7@�`iB7B +B�H@BMi2HHB;2M+2f. “Mathematical Reviews Guide For Reviewers.” 2015. AmericanMathematicalSociety. February 2015. ?iiTb,ffK�i?b+BM2iX�KbXQ`;fK`2bm#bf;mB/2@`2pB2r2`bX?iKH. “Microsoft Academic Increases Power of Semantic Search by Adding More Fields of Study.” 2018. Microsoft Academic (blog). February 15, 2018. ?iiTb,ffrrrXKB+`QbQ7iX+Q Kf2M@mbf`2b2�`+?fT`QD2+if�+�/2KB+f�`iB+H2bfKB+`QbQ7i@�+�/2KB+@BM +`2�b2b@TQr2`@b2K�MiB+@b2�`+?@�//BM;@7B2H/b@bim/vf. Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. “Distributed Representations of Words and Phrases and Their Compositionality.” In Advances in Neu- ral Information Processing Systems 26, edited by C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, 3111–3119. Curran Associates, Inc. ?iiT,ffT�T2 `bXMBTbX++fT�T2`f8ykR@/Bbi`B#mi2/@`2T`2b2Mi�iBQMb@Q7@rQ`/b@�M/@T ?`�b2b@�M/@i?2B`@+QKTQbBiBQM�HBivXT/7. Parloff, Roger. 2016. “From 2016: Why Deep Learning Is Suddenly Changing Your Life.” For- tune. September 28, 2016. ?iiTb,ff7Q`imM2X+QKfHQM;7Q`Kf�B@�`iB7B+B�H@B Mi2HHB;2M+2@/22T@K�+?BM2@H2�`MBM;f. Rahimi, Ali, and Benjamin Recht. 2017. “Back When We Were Kids.” Presentation at the NIPS 2017 Conference. ?iiTb,ffrrrXvQmim#2X+QKfr�i+??p4ZBRu`vjjhZ1. Řehůřek, Radim, and Petr Sojka. 2008. “Automated Classification and Categorization of Math- ematical Knowledge.” In Intelligent Computer Mathematics, edited by Serge Autexier, John Campbell, Julio Rubio, Volker Sorge, Masakazu Suzuki, and Freek Wiedijk, 543–57. Berlin: Springer Verlag. Sangwani, Gaurav. 2017. “2017 Is the Year of Machine Learning. Here’s Why.” Business Insider, January 13, 2017. ?iiTb,ffrrrX#mbBM2bbBMbB/2`XBMfkyRd@Bb@i?2@v2�`@Q7@K �+?BM2@H2�`MBM;@?2`2b@r?vf�`iB+H2b?Qrf8e8R98j8X+Kb. https://www.bloomberg.com/news/articles/2015-12-08/why-2015-was-a-breakthrough-year-in-artificial-intelligence https://www.bloomberg.com/news/articles/2015-12-08/why-2015-was-a-breakthrough-year-in-artificial-intelligence https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47 https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47 https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47 https://geodeepdive.org/about.html https://geodeepdive.org/about.html https://www.wired.com/2014/10/future-of-artificial-intelligence/ https://www.wired.com/2014/10/future-of-artificial-intelligence/ https://mathscinet.ams.org/mresubs/guide-reviewers.html https://www.microsoft.com/en-us/research/project/academic/articles/microsoft-academic-increases-power-semantic-search-adding-fields-study/ https://www.microsoft.com/en-us/research/project/academic/articles/microsoft-academic-increases-power-semantic-search-adding-fields-study/ https://www.microsoft.com/en-us/research/project/academic/articles/microsoft-academic-increases-power-semantic-search-adding-fields-study/ http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf https://fortune.com/longform/ai-artificial-intelligence-deep-machine-learning/ https://fortune.com/longform/ai-artificial-intelligence-deep-machine-learning/ https://www.youtube.com/watch?v=Qi1Yry33TQE https://www.businessinsider.in/2017-is-the-year-of-machine-learning-heres-why/articleshow/56514535.cms https://www.businessinsider.in/2017-is-the-year-of-machine-learning-heres-why/articleshow/56514535.cms 166 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 14 Shen, Zhihong, Hao Ma, and Kuansan Wang. 2018. “A Web-Scale System for Scientific Knowl- edge Exploration.” Paper presented at the 56th Annual Meeting of the Association for Com- putational Linguistics, Melbourne, July 2018. ?iiT,ff�`tBpXQ`;f�#bfR3y8XRkkRe. Tank, Aytekin. 2017. “This Is the Year of the Machine Learning Revolution.” Entrepreneur, January 12, 2017. ?iiTb,ffrrrX2Mi`2T`2M2m`X+QKf�`iB+H2fk3djk9. http://arxiv.org/abs/1805.12216 https://www.entrepreneur.com/article/287324 harper-generative-2021 ---- Chapter 2 Generative Machine Learning Charlie Harper, PhD Case Western Reserve University Introduction Generative machine learning is a hot topic. With the 2020 election approaching, Facebook and Reddit have each issued their own bans on the category of machine-generated or -altered con- tent that is commonly termed “deep fakes” (Cohen 2020; Romm, Harwell, and Stanley-Becker 2020). Calls for regulation of the broader, and very nebulous category of fake news are now part of US political debates, too. Although well known and often discussed in newspapers and on TV because of their dystopian implications, deep fakes are just one application of generative ma- chine learning. There is a remarkable need for others, especially humanists and social scientists, to become involved in discussions about the future uses of this technology, but this first requires a broader awareness of generative machine learning’s functioning and power. Many articles on the subject of generative machine learning exist in specialized, highly technical literature, but there is little that covers this topic for a broader audience while retaining important high-level informa- tion on how the technology actually operates. This chapter presents an overview of generative machine learning with particular focus on generative adversarial networks (GANs). GANs are largely responsible for the revolution in machine-generated content that has occured in the past few years and their impact on our fu- ture extends well beyond that of producing purposefully-deceptive fakes. After covering genera- tive learning and the working of GANs, this chapter touches on some interesting and significant applications of GANs that are not likely to be familiar to the reader. The hope is that this will serve as the start of a larger discussion on generative learning outside of the confines of technical literature and sensational news stories. 13 14 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 2 Figure 2.1: The three most-common letters following “F” in two Markov chains trained on an English and Italian dictionary. Three examples of generated words are given for each Markov chain that show how the Markov chain captures high-level information about letter arrangements in the different languages. What is Generative Machine Learning? Machine learning, which is a subdomain of Artificial Intelligence, is roughly divided into three paradigms that rely on different methods of learning: supervised, unsupervised, and reinforce- ment learning (Murphy 2012, 1–15; Burkov 2019, 1–8). These differ in the types of datasets used for learning and the desired applications. Supervised and unsupervised machine learning use labeled and unlabeled datasets, respectively, to assign unseen data to human-generated la- bels or statistically-constructed groups. Both supervised and unsupervised approaches are com- monly used for classification and regression problems, where we wish to predict categorical or quantitative information about new data. A combined form of these two paradigms, called semi- supervised learning, that mixes labeled and unlabeled data also exists. Reinforcement learning, on the other hand, is a paradigm in which an agent learns how to function in a specific environ- ment by being rewarded or penalized for its behavior. For example, reinforcement learning can be used to train a robot to successfully navigate around obstacles in a physical space. Generative machine learning, rather than being a specific learning paradigm, encompasses an ever-growing variety of techniques that are capable of generating new data based on learned patterns. The process of learning these patterns can engage both supervised and unsupervised learning. A simple, statistical example of one type of generative learning is a Markov chain. From a given set of data, a Markov chain calculates and stores the probabilities of a following state based on a current state. For example, a Markov chain can be trained on a list of English words to store the probabilities of any one letter occuring after another letter. These probabilities chain together to represent that chance of moving from the current letter state (e.g. the letter q) to a succeeding letter state (e.g. the letter u) based on the data from which it has learned. If another Markov chain were trained on Italian words instead of English, the probabilities would change, and for this reason, Markov chains can capture important high level information about datasets (Figure 2.1). They can then be sampled to generate new data by starting from a random state and probabilistically moving to succeeding states. In figure 2.1, you can see the Harper 15 Figure 2.2: Images generated with a simple statistical model appear as noise as the model is in- sufficient to capture the structure of the real data (Markov chains trained using wine bottles and circles from Google’s QuickDraw dataset). probability that the letter “F” transitions to the three most common succeeding letters in English and Italian. A few examples of “words” generated by two Markov chains trained on an English and Italian dictionary are also given. The example words are generated by sampling the probabil- ity distributions of the Markov chain, letter by letter, so that the generated words are statistically random, but guided by the learned probability of one letter following another. The different probabilities of letter combinations in English and Italian result in distinctly different generated words. This exemplifies how a generative model can capture specific aspects of a dataset to create new data. The letter combinations are nonsense, but they still reflect the high-level structure of Ital- ian and English words in the way letters join together, such as the different utilization of vowels in each language. These basic Markov chains demonstrate the essence of generative learning: a generative approach learns a distribution over a dataset, or in other words, a mathematical rep- resentation of a dataset, which can then be sampled to generate new data that exists within the learned structure of that dataset. How convincing the generated data appears to a human ob- server depends on the type and tuning of the machine learning model chosen and the data upon which the model has been trained. So, what happens if we build a comparable Markov chain with image data1 instead of words, and then sample, pixel by pixel, from it to generate new images? The results are just noise and the generated images reveal no hint of a wine bottle or circle to the human eye (Figure 2.2). The very simple generative statistical model we have chosen to use is incapable of capturing the distribution of the underlying images sufficiently enough to produce realistic new images. Other types of generative statistical models, like Naive Bayes or a higher-order Markov chain,2 1In many examples, I have used the Google QuickDraw Dataset to highlight features of generative machine learning. The dataset is freely available (?iiTb,ff;Bi?m#X+QKf;QQ;H2+`2�iBp2H�#f[mB+F/`�r@/�i�b2i) and licensed under CC BY 4.0. 2The order of a Markov chain reflects how many preceding states are taken into account. For example, a 2nd order Markov chain would look at the preceding two letters to calculate the probability of a succeeding letter. Rudimentary autocomplete is a good example of Markov chains in application. https://github.com/googlecreativelab/quickdraw-dataset 16 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 2 could perhaps capture a bit more information about the training data, but they would still be insufficient for real-world applications like this.3 Image, video, and audio are complicated; it is hard to reduce them to their essence with basic statistical rules in the way we were able to with the ordering of letters in English and Italian. Capturing the intricate and often-inscrutable distri- butions that underlie real-world media, like full-sized photographs of people, is where deep (i.e. using neural networks) generative learning shines and where generative adversarial networks have revolutionized machine-generated content. Generative Adversarial Networks The problem of capturing the complexity of an image so that a computer can generate new images leads directly to the emergence of Generative Adversarial Networks, which are a neural-network- based model architecture within the broader sphere of generative machine learning. Although prior deep learning approaches to generating data, particularly variational autoencoders, already existed, it was a breakthrough in 2014 that changed the fabric and power of generative machine learning. Like every big development, it has an origin story that has moved into legend with its many retellings. According to the handed-down tale (Giles 2018), in 2014 doctoral student Ian Goodfellow was at a bar with friends when the topic of generating photos arose. His friends were working out a method to create realistic images by using complex statistical analyses of existing images. Goodfellow countered that it would not work; there were too many variables at play within such data. Instead, he put forth the idea of pairing two neural networks against each other in a type of zero-sum game where the goal was to generate believable fake images. According to the story, he developed this idea into working code that night and his paired neural network architecture produced results the very first time. This was the birth of Generative Adversarial Networks or GANs. Goodfellow’s work was quickly disseminated in what is one of the most influential papers in the recent history of machine learning (Goodfellow et al. 2014). GANs have progressed in almost miraculous ways since 2014, but the crux of their architec- ture remains the coupling of two neural networks. Each neural network has a specific function in the pairing. The first network, called the generator, is tasked with generating fake examples of some dataset. To produce this data it randomly samples from an n-dimensional latent space often labeled Z. In simple terms, the generator takes random noise (really a random list of n-numbers where n is the dimensionality of the latent space) as its input and outputs its attempt at a fake piece of data, such as an image, clip of audio, or row of tabular information. The second neural network, called the discriminator, takes both fake and real data as input. Its role is to correctly dis- criminate between fake and real examples.4 The generator and discriminator networks are then coupled together as adversaries, hence “adversarial” in the name. The output from the generator flows into the discriminator, and information on the success or failure of the discriminator to identify fakes (i.e. the discriminator’s loss) flows back through the network so that the genera- tor and discriminator each knows how well it is performing compared to the other. All of this happens automatically, without any need for human supervision. When the generator finds it is doing poorly, it learns to produce better examples by updating its weights and biases through tra- ditional backpropagation (see especially Langr and Bok 2019, 3–16 for a more detailed summary of this). As backpropagation updates the generator network’s weights and biases, the generator 3This is not to imply that these models do not have immense practical applications in other areas of machine learning. 4Its function is exactly that of any other binary classifier found in machine learning. Harper 17 Figure 2.3: At the heart of a GAN are two neural networks, the generator and the discriminator. As the generator learns to produce fake data, the discriminator learns to separate it out. The pairing of the two in an adversarial structure forces each to improve at its given task. Figure 2.4: A GAN being trained on wine bottle sketches from Google’s quickdraw dataset (?iiTb,ff;Bi?m#X+QKf;QQ;H2+`2�iBp2H�#f[mB+F/`�r@/�i�b2i) shows the genera- tor learning how to produce better sketches over time. Moving from left to right, the generator begins by outputting random noise and progressively generates better sketches as it tries to trick the discriminator. inherently begins to map regions of the randomly sampled Z space to characteristics found in the real dataset. Contrarily, as the discriminator finds that it is not identifying better fakes accurately, it learns to separate these out in new ways. At first, the generator outputs random data and the discriminator easily catches these fakes (Figure 2.4). As the results of the discriminator feed back into the generator, however, the gen- erator learns to trick its foe by creating more convincing fakes. The discriminator consecutively learns to better separate out these more convincing fakes. Turn after turn, the two networks drive one another to become better at their specialized tasks and the generated data becomes in- creasingly like the real data.5 At the end of training, ideally, it will not be possible to distinguish between real and fake (Figure 2.5). In the original publication, the first GANs were trained on sets of small images, like the Toronto Face Dataset, which contains 32 ⇥ 32 pixel grayscale photos of faces and facial expres- sions (Goodfellow et al. 2014). Although the generator’s results were convincing when com- pared to the originals, the fake images were still small, colorless, and pixelated. Since then an explosion of research into GANs and increased computational power has led to strikingly realis- 5See ?iiTb,ffTQHQ+Hm#X;Bi?m#XBQf;�MH�#f (accessed Jan 17, 2020) (Kahng et al. 2019). https://github.com/googlecreativelab/quickdraw-dataset https://poloclub.github.io/ganlab/ 18 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 2 Figure 2.5: The fully trained generator from Figure 2.4 produces examples that are not readily distinguishable from real world data. The top row of sketches were produced by the GAN and the bottom row were drawn by humans. tic images. The most recent milestone was reached in 2019 by researchers with NVIDIA, who built a GAN that generates high-quality photo-realistic images of people (Karras, Laine, and Aila 2019). When contrasted with the results of 2014 (Figure 2.6), the stunning progression of GANs is self-evident, and it is difficult to believe that the person on the right does not exist. Some Applications of Generative Adversarial Networks Over the past five years, many papers on implementations of GANs have been released by re- searchers (Alqahtani, Kavakli-Thorne, and Kumar 2019; Wang, She, and Ward 2019). The list of applications is extensive and ever growing, but it is worth pointing out some of the major exam- ples as of 2019 and why they are significant. These examples highlight the vast power of GANs and underscore the importance of understanding and carefully scrutinizing this type of machine learning. Data Augmentation One major problem in machine learning has always been the lack of labeled datasets, which are re- quired by supervised learning approaches. Labeling data is time consuming and expensive. With- out good labeled data, trained models are limited in their power to learn and in their ability to generalize to real-world problems. Services, such as Amazon’s Mechanical Turk, have attempted to crowdsource the tedious process of manually assigning labels to data, but labeling has remained a bottleneck in machine learning. GANs are helping to alleviate this bottleneck by generating new labeled data that is indistinguishable from the real data. This process can grow a small la- beled dataset into one that is larger and more useful for training purposes. In the area of medical imaging and diagnostics this may have profound effects (Yi, Walia, and Babyn 2019). For exam- ple, GANs can produce photorealistic images of skin lesions that expert dermatologists are able to separate from real images only slightly over 50% of the time (Baur, Albarqouni, and Navab 2018) and they can synthesize high-resolution mammograms for training better cancer detection algorithms (Korkinof et al. 2018). A corollary effect of these developments in medical imaging is the potential to publicly release Harper 19 Figure 2.6: An image of a generated face from the original GAN publication (left) and the 2019 milestone (right) shows how the ability of GANs to produce photo-realistic images has evolved since 2014. large medical datasets and thereby expand researchers’ access to important data. Whereas the dissemination of traditional medical images is constrained by strict health privacy laws, generated images may not be governed by such rules. I qualify this statement with “may”, because any restrictions or ethical guidelines for the use of medical data that is generated from real patient data requires extensive discussion and legal reviews that have not yet happened. Under certain conditions, it may also be possible to infer original data from a GAN (Mukherjee et al. 2019). How institutional review boards, professional medical organizations, and courts weigh in on this topic will be seen in the coming years. In addition to generating entirely new data, a GAN can augment datasets by expanding their coverage to new domains. For example, autonomous vehicles must cope with an array of road and weather conditions that are unpredictable. Training a model to identify pedestrians, street signs, road lines, and so on with images taken on a sunny day will not translate well to variable real-world conditions. Using one dataset, in a process known as style transfer, GANs can translate one image to other domains (Figure2.7). This can include creating night road scenes from day scenes (Romera et al. 2019) and producing images of street signs under varying lighting condi- tions (Chowdhury et al. 2019). This added data permits models to account for greater variability under operating conditions without the high cost of photographing all possible conditions and manually labeling them. Beyond medicine and autonomous vehicles, generative data augmenta- tion will progressively impact other imaging-heavy fields (Shorten and Khoshgoftaar 2019) like remote sensing (L. Ma et al. 2019; D. Ma, Tang, and Zhao 2019). Creativity and Design The question of whether machines can possess creativity or artistic ability is philosophically diffi- cult to answer (Mazzone and Elgammal 2019; McCormack, Gifford, and Hutchings 2019). Still, in 2018, Christie’s auctioned off its first piece of GAN art for $432,500 (Cohn 2018) and GANs 20 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 2 Figure 2.7: The images on the left are originals and the images on the right have been modified by a GAN with the ability to translate images between the domains of “dirty lens” and “clean lens” on a vehicle (from Uřičář et al. 2019, fig. 11). Harper 21 Figure 2.8: This example of GauGAN in action shows a sketched out scene on the left turned into a photo-realistic landscape on the right. *If any representatives of Christie’s are reading, the author would be happy to auction this piece. are increasingly assisting humans in the creative process for all forms of media. Simple models, like CycleGAN, are already able to stylize images in the manner of Van Gogh or Monet (Zhu et al. 2017), and more varied stylistic GANs are emerging. GauGAN, a beta tool released by NVIDIA, is a great example of GAN-assisted creativity in action. GauGAN allows you to rough out a scene using a paint brush for different categories, like clouds, flowers, and houses (Figure 2.8). It then converts this into a photo reflecting what you have drawn. The online demo6 remains limited, but the underlying model is powerful and has massive potential (Park et al. 2019). Recently, Martin Scorsese’s The Irishman made headlines for its digital de-aging of Robert Deniro and other actors. Although this process did not involve GANs, it is highly likely that in the future, GANs will become a major part of cinematic post- production (Giardina 2019) through assistive tools like GauGAN. Fashion and product design are also being impacted by the use of GANs. Text-to-image syn- thesis, which can take free text or categories as input to generate a photo-realistic image, has promising potential (Rostamzadeh et al. 2018). By accepting text as input, GANs can let de- signers rapidly generate new ideas or visualize concepts for products at the start of the design process. For example, a recently published GAN for clothing design accepts basic text and out- puts modeled images of the described clothing (Banerjee et al. 2019; Figure 9). In an example of automotive design, a single sketch can be used to generate realistic photos of multiple perspec- tives of a vehicle (Radhakrishnan et al. 2018). The many fields that rely on quick sketching or visual prototyping, such as architecture or web design, are likely to be influenced by the use of GAN-assisted design software in coming years. In a similar vein, GANs have an upcoming role in the creation of new medicines, chemi- cals, and materials (Zhavoronkov 2018). By training a GAN on existing chemical and material structures, research is showing that novel chemicals and materials can be designed with particular properties (Gómez-Bombarelli et al. 2018; Sanchez-Lengeling and Aspuru-Guzik 2018). This is facilitated by how information is encoded in the GAN’s latent space (the n-dimensional space from which the generator samples; see “Z” in Figure 2.3). As the generator learns to produce realistic examples, certain aspects of the original data become encoded in regions of the latent 6See ?iiT,ffMpB/B�@`2b2�`+?@KBM;vmHBmX+QKf;�m;�Mf (last accessed January 12, 2019). http://nvidia-research-mingyuliu.com/gaugan/ 22 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 2 Figure 2.9: Text-to-image synthesis can generate images of new fashions based on a description. From the input “maroon round neck mini print a-line bodycon short sleeves” a GAN has pro- duced these three photos (from Banerjee et al. 2019, fig. 11). Figure 2.10: Two examples of linearly-spaced mappings across the latent space between generated images A and B. Note that by taking one image and moving closer to another, you can alter prop- erties in the image, such as adding steam, removing a cup handle, or changing the angle of view. These characteristics of the dataset are learned by the generator during training and encoded in the latent space. (GAN built on coffee cup sketches from Google’s QuickDraw dataset) space. By moving through this latent space or sampling particular areas, new data with desired properties can then be generated. This can be seen by periodically sampling the latent space and generating an image as one moves between two generated images (Figure 2.10). In the same way, by moving in certain directions or sampling from particular areas of the latent space, new chem- icals or medicines with specific properties can be generated.7 Impersonation and the Invisible I have reserved some of the more dystopian and likely more well-heard-of applications of GANs for last. This is the area where GANs’ ability to generate convincing media is challenging our perceptions of reality and raising extreme ethical questions (Harper 2018). Deep fakes are, of course, the most well known of these. This can include the creation of fake images, videos, and audio of an individual or the modification of any media to alter what someone appears to be doing or saying. In images and video in particular, GANs make it possible to swap the identity of an individual and manipulate facial attributes or expressions (Tolosana et al. 2020). A large portion 7This is also relevant to facial manipulation discussed below. Harper 23 Figure 2.11: GANs are providing a method to reconstruct hidden images of people and objects. Images 1–3 show reconstructions as compared to an input occluded image (OCC) and a ground truth image (GT) (from Fulgeri et al. 2019, fig. 6). of technical literature is, in fact, now devoted to detecting faked and altered media (see Tolosana et al. 2020, Table IV and V). It remains to be seen how successful any approaches will be. From a theoretical perspective, anything that can detect fakes can also be used to train a better generator since the training process of a GAN is founded on outsmarting a detector (i.e. the discriminator network). One shocking extension of deep fakes that has emerged is transcript to video creation, which generates a video of someone speaking from a written text. If you want to see this at work, you can view clips of Nixon giving the speech written in the case of an Apollo 11 disaster.8 As of now, deep fakes like this remain choppy and are largely limited to politicians and celebrities because they require large datasets and additional manipulation, but this limitation is not likely to last. If the evolution of GANs for images is any predictor, the entire emerging field of video generation is likely to progress rapidly. One can imagine the incorporation of text-to-image and deep fakes enabling someone to produce an image of, say, “politican X doing action Y,” simply by typing it. An application of GANs that parallels deep fakes and is likely more menacing in the short term is the infilling or adding of hidden, invisible, or predicted information to existing media. One nascent use is video prediction from an image. For example, in 2017, researchers were able to build a GAN that produced 1-second video clips from a single starting frame (Vondrick and Torralba 2017). This may not seem impressive, but video is notoriously difficult to work with because the content of a succeeding frame can vary so drastically from the preceding frame (for other examples of on-going research into video prediction, see Cai et al. 2018; Wen et al. 2019). For still images, occluded object reconstruction, in which a GAN is trained to produce a full image of a person or object that is partially hidden behind something else, is progressing (Fulgeri et al. 2019; see Figure 11). For some applications, like autonomous driving, this could save lives as it would help to pick out when a partially-occluded pedestrian is about to emerge from 8See ?iiT,ffM2rbXKBiX2/mfkyRNfKBi@�TQHHQ@/22T7�F2@�`i@BMbi�HH�iBQM@�BKb@iQ@2KTQr2`@K Q`2@/Bb+2`MBM;@Tm#HB+@RRk8. http://news.mit.edu/2019/mit-apollo-deepfake-art-installation-aims-to-empower-more-discerning-public-1125 http://news.mit.edu/2019/mit-apollo-deepfake-art-installation-aims-to-empower-more-discerning-public-1125 24 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 2 behind a parked car. On the other hand, for surveillance technology, it can further undermine anonymity. Indeed, such GANs are already being explicitly studied for surveillance purposes (Fabbri, Calderara, and Cucchiara 2017). Lastly, I would be remiss if I did not mention that researchers have designed a GAN that can generate an image of what you are thinking about, using EEG signals (Tirupattur et al. 2018). GANs and the Future The tension between the creation of more realistic generated data and the technology to detect maliciously generated information is only beginning. The machine learning and data science plat- form, Kaggle, is replete with publicly-accessible python code for building GANs and detecting fake data. Money, too, is freely flowing in this domain of research; The 2019 Deepfake Detec- tion Challenge sponsored by Facebook, AWS, and Microsoft boasted one million dollars in prizes (?iiTb,ffrrrXF�;;H2X+QKf+f/22T7�F2@/2i2+iBQM@+?�HH2M;2 accessed April 20, 2020). Meanwhile, industry leaders, such as NVidia, continue to fund the training of better and more convincing GANs. The structure of a GAN, with its generator and detector paired adver- sarially, is now being mirrored in society as groups of researchers competitively work to create and discern generated data. The path that this machine-learning arms race will take is unpredictable, and, therefore, it is all the more important to scrutinize it and make it comprehensible to the broader publics whom it will affect. References Alqahtani, Hamed, Manolya Kavakli-Thorne, and Gulshan Kumar. 2019. “Applications of Gen- erative Adversarial Networks (GANs): An Updated Review.” Archives of Computational Methods in Engineering, December. ?iiTb,ff/QBXQ`;fRyXRyydfbRR3jR@yRN@yNj 33@v. Banerjee, Rajdeep H., Anoop Rajagopal, Nilpa Jha, Arun Patro, and Aruna Rajan. 2019. “Let AI Clothe You: Diversified Fashion Generation.” In Computer Vision—ACCV 2018 Work- shops, edited by Gustavo Carneiro and Shaodi You, 75–87. Cham: Springer International Publishing. Baur, Christoph, Shadi Albarqouni, and Nassir Navab. 2018. “Generating Highly Realistic Im- ages of Skin Lesions with GANs” September. ?iiTb,ff�`tBpXQ`;f�#bfR3yNXyR9Ry. Burkov, Andriy. 2019. The Hundred-Page Machine Learning Book. Self-published, Amazon. Cai, Haoye, Chunyan Bai, Yu-Wing Tai, and Chi-Keung Tang. 2018. “Deep Video Generation, Prediction and Completion of Human Action Sequences.” In Computer Vision — ECCV 2018, edited by Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss, 374–90. Lecture Notes in Computer Science. Cham: Springer International Publishing. ?iiTb,ff/QBXQ`;fRyXRyydfNd3@j@yjy@yRkRe@3nkj. Chowdhury, Sohini Roy et al. 2019. “Automated Augmentation with Reinforcement Learning and GANs for Robust Identification of Traffic Signs Using Front Camera Images.” In 53rd Asilomar Conference on Signals, Systems & Computers, 79–83. N.p.: IEEE. ?iiTb,ff/Q BXQ`;fRyXRRyNfA111*PL699ee9XkyRNXNy9Nyy8. Cohen, Libby. 2020. “Reddit Bans Deepfakes with ‘Malicious’ Intent.” The Daily Dot. January 10, 2020. ?iiTb,ffrrrX/�BHv/QiX+QKfH�v2`3f`2//Bi@/22T7�F2b@#�Mf. https://www.kaggle.com/c/deepfake-detection-challenge https://doi.org/10.1007/s11831-019-09388-y https://doi.org/10.1007/s11831-019-09388-y https://arxiv.org/abs/1809.01410 https://doi.org/10.1007/978-3-030-01216-8_23 https://doi.org/10.1109/IEEECONF44664.2019.9049005 https://doi.org/10.1109/IEEECONF44664.2019.9049005 https://www.dailydot.com/layer8/reddit-deepfakes-ban/ Harper 25 Cohn, Gabe. 2018. “AI Art at Christie’s Sells for $432,500.” The New York Times, October 25, 2018. ?iiTb,ffrrrXMviBK2bX+QKfkyR3fRyfk8f�`ibf/2bB;Mf�B@�`i@bQH/@+ ?`BbiB2bX?iKH. Fabbri, Matteo, Simone Calderara, and Rita Cucchiara. 2017. “Generative Adversarial Models for People Attribute Recognition in Surveillance.” In 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). N.p.: IEEE. ?iiTb,ff/QBXQ` ;fRyXRRyNf�oaaXkyRdX3yd38kR. Fulgeri, Federico, Matteo Fabbri, Stefano Alletto, Simone Calderara, and Rita Cucchiara. 2019. “Can Adversarial Networks Hallucinate Occluded People With a Plausible Aspect?” Com- puter Vision and Image Understanding 182 (May): 71–80. Giardina, Carolyn. 2019. “Will Smith, Robert De Niro and the Rise of the All-Digital Actor.” The Hollywood Reporter, August 10, 2019. ?iiTb,ffrrrX?QHHvrQQ/`2TQ`i2`X+QKf #2?BM/@b+`22Mf`Bb2@�HH@/B;Bi�H@�+iQ`@RkkNd3j. Giles, Martin. 2018. “The GANfather: The Man Who’s given Machines the Gift of Imagina- tion.” MIT Technology Review 121, no. 2 (March/April): 48–53. Gómez-Bombarelli, Rafael, Jennifer N. Wei, David Duvenaud, José Miguel Hernández-Lobato, Benjamín Sánchez-Lengeling, Dennis Sheberla, Jorge Aguilera-Iparraguirre, Timothy D. Hirzel, Ryan P. Adams, and Alán Aspuru-Guzik. 2018. “Automatic Chemical Design Us- ing a Data-Driven Continuous Representation of Molecules.” ACS Central Science 4, no. 2 (February): 268–76. ?iiTb,ff/QBXQ`;fRyXRykRf�+b+2Mib+BXd#yy8dk. Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. “Generative Adversarial Nets.” In Advances in Neural Information Processing Systems, edited by Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Q. Weinberger, 27:2672–2680. Curran Associates, Inc. ?iiTb,ffT` Q+22/BM;bXM2m`BTbX++fT�T2`fkyR9f7BH2f8+�j2N#Rkk7eR737ye9N9+Nd#R� 7++7j@S�T2`XT/7. Harper, Charlie. 2018. “Machine Learning and the Library or: How I Learned to Stop Worrying and Love My Robot Overlords.” Code4Lib Journal, no. 41 (August). ?iiTb,ffDQm`M� HX+Q/29HB#XQ`;f�`iB+H2bfRjedR. Kahng, Minsuk, Nikhil Thorat, Duen Horng Polo Chau, Fernanda B. Viegas, and Martin Wat- tenberg. 2019. “GAN Lab: Understanding Complex Deep Generative Models Using Inter- active Visual Experimentation.” IEEE Transactions on Visualization and Computer Graph- ics 25, no. 1 (January 2019): 310–320. ?iiTb,ff/QBXQ`;fRyXRRyNfip+;XkyR3Xk3 e98yy. Karras, Tero, Samuli Laine, and Timo Aila. 2019. “A Style-Based Generator Architecture for Generative Adversarial Networks.” In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 4396–4405. N.p.: IEEE. ?iiTb,ff/QBXQ`;fRyXRRyNf *oS_XkyRNXyy98j. Korkinof, Dimitrios, Tobias Rijken, Michael O’Neill, Joseph Yearsley, Hugh Harvey, and Ben Glocker. 2018. “High-Resolution Mammogram Synthesis Using Progressive Generative Adversarial Networks.” Preprint, submitted July 9, 2018. ?iiTb,ff�`tBpXQ`;f�#bf R3ydXyj9yR. Langr, Jakub and Vladimir Bok. 2019. GANs in Action: Deep Learning with Generative Adver- sarial Networks. Shelter Island, NY: Manning Publications. Ma, Dongao, Ping Tang, and Lijun Zhao. 2019. “SiftingGAN: Generating and Sifting La- beled Samples to Improve the Remote Sensing Image Scene Classification Baseline In Vitro.” https://www.nytimes.com/2018/10/25/arts/design/ai-art-sold-christies.html https://www.nytimes.com/2018/10/25/arts/design/ai-art-sold-christies.html https://doi.org/10.1109/AVSS.2017.8078521 https://doi.org/10.1109/AVSS.2017.8078521 https://www.hollywoodreporter.com/behind-screen/rise-all-digital-actor-1229783 https://www.hollywoodreporter.com/behind-screen/rise-all-digital-actor-1229783 https://doi.org/10.1021/acscentsci.7b00572 https://proceedings.neurips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf https://proceedings.neurips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf https://proceedings.neurips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf https://journal.code4lib.org/articles/13671 https://journal.code4lib.org/articles/13671 https://doi.org/10.1109/tvcg.2018.2864500 https://doi.org/10.1109/tvcg.2018.2864500 https://doi.org/10.1109/CVPR.2019.00453 https://doi.org/10.1109/CVPR.2019.00453 https://arxiv.org/abs/1807.03401 https://arxiv.org/abs/1807.03401 26 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 2 IEEE Geoscience and Remote Sensing Letters 16, no. 7 (July): 1046–1050. ?iiTb,ff/QBX Q`;fRyXRRyNfH;`bXkyR3Xk3Ny9Rj. Ma, Lei, Yu Liu, Xueliang Zhang, Yuanxin Ye, Gaofei Yin, and Brian Alan Johnson. 2019. “Deep Learning in Remote Sensing Applications: A Meta-Analysis and Review.” ISPRS Journal of Photogrammetry and Remote Sensing 152 (June): 166–77. ?iiTb,ff/QBXQ`;fRyXR yRefDXBbT`bDT`bXkyRNXy9XyR8. Mazzone, Marian, and Ahmed Elgammal. 2019. “Art, Creativity, and the Potential of Artificial Intelligence.” Arts 8, no. 1 (March): 1–9. ?iiTb,ff/QBXQ`;fRyXjjNyf�`ib3yRyyke. McCormack, Jon, Toby Gifford, and Patrick Hutchings. 2019. “Autonomy, Authenticity, Au- thorship and Intention in Computer Generated Art.” In ComputationalIntelligenceinMu- sic, Sound, Art and Design, edited by Anikó Ekárt, Antonios Liapis, and María Luz Castro Pena, 35–50. Cham: Springer International Publishing. Mukherjee, Sumit, Yixi Xu, Anusua Trivedi, and Juan Lavista Ferres. 2019. “Protecting GANs against Privacy Attacks by Preventing Overfitting.” Preprint, submitted December 31, 2019. ?iiTb,ff�`tBpXQ`;f�#bfkyyRXyyydRpR. Murphy, Kevin P. 2012. Machine Learning : A Probabilistic Perspective. Adaptive Computation and Machine Learning Series. Cambridge, Mass: MIT Press. Park, Taesung, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. 2019. “Semantic Image Syn- thesis with Spatially-Adaptive Normalization.” In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2332–2341. N.p.: IEEE. ?iiTb,ff/QBXQ`;f RyXRRyNf*oS_XkyRNXyyk99. Radhakrishnan, Sreedhar, Varun Bharadwaj, Varun Manjunath, and Ramamoorthy Srinath. 2018. “Creative Intelligence – Automating Car Design Studio with Generative Adversarial Net- works (GAN).” InMachineLearningandKnowledgeExtraction, edited by Andreas Holzinger, Peter Kieseberg, A Min Tjoa, and Edgar Weippl, 160–75. Cham: Springer International Publishing. Romera, Eduardo, Luis M. Bergasa, Kailun Yang, Jose M. Alvarez, and Rafael Barea. 2019. “Bridging the Day and Night Domain Gap for Semantic Segmentation.” In 2019 IEEE Intelligent Vehicles Symposium (IV), 1312–18. N.p.: IEEE. ?iiTb,ff/QBXQ`;fRyXRRy NfAoaXkyRNX33Rj333. Romm, Tony, Drew Harwell, and Isaac Stanley-Becker. 2020. “Facebook Bans Deepfakes, but New Policy May Not Cover Controversial Pelosi Video.” The Washington Post. January 7, 2020. ?iiTb,ffrrrXr�b?BM;iQMTQbiX+QKfi2+?MQHQ;vfkykyfyRfyef7�+2#QQ F@#�M@/22T7�F2b@bQm`+2b@b�v@M2r@TQHB+v@K�v@MQi@+Qp2`@+QMi`Qp2`bB �H@T2HQbB@pB/2Qf. Rostamzadeh, Negar, Seyedarian Hosseini, Thomas Boquet, Wojciech Stokowiec, Ying Zhang, Christian Jauvin, and Chris Pal. 2018. “Fashion-Gen: The Generative Fashion Dataset and Challenge.” Preprint, submitted June 21, 2018. ?iiTb,ff�`tBpXQ`;f�#bfR3yeXy3j Rd. Sanchez-Lengeling, Benjamin, and Alán Aspuru-Guzik. 2018. “Inverse Molecular Design Us- ing Machine Learning: Generative Models for Matter Engineering.” Science 361, no. 6400 (July): 360–365. ?iiTb,ff/QBXQ`;fRyXRRkefb+B2M+2X��ikeej. Shorten, Connor, and Taghi M. Khoshgoftaar. 2019. “A Survey on Image Data Augmentation for Deep Learning.” Journal of Big Data 6 (60): 1–48. ?iiTb,ff/QBXQ`;fRyXRR3efb9 y8jd@yRN@yRNd@y. https://doi.org/10.1109/lgrs.2018.2890413 https://doi.org/10.1109/lgrs.2018.2890413 https://doi.org/10.1016/j.isprsjprs.2019.04.015 https://doi.org/10.1016/j.isprsjprs.2019.04.015 https://doi.org/10.3390/arts8010026 https://arxiv.org/abs/2001.00071v1 https://doi.org/10.1109/CVPR.2019.00244 https://doi.org/10.1109/CVPR.2019.00244 https://doi.org/10.1109/IVS.2019.8813888 https://doi.org/10.1109/IVS.2019.8813888 https://www.washingtonpost.com/technology/2020/01/06/facebook-ban-deepfakes-sources-say-new-policy-may-not-cover-controversial-pelosi-video/ https://www.washingtonpost.com/technology/2020/01/06/facebook-ban-deepfakes-sources-say-new-policy-may-not-cover-controversial-pelosi-video/ https://www.washingtonpost.com/technology/2020/01/06/facebook-ban-deepfakes-sources-say-new-policy-may-not-cover-controversial-pelosi-video/ https://arxiv.org/abs/1806.08317 https://arxiv.org/abs/1806.08317 https://doi.org/10.1126/science.aat2663 https://doi.org/10.1186/s40537-019-0197-0 https://doi.org/10.1186/s40537-019-0197-0 Harper 27 Tirupattur, Praveen, Yogesh Singh Rawat, Concetto Spampinato, and Mubarak Shah. 2018. “Thoughtviz: Visualizing Human Thoughts Using Generative Adversarial Network.” In Proceedings of the 26th ACM International Conference on Multimedia, 950–958. New York: Association for Computing Machinery. ?iiTb,ff/QBXQ`;fRyXRR98fjk9y 8y3Xjk9ye9R. Tolosana, Ruben, Ruben Vera-Rodriguez, Julian Fierrez, Aythami Morales, and Javier Ortega- Garcia. 2020. “DeepFakes and Beyond: A Survey of Face Manipulation and Fake Detec- tion.” Preprint, submitted January 1, 2020. ?iiTb,ff�`tBpXQ`;f�#bfkyyRXyyRdN. Uřičář, Michal, Pavel Křížek, David Hurych, Ibrahim Sobh, Senthil Yogamani, and Patrick Denny. 2019. “Yes, We GAN: Applying Adversarial Techniques for Autonomous Driving.” In IS&T International Symposium on Electronic Imaging, 1–16. Springfield, VA: Society for Imaging Science and Technology. ?iiTb,ff/QBXQ`;fRyXkj8kfAaaLXk9dy@RRdjXk yRNXR8X�oJ@y93. Vondrick, Carl, and Antonio Torralba. 2017. “Generating the Future with Adversarial Trans- formers.” In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2992–3000. N.p.: IEEE. ?iiTb,ff/QBXQ`;fRyXRRyNf*oS_XkyRdXjRN. Wang, Zhengwei, Qi She, and Tomas E. Ward. 2019. “Generative Adversarial Networks: A Survey and Taxonomy.” Preprint, submitted June 4, 2019. ?iiTb,ff�`tBpXQ`;f�#bf RNyeXyR8kN. Wen, Shiping, Weiwei Liu, Yin Yang, Tingwen Huang, and Zhigang Zeng. 2019. “Generating Realistic Videos From Keyframes With Concatenated GANs.” IEEE Transactions on Cir- cuits and Systems for Video Technology 29 (8): 2337–48. ?iiTb,ff/QBXQ`;fRyXRRyNf h*aohXkyR3Xk3edNj9. Yi, Xin, Ekta Walia, and Paul Babyn. 2019. “Generative Adversarial Network in Medical Imag- ing: A Review.” Medical Image Analysis 58 (December): 1–20. ?iiTb,ff/QBXQ`;fRy XRyRefDXK2/B�XkyRNXRyR88k. Zhavoronkov, Alex. 2018. “Artificial Intelligence for Drug Discovery, Biomarker Development, and Generation of Novel Chemistry.” Molecular Pharmaceutics 15, no. 10 (October): 4311–13. ?iiTb,ff/QBXQ`;fRyXRykRf�+bXKQHT?�`K�+2miX3#yyNjy. Zhu, Jun-Yan, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. “Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks.” In 2017 IEEE International Conference on Computer Vision (ICCV), 2242–2251. N.p.: IEEE. ?iiTb,ff/QBXQ`;fRy XRRyNfA**oXkyRdXk99. https://doi.org/10.1145/3240508.3240641 https://doi.org/10.1145/3240508.3240641 https://arxiv.org/abs/2001.00179 https://doi.org/10.2352/ISSN.2470-1173.2019.15.AVM-048 https://doi.org/10.2352/ISSN.2470-1173.2019.15.AVM-048 https://doi.org/10.1109/CVPR.2017.319 https://arxiv.org/abs/1906.01529 https://arxiv.org/abs/1906.01529 https://doi.org/10.1109/TCSVT.2018.2867934 https://doi.org/10.1109/TCSVT.2018.2867934 https://doi.org/10.1016/j.media.2019.101552 https://doi.org/10.1016/j.media.2019.101552 https://doi.org/10.1021/acs.molpharmaceut.8b00930 https://doi.org/10.1109/ICCV.2017.244 https://doi.org/10.1109/ICCV.2017.244 hintze-artificial-2021 ---- Chapter 1 Artificial Intelligence in the Humanities: Wolf in Disguise, or Digital Revolution? Arend Hintze Dalarna University Jorden Schossau Michigan State University Introduction Artificial Intelligence, with its ability to machine learn coupled to an almost human-like under- standing, sounds like the ideal tool to the humanities. Instead of using primitive quantitative methods to count words or catalogue books, current advancements promise to reveal insights that otherwise could only be obtained by years of dedicated scholarship. But are these technolo- gies imbued with intuition or understanding, and do they learn like humans? Are they capable of developing their own perspective, and can they aid in qualitative research? In the 80s and 90s, as home computers were becoming more common, Hollywood was sen- sationalizing the idea of smart or human-like Artificial Intelligent machines (AI) through movies such as Terminator, Blade Runner, Short Circuit, and Bicentennial Man. At the same time, the home experience of personal computing highlighted the difference between Hollywood intelli- gent machines and the reality of how “dumb” machines really were. Home, or even industry machines, could not answer simple natural language questions of anything but the simplest of complexity. Instead, users or programmers needed to painstakingly implement an algorithm to address their question. Then, the user was required to wait for the machine to slavishly follow each instruction that was programmed while hoping that whoever entered the instructions did 3 4 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 1 not make a mistake. Despite the Hollywood intelligent machines sensation, people understood that computers did not and could not think like humans, but that they do excel at perform- ing repetitive tasks with extreme speed and fidelity. This shaped the expectations for interacting with computers. Computers became efficient tools that required specific instruction in order to achieve a desired outcome. Computational technology and user experience drastically changed over the next 20 years. Technology became much more intuitive to use while it also became much more powerful at handling large data sets. For instance, Google can return search results for websites as a response to even the silliest or sparsest request, with a decent chance that the results are relevant to the question asked. Did you read a manual before you used your smartphone, or did you like everyone else just “figure it out”? Or, as a consequence of modern-day media and its on-demand services, children ask to skip a song playing through radio broadcast. The older technologies quickly feel archaic. These technological advancements go hand in hand with the developments in the field of machine learning and artificial intelligence. The automotive industry is on the cusp of fully self- driving cars. Electronic assistants are not only keeping track of our dates and responding to spo- ken language, they will also soon start making our appointments by speaking to other humans on our behalf. Databases are getting new voice-controlled intuitive interfaces, changing a typ- ical incomprehensible “a1G1*h �o:Ub�H�`vV 6_PJ 2KTHQv22GBbi q>1_1 v2�`>B`2/ = kyRkc” to a spoken “Average salary of our employees hired after 2012?” Another phenomenon is the trend in many disciplines to go from “qualitative” to “quanti- tative” research, or to think about the “system” rather than the “components.” The field that probably experienced this trend first was biology. While obviously descriptive about species of organisms, biologists also always wanted to understand the mechanisms that drive life on earth spanning micro to macro scales. Consequently, a lot is known about the individual chemical components that constitute our metabolism, the components that drive cell division and DNA replication, and which genes are involved in, for example, developmental processes. However, in many cases, our scientific knowledge only covers single functions of single components. In the context of the cell, the state of the organism and how other components interact matters a lot. Cancer, for example, cannot be explained by a single mutation on a single gene but involves many complex interactions (Hanahan and Weinberg 2011). Ecosystems don’t collapse because a single insect dies, but because indirect changes in the food chain interact in complex ways (for a review of the different theories, see Tilman 1996). As a result, systems biology emerged. Systems biolo- gists use large data sets and are often dependent on computer models to understand phenomena on the systems level. The field of Bioinformatics is one such example of an entire field that emerged as a result of using computers to study entire systems that were otherwise humanly intractable. The human genome project to sequence the complete human genome finished in 2003, a time when our con- sumer data storage was limited by the amount of data that fit on a DVD (4.9 GB). While the hu- man genome fits on a DVD, the data that came from the sequencing machines was much larger. Short repetitive sequences first needed assembly, which at that time was a high-performance com- puting task. Other fields have since undergone their own computational revolutions, and now the hu- manities begin their computational revolution. Computers have been a part of core library in- frastructure and experience for some time now, by cataloging entries in a database and allowing intuitive user exploration of that database. However, the digital humanities go beyond this (Fitz- Hintze and Schossau 5 patrick 2012). The ability to analyze (crawl) extremely large corpora of different sources, monitor the internet using the Internet of Things as large sensor arrays, and detect patterns by using so- phisticated algorithms can each produce a treasure trove of quantitative data. Until this point these tasks could only be described or analyzed qualitatively. Additionally, artificial intelligence promises models of the human mind (Yampolskiy and Fox 2012). Machine learning allows us to learn from these data sets in ways that exceed human capa- bilities, while an artificial brain will eventually allow us to objectively describe a subjective experi- ence (through quantifying neural activations or positively and negatively associated memories). This would ultimately close the gap between quantitative and qualitative approaches by allowing an inspection of experience. However, this bridging between quantitative and qualitative methods causes a possible ten- sion for the humanities, which historically defines itself by qualitative methodologies. When qualitative experiences or responses can be finely quantified, such as sadness caused by reading a particular passage, or the curiosity caused by viewing certain works of art, then the field will undergo a revolution. When this happens, we will be able to quantify and discuss how sadness was learned by reading, or how much surprise was generated by viewing an artwork. This is exactly the point where the metaphors break down. Current computational models of the mind are not sophisticated enough to allow these kinds of inferences. Machine learning algorithms work well for what they do but have nothing to do with what a person would call learning. Artificial intelligence is a broad encompassing field. It includes methods that might have appeared to be magic only a couple of years ago (such as generative adversarial networks). Algorithmic finesse resulting from these advances is capable of beating humans in chess (Camp- bell, Hoane Jr, and Hsu 2002), but it is only a very specialized algorithm that has nothing to do with the way humans play or learn chess. This means we are back to the problem we had in the 80s. Instead of being disappointed by the difference between modern technology and Hol- lywood technology, we are disappointed by the difference between modern technology and the experience implied by the labels given to those technologies. Applying misnomer terminology, such as “smart,” “intelligent,” “search,” and “learning” to modern technologies that have little to do with those terms is misleading. It is possible that such technology was deliberately branded with these terms for the improved marketing and sales, effectively redefining them and obscuring their original meaning. Consequently, we again are disappointed by the mismatch of the expec- tations of our computing infrastructure and the reality of our experiences. The following paragraphs will explore current Machine Learning and Artificial Intelligence technologies, explain how quantitative or qualitative they really are, and explore what the possible implications for future Digital Humanities could be. Learning: Phenomenon versus Mechanism Learning is an electrochemical process that involves cells, their genetic makeup, and how they are interconnected. Some interplay between external stimuli and receptor proteins in specialized sensor neurons leads to electrochemical signals propagating over a network of interconnected cells, which themselves respond with physical and genetic changes to said stimuli, probably also dependent on previous stimuli (Kandel, Schwartz, Jessel 2000). This concoction of elaborate terms might suggest that we know in principle which parts are involved and where they are, but we are far from an understanding of the learning mechanism. The description above is as generic as saying that a city functions because cars drive on streets. Even though we might know a lot 6 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 1 about long-term potentiation or the mechanism of neurons which fire together wiring together (aka Hebbian learning), neither of these processes actually mechanistically explains how learning works. Neuroscience, neurophysiology, and cognitive science have not been able to discover this complete process in such a way that we can replicate it, though some inroads are being made (El- Boustani et al. 2018). Similarly, we find promising new interdisciplinary efforts like “Cognitive computational neuroscience” that try to bridge the gap between neuro- and cognitive science and computation (Kriegeskorte and Douglas 2018). So, unfortunately, while the components involved can be identified, the question about “how learning works” cannot be answered mech- anistically. However, a lot is known about the phenomenon of learning. It happens during the lifetime of an organism. What happens between the lifetimes of related organisms is an adaptive process called evolution: inheritance, variation, and natural selection over many generations up to 3.5 billion years here on Earth enabled populations of organisms to succeed in their environments in any way they could. Evolutionary forces found ways for organisms to adapt to their environment during their own lifetimes. While this can take many forms, such as storing energy, seeking shel- ter, having a fight or flight response, it has led to the phenomenon we now call learning. Instead of discussing the diversity of learning in the animal kingdom, we will discuss the richest example: human learning. Here, learning is defined as the cognitive adaptation to external stimulus. The phenomenon of learning can be observed as an increase in performance over time. Learning makes the organism better at doing something. In humans, because we have language and a much higher degree of abstract thinking, an improvement in performance can be facilitated very quickly. While it takes time to learn how to juggle, the ability to find the mean of a series of samples can be quickly com- municated by reading Wikipedia. Both types of lifetime adaptations are called learning. How- ever, these lifetime adaptations are facilitated by two different cognitive processes: explicit or im- plicit learning. 1 Explicit learning—or episodic memory—is fact-based memory. What you did yesterday, what happened in your childhood, or the list of things you should buy when you go shopping, are all memories. Currently, the engram theory best explains this mechanism (Poo et al. 2016 elaborates on the origins of the term). Explicit memory can be retrieved relatively easily and then used to inform future decisions: “Press the green button if the capital of Italy is Paris, otherwise press the red.” The rate of learning for explicit memory can be much higher than for implicit memory, and it can also be communicated more quickly. Abstract communication, such as “I saw a wolf” allows us to transfer the experience of seeing a wolf quickly to other individuals, even though their evoked explicit memory might not be identical to ours. Learning by using implicit memory—sometimes called procedural memory—is facilitated by much slower processes (Schacter, Chiu, and Ochsner 1993). It is generally based on the idea that learning is a combination of expectation, observation or action, and internal model changes. For example, a recovering hospital patient who has suffered a stroke is handed an apple. In this exchange, the patient forms an expectation of where his hand will be to accept the apple. He en- gages his muscles to move his forearm and hand to accept the apple, which is his action. Then the patient observes that his arm did not arrive at the correct position (due to neurological damage). This discrepancy between expectation and action-outcome drives internal changes so that the patient’s brain learns how to adequately control their arm. Presumably, everything considered a skill is based on this process. While very flexible, this form of memory is not easily communicated nor fast to acquire. For instance, while juggling can be described it cannot be communicated in 1There are more than these two mechanisms, but these are the two major ones. Hintze and Schossau 7 such a way that it enables the recipient to juggle without additional training. This description of explicit and implicit learning is an amalgamation of many different hy- potheses and observations. Also, these processes are not as well segregated in practice as outlined here. What is important is what these two learning mechanisms are based on: observations lead to memory, and internal predictions together with exploration lead to improved models about the world. Lastly, these learning processes only exist in organisms because they previously conferred an evolutionary advantage: Organisms that could memorize and then act on those memories had more offspring than those that did not. This interaction of learning and evolution is called the Baldwin effect (Weber and Depew 2003). Organisms that could explore the environment, make predictions about it, and use observations to optimize their internal models were similarly more capable than organisms that could not. Machines do not Learn; They are Trained Now prepared with a proper intuition about learning, we can turn our attention to machine learning. After all, our intuitions should be meaningful in the computational domain as well, because learning always follows the same pattern. One might be disappointed when looking over the table of contents of a machine learning book and find only methods for creating static trans- formation functions (see Russell and Norvig 2016, one of the putative foundations of machine learning and AI). There will typically be a distinction between supervised and unsupervised learn- ing, between categorical and continuous data, and maybe a section about other “smart” algo- rithms. You will not find a discussion about implicit and explicit memory, let alone methods for implementing these concepts. So, if these important sections in our imaginary machine learning book do not discuss the mechanisms of learning, then what are they discussing? Unsupervised learning describes algorithms that report information based on associations within the data. Clustering algorithms are a popular example of unsupervised learning. These use similarity between data points to form and report on distinct groups of data. Clustering is a very important method but is only a well-designed algorithm that is not adaptive. Supervised learning describes algorithms that refine a transformation function to convert from a certain input to a certain output. The idea is to balance specific and general refining such that the transformation function correctly transforms all known examples but generalizes enough to work well on new variations. For example, we would like the machine to transform image data into textual labels, such as “house” or “car.” The input is an image and the output is a label. The input image data are provided to the machine, and small adjustments to the machine’s function are made depending on how well it provided the correct output. Many iterations later ideally will result in a machine that can transform all image data to correct labels, and even operate correctly on new variations of images not provided before. Supervised learning is extremely powerful and is yet to be fully explored. However, supervised learning is quite dissimilar to actual learning. A common argument is that supervised learning uses feedback in a “student-teacher” paradigm of making changes with feedback until proper behavior is achieved, so it could be considered learning. But this feedback is external, objective, and not at all similar to our prediction and com- parison model that, for instance, operates without an all-knowing oracle whispering “good” or “bad” into our ears. Humans and other organisms instead compare predictions with outcomes, and the choices are driven by an intersection of desire and prediction. What seems astonishing is the diverse and specialized capabilities that these two rather simple types of computation, clustering and classification, can produce. Their economic impact is enor- 8 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 1 mous, and we are still finding new ways to combine neural networks and exploit deep learning techniques to create amazing data transformations, such as deep fake videos. But so far, each as- tounding example of AI, through machine learning or some other method, is not showcasing all these capabilities as one machine, but instead each as an independently achieved computational marvel. Each of these examples does only exactly what it was trained to do in a narrow domain and no more. Siri, or any other voice assistant for that matter, does not drive a car (López, Que- sada, and Guerrero 2017), Watson does not play chess (Ferrucci et al. 2013), and Google Alpha Go cannot understand spoken language (Gibney 2016). Even hybrid approaches, such as com- bining speech recognition, chess playing, and autonomous driving, would only be a combination of specialty strategies, not a trained entity from the ground up. Modern machine learning gives us an amazing collection of very applicable, but extremely specialized, computational tools that may be customized to particular data sets, but the resulting machines do not learn autonomously as you or I do. There are cutting edge technologies, such as so-called neuromorphic chips (Nawrocki, Voyles, and Shaheen 2016) and other computational brain models that more closely mimic brain function, but they are not what has been sensation- alized in the media as machine learning or AI, and they have yet to showcase competence on difficult problems competitive with standard supervised learning. Curiously, many people in the machine learning community defend the term “learning,” ar- guing there is no difference between learning and training. In traditional machine learning, the trained algorithm is deployed as a service after which it no longer improves. If the data set ever changes, then a new training set including correct labels needs to be generated and a new train- ing phase initiated. However, if the teacher can be forever bundled with the learner and training continued during the deployment phase, even on new never-before-seen data, then indeed the delineation between learning and training is far less clear. Approaches to such lifelong learning exist, but they struggle with what is called catastrophic forgetting—the phenomenon that only the most recent experiences are learned at the expense of older ones (French 1999). This is the objective for Continuous Delivery for machine learning. Unfortunately, creating a new training set is typically the most expensive endeavor for standard supervised machine learning develop- ment. Adequate training then becomes difficult or impossible without involving thousands or millions of human inputs to keep up with training and using the online machine on an ever- evolving data set. Some have tried to use such “human-in-the-loop” methods, but the resulting machine then becomes only a slight extension of the humans who are forever caught in the loop. Is it an intelligent machine, or a human trapped in a machine? To combat this problem of generating the training set, researchers altered the standard super- vised learning paradigm of flexible learner and rigid teacher to make the teacher likewise flexible to generate new data, continually probing the bounds of the student machine. This is the method of Generative Adversarial Networks, or GANs (Goodfellow et al. 2014). The teacher generates training examples and the student discerns between those generated examples and the original labeled training data. After many iterations, the teacher is improved to better fool the student, and the student is improved to better discern generated training data. As amazing as they are, GANs only partially mitigate the problematic requirement for human-labeled training data, be- cause GANs can only mimic a known labeled distribution. If that distribution ever changes, then new labeled data must be generated, and again we have the same problem as before. Unfor- tunately, GANs have been sensationalized as magic, and public and hobbyist expectation is that GANs are a way toward much better artificial intelligence. Disappointment is inevitable because GANs only allow us to explore what it would be like to have more training data from the same Hintze and Schossau 9 data sets we were using before. These expectations are important for machine learning and AI. We are very familiar with learning, to the point where our whole identity as human could be generously defined as the result of being a monkey with an exceptional proclivity for learning. If we now approach AI and machine learning with expectations that these technologies learn as we do, or are an equally general-purpose intelligence, then we will be bitterly disappointed. The best example of such discrepancy is how easily neural networks trained by deep learning can be fooled. Images that are seemingly identical and differ only by a few pixels are grossly misclassified, a mistake no human would make (Nguyen, Yosinski, and Clune 2015). Fortunately, we know about these biases and the possible shortcomings of these methods. As long as we have the right expectations, we can take their flaws into account and still enjoy the prospects they provide. Trained Machines: Tool or Provocation? On one side we have the natural sciences characterized by hypothesis-driven experimentation re- ducing reality to an abstract model of causal interactions. This approach can inform us about the consequences of our possible actions, but only as far in the future as the model can adequately predict. With machine learning and AI, we can move this temporal horizon of prediction far- ther into the future. While weather models might still struggle to predict precipitation 7 days in advance, global climate models predict in detail the effects of global warming in 100 years. But these models are nihilist, void of values, and cannot themselves answer the question if humans would prefer to live in one possible future or another. Is sunshine better than rain? The human- ities, on the other hand, are home to exactly these problems. What are our values? How do we understand what is essential? Now that we know the facts, how should we choose? Do we speak for everyone? The questions seem to be endless, but they are what makes our human experience so special, and what separates the humanities from the sciences. Labels—such as learning or intelligence—are too easily anthropomorphized. A technology branded in this way suggests human-like properties: intelligence, common sense, or even sub- jective opinion. From a name like “deep learning” we expect a system that develops a deep and intuitive understanding with insights more profound than our own. However, these systems do not provide an alternative perspective, but as explained above, are only as good or as biased as the scientist selecting their training data. Just because humans and machine learning are both black boxes in the sense that their inner workings are opaque, does not mean they share other quali- ties. For instance, having labeled the ML training process as “learning” does not imply that ML algorithms are curious and learn from observations. While these new computerized quantitative measures might be welcomed by some scholars, there will be others who view it as an existential threat to the very nature of the humanities. Are these quantitative methods sneaking into the hu- manities disguised by anthropomorphic terms like a wolf shrouded in a sheep’s fleece? From this viewpoint, having the wrong expectations is not only provoking a disappointment, but flooding the humanities with sophisticated technologies that dilute and muddy the nature of qualitative research that makes the humanities special. However, this imminent clash between quantitative and qualitative research also provides a unique opportunity. Suppose there is a question that can only be answered subjectively and qualitatively. If so, it would define a hard boundary against the aforementioned reductionism of the purely causal quantitative approach. At the same time, such a boundary presents the perfect target for an artificially intelligent system to prove its utility. If a computational human analog 10 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 1 can be created, then it must be capable of performing the same tasks as a humanities researcher. In other words, it must be able to answer subjective and qualitative questions, regardless of its computational and quantitative construction. Failing at such a task would be equivalent to fail- ing the famous Turing test, thereby proving the AI is not yet human-like enough. In this way, the qualitative nature of the humanities poses a challenge—and maybe a threat—to artificially intelligent systems. While some might say the threat is mutual, past successes of interdisciplinary research suggest otherwise: The digital humanities could become the forefront of AI research. Beyond machine training, towards general purpose intelligence Currently, machines do not learn but must be trained, typically with human-labeled data. ML algorithms are not smart as we are, but they can solve specific tasks in sophisticated ways. Per- haps sentience will only be a product of enough time and training data, but the path to sentience probably requires more than time and data. The process that gave rise to human intelligence was evolution. This opportunistic process optimized brains over endless generations to perform ever-changing tasks, and it is the only known example of a process that resulted in such complex intelligence. None of the earlier described computational methods even remotely follow this paradigm: Researchers designed ad hoc algorithms that solved well-defined problems. The next iteration of these methods is either an incremental improvement of existing code, a new method- ological invention, or an application to a new data set. These improvements do not compound to make AI tools better generalists, but instead contribute to the diversity of the existing tools. One approach that does not suffer from these shortcomings is neuro-evolution (Floreano, Dürr, and Mattiussi 2008). Currently, the field of Neuroevolution is in its infancy, but find- ing new and creative solutions to otherwise unsolved problems, such as controlling robots driv- ing cars, is a popular area of focus (Lehman et al. 2020). At the same time, memory formation (Marstaller, Hintze, and Adami 2013), information integration in the brain (Tononi 2004), and how systems evolve the ability to learn (Sheneman, Schossau, and Hintze 2019) are also being researched, as they are building blocks of general purpose intelligence. While it is not clear how thinking machines will ultimately emerge, they are on the horizon. The dualism of a quantitative system that can be subjective and understand the qualitative nature of existence makes it a strange artifact that cannot be ignored. References Campbell, Murray, A Joseph Hoane Jr, and Feng-hsiung Hsu. 2002. “Deep Blue.” Artificial Intelligence 134 (1–2): 57–83. El-Boustani, Sami, Jacque P K Ip, Vincent Breton-Provencher, Graham W Knott, Hiroyuki Okuno, Haruhiko Bito, and Mriganka Sur. 2018. “Locally Coordinated Synaptic Plasticity of Visual Cortex Neurons in Vivo.” Science 360 (6395): 1349–54. Ferrucci, David, Anthony Levas, Sugato Bagchi, David Gondek, and Erik T Mueller. 2013. “Watson: Beyond Jeopardy!” Artificial Intelligence 199: 93–105. Fitzpatrick, Kathleen. 2012. “The Humanities, Done Digitally.” In Debates in the Digital Hu- manities, edited by Matthew K. Gold, 12–15. Minneapolis: University of Minnesota Press. Hintze and Schossau 11 Floreano, Dario, Peter Dürr, and Claudio Mattiussi. 2008. “Neuroevolution: From Architec- tures to Learning.” Evolutionary Intelligence 1 (1): 47–62. French, Robert M. 1999. “Catastrophic Forgetting in Connectionist Networks.” Trends in Cog- nitive Sciences 3 (4): 128–35. Gibney, Elizabeth. 2016. “Google AI Algorithm Masters Ancient Game of Go.” Nature News 529 (7587): 445. Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. “Generative Adversarial Nets.” In Advances in Neural Information Processing Systems 27 (NIPS 2014), edited by Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence, and K.Q. Weinberger, 2672–80. N.p.: Neural Infor- mation Processing Systems Foundation. Hanahan, Douglas, and Robert A Weinberg. 2011. “Hallmarks of Cancer: The Next Genera- tion.” Cell 144 (5): 646–74. Kandel, Eric R, James H Schwartz, and Thomas M Jessell. 2000. Principles of Neural Science. 4th ed. New York: McGraw-Hill. Kriegeskorte, Nikolaus, and Pamela K Douglas. 2018. “Cognitive Computational Neuroscience.” Nature Neuroscience 21: 1148–60. Lehman, Joel et al. 2020. “The Surprising Creativity of Digital Evolution: A Collection of Anec- dotes from the Evolutionary Computation and Artificial Life Research Communities.” Ar- tificial Life 26 (2): 274-306. López, Gustavo, Luis Quesada, and Luis A Guerrero. 2017. “Alexa vs. Siri vs. Cortana vs. Google Assistant: A Comparison of Speech-Based Natural User Interfaces.” In Interna- tional Conference on Applied Human Factors and Ergonomics, edited by Isabel L. Nunes, 241–50. Cham: Springer. Marstaller, Lars, Arend Hintze, and Christoph Adami. 2013. “The Evolution of Representation in Simple Cognitive Networks.” Neural Computation 25 (8): 2079–2107. Nawrocki, Robert A, Richard M Voyles, and Sean E Shaheen. 2016. “A Mini Review of Neu- romorphic Architectures and Implementations.” IEEE Transactions on Electron Devices 63 (10): 3819–29. Nguyen, Anh, Jason Yosinski, and Jeff Clune. 2015. “Deep Neural Networks Are Easily Fooled: High Confidence Predictions for Unrecognizable Images.” In Proceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition (CVPR), 427–36. N.p.: IEEE. Poo, Mu-ming et al. 2016. “What Is Memory? The Present State of the Engram.” BMC Biology 14: 1-18. Russell, Stuart J, and Peter Norvig. 2016. Artificial Intelligence: A Modern Approach. Malaysia: Pearson Education Limited. Schacter, Daniel L, C-Y Peter Chiu, and Kevin N Ochsner. 1993. “Implicit Memory: A Selective Review.” Annual Review of Neuroscience 16 (1): 159–82. Sheneman, Leigh, Jory Schossau, and Arend Hintze. 2019. “The Evolution of Neuroplasticity and the Effect on Integrated Information.” Entropy 21 (5): 1-15. Tilman, David. 1996. “Biodiversity: Population versus Ecosystem Stability.” Ecology 77 (2): 350–63. Tononi, Giulio. 2004. “An Information Integration Theory of Consciousness.” BMC Neuro- science 5: 1–22. Weber, Bruce H, and David J Depew. 2003. Evolution and Learning: The Baldwin Effect Recon- sidered. Cambridge, MA: Mit Press. 12 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 1 Yampolskiy, Roman V, and Joshua Fox. 2012. “Artificial General Intelligence and the Human Mental Model.” In Singularity Hypotheses: A Scientific and Philosophical Assessment, edited by Ammon H. Eden, James H. Moor, Johnny H. Søraker, and Erik Steinhart, 129–45. Hei- delberg: Springer. janco-machine-2021 ---- Chapter 4 Machine Learning in Digital Scholarship Andrew Janco Haverford College Introduction We are entering an exciting time when research on machine learning and innovation no longer requires background knowledge in programming, mathematics, or data science. Tools like Run- wayML, the Teachable Machine, and Google AutoML allow researchers to train project-specific classification and object detection models. Other tools such as Prodigy or INCEpTION provide the means to train custom named entity recognition and named entity linking models. Yet with- out a clear way to communicate the value and potential of these solutions to humanities scholars, they are unlikely to incorporate them into their research practices. Since 2014, dramatic innovations in machine learning have occurred, providing new capa- bilities in computer vision, natural language processing, and other areas of applied artificial in- telligence. Scholars in the humanities, however, are often skeptical. They are eager to realize the potential of these new methods in their research and scholarship, but they do not yet have the means to do so. They need to make connections between machine capabilities, research in the sciences, and tangible outcomes for humanities scholarship, but very often, drawing these con- nections is more a matter of chance than deliberate action. Is it possible to make such connections deliberately and identify how machine learning methods can benefit a scholar’s research? This article outlines a method for connecting the technical possibilities of machine learning with the intellectual goals of academic researchers in the humanities. It argues for a reframing of the problem. Rather than appropriating innovations from computer science and artificial intelli- gence, this approach starts from humanities-based methods and practices. This shift allows us to work from the needs of humanities scholars in terms that are familiar and have recognized value to their peers. Machines can augment scholars’ tasks with greater scale, precision, and reproducibil- 43 44 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 4 ity than are possible for a single scholar alone. However, only relatively basic and repetitive tasks can presently be delegated to machines. This article argues that John Unsworth’s concept of “scholarly primitives” is an effective tool for identifying basic tasks that can be completed by computers in ways that advance humani- ties research (2000). As Unsworth writes, primitives are “basic functions common to scholarly activity across disciplines, over time, and independent of theoretical orientation.” They are the building blocks of research and analysis. As the roots and foundations of our work, “primitives” provide an effective starting point for the augmentation of scholarly tasks. Here it is important to note that the end goal is not the automation of scholarship, but rather the delegation of appropriate tasks to machines. As François Chollet recently noted, Our field isn’t quite “artificial intelligence” — it’s “cognitive automation”: the en- coding and operationalization of human-generated abstractions / behaviors / skills. The “intelligence” label is a category error. (2020) This view shifts our focus from the potential intelligence of machines towards their abil- ity to complete useful tasks for human ends. Specifically, they can augment scholars’ work by performing repetitive tasks at scale with superhuman speed and precision. I proceed from this understanding to argue for an experimental and interpretive approach to machine learning that highlights the value of the interaction between the scholar and machine rather than what ma- chines can produce. *** Unsworth’s notion “scholarly primitive” takes its meaning from programming and refers to the most basic operations and data types of a programming language. Primitives form the build- ing blocks for all other components and operations of the language. This borrowing of termi- nology also suggests that primitives are not universal. A sequence of characters called a string is a primitive in Python, but not in Java or C. The architecture of a language’s primitives changes over time and evolves with community needs. The Python and C communities, for example, have em- braced Unicode as a standard to allow strings in every human language (including emojis). Other communities continue to use a range of character encodings, which grants greater flexibility to the individual programmer and avoids the notion that there should be a common standard. For scholarship, the term offers a metaphor and point of departure. It poses a question: What are the most basic elements of scholarly research and analysis? Unsworth offers several initial ex- amples of primitives to illustrate their value without a claim that they are comprehensive, includ- ing discovering, annotating, comparing, referring, sampling, illustrating, and representing. These terms offer a “list of functions (recursive functions) that could be the basis for a manageable but also useful tool-building enterprise in humanities computing.” Primitives can thus guide us in the creation of computational tools for scholarship. For example, with the primitive of comparison, a scholar might study different editions of a text, searching for similarities and differences that often lead to new insights or highlight ideas that would otherwise be taken for granted. As a tool, comparison can (but does not always) re- veal new information. For an assignment in graduate school, I compared a historical calendar that showed the days of the week against entries in Stalin’s appointment book. The simple juxtaposi- tion revealed that none of Stalin’s appointments were on a Sunday. This example raises questions for further investigation and interpretation. If Stalin was an atheist who worked at all times of Janco 45 the day and night, why wouldn’t he schedule meetings on Sundays? Perhaps it was a legacy from Stalin’s youth spent in seminary? Is there a similar pattern in other periods of Stalin’s life? The craft of humanities research relies on many such simple initial queries. It should be noted that these little experiments are just the beginning of a research project. Nonetheless, the utility of comparison is clear. If anything, it seems so basic as to go unnoticed. This particular comparison offered an insight and new knowledge that led to further research questions. Such beginnings are often a matter of luck. However, machine learning offers an opportu- nity to increase the dimensionality of comparisons. The similarities and differences between two editions of a text can easily be quantified using Levenshtein distance.1 However, that will only capture the differences at the level of characters on a page. With machine learning, we can train embeddings that account for semantics, authors, time periods, genders and other features of a text and its contents simultaneously. We can quantify similarity in new ways that facilitate new forms of comparison. This approach builds on the original meaning and purpose of comparison as a form of “scholarly primitive,” but opens additional directions for research and opportunities for insights. Rather than relying on happenstance or intuition to find productive comparisons, we can systematically search and compare research materials. The second “scholarly primitive” that lends itself well to augmentation is annotation. This activity takes different forms across disciplines. A literary scholar might underline notable sec- tions of a text by writing a note in the margins. A historian transcribes information from an archival source into a notebook. At their core, these actions add observations and associations to the original materials. Those steps in the research process are the first, most basic step, that con- nects information in a source to a larger set of research materials. We add context and meaning to materials that make them part of a larger collection. When working with texts or images, machine learning models are presently capable of mak- ing simple annotations and associations. For example, named entity recognition models (NER) are able to recognize person names, place names, and other key words in text. Each label is an annotation that makes a claim about the content of the text. “Steamboat Springs” or “New York City” are linked to an entity called PLACE. Once again, we are speaking about the most basic first steps that scholars perform during research. I know that Steamboat Springs is a place. It’s where I grew up. However, another scholar, one less versed in small mountain towns in Colorado, might not recognize the town name. They might identify it as a spring or a ski resort; perhaps a volcanic field in Nevada. The idea of “scholarly primitives” forces us to confront the importance of do- main knowledge and the role that it plays in the interpretation of materials. To teach a machine to find entities, we must first explain everything in very specific terms. We can train the machine to use surrounding contextual information in order to predict — correctly — that “Steamboat Springs” refers to a town, a spring, or a ski resort. As part of a project with Philip Gleissner, I trained a model that correctly identifies Soviet journal names in diary entries. For instance, the machine uses contextual clues to identify when the term Volga refers to the journal by that name and not to the river or the automobile. Where is the mention of “October” a journal name and not a month, a factory name, or the revolu- tion? The trained model makes it possible to identify references to journals in a corpus of over 400,000 diary entries. This in turn makes it possible to research the diaries with a focus on reader reception. Normally, this would be a laborious and time-consuming task. Each time the machine predicts an entity in the text, it adds annotations. What was simply text is now marked as an en- 1Named after the Soviet mathematician Vladimir Levenshtein, Levenshtein distance uses the number of changes that would be needed to make two objects identical as a measure of their similarity. 46 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 4 tity. As part of this project, we had to define the relevant entities, create training data, and train the model to accomplish a specific task. This process has tangible value for scholarship because it forces us to break down complicated research processes into their most basic tasks and processes. As noted before, annotation can be an act of association and linking. Natural language pro- cessing is capable of not only recognizing entities in a text, but also associating that text with a record in a knowledge base. This capability is called named entity linking. Using embeddings, a statistical language model can not only predict that “Steamboat Springs” is a town, but that it is a specific town with the record Q984721 in dbpedia. This association opens a wealth of contex- tual information about the place, including its population, latitude and longitude, and elevation. A scholar might have ample knowledge and experience reading literature — specifically, Milton. A machine does not, but it has access to context information that enriches analysis and permits associations. The result is a reading of a literary work that accounts for contextual knowledge. To be sure, named entity linking is not a replacement for domain knowledge. However, it is able to augment a scholar’s contextual knowledge of materials and make that information available for study during research. At this point, we are asking the machine not only to sort or filter data, but to reason actively about its contents. Machine learning offers the potential to automate humanities annotation tasks at scale. This is true of basic tasks, such as recognizing that a given text is a letter. It is also true of object recognition tasks, such as identifying a state seal in a letterhead or other visual at- tributes. A Haverford College student was doing research on documents in a digital archive that we are building with the Grupo de Apoyo Mutuo (GAM), of more than three thousand case inves- tigations of disappeared persons during the Guatemalan Civil War. They noticed that many of the documents were signed with a thumbprint. The student and I trained an image classification model to identify those documents, thus providing the capability to search the entire collection of documents for this visual attribute. The thumbprints provided a proxy for literacy and allowed the student to study the collection in new ways. Similarly, documents containing the state seal of Guatemala are typically letters from the government in reply to GAM’s requests for information about disappeared persons. At present, several excellent tools exist to facilitate machine annotation of images and texts. Google’s Teachable Machine offers an intuitive web application that humanities faculty and stu- dents can use to train classification models for images, sounds, and poses. To take the example above, the user would upload images of correspondence. They would then upload images of doc- uments that are not letters.2 Once training begins, a base model is loaded and trained on the new categories. Because the model already has existing training on image categories, it is able to learn the new category with only a few examples. This process is called transfer learning. For more advanced tasks, Google offers AutoML Vision and Natural Language, which are able to process large collections of text or images and to deploy trained models using Google cloud infrastruc- ture. Similar products are available from Amazon, IBM, and other companies. Runway ML offers a locally installed program with more advanced capabilities than the Teachable Machine. Runway ML works with a wide range of machine learning models and is an excellent way for scholars to explore their capabilities without having to write code.3 The accessibility of tools like 2In the Google Cloud Terms of Service there is specific assurance that your data will not be shared or used for any other purpose than the training of the model. More expert analysis may find concerns, and caution is always warranted. At present, there seems to be no more risk in using cloud services for ML tasks than there are for using cloud services more generally. See ?iiTb,ff+HQm/X;QQ;H2X+QKfi2`Kbf. 3Teachable Machine, ?iiTb,ffi2�+?�#H2K�+?BM2XrBi?;QQ;H2X+QKf; Google AutoML, ?iiTb,ff+HQm /X;QQ;H2X+QKf�miQKHf; RunwayML, ?iiTb,ff`mMr�vKHX+QKf. https://cloud.google.com/terms/ https://teachablemachine.withgoogle.com/ https://cloud.google.com/automl/ https://cloud.google.com/automl/ https://runwayml.com/ Janco 47 Runway allows for low-stakes experimentation and exploration. It is also a particularly good way for scholars to explore new methods and discover new materials. For Unsworth, discovery is largely the process of identifying new resources. We can find new sources in a library catalog, on the shelf, or in a conversation. These activities require a human in the loop because it is the person’s incomplete knowledge of a source that makes it a “discovery” when found. Given that machines reason about the content of text and images in ways that are quite unlike those of humans, machine learning opens new possibilities for discovery. When it comes to the differences in our own habits of mind and the computational processes of artificial networks, we may speak of “neurodiversity.” Scholars can benefit from these differences, since the strengths of machine thinking complement our needs. Machine learning models offer a variety of ways to identify similarity and difference with re- search materials. Yale’s PixPlot, for example, uses a convolutional network to train image embed- dings which are then plotted relative to one another in two-dimensional space with a stochastic nearest neighbor algorithm (t-SNE) (Duhaime n.d.).4 PixPlot creates a striking visualization of hundreds or thousands of images, which are organized and clustered by their relative visual sim- ilarity. As a research tool, PixPlot and similar projects offer a quick means to identify statistically relevant similarities and clusters. This visualization reveals what patterns are most evident to the machine and provides a discovery tool for associations that might not be evident to a human researcher. Ben Schmidt has applied a comparable process to “machine read” and visualize four- teen million texts in the HathiTrust (n.d., 2018).5 Using the relative co-occurrence of words in a book, Schmidt is able to train book embeddings. Schmidt’s vectors provide an original way to organize and label texts based purely on the machine’s “reading” of a book. These machine- generated labels and clusters can be compared against human-generated metadata. The value of this work is the human investigation of what machine models find significant in a collection of research materials. For example, with topic modeling, a scholar must interpret what a particular algorithm has identified as a statistically significant topic by interpreting a cryptic chain of words. The topic “menu, platter, coffee, ashtray” is likely related to a diner. In these efforts, Scattertext offers an effective tool to visualize what terms are most distinctive of a text category. In a given corpus of text, I can identify which words are most exemplary of poetry and which words are most exemplary of prose. Scattertext creates a striking and useful visualization, or it can be used in the terminal to process large collections of text. Conclusion As a conceptual tool, “scholarly primitives” has considerable promise to connect the intellectual goals of academic researchers in the humanities with the technical possibilities of machine learn- ing. Rather than focusing on the capabilities of machine learning methods and the priorities of machine learning researchers, this method offers a means to build from the existing research practices of humanities scholars. It allows us to identify what kinds of tasks would benefit from being augmented. Using “primitives” shifts the focus away from large abstract goals, such as re- search findings and interpretive methods, to micro-methods and actions of humanities research. By augmenting these activities, we are able to benefit from the scale and precision afforded by 4See also ?iiTb,ff�`ib2tT2`BK2MibXrBi?;QQ;H2X+QKfibM2K�Tf. 5At time of writing, Schmidt’s digital monograph Creating Data (n.d.) is a work in progress, with most sections empty until the official publication. https://artsexperiments.withgoogle.com/tsnemap/ 48 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 4 computational methods, as well as the valuable interplay between scholars and machines as hu- manities research practices are made explicit and reproducible. References Chollet, François. 2020. “Our Field Isn’t Quite ‘Artificial Intelligence’ — It’s ‘Cognitive Au- tomation’: The Encoding and Operationalization of Human-Generated Abstractions / Be- haviors / Skills. The ‘Intelligence’ Label Is a Category Error.” Twitter, January 6, 2020, 10:45 p.m. ?iiTb,ffirBii2`X+QKf7+?QHH2ifbi�imbfRkR9jNk9Nejd8yk8ee9. Duhaime, Douglas. n.d. “PixPlot.” Yale DHLab. Accessed July 12, 2020. ?iiTb,ff/?H�#X v�H2X2/mfT`QD2+ibfTBtTHQif. Schmidt, Benjamin. n.d. “A Guided Tour of the Digital Library.” In Creating Data: The Inven- tion of Information in the American State, 1850-1950. ?iiT,ff+`2�iBM;/�i�Xmbf/� i�b2ibf?�i?B@72�im`2bf. . 2018. “Stable Random Projection: Lightweight, General-Purpose Dimension- ality Reduction for Digitized Libraries.” Journal of Cultural Analytics, October. ?iiTb, ff/QBXQ`;fRyXkkR93fReXyk8. Unsworth, John. 2000. “Scholarly Primitives: What Methods Do Humanities Researchers Have in Common, and How Might Our Tools Reflect This?” Paper presented at the Symposium on Humanities Computing: Formal Methods, Experimental Practice, King’s College, Lon- don, May 2000. ?iiT,ffrrrXT2QTH2XpB`;BMB�X2/mf�DKmkKfEBM;bX8@yyfT`BK BiBp2bX?iKH. https://twitter.com/fchollet/status/1214392496375025664 https://dhlab.yale.edu/projects/pixplot/ https://dhlab.yale.edu/projects/pixplot/ http://creatingdata.us/datasets/hathi-features/ http://creatingdata.us/datasets/hathi-features/ https://doi.org/10.22148/16.025 https://doi.org/10.22148/16.025 http://www.people.virginia.edu/~jmu2m/Kings.5-00/primitives.html http://www.people.virginia.edu/~jmu2m/Kings.5-00/primitives.html jiang-cross-2021 ---- Chapter 6 Cross-Disciplinary ML Research is like Happy Marriages: Five Strengths and Two Examples Meng Jiang University of Notre Dame Top Strengths in ML+X Collaboration Cross-disciplinary research refers to research and creative practices that involve two or more aca- demic disciplines (Jeffrey 2003; Karniouchina, Victorino, and Verma 2006). These activities may range from those that simply place disciplinary insights side by side to much more integrative or transformative approaches (Aagaard-Hansen 2007; Muratovski 2011). Cross-disciplinary re- search matters, because (1) it provides an understanding of complex problems that require a mul- tifaceted approach to solve; (2) it combines disciplinary breadth with the ability to collaborate and synthesize varying expertise; (3) it enables researchers to reach a wider audience and com- municate diverse viewpoints; (4) it encourages researchers to confront questions that traditional disciplines do not ask while opening up new areas of research; and (5) it promotes disciplinary self-awareness about methods and creative practices (Urquhart et al. 2011; O’Rourke, Crowley, and Gonnerman 2016; Miller and Leffert 2018). One of the most popular cross-disciplinary research topics/programs is Machine Learning + X (or Data Science + X). Machine learning (ML) is a method of data analysis that automates an- alytical model building. It is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns, and make decisions with minimal human intervention. ML has been used in a variety of applications (Murthy 1998), such as email filtering and computer vision; however, most applications still fall in the domain of computer science and engineering. Recently, the power of ML+X, where X can be any other discipline (such as physics, chemistry, 63 64 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 6 biology, sociology, and psychology), is well recognized. ML tools can reveal profound insights hiding in ballooning datasets (Kohavi et al. 1994; Pedregosa et al. 2011; Kotsiantis 2012; Mul- lainathan and Spiess 2017). However, cross-disciplinary research, which ML+X is part of, is challenging. Collaborating with investigators outside one’s own field requires more than just adding a co-author to a paper or proposal. True collaborations will not always be without conflict—lack of information leads to misunderstandings. For example, ML experts would have little domain knowledge in the field of X; and researchers in X might not understand ML either. The knowledge gap limits the progress of collaborative research. So how can we start and manage successful cross-disciplinary research? What can we do to facilitate collaborative behaviors? In this essay, I will compare cross-disciplinary ML research to “happy marriages,” discussing some characteristics they share. Specifically, I will present the top strengths of conducting cross-disciplinary ML research and give two examples based on my experience of collaborating with historians and psychologists. Marriage is one of the most common “collaborative” behaviors. Couples expect to have happy marriages, just like collaborators expect to have successful project outcomes (Robinson and Blan- ton 1993; Pettigrew 2000; Xu et al. 2007). Extensive studies have revealed the top strengths of happy marriages (DeFrain and Asay 2007; Gordon and Baucom 2009; Prepare/Enrich, n.d.), which can be reflected in cross-disciplinary ML research. Here I focus on five of them: 1. Collaborators (“partners” in the language of marriage) are satisfied with communication. 2. Collaborators feel very close to each other. 3. Collaborators discuss their problems well. 4. Collaborators handle their differences creatively. 5. There is a goodbalanceoftimealone (i.e., individual research work) andtogether (meetings, discussions, etc). First of all, communication is the exchange of information to achieve a better understanding; and collaboration is defined as the process of working together with another person to achieve an end goal. Effective collaboration is about sharing information, knowledge, and resources to work together through satisfactory communication. Ineffectiveness or lack of communication is one of the biggest challenges in ML+X collaboration. Second, researchers in different disciplines meet different challenges through the process of collaboration. Making the challenges clear to understand and finding solutions together is the core of effective collaboration. Third, researchers in different disciplines can collaborate only when they recognize mutual interest and feel that the research topics they have studied in depth are very close to each other. Collaborators must be interested in solving the same, big problem. Fourth, collaborators must embrace their differences on concepts and methods and take ad- vantage of them. For example, one researcher can introduce a complementary method to the mix of other methods that the collaborator has been using for a long time; or one can have a new, impactful dataset and evaluation method to test the techniques proposed by the other. Fifth, in strong collaboration, there is a balance between separateness and togetherness. Meet- ings are an excellent use of time for having integrated perspectives and productive discourse around Jiang 65 difficult decisions. However, excessive collaboration happens when researchers are depleted by too many meetings and emails. It can lead to inefficient, unproductive meetings. So it is impor- tant to find a balance. Next, I, as a computer scientist and ML expert, will discuss twoML+X collaborative projects. ML experts bring mathematical modeling and computational methods for mining knowledge from data. The solutions usually have good generalizability; however, they still need to be tai- lored for specialized domains or disciplines. Example 1: ML + History The history professor Liang Cai and I have collaborated on an international research project ti- tled “Digital Empires: Structured Biographical and Social Network Analysis of Early Chinese Empires.” Dr. Cai is well known for her contributions to the fields of early Chinese Empires, Classical Chinese thought (in particular, Confucianism and Daoism), digital humanities, and the material culture and archaeological texts of early China (Cai 2014). Our collaboration ex- plores how digital humanities expand the horizon of historical research and help visualize the research landscape of Chinese history. Historical research is often constrained by sources and the human cognitive capacity for processing them. ML techniques may enhance historians’ abilities to organize and access sources as they like. ML techniques can even create new kinds of sources at scale for historians to interpret. “The historians pose the research questions and visualize the project,” said Cai. “The computer scientists can help provide new tools to process primary sources and expand the research horizon.” We conducted a structured biographical analysis to leverage the development of machine learning techniques, such as neural sequence labeling and textual pattern mining, which allowed classical sources of Chinese empires to be represented in an encoded way. The project aims to build a digital biographical database that sorts out different attributes of all recorded historical actors in available sources. Breaking with traditional formats, ML+History creates new oppor- tunities and augments our way of understanding history. First, it helps scholars, especially historians, change their research paradigm, allowing them to generalize their arguments with sufficient examples. ML techniques can find all examples in the data where manual investigation may miss some. Also, abnormal cases can indicate a new discovery. As far as early Chinese empires are concerned, ML promises to automate mining and encoding all available biographical data, which allows scholars to change the perspective from one person to a group of persons with shared characteristics, and to shift from analyzing examples to relating a comprehensive history. Therefore, scholars can identify general trends efficiently and present an information-rich picture of historical reality using ML techniques. Second, the structured data produced by ML techniques revolutionize the questions researchers ask, thereby changing the research landscape. Because of the lack of efficient tools, there are nu- merous interesting questions scholars would like to ask but cannot. For example, the geographical mobility of historical actors is an intriguing question for early China, the answer to which would show how diversified regions were integrated into a unified empire. Nevertheless, an individual historian cannot efficiently process the massive amount of information preserved in the sources. With ML techniques, we can generate fact tuples to sort out original geographical places of all available historical actors and provide comprehensive data for historians to analyze. 66 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 6 Figure 6.1: The graph presents a visual of the social network of officials who served in the gov- ernment about 2,000 years ago in China. The network describes their relationships and personal attributes. Jiang 67 Patterns Mined by ML Tech Extracted Relations $PER_X …ጛ$PER_Yழ$KLG (዗⑬,᜺〫,᝹) $PER_X was taught by $PER_Y on $KLG (knowledge) (᜺〫,ↁၵ,᝹) (⋁዆,ၔੲ,ឃ⑷) $PER_X PER_Y$ࢍ… (ோ㠟⊡༱,ၮឮሞ) $PER_X was taught/mentored by $PER_Y (ჶ㬾,዗ᴃ) $PER_X …ᖱ$PER_Y (ၯ೓,௙⭈㶷↲ኧ) $PER_X taught $PER_Y (ዀ,㭮⥸) $PER … $LOCࢁࢨ (዗᛹,ᯊᡕቕ㙈) $PER place_of_birth $LOC (ዺヽ,ᝲ㋺) $PER㋣$TIT (ᠮ㋺,୔᱓໼ႉ) $PER job_title $TIT (ⅰኴ໢,㋨ᡕ໼ႉ) $PER⥤$TIT (᫖㙈ⅴ,ጞை໺໽) $PER job_title $TIT (ၯஒ,ࡢᄝࡢმ) $PERẚ$TIT (ⅴ,⒆୻໛ࣝ) $PER job_title $TIT (ோ㠟⊡༱,᫦㡧ሮश) Table 6.1: Examples of Chinese Text Extraction Patterns Third, the project revolutionizes our reading habits. Large datasets mined from primary sources will allow scholars to combine long-distant reading with original texts. The macro pic- ture generated from data will aid in-depth analysis of the event against its immediate context. Furthermore, graphics of social networks and common attributes of historical figures will change our reading habits, transforming linear storytelling to accommodate multiple narratives (see the above figure). Researchers from the two sides develop collaboration through the project step by step, just like developing a relationship for marriage. Ours started at a faculty gathering from some random chat about our research. As the historian is open-minded to ML technologies and the ML expert is willing to create broader impact, we brainstormed ideas that would not have developed without taking care of the five important points: 1. Communication: With our research groups, we started to meet frequently at the begin- ning. We set up clear goals at the early stage, including expected outcomes, publication venues, and joint proposals for funding agencies, such as the National Endowment for the Humanities (NEH) and Notre Dame seed grant funding. Our research groups met almost twice a week for as long as three weeks. 2. Feel very close to each other: Besides holding meetings, we exchanged our instant messenger accounts so we could communicate faster than email. We created Google Drive space to share readings, documents, and presentation slides. We found many tools to create “tight relationships” between the groups at the beginning. 3. Discuss their problems well: Whenever we had misunderstandings, we discussed our prob- 68 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 6 lems. Historians learned about what a machine does, what a machine can do, and generally how a machine works toward the task. ML people learned what is interesting to historians and what kind of information is valuable. We hold the principle that as the problems exist, they make sense; any problem any other encounters is worth a discussion. We needed to solve problems together from the moment they became our problems. 4. Handletheirdifferencescreatively: Historians are among the few who can read and write in classical Chinese. Classical Chinese was used as the written language from over 3,000 years ago to the early 20th century. Since then, mainland China has used either Mandarin (sim- plified Chinese) or Cantonese, while Taiwan has used traditional Chinese. None is similar to classical Chinese at all. In other words, historians work on a language that no ML ex- perts here, even those who speak modern Chinese, can understand. So we handle our lan- guage differences “creatively” by using the translated version as the intermediate medium. Historians have translated history books in classical Chinese into simplified Chinese so we can read the simplified version. Here, the idea is to let the machine learning algorithms read both versions. We find that information extraction (i.e., finding relations from text) and machine translation (i.e., from classical Chinese to modern Chinese) can mutually en- hance each other, which turns out to be one of our novel technical contributions to the field of natural language processing. 5. Good balance of time alone and together: After the first month, since the project goal, datasets, background knowledge, and many other aspects were clear in both sides’ minds, we had regular meetings in a less intensive manner. We met twice or three times a month so that computer science students could focus on developing machine learning algorithms, and only when significant progress was made or expert evaluation was needed would we schedule a quick appointment with Prof. Liang Cai. So far, we have published peer-reviewed papers on the topic of information extraction and entity retrieval in classical Chinese history books using ML (Ma et al. 2019; Zeng et al. 2019). We have also submitted joint proposals with the above work as preliminary results to NEH. Example 2: ML + Psychology I am working with Drs. Ross Jacobucci and Brooke Ammerman in psychology to apply ML to understand mental health problems and suicidal intentions. Suicide is a serious public health problem; however, suicides are preventable with timely, evidence-based interventions. Social me- dia platforms have been serving users who are experiencing real-time suicidal crises with hopes of receiving peer support. To better understand the helpfulness of peer support occurring online, we characterize the content of both a user’s post and corresponding peer comments occurring on a social media platform and present an empirical example for comparison. We have designed a new topic-model-based approach to finding topics of users and peer posts from the social me- dia forum data. The key advantages include: (i) modeling both the generative process of each type of corpora (i.e., user posts and peer comments) and the associations between them, and (ii) using phrases, which are more informative and less ambiguous than words alone, to represent so- cial media posts and topics. We evaluated the method using data from Reddit’s r/SuicideWatch community. Jiang 69 Figure 6.2: Screenshot of r/SuicideWatch on Reddit. We examined how the topics of user and peer posts were associated and how this information influenced the perceived helpfulness of peer support. Then, we applied structural topic modeling to data collected from individuals with a history of suicidal crisis as a means to validate findings. Our observations suggest that effective modeling of the association between the two lines of top- ics can uncover helpful peer responses to online suicidal crises, notably providing the suggestion of pursuing professional help. Our technology can be applied to “paired” corpora in many appli- cations such as tech support forums and question-answering sites. This project started from a talk I gave at the psychology graduate seminar. The fun thing is that Dr. Jacobucci was not able to attend the talk. Another psychology professor who attended my talk asked constructive questions and mentioned my research to Dr. Jacobucci when they met later. So Dr. Jacobucci dropped me an email, and we had coffee together. Cross-disciplinary research often starts from something that sounds like developing a relationship. Because, again, the psychologists are open-minded to ML technologies and the ML expert is willing to create broader impact, we successfully brainstormed ideas when we had coffee, but this would not have developed into long-term collaboration without the following efforts: (1) Communicate inten- sively between research groups at the early stage. We had multiple meetings a week to make the goals clear. (2) Get students involved in the process. When my graduate student received more and more advice from the psychology professors and students, the connections between the two groups became stronger. (3) Discuss the challenges in our fields very well. We analyzed together whether machine learning would be capable of addressing the challenges in mental health. We also analyzed whether domain experts could be involved in the loop of machine learning algo- rithms. (4) Handle our differences. We separately presented our research and then found times to work together to put sets of slides together based on one common vision and goal. (5) After the first month, only hold meetings when discussion is needed or there is an approaching deadline 70 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 6 for either paper or proposal. We have enjoyed our collaboration and the power of cross-disciplinary research. Our joint work is under review at Nature Palgrave Communications. We have also submitted joint propos- als to NIH with this work as preliminary results (Jiang et al. 2020). Conclusions In this essay, I used a metaphor comparing cross-disciplinary ML research to “happy marriages.” I discussed five characteristics they share. Specifically, I presented the top strengths of produc- ing successful cross-disciplinary ML research: (1) Partners are satisfied with communication. (2) Partners feel very close to each other. (3) Partners discuss their problems well. (4) Partners han- dle their differences creatively. (5) There is a good balance of time alone (i.e., individual research work) and together (meetings, discussions, etc). While every project is different and will produce its own challenges, my experience of collaborating with historians and psychologists according to the happy marriage metaphor suggests that it is a simple and strong paradigm that could help other interdisciplinary projects develop into successful, long-term collaborations. References Aagaard lj Hansen, Jens. 2007. “The Challenges of Cross lj Disciplinary Research.” Social Epistemology 21, no. 4 (October-December): 425–38. ?iiTb,ff/QBXQ`;fRyXRy3yfyk eNRdkydyRd9e89y. Cai, Liang. 2014. Witchcraft and the Rise of the First Confucian Empire. Albany: SUNY Press. DeFrain, John, and Sylvia M. Asay. 2007. “Strong Families Around the World: An Introduction to the Family Strengths Perspective.” Marriage & Family Review 41, no. 1–2 (August): 1–10. ?iiTb,ff/QBXQ`;fRyXRjyyfCyykp9RMyRnyR. Gordon, Cameron L., and Donald H. Baucom. 2009. “Examining the Individual Within Mar- riage: Personal Strengths and Relationship Satisfaction.” Personal Relationships 16, no. 3 (September): 421–435. ?iiTb,ff/QBXQ`;fRyXRRRRfDXR9d8@e3RRXkyyNXyRkjR Xt. Jeffrey, Paul. 2003. “Smoothing the Waters: Observations on the Process of Cross-Disciplinary Research Collaboration.” Social Studies of Science 33, no. 4 (August): 539–62. Jiang, Meng, Brooke A. Ammerman, Qingkai Zeng, Ross Jacobucci, and Alex Brodersen. 2020. “Phrase-Level Pairwise Topic Modeling to Uncover Helpful Peer Responses to Online Sui- cidal Crises.” Humanities and Social Sciences Communications 7: 1–13. Karniouchina, Ekaterina V., Liana Victorino, and Rohit Verma. 2006. “Product and Service In- novation: Ideas for Future Cross-Disciplinary Research.” TheJournalofProductInnovation Management 23, no. 3 (May): 274–80. Kohavi, Ron, George John, Richard Long, David Manley, and Karl Pfleger. 1994. “MLC++: A Machine Learning Library in C++.” In Proceedings of the Sixth International Conference on Tools with Artificial Intelligence, 740–3. N.p.: IEEE. ?iiTb,ff/QBXQ`;fRyXRRyNfh� AXRNN9Xj9e9Rk. Kotsiantis, S.B. 2012. “Use of Machine Learning Techniques for Educational Proposes [sic]: a Decision Support System for Forecasting Students’ Grades.” Artificial Intelligence Review 37, no. 4 (May): 331–44. ?iiTb,ff/QBXQ`;fRyXRyydfbRy9ek@yRR@Nkj9@t. https://doi.org/10.1080/02691720701746540 https://doi.org/10.1080/02691720701746540 https://doi.org/10.1300/J002v41n01_01 https://doi.org/10.1111/j.1475-6811.2009.01231.x https://doi.org/10.1111/j.1475-6811.2009.01231.x https://doi.org/10.1109/TAI.1994.346412 https://doi.org/10.1109/TAI.1994.346412 https://doi.org/10.1007/s10462-011-9234-x Jiang 71 Ma, Yihong, Qingkai Zeng, Tianwen Jiang, Liang Cai, and Meng Jiang. 2019. “A Study of Person Entity Extraction and Profiling from Classical Chinese Historiography.” In Pro- ceedings of the 2nd International Workshop on EntitY REtrieval, edited by Gong Cheng, Kalpa Gunaratna, and Jun Wang, 8–15. N.p.: International Workshop on EntitY REtrieval. ?iiT,ff+2m`@rbXQ`;foQH@k99ef. Miller, Eliza C. and Lisa Leffert. 2018. “Building Cross-Disciplinary Research Collaborations.” Stroke 49, no. 3 (March): e43-e45. ?iiTb,ff/QBXQ`;fRyXRReRfbi`QF2�?�XRRdXyk y9jd. Mullainathan, Sendhil, and Jann Spiess. 2017. “Machine learning: an applied econometric ap- proach.” Journal of Economic Perspectives 31, no. 2 (spring): 87–106. ?iiTb,ff/QBXQ` ;fRyXRk8dfD2TXjRXkX3d. Muratovski, Gjoko. 2011. “Challenges and Opportunities of Cross-Disciplinary Design Edu- cation and Research.” In Proceedings from the Australian Council of University Art and Design Schools (ACUADS) Conference: Creativity: Brain—Mind—Body, edited by Gordon Bull. Canberra, Australia: ACAUDS Conference. ?iiTb,ff�+m�/bX+QKX�mf+QM72` 2M+2f�`iB+H2f+?�HH2M;2b@�M/@QTTQ`imMBiB2b@Q7@+`Qbb@/Bb+BTHBM�`v@ /2bB;M@2/m+�iBQM@�M/@`2b2�`+?f. Murthy, Sreerama K. 1998. “Automatic Construction of Decision Trees from Data: A Multi- Disciplinary Survey.” DataMiningandKnowledgeDiscovery 2, no. 4 (December): 345–89. ?iiTb,ff/QBXQ`;fRyXRykjf�,RyyNd99ejykk9. O’Rourke, Michael, Stephen Crowley, and Chad Gonnerman. 2016. “On the Nature of Cross- Disciplinary Integration: A Philosophical Framework.” Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences 56 (April): 62–70. ?iiTb,ff/QBXQ`;fRyXRyRefDXb?Tb+XkyR8XRyXyyj. Pedregosa, Fabian et al. 2011. “Scikit-learn: Machine Learning in Python.” The Journal of Ma- chine Learning Research 12: 2825–30. ?iiT,ffrrrXDKH`XQ`;fT�T2`bfpRkfT2/`2; Qb�RR�X?iKH. Pettigrew, Simone F. 2000. “Ethnography and Grounded Theory: a Happy Marriage?” In Associ- ation for Consumer Research Conference Proceedings, edited by Stephen J. Hoch and Robert J. Meyer, 256–60. Provo, UT: Association for Consumer Research. ?iiTb,ffrrrX�+`r 2#bBi2XQ`;fpQHmK2bf39yyfpQHmK2bfpkdf. Prepare/Enrich. N.d. “National Survey of Marital Strengths.” Prepare/Enrich (website). Ac- cessed January 17, 2020. ?iiTb,ffrrrXT`2T�`2@2M`B+?X+QKfT2nK�BMnbBi2n+QM i2MifT/7f`2b2�`+?fM�iBQM�Hnbm`p2vXT/7. Robinson, Linda C. and Priscilla W. Blanton. 1993. “Marital Strengths in Enduring Marriages.” Family Relations: An Interdisciplinary Journal of Applied Family Studies 42, no. 1 (Jan- uary): 38–45. ?iiTb,ff/QBXQ`;fRyXkjydf839NRN. Urquhart, R., E. Grunfeld, L. Jackson, J. Sargeant, and G. A. Porter. 2013. “Cross-Disciplinary Research in Cancer: an Opportunity to Narrow the Knowledge–Practice Gap.” Current Oncology 20, no. 6 (December): e512–e521. ?iiTb,ff/QBXQ`;fRyXjd9df+QXkyXR9 3d. Xu, Anqi, Xiaolin Xie, Wenli Liu, Yan Xia, and Dalin Liu. 2007. “Chinese Family Strengths and Resiliency.” Marriage & Family Review 41, no. 1–2 (August): 143–64. ?iiTb, ff/QBXQ`;fRyXRjyyfCyykp9RMyRny3. Zeng, Qingkai, Mengxia Yu, Wenhao Yu, Jinjun Xiong, Yiyu Shi, and Meng Jiang. 2019. “Faceted Hierarchy: A New Graph Type to Organize Scientific Concepts and a Construction Method.” http://ceur-ws.org/Vol-2446/ https://doi.org/10.1161/strokeaha.117.020437 https://doi.org/10.1161/strokeaha.117.020437 https://doi.org/10.1257/jep.31.2.87 https://doi.org/10.1257/jep.31.2.87 https://acuads.com.au/conference/article/challenges-and-opportunities-of-cross-disciplinary-design-education-and-research/ https://acuads.com.au/conference/article/challenges-and-opportunities-of-cross-disciplinary-design-education-and-research/ https://acuads.com.au/conference/article/challenges-and-opportunities-of-cross-disciplinary-design-education-and-research/ https://doi.org/10.1023/A:1009744630224 https://doi.org/10.1016/j.shpsc.2015.10.003 http://www.jmlr.org/papers/v12/pedregosa11a.html http://www.jmlr.org/papers/v12/pedregosa11a.html https://www.acrwebsite.org/volumes/8400/volumes/v27/ https://www.acrwebsite.org/volumes/8400/volumes/v27/ https://www.prepare-enrich.com/pe_main_site_content/pdf/research/national_survey.pdf https://www.prepare-enrich.com/pe_main_site_content/pdf/research/national_survey.pdf https://doi.org/10.2307/584919 https://doi.org/10.3747/co.20.1487 https://doi.org/10.3747/co.20.1487 https://doi.org/10.1300/J002v41n01_08 https://doi.org/10.1300/J002v41n01_08 72 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 6 In Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-13), edited by Dmitry Ustalov, Swapna Somasundaran, Peter Jansen, Goran Glavaš, Martin Riedl, Mihai Surdeanu, and Michalis Vazirgiannis, 140–50. Hong Kong: Association for Computational Linguistics. ?iiTb,ff/QBXQ`;fRyXR3e8jfpRf .RN@8jRd. https://doi.org/10.18653/v1/D19-5317 https://doi.org/10.18653/v1/D19-5317 johnson-preface-2021 ---- Preface This collection of essays is the unexpected culmination of a 2018–2020 grant from the Institute of Museum and Library Services to the Hesburgh Libraries at the University of Notre Dame.1 The plan called for a survey and a series of workshops hosted across the country to explore, orig- inally, “the national need for library based topic modeling tools in support of cross-disciplinary discovery systems.” As the project developed, however, it became apparent that the scope of re- search should expand beyond topic modeling and that the scope of output might expand beyond a white paper. The end of the 2010s, we found, was swelling with library-centered investigations of broader machine learning applications across the disciplines, and our workshops demonstrated such a compelling mixture of perspectives on this development that we felt an edited collection of essays from our participants would be an essential witness to the moment in history. With remaining grant funds, we hosted one last workshop at Notre Dame to kick start writing. The resulting essays cover a wide ground. Some present a practical, “how-to” approach to the machine learning process for those who wish to explore it at their own institutions. Oth- ers present individual projects, examining not just technical components or research findings, but also the social, financial, and political factors involved in working across departments (and in some cases, across the town/gown divide). Others still take a larger panoramic view of the ethics and opportunities of integrating machine learning with cross-disciplinary higher education, veer- ing between optimistic and wary viewpoints. The multi-disciplinarity of the essayists and the diversity of their research give each chapter a sui generis flavor, though several shared concerns thread through the collection. Most signifi- cantly, the authors suggest that while the technical aspects of machine learning are a challenge, especially when working with collaborators from different backgrounds, many of their key con- cerns are actually about the ethical and social dimensions of the work. In this sense, the collection is very much of the moment. Two large projects on machine learning, cross-disciplinarity, and libraries ran concurrently with our grant — Cordell 2020 and Padilla 2019, which were com- missioned by major players in the field, the Library of Congress and OCLC, respectively — and both took pains to foreground the wider potential effects of machine learning. As Ryan Cordell puts it, “current cultural attention to ML may make it seem necessary for libraries to implement ML quickly. However, it is more important for libraries to implement ML through their existing commitments to responsibility and care” (1). The voices represented here exhibit a thorough commitment to Cordell’s call for responsibil- ity and care, and they are only a subset of the larger chorus that sounded at the workshops. We editors therefore encourage readers interested in this bigger picture to examine the meta-themes 1LG-72-18-0221-18: “Investigating the National Need for Library Based Topic Modeling Discovery Systems.” See ?iiTb,ffrrrXBKHbX;Qpf;`�Mibf�r�`/2/fH;@dk@R3@ykkR@R3. v https://www.imls.gov/grants/awarded/lg-72-18-0221-18 vi Machine Learning, Libraries, and Cross-Disciplinary Research and detailed information that emerged in the course of the workshops and the original survey through the grant’s final report.2 All of these pieces together capture a fascinating snapshot of an interdisciplinary field in motion. We should note that the working methods of the collection’s editorial team were an attempt to extend the grant’s spirit of collaboration. Through several stages of development, content editors Don Brower, Mark Dehmlow, Eric Morgan, Alex Papson, and John Wang reviewed as- signed essays and provided commentary before notifying general editor Daniel Johnson for prose editing, who in turn shared the updated manuscripts with the authors so the cycle could begin again. The submissions, written variously in Microsoft Word or Google Docs format, were ush- ered through these stages of life in team Google Drive folders and tracked by spreadsheet be- fore eventual conversion by Don Brower into a series of TeX files, provisioned in a version con- trolled Github repository, for more fine-tuned final editing. Like working with diverse teams in the pursuit of machine learning, editing essays together in this fashion, for publication by the Hesburgh Libraries, was a novel way of collaborating, and we editors thought candor about this book-making process might prove insightful to readers. Attending to the social dimensions of the work ourselves, we must note that this collection would not have been possible without the generous support of many people and organizations. We would like to thank the IMLS for providing essential funding support for the grant and the Hesburgh Libraries’ Edward H. Arnold University Librarian, Diane Parr Walker, for her orga- nizational support. Thank you to the members of the Notre Dame IMLS grant team who, at its various stages, provided critical support in managing logistics, conducting research, facilitat- ing workshops, and analyzing results. These individuals include John Wang (grant project di- rector), Don Brower, Mark Dehmlow, Nastia Guimaraes, Melissa Harden, Helen Hockx-Yu, Daniel Johnson, Christina Leblang, Rebecca Leneway, Laurie McGowan, Eric Lease Morgan, and Alex Papson. The University of Notre Dame Office of General Counsel provided key publi- cation advice, and the University of Notre Dame Office of Research provided critical support in administering the grant. Again, many thanks. We would also like to thank the co-signatories of the IMLS Grant Application for supporting the project’s goals: Mark Graves (then Visiting Research Assistant Professor, Center for Theol- ogy, Science, and Human Flourishing, University of Notre Dame), Pamela Graham (Director of Global Studies and Director of the Center for Human Rights Documentation and Research, Columbia University Libraries), and Ed Fox (Professor of Computer Science and Director of the Digital Library Research Laboratory, Virginia Polytechnic Institute and State University). And of course, thanks to the 95 participants in our 2019 IMLS Grant Workshops (too many to enu- merate here) and to the essay authors for sharing their expertise and perspectives in growing our collective knowledge of machine learning and its use in research, scholarship, and cultural her- itage organizations. Your active engagement continues to shape the field, and we look forward to your next achievements. References Cordell, Ryan. 2020. “Machine Learning + Libraries: A Report on the State of the Field.” Com- missioned by LC Labs, Library of Congress. ?iiTb,ffH�#bXHQ+X;Qpfbi�iB+fH�#b frQ`Ff`2TQ`ibf*Q`/2HH@GP*@JG@`2TQ`iXT/7. 2See ?iiTb,ff/QBXQ`;fRyXdkd9f`y@jkyx@FM83. https://labs.loc.gov/static/labs/work/reports/Cordell-LOC-ML-report.pdf https://labs.loc.gov/static/labs/work/reports/Cordell-LOC-ML-report.pdf https://doi.org/10.7274/r0-320z-kn58 vii Padilla, Thomas. 2019. “Responsible Operations: Data Science, Machine Learning, and AI in Libraries.” Dublin, Ohio: OCLC Research. ?iiTb,ffrrrXQ+H+XQ`;f`2b2�`+?fTm #HB+�iBQMbfkyRNfQ+H+`2b2�`+?@`2bTQMbB#H2@QT2`�iBQMb@/�i�@b+B2M+2 @K�+?BM2@H2�`MBM;@�BX?iKH. https://www.oclc.org/research/publications/2019/oclcresearch-responsible-operations-data-science-machine-learning-ai.html https://www.oclc.org/research/publications/2019/oclcresearch-responsible-operations-data-science-machine-learning-ai.html https://www.oclc.org/research/publications/2019/oclcresearch-responsible-operations-data-science-machine-learning-ai.html kim-ai-2021 ---- Chapter 7 AI and Its Moral Concerns Bohyun Kim University of Rhode Island Automating Decisions and Actions The goal of artificial intelligence (AI) as a discipline is to create an artificial system—whether it be a piece of software or a machine with a physical body—that is as intelligent as a human in its performance, either broadly in all areas of human activities or narrowly in a specific activity, such as playing chess or driving.1 The actual capability of most AI systems remained far below this ambitious goal for a long time. But with recent successes with machine learning and deep learning, the performance of some AI programs has started surpassing that of humans. In 2016, an AI program developed with the deep learning method, AlphaGo, astonished even its creators by winning four out of five Go matches with the eighteen-time world champion, Sedol Lee.2 In 2020, Google’s DeepMind unveiled Atari57, a deep reinforcement learning algorithm that reached superhuman levels of play in 57 classic Atari games.3 Early symbolic AI systems determined their outputs based upon given rules and logical in- ference. AI algorithms in these rule-based systems, also known as good old-fashioned AI (GO- FAI), are pre-determined, predictable, and transparent. On the other hand, machine learning, 1Note that by ‘as intelligent as a human,’ I only mean AI at human-level performance in achieving a particular goal not general(/strong) AI. General AI—also known as ‘artificial general intelligence (AGI)’ and ‘strong AI’—refers to AI with the ability to adapt to achieve any goals. By contrast, an AI system developed to perform only one or some activities in a specific domain is called a ‘narrow (/weak) AI’ system. 2AlphaGo can be said to be “as intelligent as humans,” but only in playing Go, where it exceeds human capability. So, it does not qualify as general/strong AI in spite of its human-level intelligence in Go-playing. It is to be noted that general(/strong) AI and narrow(/weak) AI signify the difference in the scope of AI capability. General(/strong) AI is also a broader concept than human-like intelligence, either with its carbon-based substrate or with human-like understanding that relies on what we regard as uniquely human cognitive states such as consciousness, qualia, emotions, and so on. For more helpful descriptions of common terms in AI, see (Tegmark 2017, 39). For more on the match between AlphaGo and Sedol Lee, see (Koch 2016). 3Deep reinforcement learning is a type of deep learning that is goal-oriented and reward-based. See (Heaven 2020). 73 74 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 7 another approach in AI, enables an AI algorithm to evolve to identify a pattern through the so- called ‘training’ process, which relies on a large amount of data and statistics. Deep learning, one of the widely-used techniques in machine learning, further refines this training process using a ‘neural network.’4 Machine learning and deep learning have brought significant improvements to the performance of AI systems in areas such as translation, speech recognition, and detecting objects and predicting their movements. Some people assume that machine learning completely replaced GOFAI, but this is a misunderstanding. Symbolic reasoning and machine learning are two distinct but not mutually exclusive approaches in AI, and they can be used together (Knight 2019a). With their limited intelligence and fully deterministic nature, early rule-based symbolic AI systems raised few ethical concerns.5 AI systems that near or surpass human capability, on the other hand, are likely to be given the autonomy to make their own decisions without humans, even when their workings are not entirely transparent, and some of those decisions are distinc- tively moral in character. As humans, we are trained to recognize situations that demand moral decision-making. But how would an AI system be able to do so? Or, should they be? With self- driving cars and autonomous weapons systems under active development and testing, these are no longer idle questions. The Trolley Problem Recent advances of AI, such as autonomous cars, have brought new interest to the trolley prob- lem, a thought experiment introduced by the British philosopher Philippa Foot in 1967. In the standard version of this problem, a runaway trolley barrels down a track where five unsuspecting people are standing. You happen to be standing next to a lever that switches the trolley onto a different track, where there is only one person. Those who are on either track will be killed if the trolley heads their way. Should you pull the lever, so that the runaway trolley would kill one per- son instead of five? Unlike a person, a machine does not panic or freeze and simply follows and executes the given instruction. This means that an AI-powered trolley may act morally as long as it is programmed properly.6 The question itself remains, however. Should the AI-powered trolley be programmed to swerve or stay on course? Different moral theories, such as virtue ethics, contractarianism, and moral relativism, take different positions. Here, I will consider utilitarianism and deontology. Since their tenets are relatively straightforward, most AI developers are likely to look towards those two moral theories for guidance and insight. Utilitarianism argues that the utility of an action is what makes an action moral. In this view, what generates the greatest amount of good is the most moral thing to do. If one regards five human lives as a greater good than one, then one acts morally by pulling the lever and diverting the trolley to the other track. By contrast, deontology claims that what determines whether an action is morally right or wrong is not its utility but moral rules. If an action is in accordance with those rules, then the action is morally right. Otherwise, it is morally 4Machine learning and deep learning have gained momentum because the cost of high-performance computing has significantly decreased and large data sets have become more widely available. For example, the data in the ImageNet contains more than 14 million hand-annotated images. The ImageNet data have been used for the well-known annual AI competition for object detection and image classification at large scale from 2010 to 2017. See ?iiT,ffrrrXBK�; 2@M2iXQ`;f+?�HH2M;2bfGao_*f. 5For an excellent history of AI research, see chapter 1, “What is Artificial Intelligence,” of Boden 2016, 1-20. 6Programming here does not exclusively refer to a deep learning or machine learning approach. http://www.image-net.org/challenges/LSVRC/ http://www.image-net.org/challenges/LSVRC/ Kim 75 wrong. If not to kill another human being is one of those moral rules, then killing someone is morally wrong even if it is to save more lives. Note that these are highly simplified accounts of utilitarianism and deontology. The good in utilitarianism can be interpreted in many different ways, and the issue of conflicting moral rules is a perennial problem that deontological ethics grapples with.7 For our purpose, however, these simplified accounts are sufficient to highlight the aspects in which the utilitarian and the deontological position appeal to and go against our moral intuition at the same time. If a trolley cannot be stopped, saving five lives over one seems to be a right thing to do. Util- itarianism appears to get things right in this respect. However, it is hard to dispute that killing people is wrong. If killing is morally wrong no matter what, deontology seems to make more sense. With moral theories, things seem to get more confusing. Furthermore, consider the case in which one freezes and fails to pull the lever. According to utilitarianism, this would be morally wrong because it fails to maximize the greatest good, i.e. human lives. But how far should one go to maximize the good? Suppose there is a very large person on a footbridge over the trolley track, and one pushes that person off the footbridge onto the track, thus stopping the trolley and saving the five people. Would this count as a right thing to do? Utilitarianism may argue that. But in real life, many would consider throwing a person morally wrong but pulling the lever morally permissible.8 The problem with utilitarianism is that it treats the good as something inherently quantifi- able, comparable, calculable, and additive. But not all considerations that we have to factor into moral decision-making are measurable in numbers. What if the five people on the track are help- less babies or murderers who just escaped from the prison? Would or should that affect our de- cision? Some of us would surely hesitate to save the lives of five murderers by sacrificing one innocent baby. But what if things were different and we were comparing five school children ver- sus one baby or five babies versus one school child? No one can say for sure what is the morally right action in those cases.9 While the utilitarian position appears less persuasive in light of these considerations, deon- tology doesn’t fare too well, either. Deontology emphasizes one’s duty to observe moral rules. But what if those moral rules conflict with one another? Between the two moral rules, “do not kill a person” and “save lives,” which one should trump the other? The conflict among values is common in life, and deontology faces difficulty in guiding how an intelligent agent is to act in a tricky situation such as the trolley problem.10 Understanding What Ethics Has to Offer Now, let us consider AI-powered military robots and autonomous weapons systems since they present the moral dilemma in the trolley problem more convincingly due to the high stakes in- volved. Suppose that some engineers, following utilitarianism and interpreting victory as the ul- timate good/utility, wish to program an unmanned aerial vehicle (UAV) to autonomously drop 7For an overview, see (Sinnott-Armstrong, 2019) and (Alexander and Moore, 2016). 8For an empirical study on this, see (Cushman, Young, and Hauser 2006). For the results of a similar survey that involves an autonomous car instead of a trolley, see (Bonnefon, Shariff, and Rahwan 2016). 9For an attempt to identify moral principles behind our moral intuition in different versions of the trolley problem and other similar cases, see (Thomson 1976). 10Some moral philosophers doubt the value of our moral intuition in constructing a moral theory. See (Singer 2005), for example. But a moral theory that clashes with common moral intuition is unlikely to be sought out as a guide to making an ethical decision. 76 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 7 bombs in order to maximize the chances of victory. That may result in sacrificing a greater num- ber of civilians than necessary, and many will consider this to be morally wrong. Now imagine different engineers who, adopting deontology and following the moral principle of not killing people, program a UAV to autonomously act in a manner that minimizes casualties. This may lead to defeat on the battlefield, because minimizing casualties may not be always advantageous to winning a war. From these examples, we can see that philosophical insights from utilitarian- ism and deontology may provide little practical guidance on how to program autonomous AI systems to act morally. Ethicists seek abstract principles that can be generalized. For this reason, they are interested in borderline cases that reveal subtle differences in our moral intuition and varying moral theories. Their goal is to define what is moral and investigate how moral reasoning works or should work. By contrast, engineers and programmers pursue practical solutions to real-life problems and look for guidelines that will help with implementing those solutions. Their focus is on creating a set of constraints and if-then statements, which will allow a machine to identify and process morally relevant considerations, so that it can determine and execute an action that is not only rational but also ethical in the given situation.11 On the other hand, the goal of military commanders and soldiers is to end a conflict, bring peace, and facilitate restoring and establishing universally recognized human values such as free- dom, equality, justice, and self-determination. In order to achieve this goal, they must make the best strategic decisions and take the most appropriate actions. In deciding on those actions, they are also responsible for abiding by the principles of jus in bello and for not abdicating their moral responsibility, protecting civilians and minimizing harm, violence, and destruction as much as possible.12 The goal of military commanders and soldiers, therefore, differs from those of moral philosophers or of the engineers who build autonomous weapons. They are obligated to make quick decisions in a life-or-death situation while working with AI-powered military systems. These different goals and interests explain why moral philosophers’ discussion on the trolley problem may be disappointing to AI programmers or military commanders and soldiers. Ethics does not provide an easy answer to the question of how one should program moral decision- making into intelligent machines. Nor does it prescribe the right moral decision in a battlefield. But taking this as a shortcoming of ethics is missing the point. The role of moral philosophy is not to make decision-making easier but to highlight and articulate the difficulty and complexity involved in it. Ethical Challenges from Autonomous AI Systems The complexity of ethical questions means that dealing with the morality of an action by an autonomous AI system will require more than a clever engineering or programming solution. The fact that ethics does not eliminate the inherent ambiguity in many moral decisions should not lead to the dismissal of ethical challenges from autonomous AI systems. By injecting the capacity for autonomous decision-making into machines, AI can fundamentally transform any given field. For example, AI-powered military robots are not just another kind of weapon. When widely deployed, they can change the nature of war itself. Described below are some of the signif- icant ethical challenges that autonomous AI systems such as military robots present. Note that 11Note that this moral decision-making process can be modeled with a rule-based symbolic AI approach, a machine learning approach, or a combination of both. See Vincent Conitzer et al. 2017. 12For the principles of jus in bello, see International Committee of the Red Cross 2015. Kim 77 in spite of these ethical concerns, autonomous AI systems are likely to continue to be developed and adopted in many areas as a way to increase efficiency and lower cost. (a) Moral desensitization AI-powered military robots are more capable than merely remotely-operated weapons. They can identify a target and initiate an attack on their own. Due to their autonomy, military robots can significantly increase the distance between the party that kills and the party that gets killed (Sharkey 2012). This increase, however, may lead people to surrender their own moral responsi- bility to a machine, thereby resulting in the loss of humanity, which is a serious moral risk (Davis 2007). The more autonomous military robots become, the less responsibility humans will feel regarding their life-or-death decisions. (b) Unintended outcome The side that deploys AI-powered military robots is likely to suffer fewer casualties itself while inflicting more casualties on the enemy side. This may make the military more inclined to start a war. Ironically, when everyone thinks and acts this way, the number of wars and the overall amount of violence and destruction in the world will only increase.13 (c) Surrender of moral agency AI-powered military robots may fail to distinguish innocents from combatants and kill the for- mer. In such a case, can we be justified in letting robots take the lives of other human beings? Some may argue that only humans should decide to kill other humans, not machines (Davis 2007). Is it permissible for people to delegate such a decision to AI? (d) Opacity in decision-making Machine learning is used to build many AI systems today. Instead of prescribing a pre-determined algorithm, a machine learning system goes through a so-called ‘training’ process to produce the final algorithm from a large amount of data. For example, a machine learning system may generate an algorithm that successfully recognizes cats in a photo after going through millions of photos that show cats in many different postures from various angles.14 But the resulting algorithm is a complex mathematical formula and not something that humans can easily decipher. This means that the inner workings of a machine learning AI system and its decision-making process is opaque to human understanding, even to those who built the system itself (Knight 2017). In cases where the actions of an AI system can have grave consequences such as a military robot, such opacity becomes a serious problem.15 13(Kahn 2012) also argues that the resulting increase in the number of wars by the use of military robots will be morally bad. 14Google’s research team created an AI algorithm that learned how to recognize a cat in 2012. The neural network behind this algorithm had an array of 16,000 processors and more than one billion connections. Unlabeled random thumbnail images from 10 million YouTube videos allowed this algorithm to learn to identify cats by itself. See Markoff 2012 and Clark 2012. 15This black-box nature of AI systems powered by machine learning has raised great concern among many AI re- searchers in recent years. This is problematic in all areas where these AI systems are used for decision-making, not just in military operations. The gravity of decisions made in a military operation makes this problem even more troublesome. Fortunately, some AI researchers including those in the US Department of Defense are actively working to make AI sys- tems explainable. But until such research bears fruit and AI systems become fully explainable, their military use means accepting many unknown variables and unforeseeable consequences. See Turek n.d. 78 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 7 AI Applications for Libraries Do these ethical concerns outlined above apply to libraries? To answer that, let us first take a look at how AI, particularly machine learning, may apply to library services and operations. AI- powered digital assistants are likely to mediate a library user’s information search, discovery, and retrieval activities in the near future. In recent years, machine learning and deep learning have brought significant improvement to natural language processing (NLP), which deals with analyzing large amounts of natural lan- guage data to make the interaction between people and machines in natural languages possible. For instance, Google Assistant’s new feature ‘duplex’ was shown to successfully make a phone reservation with restaurant staff in 2018 (Welch 2018). Google’s real-time translation capability for 44 different languages was introduced to Google Assist-enabled Android and iOS phones in 2019 (Rincon 2019). As digital assistants become capable of handling more sophisticated language tasks, their use as a flexible voice user interface will only increase. Such digital assistants will be able to directly interact with library systems and applications, automatically interpret a query, and return results that they deem to be most relevant. Those digital assistants can also be equipped to handle the library’s traditional reference or readers’ advisory service. Integrated into a humanoid robot body, they may even greet library patrons at the entrance and answer directional questions about the library building. Cataloging, abstracting, and indexing are other areas where AI will be actively utilized. Cur- rently, those tasks are performed by skilled professionals. But as AI applications become more sophisticated, we may see many of those tasks partially or fully automated and handed over to AI systems. Machine learning and deep learning can be used to extract key information from a large number of documents or from information-rich visual materials, such as maps and video recordings, and generate metadata or a summary. Since machine learning is new to libraries, there are a relatively small number of machine learning applications developed for libraries’ use. They are likely to grow in number. Yewno, Quartolio, and Iris.ai are examples of the commercial products developed with machine learning and deep learning techniques.16 Yewno Discover displays the connections between different con- cepts or works in library materials. Quartolio targets researchers looking to discover untapped research opportunities based upon a large amount of data that includes articles, clinical trials, patents, and notes. Similarly, Iris.ai helps researchers identify and review a large amount of re- search papers and patents and extracts key information from them. Kira identifies, extracts, and analyzes text in contracts and other legal documents.17 None of these applications performs fully automated decision-making nor incorporates the digital assistant feature. But this is an area on which information systems vendors are increasingly focusing their efforts. Libraries themselves are also experimenting with AI to test its potential for library services and operations. Some are focusing on using AI, particularly the voice user interface aspect of the digital assistant, in order to improve existing services. The University of Oklahoma Libraries have been building an Alexa application to provide basic reference service to their students.18 16See ?iiTb,ffrrrXv2rMQX+QKf2/m+�iBQM, ?iiTb,ff[m�`iQHBQX+QKf, and ?iiTb,ffB`BbX�Bf. 17See ?iiTb,ffFB`�bvbi2KbX+QKf. Law firms are adopting similar products to automate and expedite their legal work, and law librarians are discussing how the use of AI may change their work. See Marr 2018 and Talley 2016. 18University of Oklahoma Libraries are building an Alexa application that will provide some basic reference service to their students. Also, their PAIR registry attempts to compile all AI-related projects at libraries. See ?iiTb,ffT�B`XH B#`�`B2bXQmX2/m. https://www.yewno.com/education https://quartolio.com/ https://iris.ai/ https://kirasystems.com/ https://pair.libraries.ou.edu https://pair.libraries.ou.edu Kim 79 At the University of Pretoria Library in South Africa, a robot named ‘Libby’ already interacts with patrons by providing guidance, answering questions, conducting surveys, and displaying marketing videos (Mahlangu 2019). Other libraries are applying AI to extract information from digital materials and automate metadata generation to enhance their discovery and use. The Library of Congress has worked on detecting features, such as railroads in maps, using the convolutional neural network model, and issued a solicitation for a machine learning and deep learning pilot program that will max- imize the use of its digital collections in 2019.19 Indiana University Libraries, AVP, University of Texas Austin School of Information, and the New York Public Library are jointly developing the Audiovisual Metadata Platform (AMP), using many AI tools in order to automatically gen- erate metadata for audiovisual materials, which collection managers can use to supplement their archival description and processing workflows.20 Some libraries are also testing out AI as a tool for evaluating services and operations. The Uni- versity of Rochester Libraries applied deep learning to the library’s space assessment to determine the optimal staffing level and building hours. The University of Illinois Urbana-Champaign Li- braries used machine learning to conduct sentiment analysis on their reference chat log (Blewer, Kim, and Phetteplace 2018). Ethical Challenges from the Personalized and Automated Information Environment Do these current and future AI applications for libraries pose ethical challenges similar to those that we discussed earlier? Since information query, discovery, and retrieval rarely involve life- or-death situations, stakes seem to be certainly lower. But an AI-driven automated information environment does raise its own distinct ethical challenges. (i) Intellectual isolation and bigotry hampering civic discourse Many AI applications that assist with information seeking activities promise a higher level of per- sonalization. But a highly personalized information environment often traps people in their own so-called ‘filter bubble,’ as we have been increasingly seeing in today’s social media channels, news websites, and commercial search engines, where such personalization is provided by machine learning and deep learning.21 Sophisticated AI algorithms are already curating and pushing in- formation feeds based upon the person’s past search and click behavior. The result is that infor- mation seekers are provided with information that conforms and reinforces their existing beliefs and interests. Views that are novel or contrast with their existing beliefs are suppressed and be- come invisible without them even realizing. Such lack of exposure to opposing views leads information users to intellectual isolation and even bigotry. Highly personalized information environments powered by AI can actively restrict ways in which people develop balanced and informed opinions, thereby intensifying and perpet- uating social discord and disrupting civic discourse. Under such conditions, prejudices, discrim- 19See Blewer, Kim, and Phetteplace 2018 and Price 2019. 20The AMP wiki is ?iiTb,ffrBFBX/HB#XBM/B�M�X2/mfT�;2bfpB2rT�;2X�+iBQM?T�;2A/48jReNNN9R. The Audiovisual Metadata Platform Pilot Development (AMPPD) project was presented at Code4Lib 2020 (Averkamp and Hardesty 2020). 21See Pariser 2012. https://wiki.dlib.indiana.edu/pages/viewpage.action?pageId=531699941 80 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 7 ination, and other unjust social practices are likely to increase, and this in turn will have more negative impact on those with fewer privileges. Intellectual isolation and bigotry has a distinctly moral impact on society. (ii) Weakening of cognitive agency and autonomy We have seen earlier that AI-powered digital assistants are likely to mediate people’s information search, discovery, and retrieval activities in the near future. As those digital assistants become more capable, they will go beyond listing available information. They will further choose what they deem to be most relevant to users and proceed to recommend or autonomously execute the best course of action.22 Other AI-driven features, such as extracting key information or generat- ing a summary of a large amount of information, are also likely to be included in future informa- tion systems, and they may deliver key information or summaries even before the request is made based upon constant monitoring of the user’s activities. In such a scenario, an information seeker’s cognitive agency is likely be undermined. Cru- cial to cognitive agency is the mental capacity to critically review a variety of information, judge what is and is not relevant, and interpret how they relate to other existing beliefs and opinions. If AI assumes those tasks, the opportunities for information seekers to exercise their own cogni- tive agency will surely decrease. Cognitive deskilling and the subsequent weakening of people’s agency in the AI -powered automated information environment presents an ethical challenge because such agency is necessary for a person to be a fully functioning moral agent in society.23 (iii) Social impact of scholarship and research from flawed AI algorithms Previously, we have seen that deep learning applications are opaque to human understanding. This lack of transparency and explainability raises a question of whether it is moral to rely on AI-powered military robots for life-or-death decisions. Does the AI-powered information envi- ronment have a similar problem? Machine learning applications base their recommendations and predictions upon the pat- terns in past data. Their predictions and recommendations are in this sense inherently conser- vative. They also become outdated when they fail to reflect new social views and material con- ditions that no longer fit the past patterns. Furthermore, each data set is a social construct that reflects particular values and choices such as who decided to collect the data and for what pur- pose; who labeled data; what criteria or beliefs guided such labeling; what taxonomies were used and why (Davis 2020). No data set can capture all variables and elements of the phenomenon that it describes. Furthermore, data sets used for training machine learning and deep learning algorithms may not be representational samples for all relevant subgroups. In such a case, an al- gorithm trained by such a data set will produce skewed results. Creating a large data set is also costly. Consequently, developers often simply take the data sets available to them. Those data sets are likely to come with inherent limitations such as omissions, inaccuracies, errors, and hidden biases. 22Needless to say, this is a highly simplified scenario. Those features can also be built in the information system itself rather than being delivered by a digital assistant. 23Outside of the automated information environment, AI has a strong potential to engender moral deskilling. Vallor (2015) points out that automated weapons will lead to soldiers’ moral deskilling in the use of military force; new me- dia practices of multitasking may result in deskilling in moral attention; and social robots can cause moral deskilling in practices of human caregiving. Kim 81 AI algorithms trained with these flawed data sets can fail unexpectedly, revealing those limi- tations. For example, it has been reported that the success rate of a facial recognition algorithm plunges from 99% to 35% when the group of subjects changes from white men to dark-skinned women because it was trained mostly with the photographs of white men (Lohr 2018). Adopt- ing such a faulty algorithm for any real-life use at a large scale would be entirely unethical. For the context of libraries, imagine using such a face-recognition algorithm to generate metadata for digitized historical photographs or a similarly flawed audio transcription algorithm to transcribe archival audio recordings. Just like those faulty algorithms, an AI-powered automated information environment can produce information, recommendations, and predictions affected by similar limitations existing in many data sets. The more seamless such an information environment is, the more invisible those limitations become. Automated information systems from libraries may not be involved in decisions that have a direct and immediate impact on people’s lives, such as setting a bail amount or determining the Medicaid payment to be paid.24 But automated information systems that are widely adopted and used for research and scholarship will impact real-life policies and regulations in areas such as healthcare and the economy. Undiscovered flaws will undermine the validity of the scholarly output that utilized those automated information systems and can further inflict serious harm on certain groups of people through those policies and regulations. Moral Intelligence and Rethinking the Role of AI In this chapter, I discussed four significant ethical challenges that automating decisions and ac- tions with AI presents: (a) moral desensitization; (b) unintended outcomes; (c) surrender of moral agency; (d) opacity in decision-making.25 I also examined somewhat different but equally significant ethical challenges in relation to the AI-powered automated information environment, which is likely to surround us in the future: (i) intellectual isolation and bigotry hampering civic discourse; (ii) weakening of cognitive agency and autonomy; (iii) social impact of scholarship and research based upon flawed AI algorithms. In the near future, libraries will be acquiring, building, customizing, and implementing many personalized and automated information systems. Given this, the challenges related to the AI- powered automated information environment are highly relevant to them. At present, libraries are at an early stage in developing AI applications and applying machine learning and deep learn- ing techniques to improve library services, systems, and operations. But the general issues of hidden biases and the lack of explainability in machine learning and deep learning are already gaining awareness in the library community. As we have seen in the trolley problem, whether a certain action is moral is not a line that can be drawn with absolute clarity. It is entirely possible for fully-functioning moral agents to make different judgements. In addition, there is the matter of morality that our tools and systems display. This is called “machine morality” in relation to AI systems. Wallach and Allen (2009) argue that there are three distinct levels of machine morality: oper- ational morality, functional morality, and full moral agency (26). Operational morality is found in systems that are low in both autonomy and ethical sensitivity. At this level of machine moral- ity, a machine or a tool is given a mechanism that prevents its immoral use, but the mechanism 24See Tashea 2017 and Stanley 2017. 25This is by no means an exhaustive list. User privacy and potential surveillance are examples of other important ethical challenges, which I do not discuss here. 82 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 7 is within the full control of the user. Such operational morality exists in a gun with a childproof safety mechanism, for example. A gun with a safety mechanism is neither autonomous nor sen- sitive to ethical concerns related to its use. By contrast, machines with functional morality do possess a certain level of autonomy and ethical sensitivity. This category includes AI systems with significant autonomy and little ethical sensitivity or those with little autonomy and high ethical sensitivity. An autonomous drone would fall under the former type, while MedEthEx, an ethical decision-support AI recommendation system for clinicians, would be of the latter. Lastly, Wallach and Allen regard systems with high autonomy and high ethical sensitivity as having full moral agency, as much as humans do. This means that those systems would have a mental rep- resentation of values and the capacity for moral reasoning. Such machines can be held morally responsible for their actions. We do not know whether AI will be able to produce such a machine with full moral agency. If the current direction to automate more and more human tasks for cost savings and efficiency at scale continues, however, most of the more sophisticated AI applications to come will be of the kind with functional morality, particularly the kind that combines a relatively high level of autonomy and a lower level of ethical sensitivity. In the beginning of this chapter, I mentioned that the goal of AI is to create an artificial system—whether it be a piece of software or a machine with a physical body—that is as intelligent as a human in its performance, either broadly in all areas of human activities or narrowly in a specific activity. But what does “as intelligent as a human” exactly mean? If morality is an integral component of human-level intelligence, AI research needs to pay more attention to intelligence not only in accomplishing a goal but also in doing so ethically.26 In that light, it is meaningful to ask what level of autonomy and ethical sensitivity a given AI system is equipped with, and what level of machine morality is appropriate for its purpose. In designing an AI system, it would be helpful to consider what level of autonomy and ethical sensitivity would be best suited for its purpose and whether it is feasible to provide that level of machine morality for the system in question. In general, the narrower the function or the do- main of an AI system is, the easier it will be to equip it with an appropriate level of autonomy and ethical sensitivity. In evaluating and designing an AI system, it will be important to test the actual outcome against the anticipated outcome in different types of cases in order to identify potential problems. System-wide audits to detect well-known biases, such as gender discrimina- tion or racism, can serve as an effective strategy.27 Other undetected problems may surface only after the AI system is deployed. Having a mechanism to continually test an AI algorithm to iden- tify those unnoticed problems and feeding the test result back into the algorithm for retraining will be another way to deal with algorithmic biases. Those who build AI systems will also benefit from consulting existing principles and guidelines such as FAT/ML’s “Principles for Accountable Algorithms and a Social Impact Statement for Algorithms.”28 We may also want to rethink how and where we apply AI. We and our society do not have 26Here, I regard intelligence as the ability to accomplish complex goals following Tegmark 2017. For more discussion on intelligence and goals, see Chapter 2 and Chapter 7. 27These audits are far from foolproof, but the detection of hidden biases will be crucial in making AI algorithms more accountable and their decisions more ethical. A debiasing algorithm can also be used during the training stage of an AI algorithm to reduce hidden biases in training data. See Amini et al. 2019, Knight 2019b, and Courtland 2018. 28See ?iiTb,ffrrrX7�iKHXQ`;f`2bQm`+2bfT`BM+BTH2b@7Q`@�++QmMi�#H2@�H;Q`Bi?Kb. Other principles and guidelines include “Ethics Guidelines for Trustworthy AI” (?iiTb,ff2+X2m`QT�X2mf/B;Bi�H@b BM;H2@K�`F2if2MfM2rbf2i?B+b@;mB/2HBM2b@i`mbirQ`i?v@�B) and “Algorithmic Impact Assessments: A Practical Framework For Public Agency Accountability” (?iiTb,ff�BMQrBMbiBimi2XQ`;f�B�`2TQ`ikyR3XT /7). https://www.fatml.org/resources/principles-for-accountable-algorithms https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai https://ainowinstitute.org/aiareport2018.pdf https://ainowinstitute.org/aiareport2018.pdf Kim 83 to use AI to equip all our systems and machines with human- or superhuman-level performance. This is particularly so if the pursuit of such human- or superhuman-level performance is likely to increase unethical decisions that negatively impact a significant number of people. We do not have to task AI with always automating away human work and decisions as much as possible. What if we reframe AI’s role as helping people become more intelligent and more capable where they struggle or experience disadvantages, such as critical thinking, civic participation, healthy liv- ing, financial literacy, dyslexia, or hearing loss? What kind of AI-driven information systems and environments would be created if libraries approach AI with such intention from the beginning? References Alexander, Larry, and Michael Moore. 2016. “Deontological Ethics.” In The Stanford Encyclo- pedia of Philosophy, edited by Edward N. Zalta, Winter 2016. Metaphysics Research Lab, Stanford University. ?iiTb,ffTH�iQXbi�M7Q`/X2/mf�`+?Bp2bfrBMkyRef2Mi`B2 bf2i?B+b@/2QMiQHQ;B+�Hf. Amini, Alexander, Ava P. Soleimany, Wilko Schwarting, Sangeeta N. Bhatia, and Daniela Rus. 2019. “Uncovering and Mitigating Algorithmic Bias through Learned Latent Structure.” In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, 289–295. AIES ’19. New York, NY, USA: Association for Computing Machinery. ?iiTb,ff/QBXQ`;f RyXRR98fjjyeeR3XjjR9k9j. Averkamp, Shawn, and Julie Hardesty. 2020. “AI Is Such a Tool: Keeping Your Machine Learn- ing Outputs in Check.” Presented at the Code4lib Conference, Pittsburgh, PA, March 11. ?iiTb,ffkykyX+Q/29HB#XQ`;fi�HFbf�A@Bb@bm+?@�@iQQH@E22TBM;@vQm`@K �+?BM2@H2�`MBM;@QmiTmib@BM@+?2+F. Blewer, Ashley, Bohyun Kim, and Eric Phetteplace. 2018. “Reflections on Code4Lib 2018.” ACRL TechConnect (blog). March 12, 2018. ?iiTb,ff�+`HX�H�XQ`;fi2+?+QMM2+i fTQbif`27H2+iBQMb@QM@+Q/29HB#@kyR3f. Boden, Margaret A. 2016. AI: Its Nature and Future. Oxford: Oxford University Press. Bonnefon, Jean-François, Azim Shariff, and Iyad Rahwan. 2016. “The Social Dilemma of Au- tonomous Vehicles.” Science 352 (6293): 1573–76. ?iiTb,ff/QBXQ`;fRyXRRkefb+B2 M+2X��7ke89. Clark, Liat. 2012. “Google’s Artificial Brain Learns to Find Cat Videos.” Wired, June 26, 2012. ?iiTb,ffrrrXrB`2/X+QKfkyRkfyef;QQ;H2@t@M2m`�H@M2irQ`Ff. Conitzer, Vincent, Walter Sinnott-Armstrong, Jana Schaich Borg, Yuan Deng, and Max Kramer. 2017. “Moral Decision Making Frameworks for Artificial Intelligence.” In Proceedingsofthe Thirty-First AAAI Conference on Artificial Intelligence, 4831–4835. AAAI’17. San Fran- cisco, California, USA: AAAI Press. Courtland, Rachel. 2018. “Bias Detectives: The Researchers Striving to Make Algorithms Fair.” Nature 558 (7710): 357–60. ?iiTb,ff/QBXQ`;fRyXRyj3f/9R83e@yR3@y89eN@j. Cushman, Fiery, Liane Young, and Marc Hauser. 2006. “The Role of Conscious Reasoning and Intuition in Moral Judgment: Testing Three Principles of Harm.” Psychological Science 17 (12): 1082–89. Davis, Daniel L. 2007. “Who Decides: Man or Machine?” Armed Forces Journal, November. ?iiT,ff�`K2/7Q`+2bDQm`M�HX+QKfr?Q@/2+B/2b@K�M@Q`@K�+?BM2f. Davis, Hannah. 2020. “A Dataset Is a Worldview.” Towards Data Science. March 5, 2020. ?iiT b,ffiQr�`/b/�i�b+B2M+2X+QKf�@/�i�b2i@Bb@�@rQ`H/pB2r@8jk3kRe//99/. https://plato.stanford.edu/archives/win2016/entries/ethics-deontological/ https://plato.stanford.edu/archives/win2016/entries/ethics-deontological/ https://doi.org/10.1145/3306618.3314243 https://doi.org/10.1145/3306618.3314243 https://2020.code4lib.org/talks/AI-is-such-a-tool-Keeping-your-machine-learning-outputs-in-check https://2020.code4lib.org/talks/AI-is-such-a-tool-Keeping-your-machine-learning-outputs-in-check https://acrl.ala.org/techconnect/post/reflections-on-code4lib-2018/ https://acrl.ala.org/techconnect/post/reflections-on-code4lib-2018/ https://doi.org/10.1126/science.aaf2654 https://doi.org/10.1126/science.aaf2654 https://www.wired.com/2012/06/google-x-neural-network/ https://doi.org/10.1038/d41586-018-05469-3 http://armedforcesjournal.com/who-decides-man-or-machine/ https://towardsdatascience.com/a-dataset-is-a-worldview-5328216dd44d https://towardsdatascience.com/a-dataset-is-a-worldview-5328216dd44d 84 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 7 Foot, Philippa. 1967. “The Problem of Abortion and the Doctrine of Double Effect.” Oxford Review 5: 5–15. Heaven, Will Douglas. 2020. “DeepMind’s AI Can Now Play All 57 Atari Games—but It’s Still Not Versatile Enough.” MIT Technology Review, April 1, 2020. ?iiTb,ffrrrXi2+?MQ HQ;v`2pB2rX+QKfkykyfy9fyRfNd9NNd. International Committee of the Red Cross. 2015. “What Are Jus Ad Bellum and Jus in Bello?” January 22, 2015. ?iiTb,ffrrrXB+`+XQ`;f2Mf/Q+mK2Mifr?�i@�`2@Dmb@�/@#2H HmK@�M/@Dmb@#2HHQ@y. Kahn, Leonard. 2012. “Military Robots and The Likelihood of Armed Combat.” In Robot Ethics: The Ethical and Social Implications of Robotics, edited by Patrick Lin, Keith Abney, and George A. Bekey, 274–92. Intelligent Robotics and Autonomous Agents. Cambridge, Mass.: MIT Press. Knight, Will. 2017. “The Dark Secret at the Heart of AI.” MIT Technology Review, April 11, 2017. ?iiTb,ffrrrXi2+?MQHQ;v`2pB2rX+QKfkyRdfy9fRRf8RRj. . 2019a. “Two Rival AI Approaches Combine to Let Machines Learn about the World like a Child.” MIT Technology Review, April 8, 2019. ?iiTb,ffrrrXi2+?MQHQ;v `2pB2rX+QKfkyRNfy9fy3fRyjkkj. . 2019b. “AI Is Biased. Here’s How Scientists Are Trying to Fix It.” Wired, De- cember 19, 2019. ?iiTb,ffrrrXrB`2/X+QKfbiQ`vf�B@#B�b2/@?Qr@b+B2MiBbib @i`vBM;@7Btf. Koch, Christof. 2016. “How the Computer Beat the Go Master.” Scientific American. March 19, 2016. ?iiTb,ffrrrXb+B2MiB7B+�K2`B+�MX+QKf�`iB+H2f?Qr@i?2@+QKTmi2 `@#2�i@i?2@;Q@K�bi2`f. Lohr, Steve. 2018. “Facial Recognition Is Accurate, If You’re a White Guy.” New York Times, February 9, 2018. ?iiTb,ffrrrXMviBK2bX+QKfkyR3fykfyNfi2+?MQHQ;vf7�+B�H @`2+Q;MBiBQM@`�+2@�`iB7B+B�H@BMi2HHB;2M+2X?iKH. Mahlangu, Isaac. 2019. “Meet Libby - the New Robot Library Assistant at the University of Pretoria’s Hatfield Campus.” SowetanLIVE. June 4, 2019. ?iiTb,ffrrrXbQr2i�MHBp 2X+QXx�fM2rbfbQmi?@�7`B+�fkyRN@ye@y9@K22i@HB##v@i?2@M2r@`Q#Qi@HB #`�`v@�bbBbi�Mi@�i@i?2@mMBp2`bBiv@Q7@T`2iQ`B�b@?�i7B2H/@+�KTmbf. Markoff, John. 2012. “How Many Computers to Identify a Cat? 16,000.” New York Times, June 25, 2012. Marr, Bernard. 2018. “How AI And Machine Learning Are Transforming Law Firms And The Legal Sector.” Forbes, May 23, 2018. ?iiTb,ffrrrX7Q`#2bX+QKfbBi2bf#2`M�`/K� ``fkyR3fy8fkjf?Qr@�B@�M/@K�+?BM2@H2�`MBM;@�`2@i`�Mb7Q`KBM;@H�r@7 B`Kb@�M/@i?2@H2;�H@b2+iQ`f. Pariser, Eli. 2011. TheFilterBubble: HowtheNewPersonalizedWebIsChangingWhatWeRead and How We Think. New York: Penguin Press. Price, Gary. 2019. “The Library of Congress Posts Solicitation For a Machine Learning/Deep Learning Pilot Program to ‘Maximize the Use of Its Digital Collection.’ ” LJ InfoDOCKET. June 13, 2019. ?iiTb,ffrrrXBM7Q/Q+F2iX+QKfkyRNfyefRjfHB#`�`v@Q7@+QM;` 2bb@TQbib@bQHB+Bi�iBQM@7Q`@�@K�+?BM2@H2�`MBM;@/22T@H2�`MBM;@TBHQ i@T`Q;`�K@iQ@K�tBKBx2@i?2@mb2@Q7@Bib@/B;Bi�H@+QHH2+iBQM@HB#`�`v@ Bb@HQQFBM;@7Q`@`f. Rincon, Lilian. 2019. “Interpreter Mode Brings Real-Time Translation to Your Phone.” Google Blog (blog). December 12, 2019. ?iiTb,ffrrrX#HQ;X;QQ;H2fT`Q/m+ibf�bbBbi� https://www.technologyreview.com/2020/04/01/974997 https://www.technologyreview.com/2020/04/01/974997 https://www.icrc.org/en/document/what-are-jus-ad-bellum-and-jus-bello-0 https://www.icrc.org/en/document/what-are-jus-ad-bellum-and-jus-bello-0 https://www.technologyreview.com/2017/04/11/5113 https://www.technologyreview.com/2019/04/08/103223 https://www.technologyreview.com/2019/04/08/103223 https://www.wired.com/story/ai-biased-how-scientists-trying-fix/ https://www.wired.com/story/ai-biased-how-scientists-trying-fix/ https://www.scientificamerican.com/article/how-the-computer-beat-the-go-master/ https://www.scientificamerican.com/article/how-the-computer-beat-the-go-master/ https://www.nytimes.com/2018/02/09/technology/facial-recognition-race-artificial-intelligence.html https://www.nytimes.com/2018/02/09/technology/facial-recognition-race-artificial-intelligence.html https://www.sowetanlive.co.za/news/south-africa/2019-06-04-meet-libby-the-new-robot-library-assistant-at-the-university-of-pretorias-hatfield-campus/ https://www.sowetanlive.co.za/news/south-africa/2019-06-04-meet-libby-the-new-robot-library-assistant-at-the-university-of-pretorias-hatfield-campus/ https://www.sowetanlive.co.za/news/south-africa/2019-06-04-meet-libby-the-new-robot-library-assistant-at-the-university-of-pretorias-hatfield-campus/ https://www.forbes.com/sites/bernardmarr/2018/05/23/how-ai-and-machine-learning-are-transforming-law-firms-and-the-legal-sector/ https://www.forbes.com/sites/bernardmarr/2018/05/23/how-ai-and-machine-learning-are-transforming-law-firms-and-the-legal-sector/ https://www.forbes.com/sites/bernardmarr/2018/05/23/how-ai-and-machine-learning-are-transforming-law-firms-and-the-legal-sector/ https://www.infodocket.com/2019/06/13/library-of-congress-posts-solicitation-for-a-machine-learning-deep-learning-pilot-program-to-maximize-the-use-of-its-digital-collection-library-is-looking-for-r/ https://www.infodocket.com/2019/06/13/library-of-congress-posts-solicitation-for-a-machine-learning-deep-learning-pilot-program-to-maximize-the-use-of-its-digital-collection-library-is-looking-for-r/ https://www.infodocket.com/2019/06/13/library-of-congress-posts-solicitation-for-a-machine-learning-deep-learning-pilot-program-to-maximize-the-use-of-its-digital-collection-library-is-looking-for-r/ https://www.infodocket.com/2019/06/13/library-of-congress-posts-solicitation-for-a-machine-learning-deep-learning-pilot-program-to-maximize-the-use-of-its-digital-collection-library-is-looking-for-r/ https://www.blog.google/products/assistant/interpreter-mode-brings-real-time-translation-your-phone/ https://www.blog.google/products/assistant/interpreter-mode-brings-real-time-translation-your-phone/ Kim 85 MifBMi2`T`2i2`@KQ/2@#`BM;b@`2�H@iBK2@i`�MbH�iBQM@vQm`@T?QM2f. Sharkey, Noel. 2012. “Killing Made Easy: From Joysticks to Politics.” In Robot Ethics: The Ethical and Social Implications of Robotics, edited by Patrick Lin, Keith Abney, and George A. Bekey, 111–28. Intelligent Robotics and Autonomous Agents. Cambridge, Mass.: MIT Press. Singer, Peter. 2005. “Ethics and Intuitions.” The Journal of Ethics 9 (3/4): 331–52. Sinnott-Armstrong, Walter. 2019. “Consequentialism.” In The Stanford Encyclopedia of Phi- losophy, edited by Edward N. Zalta, Summer 2019. Metaphysics Research Lab, Stanford University. ?iiTb,ffTH�iQXbi�M7Q`/X2/mf�`+?Bp2bfbmKkyRNf2Mi`B2bf+QMb 2[m2MiB�HBbKf. Stanley, Jay. 2017. “Pitfalls of Artificial Intelligence Decisionmaking Highlighted In Idaho ACLU Case.” American Civil Liberties Union (blog). June 2, 2017. ?iiTb,ffrrrX�+HmXQ`;f# HQ;fT`Bp�+v@i2+?MQHQ;vfTBi7�HHb@�`iB7B+B�H@BMi2HHB;2M+2@/2+BbBQM K�FBM;@?B;?HB;?i2/@B/�?Q@�+Hm@+�b2. Talley, Nancy B. 2016. “Imagining the Use of Intelligent Agents and Artificial Intelligence in Academic Law Libraries.” Law Library Journal 108 (3): 383–402. Tashea, Jason. 2017. “Courts Are Using AI to Sentence Criminals. That Must Stop Now.” Wired, April 17, 2017. ?iiTb,ffrrrXrB`2/X+QKfkyRdfy9f+Qm`ib@mbBM;@�B@b2 Mi2M+2@+`BKBM�Hb@Kmbi@biQT@MQrf. Tegmark, Max. 2017. Life 3.0: Being Human in the Age of Artificial Intelligence. New York: Alfred Knopf. Thomson, Judith Jarvis. 1976. “Killing, Letting Die, and the Trolley Problem.” The Monist 59 (2): 204–17. Turek, Matt. n.d. “Explainable Artificial Intelligence.” Defense Advanced Research Projects Agency. ?iiTb,ffrrrX/�`T�XKBHfT`Q;`�Kf2tTH�BM�#H2@�`iB7B+B�H@BMi2H HB;2M+2. Vallor, Shannon. 2015. “Moral Deskilling and Upskilling in a New Machine Age: Reflections on the Ambiguous Future of Character.” Philosophy & Technology 28 (1): 107–24. ?iiTb, ff/QBXQ`;fRyXRyydfbRjj9d@yR9@yR8e@N. Wallach, Wendell. 2009. Moral Machines: Teaching Robots Right from Wrong. Oxford: Oxford University Press. Welch, Chris. 2018. “Google Just Gave a Stunning Demo of Assistant Making an Actual Phone Call.” The Verge. May 8, 2018. ?iiTb,ffrrrXi?2p2`;2X+QKfkyR3f8f3fRdjjkydy f;QQ;H2@�bbBbi�Mi@K�F2b@T?QM2@+�HH@/2KQ@/mTH2t@BQ@kyR3. https://www.blog.google/products/assistant/interpreter-mode-brings-real-time-translation-your-phone/ https://www.blog.google/products/assistant/interpreter-mode-brings-real-time-translation-your-phone/ https://plato.stanford.edu/archives/sum2019/entries/consequentialism/ https://plato.stanford.edu/archives/sum2019/entries/consequentialism/ https://www.aclu.org/blog/privacy-technology/pitfalls-artificial-intelligence-decisionmaking-highlighted-idaho-aclu-case https://www.aclu.org/blog/privacy-technology/pitfalls-artificial-intelligence-decisionmaking-highlighted-idaho-aclu-case https://www.aclu.org/blog/privacy-technology/pitfalls-artificial-intelligence-decisionmaking-highlighted-idaho-aclu-case https://www.wired.com/2017/04/courts-using-ai-sentence-criminals-must-stop-now/ https://www.wired.com/2017/04/courts-using-ai-sentence-criminals-must-stop-now/ https://www.darpa.mil/program/explainable-artificial-intelligence https://www.darpa.mil/program/explainable-artificial-intelligence https://doi.org/10.1007/s13347-014-0156-9 https://doi.org/10.1007/s13347-014-0156-9 https://www.theverge.com/2018/5/8/17332070/google-assistant-makes-phone-call-demo-duplex-io-2018 https://www.theverge.com/2018/5/8/17332070/google-assistant-makes-phone-call-demo-duplex-io-2018 lesk-fragility-2021 ---- Chapter 9 Fragility and Intelligibility of Deep Learning for Libraries Michael Lesk Rutgers University Introduction On February 7, 2018, Mounir Mahjoubi, then the “digital minister” of France (le secrétariat d’État chargé du Numérique), told the civil service to use only computer methods that could be understood (Mahjoubi 2018). To be precise, what he actually said to l’Assemblée Nationale was: Aucun algorithme non explicable ne pourra être utilisé. I gave this to Google Translate and asked for it in English. What I got (on October 13, 2019) was: No algorithm that can not be explained can not be used. That’s a long way from fluent English. As I count the “not” words, it’s actually reversed in mean- ing. But, what if I leave off the final period when I enter it in Google Translate? Then I get: No non-explainable algorithm can be used Quite different, and although only barely fluent, now the meaning is right. The difference was only the final punctuation on the sentence.1 This is an example of the fragility of an AI algorithm. The point is not that both translations are of doubtful quality. The point is that a seemingly insignificant change in the input produced such a difference in the output. In this case, the fragility was detected by accident. 1In the months between my original queries in October 2019 and the final preparations for publication in November 2020, the algorithm has changed to produce the same translation with or without a period: “No non-explicable algorithm can be used.” 101 102 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 9 Machine learning systems have a set of data for training. For example, if you are interested in translation, and you have a large collection of text in both French and English, you might notice that the word truck in English appears where the word camion appears in French. And the system might “learn” this translation. It would then apply this in other examples; this is called general- ization. Of course if you wish to translate French into British English, a preferred translation of camion is lorry. And if the context of your English truck is a US discussion of the wheels and axles underneath railway vehicles, the better French word is le bogie. Deep learning enthusiasts believe that with enough examples, machine learning systems will be able to generalize correctly. There can be various kinds of failures: we can discuss both (a) problems in the scope of the training data and (b) problems in the kind of modeling done. If the system has sufficiently general input data so that it learns well enough to produce reliably correct results on examples it has not seen, we call it robust; robustness is the opposite of fragility. Fragility errors here can arise from many sources—for example, the training data may not be representative of the real problem (if you train a machine translation program solely on engineering documents, do not expect it to do well on theater reviews). Or, the data may not have the scope of the real problem: if you train for “boat” based on ocean liners, don’t be surprised if the program fails on canoes. In addition, there are also modeling issues. Suppose you use a very simple model, such as a linear model, for data that is actually perhaps quadratic or exponential. This is called “underfit- ting” and may often arise when there is not enough training data. The reverse is also possible: there may be a lot of training data, including many noisy points, and the program may decide on a very complex model to cover all the noise in the training data. This is called “overfitting” and gives you an answer too dependent on noise and outliers in your data. For example, 1998 was an unusually warm year, but the decline in world temperature for the next few years suggests it was noise in the data, not a change in the development of climate. Fragility is also a problem in image recognition (“AI Recognition” 2017). Currently the most common technique for image recognition research projects is the use of convolutional neural nets. Recently, several papers have looked at how trivial modifications to images may impact im- age classification. Here (figure 9.1) is an image taken from (Su, Vargas, and Sakurai 2019). The original image class is in black and the classifier choice (and confidence) after adding a single un- usual pixel are shown in blue, with the extraneous pixel in white. The images were deliberately processed at low resolution—hence the pixellation—to match the input requirement of a popu- lar image classification program. The authors experimented with algorithms to find the quickest single-pixel change that would deceive an image classifier. They were routinely able to fool the recognition software. In this ex- ample, the deception was deliberate; the researchers searched for the best place to change the image. Bias and mistakes We have seen a major change in the way we do machine learning, and there are real dangers in- volved. The current enthusiasm for neural nets risks the use of processes which cannot be under- stood, as Mahjoubi warned, and which can thus conceal methods we would not approve of, such as discrimination in lending or hiring. Cathy O’Neil has described this in her book Weapons of Math Destruction (2016). There is much research today that seeks methods to explain what neural nets are doing. See Lesk 103 Figure 9.1: Examples of misclassification. Guidiotti et al. (2017) for a survey. There is also a 2018 DARPA program on “Explainable AI.” Techniques used can include looking at the results over a range of input data and seeing if the neural net can be modeled by a decision tree, or modifying the input data to see which input elements have the greatest effect on the results, and then showing that to the user. For example, Mariusz Bojarski et al. describe a self-driving system that highlights what it thinks is important in what it is seeing (2017). However, this is generally research in progress, and it raises the question of whether we can trust the explanation generator. Many popular magazines have discussed this problem; Forbes, for example, had an explana- tion of how the choice of datasets can produce a biased result without any deliberate attempt to do so (Taulli 2019). Similarly, the New York Times discussed the way groups of primarily young white men will build systems that focus on their data, and give wrong or discriminatory answers in more general situations (Tugend 2019). The MIT Media Lab hosts the Algorithmic Justice League, trying to stop organizations from building socially slanted systems. Similar thoughts come from groups like the Data and Society Research Institute or the AI Now Institute. Again, the problems may be accidental or deliberate. The phrase “data poisoning” has been used to suggest malicious creation of training data or examples of data designed to deceive ma- chine learning systems. There is now a DARPA research program, “Guaranteeing AI Robustness against Deception (GARD),” supporting research to learn how to stop trickery such as a demon- stration of converting a traffic stop sign to a 45 mph speed limit with a few stickers (Eykholt et al. 2018). More generally, bias in systems deciding whether to grant loans may be discriminatory but nevertheless profitable. Even if you want to detect AI mistakes, recognizing such problems is difficult. Often things will be wrong and we won’t know why. And even hypothetical (but perhaps erroneous) explana- tions can be very convincing; people easily believe plausible stories. I routinely give my students a paper that concludes that prior ownership of a cat prevents fatal myocardial infarctions; its re- sult implies that cats are more protective than statin drugs (Qureshi et al. 2009). The students are very quick to come up with possibilities like “petting a cat is relaxing, relaxation reduces your blood pressure, and lower blood pressure decreases the risk of heart attacks.” Then I have to ex- plain that the paper evaluates 32 possibilities (prior/current ownership ⇥ cats/dogs ⇥ 4 medical conditions ⇥ fatal/nonfatal) and you shouldn’t be surprised if you evaluate 32 chances and one is significant at the 0.05 level, which is only 1 in 20. In this example, there is also the question of reverse causality: perhaps someone who is in ill health will decide he is too sick to take care of a 104 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 9 Figure 9.2: Panoramic landscape. pet, so that the poor health is not caused by the lack of a cat, but rather the poor health causes the absence of a cat. Sometimes explanations can help, as in a machine learning program that was deliberately trained to distinguish images of wolves and dogs but was trained using pictures of wolves that always contained snow and pictures of dogs that never did (Ribeiro, Singh, and Guestrin 2016). Without explaining that, 10 of 27 subjects thought the classifier was trustworthy; after point- ing out the snow only 3 of 27 subjects believed the system. Usually you don’t get such a clear presentation of a mis-trained system. Recognition of problems Can we tell when something is wrong? Here’s the result of a Google Photo merge of three other photos; two landscapes and a picture of somebody’s friend. The software was told to make a panorama and stitched the images together (Peng 2018). It looks like a joke, and even made it into a list of top jokes on reddit. The author’s point was that the panorama system didn’t understand basic composition: people are not the same scale as mountains. Often, machine learning results are overstated. Google Flu Trends was acclaimed for several years and then turned out to be undependable (Lazer et al. 2014). A study that attempted to compare the performance of machine learning systems for medical diagnosis with actual doctors found that of over 20,000 papers analyzed, only a few dozen had data suitable for an evaluation (Liu et al. 2019). The results claimed comparable accuracy, but virtually none of the papers Lesk 105 presented adequate data to support that conclusion. Unusually promising results are sometimes the result of overfitting (Brownlee 2018); this is what was wrong with Google Flu Trends. A machine learning program can learn a large number of special cases and then find that the results do not generalize. In other cases problems can result when using “clean” data for training, and then encountering messier data in applications. Ideally, training and testing data should be from the same dataset and divided at random, but it can be tempting to start off with examples that are the result of initial and higher quality data collection. Sometimes in the past we had a choice between modeling and data for predictions. Consider, for example, the problem of guessing what the weather will be tomorrow. We now do this based on a model of the atmosphere that uses the Navier-Stokes equations; we use supercomputers and derive tomorrow’s atmosphere from today’s (Christensen 2015). What did we do before we had supercomputers? Solving those equations by hand is impractical. One of the methods was “pre- diction by analogy”: find some day in the past whose weather was most similar to today. Suppose that day is Oct. 20, 1970. Then use October 21, 1970 as tomorrow’s prediction. Prediction by analogy doesn’t require you to have a model or use advanced mathematics. In this case, however, it doesn’t work as well—partly because we don’t have enough past days to choose from, and we only get new days at the rate of one per day. In fact, Huug van den Dool estimated the number of days of data needed to make accurate predictions as 1030 years, which is far more than the age of the universe (Wilks 2008). The under- lying problem is that the weather is very random. If your state lottery is properly run, it should be completely pointless to look at past winning numbers and try to guess the next one. The weather is not that random but it has too much variation to be solved easily by analogy. If your problem is very simple (tic-tac-toe) you could indeed write down each position and what the best next move is; there are only about 255,000 games. To deal with more realistic problems, much of machine learning research is now focused on obtaining larger training sets. Instead of trying to learn more about the characteristics of a system that is being modeled, the effort is driven by the dictum, “more data beats better algorithms.” In a review of the history of speech recognition, Xuedong Huang, James Baker, and Raj Reddy write, “The power of these systems arises mainly from their ability to collect, process, and learn from very large datasets. The basic learning and decoding algorithms have not changed substantially in 40 years” (2014). Nevertheless, speech recognition has gone from frustration to useful products such as dictation software or home appliances. Lacking a model, however, means that we won’t know the limits of the calculations being done. For example, if you have some data that looks quadratic, but you fit a linear model, any attempt at extrapolation is fraught with error. If you are using a “black box” system, you don’t know when this is happening. And, regrettably, many of the AI software systems are sold as black boxes where the purchasers and users do not have access to the process, even if they are imagined to be able to understand it. What’s changing Many AI researchers are sensitive to the risks, especially given the publicity over self-driving cars. As the hype over “deep learning” built up, writers discussed examples such as a Pittsburgh med- ical system that proposed to send patients with both pneumonia and asthma home, because the computer had not understood that patients with both problems were actually being sent to the ICU (Bornstein 2016; Caruana et al. 2015). 106 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 9 Figure 9.3: Explainability. Many people work on ways of explaining or presenting neural net software (Harley 2015). Most important, perhaps, are new EU regulations that prohibit automated decision making that affects EU citizens, and provides a “right of explanation” (Metz 2016). We recognize that systems which don’t rely on a mathematical model may be cheaper to build than one where the coders understand what is going on. More serious is that they may be more accurate. This image is from the same article on understandability (Bornstein 2016). If there really is a tradeoff between what will solve the problem and what can be explained, we know that many system builders will choose to solve the problem. And yet even having explana- tions may not be an answer; a key paper on interpretability discusses the complexities of meaning related to explanation, causality, and modeling (Lipton 2018). Arend Hintze has noted that we do not always impose a demand for explanation on people. I can write that the New York Public Library main building is well proportioned and attractive without anyone expecting that I will recite its dimensions or the source of the marble used to construct it. And for some problems that’s fine: I don’t care how my camera decides on the focus distance to the subject. Where it matters, however, we often want explanations; the hard ethical problem, as noted before, is if better performance can be achieved in an inexplicable way. Recommendations 2017 saw the publication of the “Asilomar AI principles” (2017). Two of these principles are: • Safety: AI systems should be safe and secure throughout their operational lifetime, and verifiably so where applicable and feasible. • Failure Transparency: If an AI system causes harm, it should be possible to ascertain why. The problem is that the technology used to build many systems does not enable verifiability and explanation. Similarly the World Economic Forum calls for protection against discrimina- tion but notes many ways in which technology can have unanticipated and undesirable effects as a result of machine learning (“How to Prevent” 2018). Lesk 107 Historically there has been and continues to be too much hype. An important image recog- nition task is distinguishing malignant and benign spots on mammograms. There have been promises for decades that computers would do this better than radiologists. Here are examples from 1995 (“computer-aided diagnosis can improve radiologists’ observational performance”) (Schmidt and Nishikawa) and 2009 (“The Bayesian network significantly exceeded the perfor- mance of interpreting radiologists”) (Burnside et al.). A typical recent AI paper to do this with convolutional neural nets reports 90% accuracy (Singh et al. 2020). To put this in perspective, the problem is complex, but some examples are more straightforward, and even pigeons can reach 85% (Levenson et al. 2015). A serious recent review is “Diagnostic Accuracy of Digital Screening Mammography With and Without Computer-Aided Detection” (Lehman et al. 2015). Very re- cently there was another claim that computers have surpassed radiologists (Walsh 2020); we will have to await evaluation. As with many claims of medical progress, replicability and evaluation are needed before doctors will be willing to believe them. What should we do? Software testing generally is a decades-old discipline, and many basic principles of regression testing apply here also: • Test data should cover the full range of expected input. • Test data should also cover unexpected and even illegal input. • Test data should include known past failures believed cleared up. • Test data should exercise all parts of the program, and all important paths (coverage). • Test data should include a set of data which is representative of the distribution of actual data, to be used for timing purposes. It is difficult to apply these ideas in parts of the AI world. If the allowed input is speech, there is no exhaustive list of utterances which can be sampled. If a black-box commercial machine learning package is being used, there is no way to ask about coverage of any number of test cases. If a program is constantly learning from new data, there is no list of previously fixed failures to be collected that reflects the constantly changing program. And obviously the circumstances of use matter. We may well, as a society, decide that forcing banks evaluating loan applications to use decision trees instead of deep learning is appropriate, so that we know whether illegal discrimination is going on, even if this raises the costs to the banks. We might also believe that the safest possible railway operation is important, even if the automated train doesn’t routinely explain how it balanced its choices of acceleration to achieve high punctuality and low risk. What would I suggest? Organizationally: • Have teams including both the computer scientists and the users. • Collaborate with a statistician: they’ve seen a lot of these problems before. • Work on easier problems. As examples, I watched a group of zoologists with a group of computer scientists discussing how to improve accuracy at identifying animals in photographs. The discussion indicated that 108 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 9 you needed hundreds of training examples at a minimum, if not thousands, since the animals do not typically walk up to the camera and pose for a full-frame shot. It was important to have both the people who understood the learning systems and the people who knew what the pictures were realistically like. The most amusing contribution by a statistician happened when a computer scientist offered a program that tried to recognize individual giraffes, and a zoologist complained that it only worked if you had a view of the right-hand side of the giraffe. Somebody who knew statistics perked up and said “it’s a 50% chance of recognizing the animal? I can do the math for that.” And it is simpler to do “is there any animal in the picture?” before asking “which animal is it?” and create two easier problems. Technically: • Try to interpolate rather than extrapolate: use the algorithm on points “inside” the training set (thinking in multiple dimensions). • Lean towards feature detection and modeling rather than completely unsupervised learn- ing. • Emphasize continuous rather than discrete variables. I suggest using methods that involve feature detection, since that tells you what the algorithm is relying on. For example, consider the Google Flu Trends failure; the public was not told what terms were used. As David Lazer noted, some of them were just “winter” terms (like ‘basketball’). If you know that, you might be skeptical. More significant are decisions like jail sentences or college admissions; knowing that racial or religious discrimination are not relevant can be verified by knowing that the program did not use them. Knowing what features were used can sometimes help the user: if you know that your loan application was downrated because of your credit score, it may be possible for you to pay off some bill to raise the score. Sometimes you have to use categorical variables (what county do you live in?) but if you have a choice of how you phrase a variable, asking something like “how many minutes a day do you spend reading?” is likely to produce a better fit than asking people to choose “how much do you read: never, sometimes, a lot?” A machine learning algorithm may tell you how much of the variance each input variable explains; you can use that information to focus on the variables that are most important to your problem, and decide whether you think you are measuring them well enough. Why not extrapolate? Sadly, as I write this in early April 2020, we are seeing all sorts of ex- trapolations of the COVID-19 epidemic, with expected US deaths ranging from 30,000 to 2 million, as people try to fit various functions (Gaussians, logistic regression, or whatever) with inadequately precise data and uncertain models. A simpler example is Mark Twain’s: “In the space of one hundred and seventy-six years the Lower Mississippi has shortened itself two hun- dred and forty-two miles. That is an average of a trifle over one mile and a third per year. There- fore, any calm person, who is not blind or idiotic, can see that in the ‘Old Oolitic Silurian Period,’ just a million years ago next November, the Lower Mississippi River was upwards of one million three hundred thousand miles long, and stuck out over the Gulf of Mexico like a fishing-rod. And by the same token any person can see that seven hundred and forty-two years from now the Lower Mississippi will be only a mile and three-quarters long, and Cairo and New Orleans will have joined their streets together, and be plodding comfortably along under a single mayor and a mutual board of aldermen” (1883). Lesk 109 Finally, note the advice of Edgar Allan Poe: “Believe nothing you hear, and only one half that you see.” References “AI Recognition Fooled by Single Pixel Change.” BBC News, November 3, 2017. ?iiTb,ffrr rX##+X+QKfM2rbfi2+?MQHQ;v@9R3983d3. “Asilomar AI Principles.” 2017. ?iiTb,ff7mim`2Q7HB72XQ`;f�B@T`BM+BTH2bf. Bojarski, Mariusz, Larry Jackel, Ben Firner, and Urs Muller. 2017. “Explaining How End-to- End Deep Learning Steers a Self-Driving Car.” NVIDIA Developer Blog. ?iiTb,ff/2p# HQ;bXMpB/B�X+QKf2tTH�BMBM;@/22T@H2�`MBM;@b2H7@/`BpBM;@+�`f. Bornstein, Aaron. 2016. “Is Artificial Intelligence Permanently Inscrutable?” Nautilus 40 (1). ?iiT,ffM�miBHXmbfBbbm2f9yfH2�`MBM;fBb@�`iB7B+B�H@BMi2HHB;2M+2@T2` K�M2MiHv@BMb+`mi�#H2. Brownlee, Jason. 2018. “The Model Performance Mismatch Problem (and What to Do about It).” Machine Learning Mastery. ?iiTb,ffK�+?BM2H2�`MBM;K�bi2`vX+QKfi?2@K Q/2H@T2`7Q`K�M+2@KBbK�i+?@T`Q#H2Kf. Burnside, Elizabeth S., Jessie Davis, Jagpreet Chhatwal, Oguzhan Alagoz, Mary J. Lindstrom, Berta M. Geller, Benjamin Littenberg, Katherine A. Shaffer, Charles E. Kahn, and C. David Page. 2009. “Probabilistic Computer Model Developed from Clinical Data in National Mammography Database Format to Classify Mammographic Findings.” Radiology 251 (3): 663–72. Caruana, Rich, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. 2015. “Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Read- mission.” In Proceedings of the 21th ACM SIGKDD International Conference on Knowl- edge Discovery and Data Mining (KDD ’15), 1721–30. New York: ACM Press. ?iiTb, ff/QBXQ`;fRyXRR98fkd3jk83Xkd33eRj. Christensen, Hannah. 2015. “Banking on better forecasts: the new maths of weather predic- tion.” The Guardian, 8 Jan 2015. ?iiTb,ffrrrXi?2;m�`/B�MX+QKfb+B2M+2f�H2t b@�/p2Mim`2b@BM@MmK#2`H�M/fkyR8fD�Mfy3f#�MFBM;@7Q`2+�bib@K�i?b@r 2�i?2`@T`2/B+iBQM@biQ+?�biB+@T`Q+2bb2b. Eykholt, Kevin, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Florian Tramèr, Atul Prakash, Tadayoshi Kohno, and Dawn Song. 2018. “Physical Adversarial Examples for Ob- ject Detectors.” 12th USENIX Workshop on Offensive Technologies (WOOT 18). Guidiotti, Riccardo, Anna Monreale, Salvatore Ruggieri, Franco Turini, Giannotti Fosca, and Dino Pedreschi. 2018. “A Survey of Methods for Explaining Black Box Models.” ACM Computing Surveys 51 (5): 1–42. Halevy, Alon, Peter Norvig, and Fernando Pereira. 2009. “The Unreasonable Effectiveness of Data.” IEEE Intelligent Systems 24 (2). Harley, Adam W. 2015. “An Interactive Node-Link Visualization of Convolutional Neural Net- works.” In Advances in Visual Computing, edited by George Bebis et al., 867–77. Lecture Notes in Computer Science. Cham: Springer International Publishing. “How to Prevent Discriminatory Outcomes in Machine Learning.” 2018. White Paper from the Global Future Council on Human Rights 2016–2018, World Economic Forum. ?iiTb, ffrrrXr27Q`mKXQ`;fr?Bi2T�T2`bf?Qr@iQ@T`2p2Mi@/Bb+`BKBM�iQ`v@Qmi+ QK2b@BM@K�+?BM2@H2�`MBM;. https://www.bbc.com/news/technology-41845878 https://www.bbc.com/news/technology-41845878 https://futureoflife.org/ai-principles/ https://devblogs.nvidia.com/explaining-deep-learning-self-driving-car/ https://devblogs.nvidia.com/explaining-deep-learning-self-driving-car/ http://nautil.us/issue/40/learning/is-artificial-intelligence-permanently-inscrutable http://nautil.us/issue/40/learning/is-artificial-intelligence-permanently-inscrutable https://machinelearningmastery.com/the-model-performance-mismatch-problem/ https://machinelearningmastery.com/the-model-performance-mismatch-problem/ https://doi.org/10.1145/2783258.2788613 https://doi.org/10.1145/2783258.2788613 https://www.theguardian.com/science/alexs-adventures-in-numberland/2015/jan/08/banking-forecasts-maths-weather-prediction-stochastic-processes https://www.theguardian.com/science/alexs-adventures-in-numberland/2015/jan/08/banking-forecasts-maths-weather-prediction-stochastic-processes https://www.theguardian.com/science/alexs-adventures-in-numberland/2015/jan/08/banking-forecasts-maths-weather-prediction-stochastic-processes https://www.weforum.org/whitepapers/how-to-prevent-discriminatory-outcomes-in-machine-learning https://www.weforum.org/whitepapers/how-to-prevent-discriminatory-outcomes-in-machine-learning https://www.weforum.org/whitepapers/how-to-prevent-discriminatory-outcomes-in-machine-learning 110 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 9 Huang, Xuedong, James Baker, and Raj Reddy. 2014. “A Historical Perspective of Speech Recog- nition.” Communications of the ACM 57 (1): 94–103. Lazer, David, Ryan Kennedy, Gary King, and Alessandro Vespignani. 2014. “The Parable of Google Flu: Traps in Big Data Analysis.” Science 343 (6176): 1203–1205. Lehman, Constance, Robert Wellman, Diana Buist, Karl Kerlikowske, Anna Tosteson, and Di- ana Miglioretti. 2015. “Diagnostic Accuracy of Digital Screening Mammography with and without Computer-Aided Detection.” JAMA Intern Med 175 (11): 1828–1837. Levenson, Richard M., Elizabeth A. Krupinski, Victor M. Navarro, and Edward A. Wasserman. 2015. “Pigeons (Columba livia) as Trainable Observers of Pathology and Radiology Breast Cancer Images.” PLoS One, November 18, 2015. ?iiTb,ff/QBXQ`;fRyXRjdRfDQm` M�HXTQM2XyR9Rj8d. Lipton, Zachary. 2018. “The Mythos of Model Interpretability.” ACM Queue 61 (10): 36–43. Liu, Xiaoxuan et al. 2019. “A Comparison of Deep Learning Performance against Health-Care Professionals in Detecting Diseases from Medical Imaging: a Systematic Review and Meta- Analysis.” Lancet Digital Health 1 (6): e271–97. ?iiTb,ffrrrXb+B2M+2/B`2+iX+Q Kfb+B2M+2f�`iB+H2fTBBfak83Nd8yyRNjyRkjk. Mahjoubi, Mounir. 2018. “Assemblée nationale, XVe législature. Session ordinaire de 2017–2018.” Compte rendu intégral, Deuxième séance du mercredi 07 février 2018. ?iiT,ffrrrX�b b2K#H22@M�iBQM�H2X7`fR8f+`BfkyRd@kyR3fkyR3yRjdX�bT. Metz, Cade. 2016. “Artificial Intelligence Is Setting Up the Internet for a Huge Clash with Eu- rope.” Wired, July 11, 2016. ?iiTb,ffrrrXrB`2/X+QKfkyRefydf�`iB7B+B�H@BMi 2HHB;2M+2@b2iiBM;@BMi2`M2i@?m;2@+H�b?@2m`QT2f. O’Neil, Cathy. 2016. Weapons of Math Destruction. New York: Crown. Peng, Tony. 2018. “2018 in review: 10 AI failures.” Medium, December 10, 2018. ?iiTb,ffK2 /BmKX+QKfbvM+2/`2pB2rfkyR3@BM@`2pB2r@Ry@�B@7�BHm`2b@+R37��/78N3j. Qureshi, A. I., M. Z. Memon, G. Vazquez, and M. F. Suri. 2009. “Cat ownership and the Risk of Fatal Cardiovascular Diseases. Results from the Second National Health and Nutrition Ex- amination Study Mortality Follow-up Study.” Journal of Vascular and Interventional Neu- rology 2 (1): 132–5. ?iiTb,ffrrrXM+#BXMHKXMB?X;QpfTK+f�`iB+H2bfSJ*jjRdj kN. Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016. “ ‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier.” In Proceedingsofthe22ndACMSIGKDDIn- ternational Conference on Knowledge Discovery and Data Mining (KDD ’16), 1135–1144. New York: ACM Press. Schmidt, R. A. and R. M. Nishikawa. 1995. “Clinical Use of Digital Mammography: the Present and the Prospects.” Journal of Digital Imaging 8 (1 Suppl 1): 74–9. Singh, Vivek Kumar et al. 2020. “Breast Tumor Segmentation and Shape Classification in Mam- mograms Using Generative Adversarial and Convolutional Neural Network.” Expert Sys- tems with Applications 139. Su, Jiawei, Danilo Vasconcellos Vargas, and Kouichi Sakurai. 2019. “One Pixel Attack for Fool- ing Deep Neural Networks.” IEEETransactionsonEvolutionaryComputation23 (5): 828–841. Taulli, Tom. 2019. “How Bias Distorts AI (Artificial Intelligence).” Forbes, August 4, 2019. ?iiTb,ffrrrX7Q`#2bX+QKfbBi2bfiQKi�mHHBfkyRNfy3fy9f#B�b@i?2@bBH2M i@FBHH2`@Q7@�B@�`iB7B+B�H@BMi2HHB;2M+2fOR++e7j8/d/3d. Twain, Mark. 1883. Life on the Mississippi. Boston: J. R. Osgood & Co. https://doi.org/10.1371/journal.pone.0141357 https://doi.org/10.1371/journal.pone.0141357 https://www.sciencedirect.com/science/article/pii/S2589750019301232 https://www.sciencedirect.com/science/article/pii/S2589750019301232 http://www.assemblee-nationale.fr/15/cri/2017-2018/20180137.asp http://www.assemblee-nationale.fr/15/cri/2017-2018/20180137.asp https://www.wired.com/2016/07/artificial-intelligence-setting-internet-huge-clash-europe/ https://www.wired.com/2016/07/artificial-intelligence-setting-internet-huge-clash-europe/ https://medium.com/syncedreview/2018-in-review-10-ai-failures-c18faadf5983 https://medium.com/syncedreview/2018-in-review-10-ai-failures-c18faadf5983 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3317329 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3317329 https://www.forbes.com/sites/tomtaulli/2019/08/04/bias-the-silent-killer-of-ai-artificial-intelligence/#1cc6f35d7d87 https://www.forbes.com/sites/tomtaulli/2019/08/04/bias-the-silent-killer-of-ai-artificial-intelligence/#1cc6f35d7d87 Lesk 111 Tugend, Alina. 2019. “The Bias Embedded in Tech.” The New York Times, June 17, 2019, section F, 10. Walsh, Fergus. 2020. “AI ‘outperforms’ doctors diagnosing breast cancer.” BBC News, January 2, 2020. ?iiTb,ffrrrX##+X+QKfM2rbf?2�Hi?@8y38dd8N. Wilks, Daniel S. 2008. Review of EmpiricalMethodsinShort-TermClimatePrediction, by Huug van den Dool. Bulletin of the American Meteorological Society 89 (6): 887–88. https://www.bbc.com/news/health-50857759 lucic-towards-2021 ---- Chapter 13 Towards a Chicago place name dataset: From back-of-the-book index to a labeled dataset Ana Lucic University of Illinois John Shanahan DePaul University Introduction Reading Chicago Reading1 is a grant-supported digital humanities project that takes as its ob- ject the “One Book One Chicago” (OBOC) program2 of the Chicago Public Library. Since fall 2001, One Book One Chicago has fostered community through reading and discussion. On its “Big Read” website, the Library of Congress includes information about One Book programs around the United States,3 and the American Library Association (ALA) also provides materials with which a library can build its own One Book program and, in this way, bring members of their communities together in a conversation.4 While community reading programs are not a 1Reading Chicago Reading project (?iiTb,ff/?X/2T�mHXT`2bbf`2�/BM;@+?B+�;Qf) gratefully acknowl- edges the support of the National Endowment for the Humanities Office of Digital Humanities, HathiTrust, and Lyrasis. 2See ?iiTb,ffrrrX+?BTm#HB#XQ`;fQM2@#QQF@QM2@+?B+�;Qf. 3See ?iiT,ff`2�/X;Qpf`2bQm`+2bf. 4See ?iiT,ffrrrX�H�XQ`;fiQQHbfT`Q;`�KKBM;fQM2#QQF. 151 https://dh.depaul.press/reading-chicago/ https://www.chipublib.org/one-book-one-chicago/ http://read.gov/resources/ http://www.ala.org/tools/programming/onebook 152 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 13 new phenomenon and exist in various formats and sizes, the One Book One Chicago program is notable because of its size (the Chicago Public Library has 81 local branches) as well as its history (the program has been in continual existence for nearly 20 years). Although relatively common, book clubs and community-based reading programs are not regularly assessed as other library programming components are, or are subjects of long-term quantitative study. The following research questions have been guiding the Reading Chicago Reading project so far: can we predict the future circulation of a book using a predictive model based on prior cir- culation, community demographics, and text characteristics? How did different neighborhoods in a diverse but also segregated city respond to particular book choices? Have certain books been more popular than others around the city as measured by branch-level circulation, and can these changes in checkout totals be correlated with CPL outreach work? A related question is the fo- cus of this paper: by associating place names with sentiment scores in Chicago-themed OBOC books, what trends emerge from spatial analysis? Results are still in progress and will be forth- coming in future papers. In the meantime, exploration of these questions, and our attempt to find solutions for some of them, enables us to reflect on some innovative services that libraries can offer. We will discuss this possibility in the last section of this paper. Chicago as a place name Thus far, the Reading Chicago Reading project has focused the bulk of its analysis on seven recent OBOC book selections and their respective “seasons” of public outreach programming: • Fall of 2011: Saul Bellow’s The Adventures of Augie March • Spring of 2012: Yiyun Li’s Gold Boy, Emerald Girl • Fall of 2012: Markus Zusak’s The Book Thief • 2013–2014: Isabel Wilkerson’s The Warmth of Other Suns • 2014 – 2015: Michael Chabon’s The Amazing Adventures of Kavalier and Clay • 2015 – 2016: Thomas Dyja’s The Third Coast • 2016 – 2017: Barbara Kingsolver’s Animal Vegetable Miracle: A Year of Food Life All of the listed works above, spanning categories of fiction and non-fiction, are still in copy- right. Of the seven works, three were categorized as Chicago-themed because they take place in the Chicago area in whole or in substantial part: Saul Bellow’s The Adventures of Augie March, Isabel Wilkerson’s The Warmth of Other Suns, and Thomas Dyja’s The Third Coast. As part of ongoing work of the Reading Chicago Reading project, we used the secure data portal of the HathiTrust Research Consortium to access and pre-process the in-copyright nov- els in our set. The HathiTrust research portal permits the extraction of non-consumptive fea- tures of the works included in the digital library, even those that are still under copyright. Non- consumptive features do not violate copyright restrictions as they do not allow the regular reading (“consumption”) or digital reconstruction of the full work in question. An example of a non- consumptive feature is the part of speech information extracted in aggregate with or without connection to its source words. Location words (i.e. place names) in the text are another example Lucic and Shanahan 153 of a non-consumptive feature as long as we do not aim to extract locations with the surround- ing context: that is, while the extraction of a location word alone from a work under copyright will not violate copyright law, the extraction of the location word with its surrounding context (a fixed size “window” of words that surrounds the location word) might do so. Similarly, the sentiment of a sentence also falls under the category of a “non-consumptive” feature as long as we do not extract both the entire sentence and its sentiment score. Using these methods, it was possible to utilize the HathiTrust research portal to access and also extract the location words as well as sentiment of individual sentences from copyrighted works. As later paragraphs will reveal however, we also needed to verify the accuracy of these extractions, which was done manually by checking the extracted references against the actual text of the work. This paper arises from the finding that the three OBOC books that are set largely in or are about Chicago circulated differently than the OBOC books that are not, (i.e., Marcus Zusak’s TheBookThief, Yiyun Li’sGoldBoy, Barbara Kingsolver’sAnimal,Vegetable,Miracle, and Michael Chabon’s TheAmazingAdventuresofKavalierandClay. Since one of the findings was that some CPL branches had higher circulation for “Chicago” OBOC books than others in the program, we wanted to determine (1) which place names were featured in the three books and (2) quan- tify and examine the sentiment associated with these places. Although recognizing a well-defined place name in a text by automated means is no longer a difficult task thanks to the development of named entity recognizers such as the Stanford Named Entity Recognizer,5 OpenNLP,6 spaCy,7 and NLTK,8 recognizing whether a place name is a reference to a Chicago location is a harder task. If Chicago is the setting or one of the main topics of the book then we can assume that a number of locations mentioned will also be Chicago place names. However, if information about the topicality or locality of the book is not known in advance or if the plot in the book moves from location to location, then the task of verifying through automated methods whether a place name is a Chicago location is much harder. With the help of LinkedGeoData9 we were able to obtain all of the Chicago place names identified by volunteers through the OpenStreetMap project10 and then download a listing that included Chicago buildings, theaters, restaurants, streets, and other prominent places. While this is very useful, we also realized that we were missing historical Chicago place names with this ap- proach. At the same time, the way that place names are represented in a text will likely not always correspond to the way a place name is formally represented in a dictionary, database, or knowledge graph. For example, a sentence might simply use an anaphoric reference such as “that building” or “her home” instead of directly naming the entity known from other sentences. Moreover, there were many examples of generic place names: how many cities in the United States have a State Street, a Madison Street, or a 1st Avenue, and the like? A further hindrance was determining the type of place names we wanted to identify and collect from the text’s total set of location word tokens: it soon became obvious that for the purposes of visualizing a place name on the map, gen- eral references to Chicago went beyond the scope of the maps we wanted to create. We became more interested in tracking references to specific Chicago place names that included buildings (historical and present), named areas of the city, monuments, streets, theatres, restaurants, and the like. Given that our total dataset for this task comprised just three books, we were able to man- 5See ?iiTb,ffMHTXbi�M7Q`/X2/mfbQ7ir�`2f*_6@L1_X?iKH. 6See ?iiTb,ffQT2MMHTX�T�+?2XQ`;f. 7See ?iiTb,ffbT�+vXBQf. 8See ?iiTb,ffrrrXMHiFXQ`;f#QQFf+?ydX?iKH. 9See ?iiT,ffHBMF2/;2Q/�i�XQ`;f�#Qmi. 10See ?iiTb,ffrrrXQT2Mbi`22iK�TXQ`;f. https://nlp.stanford.edu/software/CRF-NER.html https://opennlp.apache.org/ https://spacy.io/ https://www.nltk.org/book/ch07.html http://linkedgeodata.org/About https://www.openstreetmap.org/ 154 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 13 Figure 13.1: Mapping place names associated with positive (top row) and very negative (bottom row) sentiment extracted from three OBOC books. ually sift through the automatically identified place names and verify whether they were indeed a Chicago place name or not. We also established the sentiment of each location-bearing sentence in the three books using the Stanford Sentiment Analyzer.11 Our guiding principle was that spe- cific place(s) mentioned in the sentence “inherit” the sentiment score of the entire sentence. This principle may not always be true, but our manual inspection of the sentiment assigned to sen- tences, and therefore to locations mentioned in the sentences, established that this was a fairly accurate estimate: the sentiment score of the entire sentence is at the very least connected to or “resonates” with the individual components of the sentence including place names. While we did examine some samples, we did not conduct a qualitative analysis of the accuracy of the sentiment scores assigned to the corpus. Figure 13.1 documents an example of the results of our effort to integrate place names with the sentiment of the sentence. Particularly notable in Figure 13.1 is The Third Coast (right column) which shows a concen- tration of positively-associated Chicago place names in the northern parts of the city along the shore of Lake Michigan. Negative sentiment, by contrast appears to be more concentrated in the central part of Chicago and also in the southern parts of the city. The place names extracted from our three Chicago-setting OBOC books allowed us to focus 11See ?iiTb,ffMHTXbi�M7Q`/X2/mfb2MiBK2Mif. https://nlp.stanford.edu/sentiment/ Lucic and Shanahan 155 Figure 13.2: Mapping of sentences that feature “Hyde Park,” and their sentiment, from three OBOC program books on particular areas of the city such as Hyde Park on the South Side, which is mentioned in each of them. Larger circles correspond to a greater number of sentences that mention Hyde Park and are associated with a negative sentiment in both The Adventures of Augie March and The Warmth of Other Suns. As the maps in figure 13.2 indicate, on the other hand, The Third Coast features sentences in which Hyde Park is mentioned in both positive and negative contexts. These results prompt us to continue with this line of research and to procure a larger “con- trol” set of texts with Chicago place names and sentiment scores. This would allow us to focus on specific places such as “Wrigley Field” or the once-famous but no longer existing “Mecca” apart- ment building (which stood at the intersection of 34th and State Street on the South Side and was immortalized in a 1968 poetry collection by Gwendolyn Brooks). With a robust place name data set, we could analyze the context in which these place names were mentioned in other liter- ature, in contemporary or historical newspapers (Chicago Tribune, Chicago Sun-Times, Chicago Defender), or in library and archival materials. Promising contextual elements would include the sentiment associated with the place name. Our interest in creating a dataset of Chicago place names extracted from literature led us to The Chicago of Fiction, a vast annotated bibliography by James A. Kaser. Published in 2011, this 156 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 13 work contains entries on more than 1,200 works published between 1852 and 1980 that feature Chicago. Kaser’s book contains several indexes that can serve as sources of labeled data or in- stances in which Chicago locations are mentioned. Although we are still determining how many of the titles included in the annotated bibliography already exist in digital format or are accessible through the HathiTrust digital library, it is likely that a subset of the total can be accessed elec- tronically. Even if the books do not exist in electronic format presently, it is still possible to use the index as a source of already-labeled data for Chicago place names. We anticipate that such a dataset would be of interest to researchers in Urban Studies, Literature, History, and Geogra- phy. A sufficiently large number of sentences featuring Chicago place names would enable us to proceed in the direction of a Chicago place name recognizer that can “learn” Chicago context or examine how much context is sufficient to establish whether, for instance, a “Madison Street” place name in a text is located in Chicago or elsewhere. How do libraries innovate? From print index to labeled data Over the last decade, libraries have pioneered services related to the development and preservation of digital scholarship projects. Librarians frequently assist faculty and students with the devel- opment of digital humanities and digital scholarship projects. They point patrons to resources and portals where they can find data and help with licensing. Librarians also procure datasets, and some perform data cleaning and pre-processing tasks. And yet it is still not that common for librarians to participate in the creation of a dataset. A relatively recent initiative, however, Collections as Data,12 directly tackles the issue of treating research, library, and cultural heritage collections as data and providing access to them. This ongoing initiative aims to create 12 projects that can serve as a model to other libraries for making collections accessible as data. The data that undergird the mechanisms of library workings—circulation records for phys- ical and digital objects, metadata records, and the like—are not commonly available as datasets open to machine learning tasks. If they were, not only could libraries refer others to the already created and annotated physical and digital objects, but they could also participate in creating ob- jects that are local to their settings. Creation and curation of such datasets could in turn help establish new relationships between area libraries and local communities. One can imagine a “data challenge,” for instance, in which libraries assemble a community by building a dataset rel- evant to that community. Such an effort would need to be preceded by assessment of the data needs and interests of that particular community. In the case of a Chicago place name dataset challenge, efforts could revolve around local communities adding sentences to the dataset from literary sources. A second step might involve organizing a crowdsourced data challenge to build a place name recognizer model (e.g. Chicago place name recognizer model) based on the sentences gathered. One can also imagine turning metadata records into curated datasets that are shared with local communities and with teachers and university lecturers for use in the classroom. Once a dataset is built, scenarios can be invented for using it. This kind of work invites conversations with faculty members about their needs and about potential datasets that would be of particular interest. Creation of datasets based on unique materials at their disposal will enrich the palette of services already offered by libraries. 12See ?iiTb,ff+QHH2+iBQMb�b/�i�X;Bi?m#XBQfT�`ikr?QH2f. https://collectionsasdata.github.io/part2whole/ Lucic and Shanahan 157 One of the main goals of the Reading Chicago Reading project was the creation of a model that can predict the circulation of a One Book One Chicago program book selection given param- eters such as prior circulation for the book, its text characteristics, and the geographical locality of the work. We are not aware of other predictive models that integrate circulation records with text features extracted from the books in this way. Given that circulation records are not com- monly integrated with other data sources when they are analyzed, linking different data sources with circulation records is another challenging opportunity that this paper envisions. Ultimately, libraries can play a dynamic role in both managing and creating data and datasets that can be shared with the members of local communities. Using back-of-the-book indexes as a source of labeled place name data is a tool that we have begun to prototype but still requires further exploration and troubleshooting. While organizing a data challenge takes a lot of effort, a data challenge can be an effective way of reaching out to one’s local community and identifying their data needs. To this end, we aim to make freely available our curated list of sentences and associated sentiment scores for Chicago place names in the three OBOC selections centered on Chicago. We will invite scholars and the general public to add more Chicago location sentences extracted from other literature. Our end goal is a labeled training dataset for the creation of a Chicago place name recognizer, which, we hope, will enable new avenues of research. References American Library Association. n.d. “One Book One Community.” Programming & Exhibitions (website). Accessed May 31, 2020. ?iiT,ffrrrX�H�XQ`;fiQQHbfT`Q;`�KKBM;fQM 2#QQF. Bird, Steven, Edward Loper and Ewan Klein. 2009. Natural Language Processing with Python. Sebastopol, CA: O’Reilly Media Inc. Chicago Public Library. n.d. “One Book One Chicago.” Accessed May 31, 2020. ?iiTb, ffrrrX+?BTm#HB#XQ`;fQM2@#QQF@QM2@+?B+�;Qf. “Collections as Data: Part to Whole.” n.d. Accessed May 31, 2020. ?iiTb,ff+QHH2+iBQMb� b/�i�X;Bi?m#XBQfT�`ikr?QH2f. Finkel, Jenny Rose, Trond Grenager, and Christopher Manning. 2005. “Incorporating Non- local Information into Information Extraction Systems by Gibbs Sampling.” In Proceedings of the 43nd Annual Meeting of the Association for Computational Linguistics (ACL 2005), 363-370. ?iiTb,ffrrrX�+Hr2#XQ`;f�Mi?QHQ;vfSy8@Ry98f. HathiTrust Digital Library. n.d. Accessed May 31, 2020. ?iiTb,ffrrrX?�i?Bi`mbiXQ`;f. Kaser, A. James. 2011. The Chicago of Fiction: A Resource Guide. Lanham: Scarecrow Press. Library of Congress. “Local/Community Resources.’ n.d. Read.gov. Accessed May 31, 2020. ?iiT,ff`2�/X;Qpf`2bQm`+2bf. LinkedGeoData. “About / News.” n.d. Accessed May 31, 2020. ?iiT,ffHBMF2/;2Q/�i�X Q`;f�#Qmi. Manning, Christopher D., Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. “The Stanford CoreNLP Natural Language Processing Toolkit.” In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 55-60. ?iiTb,ffrrrX�+Hr2#XQ`;f�Mi?QHQ;vfSR9@8yRyf. OpenStreetMap. n.d. Accessed May 31, 2020. ?iiTb,ffrrrXQT2Mbi`22iK�TXQ`;f. Reading Chicago Reading. “About Reading Chicago Reading.” n.d. Accessed May 31, 2020. ?iiTb,ff/?X/2T�mHXT`2bbf`2�/BM;@+?B+�;Qf�#Qmif. http://www.ala.org/tools/programming/onebook http://www.ala.org/tools/programming/onebook https://www.chipublib.org/one-book-one-chicago/ https://www.chipublib.org/one-book-one-chicago/ https://collectionsasdata.github.io/part2whole/ https://collectionsasdata.github.io/part2whole/ https://www.aclweb.org/anthology/P05-1045/ https://www.hathitrust.org/ http://read.gov/resources/ http://linkedgeodata.org/About http://linkedgeodata.org/About https://www.aclweb.org/anthology/P14-5010/ https://www.openstreetmap.org/ https://dh.depaul.press/reading-chicago/about/ maceli-what-2015 ---- Microsoft Word - September_ITAL_Maceli_proofed.docx What  Technology  Skills  Do  Developers   Need?  A  Text  Analysis  of  Job  Listings  in   Library  and  Information  Science  (LIS)     from  Jobs.code4lib.org.      Monica  Maceli     INFORMATION  TECHNOLOGY  AND  LIBRARIES  |  SEPTEMBER  2015             8   ABSTRACT   Technology  plays  an  indisputably  vital  role  in  library  and  information  science  (LIS)  work;  this  rapidly   moving  landscape  can  create  challenges  for  practitioners  and  educators  seeking  to  keep  pace  with   such  change.  In  pursuit  of  building  our  understanding  of  currently  sought  technology  competencies   in  developer-­‐oriented  positions  within  LIS,  this  paper  reports  the  results  of  a  text  analysis  of  a  large   collection  of  job  listings  culled  from  the  Code4lib  jobs  website.  Beginning  more  than  a  decade  ago  as   a  popular  mailing  list  covering  the  intersection  of  technology  and  library  work,  the  Code4lib   organization's  current  offerings  include  a  website  that  collects  and  organizes  LIS-­‐related  technology   job  listings.  The  results  of  the  text  analysis  of  this  dataset  suggest  the  currently  vital  technology  skills   and  concepts  that  existing  and  aspiring  practitioners  may  target  in  their  continuing  education  as   developers.     INTRODUCTION For  those  seeking  employment  in  a  technology-­‐intensive  position  within  library  and  information   science  (LIS),  the  number  and  variation  of  technology  skills  required  can  be  daunting.  The  need  to   understand  common  technology  job  requirements  is  relevant  to  current  students  positioning   themselves  to  begin  a  career  within  LIS,  those  currently  in  the  field  that  wish  to  enhance  their   technology  skills,  and  LIS  educators.  The  aim  of  this  short  paper  is  to  highlight  the  skills  and   combinations  of  skills  currently  sought  by  LIS  employers  in  North  America  through  textual   analysis  of  job  listings.  Previous  research  in  this  area  explored  job  listings  through  various   perspectives,  from  categorizing  titles  to  interviewing  employers;1,2  the  approach  taken  in  this   study  contributes  a  new  perspective  to  this  ongoing  and  highly  necessary  work.  This  research   report  seeks  a  further  understanding  of  the  following  research  questions:   • What  are  the  most  common  job  titles  and  skills  sought  in  technology-­‐focused  LIS  positions?   • What  technology  skills  are  sought  in  combination?   • What  implications  do  these  findings  have  for  aspiring  and  current  LIS  practitioners   interested  in  developer  positions?     As  detailed  in  the  following  research  method  section,  this  study  addresses  these  questions     Monica  Maceli  (mmaceli@pratt.edu)  is  Assistant  Professor,  School  of  Information  and  Library   Science,  Pratt  Institute,  New  York.     WHAT  TECHNOLOGY  SKILLS  DO  DEVELOPERS  NEED?  |  MACELI   doi:  10.6017/ital.v34i3.5893   9   through  textual  analysis  of  relevant  job  listings  from  a  novel  dataset—the  job  listings  from  the   Code4lib  jobs  website  (http://jobs.code4lib.org/).  Code4lib  began  more  than  a  decade  ago  as  an   electronic  discussion  list  for  topics  around  the  intersection  of  libraries  and  technology.3  Over  time,   the  Code4lib  organization  expanded  to  an  annual  conference  in  the  United  States,  the  Code4Lib   Journal,  and  most  relevant  to  this  work,  an  associated  jobs  website  that  highlights  jobs  culled  from   both  the  discussion  list  and  other  job-­‐related  sources.  Figure  1  illustrates  the  home  page  of  the   Code4lib  jobs  website;  the  page  presents  job  listings  and  associated  tags,  with  the  tags  facilitating   navigation  and  viewing  of  other  related  positions.  Users  may  also  view  positions  geographically  or   by  employer.           Figure  1.  Homepage  of  the  code4lib  Jobs  Website,  Displaying  Most-­‐Recently  Posted  Jobs  and  the   Associated  Tags.4   In  addition  to  the  visible  user  interface  for  job  exploration,  the  website  consists  of  software  to   gather  the  job  listings  from  a  variety  of  sources.  The  website  incorporates  jobs  posted  to  the   Code4lib  discussion  list,  American  Library  Association,  Canadian  Library  Association,  Australian   Library  and  Information  Association,  HigherEd  Jobs,  Digital  Koans,  Idealist,  and  ArchivesGig.  This   broad  incoming  set  of  jobs  provides  a  wide  look  into  new  technology-­‐related  postings.     New  job  listings  are  automatically  added  to  a  queue  to  be  assessed  and  tagged  by  human  curators   before  posting.  This  allows  manual  intervention  where  a  curator  assesses  whether  the  job  is   relevant  to  technology  in  the  library  domain  and  to  validate  the  job  listing  information  and   metadata  (see  figure  2).  Curating  is  done  on  a  volunteer  basis,  and  curators  are  asked  to  assess   whether  the  position  is  relevant  to  the  Code4lib  community,  if  it  is  unique,  and  to  ensure  that  it   has  an  associated  employer,  set  of  tags,  and  descriptive  text.  Combining  both  software  processes     INFORMATION  TECHNOLOGY  AND  LIBRARIES  |  SEPTEMBER  2015                   10   and  human  intervention  in  the  job  assessment  results  in  the  ability  to  gather  a  large  number  of   jobs  of  high  relevance  to  the  Code4lib  community.  As  mentioned  earlier,  Code4lib’s  origins  are  in   the  area  of  software  development  and  design  as  applied  in  LIS  contexts.  These  foci  mean  that  most   jobs  identified  as  relevant  for  inclusion  in  the  Code4lib  jobs  dataset  are  oriented  toward  developer   activities.  The  Code4lib  jobs  website  therefore  provides  a  useful  and  novel  dataset  within  which  to   understand  current  employment  opportunities  relating  to  the  intersection  between  technology— particularly  developer  work—and  the  LIS  field.       Figure  2.  Code4lib  Job  Curators  Interface  Where  Job  Data  is  Validated  and  Tags  Assigned.5   RESEARCH  METHOD   To  analyze  the  job  listing  data  in  greater  depth,  a  textual  analysis  was  conducted  using  the  R   statistical  package,  exploring  job  titles  and  descriptions.6  First,  the  job  listing  data  from  the  most   recent  complete  year  (2014)  were  dumped  from  the  database  backend  of  the  Code4lib  jobs   website;  this  dataset  contained  1,135  positions  in  total.  The  dataset  included  the  job  titles,   descriptions,  location  and  employer  information,  as  well  as  tags  associated  with  the  various     WHAT  TECHNOLOGY  SKILLS  DO  DEVELOPERS  NEED?  |  MACELI   doi:  10.6017/ital.v34i3.5893   11   positions.  The  text  was  then  cleaned  to  remove  any  markup  tags  or  special  characters  that   remained  from  the  scraping  of  listings.  Finally,  the  tm  (text  mining)  package  in  R  was  used  to   calculate  frequency,  correlation  of  terms,  generate  plots,  and  cluster  terms  across  both  job  titles   and  descriptions.7   RESULTS   Job  Title  Analysis   Of  the  full  set  of  1,135  positions,  30  percent  were  titled  as  a  librarian  position;  popular  specialties   included  systems  librarian  and  various  digital  collections  and  curation-­‐oriented  librarian  titles.   Figures  3  and  4  detail  the  most  common  terms  used  in  position  titles  across  librarian  and   nonlibrarian  positions.       Figure  3.  Most  Common  Terms  Used  in  Librarian  Position  Titles.   345 89 63 59 34 29 25 25 23 21 20 20 18 18 16 14 13 13 13 12 12 11 11 11 10 librarian digital systems services metadata data technologies university technology web electronic resources assistant information emerging scholarship collections library management initiatives sciences cataloging projects research professor Top Title Terms - Librarian Positions   INFORMATION  TECHNOLOGY  AND  LIBRARIES  |  SEPTEMBER  2015                   12     Figure  4.  Most  Common  Terms  Used  in  Nonlibrarian  Position  Titles.   The  most  popular  job  title  terms  were  then  clustered  using  Ward’s  agglomerative  hierarchical   method  (dendogram  in  figure  5).  Agglomerative  hierarchical  clustering,  of  which  Ward’s  method   is  widely  used,  begins  first  with  single-­‐item  clusters,  then  identifies  and  joins  similar  clusters  until   the  final  stage  in  which  one  larger  cluster  is  formed.  Commonly  used  in  text  analysis,  this  allows   the  investigator  to  explore  datasets  in  which  the  number  of  clusters  is  not  known  before  the   analysis.  The  dendograms  generated  (e.g.,  figure  5)  allow  for  visual  identification  and   interpretation  of  closely  related  terms  representing  various  common  positions,  e.g.,  digital   librarian,  software  engineer,  collections  management,  etc.  Given  that  job  titles  in  listings  may   include  extraneous  or  infrequent  words,  such  as  the  organization  name,  the  cluster  analysis  can   provide  an  additional  view  into  common  job  titles  across  the  full  dataset  in  a  more  generalized   fashion.     182 141 116 90 86 68 65 59 59 59 55 52 49 49 40 40 40 40 38 35 34 34 33 32 24 digital developer library manager specialist software web archivist services technology engineer director data systems analyst coordinator information senior metadata administrator lead project head programmer research Top Title Terms - Non-Librarian Positions   WHAT  TECHNOLOGY  SKILLS  DO  DEVELOPERS  NEED?  |  MACELI   doi:  10.6017/ital.v34i3.5893   13       Figure  5.  Cluster  Dendrogram  of  Terms  Used  in  Job  Titles  Generated  Using  Ward's  Agglomerative   Hierarchical  Method.       Tag  Analysis   As  described  earlier,  the  Code4lib  jobs  website  allows  curators  to  validate  and  tag  jobs  before   listing.  The  word  cloud  in  figure  6  displays  the  most  common  tags  associated  with  positions,  with   XML  being  the  most  popular  tag  (178  occurrences).  Figure  7  contains  the  raw  frequency  counts  of   common  tags  observed.       INFORMATION  TECHNOLOGY  AND  LIBRARIES  |  SEPTEMBER  2015                   14         Figure  6.  Word  Cloud  of  Most  Frequent  Tags  Associated  with  Job  Listings  by  Curators.     WHAT  TECHNOLOGY  SKILLS  DO  DEVELOPERS  NEED?  |  MACELI   doi:  10.6017/ital.v34i3.5893   15     Figure  7.  Frequency  of  Commonly  Occurring  Tags  (frequency  of  fifty  occurrences  or  more)  in  the   2014  Job  Listings.   Job  Description  Analysis   The  job  description  text  was  then  analyzed  to  explore  commonly  co-­‐occurring  technology-­‐related   terms,  focusing  on  frequent  skills  required  by  employers.  Figures  8,  9,  and  10  plot  term   correlations  and  interconnectedness.  Terms  with  correlation  coefficients  of  0.3  or  higher  were   chosen  for  plotting;  this  common  threshold  chosen  broadly  included  terms  with  a  range  in   positive  relationship  strength  from  moderate  to  strong.     Plots  were  created  to  express  correlations  around  the  top  five  terms  identified  from  the  tags:  XML,   Javascript,  PHP,  metadata,  and  HTML  (frequencies  in  figure  7).  Any  number  of  terms  and   178 155 152 142 125 119 114 106 101 99 90 90 89 89 86 82 79 78 70 70 69 69 66 63 62 54 53 51 51 50 50 XML JavaScript PHP Metadata HTML Archive Cascading Style Sheets Python Integrated library system Java MySQL Dublin Core MARC standards Encoded Archival Description Ruby Drupal Project management SQL Metadata Object Description Standard Data management GNU/Linux Digital preservation Perl Digital library XSL Transformations Resource Description and Access Digital repository World Wide Web Management DSpace METS Frequency of Tags - 2014 Job Listings   INFORMATION  TECHNOLOGY  AND  LIBRARIES  |  SEPTEMBER  2015                   16   frequencies  can  be  plotted  from  such  a  dataset;  to  orient  the  findings  closely  around  the  job  listing   text,  a  focus  on  the  top  terms  was  chosen.  These  plots  illustrate  the  broader  set  of  skills  related  to   these  vital  competencies  represented  in  the  job  listings.         Figure  8.  Job  Listing  Terms  Correlated  with  “XML”  (most  popular  tag).         Figure  9.  Job  Listing  Terms  Correlated  with  “Javascript”  (Second  Most  Popular  Tag),  including   “PHP”  and  “HTML”  (third  and  fifth  most  popular  tags,  respectively).     WHAT  TECHNOLOGY  SKILLS  DO  DEVELOPERS  NEED?  |  MACELI   doi:  10.6017/ital.v34i3.5893   17     Figure  10.  Job  Listing  Terms  Correlated  with  “Metadata”  (fourth  most  popular  tag).     Finally,  a  series  of  general  plots  was  created  to  visualize  the  broad  set  of  skills  necessary  in   fulfilling  the  positions  of  interest  to  the  Code4lib  community.  As  detailed  in  the  title  analysis   (figures  3  and  4),  apart  from  the  generic  term  librarian,  the  two  most  common  terms  across  all  job   titles  were  digital  and  developer.  Correlation  plots  were  created  to  detail  the  specific  skills  and   requirements  commonly  sought  in  positions  using  such  terms.  Figure  11  illustrates  the  terms   correlated  with  the  general  term  of  developer,  while  figure  12  displays  terms  correlated  with   digital.  The  implications  of  these  findings  will  be  discussed  further  in  the  following  discussion   section.             INFORMATION  TECHNOLOGY  AND  LIBRARIES  |  SEPTEMBER  2015                   18     Figure  11.  Job  Listing  Terms  Correlated  with  “Developer.”       Figure  12.  Job  Listing  Terms  Correlated  with  “Ddigital.”     WHAT  TECHNOLOGY  SKILLS  DO  DEVELOPERS  NEED?  |  MACELI   doi:  10.6017/ital.v34i3.5893   19   DISCUSSION   Taken  as  a  whole,  the  job  listing  dataset  covered  a  quite  dramatic  range  of  positions,  from  highly   technical  (e.g.,  senior-­‐level  software  engineer  or  web  developer)  to  managerial  and  leadership   roles  (e.g.,  director  or  department  head  roles  centered  on  digital  services  or  emerging   technologies).  These  findings  support  the  suggestions  of  earlier  research,8  which  advocated  for  LIS   graduate  programs  to  build  their  offerings  not  just  in  technology  skills  but  also  in  technology   management  and  decision-­‐making.  However,  the  Code4lib  jobs  dataset  is  a  one-­‐dimensional  view   into  the  employment  process  and  is  focused  largely  on  the  developer  perspective.  Additional   contextual  information,  including  whether  suitable  candidates  were  easily  identified  and  if  the   position  was  successfully  filled,  would  provide  a  more  complete  view  of  the  employment  process.   Prior  research  has  indicated  that  many  technology-­‐related  positions  in  LIS  are  in  fact  difficult  to   fill  with  LIS  graduates.9  While  LIS  graduate  programs  have  made  great  strides  in  increasing  the   number  of  courses  and  topics  covered  that  address  technology,  these  improvements  may  not   benefit  those  already  in  the  field  or  wishing  to  shift  towards  a  more  technology-­‐focused  position.   In  the  common  tags  and  terms  analysis,  experience  with  specific  LIS  applications  was  relatively   infrequently  required,  with  the  Drupal  content  management  system  a  notable  exception.  More   generalizable  programming  languages  or  concepts,  e.g.,  Python,  relational  databases,  XML,  etc.,   were  favored  As  with  technology  positions  outside  of  the  LIS  domain,  employers  likely  seek  those   with  the  ability  to  flexibly  apply  their  skills  across  various  tools  and  platforms.  This  may  also   relate  to  the  above  challenges  in  filling  such  positions  with  LIS  graduates,  with  the  goal  of  opening   up  the  position  to  a  larger  technologist  applicant  base.   Common  web  technologies  popular  in  the  open-­‐source  software  often  favored  by  LIS   organizations  continued  to  dominate,  with  a  clear  preference  for  candidates  well  versed  in  HTML,   CSS,  JavaScript,  and  PHP.  Relating  to  these  skills,  web  development  and  design  practices  were   often  intertwined  with  positions  requesting  both  developer-­‐oriented  skillsets  as  well  as  interface   design  (e.g.,  figure  7).  Technologies  supporting  modern  web  application  development  and   workflow  management  were  evident  as  well,  e.g.,  common  requirements  for  experience  with   versioning  systems  such  as  Git,  popular  JavaScript  libraries,  and  development  frameworks.  Also   striking  was  the  richness  of  the  terms  correlated  with  metadata  (figure  10),  including  mention  of   growing  areas  of  expertise,  such  as  linked  data.     Interestingly,  the  general  correlation  plots  expressing  the  common  terms  sought  around  “digital”   and  “developer”  positions  were  quite  varied.  While  the  developer  plot  (figure  11  above)  provided   a  richly  technical  view  into  common  technologies  broadly  applied  in  web  and  software   development,  the  terms  correlated  around  digital  were  notably  less  technical  (figure  12  above).   While  there  was  a  clear  focus  on  digital  preservation  activities  and  common  standards  in  this  area,   mention  of  terms  such  as  “grant”  indicated  that  these  positions  likely  have  a  broad  role.  The  term   digital  was  frequently  observed  in  librarian  job  titles,  so  these  roles  may  be  tasked  with  both   technical  and  administrative  work.       INFORMATION  TECHNOLOGY  AND  LIBRARIES  |  SEPTEMBER  2015                   20   Finally,  there  are  inherent  difficulties  in  capturing  all  jobs  relating  to  technology  use  in  the  LIS   domain  that  introduce  limitations  into  this  study.  While  the  incoming  job  feeds  attempt  to  broadly   capture  recent  job  posts,  it  is  possible  that  jobs  are  missed  or  overlooked  by  the  job  curators.   Given  the  lack  of  one  centralized  job-­‐posting  source  regardless  of  the  field,  this  is  a  common   challenge  to  research  work  attempting  to  assess  every  job  posting.  And  as  mentioned  above,  there   is  also  a  lack  of  corresponding  data  as  to  whether  these  jobs  are  successfully  filled  and  what   candidate  backgrounds  are  ultimately  chosen  (i.e.,  from  within  or  outside  of  LIS).     CONCLUSION   This  assessment  of  the  in-­‐demand  technology  skills  provides  students,  educators,  and  information   professionals  with  useful  direction  in  pursuing  technology  education  or  strengthening  their   existing  skills.  There  are  myriad  technology  skills,  tools,  and  concepts  in  today’s  information   environments.  Reorienting  the  pursuit  of  knowledge  in  this  area  around  current  employer   requirements  can  be  useful  in  professional  development,  new  course  creation,  and  course  revision.   The  constellations  of  correlated  skills  presented  above  (figures  8–12)  and  popular  job  tags  (figure   7)  describe  key  areas  of  technology  competencies  in  the  diverse  areas  of  expertise  presently   needed,  from  web  design  and  development  to  metadata  and  digital  collection  management.  In   addition  to  the  results  presented  in  this  paper,  the  Code4lib  job  website  provides  a  continuously   current  view  into  recent  jobs  and  related  tags;  this  data  can  help  those  in  the  LIS  field  orient   professional  and  curricular  development  toward  real  employer  needs.   ACKNOWLEDGEMENTS   The  author  would  like  to  thank  Ed  Summers  of  the  Maryland  Institute  for  Technology  in  the   Humanities  for  generously  providing  the  jobs.code4lib.org  dataset  for  analysis.     REFERENCES     1. Janie  M.  Mathews  and  Harold  Pardue,  “The  Presence  of  IT  Skill  Sets  in  Librarian  Position   Announcements,”  College  &  Research  Libraries  70,  no.  3  (2009):  250–57,   http://dx.doi.org/10.5860/crl.70.3.250.     2. Vandana  Singh  and  Bharat  Mehra,  “Strengths  and  Weaknesses  of  the  Information  Technology   Curriculum  in  Library  and  Information  Science  Graduate  Programs,”  Journal  of  Librarianship  &   Information  Science  45,  no.  3  (2013):  219–31,  http://dx.doi.org/10.1177/0961000612448206.     3. “About”"  Code4lib,  accessed  January  6,  2014,  http://jobs.code4lib.org/about/.   4. “code4lib  jobs:  all  jobs,”  Code4lib  Jobs,  accessed  January  12,  2015,  http://jobs.code4lib.org/.     5. “code4lib  jobs:  Curate,”  Code4lib  Jobs,  accessed  January  17,  2015,   http://jobs.code4lib.org/curate/.     6. R  Core  Team,  R:  The  R  Project  for  Statistical  Computing,  2014,  http://www.R-­‐project.org/.     WHAT  TECHNOLOGY  SKILLS  DO  DEVELOPERS  NEED?  |  MACELI   doi:  10.6017/ital.v34i3.5893   21   7. Ingo  Feinerer  and  Kurt  Hornik,  “tm:  Text  Mining  Package,”  2014,  http://CRAN.R-­‐ project.org/package=tm.     8. Meredith  G.  Farkas,  “Training  Librarians  for  the  Future:  Integrating  Technology  into  LIS   Education,”  in  Information  Tomorrow:  Reflections  on  Technology  and  the  Future  of  Public  &   Academic  Libraries,  edited  by  Rachel  Singer  Gordon,  193–201  (Medford,  NJ:  Information  Today,   2007).   9. Mathews  and  Pardue,  “The  Presence  of  IT  Skill  Sets  in  Librarian  Position  Announcements.”   mathews-think-2012 ---- We don’t just need change, we need breakthrough, paradigm-shifting, transformative, disruptive ideas. April 2012 “Don’t think about better vacuum cleaners, think about cleaner floors.” That’s what I frequently remind my staff during our brainstorming sessions. Get beyond what’s familiar. It’s easy to just focus on making small tweaks to existing services, rather than considering the bigger, bolder, broader possibilities. Vacuum-cleaner-thinking is about asking: “How do we make it better?” A stylish new design? Stronger suction? Larger capacity? Attachments? Quieter motors? It’s all about building better features. And there’s nothing wrong with that. In fact, we should definitely strive for incremental improvement; but we have to go beyond that. We have to exceed our imaginations. We can’t just find new ways of doing the same old things. What we really need right now are breakthrough, paradigm-shifting, transformative, and disruptive ideas. When searching for “what’s next” we can’t focus on building a better vacuum cleaner, but rather, we need to set our minds to maintaining cleaner floors. That’s the real question at hand. It’s not about adding features, but about new processes. It’s not about modifying the reference desk model or purchasing ebooks. That’s just more of the same, but a little different. Instead we ought to consider a more central question: how can libraries support 21st century learners? Follow that thread and you’ll find transformative change. T H I N K L I K E A A white paper to inspire library entrepreneurialism Brian Mathews Associate Dean for Learning & Outreach at Virginia Tech www.brianmathews.com We have to face the future boldly. We have to peer upwards and outwards through telescopes, not downwards into microscopes. Over the next decade we need to implement big new ideas, otherwise the role of the library will become marginalized in higher education. We’ll become the keepers of the campus proxy, rather than information authorities. We’ll become just another campus utility like parking, dining services, and IT rather than the intellectual soul of the community. Now is the time to “zoom out” rather than “zoom in.”1 Let’s not pigeonhole ourselves into finite roles, such as print collections, computer labs, or information literacy. These self-imposed limitations will only ensure our vulnerability and gradual decline. We can’t abide by the dictionary definition of “library.” We can’t stay basically the same and only make small changes. Not only will that constrain the library, but it will also hold back scholarship and learning. With or without us the nature of information, knowledge creation, and content sharing is going to evolve. It’s already happening. Which side of the revolution will we be on? Dyson offers beautiful state-of-the-art vacuum machines. Their tools are top of the line. But ultimately, it’s still a chore to push a vacuum cleaner around the floor. If we’re talking about transformative ideas then iRobot is the place to focus your attention. Their machines are autonomous. Vacuuming isn’t a chore; it’s just something that happens while you sleep, work, or run errands. Their focus isn’t on providing new hardware, but on providing an ingenuous system that cleans surfaces for you. Carpets. Tiles. Hardwood. Pools. The Roomba is a revolution! It’s a new way of thinking. It’s solving a problem in a different way. And that’s what we need right now. We need to reinvent not just what we do, but how we think about it. This document is intended to inspire transformative thinking using insight into startup culture and innovation methodologies. It’s a collection of talking points intended to stir the entrepreneurial spirit in library leaders at every level. 1 Is Higher Education Too Big to Fail? Flip through the headlines and you’ll see that there is much to be concerned about: bankruptcy,2 mergers,3 and closures.4 Even Harvard is reducing library hours and laying off staff.5 While state budgets swing between bad and worse, something else is happening-- something more than just financial hardship. Higher education is facing increasing public criticism, and it’s possible (perhaps even inevitable) that the bubble is going to burst.6 Of course it won’t vanish; it will just evolve, like everything does, but traditional educational delivery is about to be disrupted.7 New options are emerging such as StraigterLine, UnCollege, and Udacity. There is no shortage of doom and gloom scenarios for the academic library.8 I hate adding more to the pile, but let’s face it: we’re vulnerable. While many of the services we provide are indeed essential to the academic mission, nothing says in stone that they must remain under our domain: • What if Residence Halls and Student Centers managed learning commons spaces? • What if the Office of Research managed campus- wide electronic database subscriptions and on- demand access to digital scholarly materials? • What if Facilities managed the off-campus warehouses where books and other print artifacts are stored? • What if the majority of scholarly information becomes open? Libraries would no longer need to acquire and control access to materials. • What if all students are given eBook readers and an annual allotment to purchase the books, articles, and other media necessary for their academic pursuits and cultural interests?9 Collections become personalized, on-demand, instantaneous, and lifelong learning resources. • What if local museums oversaw special collections and preservation? • What if graduate assistants, teaching fellows, post-docs, and undergraduate peer leaders managed database training, research assistance, and information literacy instruction? • What if the Office of Information Technology managed computer labs, proxy access, and lending technology and gadgets? Some of these are real possibilities over the next twenty years. Colleges and universities are highly competitive environments; everyone wants to expand, but funding is limited. If financial resources continue to decrease (as we expect that they will at public institutions) we’re likely to see some large-scale reorganization and reallocations take place.10 In the future you may still work as a librarian, just not in a traditional physical library. Many of the things we currently do could be assimilated elsewhere. This is why we need to be open to the definition of what an academic library is and focus on what people need it to become. How do we help the individuals at our institutions become more successful? That’s the goal. Our jobs are shifting from doing what we’ve always done very well, to always being on the lookout for new opportunities to advance teaching, learning, service, and research. 2 Change is going to be difficult, but the good news is that we know it’s necessary. Glance though the academic library job postings and you’ll see what I mean. Over and over again the word innovation pops up. There is a huge demand for librarians who “think different.” In fact, this theme of change has become a part of our landscape. Change is the new normal. Change is the only constant. Here is a sampling from some current ARL job listings: Innovators Wanted Mobile computing in everyone’s hands. An iTunes-like interface for quickly acquiring and accessing content anytime, anywhere, on any device. Facebook-like communities for students and scholars to discover, build, publish, and share new knowledge. Of course, this leads to a lot of controversy. Take collections for example. Several years ago it was impossible to imagine a research library without a significantly massive collection in print. Now I can’t envision a future without the majority of scholarly content being digital. But this isn’t just about books; it’s about libraries redefining what a collection is. As information migrates to digital platforms, let’s imagine what’s next: Google-like search capabilities across millions of books, articles, and multimedia. This is what I’m hearing around campus. This is what students, researchers, and administrations expect us to offer. This is the future they want to see. And if we don’t do it someone else will. Perhaps our future isn’t centered on access to content, but rather, the usage of it. Maybe there is a greater emphasis on community building, connecting people, engaging students, assisting researchers, and advancing knowledge production? Are academic libraries too important to fail? Maybe. If we remain steeped in nostalgia then I think we’re in trouble. At some point we have to take a leap into the future. Our focus can’t just be about adding features, but about redefining and realigning the role and identity of the academic library. We can’t map our value to outdated needs and practices, but instead, must intertwine ourselves with what’s needed next. It’s time to innovate. • ever-changing environment • an evolving program of research services • changing user preferences • receptive to and fostering new ideas We’re looking for people who are comfortable with change. We’re looking for people who can innovate. But is that what we really want? Innovation is messy. It takes many wild ideas that flop in order to find transformative gold. Innovation demands leaders who are persistent and who can challenge the status quo.11 Innovation requires organizations to live in liminality. Is your library ready for disruption? We can’t hire a few creative and improvisational individuals and expect them to deliver new service models if the work culture is not ready for new service models. We can’t expect entrepreneurialism to flourish in a tradition-obsessed environment. We can’t just talk about change; it must be embedded in the actions of employees. Innovation is a team sport that has to be practiced regularly. So how do we get there? • nimble • adaptive • flexible • self-starter 3 Think Like a Startup To become innovative organizations we need to emulate innovative organizations. Startups are a perfect model for guiding this change. The media and pop culture provide us with romanticized visions of dorm room ideas becoming billion dollar IPOs. And indeed, that does happen sometimes, but startups are more than rags to riches stories. In concise terms: startups are organizations dedicated to creating something new under conditions of extreme uncertainty.12 This sounds exactly like an academic library to me. Not only are we trying to survive, but we’re also trying to transform our organizations into a viable service for 21st century scholars and learners. Here are a few considerations: It’s not about what’s-now but about what’s-next. Startups probe for new possibilities. They examine what else needs to be done and then launch a path for that destination. Thinking like a startup positions us to think aspirationally about change. It requires and rewards innovation and creativity. It causes us to constantly reevaluate our organization, purpose, and drive: not against what it is or what it has been, but against what it needs to become. not necessarily profit. Obviously for businesses, financial validation is necessary for survival, but the incubation stage is more about trying to develop good ideas into working models. The film The Social Network provides a dramatic representation of this situation. The co-founders of Facebook ponder its future. One of them wants to monetize right away, while the other insists, “We don’t even know what it is yet.” That’s where we are with the future of academic libraries. We’re still in the early stages of our next evolution. It’s too early to know what libraries will become, but we know they’ll never be the same. Rather than getting bogged down with a definition, the time is ideal for launching new products, programs, and partnerships. The library is not a building, a website, or a person; it is a platform for scholars, students, cultural enthusiasts, and others who want to absorb and advance knowledge. They give us a way to analyze what we do, why we do it, and how we might implement change. The lean startup methodology accelerates discovering possibilities, addressing needs, and proposing solutions. Whether launching new initiatives or addressing existing ones, the startup mindset challenges us to test and validate our assumptions. It bonds us together. It connects us with our users. It forces us beyond satisfaction metrics and into the difficult but rewarding position of needs-based librarianship. Our profession invests a lot of time measuring how well we did, and hardly any time leap-frogging into what is going to be important in the future. Embracing startup culture is embracing a forward-thinking and future-oriented perspective. What can we create today that will be essential tomorrow? Startups condition us for constant change. Startups are about building a platform, Startups provide us a framework for action. Lastly, startup is a culture. 4 If most startups fail then why should we follow their lead? Indeed, studies suggest that as many as nine out of ten of these companies fall apart.13 But let’s flip that question and ask: what can we learn from the 10% that succeed? What did they do right? How did they think and act differently? The Lean Startup methodology addresses this perspective.14 Here are a few key insights: Investing too much time on something that doesn’t work is a common startup mistake. Their concepts are not viable, but they don’t discover that until it is too late. Instead, build “failure” or adjustment into the process. Seek to validate your ideas early on and then expand, edit, and revise them along the way. New ideas are exciting. You want to launch them as quickly as possible, but often you might feel “it’s just not ready yet.” That’s a surefire way to inhibit success. Instead, distill the concept into a raw form and then go with it. Get it into others’ hands and see what happens. If you are too hung up on creating policies and procedures, workflows and logistics, wordsmithing and committee debates then your idea doesn’t stand a chance. The project will stall out before you can even find out if it’s worth all the effort. When it’s good enough, go with it. Build upon success. That should be your initial objective. In the business lit they call this the minimum viable product. In Web 2.0 the motto is: everything is beta. Real estate is driven by location, location, location. With innovation it’s iteration, iteration, iteration. Your outlook should be to grow your idea by constantly building feedback into the developmental process. Let potential customers help nurture the concept to make it better. Don’t just cook it up in your office or meeting rooms-- test it in the field. You might begin traveling along one path but need to change the route in order to reach the destination. In fact, you might even need to change your destination. Successful startups are attuned to this. Facebook moved beyond just a college-oriented social network. Groupon shifted from social activism to social shopping. Realizing when you may need to pivot your idea in a new direction is critical toward cultivating innovation. Let it grow naturally. Don’t force it to become something it doesn’t want to be. Who doesn’t love following a great plan? Crossing off completed tasks. Reaching milestones. Launching on deadline. The problem, though, is that while we can follow a plan perfectly, it doesn’t mean it’s a good plan. We can follow a good plan right off a cliff. We can miss out on new opportunities because we’re too busy following the prescribed strategy. Instead, the goal should be to draft a good Plan A with the intention of it helping us get to plans B, C, and D. Instead of focusing on one perfect idea, try lots of decent ideas instead. See what works and what doesn’t. See what gains interest or has a positive impact. Nurture the projects that show the most potential. What isn’t being done? What opportunities exist to help people in new ways? Don’t limit your innovation to traditional library boundaries, but consider the entire teaching, learning, and research enterprise. What are the areas of untapped potential? Translation services? 3D Printing? Experimental classrooms? An important local collection? How might we fill a new role and not only expand the library’s portfolio, but also empower people by addressing unmet needs? Most Startups Fail; Learn From the Ones That Didn’t Fail Faster, Fail Smarter Good Enough is Good Enough to Start Feed the Feedback Loop Pivot Toward Success Don’t Get Stuck Following Plan A; Instead Get to A Plan That Works15 Plant Many Seeds16 Seize the White Space17 5 Build, Measure, Learn: The Methodology The lean startup method encourages a phased process right from the start.18 Building, measuring, and learning are integrated into the workflow. Changes to the idea, product, or service are expected and required. This is how it works: you take your initial concept and develop it into a shareable format. Test it and analyze the reaction. You then use this insight to build a better prototype. Repeat the process. Iterate forever. The aim isn’t to develop a finished product, but to continuous evaluate and evolve the concept. This cycle of rapid development keeps you on track for constant improvement instead of clinging to services that are no longer needed. While this process is ideal for software development, it also works well in other areas. For example, the Newman Library at Virginia Tech experimented by hosting writing center tutors at a table in a commons area. Based upon this successful trial the writing center staff left their former location and set up shop in the library full-time. During the incubation period they tested the concept: location, staffing, hours of operation, publicity, perceived value, etc. The resulting insight enabled the library and writing center to flesh out a successful concept before committing money and floor space. Thinking like a startup means getting your idea out quickly. Test it, improve it, and then try it again. And then repeat the process, refining the concept along the way. A variation of this model comes from the user experience domain and argues to shift the order of steps to Learn, Build, Measure.19 This sequence places a greater emphasis on investing a small amount of time upfront engaging people. After learning about any potential problems, address those needs by either tweaking the idea or pivoting the concept. Next measure behaviors or perceptions and gain insights from actual usage. This will then stimulate another round of learning, building, and measuring. Perhaps you already employ a form of this model. The point is to make it explicit in your operations. Whether launching a new service, developing a new space, or reviewing current workflows, build this continuous feedback loop into your process. The cycles should be more frequent at first and then taper off, but the important thing is stay focused on constant improvement: growing and pivoting, expanding and contracting. This practice of constant refinement will challenge us to think about what’s next rather than just clinging to what’s worked before. The NCSU Libraries have long practiced this good entrepreneurial development.20 Let’s look at two examples: During the early stages of their Commons development the library ran into a funding delay and was consequently left with a large open space. To bridge the gap, the library provided hundreds of beanbags. This temporary solution was fortuitous because it opened their eyes to what the library needed to become. Students were drawn to the open space and started bringing their own accessories and furniture. Watching the way the area was used, the librarians realized their initial plan was flawed; the way that students used the space was completely different than originally anticipated. NCSU had greatly underestimated the desire for social learning and collaboration. The architect was able to adjust the design, and they eventually constructed an environment more attuned to user preferences. The Libraries have since incorporated user-driven insight to inform all subsequent renovations. NCSU uses a variation of the Build, Measure, Learn method with many of its online projects as well. New digital collections are often rolled out quickly and then enhancements are added over time, making extensive use of web analytics and tracking on individual interfaces to review how the systems are being used. The NCSU Libraries have increasingly taken the approach of developing their applications in such a way that they generate the kind of data necessary to evaluate how the tool, content, or service is being used, so staff can respond to emerging patterns of use. They can grow the initiative according to what their users need it to become. D. H. Hill Library Learning Commons Web Initiatives21 6 “Entrepreneurship is similar to a science experiment; you’re constantly creating and testing new theses and seeing what works.” That’s the advice from Bob Summer, founder of TechPad, a Blacksburg startup co-working office space.22 Bob has been involved with startups from many dimensions, as a founder as well as a venture investor. At TechPad he is more than a property manager, serving as a mentor to several early-stage companies. He believes that successful ideas can be boiled down to three essential qualities:23 If your concept is lacking one of these attributes, it’s less likely to succeed. Some examples: A library I worked in wanted to offer a flexible, customizable, commons environment. High-end designer tables and chairs were installed that were lightweight, on casters, and very easy to move. From a cost and square footage standpoint this was feasible to make happen. In terms of value, many students enjoyed being able to create the type of space they needed on the fly. However, usability was questionable. While it was easy to move furniture around, the problem was excessive mobility. Students often left the tables and chairs in arrangements that were chaotic, confusing, and unnavigable. During finals week I often observed small groups cramming together for their last minute preparations before tests. I wanted to enhance this, especially for large general classes like biology and calculus. My concept: what if you could study with your friends, and a few others, and have the session facilitated by a teaching assistant? There was great value in this venture because many campus units partnered with us, and students turned out to take advantage of the program. It also had great usability because it worked well. Students discovered the program, found the locations, and commented that it helped them prepare for their tests. The issue was feasibility; it couldn’t scale. Some sessions had over 75 students show up but only enough room for 25. We encountered some reliability issues, too. Some teaching assistants didn’t show up and this caused anger, disappointment, and anxiety among the students. While the concept was good, the library was limited in being able to coordinate and scale to the demand. Char Booth describes her experience with the implementation of Skype reference at Ohio University. They experimented with setting up a Skype kiosk in various locations, enabling students to interact with librarians. After several iterations of location, signage, and software configuration, they decided to end the project. It was feasible and usable; from a technical standpoint the tools worked well and cost was minimal. The problem was value. Students just didn’t use the service. Maybe Char’s team was too ahead of the curve; Skype has only recently become a standard communications tool. Or maybe students just didn’t want to video chat with librarians. All three of these are examples of failure. Not epic, million-dollar catastrophes, but great ideas that just didn’t turn out as planned. And that’s okay. Forgiveness has to be built into the experience. We shouldn’t look at failure as finality, but rather as a test bed to help ideas evolve. The library with furniture chaos built table management into someone’s job responsibilities. This person was able to monitor the pulse of student needs and managed the learning space more effectively. The Exam Cram concept spun off from the library into the dining halls and dorms where it was more manageable and linked to the living-learning community. And the library that experimented with Skype gained insight about user preferences and were able to focus service toward anonymous and mobile platforms like instant messaging and texting. We have to look at our efforts beyond successes or failures, beyond black and white, and be comfortable with gray. We have to give our ideas enough time and room to grow. And we have to learn when to let them go. Building on the core elements of usability, feasibility, and value greatly increases the likelihood of developing ideas that people will adopt. Three Essential Qualities of Inspiring Products “Entrepreneurship is a lot like to a science experiment; you’re constantly creating and testing new theses and seeing what works.” Usability. Feasibility. Value. Iteration. Iteration. Iteration. Open Floor Plans Exam Cram Skype a Librarian24 7 Too Much Assessment, Not Enough Innovation We invest a lot of time, money, and effort into metrics. Entire journals and conferences are dedicated to library assessment. There are assessment librarian positions and even assessment departments. It’s obviously something we believe in. But does it work? Does it matter? Does it produce something useful? Does it encourage innovation? Does it nurture breakthrough, paradigm-shifting, transformative ideas? Or put another way: if we stopped all of our assessment programs today would our patrons notice anything different tomorrow? I’ll admit that I’ve grown skeptical of traditional library assessment. After spending time with startup founders and other entrepreneurs, as well as market researchers from Fortune 500 companies, I think it boils down one central difference: we’re asking the wrong questions. The problem with traditional library assessment is that it’s predominantly linked to satisfaction and performance. We’re focused on things like: how many articles are downloaded, how many pre- prints are in the repository, how many classes do we teach, or how our students feel about the library commons. This is all well and good. Obviously we want to measure and learn from how well our current services, processes, and products are performing. That’s just the tip of the iceberg. We stop short of discovering real transformative insights. We don’t ask big enough questions. We don’t follow the rabbit down the hole. We don’t break out of our comfort zones. We don’t seek out disruption. We’re too focused on trying to please our users rather than trying to anticipate their unarticulated needs. Assessment isn’t about developing breakthrough ideas. In short: we focus on service sustainability rather than revolutionary or evolutionary new services. As we think about the direction libraries are heading, the focus can’t remain on how well we’re doing right now, but on where we should be heading. It’s not about making our services incrementally better, but about developing completely new services and service models. Instead of assessment, we need to invest in R&D. We need to infuse the entrepreneurial spirit into our local efforts and into our professional conversations. R&D empowers us to move away from our niche and dabble in new arenas. Let’s take a look at instruction. Instead of continuing the library-centered perspective of infusing information literacy (something that we feel is critical) into the classroom, we could take a more empathic or user-sensitive approach of understanding the common barriers that students face with their assignments and then build instructional support to address these needs. We could take that even further by imagining the types of tools and services that would enable students to be more successful: project management, resource sharing, discovery tools and filters, processes for synthesizing information, and so forth. This more user-focused (as opposed to information-focused) approach moves us closer to addressing actual needs and further associates the library with user perceptions of scholarly achievement. The need for R&D isn’t new. Skunk works operations, or independent teams working on secret projects, have been proposed for libraries before.25 But we need more than just “the innovation department” - we need a culture of innovation. We need to encourage everyone at every level to be on the lookout for breakthrough, paradigm-shifting, transformative ideas. Innovation needs to happen out in the open. It needs to be in everyone’s job description. We don’t ask BIG ENOUGH questions. 8 A Strategic Culture (Instead of a Strategic Plan) Many library strategic plans read more like to-do lists rather than entrepreneurial visions. With all the effort that goes into these documents I’m not sure that we’re getting a good return. You can easily pick out who wrote which parts: there is a section for public services, a section for technical services, something about information literacy, something about open access, something about providing service excellence. These are highly predictable documents. They don’t say: we’re going to develop three big ideas that will shift the way we operate. They don’t say: we’re going delight our patrons by anticipating their needs. They don’t say: we’re going to transform how scholarship happens. They don’t attempt to dent the universe. A common strategy for innovation is the “copy-and- paste” method-- see what others are doing and then follow suit. Alter the name or modify the template, but largely our ideas come from other libraries. I observed this narrow-sightedness when I led a User Experience (UX) unit.29 Numerous librarians and administrators contacted me to inquire about my position. They remarked that they wanted to develop a similar position but didn’t know exactly what I did. UX was a sexy title back then and many libraries felt the need to jump on the bandwagon without understanding what it was. Sadly, over the last few years the user experience librarian trend has evolved into a website design, usability, and analytics role rather than one focused on improving the patron’s total library experience. Another example is the information/learning commons model. Here is the formula: lots of computers with software + designer furniture + café + research & tech help = a commons. Similar to UX librarians, every academic library had to have a learning commons over the last decade. We’re a copy-and-paste profession. When I’ve asked librarians about their design principles, critical success factors, or cultural and pedagogical outcomes they look at me strangely. We don’t typically link science and psychology to the spaces we develop. It’s easier to just select from the Steelcase or Herman Miller catalog without having a narrative behind what’s being developed. Too often our renovations are about refreshing the space, instead of revitalizing the way the organization operates. Being strategic should be about pushing the boundaries. Instead you are more likely to see something like: “embed information literacy into the curriculum” rather than “build a curriculum to prepare students for 21st century literacies.” Stretching not sustaining. A strategic instructional venture isn’t about just training students how to search database interfaces, but about building their fluency with data, visual, spatial, media, information, and technology literacies. This is how we can advance the role of the library. This is how we transform scholarship. Here are some approaches to get you started: Academic Librarianship by Design. Steven Bell and John Shank adapted the IDEO design-thinking method for the library environment. Innovation is a process: understand, observe, visualize, evaluate, refine, and implement. They argue for a more holistic approach to librarianship with goals such as improving faculty collaboration, connecting with learners, and taking on leadership to integrate the library into the total learning process. Nancy Foster and Susan Gibbons (and their staff) experimented with ethnographic techniques as a means of better understanding their student population. Anthropological methods of observation and community- study have blossomed in our field. This book reflects on involving library personnel in the process. Joseph Michelli provides insight that propelled Starbucks from turning ordinary into extraordinary experiences. His vision is based on the process of making a personal connection with people through a framework based on connecting, discovering, and responding. This transforms patrons into people and makes library usage personal. By focusing on relationship building instead of service excellence, organizations can uncover new needs and be in position to make a stronger impact. Academic Librarianship by Design26 Studying Students27 The Starbucks Experience28 It’s not about books migrating from print to digital. 9 Xerox provides us with a great example of strategic thinking.30 After dominating the marketplace with photocopiers and printers, they realized they needed to change. The rise of digital communications was impacting their core business, and instead of just building better hardware they expanded their identity. Xerox evolved from being a photocopy company to one that emphasizes business support services. They developed new areas such as document management, IT outsourcing, HR and accounting support, and data entry. They redefined themselves not by better document reproduction, but by becoming an integral partner in business operations infrastructure.31 We need to undergo a similar transformation. What’s the role for the library beyond providing access to information and a space to study? How can we make an impact on the teaching and learning process? How can we become an integral partner with faculty involved in the business of research? How can we stimulate knowledge production and sharing? These are the important questions that we need to ask. This is the important work that we need to figure out. This is beyond books migrating from print to digital platforms, but rather, it’s about libraries staking a claim in other parts of the scholarly enterprise. The most vital component to our success and survival is building a culture that inspires a strategic mindset -- a culture that embraces and rewards imagination, experimentation, teamwork, and initiative. The best way to do that is to fund it.32 Library administrators should serve as venture capitalists investing in creative concepts that show promise. They should invest in ideas that are, usable, feasible, and valuable. And they should invest in projects that are iterative and adapt to changes along the way. This investment should extend beyond project funding, and also include recruiting and developing talent and skill sets too. Administrators who aspire to be forward-thinking, user-focused, and entrepreneurial should demonstrate to their organizations that they are willing to embrace bold ideas that might not work out as planned. Startup culture is an attitude. It’s the responsibility of the administration to foster and inspire the entrepreneurial spirit. It’s the role of librarians and staff to push the boundaries, to find what’s next, and to redefine our profession. Libraries need to be a cause, a purpose, and the reason you get out of bed and are excited to get to work.34 Libraries are about people, not books or technology. It’s about the outcome for patrons interacting with everything we do and offer. If we are seeking breakthrough ideas that change service paradigms, then we need to be ready for disruption. If we’re serious about innovation then we need to go “all in” and can’t only bet on sure things. Entrepreneurialism is a cultural imperative, not something that should only happen in small pockets of your organization. Or as Steve Jobs preached, we need to strive to “dent the universe,” “build the impossible,” and offer “insanely great” services, products, and spaces.35 Until then we’re just building a better vacuum cleaner, rather than building breakthrough ideas. Innovators Experimenting with 3D printing. Early Adaptors Building visualization services. Early Majority Migrating to demand-driven acquisitions. Late Majority Offering text reference. Laggards Planning a Facebook fan page. H o w i n n o v a t i v e i s y o u r l i b r a r y ? 3 3 10 Microscopes & Telescopes Famed venture capitalist and business writer Guy Kawasaki offers a great metaphor for looking at strategic outlooks: telescopes and microscopes.36 Here is a paraphrase of his description: Microscopes magnify every detail, line item, expenditure, and demand full-blown forecasts. Microscopes are a cry for level-headed thinking, a return to fundamentals, and a “back to basics” approach. Telescopes bring the future closer. They dream up “the next big thing” and seek to change the world. Lots of ideas are tossed around. Some ideas stick and those move forward. The reality is that you need both perspectives. We can’t focus exclusively on traveling to the future scholarly universe. And at the same time we can’t remain static and nostalgic about what libraries have been. How we manage to pass through this crucible moment will define us.37 This decade before us will shape the future of what academic libraries will become. Change is inevitable and vital. Accepting this reality empowers us. This is change that we have a say in. This is change that we can guide: telescopes and microscopes working to see, plan, and implement the transformation together. “REAL ARTISTS SHIP!”38 Ideas are the easy part. Coming up with them doesn’t make you an innovator or a game-changer or a change-agent. True innovators get their hands dirty. It means taking ownership of the concept, believing it, advocating for it, fighting for it, shaping it, breathing life into it, and turning it into a reality. If you came up with the idea, then it’s your responsibility to see it through to the end.39 It’s your responsibility to stick it out. Real entrepreneurs are personally invested. Startup founders are not just in it for fame or fortune, but are driven to develop something new and to make their ideas tangible. The goal is to build something that doesn’t exist and to create something that wasn’t there before that is now absolutely essential. We in the library world need to feel that way too. That’s the heart and soul of startup culture. That’s what we need to tap into. It’s on our shoulders to find the future. It’s up to us to define what libraries will become. It won’t be easy, but how often do you get to redefine a profession? It’s not the time to do more of the same, arranging the same old blocks in different patterns. We need to change more than the packaging, add more than a shiny new wrapper. This transformation isn’t just about moving collections and services online, it’s about changing the DNA of our organizations. As Steve Jobs said, “real artists ship.” Real artists get their ideas out there. Real innovators deliver. Real entrepreneurs develop. Real startups launch. This is our time to face the future and redefine what libraries do. What will you invent next? Who will you partner with tomorrow? How will you plant the seeds of entrepreneurialism for the future? The direction academic libraries take is up to us. It’s ours to figure out. So let’s not be satisfied by adding small features, but instead, let’s use our imaginations to dream big and create amazing experiences that transform our users. True innovators get their hands dirty. 11 Summary We don’t just need change, we need breakthrough, paradigm-shifting, transformative, disruptive ideas. Startups are organizations dedicated to creating something new under conditions of extreme uncertainty. Now is not the time to find new ways of doing the same old thing. Launching a good idea is always better than not launching an awesome one. Don’t just expand services: solve problems. The library is a platform, not a place, website, or person. Libraries need less assessment and more R&D. Focus on relationship building instead of service excellence and satisfaction. Don’t just copy & paste from other libraries: invent! Grow your ideas: Build, Measure, Learn. Iterate & Prototype. Plant many seeds; nurture the ones that grow. Seize the whitespace. Good ideas are usable, feasible, and valuable. Give new ideas a place to incubate. Give new ideas enough time to blossom. Give new ideas a way to get funded. Give new ideas the talent they require. Give new ideas room to fail… and then evolve. Give up on a new idea if it just don’t work. Innovation happens out in the open—not behind closed doors. Innovation is a team sport. Practice it regularly. Innovation is messy. Innovation is disruptive. Real innovators get their hands dirty. Being strategic is about stretching not sustaining. Stake a claim in other parts of the scholarly enterprise. Build a strategic culture, not a strategic plan. Entrepreneurialism is a cultural imperative, not something that should only happen in small pockets of your organization. Strive to change the profession. Aim for epiphanies. 12 Notes 13 1 Jim Collins, Great By Choice: Uncertainty, Chaos, and Luck--Why Some Thrive Despite Them All, 2011. 2 “UNLV faculty warned higher ed system may be forced to declare bankruptcy.” http://www.lvrj.com/news/unlv-faculty-warned-university-sys tem-may-be-forced-to-declare-bankruptcy-116279269.html 3 “University System advances on campus mergers.” http://www.ajc.com/news/university-system-advances-on-1217183.html 4 “Four UCSD libraries to close, consolidate.” http://www.utsandiego.com/news/2011/mar/29/ucsd-libraries-close/ 5 “Harvard Libraries Cuts Jobs, Hours.” http://www.thecrimson.com/article/2009/6/26/harvard-libraries-cuts-jobs-hours-harvard/ 6 “Our Universities: Why Are They Failing?” http://www.nybooks.com/articles/archives/2011/nov/24/our-universities-why-are-they-failing/ 7 Anya Kamenetz, DIY U: Edupunks, Edupreneurs, and the Coming Transformation of Higher Education, 2010. 8 A recent example: “Academic Library Autopsy Report, 2050” http://chronicle.com/article/Academic-Library-Autopsy/125767/ 9 Conversation with Steven Bell: http://stevenbell.info/ 10 The University of California’s Next Generation Tech Services reports: http://libraries.universityofcalifornia.edu/about/uls/ngts/ 11 Malcolm Gladwell, “Creation Myth.” http://www.newyorker.com/reporting/2011/05/16/110516fa_fact_gladwell?currentPage=all 12 Eric Ries, The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses, 2011. 13 Lean Canvas, http://leancanvas.com/ 14 Eric Ries, The Lean Startup, 2011. 15 John Mullins & Randy Komisar, Getting to Plan B: Breaking Through to a Better Business Mode, 2009. 16 Guy Kawasaki, The Art of the Start: The Time-Tested, Battle-Hardened Guide for Anyone Starting Anything, 2004. 17 A. G. Lafley, Seizing the White Space: Business Model Innovation for Growth and Renewal, 2010. 18 Eric Ries, The Lean Startup, 2011. 19 “More, better, faster: UX design for startups,” http://www.cooper.com/journal/2011/03/more_better_faster_ux_design.html 20 “Building a competitive advantage.” http://americanlibrariesmagazine.org/columns/next-steps/building-competitive-advantage 21 Correspondence with Steve Morris: Head, Digital Library Initiatives and Digital Projects, NCSU. 22 Conversation with Bob Summer, see also http://www.collegiatetimes.com/stories/17767/techpad-opens-in-blacksburg 23 Bob Summer was influenced by Marty Cagan’s book Inspired: How To Create Products Customers Love, 2008. 24 Char Booth, “Hope, Hype and VoIP: Riding the Library Technology Cycle.” http://www.alastore.ala.org/detail.aspx?ID=3037 25 Brian Quinn “The McDonaldization of Academic Libraries?” College & Research Libraries, May 2000. 26 Steven Bell & John Shank, Academic Librarianship by Design: A Blended Librarian’s Guide to the Tools and Techniques, 2007. See also The Art of Innovation (Kelley & Littman) and “Spark Innovation Through Empathic Design” HBR (Leonard & Rayport). 27 Nancy Foster & Susan Gibbons, Studying Students: The Undergraduate Research Project at the University of Rochester, 2007. http://docushare. lib.rochester.edu/docushare/dsweb/View/Collection-4436 28 Joseph Michelli, The Starbucks Experience: 5 Principles for Turning Ordinary Into Extraordinary, 2006. 29 Erin Dorney, “The user experience librarian” CRL News, 2009. http://crln.acrl.org/content/70/6/346.full.pdf+html?sid=f29caba6-f126-42da- 9bd5-4c595f67da3a 30 “Fresh Copy: How Ursula Burns Reinvented Xerox” Fast Company, 2011. http://www.fastcompany.com/magazine/161/ursula-burns-xerox 31 A cautionary tale about railroads: Levitt, “Marketing Myopia.” HBR 38(4): 45-56. 32 Good example: Microgrants: http://info.lib.uh.edu/about/strategic-directions/microgrants 33 Everett Rogers, Diffusion of Innovations, 2003. (5th Edition) 34 Simon Sinek, Start with Why: How Great Leaders Inspire Everyone to Take Action, 2011. 35 Walter Isaacson, Steve Jobs, 2011. 36 Guy Kawasaki, The Art of the Start, 2004. 37 Robert Thomas, Crucibles of Leadership: How to Learn from Experience to Become a Great Leader, 2008. 38 Walter Isaacson, Steve Jobs, 2011. 39 Killer Innovations podcast: http://philmckinney.com/killer-innovations Paper Layout & Design by Ashley Marlowe Brian Mathews is an Associate Dean at Virginia Tech. He has previously worked as an Assistant University Librarian at UC Santa Barbara and as User Experience Librarian at Georgia Tech. Brian’s blog, The Ubiquitous Librarian, is hosted by the Chronicle of Higher Education: http://chronicle.com/ blognetwork/theubiquitouslibrarian/ morgan-bringing-2021 ---- Chapter 10 Bringing Algorithms and Machine Learning Into Library Collections and Services Eric Lease Morgan University of Notre Dame Seemingly revolutionary changes At the time of their implementation, some changes in the practice of librarianship were deemed revolutionary, but now-a-days some of these same changes are deemed matter of fact. Take, for example, the catalog. During much of the Middle Ages, a catalog was more akin to a simple acquisitions list. By 1548 the first author, title, subject catalog was created (LOC 2017, 18). These catalogs morphed into books, books which could be mass produced and distributed. But the books were difficult to keep up to date, and they were expensive to print. As a consequence, in the early 1860s, the card catalog was invented by Ezra Abbot, and the catalog eventually became a massive set of drawers (82). Unfortunately, because the way catalog cards are produced, it is not feasible to assign more than three or four subject headings to any given book. If one does, then the number of catalog cards quickly gets out of hand. In the 1870s, the idea of sharing catalog cards between libraries became common, and the Library of Congress facilitated much of the distribution (LOC 2017, 87). In 1965 and with the advent of computers, the idea of sharing cataloging data as MARC (machine readable cataloging) became prevalent (Crawford 1989, 204). The data structure of a MARC record is indicative of the time. Intended to be distributed on reel-to-reel tape, the MARC record is a sequential data structure designed to be read from beginning to end, complete with checks and balances ensuring the record’s integrity. Despite the apparent flexibility of a digital data structure, the tradition of three or four subject headings per book still holds true. Now-a-days, the data from MARC records is used to fill databases, the databases’ content is indexed, and items from the 113 114 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 10 library collection are located by searching the index. The evolution of the venerable library catalog has spanned centuries, each evolutionary change solving some problems but creating new ones. With the advent of the Internet, a host of other changes are (still) happening in libraries. Some of them are seen as revolutionary, and only time will tell whether or not these changes will persevere. Examples include but are not limited to: • the advocacy of alt-metrics and open access publications • the continuing dichotomy of the virtual library and library as place • the creation and maintenance of institutional repositories • the existence of digital scholarship centers • the increasing tendency to license instead of own content Many of the traditional roles of libraries are not as important as they used to be. That does not mean the roles are unimportant, just not as important. Like many other professions, librarianship is exploring new ways to remain relevant when many of their core functions are needed by fewer people. Working smarter, not harder Beyond automation, librarianship has not exploited computer technology. Despite the fact that libraries have the world of knowledge at their fingertips, libraries do not operate very intelligently, where “intelligently” is an allusion to artificial intelligence. Let’s enumerate the core functionalities of computers. First of all, computers…compute. They are given some sort of input, assign the input to a variable, apply any number of functions to the variable, and output the result. This process — computing — is akin to solving simple algebraic equations such as the area of a circle or a distance traveled. There are two factors of particular interest here. First, the input can be as simple as a number or a string (read: “a word”) or the input can be arbitrarily large combinations of both. Examples include: • 42 • 1776 • xyzzy • George Washington • a MARC record • the circulation history and academic characteristics of an individual • the full text and bibliographic descriptions of all early American authors Morgan 115 What is really important is the possible scale of a computer’s input. Libraries have not taken advantage of that scale. Imagine how librarianship would change if the profession actively used the full text of its collections to enhance bibliographic description and resulting public service. Imagine how collection policies and patron needs could be better articulated if: 1) students, re- searchers, or scholars first opted-in to have their records analyzed, and 2) the totality of circulation histories and journal usage histories were thoroughly investigated in combination with patron characteristics and data from other libraries. A second core functionality of computers is their ability to save, organize, and retrieve vast amounts of data. More specifically, computers save “data” — mere numbers and strings. But when the data is given context, such as a number denoted as date or a string denoted as a name, then the data is transformed into information. An example might include the birth year 1972 and the name of my pet, Blake. Given additional information, which may be compared and contrasted with other information, knowledge can be created — information put to use and un- derstood. For example, Mary, my sister, was born in 1951 and is therefore 21 years older than Blake. Computers excel at saving, organizing, and retrieving data which leads to information and knowledge. The possibilities of computers dispensing wisdom — knowledge of a timeless nature — is left for another essay. Like the scale of computer input, the library profession has not really exploited computers’ ability to save, organize, and retrieve data; on the whole, the library profession does not under- stand the concept of a “data structure.” For example, tab-delimited files, CSV (comma-separated value) files, relational database schema, XML files, JSON files, and the content of email messages or HTTP server responses are all examples of different types of data structures. Each has its own set of inherent strengths and weaknesses; there is no such thing as “One size fits all.” Through the use of data structures, computers store and retrieve information. Librarianship is about these same kinds of things, yet few librarians would be able to outline the differences between different data structures. Again, data becomes information when it is given context. In the world of MARC, when a string (one or more “words”) is inserted into the 245 field of a MARC bibliographic record, then the string is denoted as a title. In this case, MARC is a “data structure” because different fields denote different contexts. There are fields for authors, subjects, notes, added entries, etc. This is all very well and good, especially considering that MARC was designed more than fifty years ago. But since then, many more scalable, flexible, and efficient data structures have been designed. Relational databases are a good example. Relational databases build on a classic data structure known as the “table” — a matrix of rows and columns where each row is a record and each column is a field. Think “spreadsheet.” For example, each row may represent a book, with columns for authors, titles, dates, publishers, etc. The problem comes when a column needs to be repeatable. For example, a book may have multiple authors or more commonly, multiple subjects. In this case the idea of a table breaks down because it doesn’t make sense to have a column named subject-01, subject-02, and subject-03. As soon as you do that, you will want subject-04. Relational databases solve this problem. The solution is to first add a “key” — a unique value — to each row. Next, for fields with multiple values, create a new table where one of the columns is the key from the first table and the other column is a value, in this case, a subject heading. There are now two tables and they can be “joined” through the use of the key. Given such a data structure it is possible to add as many subjects as desired to any bibliographic item. But you say, “MARC can handle multiple subjects.” True, MARC can handle multiple sub- jects, but underneath, MARC is a data structure designed for when information was dissemi- 116 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 10 nated on tape. As such, it is a sequential data structure intended to be read from beginning to end. It is not a random access structure. What’s more, the MARC data structure is really di- vided into three substructures: 1) the leader, which is always twenty-four characters long, 2) the directory, which denotes where each bibliographic field exists, and 3) the bibliographic section where the bibliographic information is actually stored. It gets more complicated. The first five characters of the leader are expected to be a left-hand, zero-padded integer denoting the length of the record measured in bytes. A typical value may be 01999. Thus, the record is 1999 bytes long. Now, ask yourself, “What is the maximum size of a MARC record?” Despite the fact that librarianship embraces the idea of MARC, very few librarians really understand the structure of MARC data. MARC is a format for transmitting data from one place to another, not for organization. Moreover, libraries offer more than bibliographic information. There is information about people and organizations. Information about resource usage. Information about licensing. In- formation about resources that are not bibliographic, such as images or data sets. Etc. When these types of information present themselves, libraries fall back to the use of simple tables, which are usually not amenable to turning data into information. There are many different data structures. XML became popular about twenty years ago. Since then JSON has become prevalent. More than twenty years ago the idea of Linked Data was presented. All of these data structures have various strengths and weaknesses. None of them is perfect, and each addresses different needs, but they are all better than MARC when it comes to organizing data. Libraries understand the concept of manifesting data as information, but as a whole, libraries do not manifest the concept using computer technology. Finally, another core functionality of computers is networking and communication. The advent of the Internet is a relatively recent phenomenon, and the ubiquitous nature of comput- ers combined with other “smart” devices has facilitated literally billions of connections between computers (and people). Consequently the data computed upon and stored in one place can be transmitted almost instantly to another place, and the transmission is an exact copy. Again, like the process of computing and the process of storage, efficient computer communication builds upon itself with unforeseen consequences. For example, who predicted the demise of many cen- tralized information authorities? With the advent of the Internet there is less of a need/desire for travel agents, movie reviewers, or dare I say it, libraries. Yet again, libraries use the Internet, but do they actually exploit it? How many librarians are able to create a file, put it on the Web, and share the resulting URL? Granted, centralized computing departments and networking administrators put up road blocks to doing such things, but the sharing of data and information is at the core of librarianship. Putting a file on the ’Net, even temporarily, is something every librarian ought to be able to know how (and be authorized) to do. Despite the functionality of computers and their place in libraries over the past fifty to sixty years, computers have mostly been used to automate library tasks. MARC automated the process of printing catalog cards and eventually the creation of “discovery systems.” Libraries have used computers to automate the process of lending materials between themselves as well as to local learners, teachers, and scholars. Libraries use computers to store, organize, preserve, and dissem- inate the gray literature of our time, and we call these systems “institutional repositories.” In all of these cases, the automation has been a good thing because efficiencies were gained, but the use of computers has not gone far enough nor really evolved. Lending and usage statistics are not routinely harvested nor organized for the purposes of monitoring and predicting library patron Morgan 117 needs/desires. The content of institutional repositories is usually born digital, but libraries have not exploited their full text nature nor created services going beyond rudimentary catalogs. Computers can do so much more for libraries than mere automation. While I will never say computers are “smart,” their fundamental characteristics do appear intelligent, especially when used at scale. The scale of computing has significantly changed in the past ten years, and with this change the concept of “machine learning” has become more feasible. The following sections outline how libraries can go beyond automation, embrace machine learning, and truly evolve their ideas of collections and services. Machine learning: what it is, possibilities, and use cases Machine learning is a computing process used to make decisions and predictions. In the past, computer-aided decision-making and predictions were accomplished by articulating large sets of if-then statements and navigating down decision trees. The applications were extremely domain specific, and they weren’t very scalable. Machine learning turns this process on its head. Instead of navigating down a tree, machine learning takes sets of previously made observations (think “decisions”), identifies patterns and anomalies in the observations, and saves the result as a math- ematical model, which is really an n-dimensional array of vectors. Outside observations are then compared to the model and depending on the resulting similarities or differences, decisions or predictions are drawn. Using such a process, there are really only four different types of machine learning: classifi- cation, clustering, regression, and dimension reduction. Classification is a supervised machine learning process used to subdivide a set of observations into smaller sets which have been previ- ously articulated. For example, suppose you had a few categories of restaurants such as American, French, Italian, or Chinese. Given a set of previously classified menus, one could create a model defining each category and then classify new, unseen menus. The classic classification example is the filtering of email. “Is this message ‘spam’ or ‘ham’?” This chapter’s appendix walks a person through the creation of a simplified classification system. It classifies texts based on authorship. Clustering is almost always an unsupervised machine learning process which also creates smaller sets from a larger one, but clustering is not given a set of previously articulated categories. That is what makes it “unsupervised.” Instead, the categories are created as an end result. Topic modeling is a popular example of clustering. Regression predicts a numeric value based on sets of dependent variables. For example, given dependent variables like annual income, education level, size of family, age, gender, religion, and employment status, one might predict how much money a person may spend on an independent variable such as charity. Sometimes the number of characteristics of each observation is very large. Many times some of these characteristics do not play a significant role in decision-making or prediction. Dimension reduction is another machine learning process, and it is used to eliminate these less-than-useful characteristics from the observations. This process simplifies classification, clustering, or regres- sion. Some possible use cases There are many possible ways to enhance library collections and services through the use of ma- chine learning. I’m not necessarily advocating the implementation of any of the following ideas, 118 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 10 but they are possibilities. Each is grouped into the broadest of library functional departments: • reference and public services – given a set of grant proposals, suggest library resources be used in support of the grants – given a set of licensed library resources and their usage, suggest other resources for use – given a set of previously checked out materials, suggest other materials to be checked out – given a set of reference interviews, create a chatbot to supplement reference services – given the full text of a set of desirable journal articles, create a search strategy to be applied against any number of bibliographic indexes; answer the proverbial question, “Can you help me find more like this one?” – given the full text of articles as well as their bibliographic descriptions, predict and describe the sorts of things a specific journal title accepts or whether a given draft is good enough for publication – given the full text of reading materials assigned in a class, suggest library resources to support them • technical services – given a set of multimedia, enumerate characteristics of the media (number of faces, direction of angles, number and types of colors, etc.), and use the results to supple- ment bibliographic description – given a set of previously cataloged items, determine whether or not the cataloging can be improved – given full-text content harvested from just about anywhere, analyze the content in terms of natural language processing, and supplement bibliographic description • collections – given circulation histories, articulate more refined circulation patterns, and use the results to refine collection development policies – given the full text of sets of theses and dissertations, predict where scholarship at your institution is growing, and use the results to more intelligently build your just-in-case collection; do the same thing with faculty publications Implementing any of these possible use cases would necessarily be a collaborative effort. Im- plementation requires an array of expertise. Enumerated in no priority order, this expertise in- cludes: subject/domain expertise (such as cataloging trends, circulation services, collection strate- gies, etc.), computer programming and data management skills (such as Python, R, relational databases, JSON, etc.), and statistical modeling (an understanding of the strengths and weak- nesses of different machine learning algorithms). The team would then need to: 1. articulate and share a common goal for the work Morgan 119 2. amass the data to model 3. employ a feature extraction process (lower case words, extract a value from a database, etc.) 4. vectorize the features 5. create and evaluate the resulting model 6. go to Step #2 until satisfied 7. put the model into practice 8. go to Step #1; this work is never done For example, to bibliographically connect grant proposals to library resources, try this: 1. use classification to sub-divide each of your bibliographic index descriptions 2. apply the resulting model to the full text of the grants 3. return a percentage score denoting the strength of each resulting classification 4. recommend the use of zero or more bibliographic indexes To predict scholarship, try this: 1. amass the full text and bibliographic descriptions of all theses and dissertations 2. topic model the full text 3. evaluate the resulting topics 4. go to Step #2 until satisfied 5. augment the model’s matrix of vectors with bibliographic description 6. pivot the matrix on any of the given bibliographics 7. plot the results to see possible trends over time, trends within disciplines, etc. 8. use the results to make decisions The content of the GitHub repository reproduced in this chapter’s appendix describes how to do something very similar in method to the previous example.1 1See ?iiTb,ff;Bi?m#X+QKf2`B+H2�b2KQ`;�Mf#`BM;BM;@�H;Q`Bi?Kb. https://github.com/ericleasemorgan/bringing-algorithms 120 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 10 Some real-world use cases Here at the University of Notre Dame’s Navari Center for Digital Scholarship, we use machine learning in a number of ways. We cut our teeth on a system called Convocate.2 In this case we ob- tained a set of literature on the theme of human rights. Half of the set was written by researchers in non-governmental organizations. The other half was written by theologians. While both sets were on the same theme, the language of each was different. An excellent example is the use of the word “child.” In the former set, children were included in documents about fathers and mothers. In the later set, children often referred to the “Children of God.” Consequently, queries referring to children were often misleading. To rectify this problem, a set of broad themes were articulated, such as Actors, Harms and Violations, Rights and Freedoms, and Principles and Values. We then used topic modeling to subdivide all of the paragraphs of all of the documents into smaller and smaller sets of paragraphs. We compared the resulting topics to the broad themes, and when we found correlations between the two, we classified the paragraphs accordingly. Because the process required a great deal of human intervention, and thus impeded subsequent updates, this process was not ideal, but we were learning and the resulting index is useful. On a regular basis we find ourselves using a program called Topic Modeling Tool, which is a GUI/desktop application heavily based on the venerable MALLET suite of software.3 Given a set of plain text files and an integer, Topic Modeling Tool will create a weighted list of latent themes found in a corpus. Each theme is really a list of words which tend to cluster around each other, and these clusters are generated through the use of an algorithm called LDA (Latent Dirichlet Allocation). When it comes to topic modeling, there is no such thing as the correct number of topics. Just as in the traditional process of denoting what a corpus is about, there can be many distinct topics or there can be a few. Moreover, some of the topics may be large and others may be small. When using a topic modeler, it is important to iteratively configure and re-configure the input until the results seem to make sense. Just like every other machine learning application, Topic Modeling Tool bases its “reason- ing” on a matrix of vectors. Each row represents a document, and each column is a topic. At the intersection of a document row and a topic column is a score denoting how much the given doc- ument is “about” the calculated topic. It is then possible to sum each topic column and output a pie chart illustrating not only what the topics are, but how much of the corpus is about each topic. Such can be very insightful. By adding metadata to the matrix of vectors, even more insights can be garnered. Suppose you have a set of plain text files. Suppose also you know the names of the authors of each file. You can then do topic modeling against your corpus, and when the modeling is complete you can add a new column to the matrix and call it authors. Next, you update the values in the authors column with author names. Finally, you “pivot” the matrix on the authors column to calculate the degree each authors’ works are “about” the calculated topics. This too can be quite insightful. Suppose you have works by authors A, B, C, and D. Suppose you have calculated topics I, II, III, and IV. By updating the matrix and pivoting the results, you might discover that author A discusses topic I almost exclusively, whereas author B discusses topics I, II, III, and IV in equal parts. This process works for just about any type of metadata: gender, genre, extent, dates, language, etc. What’s more, Topic Modeling Tool makes this process almost trivial. To learn how, see the GitHub 2See ?iiTb,ff+QMpQ+�i2XM/X2/m. 3See ?iiTb,ff;Bi?m#X+QKfb2M/2`H2fiQTB+@KQ/2HBM;@iQQH for the Topic Modeling Tool. See ?iiT, ffK�HH2iX+bXmK�bbX2/m for MALLET. https://convocate.nd.edu https://github.com/senderle/topic-modeling-tool http://mallet.cs.umass.edu http://mallet.cs.umass.edu Morgan 121 repository accompanying this chapter.4 We have used classification techniques in at least a couple of ways. One project required the classification of press releases. Some press releases are deemed mandatory — declared necessary to publish. Other press releases are considered discretionary — published at the will of a com- pany. The domain expert needed a set of 100,000 press releases classified into either mandatory or discretionary piles. We used a process very similar to the process outlined in this chapter’s Ap- pendix. In the end, the domain expert believes the classification process was 86% correct, and this was good enough for them. In another project, we tried to identify articles about a particu- lar yeast (Cryptococcus neoformans), despite the fact that the articles never mentioned the given yeast. This project failed because we were unable to generate an accuracy score greater than 70%. This was deemed not good enough. We are developing a high performance computing system called the Distant Reader, which uses machine learning to do natural language processing against an arbitrarily large volume of text. Given one or more documents of just about any number or type, the Distant Reader will: 1. amass the documents 2. convert the documents into plain text 3. do rudimentary counts and tabulations against the plain text 4. calculate statistically significant keywords against the plain text 5. extract narrative summaries against the plain text 6. use Spacy (a natural language processing library) to classify each and every feature of each and every sentence into parts-of-speech and/or named entities5 7. save the results of Steps #1 through #6 as plain text and tab-delimited files 8. distill the tab-delimited files into an SQLite database 9. create both narrative as well as tabular reports against the database 10. create an archive (.zip file) of everything 11. return the archive to the student, researcher, or scholar The student, researcher, or scholar can then analyze the contents of the .zip file to get a bet- ter understanding of its contents. This analysis (“reading”) ranges from perusing the narrative reports, to using desktop tools to visualize the data, to exploiting command-line tools to inves- tigate the data, to writing software which uses the data as input. The Distant Reader scales to everything between a single scholarly report, hundreds of book-length documents, and thou- sands of journal articles. Its purpose is to supplement the traditional reading process, and it uses machine learning techniques at its core. 4?iiTb,ff;Bi?m#X+QKf2`B+H2�b2KQ`;�Mf#`BM;BM;@�H;Q`Bi?Kb. 5See ?iiTb,ffbT�+vXBQ. https://github.com/ericleasemorgan/bringing-algorithms https://spacy.io 122 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 10 Summary and Conclusion Computers and libraries are a natural fit. They both excel at the collection, organization, and dissemination of data, information, and knowledge. Compared to most professions, the practice of librarianship has used computers for a very long time. But, for the most part, the functionality of computers in libraries has not been fully exploited. Advances in machine learning coupled with the data/information found in libraries present an opportunity for both librarianship and the people whom libraries serve. Machine learning can be used to enhance library collections and services, and with a modest investment of time as well as resources, the profession can make it a reality. Appendix: Train and Classify This appendix lists two Python programs. The first (train.py) creates a model for the classification of plain text files. The second (classify.py) uses the output of the first to classify other plain text files. For your convenience, the scripts and some sample data ought to be available in a GitHub repository.6 The purpose of including these two scripts is to help demystify the process of machine learn- ing. Train The following Python script is a simple classification training application. Given a file name and a list of directories containing .txt files, this script first reads all of the files’ contents and the names of their directories into sets of data and labels (think “categories”). It then divides the data and labels into training and testing sets. Such is a best practice for these types of programs so the models can be evaluated for accuracy. Next, the script counts and tabulates (“vectorizes”) the training data and creates a model using a variation of the Naive Bayes algorithm. The script then vectorizes the test data, uses the model to classify the test data, and compares the resulting classifications to the originally supplied labels. The result is an accuracy score, and generally speaking, a score greater than 75% is on the road to success. A score of 50% is no better than flipping a coin. Finally, the model is saved to a file for later use. 1 O i`�BM X Tv @ ;Bp2M � 7BH2 M�K2 �M/ � HBbi Q7 /B`2+iQ`B2b O +QMi�BMBM; X iti 7BH2b - +`2�i2 � KQ/2H 7Q` +H�bbB7vBM; O bBKBH�` Bi2Kb O `2[mB`2 i?2 HB#`�`B2b f KQ/mH2b i?�i rBHH /Q i?2 rQ`F 6 7`QK bFH2�`MX72�im`2n2ti`�+iBQMXi2ti BKTQ`i *QmMio2+iQ`Bx2` 7`QK bFH2�`MXKQ/2Hnb2H2+iBQM BKTQ`i i`�BMni2binbTHBi 7`QK bFH2�`MXM�Bp2n#�v2b BKTQ`i JmHiBMQKB�HL" BKTQ`i ;HQ#- Qb- TB+FH2- bvb 11 O b�MBiv +?2+F c K�F2 bm`2 i?2 T`Q;`�K ?�b #22M ;Bp2M BMTmi B7 H2MU bvbX�`;p V I 9 , bvbXbi/2``Xr`Bi2U ^lb�;2, ^ Y bvbX�`;p( y ) Y 6?iiTb,ff;Bi?m#X+QKf2`B+H2�b2KQ`;�Mf#`BM;BM;@�H;Q`Bi?Kb. https://github.com/ericleasemorgan/bringing-algorithms Morgan 123 ] IKQ/2H= I/B`2+iQ`v = I�MQi?2` /B`2+iQ`v =$M] V [mBiUV 16 O ;2i i?2 M�K2 Q7 i?2 7BH2 r?2`2 i?2 KQ/2H rBHH #2 b�p2/ KQ/2H 4 bvbX�`;p( R ) O ;2i i?2 `2bi Q7 i?2 BMTmi - i?2 M�K2b Q7 /B`2+iQ`B2b iQ T`Q+2bb 21 /B`2+iQ`B2b 4 () 7Q` B BM `�M;2U k- H2MU bvbX�`;p V V , /B`2+iQ`B2bX�TT2M/U bvbX�`;p( B ) V O BMBiB�HBx2 i?2 /�i� iQ �M�Hvx2 �M/ Bib �bbQ+B�i2/ H�#2Hb 26 /�i� 4 () H�#2Hb 4 () O HQQT i?`Qm;? 2�+? ;Bp2M /B`2+iQ`v 7Q` /B`2+iQ`v BM /B`2+iQ`B2b , 31 O 7BM/ �HH i?2 i2ti 7BH2b �M/ ;2i i?2 /B`2+iQ`v ^b M�K2 7BH2b 4 ;HQ#X;HQ#U /B`2+iQ`v Y ]f Xiti] V H�#2H 4 QbXT�i?X#�b2M�K2U /B`2+iQ`v V 36 O T`Q+2bb 2�+? 7BH2 7Q` 7BH2 BM 7BH2b , O QT2M i?2 7BH2 rBi? QT2MU 7BH2 - ^`^ V �b ?�M/H2 , 41 O �// i?2 +QMi2Mib Q7 i?2 7BH2 iQ i?2 /�i� /�i�X�TT2M/U ?�M/H2X`2�/UV V O mT/�i2 i?2 HBbi Q7 H�#2Hb 46 H�#2HbX�TT2M/U H�#2H V O /BpB/2 i?2 /�i� f H�#2Hb BMiQ i`�BMBM; b2ib �M/ i2biBM; b2ib c O � #2bi T`�+iB+2 /�i�ni`�BM - /�i�ni2bi - H�#2Hbni`�BM - H�#2Hbni2bi 4 51 i`�BMni2binbTHBiU /�i�- H�#2Hb V O BMBiB�HBx2 � p2+iQ`Bx2` - �M/ i?2M +QmMi f i�#mH�i2 i?2 O i`�BMBM; /�i� p2+iQ`Bx2` 4 *QmMio2+iQ`Bx2`U biQTnrQ`/b4^2M;HBb?^ V 56 /�i�ni`�BM 4 p2+iQ`Bx2`X7Bini`�Mb7Q`KU /�i�ni`�BM V O BMBiB�HBx2 � +H�bbB7B+�iBQM KQ/2H - �M/ i?2M mb2 L�Bp2 "�v2b O iQ +`2�i2 � KQ/2H +H�bbB7B2` 4 JmHiBMQKB�HL"UV 61 +H�bbB7B2`X7BiU /�i�ni`�BM - H�#2Hbni`�BM V O +QmMi f i�#mH�i2 i?2 i2bi /�i� - �M/ mb2 i?2 KQ/2H iQ +H�bbB7v Bi 124 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 10 /�i�ni2bi 4 p2+iQ`Bx2`Xi`�Mb7Q`KU /�i�ni2bi V +H�bbB7B+�iBQMb 4 +H�bbB7B2`XT`2/B+iU /�i�ni2bi V 66 O #2;BM iQ i2bi 7Q` �++m`�+v +QmMi 4 y O HQQT i?`Qm;? 2�+? i2bi +H�bbB7B+�iBQM 71 7Q` B BM `�M;2U H2MU +H�bbB7B+�iBQMb V V , O BM+`2K2Mi - +QM/BiBQM�HHv B7 +H�bbB7B+�iBQMb( B ) 44 H�#2Hbni2bi( B ) , +QmMi Y4 R 76 O +�H+mH�i2 �M/ QmiTmi i?2 �++m`�+v b+Q`2 c O �#Qp2 d8$W #2;BMb iQ �+?B2p2 bm++2bb T`BMi U ]�++m`�+v, WbWW $M] W U BMiU U +QmMi RXy V f H2MU +H�bbB7B+�iBQMb V Ryy V V V 81 O b�p2 i?2 p2+iQ`Bx2` �M/ i?2 +H�bbB7B2` U i?2 KQ/2H V O 7Q` 7mim`2 mb2 - �M/ /QM2 rBi? QT2MU KQ/2H- ^r#^ V �b ?�M/H2 , TB+FH2X/mKTU U p2+iQ`Bx2` - +H�bbB7B2` V- ?�M/H2 V 86 2tBiUV Classify The following Python script is a simple classification program. Given the model created by the previous script (train.py) and a directory containing a set of .txt files, this script will output a suggested label (“classification”) and a file name for each file in the given directory. This script automatically classifies a set of plain text files. O +H�bbB7v X Tv @ ;Bp2M � T`2pBQmbHv b�p2/ +H�bbB7B+�iBQM KQ/2H �M/ O � /B`2+iQ`v Q7 X iti 7BH2b - +H�bbB7v � b2i Q7 /Q+mK2Mib 4 O `2[mB`2 i?2 HB#`�`B2b f KQ/mH2b i?�i rBHH /Q i?2 rQ`F BKTQ`i ;HQ#- Qb- TB+FH2- bvb O b�MBiv +?2+F c K�F2 bm`2 i?2 T`Q;`�K ?�b #22M ;Bp2M BMTmi B7 H2MU bvbX�`;p V 54 j , 9 bvbXbi/2``Xr`Bi2U ^lb�;2, ^ Y bvbX�`;p( y ) Y ] IKQ/2H= I/B`2+iQ`v =$M] V [mBiUV O ;2i BMTmi c ;2i i?2 KQ/2H iQ `2�/ �M/ i?2 /B`2+iQ`v +QMi�BMBM; 14 O i?2 X iti 7BH2b KQ/2H 4 bvbX�`;p( R ) /B`2+iQ`v 4 bvbX�`;p( k ) O `2�/ i?2 KQ/2H 19 rBi? QT2MU KQ/2H- ^`#^ V �b ?�M/H2 , Morgan 125 U p2+iQ`Bx2` - +H�bbB7B2` V 4 TB+FH2XHQ�/U ?�M/H2 V O T`Q+2bb 2�+? X iti 7BH2 7Q` 7BH2 BM ;HQ#X;HQ#U /B`2+iQ`v Y ]f Xiti] V , 24 O QT2M - `2�/ - �M/ +H�bbB7v i?2 7BH2 rBi? QT2MU 7BH2 - ^`^ V �b ?�M/H2 , +H�bbB7B+�iBQM 4 +H�bbB7B2`XT`2/B+iU p2+iQ`Bx2`Xi`�Mb7Q`KU ( ?�M/H2X`2�/UV ) V V 29 O QmiTmi i?2 +H�bbB7B+�iBQM �M/ i?2 7BH2 ^b M�K2 T`BMiU ]$i2ti#�+FbH�b? i]XDQBMU U +H�bbB7B+�iBQM( y )- QbXT�i?X#�b2M�K2U 7BH2 V V V V 34 O /QM2 2tBiUV References Crawford, Walt. 1989. MARC for Library Use: Understanding Integrated USMARC. 2nd ed. Boston: G.K. Hall. LOC (Library of Congress). 2017. The Card Catalog: Books, Cards, and Literary Treasures. San Francisco: Chronicle Books. narlock-digital-2021 ---- Digital preservation services at digital scholarship centers The Journal of Academic Librarianship 47 (2021) 102334 Available online 24 February 2021 0099-1333/© 2021 Elsevier Inc. All rights reserved. Digital preservation services at digital scholarship centers Mikala Narlock a, *, Daniel Johnson b, Julie Vecchio, Assistant Director c a Digital Collection Strategy Librarian, Hesburgh Libraries, University of Notre Dame, United States of America b English, Film, Television, and Theatre; Digital Humanities Librarian, Hesburgh Libraries, University of Notre Dame, United States of America c Navari Family Center for Digital Scholarship, Hesburgh Libraries, University of Notre Dame, United States of America A R T I C L E I N F O Keywords: Digital scholarship centers Digital preservation Academic libraries Digital scholarship A B S T R A C T As academic library support services for digital scholarship activities continue to expand and evolve, large volumes of digital outputs have been created by, and in collaboration with, library and information professionals who are affiliated with digital scholarship centers. Drawing on a literature review and a 2018 pilot study of digital preservation services in digital scholarship centers, we propose future directions for investigation of preservation services for digital scholarship and projects. Introduction The proliferation of digital infrastructure, tools, and data sources has facilitated new types of academic exploration and created opportunities for novel collaborations with academic library specialty research sup- port services, such as digital scholarship centers (DSCs) (e.g., Bryson et al., 2011; Johnson & Dehmlow, 2019). DSCs are described as a “service model in academic libraries that bring faculty and student scholars, technologists, and librarians together to collaboratively develop digital projects supporting scholarship and research” (Tzoc, 2016), and for the purposes of this research, digital scholarship is construed broadly as the use of digital evidence and methods, digital publishing, digital curation and preservation, and digital use and reuse of scholarship, regardless of discipline (Rumsey, 2011). Academic li- brary support for digital scholarship encompasses a broad range of services, including teaching, consultation, outreach, the provision of access to technologies and data sources for creating and sharing new knowledge, and the creation and management of technology-enhanced spaces (e.g., Lippincott, 2017; Locke, 2017). As digital scholarship ac- tivities and outputs increase over time, the need for careful planning for the curation and long-term preservation of digital objects and projects is of critical importance (Owens, 2018). We explore the intersection of academic library digital scholarship centers with digital curation and preservation activities through the lens of a literature review and a 2018 pilot survey, seeking to address the following topics: 1. How do digital scholarship centers provide digital preservation in- formation to their users? 2. What digital preservation support is provided by digital scholarship centers to their users? 3. What kinds of relationships and interactions can we observe between academic libraries, DSCs, and digital preservation activities? Literature review The expansive growth of digital scholarship work—along with a concomitant need for data—has resulted in strengthened connections between library and information professionals and digital scholars, especially digital humanists (Johnson & Dehmlow, 2019; Millson-Mar- tula & Gunn, 2017; Sula, 2013). In particular, digital curation and preservation have been identified as ideal opportunities for collabora- tion between scholars, librarians, and information professionals, as li- brary organizations tend to focus on lifecycle management with an emphasis on curation and preservation (Lippincott, 2017). While re- searchers may lack specific training for research data curation or experience with building and applying robust preservation policies, li- brary and information professionals have been developing and utilizing these skills for decades (Poole & Garwood, 2018). Tenopir, Birch, and Allard (2012, 5) argue that there are “powerful reasons for librarians to explore how their academic libraries can better satisfy the needs of researchers in the new data-intensive research at- mosphere,” including the curation of research data to facilitate discov- ery, and advocacy for effective preservation. As Walters and Skinner (2011) note, when “the library embeds the curation and preservation infrastructure and knowledge within its own staffing and digital framework and provides stable, trustworthy, and affordable services to * Corresponding author. E-mail addresses: mnarlock@nd.edu (M. Narlock), djohns27@nd.edu (D. Johnson), jvecchio@nd.edu (J. Vecchio). Contents lists available at ScienceDirect The Journal of Academic Librarianship journal homepage: www.elsevier.com/locate/jacalib https://doi.org/10.1016/j.acalib.2021.102334 Received 10 February 2021; Accepted 15 February 2021 mailto:mnarlock@nd.edu mailto:djohns27@nd.edu mailto:jvecchio@nd.edu www.sciencedirect.com/science/journal/00991333 https://www.elsevier.com/locate/jacalib https://doi.org/10.1016/j.acalib.2021.102334 https://doi.org/10.1016/j.acalib.2021.102334 https://doi.org/10.1016/j.acalib.2021.102334 http://crossmark.crossref.org/dialog/?doi=10.1016/j.acalib.2021.102334&domain=pdf The Journal of Academic Librarianship 47 (2021) 102334 2 its campus, the library as an institution becomes more secure and influential within its campus setting.” (24) However, the results of Tenopir et al.’s, 2012 survey suggest that, while academic libraries and librarians are capable of providing research data services and support, there are often serious limitations in funding, particularly for staffing and repository maintenance. Since then, academic libraries have diverted more resources for research data services (particularly at R1 institutions), staff development, and additional support positions (Tenopir et al., 2019). This increased support coincides with increased collaborative efforts between libraries and DSCs, which academic li- braries have leveraged as an opportunity to advocate for their position, funding, and new roles (Cox, 2016). This work has also resulted in increased development of tools to support data and digital project curation, including efforts such as the Preservation Quality Tool (PresQT) and Emulation as a Service Infrastructure (EaaSI), which help harvest and curate data and metadata, and ensure that, regardless of format, data will be accessible into the future, easing the burden on both information professionals and repository managers. The importance of digital curation, and specifically lifecycle man- agement, has been written about extensively within the context of spe- cific types of disciplines, as well as writ large. In the humanities, digital curation has been supported by grant-funded projects such as the Uni- versity of Pittsburgh’s “Sustaining DH” NEH Institute for Advanced Topics in the Digital Humanities (https://sites.haa.pitt.edu/sustainabilit yinstitute/). The Institute educated librarians and departmental faculty alike on a new “Socio-Technical Sustainability Roadmap,” a framework to assist in “the seemingly daunting task of sustaining … web-based, user-facing, digital humanities project over time” (https://sites.haa. pitt.edu/sustainabilityroadmap/getting-started/). Similar efforts include “The Endings Project,” funded by the Social Sciences and Hu- manities Research Council of Canada (https://projectendings.github. io/), Katherina Fostano and Laura K. Morreale’s “Digital Documenta- tion Process” for DH scholarship (https://digitalhumanitiesddp.com/), and the Mellon-supported “Digits Project,” which promises to “conduct an environmental scan of the use of software containers in research and publication, as well as a fact-finding mission on the infrastructural needs of scholars who are currently producing non-standard digital research” (https://digits.pub/about/). Social science data are among the oldest digital media: beginning in the late 1800s, US census data were converted to a digital format for analysis by—what was at the time—brand-new tabulating machines (Gutmann et al., 2009). Text-mining and artificial intelligence technol- ogies available today are further extending the variety of data available for exploration through social science methodologies, shifting “the evi- dence base of social science” (Walters & Skinner, 2011). Social science data pose complex and unique challenges for data curation and preser- vation: documentation may be lacking or inaccessible, data ownership may be in question, data may have rigorous privacy/confidentiality requirements, and data format persistence may be problematic (ICPSR, 2012; Lyle et al., 2014). Repositories—both institutional and dis- ciplinary—are vital to the preservation of social science research assets and outputs, but are bound by their own unique missions and policies. Collaborative projects such as the Data Preservation Alliance for the Social Sciences (Data-PASS: http://www.data-pass.org/) leverage the resources of multiple institutions in support of the identification, acquisition, and curation of social science data that have been deemed “at risk,” whether from legacy research sources or from ongoing or future work (Gutmann et al., 2009). Academic library and information professionals—whether affiliated with DSCs or not—play a variety of critical roles in the preservation of social science data, ranging from acquisition, to educational and outreach services, to hands-on curation work, to name a few (Tammaro et al., 2019; Xia & Wang, 2014). The ‘hard sciences’ tend to produce data at a larger scale than the social sciences and humanities, especially that which is derived from niche software, tools, and highly-advanced equipment. Researchers and information professionals have actively been working to provide persistent and long-term access to research data and other scholarly outputs. Since the early 2000’s, librarians and information professionals have been advocating for and documenting research data curation (e.g., Gray et al., 2002), articulating the lifecycle of research data (e.g., Hig- gins, 2008), and carving space for information professionals to assist in the curation process. Data curators, discipline experts, and even private companies have developed numerous tools to help scholars and re- pository managers preserve content and provide consistent access to data and digital objects. The proliferation of disciplinary, institutional, and general repositories for researchers, as well as curatorial tools like wholeTALE, facilitate not only data reuse and reproducibility, but also curationand long-term accessibility to the data. In recent years, the rise of FAIR data (Findable, Accessible, Interoperable, and Reusable; Wil- kinson et al., 2016), increasing funder mandates and required data management plans (DMP), and hands-on data sharing workshops and hackathons (e.g., Hildreth & Meyers, 2020) have resulted in an increased awareness around the intricacies of preserving research data and the need to define domain-specific requirements. Despite the prevalence of digital scholarship activities across aca- demic disciplines, preservation remains a persistent challenge bedeviled by uncertain expectations, uneven work distribution, and inadequate sustainability planning, among other issues. Atkins (2013) found that most organizations, when lacking a dedicated digital preservation pro- gram, often left the task of preservation to the library. Li et al. (2020) observed a similar desire for help with managing research data at Wuhan University Library, but found in a quantitative survey that re- searchers “do not entirely believe librarians can be of significant help in managing research projects, providing data curation and sharing sup- port,” leading them to suggest that libraries should “promote and advertise their effort and abilities” (9.) Libraries, however, may lack the funding or technical infrastructure needed to support digital projects adequately in the long term (Owens, 2018). Moreover, given that effective digital preservation and consistent, long-term access to the content requires intense curatorial support, librarians, specifically sub- ject selectors and disciplinary curators, are in the best position to pro- vide feedback on digital scholarship projects (Tallman & Work, 2018). Robert Montoya (2017, 221) even argues that a new category of “boundary staff specifically charged with maintaining … boundary in- frastructures and negotiating mismatched practices between de- partments” is needed to break out of silos and integrate library strengths with cross-disciplinary projects. Regardless of where a digital object or project originates or con- cludes, the stakes for digital preservation are high, and project partners benefit from sharing the responsibility and privilege of applying digital preservation considerations to their work. Indeed, increasing the pool of stakeholders should increase preservation options, helping to alleviate the burden of hidden labor on a small group of individuals while also avoiding the temptation to overfit all projects to a one-size-fits-all preservation solution. DSCs, in turn, stand to benefit by learning how their peers are engaging stakeholders in this important endeavor. Pilot survey For additional perspective on this landscape, we distributed a pilot survey via list-serv in order to investigate how digital scholarship cen- ters within higher education institutions in the United States currently engage with their stakeholders on digital preservation. In total, the survey received forty-seven (47) responses. Respondents who left all answers blank were eliminated. Duplicate responses were received from three institutions. If there was overlap between responses, the authors looked to see if responses were identical; if so, one entry was kept for the institution, and if not, both entries were removed. Two entries were removed as non-US institutions. In total, twenty-five (25) survey re- sponses were used for analysis. For more information, please visit https://doi.org/10.17605/OSF.IO/3YJ8A. A key limitation of this survey is the small number of responses M. Narlock et al. https://sites.haa.pitt.edu/sustainabilityinstitute/ https://sites.haa.pitt.edu/sustainabilityinstitute/ https://sites.haa.pitt.edu/sustainabilityroadmap/getting-started/ https://sites.haa.pitt.edu/sustainabilityroadmap/getting-started/ https://projectendings.github.io/ https://projectendings.github.io/ https://digitalhumanitiesddp.com/ https://digits.pub/about/ http://www.data-pass.org/ https://doi.org/10.17605/OSF.IO/3YJ8A The Journal of Academic Librarianship 47 (2021) 102334 3 received relative to the number of invitations distributed through list- servs; the survey nevertheless provides an instructive starting place for continued exploration of digital preservation patron engagement ac- tivities at US digital scholarship centers. In following the format of the survey, the themes that emerged from our data have been divided into two categories: characteristics of the responding DSCs, and patterns of digital preservation practices. Responding digital scholarship center overview All responding centers indicated that they provide consultations to patrons (n = 25). Most responding DSCs indicated that they provide instruction (n = 22), cultivate a web presence (n = 21), and provide access to hardware and software for patron use (n = 20). Responding DSCs tended to have a broad range of expertise: while the particulars varied between DSCs, many indicated that they offer expertise in digital publishing, project management, data analysis, and metadata (n = 22, 20, 19, 19). Digital preservation was an area of expertise for over half of responding DSCs (n = 16), followed closely by institutional repository support (n = 15). Areas of DSC expertise may warrant additional exploration, specifically the emphasis on project management and data analysis and how they relate to preserving digital scholarship. The re- sponses here could be indicative of a number of things, including but not limited to: a primary focus on active project development by responding DSCs, which are often on the cutting edge of research and research methods; the possibility that responding DSCs were collaborating with patrons on sustainable projects that need less preservation support; a prevalence of projects that had not yet reached a stage where preser- vation concerns are imminent; or perhaps a lack of interest in preser- vation among responding DSC patrons. Additional investigation into these motivations for prioritizing project management and data analysis could help guide future developments in DSC support for curating and preserving digital scholarship outputs. Physically and organizationally, responding DSCs were linked to li- braries, echoing the prevalent themes in the literature about the rela- tionship between the two (Lippincott & Goldenberg-Hart, 2014). Most respondents noted that their DSC is located organizationally with the institution’s library (n = 19/25, 76%), and, when asked about their roles and responsibilities within the DSC, approximately one third of re- spondents indicated that their primary role was that of “Librarian” (n = 9/25). A responding DSC’s connection with an academic library was not associated with provision of digital preservation support by the responding DSC. This is an area that may warrant additional explora- tion: Given libraries’ and archives’ legacy of preservation and providing long-term access to materials, the library is the heir-apparent to pre- serving content created by and with the DSC, whether through curation, storage, metadata/descriptive practices, or other preservation activities. However, limited funding, overwhelmed staff, and DSCs’ charge to stay at the forefront of digital scholarship may prohibit this collaboration. Digital preservation practices of responding digital scholarship centers In terms of audience for digital preservation support, the majority of responding DSCs (n = 19) indicated that they provide support for digital preservation to patrons, with the primary demographic overwhelmingly faculty-oriented and humanities-centric. This could be due to the wide definition of “digital scholarship center” employed by the survey, which included digital humanities centers under the digital scholarship center umbrella. Additional exploration of the core demographics of commu- nities who engage with DSC services could be helpful for guiding the development of additional best practices for engaging users in digital preservation conversations. Overwhelmingly, the digital preservation support provided by responding DSCs tended to take the form of consultations (n = 19), followed by instruction and outreach (n = 8). This suggests an opportunity for developing additional resources for the integration of reusable assets and frameworks into consultative and instructional sessions. Future explorations and conclusion The literature review points to ample opportunities for libraries to engage across disciplines in digital preservation, and warns of peril if they don’t. Our pilot survey responses, though limited, suggest specific avenues of research, including the expansion of primary audiences for digital preservation outreach, the development of new (or imple- mentation of existing) resources for engaging faculty and students in digital preservation activities compatible with the time limitations in outreach and consultation, and consideration of the implications of organizational placement of DSCs for the provision of digital preserva- tion support to patrons. As DSCs continue to evolve, academic library organizations should consider prioritizing digital preservation competencies in continuing education opportunities for their employees. According to King (2018), there are a number of skills useful for DSC faculty and staff, including technical abilities, but also more traditional librarian expertise, including preservation, institutional repository support, and metadata enhancement; however, “Librarians felt overwhelmingly that they needed more, better trained staff to meet this need and that they themselves were in need of skills, knowledge and credentials.” (44) By providing these educational opportunities, funding, or other support to employees in addition to DSC patrons, libraries can continue to serve as active and collaborative partners in supporting the creation and pres- ervation of digital objects and digital scholarship projects. As a follow-up to this work, more detailed investigation into pres- ervation, through activities such as semi-structured interviews with survey respondents, could provide even more specific information on how DSCs engage patrons. While the pilot survey provides a snapshot in time, the response categories were too broad to learn detailed infor- mation at the outset. Additional research could investigate how active subject selectors, curators, or other disciplinary liaisons are in sup- porting the curation and preservation of DSC projects. Relatedly, we would like to learn whether DSCs are providing rubrics or other tools to support curators in deciding what to preserve, and to see how many DSCs are embracing a benign neglect towards their projects, allowing them to gracefully decline. Since the initial distribution of the survey, the landscape of higher education has changed drastically in the wake of the COVID-19 pandemic. Additional research could examine how remote work has impacted consultations and remote digital preservation work. Similarly, during the myriad social protests that occurred during the Summer of 2020, did DSCs engage in or support community archiving or preservation? The results of this pilot survey and related research have uncovered more questions than answers. As libraries and DSCs contend with an ever-increasing proliferation of data and digital objects—especially when considering legacy digital projects from early-adopters in the 2000s—and budgets that remain constant at best, effective digital preservation relies on an active collaboration between partners. Knowing how best to support DSCs and library and information pro- fessionals in this endeavor ensures time and resources are spent effec- tively in providing long-term access to digital projects for future scholars. This work can and must be a collaborative effort between institutional and organizational units, and requires more investigation to understand just where to start. References Atkins, W. (2013). Staffing for effective digital preservation: An NDSA report: Results of a survey of organizations preserving digital content. National Digital Stewardship Alliance. M. Narlock et al. http://refhub.elsevier.com/S0099-1333(21)00025-2/rf0005 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf0005 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf0005 The Journal of Academic Librarianship 47 (2021) 102334 4 Bryson, T., Posner, M., St. Pierre, A., & Varner, S. (2011). Digital humanities, SPEC kit 326 (November 2011). https://publications.arl.org/Digital-Humanities-SPEC-Kit -326/. Cox, J. (2016). Communicating new library roles to enable digital scholarship: A review article. New Review of Academic Librarianship, 22(2–3), 132–147. https://doi.org/ 10.1080/13614533.2016.1181665 EaaSI GitLab | Software Preservation Network (SPN). (n.d.). Retrieved from https ://www.softwarepreservationnetwork.org/eaasi-gitlab/. Gray, J., Szalay, A. S., Thakar, A. R., Stoughton, C., & van den Berg, J. (2002). In A. S. Szalay (Ed.), Online scientific data curation, publication, and archiving (pp. 103–107). https://doi.org/10.1117/12.461524 Gutmann, M. P., Abrahamson, M., Adams, M. O., Altman, M., Arms, C., Bollen, K., , … King, G., et al. (2009). From preserving the past to preserving the future: The data- PASS project and the challenges of preserving digital social science data. Library Trends, 57, 315–337. Internet. Higgins, S. (2008). The DCC curation lifecycle model. International Journal of Digital Curation, 3(1), 134–140. https://doi.org/10.2218/ijdc.v3i1.48 Hildreth, M., & Meyers, N. (2020). Final report: FAIR Hackathon workshop for mathematical and physical sciences research communities. https://doi.org/10.7274/R0-RWPP-AS13 Inter-university Consortium for Political and Social Research (ICPSR). (2012). Guide to Archiving Social Science Data for Institutional Repositories (1st ed.) Ann Arbor, MI. Johnson, D., & Dehmlow, M. (2019). Digital exhibits to digital humanities: Expanding the digital libraries portfolio. In New top technologoies every librarian needs to know: A LITA guide (p. 123). King, M. (2018). Digital scholarship librarian: What skills and competences are needed to be a collaborative librarian. International Information & Library Review, 50, 40–46. https://doi.org/10.1080/10572317.2017.1422898 Li, B., Song, Y., Lu, X., & Zhou, L. (2020). Making the digital turn: Identifying the user requirements of digital scholarship services in university libraries. The Journal of Academic Librarianship, 46(2), Article 102135. https://doi.org/10.1016/j. acalib.2020.102135 Lippincott, J. K. (2017). Opening keynote: Fulfilling our mission in the digital age. Digital Initiatives Symposium, 17. Retrieved from https://digital.sandiego.edu/cgi/viewconte nt.cgi?article=1131&context=symposium. Lippincott, J. K., & Goldenberg-Hart, D. (2014). CNI workshop report. Digital scholarship centers: Trends and good practice. Retrieved from https://www.cni.org/wp-content /uploads/2014/11/CNI-Digitial-Schol.-Centers-report-2014.web_.pdf. Locke, B. T. (2017). Digital humanities pedagogy as essential liberal education: A framework for curriculum development. Digital Humanities Quarterly, 011(3). Lyle, J., Alter, G., & Green, A. (2014). Partnering to curate and archive social science data. Research data management: Practical strategies for information professionals (pp. 203–222). Millson-Martula, C., & Gunn, K. (2017). The digital humanities: Implications for librarians, libraries, and librarianship. College & Undergraduate Libraries, 24(2–4), 135–139. https://doi.org/10.1080/10691316.2017.1387011 Montoya, R. D. (2017). Boundary objects/boundary staff: Supporting digital scholarship in academic libraries. The Journal of Academic Librarianship, 43(3), 216–223. https:// doi.org/10.1016/j.acalib.2017.03.001 Owens, Trevor (2018). The theory and craft of digital preservation. John Hopkins University Press. Poole, A. H., & Garwood, D. A. (2018). “Natural allies”: Librarians, archivists, and big data in international digital humanities project work. Journal of Documentation, 74 (4), 804–826. https://doi.org/10.1108/JD-10-2017-0137 Rumsey, A. S. (2011). Scholarly communication institute 9: New-model scholarly communication: Road map for change. Charlottesville, VA: University of Virginia Library. Sula, C. A. (2013). Digital humanities and libraries: A conceptual model. Journal of Library Administration, 53(1), 10–26. https://doi.org/10.1080/ 01930826.2013.756680 Sustaining DH – An NEH Institute for Advanced Topics in the Digital Humanities. (n.d.). Retrieved from https://sites.haa.pitt.edu/sustainabilityinstitute/. Tallman, N., & Work, L. (2018). Approaching Appraisal. In International Conference on Digital Preservation (Vol. 2018). Tammaro, A. M., Matusiak, K. K., Sposito, F. A., & Casarosa, V. (2019). Data curator’s roles and responsibilities: An international perspective. Libri, 69(2), 89–104. https:// doi.org/10.1515/libri-2018-0090 Team, T. E. P. (n.d.). The Endings Project. Retrieved from https://endings.uvic.ca/. Tenopir, C., Allard, S., Baird, L., Sandusky, R., Lundeen, A., Hughes, D., & Pollock, D. (2019). Academic librarians and research data services: Attitudes and practices. IT Lib: Information Technology and Libraries Journal, (1). https://trace.tennessee.edu/ utk_infosciepubs/99. Tenopir, C., Birch, B., & Allard, S. (2012). Academic libraries and research data services: Current practices and plans for the future. In An ACRL white paper. https://trace.te nnessee.edu/utk_dataone/20. The Digital Documentation Process—The Digital Documentation Process. (n.d.). Retrieved from https://digitalhumanitiesddp.com/. Tzoc, E. (2016). Libraries and faculty collaboration: Four digital scholarship examples. Journal of Web Librarianship, 10(2), 124–136. https://doi.org/10.1080/ 19322909.2016.1150229 Walters, T., & Skinner, K. (2011). New roles for new times: Digital curation for preservation. Association of Research Libraries. https://vtechworks.lib.vt.edu/handle/10919/10 183. Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J. J., Appleton, G., Axton, M., Baak, A., … Mons, B. (2016). The FAIR guiding principles for scientific data management and stewardship. Scientific Data, 3, 160018. https://doi.org/10.1038/sdata.2016.18 Xia, J., & Wang, M. (2014). Competencies and responsibilities of social science data librarians: An analysis of job descriptions | Xia | College & Research Libraries. https:// doi.org/10.5860/crl13-435 M. Narlock et al. https://publications.arl.org/Digital-Humanities-SPEC-Kit-326/ https://publications.arl.org/Digital-Humanities-SPEC-Kit-326/ https://doi.org/10.1080/13614533.2016.1181665 https://doi.org/10.1080/13614533.2016.1181665 https://www.softwarepreservationnetwork.org/eaasi-gitlab/ https://www.softwarepreservationnetwork.org/eaasi-gitlab/ https://doi.org/10.1117/12.461524 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf0030 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf0030 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf0030 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf0030 https://doi.org/10.2218/ijdc.v3i1.48 https://doi.org/10.7274/R0-RWPP-AS13 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf8000 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf8000 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf0045 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf0045 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf0045 https://doi.org/10.1080/10572317.2017.1422898 https://doi.org/10.1016/j.acalib.2020.102135 https://doi.org/10.1016/j.acalib.2020.102135 https://digital.sandiego.edu/cgi/viewcontent.cgi?article=1131&context=symposium https://digital.sandiego.edu/cgi/viewcontent.cgi?article=1131&context=symposium https://www.cni.org/wp-content/uploads/2014/11/CNI-Digitial-Schol.-Centers-report-2014.web_.pdf https://www.cni.org/wp-content/uploads/2014/11/CNI-Digitial-Schol.-Centers-report-2014.web_.pdf http://refhub.elsevier.com/S0099-1333(21)00025-2/rf0070 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf0070 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf7000 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf7000 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf7000 https://doi.org/10.1080/10691316.2017.1387011 https://doi.org/10.1016/j.acalib.2017.03.001 https://doi.org/10.1016/j.acalib.2017.03.001 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf9000 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf9000 https://doi.org/10.1108/JD-10-2017-0137 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf0095 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf0095 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf0095 https://doi.org/10.1080/01930826.2013.756680 https://doi.org/10.1080/01930826.2013.756680 https://sites.haa.pitt.edu/sustainabilityinstitute http://refhub.elsevier.com/S0099-1333(21)00025-2/rf6000 http://refhub.elsevier.com/S0099-1333(21)00025-2/rf6000 https://doi.org/10.1515/libri-2018-0090 https://doi.org/10.1515/libri-2018-0090 https://endings.uvic.ca/ https://trace.tennessee.edu/utk_infosciepubs/99 https://trace.tennessee.edu/utk_infosciepubs/99 https://trace.tennessee.edu/utk_dataone/20 https://trace.tennessee.edu/utk_dataone/20 https://digitalhumanitiesddp.com/ https://doi.org/10.1080/19322909.2016.1150229 https://doi.org/10.1080/19322909.2016.1150229 https://vtechworks.lib.vt.edu/handle/10919/10183 https://vtechworks.lib.vt.edu/handle/10919/10183 https://doi.org/10.1038/sdata.2016.18 https://doi.org/10.5860/crl13-435 https://doi.org/10.5860/crl13-435 Digital preservation services at digital scholarship centers Introduction Literature review Pilot survey Responding digital scholarship center overview Digital preservation practices of responding digital scholarship centers Future explorations and conclusion References oclc-social-2020 ---- Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise Social Interoperability in Research Support: Cross-campus partnerships and the university research enterprise Rebecca Bryant, Annette Dortmund, and Brian Lavoie O C L C R E S E A R C H R E P O R T Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise Rebecca Bryant Senior Program Officer Annette Dortmund Senior Product Manager Brian Lavoie Senior Research Scientist © 2020 OCLC. This work is licensed under a Creative Commons Attribution 4.0 International License. http://creativecommons.org/licenses/by/4.0/ August 2020 OCLC Research Dublin, Ohio 43017 USA www.oclc.org ISBN: 978-1-55653-157-6 DOI: 10.25333/wyrd-n586 OCLC Control Number: 1184125043 ORCID iDs Rebecca Bryant http://orcid.org/0000-0002-2753-3881 Annette Dortmund https://orcid.org/0000-0003-1588-9749 Brian Lavoie http://orcid.org/0000-0002-7173-8753 Please direct correspondence to: OCLC Research oclcresearch@oclc.org Suggested citation: Bryant, Rebecca, Annette Dortmund, and Brian Lavoie. 2020. Social Interoperability in Research Support: Cross- Campus Partnerships and the University Research Enterprise. Dublin, OH: OCLC Research. https://doi.org/10.25333/wyrd-n586. http://creativecommons.org/licenses/by/4.0/ http://www.oclc.org http://orcid.org/0000-0002-2753-3881 https://orcid.org/0000-0003-1588-9749 http://orcid.org/0000-0002-7173-8753 mailto:oclcresearch@oclc.org https://doi.org/10.25333/wyrd-n586 C O N T E N T S Foreword ............................................................................................ vi Building Intra-Campus Relationships Around Research Support Services ............................................................................................... 1 Introduction .............................................................................................................. 1 Scope and Methods .................................................................................................3 Limitations ................................................................................................................4 The Campus Environment .................................................................. 5 Universities are Complex Adaptive Systems ..........................................................5 Intense Competition for Prestige, Rankings, and Resources .................................6 Leadership Challenges ............................................................................................ 7 Frustration and Isolation in Emerging Roles ........................................................... 7 A Model for Conceptualizing University Research Support Stakeholders ....................................................................................... 9 Academic Affairs .................................................................................................... 10 Research Administration ......................................................................................... 11 The Library ............................................................................................................... 11 Information and Communications Technology (ICT) .......................................... 12 Faculty Affairs and Governance ............................................................................ 13 Communications .................................................................................................... 14 Social Interoperability in Research Support Services ......................16 Research Data Management (RDM) ........................................................................17 Research Information Management (RIM) ............................................................ 19 Public researcher profiles ................................................................................. 19 Faculty Activity Reporting (FAR) ...................................................................... 20 Research Analytics ................................................................................................. 21 ORCID Adoption .....................................................................................................23 Comments on the Library as Partner ....................................................................24 Cross-Campus Relationship Building: Strategies and Tactics ......... 26 Strategies and Directions ......................................................................................26 Secure buy-in ....................................................................................................26 Know your audience ......................................................................................... 27 Speak their language ........................................................................................28 Offer concrete solutions to others’ problems .................................................28 Timing is essential.............................................................................................29 Relationship Building: Practical Advice .................................................................29 Meeting opportunities ......................................................................................29 Shared staff and embedded resources .......................................................... 30 Troubleshooting in Relationship Building ............................................................ 30 Making connections ........................................................................................ 30 Personalities ...................................................................................................... 31 Know your value / be confident ....................................................................... 31 Challenges: Managing Resistance and Sustaining Energy ..................................32 Managing resistance.........................................................................................32 Investing the energy .........................................................................................32 Conclusion ........................................................................................ 34 Acknowledgments ............................................................................ 36 Appendix: Interview Protocol .......................................................... 37 Notes ................................................................................................. 39 F I G U R E S FIGURE 1 A conceptual model of campus research support stakeholders ......................... 9 FIGURE 2 Stakeholder interest in research support areas ..................................................16 FIGURE 3 Key takeaways about successful intra-campus social interoperability ............. 33 F O R E W O R D To develop robust research support services across the entire research life cycle, individuals and units from across the university, including the library, must work across internal silos. Previous OCLC Research publications like The Realities of Research Data Management and Practices and Patterns in Research Information Management: Findings from a Global Survey (2017-18),1 prepared in partnership with euroCRIS, already describe this growing operational convergence. Libraries are increasingly partnering with other campus stakeholders in research support, such as the office of research, campus IT, faculty affairs, and academic affairs units. This OCLC Research Report, Social Interoperability in Research Support: Cross-campus partnerships and the university research enterprise, recognizes the growing imperative for libraries to work not only in support of the goals of their parent institution, as explored in the 2018 University Futures, Library Futures report,2 but also as a valued member of a cross-institutional team. Social Interoperability in Research Support explores the social and structural norms that can serve either as roadblocks or pathways to cross-institutional collaboration and offers a model for conceptualizing the key university stakeholders in research support. It examines the network of campus units involved in both the provision and consumption of research support services and concludes with recommendations for establishing and maintaining cross-campus relationships, synthesized from interviews conducted with practitioners from all corners of campus. Social Interoperability in Research Support offers a road map for acquainting librarians with the other research support stakeholders on campus. It additionally offers a resource for acquainting others on campus with the skills and expertise that the library brings to research support activities. While the interviews informing this publication were conducted prior to the onset of the COVID-19 crisis, I believe the findings are no less relevant. In fact, the need for increasing cross-institutional research support collaboration is likely to be amplified due to the current pandemic and its longer- term effects. Lorcan Dempsey, Vice President, Membership and Research, OCLC vi Building Intra-Campus Relationships Around Research Support Services Introduction In early 2020, the University Libraries at the University of Rhode Island publicized a posting for a Library Chief Data Strategist, responsible for “enhancing library-based data services programs.” The job description noted that: This position will work with the Office of Institutional Research and DataSpark (Library- based data analytics unit) to identify avenues to increase faculty and researcher success. Working with internal (e.g. MakerspaceURI, Launch Lab, Think Lab, and the AI Lab) and external (e.g. the Office of Advancement of Teaching and Learning, the Office of Community, Equity and Diversity, Division of Research and Economic Development and IT) partners, the incumbent will plan and implement experimental and innovative activities to cultivate and expand synergistic relationships.3 This description illustrates the deeply collaborative nature of providing research support services like data management, as well as the importance of developing and sustaining productive cross- campus relationships to make these collaborations work. The academic library is undoubtedly a key figure in the landscape of research support services, but it is not the only one. Successful management of the library’s portfolio of research services requires interaction, coordination, and even direct partnerships with other campus units. Research support services are those that enhance researcher productivity, facilitate analysis of research activity, and/or make research outputs visible and accessible across the scholarly community and beyond. Research support is an increasingly visible and expanding part of the network of services and infrastructure that enable the university’s research enterprise. Definitions of the term “research support service” range from the general to the precise. For example, North Carolina State University defines research support as “a service that allows a researcher to spend more time, more efficiently in his/her role as a researcher, and contributes positively to the quality of the research.”4 In contrast, Si, Zeng, Guo, and Zhuang suggest that research support services specifically include research data management, open access, scholarly publishing, research impact measurement, research guides, research consultation, and research tools recommendation.5 1 2 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise Because research support services extend over the entire research life cycle, as well as across the entire campus, we offer a relatively expansive definition in this report. Research support services are those that enhance researcher productivity, facilitate analysis of research activity, and/or make research outputs visible and accessible across the scholarly community and beyond.6 The provision of research support services is seldom the responsibility of a single campus unit; nor is the consumption of research support services limited to a single campus cohort. Instead, both provision and consumption are distributed across many stakeholders—from the library to the research office; from faculty to administrators. The wide network of campus stakeholders involved in providing or using research support services underscores the importance of building strong intra-campus relationships to maximize their effectiveness and impact. In this report, we document the perspectives of individuals representing a wide range of campus stakeholders in research support, either as a provider or user, with the goal of making the stakeholder groups from which they are drawn more distinct, and their potential role as a partner in research support more apparent. Building robust relationships means moving beyond a “stick figure” view of campus partners to a fleshed-out, three-dimensional understanding of their responsibilities, capacities, goals, and needs that bear on the provision and/or consumption of research support services. Sheila Corrall observes that “[o]perational convergence (i.e., separate services/departments collaborating to coordinate their activities to improve conference and effectiveness) . . . is arguably more prevalent than ever, with libraries extending and deepening their collaborations and partnerships beyond IT and educational development colleagues to other professional services, such as research offices.”7 Operational convergence in turn is facilitated by social interoperability, which we define as the creation and maintenance of working relationships across individuals and organizational units that promote collaboration, communication, and mutual understanding. While “technical interoperability”—different technical systems working smoothly together—may be a more familiar concept, social interoperability is of growing importance in a landscape where cross- campus partnerships are becoming both more prevalent and more necessary. Social interoperability [is] the creation and maintenance of working relationships across individuals and organizational units that promote collaboration, communication, and mutual understanding. While this report is written primarily for academic librarians, we expect and hope that it will prove useful to the many other campus professionals involved in research support activities. Our premise is that cross-campus partnerships are a necessary condition for building effective research support services, and the best chance for developing these relationships is to cultivate a deep understanding of potential campus partners: their responsibilities, pain points, and areas of common interest where engagement can take root and flourish. The goal of this report is not just to acquaint academic librarians with other campus stakeholders in research support, but to acquaint other campus stakeholders with the library. Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 3 The remainder of the report is as follows. This section concludes with a brief description of the scope of our study and our data-collecting methods. The next section, “The campus environment,” provides background on the organizational and decision-making environment at US universities. “A model for conceptualizing university research support stakeholders” introduces a model defining campus functional areas relevant to research support, illustrated and contextualized by our informants’ perspectives on their own roles. “Social interoperability in research support services” describes major categories of research support services on campus, and documents—through the lens of our informants’ experiences—the importance of social interoperability in building effective and impactful research support services. The final section draws out some general insights or “lessons learned” from our informants on developing good social interoperability skills that lead to successful cross-campus partnerships. Scope and Methods Our study is focused on research support in US universities. In focusing on research support, we see an opportunity to address a gap in existing literature,8 which extensively documents educational support services but is less rich in addressing research support services and intra- institutional research support challenges. Focusing on the United States was a pragmatic choice. Extending the analysis internationally raises significant challenges for meaningful comparison across different higher education systems. Each national higher education context is different, and worthy of separate study. Data was collected for this study through semi-structured interviews with individuals working in a wide range of research support-related roles across campus. We chose interviews as our strategy for data collection because we sought a more in-depth, personal perspective on cross-campus collaboration than other methods, such as a survey instrument, could afford. A key impetus for our research is that knowledge resides in people: therefore, there is great benefit in gathering and synthesizing what people know. That is the aim of this study and the rationale behind our method.9 Our interviews explored the functions and responsibilities of each individual in the context of their respective campus unit; the importance of their work—and their unit—to the university and its research enterprise; and how mutual research support interests have been or could be advanced through intra-campus relationships. The interviews sought to draw out our informants’ on-the- ground experiences in establishing and sustaining productive, cross-campus relationships. Our interviewees include individuals involved in the provision of research support services, as well as those whose responsibilities require or would benefit from consuming research support services. In examining research support services, we felt it very important to get the complete campus view. Research support services represent a dynamic service space, with new services emerging and existing services maturing, merging, or being re-defined. Services that are sourced in one campus unit (or units) today may be shifted to other providers (on campus or off) in the future. Given this, it is important to look at the overall campus landscape to better understand the scope and opportunities of the library’s role in this space. Our interviews therefore focused on collaborative experiences in research support regardless of whether the library was involved, rather than focusing strictly on collaborations involving the library. To identify interview candidates, we used a variety of sources, including personal networks and recommendations from colleagues and contacts. All told, we spoke to 22 individuals from 4 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 17 research-intensive universities in the United States. Sixteen of the 17 institutions are public institutions. Our interviewees included individuals with existing intra-campus relationships with the library as well as those with little library engagement; senior leaders as well as early-career staff; technical as well as nontechnical roles; and those with faculty status as well as those with nonfaculty positions. We spoke with academic deans and senior administrators in addition to an array of professionals working in the library, research development, faculty affairs, communications, and beyond. The fact that our informants straddle all of these categories is indicative of the wide impact of research support across the university. Our interviews did not include researchers, as we sought to examine collaborations and relationships between campus units. We did not enter the interview process with a specific number of interviewees in mind; instead, we halted the interview process when we felt that the relevant parts of the campus had been covered by at least one interviewee, and, more importantly, when we began to detect significant overlap in the perspectives related by later interviewees compared to earlier ones. The result, we hope, is a diverse array of perspectives, highlighting many facets of the intra-campus collaboration story. In conducting the interviews, we spoke to our informants about their personal perspectives on building intra-campus relationships around research support; we did not ask them to “represent” the campus unit in which they are embedded or to present a summary view detached from their own experiences. Relationship building is ultimately about people interacting with people; we tried to find out from our interviewees what worked for them—and what did not—as they reached out across the campus. Our interviews were recorded and transcribed prior to review and analysis. All our interviewees were guaranteed anonymity to remove obstacles to relating their experiences. To preserve their anonymity, therefore, we do not reveal the names of the interviewees, their job titles, nor their institutions. We also use the nongendered pronoun “they” when referring to our informants. Limitations Selecting a representative and informative cohort of interviewees required making choices, acknowledging trade-offs, and recognizing the distinct challenges presented by this domain: • Complexity: many campus units could potentially be stakeholders in the provision or consumption of research support services; moreover, within each unit, there are potentially many different roles relevant to research support. The result is a vast array of individuals with different informative perspectives to offer, far beyond the threshold of our resources to address them all. • Comparison: the delineation of campus units, or the titles and roles designated within those units, varies from university to university. This makes it difficult to choose a sample from an enumerated set of campus units and associated roles within those units. • Context: every university is different, so the experiences of an individual at a given campus in building intra-campus relationships in research support will be influenced by local circumstances. With these challenges in mind, we opted to assemble a collection of interesting and informative perspectives from individuals serving in a variety of roles across the campus, rather than attempting a comprehensive view of campus stakeholders in research support,10 with the goal of comparing and contrasting their experiences in cross-campus collaboration and drawing out general lessons and insights. Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 5 The Campus Environment Being in a decentralized institution, I have to persuade people that it’s in their best interest to do [something]. But if I can do that successfully, it’s much more likely to lead to climate change than mandating. —Academic Dean It all takes longer and has more dependencies than you think. —RIM System Administrator Social interoperability takes place within the unique environment of the modern university. One key feature of this environment is the diffusion of authority and decision-making responsibility. For example, Deane and Clarke note that “it is rare for [presidents and provosts] to give anything like an order to deans, who enjoy considerable autonomy in leading their schools. This softness of command cascades down the ranks, as department heads have wide latitude in how they lead their departments and individual faculty have considerable discretion in how they conduct their teaching and research.”11 In this section, we discuss some of the organizational attributes of US universities and how they reinforce the importance of social interoperability as a key ingredient for getting things done. Universities are Complex Adaptive Systems There is no single model that can illustrate a “typical” research university structure—every institution is a bit unique, with a dizzying variety of hierarchies, positions, titles, units, and budget models. However, we find useful the description of universities as “complex adaptive systems” by systems engineering expert and former university leader William B. Rouse.12 Similar in complexity to urban systems, he describes universities as sharing these six main characteristics of complex adaptive systems: 1. Nonlinear, dynamic behavior. The behaviors in the university can appear random and chaotic. Individuals in the system may ignore stimuli, remaining oblivious to activities outside of their immediate purview, reacting infrequently, inconsistently, and perhaps overzealously when they do take notice. 2. Independent agents. Individuals, and especially faculty, have a lot of freedom to be self- directed: in research, teaching and course development, and behaviors. Their behaviors are not dictated by the university, and in fact, the independent agents may feel free to openly resist institutional initiatives. 3. Goals and behaviors that differ or conflict. The interests and needs of the independent agents acting within the university are highly heterogeneous, leading to internal conflicts, professional discourtesy, and sometimes outright competition. 4. Intelligent and learning agents. Not only are people independent agents, they’re also smart independent agents, who can learn how the complex university works and adapt their behaviors to achieve their personal goals. With such heterogeneous goals across the enterprise, individuals can end up working at odds with each other. 6 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 5. Self-organization. While universities have established hierarchies (like colleges, schools, and departments), there can also be self-organized interest groups that arise to meet evolving needs. This can also lead to duplication of effort and services, as a group working to address a problem may be unaware of similar efforts and act independently instead. 6. No single point(s) of control. Universities are characterized by a significant degree of decentralization where units, as well as individuals, operate in a federated manner with a high degree of autonomy. Our interview informants described this ecosystem as a major pain point. Universities are not sites where mandates usually work; they aren’t characterized by a command and control system. Instead, they work through incentives and inhibitions. Or, as one of our informants told us: “Mandatory is your first and fastest way to fail . . . [because] you aren’t going to dictate anything to anybody.” This can also mean that centralized efforts are more difficult.13 It’s also easy to make mistakes because “units don’t want to give up their autonomy . . . making it easy to step on toes.” Developing and stewarding trusted relationships in a decentralized organization is essential. William Rouse’s model offers context for understanding why cross-institutional collaboration can be so difficult. Instead of traditional organizational systems that rely more upon command and control management methods, a hierarchical network, contractual relationships, and a focus on efficiency, universities respond poorly to these methods. Instead, the more heterarchical and self- organized network is “better led than managed,” relying upon personal relationships, persuasion, and consideration of the interests, incentives, and inhibitions of others. Developing and stewarding trusted relationships in a decentralized organization is essential. There are also a few other, interrelated themes that emerged in the course of our interviews that are important for understanding both the imperative of cross-institutional collaboration as well as the challenges of achieving good social interoperability within the system. Intense Competition for Prestige, Rankings, and Resources Research universities today are participating in a high stakes reputation race, seeking higher rankings on national and international league tables. The quest for prestige and rankings—and the promise of greater resources with greater prestige—is driving incentives and activities throughout institutions, particularly as revenue streams decline or become less certain.14 A variety of research support-related activities relevant to institutional reputation management and research competitiveness are emerging, such as the implementation of RIM systems; support for research data management planning, storage, sharing, and preservation; and the desire for improved research analytics and benchmarking tools. These efforts require the buy-in, knowledge, and engagement of numerous campus units; they are also challenging, time-consuming efforts on decentralized campuses. Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 7 Within this highly competitive environment, strategic alignment across campus units is more important than ever. Several of our interview informants emphasized this imperative, as well as the importance of senior leadership to signal the most important issues and activities. For instance, one library leader said, I don’t think that [research data management support] or the [RIM system] would have been successful as library-only initiatives. . . . It’s been absolutely critical that they were backed by the [office of research] because I think that’s also helped keep it to be more of a campus-wide perspective. I do think it’s pretty easy for the library to get sucked into that library world, so it could happen. This is true not only for research support activities, but also for supporting student learning and success,15 and there is a significant literature addressing the importance of close alignment between the library and the parent institution.16 Leadership Challenges A major challenge mentioned by several of our informants was the significant amount of leadership instability, or “churn,” as senior leaders enter and exit with regularity. This leadership discontinuity can particularly hamper progress on enterprise wide efforts, as executive sponsorship for campus level projects is essential for forward progress. One informant from campus IT shared, The change in leadership up and down the chain is so frequent, that we get a strategic direction in place and then no one is in place long enough to actually see it through. Then you spend another year or two kind of rudderless, with everyone kind of doing what they . . . think is best but unless you have the leadership at that level actually focusing resources on a particular effort, you’re not going to get very far on campus with these campus wide efforts. We can do lots of smaller things that you can garner the resources and backing to do, but you can’t do really big things without [senior leadership] aligned. The lack of sustained leadership and vision can inhibit social interoperability as well, as individuals and units may have no encouragement or leadership to create and maintain cross-institutional relationships in order to work toward a common goal. One of our informants, a senior academic affairs leader, used a tug-of-war metaphor to describe the role of a good leader in focusing attention on shared goals: “You need to make it clear that it’s a rope. That it’s this rope. And this is what pulling on it means.” Frustration and Isolation in Emerging Roles Several of the informants we interviewed were professional staff members, without faculty status. In recent years there has been a proliferation of nonfaculty professionals working at US universities, providing student and research support in a variety of areas, such as IT, career advising, counseling, research administration, and more. In fact, many of the people we spoke with were in positions that are relatively new roles within the university, particularly those serving in positions leading campus- wide research development efforts or RIM implementations. 8 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise Celia Whitchurch describes these individuals as “Third Space professionals” working in emerging areas, within traditional organizational structures that simultaneously offer security and constraints, and working within and across these hierarchies in ways that are both appreciated and can sow friction.17 Many of our informants reported feeling isolated in their emergent roles, without (yet) a supportive community of practice within and beyond the university. In order to be successful, these professionals must develop trust relationships across campus, which will in turn also develop a socially interoperable community of practices. But this isn’t easy, especially in the university environment where decentralization, administrative churn, and local autonomy are standard. Sometimes our informants reflected frustration with their inability to lead change on campus, sometimes explicitly stating that they thought they were unable to move things forward because they weren’t faculty, and that they felt implicit bias and are seen as less respected members of an implicit caste system, or mere “administrators.”18 For example, one informant shared, “One of the reasons it may not have . . . gone anywhere was that it was coming from this staff perspective and that it may have to come through faculty members.” Social interoperability is a means of cutting through these complexities and obstacles, promoting mutual understanding, highlighting coincidence of interest, and cultivating buy-in and consensus. Leveraging relationships with faculty can be essential in this landscape, including with librarians with faculty status: We work really well with our library colleagues, because most of them are faculty librarians. They are tenured, or on the track. It’s a lot easier for us at times to hand some things over to them to let them carry it forward, especially around policy. However, one of our librarian informants cautioned that “even though we are members of the general faculty . . . we’re not always seen at the same level.” In sum, social interoperability is an essential skill in developing successful, high-impact research support services in the kind of complex adaptive system described by Rouse, and which is complicated further by intense international competition, local leadership discontinuity, and the disconnect that often attends emerging roles such as those associated with many aspects of research support. A staff member (not one of our interviewees) leading the implementation of a campus-wide RIM system half-jokingly referred to this effort as “herding flaming cats” to express the significant challenges of trying to coordinate highly independent individuals with different goals and interests, spread across a large, decentralized organization. Social interoperability is a means of cutting through these complexities and obstacles, promoting mutual understanding, highlighting coincidence of interest, and cultivating buy-in and consensus. Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 9 A Model for Conceptualizing University Research Support Stakeholders Nobody knows what the %*@# a provost does. —Provost This section describes a conceptual model of campus stakeholders in research support identified in the course of our interviews with 22 individuals from 17 research-intensive institutions in the United States. The model helps visualize the broad functional areas on campus from which stakeholders in research support services often emerge and places the specific roles represented by our informants in a broader, campus-wide context. Campus stakeholders are not identical across institutions: the functions, responsibilities, and even nomenclature of both individual positions and campus units will differ. Therefore, the descriptions we offer below are stylized and intended to express the broad sweep of stakeholder interests in research support. These interests will be organized in different ways on different campuses. FIGURE 1. A conceptual model of campus research support stakeholders THE UNIVERSITY Academic A�airs Research Administration The Library Information & Communications Technology (ICT) Faculty A�airs & Governance Communications A Conceptual Model of Campus Research Support Stakeholders A Conceptual Model of Campus Research Support Stakeholders 10 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise Our informants were associated with a diverse array of campus functional units. We have grouped them into six broad functional areas (figure 1). Note that these are not mutually exclusive; the distinctions across areas are those of focus, rather than clear administrative boundaries. Moreover, this is not a complete model of all the functional units found within a university, but instead is focused on those most relevant to research support services. Finally, we note that this model does not take into account any hierarchical relationships that may exist within and across these areas. The remainder of this section provides brief descriptions—often in the words of our informants—of each functional area represented in the model. In talking with our informants about their roles, we were impressed by the variation and nuance in responsibilities, interests, and institutional circumstances evident across seemingly similar functions or positions located at different universities. While this makes generalization difficult, we did identify a “takeaway message” in each campus area that seemed to resonate across our discussions. Academic Affairs Academic Affairs in our model includes individuals responsible for overseeing teaching, learning, and research activities at the university. Examples include the provost—the university’s chief academic officer—as well as deans and directors of colleges, schools, and institutes; department heads; directors of graduate study; and faculty and staff. It is important to emphasize that while Academic Affairs personnel are perhaps most commonly understood in relation to their oversight of academic programs (e.g., course offerings, teaching assignments, degree requirements) they also have responsibilities concerning research activities at the university. This underscores the need to understand the research interests of those in Academic Affairs positions, and by extension, their potential role as campus stakeholders in research support services. In some cases, academic and research interests may be intertwined, such as in graduate education, where the Graduate School takes a leading role in supporting the interests of early career researchers, including both graduate students and postdoctoral researchers. The functions falling within this area are vast and varied, but a common theme that emerged from our interviews is that individuals working in Academic Affairs often expressed their responsibilities in the language of campus-wide strategic imperatives. We spoke to a provost who described their responsibilities as “operationalizing the institution’s imperatives”—in other words, implementing the university’s strategy and vision. They went on to note the importance of the provost’s voice as a source of leadership in signaling and encouraging engagement with institutional priorities. Advocacy was a central responsibility of a graduate dean we interviewed, motivated by a concern that the interests of graduate students and postdoctoral researchers might be overlooked amidst an institutional focus on undergraduate education. And a dean of arts and sciences remarked on the need to demonstrate research impact and link it to institutional reputation and prestige. Moreover, emphasis on strategic imperatives—whether communicating the university vision, advocating for the interests of a student cohort, or enhancing the institutional brand and reputation—is not confined to senior leadership, but filters down, in one form or another, through the various layers of staff underneath. For example, one of our informants stressed the importance of all faculty and staff understanding their unit’s philosophy, its values, and its stance vis-à-vis other units. In working with individuals in Academic Affairs, whether executive or “front-line,” it may be especially important to understand the strategic interests motivating both their needs and the capacities they have developed or are developing. Although this observation was evident from our interviews with Academic Affairs personnel, it can be usefully applied to the other functional areas defined in the conceptual model (figure 1) as well. Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 11 Research Administration Research Administration covers a vast array of services and activities, supporting one of the three great missions of most universities (education, research, and service).19 Broadly speaking, campus units associated with research administration provide services that help advance the university’s research activities, such as securing external funding, developing institutional strategy and policy, and providing oversight of issues having to do with responsible research conduct, ethics, and grant administration. Often, campus units aimed at supporting research administration are collected under a university Office of Research (or similar name) led by a vice president or vice chancellor, with responsibilities that extend over the entire research life cycle. For example, The Ohio State University Office of Research defines its mission as supporting “the development, submission, management and integrity of Ohio State research.”20 Similarly, the Office of Research Administration at Stanford University provides “an array of high-quality services and expertise to support the research mission and sponsored projects administration at Stanford University.”21 One of our informants in this area remarked that their primary responsibility was to help our researchers advance the research. . . . So it also means helping them make their lives easier. I often tell them, “You guys don’t . . . realize the disasters I’ve prevented you from seeing.” . . . So really it’s important because I am passionate about the research mission and we do whatever we can to keep our researchers focused on doing their research so that they’re not doing other things that they shouldn’t have to do. One theme that we heard from several informants, occupying different roles and responsibilities, was the importance of managing the competitiveness and growth of the university’s overall research administration. One informant described their responsibilities as “increasing the competitiveness of our faculty when they are seeking extramural support.” Another informant explained their unit’s role as “related to strategic planning, strategic investment opportunity for the institution to grow and expand . . . as an institution, where do we invest our dollars in order to expand our research enterprise” Yet another of our interviewees described their focus as “enterprise-level strategy” for the university’s Research Office. A key message from these responses is that the university research administration, while fragmented among many different disciplinary cohorts with different priorities and objectives, is nevertheless also viewed and managed as an enterprise-wide activity. Understanding campus-wide priorities and objectives regarding research administration is an important aspect of working with this area, as well as a helpful perspective in campus partnerships aimed at providing research support services across a diverse university research community. The Library The library is a familiar campus presence, and its traditional mission—broadly speaking, to connect students and faculty with the information resources they need for education and research—is likely familiar to most as well. We spoke to a number of individuals working in the library, or in library- adjacent services, and the diversity of their roles and responsibilities were indicative of the many points of contact between the library and the university research administration. For example, one informant manages a university press, while another directs a digital humanities institute. Other informants were involved in activities such as scholarly communication and disciplinary liaison work. As these roles suggest, today’s academic library is deeply embedded in all phases of 12 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise the research life cycle. Moreover, the library is often seen, as one informant put it, “as a trusted, agnostic partner on campus.” Speaking of an effort to develop academic and research analytics, the informant went on to observe: If the provost had implemented these programs, everybody would have assumed it was for some kind of evaluation process, and they wouldn’t have trusted it. . . . Because we’re not doing the evaluation, we can go in and just, “Hey, we’re here to help you. Tell us what your story is. We’ll help you find some way to tell that story better.” So that worked quite well and was really empowering. Although the library often deploys a wide range of research support services, it can be burdened by its historical role as a physical repository of print collections. One informant remarked on this challenge, observing: Because so often, librarians are forgotten. Our expertise is completely forgotten, and we’re the last people [to be considered]. So faculty are shocked when they realize, “oh, you can help me with my data? Oh, you can help me think through this . . . publishing considerations, whatever it might be.” Effective partnership with library staff involves relinquishing preconceived notions of what the library is and where its expertise lies. . . . The library in turn must communicate clearly to campus partners its full value proposition and expertise. Another informant alluded to similar issues, while at the same time noting the importance of the university librarian’s role in communicating the value of the library to other campus stakeholders, “to make that case to university administrators who previously have had a limited understanding of what things the libraries do.” Effective partnership with library staff involves relinquishing preconceived notions of what the library is and where its expertise lies to understand its role as a key campus player in supporting research activities throughout the research life cycle. The library in turn must communicate clearly to campus partners its full value proposition and expertise, making clear that this value and expertise extends to a broad range of services beyond books. Information and Communications Technology (ICT) Information and communications technology (ICT) corresponds to units responsible for supporting a wide array of technology needs on campus, including those related to education (e.g., learning management systems, distance learning), research (e.g., storage and high-performance computing Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 13 resources, digital collaboration tools, and research software), and general campus technology (e.g., email services, telecommunications, networking, personal computer access and support). ICT also provides technical consultation and support. A key feature of ICT units is their provision of centralized services in a decentralized campus administrative environment. One of our interviewees in this area observed that the “campus IT unit provides a lot of value in that they can offer a lot of centralized services to campus and make them available to everyone, make the experience more uniform across different audiences across the campus.” A similar sentiment was expressed by an IT professional responsible for managing a campus research information management system, who noted that the system was a central hub for a variety of campus-wide needs, such as facilitating cross-campus collaboration, serving as a central registry for research outputs, and providing a consolidated source of metrics and other information for campus administration. And it is important to emphasize that, like Academic Affairs, ICT staff are often deeply connected to broader institutional strategic priorities: an IT director, for example, noted their unit’s prominent role in enhancing the university’s grant proposal success rate. Although centralization of key services is an important function of ICT, we learned that it is challenging to draw the line between services that are best scaled to a campus-wide level, and those that are best provided at a college or department level. As one interviewee pointed out, “what we hope for is the things that make sense to be run from a central point kind of gravitate and migrate towards the central unit,” while discipline-specific services are managed by the relevant institutional units themselves. Our interviewees also noted that many units on campus such as colleges, research institutes, and departments have their own dedicated ICT capacity and staff; one of our informants emphatically remarked: “We stay out of that. There’re local division level and department level system administrators that have some systems that they spin up and we might guide people to them but it’s those folks who have the role of supporting them.” Given this, an important consideration for research support services is determining at what scale a service should be deployed, which in turn influences who the appropriate campus partners may be. Faculty Affairs and Governance Faculty Affairs and Governance in our model encompasses a wide range of services and functions aimed at supporting faculty members in their careers and scholarly activities, including those usually associated with a faculty affairs unit in the provost’s office, as well as those related to faculty governance, such as the faculty senate or the local American Association of University Professors (AAUP) chapter. A recent Chronicle of Higher Education article catalogs the many areas addressed by specialists in faculty affairs: “pay parity, leaves of absence, merit increases, annual reviews . . . tenure and promotion, contract renewals, sabbaticals, research grants, start-up funds, and faculty searches . . . counting faculty members for annual IPEDS and other national surveys”22 Faculty affairs is an emerging functional area on many campuses, and an important stakeholder in research support, conducting work critical “to facilitate a lot of the research work on campus,” as one informant expressed it. Another informant remarked that their “record-keeping” activities meant that they were “one of the sources of good data about the amazing accomplishments our faculty take part in every year.” However, challenges abound, as one informant mentioned their unit was still in the process of raising its profile across the university and establishing itself as a trusted service provider. Another informant noted that understaffing often led to long and demanding work weeks. 14 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise The informants we spoke to represent a range of different functions within faculty affairs, but recurrent themes of both concentration and coordination emerged despite the differences across their specific responsibilities. For example, one informant responsible for research analytics observed that their unit was the sole data source for many of the metrics and analytics consumed by other campus units. Another informant highlighted the importance of “the human touch and coordination behind the scenes to make sure that all the units are working together in the way that they should, that all the efforts are strategically aligned.” Faculty governance involves pathways for faculty participation in institutional decision-making: as one former university president (not an interviewee in our study) put it, “While faculty are, by nature, independent actors who are rarely motivated en masse, there are faculty organizations that can play an important and constructive role. I worked hard to develop close, cooperative relationships with each of these groups, and the effort paid off with the faculty as a whole in gaining their support for what I was trying to accomplish.”23 One of our informants, speaking of their participation in a faculty senate and its role as a forum for raising and discussing issues, noted that the “Senate is very central to campus . . . the Senate has the standing to be able to call those people to actually speak to those things. So I think that’s probably the most important function that it has is that it can bring these things to the surface and make people come and publicly answer questions and speak to us.” A key benefit of working with Faculty Affairs and Governance may be that they often occupy roles that cut across the campus stakeholder network, such as providing centralized data resources, coordinating cross-unit activities, and convening and/or participating in venues for discussion and problem solving. Communications Communications staff are responsible for promoting, marketing, or otherwise raising awareness about university programs, accomplishments, initiatives, and other activities. Communications professionals appear at various levels of the university organizational structure, whether concentrated in a university communications or public affairs office, or being embedded in a wide range of campus functional units, including academic units, corporate relations, the research office, alumni relations, and many more. Communications specialists are also involved in efforts to manage and promote the university’s brand and reputation. The information disseminated by communications staff may be directed at an internal audience (for example, a campus newsletter highlighting news and events associated with the university’s research activities) or an external audience (for example, communications targeted to local and state media, legislators, or potential donors). One of our informants summarized their communications work as “telling the story of safe, ethical, productive . . . research . . . and then on the flip side, helping to sell the ideas and the creativity of our researchers to our funding agencies.” An important insight that emerged from our interviews with communications specialists was the importance in communications work of building networks and community. One of our informants remarked on their efforts to promote interdisciplinary communication, and in doing so, cultivating a sense of community across the diverse cohort of researchers at the university. This individual went on to observe that “that kind of connecting, communicating, developing of networks . . . is probably the most vital thing that I do.” Another informant noted the importance of collaboration in their work: Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 15 So we have to be really collaborative to get our work done and just to rely on each other . . . It’s part of our DNA. . . . So I work very closely with all everyone in strategic communications, from marketing and brands to the media team, to the internal communications folks on a variety of different things. Networking is a key ingredient for successful communications work—whether building networks with colleagues in other parts of the campus to carry out communication initiatives, or to build networks on campus through communication initiatives. Building cross-campus partnerships in research support services would therefore benefit from tapping into the networking and community-building skills of communications specialists, who may also be consumers of research support services. In sum, our interviews helped uncover the wide diversity of roles and functions across the campus that touch on the university’s research activity, and by extension may potentially be stakeholders in research support services. This diversity is evident not only across the six broad functional areas highlighted in the model above, but also within these areas. Building cross-campus partnerships in research support services would therefore benefit from tapping into the networking and community-building skills of communications specialists, who may also be consumers of research support services. It is important to look beyond traditional and/or superficial perceptions of what campus units do to understand how the responsibilities of these units evolve, expand, and re-prioritize over time. One library told us that as part of a strategic planning process, they conducted a ten-question interview with various stakeholders around campus: So it started very meta. And it wasn’t until question eight that we talked about libraries. So it narrowed in, went down to their school in that department … and then into the libraries. And actually we got some of the richest information out of those first seven questions when they didn’t know that we’re talking about libraries because they didn’t know that we could do things in areas that they were talking about.24 The essential first step in building successful campus partnerships is to know your partners—what they do, what they prioritize, and how they see themselves contributing to the university mission. 16 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise Social Interoperability in Research Support Services Well up front, I would say I can’t get anything done without partnerships. I mean it’s just absolutely essential to partner, whether it’s with centers, institutes, department chairs, academic deans, research deans, all the above. —Research development professional You have to recognize that you’re part of an organization and you want to advance your collective interests. Because advancing your collective interests will almost always roll down to your own benefit. —Senior university leader As discussed earlier in this report, there is increased operational convergence, as units and individuals across the campus must work together to provide support across all phases of the research life cycle: from project ideation, to grant development, to research, to publication and reuse. Increased interoperability across silos is necessary.25 This interoperability must exist in a technical sense, of course, but it is also the social interoperability within the complex adaptive system of the university that is needed to make efforts successful.26 In this section, we examine four research support topical areas in order to see how this interoperability between campus stakeholder groups plays out (figure 2). FIGURE 2. Stakeholder interest in research support areas THE UNIVERSITY Academic A�airs RDM RIM Research Analytics ORCID Adoption Research Administration RDM RIM Research Analytics ORCID Adoption The Library RDM RIM Research Analytics ORCID Adoption Information & Communications Technology (ICT) RDM RIM Research Analytics ORCID Adoption Faculty A�airs & Governance RIM ORCID Adoption Communications RIM Research Analytics ORCID Adoption Stakeholder Interest in Research Support Areas Stakeholder Interest in Research Support Areas Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 17 These areas were frequently discussed in our interviews as the locus of intra-campus research support collaborations and provide rich examples of social interoperability between stakeholder groups on campus: 1. Research Data Management (RDM) 2. Research Information Management (RIM) 3. Research analytics 4. ORCID adoption Research Data Management (RDM) Research data management has quickly grown in interest in higher education, with significant investment in services, resources, and infrastructure to support researchers’ data management needs. External funding agencies like the US National Science Foundation (NSF) require the inclusion of supplemental data management plans (DMPs) in grant proposals, noting that “[i]nvestigators are expected to share with other researchers . . . the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants.”27 Institutional support for this type of mandate dovetails with activities related to proposal development, grants administration, active data management, and data curation, sharing, and preservation.28 Research data management has quickly grown in interest in higher education, with significant investment in services, resources, and infrastructure to support researchers’ data management needs. As a result, resources and support related to research data management are distributed broadly cross campus. Research administration, the library, and campus ICT are leading stakeholders in this area, and our informants reported highly synergistic relationships. On one campus, the data librarian is embedded in the research development office, a subunit of the office of research, providing guidance on DMPs, data requirements, and library data curation resources. On another, research development staff offer training for researchers on funding opportunities, proposal writing, and industry collaboration through the library’s research commons, in conjunction with research data management programming. In a third institution, research data management resources are primarily housed in the library, with significant financial support from the office of research. In this case, our informant said, I don’t think that either the [research data management services or campus RIM system] would have been successful as library only. It’s been absolutely critical that they were backed by the [office of research] because I think that’s also helped keep it to be more of a campus-wide perspective. 18 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise One of our informants from ICT described how their unit provides direct consulting to researchers, developing long-term relationships and deep knowledge of user needs in order to provide expert support. This includes identifying workflow and data management solutions and even advising faculty on proposal development, particularly on the technology sections of proposals. They avoid answering quick questions via email, instead seeking to deepen relationships and understand the larger context of the researchers’ needs through attendance at laboratory meetings and quiet observation. Our informant remarked that “this is not trying to be an efficient operation,” and emphasized that local provision to researchers is necessary to understand and address researcher needs. Their unit is “joined at the hip with the library” and always looking for new ways to collaborate. While many stakeholders are working synergistically to provide data management support to campus, it can still be difficult for researchers to know which resources are available, as there is rarely a central resource that indexes these services. One of our informants said if they could wave a magic wand to solve any problem there, they would “cultivate a network of . . . research consultants and have a portal or something to point to” to direct researchers to an array of services such as high performance computing resources, DMP development tools, and publishing concerns. Several key stakeholders have a keen interest in RDM service provision: • Research administration units such as research development are eager to support RDM services. Research administrators in the sponsored programs pre-award work to ensure that grant proposals include all required sections, including data management plans, while post- award administrators work to ensure that required data management policies are documented and followed. Research development professionals are eager to connect researchers with any and all services that will help ensure their productivity and success, making the research development office a natural partner with the library. The VP Research may provide significant executive and monetary support. • The library has a significant role to play in the education, expertise, and curation areas of research data management, and libraries may offer individual guidance, monitor agency data curation requirements, and support local deposit and curation of datasets. • ICT professionals also play a major role in RDM support, supporting access to technology and also potentially providing expert support on workflow solutions. • Academic affairs units are keen to support research data best practices among their scholars, and the graduate school may also be interested in promoting education and training about RDM practices among graduate students and postdocs. Research information management (RIM) is the aggregation, curation, and utilization of metadata about research activities. It’s a registry of information about research produced rather than the research data generated by researchers. Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 19 Research Information Management (RIM) Research information management (RIM) is the aggregation, curation, and utilization of metadata about research activities. In other words, it’s a registry of information about research produced rather than the research data generated by researchers and includes information about locally- produced scholarly journal articles, monographs, datasets, presentations, and more.29 While national and regional reporting requirements are strong drivers of RIM practices in Europe and Australia, US practices are driven more by competition and reputation management needs, resulting in the emergence of two primary use cases—public profiles, and faculty activity reporting (FAR) workflows—both involving an array of stakeholders from across the institution.30 Other RIM use cases in the US, including internal decision support, data reuse, and institutional repository integrations, are currently of secondary relevance. Readers wanting to learn more about these uses are encouraged to consult previous OCLC Research reports.31 PUBLIC RESEARCHER PROFILES The first primary US use case is the implementation of public profiles of institutional researchers, with the hopes of facilitating the discovery of experts and collaborators, and to catalyze business and university relationships. One of our informants emphasized that at research universities, “we build reputation like businesses build profit,” and their institution, with library leadership, has implemented a researcher profiling system that harvests publications metadata on the work of every faculty member at the institution, with search engine optimization to support expertise discovery and boost the reputation of the parent institution. A variety of descriptive terms exist to describe these types of platforms, including Research Networking System (RNS) and Research Profiling System (RPS), and in our interviews, we found the campus profile system housed in the library, the office of research, or in campus ICT. In all cases, there was significant cooperation between units. One informant from research development described working “hand in glove with the library” on their campus profiles, and another informant emphasized the importance of library expertise with publications metadata as well as vendor negotiation. At another institution the profile system was administered by the library, with funding from the office of research. Many campus units are strongly interested in campus profile systems: • Research administration units are keen to connect researchers, develop strong interdisciplinary scientific research teams, and yield successful grant applications. Sponsored programs and medical center staff within the office of research may also use public profile systems to comply with US National Institutes of Health (NIH) Clinical and Translational Science Awards (CTSA) recommendations, which call for participating institutions to support collaboration among clinical and translational investigators through the provision of tools, training, and technology.32 • The library values these systems for registering the institutional record of the institution, a manifestation of the “inside out” library, and offers bibliographic expertise. • ICT professionals may be called upon to provide technical support as well as to support system-to-system interoperability, for instance, through the facilitation of automated data feeds or support for APIs. In one case, we found campus ICT as the home for the institutional profile system. • Campus communicators value resources that can help support discovery of experts for press requests and public interest stories within academic affairs units as well as research units. 20 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise • Other stakeholders in academic affairs and other units in the office of research are interested in how the aggregated content might inform institutional decision support. They also share the goal of connecting researchers with other potential collaborators within the institution. FACULTY ACTIVITY REPORTING (FAR) A second important RIM use case in the United States is annual academic progress reviews of faculty, frequently called Faculty Activity Reporting (FAR).33 Because of the disciplinary expertise required for reviews, these processes have long been administered at the departmental level, with a variety of workflow solutions ranging from Dropbox folders to dedicated FAR platforms. Like the public profile use case, the FAR workflow also captures information about scholarly products like publications, plus additional information about the teaching and service responsibilities of faculty. With so many research information management stakeholders, duplication of systems and services is possible, even likely, because of a lack of social interoperability. In particular, independent academic affairs units like colleges, departments, or research institutes may develop their own systems, instead of working with others across the institution. This was commented upon by several of our informants, including one who remarked that on their campus “we have six or seven research profiling systems. That is duplication of service, for sure.” In addition to being a duplication of effort, the failure of multiple stakeholders to work together on a unified system can unintentionally dilute the hoped for impact, as the institution delivers multiple profile discovery platforms instead of a single source of expertise. These are also silos of data that may not be easily combined to provide a broader, expertise snapshot of research activity. Duplication of systems and services is possible, even likely, because of a lack of social interoperability. For institutions that are centralizing faculty activity workflows, these are often managed by a faculty affairs office. The annual review of faculty may be mandated from the campus board of trustees, system, or even the state. One institution ties FAR participation to eligibility for merit pay increases, but even so, there are still a few noncompliant faculty. One of our informants reported how the faculty affairs unit at their institution is valued by many stakeholders on campus, including the provost and other senior campus leaders, for the business intelligence and benchmarking their unit provides. FAR workflows, which by definition are annual reviews of faculty activities, are still usually separate from the less frequent promotion and tenure (P&T) review processes, although one of our informants reported that FAR data can be extracted for reuse for P&T. For this use case we observed campus leadership from faculty affairs as well as from academic affairs units. In particular, FAR is of interest to several campus units. • Academic affairs units, including departments, colleges, and the provost’s office, are interested in faculty activity reporting practices. Colleges and campus level units are particularly interested in both improved workflows and the improved aggregated data that can used for decision support. However, there is often a great deal of unit autonomy, leading Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 21 to heterogeneous practices and duplication of effort and systems. The data aggregated in FAR workflows can also be reused for academic program reviews and program accreditation. • Faculty affairs units at some institutions, usually housed in the office of the provost, may take a leading role in implementing and managing a single FAR system for the institution. • ICT professionals play a role in supporting FAR workflows—at all levels of operation, whether it’s at the departmental or institutional level. And they may also work to provide data from other campus systems to populate the system, such as HR appointment data. • The library is also a stakeholder, providing expertise related to publications metadata, metadata harvesting workflows, and research impact metrics. The library may also play a role in vendor negotiations. • There are other stakeholders whose roles are important because their unit or system provides data for the FAR system, such as human resources (for appointment information), the registrar and/or the data warehouse (for course information), and the graduate school (for doctoral mentoring and committee service). Our informants also emphasized that public profiles and FAR are currently separate workflows and managed in separate systems even though these systems collect a lot of the same information, such as the publications and other scholarly outputs of institutional researchers. But because of a lack of both technical and social interoperability, these systems may exist in duplicate across campus, even requiring repeated manual data entry by faculty into multiple systems. As one of our interviewees emphasized, “There just needs to be the human touch and coordination behind the scenes to make sure that all the units are working together in the way that they should, that all the efforts are strategically aligned.” Research Analytics While university offices of institutional research have long collected and reported on educational outcomes, providing information to campus on student enrollment, retention, and career outcomes, US institutions have been slower to aggregate content on research activities. There are good reasons for this difference: institutions have collected their own measures of student progress while indicators of research productivity—things like journal articles and monographs—have been harder to capture, as they were processed and distributed outside of the organization. Institutions relied upon proxies of research productivity—measures like the number of research doctorates awarded or extramural funding received—to provide information on research productivity and prestige. With radical changes in digital publishing, persistent identifiers, and big data in the past two decades, as well as the growing influence of international rankings and league tables, there’s growing interest in looking beyond these proxies for a more nuanced view of an institution’s research strengths, weaknesses, and networks of opportunity. Today administrators across the institution want improved research analytics and decision support tools.34 This was a recurring trope in our interviews. One informant in research administration, when asked what problem they would solve with a magic wand, responded: “Data. Data! I’d have data at my fingertips that I could search!” Instead, they described much of the analysis of research activity on their campus as “ad hoc” and insufficient. Another informant described how the office of research at their institution wants improved “push-button reporting” about grants submitted/received, as well as to support the identification of prospective collaborators. A third informant described how 22 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise their institution’s new president is “appalled” at the difficulty of understanding institutional research strengths in a data-rich way. 35 We observed institutions responding to this need in a variety of ways. One institution is investing resources into a single, centralized data analytics office under the Chief Financial Officer (CFO), which will incorporate traditional institutional research professionals, as well as a reporting and analytics group that can provide expanded expertise on research metrics as well. (In another institution, the office of research has hired staff to support dedicated decision support and research analytics. This unit maintains its own local data warehouse, pulling data from external sources as well as numerous internal campus systems. Internal data sources include sponsored projects and extramural projects administrative databases, institutional financial data, and Enterprise Data Warehouse (EDW) data on HR appointments, space/room usage, and much more. External tools like SciVal and Pivot are also essential data sources. We also heard informants share how institutions are increasingly investing resources in managing institutional data through the development of campus data lakes and institutional data governance committees. Libraries are also often supporting the institution with data analysis. For instance, Virginia Tech Libraries (not part of our interview cohort) shared with OCLC Research Library Partnership institutions in April 2020 about their use of data analysis to identify synergies and partnerships between Virginia Tech researchers and their counterparts in industry and government.36 In our interviews, a data analyst in the office of research emphasized that impact librarians have a lot of the knowledge needed by data analysts—to understand bibliographic metadata as well as the strengths and limitations of bibliometrics. There is keen interest in improved research analytics from across campus. • Research administration units want improved intelligence about research productivity, campus strengths, trends, The Office of the Vice Chancellor for Research and Innovation at the University of Illinois at Urbana- Champaign is working to support research on a highly decentralized campus through an inclusive Research Development Community. This community is open to all members of the campus community interested in advancing research at Illinois, and is intended to: • Share information about policies, events, and opportunities • Develop and maintain templates, processes, and best practices • Build and support member literacy in a range of topics related to research development • Collaborate across the campus research community to identify research development challenges and support changes that enhance research at Illinois. In particular, the research development council encourages participation from anyone with connections to research—including individuals and units that might not fully realize they have a relationship with the research enterprise, such as facilities management and corporate relations. The group hosts a campus wide “research development day” as an opportunity to celebrate research at Illinois and to also bring in all the disparate stakeholders and service providers, including the library, campus ICT, supercomputing center, corporate relations, and institutional research. Research Development Community at the University of Illinois at Urbana- Champaign35 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 23 and opportunities for private research partnerships. Research development officers can also use research intelligence to inform the development of large “grand challenge” grants. Improved research analytics is seen as increasingly important for securing the prestige and competitive advantage of the institution.37 • Academic affairs units likewise want quality data to inform understanding and decision support. These leaders also see the value in aggregating institutional data to streamline existing processes such as academic program review, and quality data can bolster budgetary requests. • The library is an important stakeholder because of its expertise with bibliographic metadata. In particular, research impact librarians understand the indexes, tools, and limitations of bibliographic analysis and play leadership roles in advising on the responsible use of metrics. • Campus communicators also want improved information at their fingertips, data that can offer an improved understanding of institutional strengths, to help them identify stories to tell that will boost institutional reputation. • ICT professionals are crucial stakeholders in this landscape, playing a role as data stewards, supporting interoperability, and maintaining data warehouses. They are also key players as institutions move toward new data governance structures and develop data lakes for improved and shared analysis. In the course of our interviews, we thought we would find significant interest and engagement in research analytics from institutional research professionals, who collect, analyze, interpret, and report educational outcomes data, but we did not. One informant offered an opinion on this gap, saying that institutional research units are largely unfamiliar with the research domain, and are instead focused on Department of Education reporting on student outcomes. The informant expects institutional research offices to remain focused on educational assessment. As a result, a variety of stakeholders from across campus must work in increasingly socially interoperable ways to contribute knowledge and skills to develop improved data and analysis about the research enterprise.38 ORCID Adoption ORCID (Open Researcher and Contributor ID) is an open, nonprofit organization that works to create and maintain a global registry of unique identifiers for individual researchers. ORCID provides a framework for trustworthy identity management by linking research contributions Campus ICT and the library at the University of California, San Diego, have long partnered to support researchers. In an effort to enrich cross-unit relationships, the two units arranged a working meeting for relevant staff members, to identify a possible joint project or collaboration. Through an icebreaking post-it note exercise they started out collecting all the services and resources offered by both units, audiences served, areas of expertise, and service gaps. Participants suddenly realized how little they knew about the offerings of the other unit. “You have that? We didn’t know you have that!” was a common refrain, and spontaneous peer consulting and planning erupted. While the originally-planned project never happened, it didn’t matter. Instead, the greater knowledge and social interoperability gained through this exercise facilitated trusted relationships, collaborations, and ultimately, better support services for researchers at UCSD. Identifying synergies at the University of California, San Diego37 24 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise and related activities with their contributors across the scholarly communication ecosystem. The ORCID identifier can be integrated into a number of campus workflows and systems such as institutional repositories, grant administration workflows, RIM systems, HR systems, and institutional identity management systems. Consequently, cross-campus social interoperability is important for optimizing the technical interoperability that ORCID can help support.39 However, our informants reported that ORCID implementation efforts at their institutions were slow. For instance, one institution reported how securing buy-in and making any meaningful progress on campus ORCID adoption had taken years, finally resulting in ORCID integration with the institutional identity management system and campus directory. Another described the need for significant campus collaboration: “we had an ORCID integration committee that was looking for recommendations and an implementation plan and that was fairly formal because there were folks from the information systems side of the house, HR, graduate school, and the office of research. We had to come up with a plan and kind of make a recommendation of leadership in the libraries.” There are a multitude of campus stakeholders who must be engaged in ORCID adoption. • The library is frequently the institutional leader on ORCID planning, as it has the greatest familiarity with scholarly communication practices across disciplines. Libraries frequently assume a role as advocates for ORCID adoption and assume institutional responsibility for the training and outreach to scholars. • Research administration and faculty affairs are particularly interested in ORCID integration into RIM systems as ORCID can help disambiguate researchers and improve metadata harvesting workflows, data quality, and the need for manual entry. • Academic affairs units share this interest in improving workflows and reducing administrative burden on faculty. • Campus ICT is a key stakeholder because integration of the ORCID identifier into the central campus identity management system is an approach being used at many US institutions40 and can facilitate the more seamless integration of ORCID identifiers into other systems across the institution. • Campus communicators are eager for information and storytelling opportunities about campus, which improved, disambiguated scholarly communications data can offer.41 Comments on the Library as Partner Throughout the course of interviews, we heard several accounts from nonlibrary stakeholders on how the library is a valued partner in research support activities. In particular, our informants commented on the expertise of the library in licensing, vendor support and negotiations, and research impact and bibliometrics expertise. We heard numerous cases of library staff serving on search committees for the hiring of research development staff members and vice versa, with research development staff serving on search committees for library positions in data management, data visualization, and research impact. One informant saw the library as capable of making progress on things like RIM systems in part because “the library was seen as a trusted, agnostic partner on campus,” while another emphasized how the library has an important role to play as a central campus unit that serves as a trusted partner for sustainable services, not just short-term projects. Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 25 However, we also heard that there are sometimes senior leaders in research administration or campus ICT who do not always understand how or why the library should be a partner in research support activities, often because these leaders were “coming from the outside [academia] and really have no concept.” In these cases, libraries and their advocates on campus must effectively and regularly communicate their value and offerings. Our informants described how the library was sometimes seen as less effective than it might be. Communication and scope were big issues, as its services and value proposition could be diluted by a desire to “be everything for everyone” as well as by an overemphasis on values, without appealing to the needs and interests of others. We also heard several comments about how a lack of confidence among librarians hindered their effectiveness. One of our library informants noted that “even though we are members of the general faculty . . . we are not always seen at the same level.” Another interviewee commented that librarians “don’t feel very comfortable. They don’t feel like they’re equals with the rest of campus . . . [even though] there’s no reason why they shouldn’t feel like equals because they [provide] an amazingly valuable expertise.” A feeling of implicit bias, in the sense of not being perceived as being on an equal footing with faculty, was also reported by nonlibrary administrative professionals, including by one interviewee who recommended confidence in one’s own abilities: I think the number one ingredient is the understanding that I bring a certain expertise to the table that [a faculty member] might not have. You are a faculty member in your areas, and you’re a leading world expert on it. Great! It doesn’t mean you know how to do data analytics related to your publication citation count. The library has an important role to play as a central campus unit that serves as a trusted partner for sustainable services, not just short-term projects. Another informant thought it imperative that librarians see themselves as “equal partners to make teams of diverse expertise to accomplish significant, important objectives quickly.” In our interviews, the library was sometimes also seen as “slow,” moving less quickly and with less urgency than other parts of campus: “they absolutely do not move at the same pace that research faculty move.” A couple of informants also commented on the library’s discomfort with financial realities or cost recovery, describing an “unrealistic” desire for everything to be “free” resulting in the criticism in that libraries “don’t focus on the freaking bottom line.” In sum, our interviews highlighted the importance of cross-campus social interoperability in the successful provision and use of major categories of research support services. In the next section, we will focus on strategies for increasing social interoperability and the success of cross- institutional research support efforts. 26 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise Cross-Campus Relationship Building: Strategies and Tactics When things work well, it’s about people and relationships. When things don’t work well, it’s often also about people and relationships. —Academic Dean You can make more friends in two months by becoming interested in other people than you can in two years by trying to get other people interested in you. — Dale Carnegie, How to Win Friends and Influence People42 Strategies and Directions Considerable energy is invested in relationships and trusted partnerships in the provision and use of research support services. The amount of time and stewardship required is necessary given the complexities of the campus environment, which is characterized by highly heterogeneous interests and needs of smart independent agents, no single point of control, and a high level of self-organization. Strategies and challenges of cross-campus relationship building were discussed repeatedly in our interviews, and some recurrent themes emerged. SECURE BUY-IN One of the strongest common themes in our interviews was the need to get people “bought in to what you want to do.” Collaborations work best when everyone “thinks they are getting something they want.” This is especially important when working with independent agents in a decentralized campus environment. Persuading someone that something is in their own best interest to act upon is a powerful tactic in an environment where mandates do not exist or do not work. More than one of our interviewees called this “selling”—selling the idea and the role of the unit in it. “Of course, you’ve got to sell all the time!” Another interviewee explained: “Really, it’s building your services so that they’re meeting the needs that you think need to be met as well as possible so they’re attractive to people to use them. Kind of just like competing in regular free market.” Self-interest is a powerful motivator and can be leveraged in mutually beneficial ways. Our interviewees described directly appealing to the other party’s needs and goals as far more powerful than highlighting shared values or noble principles. “So being in a decentralized institution, I have to persuade people that it’s in their best interest to do it. But if I can do that successfully, it’s much more likely to lead to [institutional] climate change than mandating.” Appealing to people’s self- interest requires the ability to offer something that speaks to those needs, in a language that is clearly understood by the other party. This will help the unit to be more successful and to better align with campus goals and perspectives. People who are promoting their own agenda only, or their unit’s, rather than the entire university’s were seen as counterproductive by our interviewees. One senior university leader said, “that agenda thing is something that really, especially in academia, is the thing that really turns people off. . . . And I don’t even think it has to be a mutual benefit. . . . We can dissolve Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 27 it if it’s better to be in another unit, and I can go do something else with my life. That’s all okay as long as it’s not for some stupid, frivolous territorial thing that somebody needs to own everything. If it’s truly in the best interest of the campus and our researchers, then it can be okay.”43 KNOW YOUR AUDIENCE Deeply understanding other stakeholders on campus becomes crucial when appealing to their self-interest is the best way to succeed. Throughout all our interviews we heard a variation of: Do your research on them first and then be “meaningfully relevant” to them. Relationship building to this level exceeds knowing job titles. It requires real engagement and a deep understanding of other people’s responsibilities, priorities, and activities. It means being curious and courteous: taking the time to learn about what others do, developing trust, and stewarding the relationship over time. One of our library interviewees shared how they used a fixed set of ten questions for conversations with stakeholders (see sidebar). These questions were not immediately focused on what the library can bring to the table. Instead, the first seven questions explored larger issues about how the other stakeholder perceived campus priorities and how their unit might be affected by changing priorities. In the course of working through the ten questions, the focus narrowed, until the final questions touched only on library services. The informant noted that they “got some of the richest information out of those first seven questions when they didn’t know that we’re talking about the library because they didn’t know that we could do things in areas that they were talking about.” This strategy increased library awareness of the priorities and challenges of other units and provided the context the library needed to strategically align their work successfully with stakeholders. It also raised awareness among the other stakeholders about the offerings of the library and even provided the spark for some new programs and collaborations. 1. In what major ways do you see the University’s work and focus changing during the next 2-3 years? 2. How are these changes affecting the work and focus of your school/ department/program (unit)? 3. What are your unit’s goals for the next 2-3 years? 4. What about your responsibilities within the unit? What are your top responsibilities now, and how do you see these changing over the next 2-3 years? 5. What challenges must your unit overcome in order to meet its goals? 6. If you were a new hire, what tools and services would you need to be successful? 7. The next several years will not only be all about challenges. What are the opportunities that your unit will be pursuing? What do you see as exciting during the next few years? 8. How do the librarians and libraries contribute to your work now? 9. Considering the University’s goals and your unit’s goals, how could the libraries best contribute to the work of your unit—and to you— during the next few years? 10. What do you want the libraries to give careful consideration to as we craft our strategic plan. A script for learning about other units used at Rutgers University– New Brunswick43 28 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise Paying attention to what is happening on campus more broadly is part of this effort to understand your existing or potential audiences. This can mean something as simple as reading emails coming from other units instead of filtering them out as spam, attending events to demonstrate interest in others, and serving on campus committees. Interviewees repeatedly warned against underestimating the importance of “just knowing what other people are up to or having other people know what you’re up to.” Even a small-scale project undertaken by a single campus unit and aimed at a limited audience may be an indicator of an unfulfilled large-scale need that can be identified and addressed through cross-unit cooperation. Understanding the landscape of one’s institution, including the national landscape, was listed as high priority by many of our interviewees. “If you think you don’t understand, you have even more of an obligation to kind of immerse yourself and understand more.” Conversely, no unit should assume others know what they do, but should actively reach out, make it easy for others to learn about its services and needs, and routinely make the case for what it needs. SPEAK THEIR LANGUAGE Different units on campus use different terms for the same things, for historical or cultural reasons. What some call financial viability, others call sustainability. Some talk about profit, others are more comfortable with calling it surplus. Some prefer the term support over subvention. Some units, such as the library, tend to avoid terminology with a corporate or business inflection, other units use that language and can best be served by adopting it too. Metadata is a particularly fraught example; it means something specific to librarians but something very different to IT and/or data warehouse administrators. Interviewees across the board emphasized the importance of the ability to speak the audience’s language. Obviously, if a service or project is not understood to be addressing a problem because of the language used, it may not capture the attention of whoever has that problem. Being prepared to deliver an elevator speech, when necessary, is one aspect of this. As one informant put it, “Do you know how long you [have] to make that case? Two minutes. Two minutes. And if you are not successful, the meeting is over.” Some of our informants shared how they help other units package their information in more suitable ways (e.g., by producing one-page information sheets on topics, or key talking points for outreach and engagement with others on campus). OFFER CONCRETE SOLUTIONS TO OTHERS’ PROBLEMS Another important theme in our interviews was the importance of understanding others’ pain points and of demonstrating how your offerings can help alleviate them. Interviewees shared how it is useful for them to not go into meetings empty-handed, to anticipate needs, be proactive about building skills, and to offer solutions in advance of demand—not just vaguely ask how they could help. One of our interviewees was particularly outspoken on this, recalling a situation in the past when they worked as a faculty member and was asked that question by a library representative: “What can I do for you?” . . . That’s like the most freaking passive-aggressive crap-ass thing you could ever do to a faculty member because how the hell should I know what he could do for me? I don’t know what he knows. I don’t know what resources he has. I don’t know how much time he has. So it’s not my job to educate him on how he can help me. It’s his job to figure out what my needs are, and to come in with, “Hey, I’ll bet you’re trying to.” More than once, initiating the first step was mentioned as a tactic on campus: offering concrete assistance or cooperation without expectation of immediate payoffs or advocating for others to get invited to a meeting potentially relevant to them in the hopes that one day they remind others to include you where you should be included. Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 29 TIMING IS ESSENTIAL No pushing will help when the timing is not right, when needs are diffuse and urgency low, or when current priorities differ entirely. One informant said, “Until they need to hear it, they’re not going to hear it.” Creating awareness, informing repeatedly, and patiently waiting for the right moment— or even until the right partner comes into a role—can be the best strategy in an environment of nonlinear dynamic behavior and differing goals. All of this takes considerable time and effort, our interviewees agreed, as well as patience and perseverance: I think certainly at a big university like [ours], remembering the information lag factor, it takes people a while to realize you exist and then it takes people a while to remember what you do, and then they’ll remember what you do and then you’ve gone on to do several other things, but they still only remember the first thing that they learned about you. And then if you screw up then that’s the last story they remember, and they might not update their data on you for a while. Creating awareness, informing repeatedly, and patiently waiting for the right moment—or even until the right partner comes into a role—can be the best strategy in an environment of nonlinear dynamic behavior and differing goals. Relationship Building: Practical Advice Building new and maintaining existing relationships on campus requires considerable commitment and investment. We asked our interviewees to share how they made this happen, what opportunities there were to learn about stakeholders on campus, and which ones they found to be more useful than others. MEETING OPPORTUNITIES Our interviewees emphasized the importance of making regular contact with other stakeholders to build trust and steward relationships. These contacts existed on a continuum from formal and informal, scheduled and spontaneous, and there is value in every type of interaction. Committee work—serving on research committees, the faculty senate, or other bodies—was mentioned repeatedly as an invaluable opportunity for relationship building: to present oneself as a potential partner and to demonstrate good citizenship and support of larger university goals. It is excellent for temperature taking and trust building, and it helps sharpen skills in many ways. Faculty governance, in particular, was mentioned as something important for library staff to be engaged in, to find out what other people were talking about and how it might impact the library, as well as to increase library staff’s visibility and confidence as faculty members on an equal footing with other faculty: “I think that my work in Senate gave me so much more opportunity and ability to build these relationships with faculty that I really wish my staff had more of. . . . I do feel like participating in governance can really help you to grow those skills that you need to be an effective liaison as a collaborator rather than as a servant.” 30 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise Scheduling standing meetings with stakeholders was strongly recommended by several interviewees, both for general knowledge sharing and as a welcome option to raise or discuss topics of relevance without unnecessarily ringing alarm bells. Especially when new staff come on board in other units, creating opportunities to meet them early and regularly was mentioned as good practice. Executive level support can be particularly helpful with creating the right sort of relationships at the right point in time. Some interviewees saw an opportunity to create communities that cut across campus silos to unite people with shared interest on campus. We learned about examples of open and inclusive groups on campus that regularly convene with the express purpose of facilitating communication and networking—such as the Research Development Community at the University of Illinois (see sidebar, page 22). Such initiatives are good examples of the self-organized interest groups, arising to meet evolving needs, that are so typical of complex adaptive systems. Finally, informal or “hallway” conversations before or after more formal meetings were highlighted as important ways of engaging. In these conversations, free of pressure or expectations, real progress can be made. People are less suspicious, and “frankly less guarded,” one interviewee remarked. SHARED STAFF AND EMBEDDED RESOURCES Another recurring theme was the benefits that staff movements can bring to the relationships between units, be it shared staff, embedded staff, or staff that moves around when changing roles. A network of former colleagues spread out across campus can be immensely beneficial. Members of staff familiar with one unit and closely working with another can function as trusted “ambassadors,” “allies,” and “champions,” and can effectively “translate” goals, processes, or values between units, as well as connect people. They can help with “cross-pollination,” the cross- unit flow of information and expertise, or simply with “getting a feel for their day-to-day struggles and activities.” And while most staff moves occur organically in the course of natural career progressions, encouraging them can even become a strategy. One of our interviewees told us they purposefully nurture talent in their unit to help them move elsewhere on campus. Based on what we heard in our interviews, the units the library shared staff with or library staff was moved to most often were campus IT or technology and the research office, sometimes as a result of previous project cooperation.44 Troubleshooting in Relationship Building MAKING CONNECTIONS A common issue our interviewees reported dealing with was that of making connections with the right people. Referrals and recommendations were often described as being immensely helpful, much more so than any cold calling. One informant shared how through their investment in long- term relationships with faculty members, “sometimes we get faculty who then introduce us to the next faculty member because they say, ‘These folks have helped us.’” In particular, the importance of a “connector” or “hub” person was recognized by several of our interviewees. The value of someone well-connected on campus, someone who can help identify partners or recommend connections, people to meet with, and workshops to attend—a “hub of hubs”—cannot be overestimated. The best of these people “can see both the details and the whole and bring them together on a campus to talk through the research enterprise. How do we make Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 31 it better, faster, stronger, easier? How do we identify ways that the system can help support that better?” Some of our interviewees identified with the role of a connector on campus themselves and said their job was “to be a facilitator.” This is also an area where senior leadership support can be very helpful. Several of our informants emphasized that it is important for relationships and conversations to take place within multiple levels of the parent units up and down the organizational hierarchy. And while top-down directed collaborations tend to fail, having executive support behind collaborations can be good to move people along, as we heard some of our informants say. PERSONALITIES One of the common issues our informants reported dealing with was that of having to get along with the personalities on campus. Relationship building is all about people. Interviewees often mentioned how their relationships and partnerships depended on the personalities involved, and in some cases failed because of this. Even when the difficulties seemed to lie in the unit or program, interviewees felt that, ultimately, they originated from differences in personalities, rather than disciplinary perspectives. In such cases, it can be helpful to deeply understand not only professional priorities, but also personal sensitivities so you can “sell to” those more personal needs, too. Still, sometimes an individual can prove impossible to work with. In these cases, walking away for a time and waiting for someone else to fill a role can be the most productive way to deal with a situation. One interviewee clearly recommended to not “spend time trying to work with areas that are less receptive” and instead work with who you can. “The good news of being at a big university is there’s plenty who are happy to make progress. . . . So in the meantime [while a certain unit is not amenable], we’ll work with those who want to make changes and do these things.” Good relationships cannot be forced but must be stewarded over time. However, more than one interviewee also recommended not to assume malicious intent. Following up to inquire if something potentially offending happened may be all it takes to see it fixed—and the relationship maintained. In any case, our informants warned against ever burning bridges. Being everything to everyone will not work. Stay focused on what you want to achieve: saying no or limiting scope can strengthen your value as a reliable partner. KNOW YOUR VALUE / BE CONFIDENT It is important to adapt to one’s audience, but it is equally important to be very clear and confident regarding one’s own role and value, including the scope of one’s work. Being everything to everyone will not work. Stay focused on what you want to achieve: saying no or limiting scope can strengthen your value as a reliable partner. 32 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise Challenges: Managing Resistance and Sustaining Energy In complex adaptive systems it is not uncommon to see differing goals and behaviors result in internal conflicts and outright or perceived competition. Interviewees talked about how they are constantly trying to anticipate negative responses from different corners of campus and, at the same time, avoid losing control over their communications and efforts. This type of risk management is an important component in developing research support activities in the complex university environment of diffuse interests and conflicting perspectives. MANAGING RESISTANCE Independent agents may feel free to openly resist institutional initiatives in a system lacking single points of control. One successful tactic of dealing with the risk of upsetting others is that of consulting early and often with other stakeholders. For example, one informant recommended sharing ideas or drafts early in the process in order to take the temperature and collect preliminary feedback from stakeholders, top to bottom, so they all can feel consulted, concerns are addressed, and buy-in developed upfront. That way, one interviewee said, stakeholders will not feel blindsided by the launch of something new. Another informant emphasized the need to anticipate if and how new collaborations, or collaborative projects, impact business or administrative processes. Process changes often create resistance, and it is important to deal with them early and wisely. Resistance can also result when units feel their work or autonomy is at risk or initiatives are perceived as competing. One interviewee shared an example where they “ended up stepping on toes across the organization” because their unit offered services in a research support area (impact analysis) that others felt they owned. Departmental units felt their local autonomy was threatened. In such cases, the informant recommended, it is wiser not to try to replace existing services, but rather to find ways to complement and support them—with data, for example—while acknowledging the units’ independence. Earlier consultation with these audiences might have also reduced this friction. Relationship building is a significant but valuable investment. It is not cost-free, but as our informants made clear, the rate of return is usually quite high. INVESTING THE ENERGY With risks to manage, relationships to steward, and plenty of work to do, it is not surprising that we heard that people could feel “overwhelmed.” But our informants also emphasized the effort they invest in relationships. “It’s going to take quite a bit of effort to learn and listen about the other person’s perspectives and where they’re coming from.” And the work of relationship Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 33 building never ends “because if somebody changes, you’ve got a new person in a position, then you’ve lost all that historical agenda in that relationship.” This can be frustrating over time, even grueling. Collaboration can slow down progress—collaboration and speed can end up being trade-offs that must be balanced, potentially resulting in the duplication of systems and services on campus mentioned earlier. But despite this, our informants overwhelmingly agreed that taking the time to build strong cross-institutional relationships was essential for attaining individual and collective goals. People in emergent roles especially also report feeling isolated. They often lack a team to support them—or just free them up for their mission-critical work. Interviewees mentioned several examples where lack of resources or support—for example, help with marketing tools, assistance with event planning—made it harder for them to do impactful work. Having to attend to work outside their immediate expertise is an additional stress point for staff in emerging roles. We also heard of 80- hour work weeks and talked to informants who felt overworked and tired. In this situation, making an effort at relationship building can seem overly burdensome. But getting out of the office to learn more about what others are doing can also reduce the feeling of isolation and provide opportunities for building community and getting support. Relationship building is a significant but valuable investment. It is not cost-free, but as our informants made clear, the rate of return is usually quite high. FIGURE 3. Key takeaways about successful intra-campus social interoperability THE UNIVERSITY Academic A�airs Research Administration The Library Information & Communications Technology (ICT) Faculty A�airs & Governance Communications Secure buy-in Know your audience Speak their language O�er solutions to problems Timing is essential Find opportunities to connect Leverage shared sta� Find “connectors” Manage personalities Be confident in your value Manage resistance Invest the energy Key Takeaways About Successful Intra-Campus Social Interoperability Key Takeaways about Successful Intra-campus Social Interoperability 34 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise Conclusion We undertook this project to explore the role of social interoperability in research support following previous OCLC Research efforts where we observed the need for libraries to work closely with other campus stakeholders to advance resources and services.45 Our goal was to focus entirely on the topic of cross-campus, cross-domain institutional collaboration, and, using the human intelligence offered by our interview subjects, offer guidance for successful social interoperability in the complex adaptive system of the university. Effective social interoperability across campus units is an important, and increasingly necessary, feature of successful research support services, and requires a thorough knowledge of campus partners. In this report, we have gathered information from stakeholders in research support around the university, describing their goals, interests, expertise, and crucially, the importance of cross-campus relationships in their work. Based on our informants’ experiences, we drew out lessons and good practices on fostering social interoperability in the provision and use of research support services (figure 3). Our key findings include: • US research universities are highly decentralized, dynamic institutions, filled with heterogenous, independent agents that sometimes work at cross purposes. This environment creates specific challenges and calls for the creation and maintenance of working relationships across individuals and organizational units that promote collaboration, communication, and mutual understanding—in short, social interoperability. This is of special significance for stakeholders in research support, where roles are often new, responsibilities emerging, and staff often report feeling isolated in the absence of an established community of practice within and beyond the university. • The essential first step in building successful campus partnerships is to know who the other stakeholders are: what they do, what they prioritize, and how they see themselves contributing to the university mission. In ”A model for conceptualizing university research support stakeholders,” we present a conceptual model of key stakeholders in the provision and consumption of research support services: Academic Affairs, Research Administration, Library, Information and Communications Technology, Faculty Affairs and Governance, and Communications. • In “Social interoperability in research support services,” we document our informants’ experiences in building and maintaining cross-campus relationships in key research support service areas: research data management (RDM), research information management (RIM), research analytics, and ORCID adoption. Our interviews highlight the importance of social interoperability in the successful provision and use of research support services. But challenges remain; even when stakeholders are working synergistically, it can still be difficult for researchers to know which resources are available if there is no central resource that indexes these services provided by different stakeholders. Duplication of systems and services is common. And progress can be slowed by the necessity of first securing buy-in across stakeholders on campus. • “Cross-campus relationship building” suggests lessons and best practices from our informants on how to optimize social interoperability in research support. For instance, persuading someone that something is in their own best interest to act upon is a powerful tactic in an environment where mandates do not exist or do not work. In addition, knowing Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 35 your audience, speaking their language, offering concrete solutions to their problems, and getting the timing right are important strategies. Considerable investment of energy and time is necessary for building and maintaining cross-campus relationships, but as our informants made clear, the rate of return is usually quite high. 36 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise A C K N O W L E D G M E N T S The authors extend special thanks to our interview informants who generously shared their expertise and time with us for this investigation. We also thank members of the OCLC Research Library Partnership who tapped into their own campus networks to recommend possible interview informants for this study. Several OCLC colleagues provided guidance and support in the preparation of this report. Ixchel Faniel provided input on strengthening the interview protocol; Erin Hood and Nick Spence assisted with note-taking and interview project management activities; and Lynn Silipigni Connaway extended resources for interview transcription as well as offered sage guidance throughout. The report could not have been published without the significant efforts of the OCLC Research publishing team, including Erica Melko, Jeanette McNicol, and JD Shipengrover. Finally, our work was made possible by the senior leadership of OCLC; we wish to particularly thank Lorcan Dempsey, Vice President, Membership and Research, for OCLC, for his continued support of this effort. Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 37 A P P E N D I X : I N T E R V I E W P R O T O C O L Institutional Stakeholders in Research Support Project, oc.lc/stakeholders Date of interview Informant’s name Informant’s title Informant’s unit Informant’s institution 0. Introductions (5 minutes/xx:00-xx:05) Thanks for talking with us today. We want to spend 75 minutes with you today, talking about your role at your institution, in order to learn more about your unit’s goals, tasks, challenges, and collaborations. This discussion is part of our information gathering for a project entitled “Institutional Stakeholders in Research Support,” in which we are examining and documenting the numerous campus stakeholders that – as we observe – are increasingly called to work together, to support one or more research activities on the university campus today. The three of us here are the core research team working on this project being conducted by OCLC Research, a leading research institute or think tank investigating issues relevant to the world’s libraries. At the conclusion of our project, we will publish a synthesis of our findings as an OCLC Research Report. I will be leading the discussion while my colleagues take notes. Introductions [ask each participant to quickly share their name and role] Your interview today is confidential and your comments will be useful to us as we attempt to synthesize the variety of goals and roles taking place at research universities today. We would like to record our conversation today—but only for our own personal use; we will not share the recordings with others. Did informant agree to allow recording? (Y/N) 1. Why is the work that you do important? (15 minutes/xx:05-xx:20) Question purpose: to understand their main goals and how these align with institutional goals. This question should also help us understand the drivers, although the follow-up questions may be necessary to get there. Follow-up questions: a. Redirect to focus on research support services. Do you feel that part of what you do is providing research support? [relevant to only some informants] b. Why is this work valuable to your institution? Your campus unit? Researchers? c. Who are the main stakeholders who care about the work that you do? These may be people or organizations inside or outside your university. [this is an incentives question: can we maybe use the RDM incentives model?] http://oc.lc/stakeholders 38 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 2. HOW do you do it? (15 minutes/xx:20-xx:35) Question purpose: to get them to describe what their unit does—the tasks. Follow-up questions: a. What is your unit really good at? b. What’s most important? c. Is your unit typical of practices at similar institutions? d. [for campus IT—you work at systems of scale. Are there differences in how this work for research services vs educational services?] e. Research support services have become a much more visible part of the service portfolio on campus. Are you familiar with that term, and if so, what kinds of services come to mind? [if not familiar, here are some examples: RDM, RIM, bibliometrics support—services that support researchers and also services that support the institutional research enterprise, reporting, and reputation management. 3. What are the most beneficial relationships for helping you achieve your goals? What are the relationships that are important for achieving your unit’s goals? (20 minutes/ xx:35-xx:55) Question purpose: to understand who they are partnering with. Follow-up questions: a. What units are your most common collaborators/partners? b. Are you trying to build new relationships across campus? Why? c. Have you tried to collaborate with some units and failed? d. Have you partnered with the library? e. What about off-campus collaborations? Professional conferences? 4. If you could wave a magic wand, what would you change or fix? (10 minutes/xx:55-xx:05) Question purpose: to understand their pain points. Follow-up questions: a. What are some new things on your road map that you’d like to accomplish? b. Can you give us a specific example of something you are trying to do? c. What are the primary barriers? 5. Is there anything else we should have asked? (5 minutes/xx:05-xy:10) Comments/Perceptions Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 39 N O T E S 1 Bryant, Rebecca, Anna Clements, Pablo de Castro, Joanne Cantrell, Annette Dortmund, Jan Fransen, Peggy Gallagher, and Michele Mennielli. 2018. Practices and Patterns in Research Information Management: Findings from a Global Survey. Dublin, OH: OCLC Research. https://doi.org/10.25333/BGFG-D241; Bryant, Rebecca, Brian Lavoie, and Constance Malpas. 2018. Sourcing and Scaling University RDM Services. The Realities of Research Data Management, Part 4. Dublin, OH: OCLC Research. https://doi.org/10.25333/C3QW7M; Bryant, Rebecca, Brian Lavoie, and Constance Malpas. 2018. Incentives for Building University RDM Services. The Realities of Research Data Management, Part 3. Dublin, OH: OCLC Research. https://doi.org/10.25333/C3S62F; Bryant, Rebecca, Brian Lavoie, and Constance Malpas. 2017. Scoping the University RDM Service Bundle. The Realities of Research Data Management, Part 2. Dublin, OH: OCLC Research. https://doi.org/10.25333/C3Z039; Bryant, Rebecca, Brian Lavoie, and Constance Malpas. 2017. A Tour of the Research Data Management (RDM) Service Space. The Realities of Research Data Management, Part 1. Dublin, OH: OCLC Research. https://doi.org/10.25333/C3PG8J. 2 Malpas, Constance, Roger Schonfeld, Rona Stein, Lorcan Dempsey, and Deanna Marcum. 2018. University Futures, Library Futures: Aligning Library Strategies with Institutional Directions. Dublin, OH: OCLC Research. https://doi.org/10.25333/WS5K-DD86. 3 The University of Rhode Island. “Assistant Professor, Library Chief Data Strategist.” Human Resource Administration: Posting Details. (Archived 28 February 2020). https://web.archive.org /web/20200228000245/https://jobs.uri.edu/postings/7102/print_preview. 4 NC State University. “Researcher Support.” North Carolina Training Consortium. Accessed 3 August 2020. https://research.ncsu.edu/nctc/study-guide/project-administration/project -management/researcher-support/. 5 Si, Li, Yueliang Zeng, Sicheng Guo, Xiaozhe Zhuang. 2019. “Investigation and Analysis of Research Support Services in Academic Libraries.” The Electronic Library 37, no. 2: 281-301. https://doi.org/10.1108/EL-06-2018-0125. 6 We first used the term “social interoperability” in this way in early 2019. See Lavoie, Brian. 2019. “RLP Research Data Management Interest Group: Acquiring RDM Services for Your Institution,” Hanging Together: the OCLC Research blog, 6 February 2019. https://hangingtogether.org/?p=6997. 7 Corrall, Sheila. 2014. “Designing Libraries for Research Collaboration in the Network World: An Exploratory Study, 37” LIBER Quarterly 24, no. 1: 17-48. https://www.liberquarterly.eu/article/10.18352/lq.9525/. https://doi.org/10.25333/BGFG-D241 https://doi.org/10.25333/C3QW7M https://doi.org/10.25333/C3S62F https://doi.org/10.25333/C3Z039 https://doi.org/10.25333/C3PG8J https://doi.org/10.25333/WS5K-DD86 https://web.archive.org/web/20200228000245/https://jobs.uri.edu/postings/7102/print_preview https://web.archive.org/web/20200228000245/https://jobs.uri.edu/postings/7102/print_preview https://research.ncsu.edu/nctc/study-guide/project-administration/project-management/researcher-supp https://research.ncsu.edu/nctc/study-guide/project-administration/project-management/researcher-supp https://doi.org/10.1108/EL-06-2018-0125 https://hangingtogether.org/?p=6997 https://www.liberquarterly.eu/article/10.18352/lq.9525/ 40 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 8 See Bradley, Cara. 2018. “Research Support Priorities of and Relationships among Librarians and Research Administrators: A Content Analysis of the Professional Literature.” Evidence Based Library & Information Practice 13 (4): 15–30. https://doi.org/10.18438/eblip29478; Bradley, for example, notes that “the importance of collaborating with others on campus (units, students, and faculty) in developing and delivering support for student learning has been well- documented . . . There has been less evidence collected about how academic libraries can best support campus research.” (p. 16). Bradley goes on to observe that collaboration in research support documented in the literature tends to focus on research data management. (p. 17-18) 9 A copy of the interview protocol is provided in the report appendix. 10 Some specific gaps include in-depth discussions about the roles of Technology Transfer, Institutional Research, or Corporate Relations units, which may be stakeholders in research support services on some campuses. 11 Dean, Jr., James W., and Deborah Y. Clarke. 2019. The Insider’s Guide to Working with Universities: Practical Insights for Board Members, Businesspeople, Entrepreneurs, Philanthropists, Alumni, Parents, and Administrators, 17. Chapel Hill: University of North Carolina Press. 12 Rouse, William B. 2016. Universities as Complex Enterprises: How Academia Works, Why It Works These Ways, and Where the University Enterprise Is Headed, 5-9. New York: Routledge. 13 Ibid. 14 Hazelkorn, Ellen. 2011. Rankings and the Reshaping of Higher Education: The Battle for World- Class Excellence, 5-10. Houndmills, Basingstoke, Hampshire: Palgrave Macmillan. https://doi.org/10.1057/9781137446671. In the past two decades, state support for public higher education has declined by billions of dollars and undergraduate enrollment is also in decline, with larger declines on the horizon in the 2020s; Mitchell, Michael, Michael Leachman, and Kathleen Masterson. 2017. A Lost Decade in Higher Education Funding: State Cuts Have Driven Up Tuition and Reduced Quality. Washington, DC: Center on Budget and Policy Priorities. https://www.cbpp.org/research/state-budget-and-tax /a-lost-decade-in-higher-education-funding; Nadworny, Elissa, and Max Larkin. 2019. “Fewer Students Are Going To College. Here’s Why That Matters.” NPR KQED audio (Education), 16 December 2019, 5:00 AM ET, Morning Edition (6 minutes). https://www.npr.org/2019/12/16/787909495/fewer-students-are-going-to-college -heres-why-that-matters. 15 Connaway, Lynn Silipigni, William Harvey, Vanessa Kitzie, and Stephanie Mikitish. 2017. Academic Library Impact: Improving Practice and Essential Areas to Research. Chicago, Illinois: Association of College & Research Libraries, 31, 40. http://www.ala.org/acrl/sites/ala.org.acrl/ files/content/publications/whitepapers/academiclib.pdf. 16 Cox, John. 2018. “Positioning the Academic Library within the Institution: A Literature Review.” New Review of Academic Librarianship 24, no. 3-4: 217–41. https://doi.org/10.1080/13614533.2018.1466342. https://doi.org/10.18438/eblip29478 https://doi.org/10.1057/9781137446671 https://www.cbpp.org/research/state-budget-and-tax/a-lost-decade-in-higher-education-funding https://www.cbpp.org/research/state-budget-and-tax/a-lost-decade-in-higher-education-funding https://www.npr.org/2019/12/16/787909495/fewer-students-are-going-to-college-heres-why-that-matters https://www.npr.org/2019/12/16/787909495/fewer-students-are-going-to-college-heres-why-that-matters http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/academiclib.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/academiclib.pdf https://doi.org/10.1080/13614533.2018.1466342 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 41 17 Whitchurch, Celia. 2015. “The Rise of Third Space Professionals: Paradoxes and Dilemmas.” In: Forming, Recruiting and Managing the Academic Profession, edited by U. Teichler and W. Cummings, vol. 14. The Changing Academy – The Changing Academic Profession in International Comparative Perspective. Switzerland: Springer, Cham. https://doi.org/10.1007/978-3-319-16080-1_5. 18 There’s a significant literature accusing the bloating number of unnecessary, highly paid administrators as the cause of rising college costs. However, there’s much more evidence that drastically reduced state support for public education is the primary factor. Many new positions have been added—and seen as necessary—as institutions have added IT infrastructure, compliance officers, and more student and research support services. As faculty member and author Robert Kelchen says, Faculty do complain about all the assistant and associate deans out there, but this workload would otherwise fall on faculty. And given the research, teaching, and service expectations that we face, we can’t take on those roles. See Kelchen, Robert. 2018. “Is Administrative Bloat Really a Big Problem?” Blog (Kelchen on Education), 10 May 2020. https://robertkelchen.com/2018/05/10/is-administrative -bloat-a-problem/; For a good discussion of these misconceptions, see Dean, Jr, James W., and Deborah Y. Clarke. 2019. The Insider’s Guide to Working with Universities: Practical Insights for Board Members, Businesspeople, Entrepreneurs, Philanthropists, Alumni, Parents, and Administrators, 131-133. Chapel Hill: University of North Carolina Press. 19 Dean, Jr., and Clarke. The Insider’s Guide, 32 (See note 11). 20 The Ohio State University. “Office of Research.” https://research.osu.edu/. 21 Stanford University. “Office of Research Administration.” https://ora.stanford.edu/. 22 Pesce, Jessica R. “Student Affairs Has an Association; Faculty Affairs Needs One, Too,” The Chronicle of Higher Education, 21 August 2018, https://www.chronicle.com/article/Student -Affairs-Has-an/244313. 23 Dean, Jr., and Clarke. The Insider’s Guide, 32 (See note 11). 24 This ten-question interview script is included in the section “Cross-campus relationship building.” (See “A script for learning about other units used at Rutgers University– New Brunswick,” sidebar, p. 27.) 25 Sheila Corrall described the need for greater operational convergence in the provision of research support services, as libraries increasingly partner with other institutional stakeholders, such as the office of research; See Corrall, Sheila. 2014. “Designing Libraries for Research Collaboration in the Network World: An Exploratory Study,” 37. LIBER Quarterly 24 (1): 17-48. https://doi.org/10.18352/lq.9525; Cara Bradley, in her review of the library and research administration literature, found that even in cases where these two professions engaged in the same topics, they focused largely https://doi.org/10.1007/978-3-319-16080-1_5 https://robertkelchen.com/2018/05/10/is-administrative-bloat-a-problem/ https://robertkelchen.com/2018/05/10/is-administrative-bloat-a-problem/ https://research.osu.edu/ https://ora.stanford.edu/ https://www.chronicle.com/article/Student-Affairs-Has-an/244313 https://www.chronicle.com/article/Student-Affairs-Has-an/244313 https://doi.org/10.18352/lq.9525 42 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise on different aspects. And, more significantly, the literature of each profession demonstrated little awareness of the activities and interests of the other. See Bradley, Cara. 2018. “Research Support Priorities of and Relationships among Librarians and Research Administrators: A Content Analysis of the Professional Literature.” Evidence Based Library & Information Practice 13 (4): 15–30. http://10.0.72.6/eblip29478, 26-28. 26 Lavoie, Brian. “RLP Research Data Management Interest Group: Acquiring RDM Services for Your Institution,” Hanging Together: the OCLC Research blog, 6 February 2019. https://hangingtogether.org/?p=6997. 27 National Science Foundation. “Dissemination and Sharing of Research Results.” https://www.nsf.gov/bfa/dias/policy/dmp.jsp. 28 Research Data Management has been an area of significant interest to OCLC Research, such as the Realities of Research Data Management series published in 2017-2018, as well as many other publications made publicly available on the OCLC web site, https://www.oclc.org /research/areas/research-collections/rdm.html; Bryant, Rebecca, Brian Lavoie, and Constance Malpas. 2017. A Tour of the Research Data Management (RDM) Service Space. The Realities of Research Data Management, Part 1. Dublin, OH: OCLC Research. https://doi.org/10.25333/C3PG8J. 29 RIM is an emerging area of library interest and a subject of much previous OCLC research, such as Bryant, Rebecca, Anna Clements, Carol Feltes, David Groenewegen, Simon Huggard, Holly Mercer, Roxanne Missingham, Maliaca Oxnam, Anne Rauh, and John Wright. 2017. Research Information Management: Defining RIM and the Library’s Role. Dublin, OH: OCLC Research. https://doi.org/10.25333/C3NK88. 30 European systems for collecting research information are typically called Current Research Information Systems (CRIS) and are used for collecting and reporting on institutional research productivity. Usage of the term CRIS is uncommon in the United States. See Wikipedia. “Current Research Information System.” Updated 2 August 2020, at 14:51 (UTC). https://en.wikipedia.org/wiki/Current_research_information_system. 31 Bryant, Lavoie, and Malpas, Research Information Management: Defining (See note 29); Bryant, Rebecca, Anna Clements, Pablo de Castro, Joanne Cantrell, Annette Dortmund, Jan Fransen, Peggy Gallagher, and Michele Mennielli. 2018. Practices and Patterns in Research Information Management: Findings from a Global Survey. Dublin, OH: OCLC Research. https://doi.org/10.25333/BGFG-D241. 32 Traditionally, libraries purchased and licensed materials from external sources, to be made available locally—an “outside-in” collection. In more recent years, there has been movement among research libraries to an “inside-out” model, where institutional outputs (digitized special collections, researcher profiles, etc.) are shared with an external audience. Explained in greater depth in Dempsey, Lorcan, Constance Malpas, and Brian Lavoie. 2014. “Collection Directions: Some Reflections on the Future of library Collections and Collecting.” Libraries and the Academy 14 (3): 393–423. https://doi.org/10.1353/pla.2014.0013. 33 Rouse, William B. 2016. Universities as Complex Enterprises: How Academia Works, Why It Works These Ways, and Where the University Enterprise Is Headed, 61. New York: Routledge. http://10.0.72.6/eblip29478, 26-28 https://hangingtogether.org/?p=6997 https://www.nsf.gov/bfa/dias/policy/dmp.jsp https://www.oclc.org/research/areas/research-collections/rdm.html. https://www.oclc.org/research/areas/research-collections/rdm.html. https://doi.org/10.25333/C3PG8J https://doi.org/10.25333/C3NK88 https://en.wikipedia.org/wiki/Current_research_information_system https://doi.org/10.25333/BGFG-D241 https://doi.org/10.1353/pla.2014.0013 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 43 34 Through conversations with OCLC Research library Partnership institutions, we know that research analytics is a growing area of activity and investment for research libraries. These conversations, and institutional responses, were documented in the OCLC Research Hanging Together blog: Lavoie, Brian. “Making Connections: Research Analytics at Virginia Tech,” Hanging Together: the OCLC Research blog, 13 April 2020. https://hangingtogether.org/?p=7854, and Lavoie, Brian. “Research Analytics: Where Do Libraries Fit In?” Hanging Together: the OCLC Research blog, 2 December 2019. https://hangingtogether.org/?p=7623. 35 See University of Illinois. “Research Development Center.” https://rdc.research.illinois.edu. Institutional permission was given to publicly recognize the institutions highlighted in the sidebar case studies. 36 Lavoie, Brian. “Making Connections: Research analytics at Virginia Tech,” Hanging Together: the OCLC Research blog, 13 April 2020. https://hangingtogether.org/?p=7854. 37 Permission to publicly recognize this institutional activity was provided by a university representative. 38 The Association for Institutional Research (AIR) is the primary professional organization in the United States for institutional research professionals. It provides an overview of the “Duties and Responsibilities of Institutional Research” professionals on its web site at: https://www.airweb .org/ir-data-professional-overview/duties-and-functions-of-institutional-research. 39 The ORCID US Community, supported and led by LYRASIS in partnership with the Big Ten Academic Alliance, the Greater Western library Alliance (GWLA), and the NorthEast Research Libraries (NERL) provides resources, training, and community support for ORCID adoption in the United States. https://www.lyrasis.org/Leadership/Pages/orcid-us.aspx. 40 Lyrasis. “ORCID US Exemplars.” https://www.lyrasis.org/Leadership/Pages/ORCID-US-Exemplars.aspx. 41 The ORCID US Community offers guidance to institutions on securing stakeholder support at: Lyrasis. “ORCID US Community Planning Guide for Research Institutions.” https://www.lyrasis.org/Leadership/Pages/orcid-us-planning-guide.aspx. 42 Carnegie, Dale. 2009. How to Win Friends and Influence People. New York: Simon and Schuster. 43 Permission to publicly recognize this institutional activity was provided by a university representative. 44 Moving staff between the library and the research office was also encouraged in a recent symposium held in Washington, DC. “Critical Roles for Libraries in Today’s Research Enterprise. In Symposium Proceedings,” 11 December, 2019, https://library.ucalgary.ca/ld.php?content_id=35088958. https://hangingtogether.org/?p=7854 https://hangingtogether.org/?p=7623 https://rdc.research.illinois.edu https://hangingtogether.org/?p=7854 https://www.airweb.org /ir-data-professional-overview/duties-and-functions-of-institutional-research https://www.airweb.org /ir-data-professional-overview/duties-and-functions-of-institutional-research https://www.lyrasis.org/Leadership/Pages/orcid-us.aspx https://www.lyrasis.org/Leadership/Pages/ORCID-US-Exemplars.aspx https://www.lyrasis.org/Leadership/Pages/orcid-us-planning-guide.aspx https://library.ucalgary.ca/ld.php?content_id=35088958 44 Social Interoperability in Research Support: Cross-Campus Partnerships and the University Research Enterprise 45 The need for library cooperation with multiple stakeholders was particularly documented in the Realities of Research Data Management series as well as the Practices and Patterns report on global RIM practices: Bryant, Rebecca, Brian Lavoie, and Constance Malpas. 2017. A Tour of the Research Data Management (RDM) Service Space. The Realities of Research Data Management, Part 1. Dublin, OH: OCLC Research. https://doi.org/10.25333/C3PG8J. Bryant, Rebecca, Anna Clements, Pablo de Castro, Joanne Cantrell, Annette Dortmund, Jan Fransen, Peggy Gallagher, and Michele Mennielli. 2018. Practices and Patterns in Research Information Management: Findings from a Global Survey. Dublin, OH: OCLC Research. https://doi.org/10.25333/BGFG-D241. https://doi.org/10.25333/C3PG8J https://doi.org/10.25333/BGFG-D241 For more information about our work related to digitizing library collections, please visit: oc.lc/digitizing 6565 Kilgour Place Dublin, Ohio 43017-3395 T: 1-800-848-5878 T: +1-614-764-6000 F: +1-614-764-6096 www.oclc.org/research ISBN: 978-1-55653-157-6 DOI: 10.25333/wyrd-n586 RM-PR-216769-WWAE 2008 O C L C R E S E A R C H R E P O R T http://oc.lc/digitizing http://www.oclc.org/research Foreword Building Intra-Campus Relationships Around Research Support Services Introduction Scope and Methods Limitations The Campus Environment Universities are Complex Adaptive Systems Intense Competition for Prestige, Rankings, and Resources Leadership Challenges Frustration and Isolation in Emerging Roles A Model for Conceptualizing University Research Support Stakeholders Academic Affairs Research Administration The Library Information and Communications Technology (ICT) Faculty Affairs and Governance Communications Social Interoperability in Research Support Services Research Data Management (RDM) Research Information Management (RIM) Public researcher profiles Faculty Activity Reporting (FAR) Research Analytics ORCID Adoption Comments on the Library as Partner Cross-Campus Relationship Building: Strategies and Tactics Strategies and Directions Secure buy-in Know your audience Speak their language Offer concrete solutions to others’ problems Timing is essential Relationship Building: Practical Advice Meeting opportunities Shared staff and embedded resources Troubleshooting in Relationship Building Making connections Personalities Know your value / be confident Challenges: Managing Resistance and Sustaining Energy Managing resistance Investing the energy Conclusion Acknowledgments Appendix: Interview Protocol Notes FIGURE 1. A conceptual model of campus research support stakeholders FIGURE 2. Stakeholder interest in research support areas FIGURE 3. Key takeaways about successful intra-campus social interoperability. defoe-plague-1722 ---- HISTORY OF THE PLAGUE IN LONDON. It was about the beginning of September, 1664, that I, among the rest of my neighbors, heard in ordinary discourse that the plague was returned again in Holland; for it had been very violent there, and particularly at Amsterdam and Rotterdam, in the year 1663, whither, they say, it was brought (some said from Italy, others from the Levant) among some goods which were brought home by their Turkey fleet; others said it was brought from Candia; others, from Cyprus. It mattered not from whence it came; but all agreed it was come into Holland again.[4] We had no such thing as printed newspapers in those days, to spread rumors and reports of things, and to improve them by the invention of men, as I have lived to see practiced since. But such things as those were gathered from the letters of merchants and others who corresponded abroad, and from them was handed about by word of mouth only; so that things did not spread instantly over the whole nation, as they do now. But it seems that the government had a true account of it, and several counsels[5] were held about ways to prevent its coming over; but all was kept very private. Hence it was that this rumor died off again; and people began to forget it, as a thing we were very little concerned in and that we hoped was not true, till the latter end of November or the beginning of December, 1664, when two men, said to be Frenchmen, died of the plague in Longacre, or rather at the upper end of Drury Lane.[6] The family they were in endeavored to conceal it as much as possible; but, as it had gotten some vent in the discourse of the neighborhood, the secretaries of state[7] got knowledge of it. And concerning themselves to inquire about it, in order to be certain of the truth, two physicians and a surgeon were ordered to go to the house, and make inspection. This they did, and finding evident tokens[8] of the sickness upon both the bodies that were dead, they gave their opinions publicly that they died of the plague. Whereupon it was given in to the parish clerk,[9] and he also returned them[10] to the hall; and it was printed in the weekly bill of mortality in the usual manner, thus:-- PLAGUE, 2. PARISHES INFECTED, 1. The people showed a great concern at this, and began to be alarmed all over the town, and the more because in the last week in December, 1664, another man died in the same house and of the same distemper. And then we were easy again for about six weeks, when, none having died with any marks of infection, it was said the distemper was gone; but after that, I think it was about the 12th of February, another died in another house, but in the same parish and in the same manner. This turned the people's eyes pretty much towards that end of the town; and, the weekly bills showing an increase of burials in St. Giles's Parish more than usual, it began to be suspected that the plague was among the people at that end of the town, and that many had died of it, though they had taken care to keep it as much from the knowledge of the public as possible. This possessed the heads of the people very much; and few cared to go through Drury Lane, or the other streets suspected, unless they had extraordinary business that obliged them to it. This increase of the bills stood thus: the usual number of burials in a week, in the parishes of St. Giles-in-the-Fields and St. Andrew's, Holborn,[11] were[12] from twelve to seventeen or nineteen each, few more or less; but, from the time that the plague first began in St. Giles's Parish, it was observed that the ordinary burials increased in number considerably. For example:-- Dec. 27 to Jan. 3, St. Giles's 16 St. Andrew's 17 Jan. 3 to Jan. 10, St. Giles's 12 St. Andrew's 25 Jan. 10 to Jan. 17, St. Giles's 18 St. Andrew's 18 Jan. 17 to Jan. 24, St. Giles's 23 St. Andrew's 16 Jan. 24 to Jan. 31, St. Giles's 24 St. Andrew's 15 Jan. 31 to Feb. 7, St. Giles's 21 St. Andrew's 23 Feb. 7 to Feb. 14, St. Giles's 24 Whereof one of the plague. The like increase of the bills was observed in the parishes of St. Bride's, adjoining on one side of Holborn Parish, and in the parish of St. James's, Clerkenwell, adjoining on the other side of Holborn; in both which parishes the usual numbers that died weekly were from four to six or eight, whereas at that time they were increased as follows:-- Dec. 20 to Dec. 27, St. Bride's 0 St. James's 8 Dec. 27 to Jan. 3, St. Bride's 6 St. James's 9 Jan. 3 to Jan. 10, St. Bride's 11 St. James's 7 Jan. 10 to Jan. 17, St. Bride's 12 St. James's 9 Jan. 17 to Jan. 24, St. Bride's 9 St. James's 15 Jan. 24 to Jan. 31, St. Bride's 8 St. James's 12 Jan. 31 to Feb. 7, St. Bride's 13 St. James's 5 Feb. 7 to Feb. 14, St. Bride's 12 St. James's 6 Besides this, it was observed, with great uneasiness by the people, that the weekly bills in general increased very much during these weeks, although it was at a time of the year when usually the bills are very moderate. The usual number of burials within the bills of mortality for a week was from about two hundred and forty, or thereabouts, to three hundred. The last was esteemed a pretty high bill; but after this we found the bills successively increasing, as follows:-- Buried. Increased. Dec. 20 to Dec. 27 291 0 Dec. 27 to Jan. 3 349 58 Jan. 3 to Jan. 10 394 45 Jan. 10 to Jan. 17 415 21 Jan. 17 to Jan. 24 474 59 This last bill was really frightful, being a higher number than had been known to have been buried in one week since the preceding visitation of 1656. However, all this went off again; and the weather proving cold, and the frost, which began in December, still continuing very severe, even till near the end of February, attended with sharp though moderate winds, the bills decreased again, and the city grew healthy; and everybody began to look upon the danger as good as over, only that still the burials in St. Giles's continued high. From the beginning of April, especially, they stood at twenty-five each week, till the week from the 18th to the 25th, when there was[13] buried in St. Giles's Parish thirty, whereof two of the plague, and eight of the spotted fever (which was looked upon as the same thing); likewise the number that died of the spotted fever in the whole increased, being eight the week before, and twelve the week above named. This alarmed us all again; and terrible apprehensions were among the people, especially the weather being now changed and growing warm, and the summer being at hand. However, the next week there seemed to be some hopes again: the bills were low; the number of the dead in all was but 388; there was none of the plague, and but four of the spotted fever. But the following week it returned again, and the distemper was spread into two or three other parishes, viz., St. Andrew's, Holborn, St. Clement's-Danes; and, to the great affliction of the city, one died within the walls, in the parish of St. Mary-Wool-Church, that is to say, in Bearbinder Lane, near Stocks Market: in all, there were nine of the plague, and six of the spotted fever. It was, however, upon inquiry, found that this Frenchman who died in Bearbinder Lane was one who, having lived in Longacre, near the infected houses, had removed for fear of the distemper, not knowing that he was already infected. This was the beginning of May, yet the weather was temperate, variable, and cool enough, and people had still some hopes. That which encouraged them was, that the city was healthy. The whole ninety-seven parishes buried but fifty-four, and we began to hope, that, as it was chiefly among the people at that end of the town, it might go no farther; and the rather, because the next week, which was from the 9th of May to the 16th, there died but three, of which not one within the whole city or liberties;[14] and St. Andrew's buried but fifteen, which was very low. It is true, St. Giles's buried two and thirty; but still, as there was but one of the plague, people began to be easy. The whole bill also was very low: for the week before, the bill was but three hundred and forty-seven; and the week above mentioned, but three hundred and forty-three. We continued in these hopes for a few days; but it was but for a few, for the people were no more to be deceived thus. They searched the houses, and found that the plague was really spread every way, and that many died of it every day; so that now all our extenuations[15] abated, and it was no more to be concealed. Nay, it quickly appeared that the infection had spread itself beyond all hopes of abatement; that in the parish of St. Giles's it was gotten into several streets, and several families lay all sick together; and accordingly, in the weekly bill for the next week, the thing began to show itself. There was indeed but fourteen set down of the plague, but this was all knavery and collusion; for St. Giles's Parish, they buried forty in all, whereof it was certain most of them died of the plague, though they were set down of other distempers. And though the number of all the burials were[16] not increased above thirty-two, and the whole bill being but three hundred and eighty-five, yet there was[17] fourteen of the spotted fever, as well as fourteen of the plague; and we took it for granted, upon the whole, that there were fifty died that week of the plague. The next bill was from the 23d of May to the 30th, when the number of the plague was seventeen; but the burials in St. Giles's were fifty-three, a frightful number, of whom they set down but nine of the plague. But on an examination more strictly by the justices of the peace, and at the lord mayor's[18] request, it was found there were twenty more who were really dead of the plague in that parish, but had been set down of the spotted fever, or other distempers, besides others concealed. But those were trifling things to what followed immediately after. For now the weather set in hot; and from the first week in June, the infection spread in a dreadful manner, and the bills rise[19] high; the articles of the fever, spotted fever, and teeth, began to swell: for all that could conceal their distempers did it to prevent their neighbors shunning and refusing to converse with them, and also to prevent authority shutting up their houses, which, though it was not yet practiced, yet was threatened; and people were extremely terrified at the thoughts of it. The second week in June, the parish of St. Giles's, where still the weight of the infection lay, buried one hundred and twenty, whereof, though the bills said but sixty-eight of the plague, everybody said there had been a hundred at least, calculating it from the usual number of funerals in that parish as above. Till this week the city continued free, there having never any died except that one Frenchman, who[20] I mentioned before, within the whole ninety-seven parishes. Now, there died four within the city,--one in Wood Street, one in Fenchurch Street, and two in Crooked Lane. Southwark was entirely free, having not one yet died on that side of the water. I lived without Aldgate, about midway between Aldgate Church and Whitechapel Bars, on the left hand, or north side, of the street; and as the distemper had not reached to that side of the city, our neighborhood continued very easy. But at the other end of the town their consternation was very great; and the richer sort of people, especially the nobility and gentry from the west part of the city, thronged out of town, with their families and servants, in an unusual manner. And this was more particularly seen in Whitechapel; that is to say, the Broad Street where I lived. Indeed, nothing was to be seen but wagons and carts, with goods, women, servants, children, etc.; coaches filled with people of the better sort, and horsemen attending them, and all hurrying away; then empty wagons and carts appeared, and spare horses with servants, who it was apparent were returning, or sent from the country to fetch more people; besides innumerable numbers of men on horseback, some alone, others with servants, and, generally speaking, all loaded with baggage, and fitted out for traveling, as any one might perceive by their appearance. This was a very terrible and melancholy thing to see, and as it was a sight which I could not but look on from morning to night (for indeed there was nothing else of moment to be seen), it filled me with very serious thoughts of the misery that was coming upon the city, and the unhappy condition of those that would be left in it. This hurry of the people was such for some weeks, that there was no getting at the lord mayor's door without exceeding difficulty; there was such pressing and crowding there to get passes and certificates of health for such as traveled abroad; for, without these, there was no being admitted to pass through the towns upon the road, or to lodge in any inn. Now, as there had none died in the city for all this time, my lord mayor gave certificates of health without any difficulty to all those who lived in the ninety-seven parishes, and to those within the liberties too, for a while. This hurry, I say, continued some weeks, that is to say, all the months of May and June; and the more because it was rumored that an order of the government was to be issued out, to place turnpikes[21] and barriers on the road to prevent people's traveling; and that the towns on the road would not suffer people from London to pass, for fear of bringing the infection along with them, though neither of these rumors had any foundation but in the imagination, especially at first. I now began to consider seriously with myself concerning my own case, and how I should dispose of myself; that is to say, whether I should resolve to stay in London, or shut up my house and flee, as many of my neighbors did. I have set this particular down so fully, because I know not but it may be of moment to those who come after me, if they come to be brought to the same distress and to the same manner of making their choice; and therefore I desire this account may pass with them rather for a direction to themselves to act by than a history of my actings, seeing it may not be of one farthing value to them to note what became of me. I had two important things before me: the one was the carrying on my business and shop, which was considerable, and in which was embarked all my effects in the world; and the other was the preservation of my life in so dismal a calamity as I saw apparently was coming upon the whole city, and which, however great it was, my fears perhaps, as well as other people's, represented to be much greater than it could be. The first consideration was of great moment to me. My trade was a saddler, and as my dealings were chiefly not by a shop or chance trade, but among the merchants trading to the English colonies in America, so my effects lay very much in the hands of such. I was a single man, it is true; but I had a family of servants, who[22] I kept at my business; had a house, shop, and warehouses filled with goods; and in short to leave them all as things in such a case must be left, that is to say, without any overseer or person fit to be trusted with them, had been to hazard the loss, not only of my trade, but of my goods, and indeed of all I had in the world. I had an elder brother at the same time in London, and not many years before come over from Portugal; and, advising with him, his answer was in the three words, the same that was given in another case[23] quite different, viz., "Master, save thyself." In a word, he was for my retiring into the country, as he resolved to do himself, with his family; telling me, what he had, it seems, heard abroad, that the best preparation for the plague was to run away from it. As to my argument of losing my trade, my goods, or debts, he quite confuted me: he told me the same thing which I argued for my staying, viz., that I would trust God with my safety and health was the strongest repulse[24] to my pretensions of losing my trade and my goods. "For," says he, "is it not as reasonable that you should trust God with the chance or risk of losing your trade, as that you should stay in so eminent a point of danger, and trust him with your life?" I could not argue that I was in any strait as to a place where to go, having several friends and relations in Northamptonshire, whence our family first came from; and particularly, I had an only sister in Lincolnshire, very willing to receive and entertain me. My brother, who had already sent his wife and two children into Bedfordshire, and resolved to follow them, pressed my going very earnestly; and I had once resolved to comply with his desires, but at that time could get no horse: for though it is true all the people did not go out of the city of London, yet I may venture to say, that in a manner all the horses did; for there was hardly a horse to be bought or hired in the whole city for some weeks. Once I resolved to travel on foot with one servant, and, as many did, lie at no inn, but carry a soldier's tent with us, and so lie in the fields, the weather being very warm, and no danger from taking cold. I say, as many did, because several did so at last, especially those who had been in the armies, in the war[25] which had not been many years past: and I must needs say, that, speaking of second causes, had most of the people that traveled done so, the plague had not been carried into so many country towns and houses as it was, to the great damage, and indeed to the ruin, of abundance of people. But then my servant who[26] I had intended to take down with me, deceived me, and being frighted at the increase of the distemper, and not knowing when I should go, he took other measures, and left me: so I was put off for that time. And, one way or other, I always found that to appoint to go away was always crossed by some accident or other, so as to disappoint and put it off again. And this brings in a story which otherwise might be thought a needless digression, viz., about these disappointments being from Heaven. It came very warmly into my mind one morning, as I was musing on this particular thing, that as nothing attended us without the direction or permission of Divine Power, so these disappointments must have something in them extraordinary, and I ought to consider whether it did not evidently point out, or intimate to me, that it was the will of Heaven I should not go. It immediately followed in my thoughts, that, if it really was from God that I should stay, he was able effectually to preserve me in the midst of all the death and danger that would surround me; and that if I attempted to secure myself by fleeing from my habitation, and acted contrary to these intimations, which I believed to be divine, it was a kind of flying from God, and that he could cause his justice to overtake me when and where he thought fit.[27] These thoughts quite turned my resolutions again; and when I came to discourse with my brother again, I told him that I inclined to stay and take my lot in that station in which God had placed me; and that it seemed to be made more especially my duty, on the account of what I have said. My brother, though a very religious man himself, laughed at all I had suggested about its being an intimation from Heaven, and told me several stories of such foolhardy people, as he called them, as I was; that I ought indeed to submit to it as a work of Heaven if I had been any way disabled by distempers or diseases, and that then, not being able to go, I ought to acquiesce in the direction of Him, who, having been my Maker, had an undisputed right of sovereignty in disposing of me; and that then there had been no difficulty to determine which was the call of his providence, and which was not; but that I should take it as an intimation from Heaven that I should not go out of town, only because I could not hire a horse to go, or my fellow was run away that was to attend me, was ridiculous, since at the same time I had my health and limbs, and other servants, and might with ease travel a day or two on foot, and, having a good certificate of being in perfect health, might either hire a horse, or take post on the road, as I thought fit. Then he proceeded to tell me of the mischievous consequences which attend the presumption of the Turks and Mohammedans in Asia, and in other places where he had been (for my brother, being a merchant, was a few years before, as I have already observed, returned from abroad, coming last from Lisbon); and how, presuming upon their professed predestinating[28] notions, and of every man's end being predetermined, and unalterably beforehand decreed, they would go unconcerned into infected places, and converse with infected persons, by which means they died at the rate of ten or fifteen thousand a week, whereas the Europeans, or Christian merchants, who kept themselves retired and reserved, generally escaped the contagion. Upon these arguments my brother changed my resolutions again, and I began to resolve to go, and accordingly made all things ready; for, in short, the infection increased round me, and the bills were risen to almost seven hundred a week, and my brother told me he would venture to stay no longer. I desired him to let me consider of it but till the next day, and I would resolve; and as I had already prepared everything as well as I could, as to my business and who[29] to intrust my affairs with, I had little to do but to resolve. I went home that evening greatly oppressed in my mind, irresolute, and not knowing what to do. I had set the evening wholly apart to consider seriously about it, and was all alone; for already people had, as it were by a general consent, taken up the custom of not going out of doors after sunset: the reasons I shall have occasion to say more of by and by. In the retirement of this evening I endeavored to resolve first what was my duty to do, and I stated the arguments with which my brother had pressed me to go into the country, and I set against them the strong impressions which I had on my mind for staying,--the visible call I seemed to have from the particular circumstance of my calling, and the care due from me for the preservation of my effects, which were, as I might say, my estate; also the intimations which I thought I had from Heaven, that to me signified a kind of direction to venture; and it occurred to me, that, if I had what I call a direction to stay, I ought to suppose it contained a promise of being preserved, if I obeyed. This lay close to me;[30] and my mind seemed more and more encouraged to stay than ever, and supported with a secret satisfaction that I should be kept.[31] Add to this, that turning over the Bible which lay before me, and while my thoughts were more than ordinary serious upon the question, I cried out, "Well, I know not what to do, Lord direct me!" and the like. And at that juncture I happened to stop turning over the book at the Ninety-first Psalm, and, casting my eye on the second verse, I read to the seventh verse exclusive, and after that included the tenth, as follows: "I will say of the Lord, He is my refuge and my fortress: my God; in him will I trust. Surely he shall deliver thee from the snare of the fowler, and from the noisome pestilence. He shall cover thee with his feathers, and under his wings shalt thou trust: his truth shall be thy shield and buckler. Thou shalt not be afraid for the terror by night; nor for the arrow that flieth by day; nor for the pestilence that walketh in darkness; nor for the destruction that wasteth at noonday. A thousand shall fall at thy side, and ten thousand at thy right hand; but it shall not come nigh thee. Only with thine eyes shalt thou behold and see the reward of the wicked. Because thou hast made the Lord, which is my refuge, even the Most High, thy habitation; there shall no evil befall thee, neither shall any plague come nigh thy dwelling," etc. I scarce need tell the reader that from that moment I resolved that I would stay in the town, and, casting myself entirely upon the goodness and protection of the Almighty, would not seek any other shelter whatever; and that as my times were in his hands,[32] he was as able to keep me in a time of the infection as in a time of health; and if he did not think fit to deliver me, still I was in his hands, and it was meet he should do with me as should seem good to him. With this resolution I went to bed; and I was further confirmed in it the next day by the woman being taken ill with whom I had intended to intrust my house and all my affairs. But I had a further obligation laid on me on the same side: for the next day I found myself very much out of order also; so that, if I would have gone away, I could not. And I continued ill three or four days, and this entirely determined my stay: so I took my leave of my brother, who went away to Dorking in Surrey,[33] and afterwards fetched around farther into Buckinghamshire or Bedfordshire, to a retreat he had found out there for his family. It was a very ill time to be sick in; for if any one complained, it was immediately said he had the plague; and though I had, indeed, no symptoms of that distemper, yet, being very ill both in my head and in my stomach, I was not without apprehension that I really was infected. But in about three days I grew better. The third night I rested well, sweated a little, and was much refreshed. The apprehensions of its being the infection went also quite away with my illness, and I went about my business as usual. These things, however, put off all my thoughts of going into the country; and my brother also being gone, I had no more debate either with him or with myself on that subject. It was now mid-July; and the plague, which had chiefly raged at the other end of the town, and, as I said before, in the parishes of St. Giles's, St. Andrew's, Holborn, and towards Westminster, began now to come eastward, towards the part where I lived. It was to be observed, indeed, that it did not come straight on towards us; for the city, that is to say within the walls, was indifferent healthy still. Nor was it got then very much over the water into Southwark; for though there died that week twelve hundred and sixty-eight of all distempers, whereof it might be supposed above nine hundred died of the plague, yet there was but twenty-eight in the whole city, within the walls, and but nineteen in Southwark, Lambeth Parish included; whereas in the parishes of St. Giles and St. Martin's-in-the-Fields alone, there died four hundred and twenty-one. But we perceived the infection kept chiefly in the outparishes, which being very populous and fuller also of poor, the distemper found more to prey upon than in the city, as I shall observe afterwards. We perceived, I say, the distemper to draw our way, viz., by the parishes of Clerkenwell, Cripplegate, Shoreditch, and Bishopsgate; which last two parishes joining to Aldgate, Whitechapel, and Stepney, the infection came at length to spread its utmost rage and violence in those parts, even when it abated at the western parishes where it began. It was very strange to observe that in this particular week (from the 4th to the 11th of July), when, as I have observed, there died near four hundred of the plague in the two parishes of St. Martin's and St. Giles-in-the-Fields[34] only, there died in the parish of Aldgate but four, in the parish of Whitechapel three, in the parish of Stepney but one. Likewise in the next week (from the 11th of July to the 18th), when the week's bill was seventeen hundred and sixty-one, yet there died no more of the plague, on the whole Southwark side of the water, than sixteen. But this face of things soon changed, and it began to thicken in Cripplegate Parish especially, and in Clerkenwell; so that by the second week in August, Cripplegate Parish alone buried eight hundred and eighty-six, and Clerkenwell one hundred and fifty-five. Of the first, eight hundred and fifty might well be reckoned to die of the plague; and of the last, the bill itself said one hundred and forty-five were of the plague. During the month of July, and while, as I have observed, our part of the town seemed to be spared in comparison of the west part, I went ordinarily about the streets as my business required, and particularly went generally once in a day, or in two days, into the city, to my brother's house, which he had given me charge of, and to see it was safe; and having the key in my pocket, I used to go into the house, and over most of the rooms, to see that all was well. For though it be something wonderful to tell that any should have hearts so hardened, in the midst of such a calamity, as to rob and steal, yet certain it is that all sorts of villainies, and even levities and debaucheries, were then practiced in the town as openly as ever: I will not say quite as frequently, because the number of people were[35] many ways lessened. But the city itself began now to be visited too, I mean within the walls. But the number of people there were[35] indeed extremely lessened by so great a multitude having been gone into the country; and even all this month of July they continued to flee, though not in such multitudes as formerly. In August, indeed, they fled in such a manner, that I began to think there would be really none but magistrates and servants left in the city. As they fled now out of the city, so I should observe that the court[36] removed early, viz., in the month of June, and went to Oxford, where it pleased God to preserve them; and the distemper did not, as I heard of, so much as touch them; for which I cannot say that I ever saw they showed any great token of thankfulness, and hardly anything of reformation, though they did not want being told that their crying vices might, without breach of charity, be said to have gone far in bringing that terrible judgment upon the whole nation. The face of London was now, indeed, strangely altered: I mean the whole mass of buildings, city, liberties, suburbs, Westminster, Southwark, and altogether; for as to the particular part called the city, or within the walls, that was not yet much infected. But in the whole, the face of things, I say, was much altered. Sorrow and sadness sat upon every face, and though some part were not yet overwhelmed, yet all looked deeply concerned; and as we saw it apparently coming on, so every one looked on himself and his family as in the utmost danger. Were it possible to represent those times exactly to those that did not see them, and give the reader due ideas of the horror that everywhere presented itself, it must make just impressions upon their minds, and fill them with surprise. London might well be said to be all in tears. The mourners did not go about the streets,[37] indeed; for nobody put on black, or made a formal dress of mourning for their nearest friends: but the voice of mourning was truly heard in the streets. The shrieks of women and children at the windows and doors of their houses, where their nearest relations were perhaps dying, or just dead, were so frequent to be heard as we passed the streets, that it was enough to pierce the stoutest heart in the world to hear them. Tears and lamentations were seen almost in every house, especially in the first part of the visitation; for towards the latter end, men's hearts were hardened, and death was so always before their eyes that they did not so much concern themselves for the loss of their friends, expecting that themselves should be summoned the next hour. Business led me out sometimes to the other end of the town, even when the sickness was chiefly there. And as the thing was new to me, as well as to everybody else, it was a most surprising thing to see those streets, which were usually so thronged, now grown desolate, and so few people to be seen in them, that if I had been a stranger, and at a loss for my way, I might sometimes have gone the length of a whole street, I mean of the by-streets, and see[38] nobody to direct me, except watchmen set at the doors of such houses as were shut up; of which I shall speak presently. One day, being at that part of the town on some special business, curiosity led me to observe things more than usually; and indeed I walked a great way where I had no business. I went up Holborn, and there the street was full of people; but they walked in the middle of the great street, neither on one side or[39] other, because, as I suppose, they would not mingle with anybody that came out of houses, or meet with smells and scents from houses, that might be infected. The inns of court were all shut up, nor were very many of the lawyers in the Temple,[40] or Lincoln's Inn, or Gray's Inn, to be seen there. Everybody was at peace, there was no occasion for lawyers; besides, it being in the time of the vacation too, they were generally gone into the country. Whole rows of houses in some places were shut close up, the inhabitants all fled, and only a watchman or two left. When I speak of rows of houses being shut up, I do not mean shut up by the magistrates, but that great numbers of persons followed the court, by the necessity of their employments, and other dependencies; and as others retired, really frighted with the distemper, it was a mere desolating of some of the streets. But the fright was not yet near so great in the city, abstractedly so called,[41] and particularly because, though they were at first in a most inexpressible consternation, yet, as I have observed that the distemper intermitted often at first, so they were, as it were, alarmed and unalarmed again, and this several times, till it began to be familiar to them; and that even when it appeared violent, yet seeing it did not presently spread into the city, or the east or south parts, the people began to take courage, and to be, as I may say, a little hardened. It is true, a vast many people fled, as I have observed; yet they were chiefly from the west end of the town, and from that we call the heart of the city, that is to say, among the wealthiest of the people, and such persons as were unincumbered with trades and business. But of the rest, the generality staid, and seemed to abide the worst; so that in the place we call the liberties, and in the suburbs, in Southwark, and in the east part, such as Wapping, Ratcliff, Stepney, Rotherhithe, and the like, the people generally staid, except here and there a few wealthy families, who, as above, did not depend upon their business. It must not be forgot here that the city and suburbs were prodigiously full of people at the time of this visitation, I mean at the time that it began. For though I have lived to see a further increase, and mighty throngs of people settling in London, more than ever; yet we had always a notion that numbers of people which--the wars being over, the armies disbanded, and the royal family and the monarchy being restored--had flocked to London to settle in business, or to depend upon and attend the court for rewards of services, preferments, and the like, was[42] such that the town was computed to have in it above a hundred thousand people more than ever it held before. Nay, some took upon them to say it had twice as many, because all the ruined families of the royal party flocked hither, all the soldiers set up trades here, and abundance of families settled here. Again: the court brought with it a great flux of pride and new fashions; all people were gay and luxurious, and the joy of the restoration had brought a vast many families to London.[43] But I must go back again to the beginning of this surprising time. While the fears of the people were young, they were increased strangely by several odd accidents, which put altogether, it was really a wonder the whole body of the people did not rise as one man, and abandon their dwellings, leaving the place as a space of ground designed by Heaven for an Aceldama,[44] doomed to be destroyed from the face of the earth, and that all that would be found in it would perish with it. I shall name but a few of these things; but sure they were so many, and so many wizards and cunning people propagating them, that I have often wondered there was any (women especially) left behind. In the first place, a blazing star or comet appeared for several months before the plague, as there did, the year after, another a little before the fire. The old women, and the phlegmatic hypochondriac[45] part of the other sex (whom I could almost call old women too), remarked, especially afterward, though not till both those judgments were over, that those two comets passed directly over the city, and that so very near the houses that it was plain they imported something peculiar to the city alone; that the comet before the pestilence was of a faint, dull, languid color, and its motion very heavy, solemn, and slow, but that the comet before the fire was bright and sparkling, or, as others said, flaming, and its motion swift and furious; and that, accordingly, one foretold a heavy judgment, slow but severe, terrible, and frightful, as was the plague, but the other foretold a stroke, sudden, swift, and fiery, as was the conflagration. Nay, so particular some people were, that, as they looked upon that comet preceding the fire, they fancied that they not only saw it pass swiftly and fiercely, and could perceive the motion with their eye, but even they heard it; that it made a rushing, mighty noise, fierce and terrible, though at a distance, and but just perceivable. I saw both these stars, and, I must confess, had had so much of the common notion of such things in my head, that I was apt to look upon them as the forerunners and warnings of God's judgments, and, especially when the plague had followed the first, I yet saw another of the like kind, I could not but say, God had not yet sufficiently scourged the city. The apprehensions of the people were likewise strangely increased by the error of the times, in which I think the people, from what principle I cannot imagine, were more addicted to prophecies, and astrological conjurations, dreams, and old wives' tales, than ever they were before or since.[46] Whether this unhappy temper was originally raised by the follies of some people who got money by it, that is to say, by printing predictions and prognostications, I know not. But certain it is, books frighted them terribly, such as "Lilly's Almanack,"[47] "Gadbury's Astrological Predictions," "Poor Robin's Almanack,"[48] and the like; also several pretended religious books,--one entitled "Come out of Her, my People, lest ye be Partaker of her Plagues;"[49] another called "Fair Warning;" another, "Britain's Remembrancer;" and many such,--all, or most part of which, foretold directly or covertly the ruin of the city. Nay, some were so enthusiastically bold as to run about the streets with their oral predictions, pretending they were sent to preach to the city; and one in particular, who, like Jonah[50] to Nineveh, cried in the streets, "Yet forty days, and London shall be destroyed." I will not be positive whether he said "yet forty days," or "yet a few days." Another ran about naked, except a pair of drawers about his waist, crying day and night, like a man that Josephus[51] mentions, who cried, "Woe to Jerusalem!" a little before the destruction of that city: so this poor naked creature cried, "Oh, the great and the dreadful God!" and said no more, but repeated those words continually, with a voice and countenance full of horror, a swift pace, and nobody could ever find him to stop, or rest, or take any sustenance, at least that ever I could hear of. I met this poor creature several times in the streets, and would have spoke to him, but he would not enter into speech with me, or any one else, but kept on his dismal cries continually. These things terrified the people to the last degree, and especially when two or three times, as I have mentioned already, they found one or two in the bills dead of the plague at St. Giles's. Next to these public things were the dreams of old women; or, I should say, the interpretation of old women upon other people's dreams; and these put abundance of people even out of their wits. Some heard voices warning them to be gone, for that there would be such a plague in London so that the living would not be able to bury the dead; others saw apparitions in the air: and I must be allowed to say of both, I hope without breach of charity, that they heard voices that never spake, and saw sights that never appeared. But the imagination of the people was really turned wayward and possessed; and no wonder if they who were poring continually at the clouds saw shapes and figures, representations and appearances, which had nothing in them but air and vapor. Here they told us they saw a flaming sword held in a hand, coming out of a cloud, with a point hanging directly over the city. There they saw hearses and coffins in the air carrying to be buried. And there again, heaps of dead bodies lying unburied and the like, just as the imagination of the poor terrified people furnished them with matter to work upon. So hypochondriac fancies represent Ships, armies, battles in the firmament; Till steady eyes the exhalations solve, And all to its first matter, cloud, resolve. I could fill this account with the strange relations such people give every day of what they have seen; and every one was so positive of their having seen what they pretended to see, that there was no contradicting them, without breach of friendship, or being accounted rude and unmannerly on the one hand, and profane and impenetrable on the other. One time before the plague was begun, otherwise than as I have said in St. Giles's (I think it was in March), seeing a crowd of people in the street, I joined with them to satisfy my curiosity, and found them all staring up into the air to see what a woman told them appeared plain to her, which was an angel clothed in white, with a fiery sword in his hand, waving it or brandishing it over his head. She described every part of the figure to the life, showed them the motion and the form, and the poor people came into it so eagerly and with so much readiness. "Yes, I see it all plainly," says one: "there's the sword as plain as can be." Another saw the angel; one saw his very face, and cried out what a glorious creature he was. One saw one thing, and one another. I looked as earnestly as the rest, but perhaps not with so much willingness to be imposed upon; and I said, indeed, that I could see nothing but a white cloud, bright on one side, by the shining of the sun upon the other part. The woman endeavored to show it me, but could not make me confess that I saw it; which, indeed, if I had, I must have lied. But the woman, turning to me, looked me in the face, and fancied I laughed, in which her imagination deceived her too, for I really did not laugh, but was seriously reflecting how the poor people were terrified by the force of their own imagination. However, she turned to me, called me profane fellow and a scoffer, told me that it was a time of God's anger, and dreadful judgments were approaching, and that despisers such as I should wander and perish. The people about her seemed disgusted as well as she, and I found there was no persuading them that I did not laugh at them, and that I should be rather mobbed by them than be able to undeceive them. So I left them, and this appearance passed for as real as the blazing star itself. Another encounter I had in the open day also; and this was in going through a narrow passage from Petty France[52] into Bishopsgate churchyard, by a row of almshouses. There are two churchyards to Bishopsgate Church or Parish. One we go over to pass from the place called Petty France into Bishopsgate Street, coming out just by the church door; the other is on the side of the narrow passage where the almshouses are on the left, and a dwarf wall with a palisade on it on the right hand, and the city wall on the other side more to the right. In this narrow passage stands a man looking through the palisades into the burying place, and as many people as the narrowness of the place would admit to stop without hindering the passage of others; and he was talking mighty eagerly to them, and pointing, now to one place, then to another, and affirming that he saw a ghost walking upon such a gravestone there. He described the shape, the posture, and the movement of it so exactly, that it was the greatest amazement to him in the world that everybody did not see it as well as he. On a sudden he would cry, "There it is! Now it comes this way!" then, "'Tis turned back!" till at length he persuaded the people into so firm a belief of it, that one fancied he saw it; and thus he came every day, making a strange hubbub, considering it was so narrow a passage, till Bishopsgate clock struck eleven; and then the ghost would seem to start, and, as if he were called away, disappeared on a sudden. I looked earnestly every way, and at the very moment that this man directed, but could not see the least appearance of anything. But so positive was this poor man that he gave them vapors[53] in abundance, and sent them away trembling and frightened, till at length few people that knew of it cared to go through that passage, and hardly anybody by night on any account whatever. This ghost, as the poor man affirmed, made signs to the houses and to the ground and to the people, plainly intimating (or else they so understanding it) that abundance of people should come to be buried in that churchyard, as indeed happened. But then he saw such aspects I must acknowledge I never believed, nor could I see anything of it myself, though I looked most earnestly to see it if possible. Some endeavors were used to suppress the printing of such books as terrified the people, and to frighten the dispersers of them, some of whom were taken up, but nothing done in it, as I am informed; the government being unwilling to exasperate the people, who were, as I may say, all out of their wits already. Neither can I acquit those ministers that in their sermons rather sunk than lifted up the hearts of their hearers. Many of them, I doubt not, did it for the strengthening the resolution of the people, and especially for quickening them to repentance; but it certainly answered not their end, at least not in proportion to the injury it did another way. One mischief always introduces another. These terrors and apprehensions of the people led them to a thousand weak, foolish, and wicked things, which they wanted not a sort of people really wicked to encourage them to; and this was running about to fortune tellers, cunning men,[54] and astrologers, to know their fortunes, or, as it is vulgarly expressed, to have their fortunes told them, their nativities[55] calculated, and the like. And this folly presently made the town swarm with a wicked generation of pretenders to magic, to the "black art," as they called it, and I know not what, nay, to a thousand worse dealings with the devil than they were really guilty of. And this trade grew so open and so generally practiced, that it became common to have signs and inscriptions set up at doors, "Here lives a fortune teller," "Here lives an astrologer," "Here you may have your nativity calculated," and the like; and Friar Bacon's brazen head,[56] which was the usual sign of these people's dwellings, was to be seen almost in every street, or else the sign of Mother Shipton,[57] or of Merlin's[58] head, and the like. With what blind, absurd, and ridiculous stuff these oracles of the devil pleased and satisfied the people, I really know not; but certain it is, that innumerable attendants crowded about their doors every day: and if but a grave fellow in a velvet jacket, a band,[59] and a black cloak, which was the habit those quack conjurers generally went in, was but seen in the streets, the people would follow them[60] in crowds, and ask them[60] questions as they went along. The case of poor servants was very dismal, as I shall have occasion to mention again by and by; for it was apparent a prodigious number of them would be turned away. And it was so, and of them abundance perished, and particularly those whom these false prophets flattered with hopes that they should be kept in their services, and carried with their masters and mistresses into the country; and had not public charity provided for these poor creatures, whose number was exceeding great (and in all cases of this nature must be so), they would have been in the worst condition of any people in the city. These things agitated the minds of the common people for many months while the first apprehensions were upon them, and while the plague was not, as I may say, yet broken out. But I must also not forget that the more serious part of the inhabitants behaved after another manner. The government encouraged their devotion, and appointed public prayers, and days of fasting and humiliation, to make public confession of sin, and implore the mercy of God to avert the dreadful judgment which hangs over their heads; and it is not to be expressed with what alacrity the people of all persuasions embraced the occasion, how they flocked to the churches and meetings, and they were all so thronged that there was often no coming near, even to the very doors of the largest churches. Also there were daily prayers appointed morning and evening at several churches, and days of private praying at other places, at all which the people attended, I say, with an uncommon devotion. Several private families, also, as well of one opinion as another, kept family fasts, to which they admitted their near relations only; so that, in a word, those people who were really serious and religious applied themselves in a truly Christian manner to the proper work of repentance and humiliation, as a Christian people ought to do. Again, the public showed that they would bear their share in these things. The very court, which was then gay and luxurious, put on a face of just concern for the public danger. All the plays and interludes[61] which, after the manner of the French court,[62] had been set up and began to increase among us, were forbid to act;[63] the gaming tables, public dancing rooms, and music houses, which multiplied and began to debauch the manners of the people, were shut up and suppressed; and the jack puddings,[64] merry-andrews,[64] puppet shows, ropedancers, and such like doings, which had bewitched the common people, shut their shops, finding indeed no trade, for the minds of the people were agitated with other things, and a kind of sadness and horror at these things sat upon the countenances even of the common people. Death was before their eyes, and everybody began to think of their graves, not of mirth and diversions. But even these wholesome reflections, which, rightly managed, would have most happily led the people to fall upon their knees, make confession of their sins, and look up to their merciful Savior for pardon, imploring his compassion on them in such a time of their distress, by which we might have been as a second Nineveh, had a quite contrary extreme in the common people, who, ignorant and stupid in their reflections as they were brutishly wicked and thoughtless before, were now led by their fright to extremes of folly, and, as I said before, that they ran to conjurers and witches and all sorts of deceivers, to know what should become of them, who fed their fears and kept them always alarmed and awake, on purpose to delude them and pick their pockets: so they were as mad upon their running after quacks and mountebanks, and every practicing old woman for medicines and remedies, storing themselves with such multitudes of pills, potions, and preservatives, as they were called, that they not only spent their money, but poisoned themselves beforehand, for fear of the poison of the infection, and prepared their bodies for the plague, instead of preserving them against it. On the other hand, it was incredible, and scarce to be imagined, how the posts of houses and corners of streets were plastered over with doctors' bills, and papers of ignorant fellows quacking and tampering in physic, and inviting people to come to them for remedies, which was generally set off with such flourishes as these; viz., "INFALLIBLE preventitive pills against the plague;" "NEVER-FAILING preservatives against the infection;" "SOVEREIGN cordials against the corruption of air;" "EXACT regulations for the conduct of the body in case of infection;" "Antipestilential pills;" "INCOMPARABLE drink against the plague, never found out before;" "An UNIVERSAL remedy for the plague;" "The ONLY TRUE plague water;" "The ROYAL ANTIDOTE against all kinds of infection;" and such a number more that I cannot reckon up, and, if I could, would fill a book of themselves to set them down. Others set up bills to summon people to their lodgings for direction and advice in the case of infection. These had specious titles also, such as these:-- An eminent High-Dutch physician, newly come over from Holland, where he resided during all the time of the great plague, last year, in Amsterdam, and cured multitudes of people that actually had the plague upon them. An Italian gentlewoman just arrived from Naples, having a choice secret to prevent infection, which she found out by her great experience, and did wonderful cures with it in the late plague there, wherein there died 20,000 in one day. An ancient gentlewoman having practiced with great success in the late plague in this city, anno 1636, gives her advice only to the female sex. To be spoken with, etc. An experienced physician, who has long studied the doctrine of antidotes against all sorts of poison and infection, has, after forty years' practice, arrived at such skill as may, with God's blessing, direct persons how to prevent being touched by any contagious distemper whatsoever. He directs the poor gratis. I take notice of these by way of specimen. I could give you two or three dozen of the like, and yet have abundance left behind. It is sufficient from these to apprise any one of the humor of those times, and how a set of thieves and pickpockets not only robbed and cheated the poor people of their money, but poisoned their bodies with odious and fatal preparations; some with mercury, and some with other things as bad, perfectly remote from the thing pretended to, and rather hurtful than serviceable to the body in case an infection followed. I cannot omit a subtlety of one of those quack operators with which he gulled the poor people to crowd about him, but did nothing for them without money. He had, it seems, added to his bills, which he gave out in the streets, this advertisement in capital letters; viz., "He gives advice to the poor for nothing." Abundance of people came to him accordingly, to whom he made a great many fine speeches, examined them of the state of their health and of the constitution of their bodies, and told them many good things to do, which were of no great moment. But the issue and conclusion of all was, that he had a preparation which, if they took such a quantity of every morning, he would pawn his life that they should never have the plague, no, though they lived in the house with people that were infected. This made the people all resolve to have it, but then the price of that was so much (I think it was half a crown[65]). "But, sir," says one poor woman, "I am a poor almswoman, and am kept by the parish; and your bills say you give the poor your help for nothing."--"Ay, good woman," says the doctor, "so I do, as I published there. I give my advice, but not my physic!"--"Alas, sir," says she, "that is a snare laid for the poor then, for you give them your advice for nothing; that is to say, you advise them gratis to buy your physic for their money: so does every shopkeeper with his wares." Here the woman began to give him ill words, and stood at his door all that day, telling her tale to all the people that came, till the doctor, finding she turned away his customers, was obliged to call her upstairs again and give her his box of physic for nothing, which perhaps, too, was good for nothing when she had it. But to return to the people, whose confusions fitted them to be imposed upon by all sorts of pretenders and by every mountebank. There is no doubt but these quacking sort of fellows raised great gains out of the miserable people; for we daily found the crowds that ran after them were infinitely greater, and their doors were more thronged, than those of Dr. Brooks, Dr. Upton, Dr. Hodges, Dr. Berwick, or any, though the most famous men of the time; and I was told that some of them got five pounds[66] a day by their physic. But there was still another madness beyond all this, which may serve to give an idea of the distracted humor of the poor people at that time, and this was their following a worse sort of deceivers than any of these; for these petty thieves only deluded them to pick their pockets and get their money (in which their wickedness, whatever it was, lay chiefly on the side of the deceiver's deceiving, not upon the deceived); but, in this part I am going to mention, it lay chiefly in the people deceived, or equally in both. And this was in wearing charms, philters,[67] exorcisms,[68] amulets,[69] and I know not what preparations to fortify the body against the plague, as if the plague was not the hand of God, but a kind of a possession of an evil spirit, and it was to be kept off with crossings,[70] signs of the zodiac,[71] papers tied up with so many knots, and certain words or figures written on them, as particularly the word "Abracadabra,"[72] formed in triangle or pyramid; thus,-- A B R A C A D A B R A A B R A C A D A B R A B R A C A D A B A B R A C A D A A B R A C A D A B R A C A A B R A C A B R A A B R A B A Others had the Jesuits' mark in a cross:-- I H S[73] Others had nothing but this mark; thus,-- + I might spend a great deal of my time in exclamations against the follies, and indeed the wickednesses of those things, in a time of such danger, in a matter of such consequence as this of a national infection; but my memorandums of these things relate rather to take notice of the fact, and mention only that it was so. How the poor people found the insufficiency of those things, and how many of them were afterwards carried away in the dead carts, and thrown into the common graves of every parish with these hellish charms and trumpery hanging about their necks, remains to be spoken of as we go along. All this was the effect of the hurry the people were in, after the first notion of the plague being at hand was among them, and which may be said to be from about Michaelmas,[74] 1664, but more particularly after the two men died in St. Giles's, in the beginning of December; and again after another alarm in February, for when the plague evidently spread itself, they soon began to see the folly of trusting to these unperforming creatures who had gulled them of their money; and then their fears worked another way, namely, to amazement and stupidity, not knowing what course to take or what to do, either to help or to relieve themselves; but they ran about from one neighbor's house to another, and even in the streets, from one door to another, with repeated cries of, "Lord, have mercy upon us! What shall we do?" I am supposing, now, the plague to have begun, as I have said, and that the magistrates began to take the condition of the people into their serious consideration. What they did as to the regulation of the inhabitants, and of infected families, I shall speak to[75] by itself; but as to the affair of health, it is proper to mention here my having seen the foolish humor of the people in running after quacks, mountebanks, wizards, and fortune tellers, which they did, as above, even to madness. The lord mayor, a very sober and religious gentleman, appointed physicians and surgeons for the relief of the poor, I mean the diseased poor, and in particular ordered the College of Physicians[76] to publish directions for cheap remedies for the poor in all the circumstances of the distemper. This, indeed, was one of the most charitable and judicious things that could be done at that time; for this drove the people from haunting the doors of every disperser of bills, and from taking down blindly and without consideration, poison for physic, and death instead of life. This direction of the physicians was done by a consultation of the whole college; and as it was particularly calculated for the use of the poor, and for cheap medicines, it was made public, so that everybody might see it, and copies were given gratis to all that desired it. But as it is public and to be seen on all occasions, I need not give the reader of this the trouble of it. It remains to be mentioned now what public measures were taken by the magistrates for the general safety and to prevent the spreading of the distemper when it broke out. I shall have frequent occasion to speak of the prudence of the magistrates, their charity, their vigilance for the poor and for preserving good order, furnishing provisions, and the like, when the plague was increased as it afterwards was. But I am now upon the order and regulations which they published for the government of infected families. I mentioned above shutting of houses up, and it is needful to say something particularly to that; for this part of the history of the plague is very melancholy. But the most grievous story must be told. About June, the lord mayor of London, and the court of aldermen, as I have said, began more particularly to concern themselves for the regulation of the city. The justices of the peace for Middlesex,[77] by direction of the secretary of state, had begun to shut up houses in the parishes of St. Giles-in-the-Fields, St. Martin's, St. Clement's-Danes, etc., and it was with good success; for in several streets where the plague broke out, upon strict guarding the houses that were infected, and taking care to bury those that died as soon as they were known to be dead, the plague ceased in those streets. It was also observed that the plague decreased sooner in those parishes after they had been visited to the full than it did in the parishes of Bishopsgate, Shoreditch, Aldgate, Whitechapel, Stepney, and others; the early care taken in that manner being a great means to the putting a check to it. This shutting up of the houses was a method first taken, as I understand, in the plague which happened in 1603, at the coming of King James I. to the crown; and the power of shutting people up in their own houses was granted by act of Parliament, entitled "An Act for the Charitable Relief and Ordering of Persons Infected with Plague." On which act of Parliament the lord mayor and aldermen of the city of London founded the order they made at this time, and which took place the 1st of July, 1665, when the numbers of infected within the city were but few; the last bill for the ninety-two parishes being but four, and some houses having been shut up in the city, and some people being removed to the pesthouse beyond Bunhill Fields, in the way to Islington. I say by these means, when there died near one thousand a week in the whole, the number in the city was but twenty-eight; and the city was preserved more healthy, in proportion, than any other place all the time of the infection. These orders of my lord mayor's were published, as I have said, the latter end of June, and took place from the 1st of July, and were as follow: viz.,-- ORDERS CONCEIVED AND PUBLISHED BY THE LORD MAYOR AND ALDERMEN OF THE CITY OF LONDON, CONCERNING THE INFECTION OF THE PLAGUE; 1665. Whereas in the reign of our late sovereign King James, of happy memory, an act was made for the charitable relief and ordering of persons infected with the plague; whereby authority was given to justices of the peace, mayors, bailiffs, and other head officers, to appoint within their several limits examiners, searchers, watchmen, keepers, and buriers, for the persons and places infected, and to minister unto them oaths for the performance of their offices; and the same statute did also authorize the giving of their directions as unto them for other present necessity should seem good in their discretions: it is now, upon special consideration, thought very expedient, for preventing and avoiding of infection of sickness (if it shall please Almighty God), that these officers following be appointed, and these orders hereafter duly observed. Examiners to be appointed to every Parish. First, it is thought requisite, and so ordered, that in every parish there be one, two, or more persons of good sort and credit chosen by the alderman, his deputy, and common council of every ward, by the name of examiners, to continue in that office for the space of two months at least: and if any fit person so appointed shall refuse to undertake the same, the said parties so refusing to be committed to prison until they shall conform themselves accordingly. The Examiner's Office. That these examiners be sworn by the aldermen to inquire and learn from time to time what houses in every parish be visited, and what persons be sick, and of what diseases, as near as they can inform themselves, and, upon doubt in that case, to command restraint of access until it appear what the disease shall prove; and if they find any person sick of the infection, to give order to the constable that the house be shut up; and, if the constable shall be found remiss and negligent, to give notice thereof to the alderman of the ward. Watchmen. That to every infected house there be appointed two watchmen,--one for every day, and the other for the night; and that these watchmen have a special care that no person go in or out of such infected houses whereof they have the charge, upon pain of severe punishment. And the said watchmen to do such further offices as the sick house shall need and require; and if the watchman be sent upon any business, to lock up the house and take the key with him; and the watchman by day to attend until ten o'clock at night, and the watchman by night until six in the morning. Searchers. That there be a special care to appoint women searchers in every parish, such as are of honest reputation and of the best sort as can be got in this kind; and these to be sworn to make due search and true report, to the utmost of their knowledge, whether the persons whose bodies they are appointed to search do die of the infection, or of what other diseases, as near as they can. And that the physicians who shall be appointed for the cure and prevention of the infection do call before them the said searchers, who are or shall be appointed for the several parishes under their respective cares, to the end they may consider whether they be fitly qualified for that employment, and charge them from time to time, as they shall see cause, if they appear defective in their duties. That no searcher during this time of visitation be permitted to use any public work or employment, or keep a shop or stall, or be employed as a laundress, or in any other common employment whatsoever. Chirurgeons.[78] For better assistance of the searchers, forasmuch as there has been heretofore great abuse in misreporting the disease, to the further spreading of the infection, it is therefore ordered that there be chosen and appointed able and discreet chirurgeons besides those that do already belong to the pesthouse, amongst whom the city and liberties to be quartered as they lie most apt and convenient; and every of these to have one quarter for his limit. And the said chirurgeons in every of their limits to join with the searchers for the view of the body, to the end there may be a true report made of the disease. And further: that the said chirurgeons shall visit and search such like persons as shall either send for them, or be named and directed unto them by the examiners of every parish, and inform themselves of the disease of the said parties. And forasmuch as the said chirurgeons are to be sequestered from all other cures,[79] and kept only to this disease of the infection, it is ordered that every of the said chirurgeons shall have twelvepence a body searched by them, to be paid out of the goods of the party searched, if he be able, or otherwise by the parish. Nurse Keepers. If any nurse keeper shall remove herself out of any infected house before twenty-eight days after the decease of any person dying of the infection, the house to which the said nurse keeper doth so remove herself shall be shut up until the said twenty-eight days shall be expired. ORDERS CONCERNING INFECTED HOUSES, AND PERSONS SICK OF THE PLAGUE. Notice to be given of the Sickness. The master of every house, as soon as any one in his house complaineth either of botch, or purple, or swelling in any part of his body, or falleth otherwise dangerously sick without apparent cause of some other disease, shall give notice thereof to the examiner of health, within two hours after the said sign shall appear. Sequestration of the Sick. As soon as any man shall be found by this examiner, chirurgeon, or searcher, to be sick of the plague, he shall the same night be sequestered in the same house; and in case he be so sequestered, then, though he die not, the house wherein he sickened shall be shut up for a month after the use of the due preservatives taken by the rest. Airing the Stuff. For sequestration of the goods and stuff of the infection, their bedding and apparel, and hangings of chambers, must be well aired with fire, and such perfumes as are requisite, within the infected house, before they be taken again to use. This to be done by the appointment of the examiner. Shutting up of the House. If any person shall visit any man known to be infected of the plague, or entereth willingly into any known infected house, being not allowed, the house wherein he inhabiteth shall be shut up for certain days by the examiner's direction. None to be removed out of Infected Houses, but, etc. Item, That none be removed out of the house where he falleth sick of the infection into any other house in the city (except it be to the pesthouse or a tent, or unto some such house which the owner of the said house holdeth in his own hands, and occupieth by his own servants), and so as security be given to the said parish whither such remove is made, that the attendance and charge about the said visited persons shall be observed and charged in all the particularities before expressed, without any cost of that parish to which any such remove shall happen to be made, and this remove to be done by night. And it shall be lawful to any person that hath two houses to remove either his sound or his infected people to his spare house at his choice, so as, if he send away first his sound, he do not after send thither the sick; nor again unto the sick, the sound; and that the same which he sendeth be for one week at the least shut up, and secluded from company, for the fear of some infection at first not appearing. Burial of the Dead. That the burial of the dead by this visitation be at most convenient hours, always before sunrising, or after sunsetting, with the privity[80] of the churchwardens, or constable, and not otherwise; and that no neighbors nor friends be suffered to accompany the corpse to church, or to enter the house visited, upon pain of having his house shut up, or be imprisoned. And that no corpse dying of the infection shall be buried, or remain in any church, in time of common prayer, sermon, or lecture. And that no children be suffered, at time of burial of any corpse, in any church, churchyard, or burying place, to come near the corpse, coffin, or grave; and that all graves shall be at least six feet deep. And further, all public assemblies at other burials are to be forborne during the continuance of this visitation. No Infected Stuff to be uttered.[81] That no clothes, stuff, bedding, or garments, be suffered to be carried or conveyed out of any infected houses, and that the criers and carriers abroad of bedding or old apparel to be sold or pawned be utterly prohibited and restrained, and no brokers of bedding or old apparel be permitted to make any public show, or hang forth on their stalls, shop boards, or windows towards any street, lane, common way, or passage, any old bedding or apparel to be sold, upon pain of imprisonment. And if any broker or other person shall buy any bedding, apparel, or other stuff out of any infected house, within two months after the infection hath been there, his house shall be shut up as infected, and so shall continue shut up twenty days at the least. No Person to be conveyed out of any Infected House. If any person visited[82] do fortune,[83] by negligent looking unto, or by any other means, to come or be conveyed from a place infected to any other place, the parish from whence such party hath come, or been conveyed, upon notice thereof given, shall, at their charge, cause the said party so visited and escaped to be carried and brought back again by night; and the parties in this case offending to be punished at the direction of the alderman of the ward, and the house of the receiver of such visited person to be shut up for twenty days. Every Visited House to be marked. That every house visited be marked with a red cross of a foot long, in the middle of the door, evident to be seen, and with these usual printed words, that is to say, "Lord have mercy upon us," to be set close over the same cross, there to continue until lawful opening of the same house. Every Visited House to be watched. That the constables see every house shut up, and to be attended with watchmen, which may keep in, and minister necessaries to them at their own charges, if they be able, or at the common charge if they be unable. The shutting up to be for the space of four weeks after all be whole. That precise order be taken that the searchers, chirurgeons, keepers, and buriers, are not to pass the streets without holding a red rod or wand of three foot in length in their hands, open and evident to be seen; and are not to go into any other house than into their own, or into that whereunto they are directed or sent for, but to forbear and abstain from company, especially when they have been lately used[84] in any such business or attendance. Inmates. That where several inmates are in one and the same house, and any person in that house happens to be infected, no other person or family of such house shall be suffered to remove him or themselves without a certificate from the examiners of the health of that parish; or, in default thereof, the house whither she or they remove shall be shut up as is in case of visitation. Hackney Coaches. That care be taken of hackney coachmen, that they may not, as some of them have been observed to do after carrying of infected persons to the pesthouse and other places, be admitted to common use till their coaches be well aired, and have stood unemployed by the space of five or six days after such service. ORDERS FOR CLEANSING AND KEEPING OF THE STREETS SWEPT. The Streets to be kept Clean. First, it is thought necessary, and so ordered, that every householder do cause the street to be daily prepared before his door, and so to keep it clean swept all the week long. That Rakers take it from out the Houses. That the sweeping and filth of houses be daily carried away by the rakers, and that the raker shall give notice of his coming by the blowing of a horn, as hitherto hath been done. Laystalls[85] to be made far off from the City. That the laystalls be removed as far as may be out of the city and common passages, and that no nightman or other be suffered to empty a vault into any vault or garden near about the city. Care to be had of Unwholesome Fish or Flesh, and of Musty Corn. That special care be taken that no stinking fish, or unwholesome flesh, or musty corn, or other corrupt fruits, of what sort soever, be suffered to be sold about the city or any part of the same. That the brewers and tippling-houses be looked unto for musty and unwholesome casks. That no hogs, dogs, or cats, or tame pigeons, or conies, be suffered to be kept within any part of the city, or any swine to be or stray in the streets or lanes, but that such swine be impounded by the beadle[86] or any other officer, and the owner punished according to the act of common council; and that the dogs be killed by the dog killers appointed for that purpose. ORDERS CONCERNING LOOSE PERSONS AND IDLE ASSEMBLIES. Beggars. Forasmuch as nothing is more complained of than the multitude of rogues and wandering beggars that swarm about in every place about the city, being a great cause of the spreading of the infection, and will not be avoided[87] notwithstanding any orders that have been given to the contrary: it is therefore now ordered that such constables, and others whom this matter may any way concern, take special care that no wandering beggars be suffered in the streets of this city, in any fashion or manner whatsoever, upon the penalty provided by law to be duly and severely executed upon them. Plays. That all plays, bear baitings,[88] games, singing of ballads, buckler play,[89] or such like causes of assemblies of people, be utterly prohibited, and the parties offending severely punished by every alderman in his ward. Feasting prohibited. That all public feasting, and particularly by the companies[90] of this city, and dinners in taverns, alehouses, and other places of public entertainment, be forborne till further order and allowance, and that the money thereby spared be preserved, and employed for the benefit and relief of the poor visited with the infection. Tippling-Houses. That disorderly tippling in taverns, alehouses, coffeehouses, and cellars, be severely looked unto as the common sin of the time, and greatest occasion of dispersing the plague. And that no company or person be suffered to remain or come into any tavern, alehouse, or coffeehouse, to drink, after nine of the clock in the evening, according to the ancient law and custom of this city, upon the penalties ordained by law. And for the better execution of these orders, and such other rules and directions as upon further consideration shall be found needful, it is ordered and enjoined that the aldermen, deputies, and common councilmen shall meet together weekly, once, twice, thrice, or oftener, as cause shall require, at some one general place accustomed in their respective wards, being clear from infection of the plague, to consult how the said orders may be put in execution, not intending that any dwelling in or near places infected shall come to the said meeting while their coming may be doubtful. And the said aldermen, deputies, and common councilmen, in their several wards, may put in execution any other orders that by them, at their said meetings, shall be conceived and devised for the preservation of his Majesty's subjects from the infection. Sir JOHN LAWRENCE, Lord Mayor. Sir GEORGE WATERMAN, } Sir CHARLES DOE, } Sheriffs. I need not say that these orders extended only to such places as were within the lord mayor's jurisdiction: so it is requisite to observe that the justices of peace within those parishes and places as were called the "hamlets" and "outparts" took the same method. As I remember, the orders for shutting up of houses did not take place so soon on our side, because, as I said before, the plague did not reach to this eastern part of the town at least, nor begin to be violent till the beginning of August. For example, the whole bill from the 11th to the 18th of July was 1,761, yet there died but 71 of the plague in all those parishes we call the Tower Hamlets; and they were as follows:-- Aldgate, 14 { 34 { 65 Stepney, 33 The next { 58 To { 76 Whitechapel, 21 week was { 48 Aug. 1 { 79 St. Kath. Tower.[91] 2 thus: { 4 thus: { 4 Trin. Minories,[92] 1 { 1 { 4 -- --- --- 71 145 228 It was indeed coming on amain, for the burials that same week were, in the next adjoining parishes, thus:-- St. L.[93] Shoreditch 64 The next week { 84 To { 110 St. Bot.[94] Bishopsg. 65 prodigiously { 105 Aug. 1 { 116 St. Giles's Crippl.[95] 213 increased, as { 431 thus: { 554 --- --- --- 342 620 780 This shutting up of houses was at first counted a very cruel and unchristian method, and the poor people so confined made bitter lamentations. Complaints of the severity of it were also daily brought to my lord mayor, of houses causelessly, and some maliciously, shut up. I cannot say but upon inquiry many that complained so loudly were found in a condition to be continued; and others again, inspection being made upon the sick person, and the sickness not appearing infectious, or, if uncertain, yet, on his being content to be carried to the pesthouse, was[96] released. As I went along Houndsditch one morning, about eight o'clock, there was a great noise. It is true, indeed, there was not much crowd, because the people were not very free to gather together, or to stay long together when they were there, nor did I stay long there; but the outcry was loud enough to prompt my curiosity, and I called to one, who looked out of a window, and asked what was the matter. A watchman, it seems, had been employed to keep his post at the door of a house which was infected, or said to be infected, and was shut up. He had been there all night, for two nights together, as he told his story, and the day watchman had been there one day, and was now come to relieve him. All this while no noise had been heard in the house, no light had been seen, they called for nothing, sent him of no errands (which used to be the chief business of the watchmen), neither had they given him any disturbance, as he said, from Monday afternoon, when he heard a great crying and screaming in the house, which, as he supposed, was occasioned by some of the family dying just at that time. It seems the night before, the "dead cart," as it was called, had been stopped there, and a servant maid had been brought down to the door dead; and the "buriers" or "bearers," as they were called, put her into the cart, wrapped only in a green rug, and carried her away. The watchman had knocked at the door, it seems, when he heard that noise and crying, as above, and nobody answered a great while; but at last one looked out and said with an angry, quick tone, and yet a kind of crying voice, or a voice of one that was crying, "What d'ye want, that you make such a knocking?" He answered, "I am the watchman. How do you do? What is the matter?" The person answered, "What is that to you? Stop the dead cart." This, it seems, was about one o'clock. Soon after, as the fellow said, he stopped the dead cart, and then knocked again, but nobody answered; he continued knocking, and the bellman called out several times, "Bring out your dead;" but nobody answered, till the man that drove the cart, being called to other houses, would stay no longer, and drove away. The watchman knew not what to make of all this, so he let them alone till the morning man, or "day watchman," as they called him, came to relieve him. Giving him an account of the particulars, they knocked at the door a great while, but nobody answered; and they observed that the window or casement at which the person looked out who had answered before, continued open, being up two pair of stairs. Upon this, the two men, to satisfy their curiosity, got a long ladder, and one of them went up to the window and looked into the room, where he saw a woman lying dead upon the floor, in a dismal manner, having no clothes on her but her shift.[97] But though he called aloud, and, putting in his long staff, knocked hard on the floor, yet nobody stirred or answered, neither could he hear any noise in the house. He came down again upon this, and acquainted his fellow, who went up also; and finding it just so, they resolved to acquaint either the lord mayor or some other magistrate of it, but did not offer to go in at the window. The magistrate, it seems, upon the information of the two men, ordered the house to be broke open, a constable and other persons being appointed to be present, that nothing might be plundered; and accordingly it was so done, when nobody was found in the house but that young woman, who having been infected, and past recovery, the rest had left her to die by herself, and every one gone, having found some way to delude the watchman, and to get open the door, or get out at some back door, or over the tops of the houses, so that he knew nothing of it. And as to those cries and shrieks which he heard, it was supposed they were the passionate cries of the family at this bitter parting, which, to be sure, it was to them all, this being the sister to the mistress of the family; the man of the house, his wife, several children and servants, being all gone and fled: whether sick or sound, that I could never learn, nor, indeed, did I make much inquiry after it. At another house, as I was informed, in the street next within Aldgate, a whole family was shut up and locked in because the maidservant was taken sick. The master of the house had complained by his friends to the next alderman, and to the lord mayor, and had consented to have the maid carried to the pesthouse, but was refused: so the door was marked with a red cross, a padlock on the outside, as above, and a watchman set to keep the door, according to public order. After the master of the house found there was no remedy, but that he, his wife, and his children, were locked up with this poor distempered servant, he called to the watchman, and told him he must go then and fetch a nurse for them to attend this poor girl, for that it would be certain death to them all to oblige them to nurse her, and told him plainly that if he would not do this the maid would perish either[98] of the distemper, or be starved for want of food, for he was resolved none of his family should go near her; and she lay in the garret, four story high, where she could not cry out or call to anybody for help. The watchman consented to that, and went and fetched a nurse as he was appointed, and brought her to them the same evening. During this interval, the master of the house took his opportunity to break a large hole through his shop into a bulk or stall, where formerly a cobbler had sat before or under his shop window; but the tenant, as may be supposed, at such a dismal time as that, was dead or removed, and so he had the key in his own keeping. Having[99] made his way into this stall, which he could not have done if the man had been at the door, the noise he was obliged to make being such as would have alarmed the watchman,--I say, having made his way into this stall, he sat still till the watchman returned with the nurse, and all the next day also; but the night following, having contrived to send the watchman of another trifling errand (which, as I take it, was to an apothecary's for a plaster for the maid, which he was to stay for the making up, or some other such errand that might secure his staying some time), in that time he conveyed himself and all his family out of the house, and left the nurse and the watchman to bury the poor wench, that is, throw her into the cart, and take care of the house. Not far from the same place they blowed up a watchman with gunpowder, and burned the poor fellow dreadfully; and while he made hideous cries, and nobody would venture to come near to help him, the whole family that were able to stir got out at the windows (one story high), two that were left sick calling out for help. Care was taken to give them nurses to look after them; but the persons fled were never found till, after the plague was abated, they returned. But as nothing could be proved, so nothing could be done to them. In other cases, some had gardens and walls, or pales,[100] between them and their neighbors, or yards and backhouses; and these, by friendship and entreaties, would get leave to get over those walls or pales, and so go out at their neighbors' doors, or, by giving money to their servants, get them to let them through in the night. So that, in short, the shutting up of houses was in no wise to be depended upon; neither did it answer the end at all, serving more to make the people desperate, and drive them to such extremities as that they would break out at all adventures. And that which was still worse, those that did thus break out spread the infection farther, by their wandering about with the distemper upon them in their desperate circumstances, than they would otherwise have done; for whoever considers all the particulars in such cases must acknowledge, and cannot doubt, but the severity of those confinements made many people desperate, and made them run out of their houses at all hazards, and with the plague visibly upon them, not knowing either whither to go, or what to do, or indeed what they did. And many that did so were driven to dreadful exigencies and extremities, and perished in the streets or fields for mere want, or dropped down by[101] the raging violence of the fever upon them. Others wandered into the country, and went forward any way, as their desperation guided them, not knowing whither they went or would go, till, faint and tired, and not getting any relief, the houses and villages on the road refusing to admit them to lodge, whether infected or no, they have perished by the roadside, or gotten into barns, and died there, none daring to come to them or relieve them, though perhaps not infected, for nobody would believe them. On the other hand, when the plague at first seized a family, that is to say, when any one body of the family had gone out, and unwarily or otherwise catched[102] the distemper and brought it home, it was certainly known by the family before it was known to the officers, who, as you will see by the order, were appointed to examine into the circumstances of all sick persons, when they heard of their being sick. In this interval, between their being taken sick and the examiners coming, the master of the house had leisure and liberty to remove himself, or all his family, if he knew whither to go; and many did so. But the great disaster was, that many did thus after they were really infected themselves, and so carried the disease into the houses of those who were so hospitable as to receive them; which, it must be confessed, was very cruel and ungrateful. I am speaking now of people made desperate by the apprehensions of their being shut up, and their breaking out by stratagem or force, either before or after they were shut up, whose misery was not lessened when they were out, but sadly increased. On the other hand, many who thus got away had retreats to go to, and other houses, where they locked themselves up, and kept hid till the plague was over; and many families, foreseeing the approach of the distemper, laid up stores of provisions sufficient for their whole families, and shut themselves up, and that so entirely, that they were neither seen or heard of till the infection was quite ceased, and then came abroad sound and well. I might recollect several such as these, and give you the particulars of their management; for doubtless it was the most effectual secure step that could be taken for such whose circumstances would not admit them to remove, or who had not retreats abroad proper for the case; for, in being thus shut up, they were as if they had been a hundred miles off. Nor do I remember that any one of those families miscarried.[103] Among these, several Dutch merchants were particularly remarkable, who kept their houses like little garrisons besieged, suffering none to go in or out, or come near them; particularly one in a court in Throckmorton Street, whose house looked into Drapers' Garden. But I come back to the case of families infected, and shut up by the magistrates. The misery of those families is not to be expressed; and it was generally in such houses that we heard the most dismal shrieks and outcries of the poor people, terrified, and even frightened to death, by the sight of the condition of their dearest relations, and by the terror of being imprisoned as they were. I remember, and while I am writing this story I think I hear the very sound of it: a certain lady had an only daughter, a young maiden about nineteen years old, and who was possessed of a very considerable fortune. They were only lodgers in the house where they were. The young woman, her mother, and the maid had been abroad on some occasion, I do not remember what, for the house was not shut up; but about two hours after they came home, the young lady complained she was not well; in a quarter of an hour more she vomited, and had a violent pain in her head. "Pray God," says her mother, in a terrible fright, "my child has not the distemper!" The pain in her head increasing, her mother ordered the bed to be warmed, and resolved to put her to bed, and prepared to give her things to sweat, which was the ordinary remedy to be taken when the first apprehensions of the distemper began. While the bed was airing, the mother undressed the young woman, and just as she was laid down in the bed, she, looking upon her body with a candle, immediately discovered the fatal tokens on the inside of her thighs. Her mother, not being able to contain herself, threw down her candle, and screeched out in such a frightful manner, that it was enough to place horror upon the stoutest heart in the world. Nor was it one scream, or one cry, but, the fright having seized her spirits, she fainted first, then recovered, then ran all over the house (up the stairs and down the stairs) like one distracted, and indeed really was distracted, and continued screeching and crying out for several hours, void of all sense, or at least government of her senses, and, as I was told, never came thoroughly to herself again. As to the young maiden, she was a dead corpse from that moment: for the gangrene, which occasions the spots, had spread over her whole body, and she died in less than two hours. But still the mother continued crying out, not knowing anything more of her child, several hours after she was dead. It is so long ago that I am not certain, but I think the mother never recovered, but died in two or three weeks after. I have by me a story of two brothers and their kinsman, who, being single men, but that had staid[104] in the city too long to get away, and, indeed, not knowing where to go to have any retreat, nor having wherewith to travel far, took a course for their own preservation, which, though in itself at first desperate, yet was so natural that it may be wondered that no more did so at that time. They were but of mean condition, and yet not so very poor as that they could not furnish themselves with some little conveniences, such as might serve to keep life and soul together; and finding the distemper increasing in a terrible manner, they resolved to shift as well as they could, and to be gone. One of them had been a soldier in the late wars,[105] and before that in the Low Countries;[106] and having been bred to no particular employment but his arms, and besides, being wounded, and not able to work very hard, had for some time been employed at a baker's of sea biscuit, in Wapping. The brother of this man was a seaman too, but somehow or other had been hurt of[107] one leg, that he could not go to sea, but had worked for his living at a sailmaker's in Wapping or thereabouts, and, being a good husband,[108] had laid up some money, and was the richest of the three. The third man was a joiner or carpenter by trade, a handy fellow, and he had no wealth but his box or basket of tools, with the help of which he could at any time get his living (such a time as this excepted) wherever he went; and he lived near Shadwell. They all lived in Stepney Parish, which, as I have said, being the last that was infected, or at least violently, they staid there till they evidently saw the plague was abating at the west part of the town, and coming towards the east, where they lived. The story of those three men, if the reader will be content to have me give it in their own persons, without taking upon me to either vouch the particulars or answer for any mistakes, I shall give as distinctly as I can, believing the history will be a very good pattern for any poor man to follow in case the like public desolation should happen here. And if there may be no such occasion, (which God of his infinite mercy grant us!) still the story may have its uses so many ways as that it will, I hope, never be said that the relating has been unprofitable. I say all this previous to the history, having yet, for the present, much more to say before I quit my own part. I went all the first part of the time freely about the streets, though not so freely as to run myself into apparent danger, except when they dug the great pit in the churchyard of our parish of Aldgate. A terrible pit it was, and I could not resist my curiosity to go and see it. As near as I may judge, it was about forty feet in length, and about fifteen or sixteen feet broad, and at the time I first looked at it about nine feet deep. But it was said they dug it near twenty feet deep afterwards, in one part of it, till they could go no deeper for the water; for they had, it seems, dug several large pits before this; for, though the plague was long a-coming[109] to our parish, yet, when it did come, there was no parish in or about London where it raged with such violence as in the two parishes of Aldgate and Whitechapel. I say they had dug several pits in another ground when the distemper began to spread in our parish, and especially when the dead carts began to go about, which was not in our parish till the beginning of August. Into these pits they had put perhaps fifty or sixty bodies each; then they made larger holes, wherein they buried all that the cart brought in a week, which, by the middle to the end of August, came to from two hundred to four hundred a week. And they could not well dig them larger, because of the order of the magistrates, confining them to leave no bodies within six feet of the surface; and the water coming on at about seventeen or eighteen feet, they could not well, I say, put more in one pit. But now, at the beginning of September, the plague raging in a dreadful manner, and the number of burials in our parish increasing to more than was[110] ever buried in any parish about London of no larger extent, they ordered this dreadful gulf to be dug, for such it was rather than a pit. They had supposed this pit would have supplied them for a month or more when they dug it; and some blamed the churchwardens for suffering such a frightful thing, telling them they were making preparations to bury the whole parish, and the like. But time made it appear, the churchwardens knew the condition of the parish better than they did: for, the pit being finished the 4th of September, I think they began to bury in it the 6th, and by the 20th, which was just two weeks, they had thrown into it eleven hundred and fourteen bodies, when they were obliged to fill it up, the bodies being then come to lie within six feet of the surface. I doubt not but there may be some ancient persons alive in the parish who can justify the fact of this, and are able to show even in what place of the churchyard the pit lay, better than I can: the mark of it also was many years to be seen in the churchyard on the surface, lying in length, parallel with the passage which goes by the west wall of the churchyard out of Houndsditch, and turns east again into Whitechapel, coming out near the Three Nuns Inn. It was about the 10th of September that my curiosity led, or rather drove, me to go and see this pit again, when there had been near four hundred people buried in it. And I was not content to see it in the daytime, as I had done before,--for then there would have been nothing to have been seen but the loose earth, for all the bodies that were thrown in were immediately covered with earth by those they called the "buriers," which at other times were called "bearers,"--but I resolved to go in the night, and see some of them thrown in. There was a strict order to prevent people coming to those pits, and that was only to prevent infection. But after some time that order was more necessary; for people that were infected and near their end, and delirious also, would run to those pits wrapped in blankets, or rugs, and throw themselves in, and, as they said, "bury themselves." I cannot say that the officers suffered any willingly to lie there; but I have heard that in a great pit in Finsbury, in the parish of Cripplegate (it lying open then to the fields, for it was not then walled about), many came and threw themselves in, and expired there, before they threw any earth upon them; and that when they came to bury others, and found them there, they were quite dead, though not cold. This may serve a little to describe the dreadful condition of that day, though it is impossible to say anything that is able to give a true idea of it to those who did not see it, other than this: that it was indeed very, very, very dreadful, and such as no tongue can express. I got admittance into the churchyard by being acquainted with the sexton who attended, who, though he did not refuse me at all, yet earnestly persuaded me not to go, telling me very seriously (for he was a good, religious, and sensible man) that it was indeed their business and duty to venture, and to run all hazards, and that in it they might hope to be preserved; but that I had no apparent call to it but my own curiosity, which, he said, he believed I would not pretend was sufficient to justify my running that hazard. I told him I had been pressed in my mind to go, and that perhaps it might be an instructing sight that might not be without its uses. "Nay," says the good man, "if you will venture upon that score, 'name of God,[111] go in; for, depend upon it, it will be a sermon to you, it may be, the best that ever you heard in your life. It is a speaking sight," says he, "and has a voice with it, and a loud one, to call us all to repentance;" and with that he opened the door, and said, "Go, if you will." His discourse had shocked my resolution a little, and I stood wavering for a good while; but just at that interval I saw two links[112] come over from the end of the Minories, and heard the bellman, and then appeared a "dead cart," as they called it, coming over the streets: so I could no longer resist my desire of seeing it, and went in. There was nobody, as I could perceive at first, in the churchyard, or going into it, but the buriers, and the fellow that drove the cart, or rather led the horse and cart; but when they came up to the pit, they saw a man go to and again,[113] muffled up in a brown cloak, and making motions with his hands, under his cloak, as if he was[114] in great agony. And the buriers immediately gathered about him, supposing he was one of those poor delirious or desperate creatures that used to pretend, as I have said, to bury themselves. He said nothing as he walked about, but two or three times groaned very deeply and loud, and sighed as[115] he would break his heart. When the buriers came up to him, they soon found he was neither a person infected and desperate, as I have observed above, or a person distempered in mind, but one oppressed with a dreadful weight of grief indeed, having his wife and several of his children all in the cart that was just come in with him; and he followed in an agony and excess of sorrow. He mourned heartily, as it was easy to see, but with a kind of masculine grief, that could not give itself vent by tears, and, calmly desiring the buriers to let him alone, said he would only see the bodies thrown in, and go away. So they left importuning him; but no sooner was the cart turned round, and the bodies shot into the pit promiscuously,--which was a surprise to him, for he at least expected they would have been decently laid in, though, indeed, he was afterwards convinced that was impracticable,--I say, no sooner did he see the sight, but he cried out aloud, unable to contain himself. I could not hear what he said, but he went backward two or three steps, and fell down in a swoon. The buriers ran to him and took him up, and in a little while he came to himself, and they led him away to the Pye[116] Tavern, over against the end of Houndsditch, where, it seems, the man was known, and where they took care of him. He looked into the pit again as he went away; but the buriers had covered the bodies so immediately with throwing in earth, that, though there was light enough (for there were lanterns,[117] and candles in them, placed all night round the sides of the pit upon the heaps of earth, seven or eight, or perhaps more), yet nothing could be seen. This was a mournful scene indeed, and affected me almost as much as the rest. But the other was awful, and full of terror: the cart had in it sixteen or seventeen bodies; some were wrapped up in linen sheets, some in rugs, some little other than naked, or so loose that what covering they had fell from them in the shooting out of the cart, and they fell quite naked among the rest; but the matter was not much to them, or the indecency much to any one else, seeing they were all dead, and were to be huddled together into the common grave of mankind, as we may call it; for here was no difference made, but poor and rich went together. There was no other way of burials, neither was it possible there should,[118] for coffins were not to be had for the prodigious numbers that fell in such a calamity as this. It was reported, by way of scandal upon the buriers, that if any corpse was delivered to them decently wound up, as we called it then, in a winding sheet tied over the head and feet (which some did, and which was generally of good linen),--I say, it was reported that the buriers were so wicked as to strip them in the cart, and carry them quite naked to the ground; but as I cannot credit anything so vile among Christians, and at a time so filled with terrors as that was, I can only relate it, and leave it undetermined. Innumerable stories also went about of the cruel behavior and practice of nurses who attended the sick, and of their hastening on the fate of those they attended in their sickness. But I shall say more of this in its place. I was indeed shocked with this sight, it almost overwhelmed me; and I went away with my heart most afflicted, and full of afflicting thoughts such as I cannot describe. Just at my going out of the church, and turning up the street towards my own house, I saw another cart, with links, and a bellman going before, coming out of Harrow Alley, in the Butcher Row, on the other side of the way; and being, as I perceived, very full of dead bodies, it went directly over the street, also, towards the church. I stood a while, but I had no stomach[119] to go back again to see the same dismal scene over again: so I went directly home, where I could not but consider with thankfulness the risk I had run, believing I had gotten no injury, as indeed I had not. Here the poor unhappy gentleman's grief came into my head again, and indeed I could not but shed tears in the reflection upon it, perhaps more than he did himself; but his case lay so heavy upon my mind, that I could not prevail with myself but that I must go out again into the street, and go to the Pye Tavern, resolving to inquire what became of him. It was by this time one o'clock in the morning, and yet the poor gentleman was there. The truth was, the people of the house, knowing him, had entertained him, and kept him there all the night, notwithstanding the danger of being infected by him, though it appeared the man was perfectly sound himself. It is with regret that I take notice of this tavern. The people were civil, mannerly, and an obliging sort of folks enough, and had till this time kept their house open, and their trade going on, though not so very publicly as formerly. But there was a dreadful set of fellows that used their house, and who, in the middle of all this horror, met there every night, behaving with all the reveling and roaring extravagances as is usual for such people to do at other times, and indeed to such an offensive degree that the very master and mistress of the house grew first ashamed, and then terrified, at them. They sat generally in a room next the street; and as they always kept late hours, so when the dead cart came across the street end to go into Houndsditch, which was in view of the tavern windows, they would frequently open the windows as soon as they heard the bell, and look out at them; and as they might often hear sad lamentations of people in the streets, or at their windows, as the carts went along, they would make their impudent mocks and jeers at them, especially if they heard the poor people call upon God to have mercy upon them, as many would do at those times, in their ordinary passing along the streets. These gentlemen, being something disturbed with the clutter of bringing the poor gentleman into the house, as above, were first angry and very high with the master of the house for suffering such a fellow, as they called him, to be brought out of the grave into their house; but being answered that the man was a neighbor, and that he was sound, but overwhelmed with the calamity of his family, and the like, they turned their anger into ridiculing the man and his sorrow for his wife and children, taunting him with want of courage to leap into the great pit, and go to heaven, as they jeeringly expressed it, along with them; adding some very profane and even blasphemous expressions. They were at this vile work when I came back to the house; and as far as I could see, though the man sat still, mute and disconsolate, and their affronts could not divert his sorrow, yet he was both grieved and offended at their discourse. Upon this, I gently reproved them, being well enough acquainted with their characters, and not unknown in person to two of them. They immediately fell upon me with ill language and oaths, asked me what I did out of my grave at such a time, when so many honester men were carried into the churchyard, and why I was not at home saying my prayers, against[120] the dead cart came for me, and the like. I was indeed astonished at the impudence of the men, though not at all discomposed at their treatment of me: however, I kept my temper. I told them that though I defied them, or any man in the world, to tax me with any dishonesty, yet I acknowledged, that, in this terrible judgment of God, many better than I were swept away, and carried to their grave; but, to answer their question directly, the case was, that I was mercifully preserved by that great God whose name they had blasphemed and taken in vain by cursing and swearing in a dreadful manner; and that I believed I was preserved in particular, among other ends of his goodness, that I might reprove them for their audacious boldness in behaving in such a manner, and in such an awful time as this was, especially for their jeering and mocking at an honest gentleman and a neighbor, for some of them knew him, who they saw was overwhelmed with sorrow for the breaches which it had pleased God to make upon his family. I cannot call exactly to mind the hellish, abominable raillery which was the return they made to that talk of mine, being provoked, it seems, that I was not at all afraid to be free with them; nor, if I could remember, would I fill my account with any of the words, the horrid oaths, curses, and vile expressions such as, at that time of the day, even the worst and ordinariest people in the street would not use: for, except such hardened creatures as these, the most wicked wretches that could be found had at that time some terror upon their mind of the hand of that Power which could thus in a moment destroy them. But that which was the worst in all their devilish language was, that they were not afraid to blaspheme God and talk atheistically, making a jest at my calling the plague the hand of God, mocking, and even laughing at the word "judgment," as if the providence of God had no concern in the inflicting such a desolating stroke; and that the people calling upon God, as they saw the carts carrying away the dead bodies, was all enthusiastic, absurd, and impertinent. I made them some reply, such as I thought proper, but which I found was so far from putting a check to their horrid way of speaking, that it made them rail the more: so that I confess it filled me with horror and a kind of rage; and I came away, as I told them, lest the hand of that Judgment which had visited the whole city should glorify his vengeance upon them and all that were near them. They received all reproof with the utmost contempt, and made the greatest mockery that was possible for them to do at me, giving me all the opprobrious insolent scoffs that they could think of for preaching to them, as they called it, which, indeed, grieved me rather than angered me; and I went away, blessing God, however, in my mind, that I had not spared them, though they had insulted me so much. They continued this wretched course three or four days after this, continually mocking and jeering at all that showed themselves religious or serious, or that were any way touched with the sense of the terrible judgment of God upon us; and I was informed they flouted in the same manner at the good people, who, notwithstanding the contagion, met at the church, fasted, and prayed to God to remove his hand from them. I say they continued this dreadful course three or four days (I think it was no more), when one of them, particularly he who asked the poor gentleman what he did out of his grave, was struck from Heaven with the plague, and died in a most deplorable manner; and, in a word, they were every one of them carried into the great pit, which I have mentioned above, before it was quite filled up, which was not above a fortnight or thereabout. These men were guilty of many extravagances, such as one would think human nature should have trembled at the thoughts of, at such a time of general terror as was then upon us, and particularly scoffing and mocking at everything which they happened to see that was religious among the people, especially at their thronging zealously to the place of public worship, to implore mercy from Heaven in such a time of distress; and this tavern where they held their club, being within view of the church door, they had the more particular occasion for their atheistical, profane mirth. But this began to abate a little with them before the accident, which I have related, happened; for the infection increased so violently at this part of the town now, that people began to be afraid to come to the church: at least such numbers did not resort thither as was usual. Many of the clergymen, likewise, were dead, and others gone into the country; for it really required a steady courage and a strong faith, for a man not only to venture being in town at such a time as this, but likewise to venture to come to church, and perform the office of a minister to a congregation of whom he had reason to believe many of them were actually infected with the plague, and to do this every day, or twice a day, as in some places was done. It seems they had been checked, for their open insulting religion in this manner, by several good people of every persuasion; and that[121] and the violent raging of the infection, I suppose, was the occasion that they had abated much of their rudeness for some time before, and were only roused by the spirit of ribaldry and atheism at the clamor which was made when the gentleman was first brought in there, and perhaps were agitated by the same devil when I took upon me to reprove them; though I did it at first with all the calmness, temper, and good manners that I could, which, for a while, they insulted me the more for, thinking it had been in fear of their resentment, though afterwards they found the contrary.[122] These things lay upon my mind, and I went home very much grieved and oppressed with the horror of these men's wickedness, and to think that anything could be so vile, so hardened, and so notoriously wicked, as to insult God, and his servants and his worship, in such a manner, and at such a time as this was, when he had, as it were, his sword drawn in his hand, on purpose to take vengeance, not on them only, but on the whole nation. I had indeed been in some passion at first with them, though it was really raised, not by any affront they had offered me personally, but by the horror their blaspheming tongues filled me with. However, I was doubtful in my thoughts whether the resentment I retained was not all upon my own private account; for they had given me a great deal of ill language too, I mean personally: but after some pause, and having a weight of grief upon my mind, I retired myself as soon as I came home (for I slept not that night), and, giving God most humble thanks for my preservation in the imminent danger I had been in, I set my mind seriously and with the utmost earnestness to pray for those desperate wretches, that God would pardon them, open their eyes, and effectually humble them. By this I not only did my duty, namely, to pray for those who despitefully used me, but I fully tried my own heart, to my full satisfaction that it was not filled with any spirit of resentment as they had offended me in particular; and I humbly recommend the method to all those that would know, or be certain, how to distinguish between their zeal for the honor of God and the effects of their private passions and resentment. I remember a citizen, who, having broken out of his house in Aldersgate Street or thereabout, went along the road to Islington. He attempted to have gone[123] in at the Angel Inn, and after that at the White Horse, two inns known still by the same signs, but was refused, after which he came to the Pyed[124] Bull, an inn also still continuing the same sign. He asked them for lodging for one night only, pretending to be going into Lincolnshire, and assuring them of his being very sound, and free from the infection, which also at that time had not reached much that way. They told him they had no lodging that they could spare but one bed up in the garret, and that they could spare that bed but for one night, some drovers being expected the next day with cattle: so, if he would accept of that lodging, he might have it, which he did. So a servant was sent up with a candle with him to show him the room. He was very well dressed, and looked like a person not used to lie in a garret; and when he came to the room, he fetched a deep sigh, and said to the servant, "I have seldom lain in such a lodging as this." However, the servant assured him again that they had no better. "Well," says he, "I must make shift.[125] This is a dreadful time, but it is but for one night." So he sat down upon the bedside, and bade the maid, I think it was, fetch him a pint of warm ale. Accordingly the servant went for the ale; but some hurry in the house, which perhaps employed her other ways, put it out of her head, and she went up no more to him. The next morning, seeing no appearance of the gentleman, somebody in the house asked the servant that had showed him upstairs what was become of him. She started. "Alas!" says she, "I never thought more of him. He bade me carry him some warm ale, but I forgot." Upon which, not the maid, but some other person, was sent up to see after him, who, coming into the room, found him stark dead, and almost cold, stretched out across the bed. His clothes were pulled off, his jaw fallen, his eyes open in a most frightful posture, the rug of the bed being grasped hard in one of his hands, so that it was plain he died soon after the maid left him; and it is probable, had she gone up with the ale, she had found him dead in a few minutes after he had sat down upon the bed. The alarm was great in the house, as any one may suppose, they having been free from the distemper till that disaster, which, bringing the infection to the house, spread it immediately to other houses round about it. I do not remember how many died in the house itself; but I think the maidservant who went up first with him fell presently ill by the fright, and several others; for, whereas there died but two in Islington of the plague the week before, there died nineteen the week after, whereof fourteen were of the plague. This was in the week from the 11th of July to the 18th. There was one shift[126] that some families had, and that not a few, when their houses happened to be infected, and that was this: the families who in the first breaking out of the distemper fled away into the country, and had retreats among their friends, generally found some or other of their neighbors or relations to commit the charge of those houses to, for the safety of the goods and the like. Some houses were indeed entirely locked up, the doors padlocked, the windows and doors having deal boards nailed over them, and only the inspection of them committed to the ordinary watchmen and parish officers; but these were but few. It was thought that there were not less than a thousand houses forsaken of the inhabitants in the city and suburbs, including what was in the outparishes and in Surrey, or the side of the water they called Southwark. This was besides the numbers of lodgers and of particular persons who were fled out of other families; so that in all it was computed that about two hundred thousand people were fled and gone in all.[127] But of this I shall speak again. But I mention it here on this account: namely, that it was a rule with those who had thus two houses in their keeping or care, that, if anybody was taken sick in a family, before the master of the family let the examiners or any other officer know of it, he immediately would send all the rest of his family, whether children or servants as it fell out to be, to such other house which he had not in charge, and then, giving notice of the sick person to the examiner, have a nurse or nurses appointed, and having another person to be shut up in the house with them (which many for money would do), so to take charge of the house in case the person should die. This was in many cases the saving a whole family, who, if they had been shut up with the sick person, would inevitably have perished. But, on the other hand, this was another of the inconveniences of shutting up houses; for the apprehensions and terror of being shut up made many run away with the rest of the family, who, though it was not publicly known, and they were not quite sick, had yet the distemper upon them; and who, by having an uninterrupted liberty to go about, but being obliged still to conceal their circumstances, or perhaps not knowing it themselves, gave the distemper to others, and spread the infection in a dreadful manner, as I shall explain further hereafter. I had in my family only an ancient woman that managed the house, a maidservant, two apprentices, and myself; and, the plague beginning to increase about us, I had many sad thoughts about what course I should take and how I should act. The many dismal objects[128] which happened everywhere as I went about the streets had filled my mind with a great deal of horror, for fear of the distemper itself, which was indeed very horrible in itself, and in some more than others. The swellings, which were generally in the neck or groin, when they grew hard, and would not break, grew so painful that it was equal to the most exquisite torture; and some, not able to bear the torment, threw themselves out at windows, or shot themselves, or otherwise made themselves away, and I saw several dismal objects of that kind. Others, unable to contain themselves, vented their pain by incessant roarings; and such loud and lamentable cries were to be heard, as we walked along the streets, that[129] would pierce the very heart to think of, especially when it was to be considered that the same dreadful scourge might be expected every moment to seize upon ourselves. I cannot say but that now I began to faint in my resolutions. My heart failed me very much, and sorely I repented of my rashness, when I had been out, and met with such terrible things as these I have talked of. I say I repented my rashness in venturing to abide in town, and I wished often that I had not taken upon me to stay, but had gone away with my brother and his family. Terrified by those frightful objects, I would retire home sometimes, and resolve to go out no more; and perhaps I would keep those resolutions for three or four days, which time I spent in the most serious thankfulness for my preservation and the preservation of my family, and the constant confession of my sins, giving myself up to God every day, and applying to him with fasting and humiliation and meditation. Such intervals as I had, I employed in reading books and in writing down my memorandums of what occurred to me every day, and out of which, afterwards, I took most of this work, as it relates to my observations without doors. What I wrote of my private meditations I reserve for private use, and desire it may not be made public on any account whatever. I also wrote other meditations upon divine subjects, such as occurred to me at that time, and were profitable to myself, but not fit for any other view, and therefore I say no more of that. I had a very good friend, a physician, whose name was Heath, whom I frequently visited during this dismal time, and to whose advice I was very much obliged for many things which he directed me to take by way of preventing the infection when I went out, as he found I frequently did, and to hold in my mouth when I was in the streets. He also came very often to see me; and as he was a good Christian, as well as a good physician, his agreeable conversation was a very great support to me in the worst of this terrible time. It was now the beginning of August, and the plague grew very violent and terrible in the place where I lived; and Dr. Heath coming to visit me, and finding that I ventured so often out in the streets, earnestly persuaded me to lock myself up, and my family, and not to suffer any of us to go out of doors; to keep all our windows fast, shutters and curtains close, and never to open them, but first to make a very strong smoke in the room, where the window or door was to be opened, with rosin[130] and pitch, brimstone and gunpowder, and the like; and we did this for some time. But, as I had not laid in a store of provision for such a retreat, it was impossible that we could keep within doors entirely. However, I attempted, though it was so very late, to do something towards it; and first, as I had convenience both for brewing and baking, I went and bought two sacks of meal, and for several weeks, having an oven, we baked all our own bread; also I bought malt, and brewed as much beer as all the casks I had would hold, and which seemed enough to serve my house for five or six weeks; also I laid in a quantity of salt butter and Cheshire cheese; but I had no flesh meat,[131] and the plague raged so violently among the butchers and slaughterhouses on the other side of our street, where they are known to dwell in great numbers, that it was not advisable so much as to go over the street among them. And here I must observe again, that this necessity of going out of our houses to buy provisions was in a great measure the ruin of the whole city; for the people catched the distemper, on these occasions, one of another; and even the provisions themselves were often tainted (at least I have great reason to believe so), and therefore I cannot say with satisfaction, what I know is repeated with great assurance, that the market people, and such as brought provisions to town, were never infected. I am certain the butchers of Whitechapel, where the greatest part of the flesh meat was killed, were dreadfully visited, and that at last to such a degree that few of their shops were kept open; and those that remained of them killed their meat at Mile End, and that way, and brought it to market upon horses. However, the poor people could not lay up provisions, and there was a necessity that they must go to market to buy, and others to send servants or their children; and, as this was a necessity which renewed itself daily, it brought abundance of unsound people to the markets; and a great many that went thither sound brought death home with them. It is true, people used all possible precaution. When any one bought a joint of meat in the market, they[132] would not take it out of the butcher's hand, but took it off the hooks themselves.[132] On the other hand, the butcher would not touch the money, but have it put into a pot full of vinegar, which he kept for that purpose. The buyer carried always small money to make up any odd sum, that they might take no change. They carried bottles for scents and perfumes in their hands, and all the means that could be used were employed; but then the poor could not do even these things, and they went at all hazards. Innumerable dismal stories we heard every day on this very account. Sometimes a man or woman dropped down dead in the very markets; for many people that had the plague upon them knew nothing of it till the inward gangrene had affected their vitals, and they died in a few moments. This caused that many died frequently in that manner in the street suddenly, without any warning: others, perhaps, had time to go to the next bulk[133] or stall, or to any door or porch, and just sit down and die, as I have said before. These objects were so frequent in the streets, that when the plague came to be very raging on one side, there was scarce any passing by the streets but that several dead bodies would be lying here and there upon the ground. On the other hand, it is observable, that though at first the people would stop as they went along, and call to the neighbors to come out on such an occasion, yet afterward no notice was taken of them; but that, if at any time we found a corpse lying, go across the way and not come near it; or, if in a narrow lane or passage, go back again, and seek some other way to go on the business we were upon. And in those cases the corpse was always left till the officers had notice to come and take them away, or till night, when the bearers attending the dead cart would take them up and carry them away. Nor did those undaunted creatures who performed these offices fail to search their pockets, and sometimes strip off their clothes, if they were well dressed, as sometimes they were, and carry off what they could get. But to return to the markets. The butchers took that care, that, if any person died in the market, they had the officers always at hand to take them up upon handbarrows, and carry them to the next churchyard; and this was so frequent that such were not entered in the weekly bill, found dead in the streets or fields, as is the case now, but they went into the general articles of the great distemper. But now the fury of the distemper increased to such a degree, that even the markets were but very thinly furnished with provisions, or frequented with buyers, compared to what they were before; and the lord mayor caused the country people who brought provisions to be stopped in the streets leading into the town, and to sit down there with their goods, where they sold what they brought, and went immediately away. And this encouraged the country people greatly to do so; for they sold their provisions at the very entrances into the town, and even in the fields, as particularly in the fields beyond Whitechapel, in Spittlefields. Note, those streets now called Spittlefields were then indeed open fields; also in St. George's Fields in Southwark, in Bunhill Fields, and in a great field called Wood's Close, near Islington. Thither the lord mayor, aldermen, and magistrates sent their officers and servants to buy for their families, themselves keeping within doors as much as possible; and the like did many other people. And after this method was taken, the country people came with great cheerfulness, and brought provisions of all sorts, and very seldom got any harm, which, I suppose, added also to that report of their being miraculously preserved.[134] As for my little family, having thus, as I have said, laid in a store of bread, butter, cheese, and beer, I took my friend and physician's advice, and locked myself up, and my family, and resolved to suffer the hardship of living a few months without flesh meat rather than to purchase it at the hazard of our lives. But, though I confined my family, I could not prevail upon my unsatisfied curiosity to stay within entirely myself, and, though I generally came frighted and terrified home, yet I could not restrain, only that, indeed, I did not do it so frequently as at first. I had some little obligations, indeed, upon me to go to my brother's house, which was in Coleman Street Parish, and which he had left to my care; and I went at first every day, but afterwards only once or twice a week. In these walks I had many dismal scenes before my eyes, as, particularly, of persons falling dead in the streets, terrible shrieks and screechings of women, who in their agonies would throw open their chamber windows, and cry out in a dismal surprising manner. It is impossible to describe the variety of postures in which the passions of the poor people would express themselves. Passing through Token-House Yard in Lothbury, of a sudden a casement violently opened just over my head, and a woman gave three frightful screeches, and then cried, "O death, death, death!" in a most inimitable tone, and which[135] struck me with horror, and[136] a chillness in my very blood. There was nobody to be seen in the whole street, neither did any other window open, for people had no curiosity now in any case, nor could anybody help one another: so I went on to pass into Bell Alley. Just in Bell Alley, on the right hand of the passage, there was a more terrible cry than that, though it was not so directed out at the window. But the whole family was in a terrible fright, and I could hear women and children run screaming about the rooms like distracted, when a garret window opened, and somebody from a window on the other side the alley called, and asked, "What is the matter?" Upon which from the first window it was answered, "O Lord, my old master has hanged himself!" The other asked again, "Is he quite dead?" and the first answered, "Ay, ay, quite dead; quite dead and cold!" This person was a merchant and a deputy alderman, and very rich. I care not to mention his name, though I knew his name too; but that would be a hardship to the family, which is now flourishing again.[137] But this is but one. It is scarce credible what dreadful cases happened in particular families every day,--people, in the rage of the distemper, or in the torment of their swellings, which was indeed intolerable, running out of their own government,[138] raving and distracted, and oftentimes laying violent hands upon themselves, throwing themselves out at their windows, shooting themselves, etc.; mothers murdering their own children in their lunacy; some dying of mere grief as a passion, some of mere fright and surprise without any infection at all; others frighted into idiotism[139] and foolish distractions, some into despair and lunacy, others into melancholy madness. The pain of the swelling was in particular very violent, and to some intolerable. The physicians and surgeons may be said to have tortured many poor creatures even to death. The swellings in some grew hard, and they applied violent drawing plasters, or poultices, to break them; and, if these did not do, they cut and scarified them in a terrible manner. In some, those swellings were made hard, partly by the force of the distemper, and partly by their being too violently drawn, and were so hard that no instrument could cut them; and then they burned them with caustics, so that many died raving mad with the torment, and some in the very operation. In these distresses, some, for want of help to hold them down in their beds or to look to them, laid hands upon themselves as above; some broke out into the streets, perhaps naked, and would run directly down to the river, if they were not stopped by the watchmen or other officers, and plunge themselves into the water wherever they found it. It often pierced my very soul to hear the groans and cries of those who were thus tormented. But of the two, this was counted the most promising particular in the whole infection: for if these swellings could be brought to a head, and to break and run, or, as the surgeons call it, to "digest," the patient generally recovered; whereas those who, like the gentlewoman's daughter, were struck with death at the beginning, and had the tokens come out upon them, often went about indifferently easy till a little before they died, and some till the moment they dropped down, as in apoplexies and epilepsies is often the case. Such would be taken suddenly very sick, and would run to a bench or bulk, or any convenient place that offered itself, or to their own houses, if possible, as I mentioned before, and there sit down, grow faint, and die. This kind of dying was much the same as it was with those who die of common mortifications,[140] who die swooning, and, as it were, go away in a dream. Such as died thus had very little notice of their being infected at all till the gangrene was spread through their whole body; nor could physicians themselves know certainly how it was with them till they opened their breasts, or other parts of their body, and saw the tokens. We had at this time a great many frightful stories told us of nurses and watchmen who looked after the dying people (that is to say, hired nurses, who attended infected people), using them barbarously, starving them, smothering them, or by other wicked means hastening their end, that is to say, murdering of them. And watchmen being set to guard houses that were shut up, when there has been but one person left, and perhaps that one lying sick, that[141] they have broke in and murdered that body, and immediately thrown them out into the dead cart; and so they have gone scarce cold to the grave. I cannot say but that some such murders were committed, and I think two were sent to prison for it, but died before they could be tried; and I have heard that three others, at several times, were executed for murders of that kind. But I must say I believe nothing of its being so common a crime as some have since been pleased to say; nor did it seem to be so rational, where the people were brought so low as not to be able to help themselves; for such seldom recovered, and there was no temptation to commit a murder, at least not equal to the fact, where they were sure persons would die in so short a time, and could not live. That there were a great many robberies and wicked practices committed even in this dreadful time, I do not deny. The power of avarice was so strong in some, that they would run any hazard to steal and to plunder; and, particularly in houses where all the families or inhabitants have been dead and carried out, they would break in at all hazards, and, without regard to the danger of infection, take even the clothes off the dead bodies, and the bedclothes from others where they lay dead. This, I suppose, must be the case of a family in Houndsditch, where a man and his daughter (the rest of the family being, as I suppose, carried away before by the dead cart) were found stark naked, one in one chamber and one in another, lying dead on the floor, and the clothes of the beds (from whence it is supposed they were rolled off by thieves) stolen, and carried quite away. It is indeed to be observed that the women were, in all this calamity, the most rash, fearless, and desperate creatures. And, as there were vast numbers that went about as nurses to tend those that were sick, they committed a great many petty thieveries in the houses where they were employed; and some of them were publicly whipped for it, when perhaps they ought rather to have been hanged for examples,[142] for numbers of houses were robbed on these occasions; till at length the parish officers were sent to recommend nurses to the sick, and always took an account who it was they sent, so as that they might call them to account if the house had been abused where they were placed. But these robberies extended chiefly to wearing-clothes, linen, and what rings or money they could come at, when the person died who was under their care, but not to a general plunder of the houses. And I could give you an account of one of these nurses, who several years after, being on her deathbed, confessed with the utmost horror the robberies she had committed at the time of her being a nurse, and by which she had enriched herself to a great degree. But as for murders, I do not find that there was ever any proofs of the fact in the manner as it has been reported, except as above. They did tell me, indeed, of a nurse in one place that laid a wet cloth upon the face of a dying patient whom she tended, and so put an end to his life, who was just expiring before; and another that smothered a young woman she was looking to, when she was in a fainting fit, and would have come to herself; some that killed them by giving them one thing, some another, and some starved them by giving them nothing at all. But these stories had two marks of suspicion that always attended them, which caused me always to slight them, and to look on them as mere stories that people continually frighted one another with: (1) That wherever it was that we heard it, they always placed the scene at the farther end of the town, opposite or most remote from where you were to hear it. If you heard it in Whitechapel, it had happened at St. Giles's, or at Westminster, or Holborn, or that end of the town; if you heard it at that end of the town, then it was done in Whitechapel, or the Minories, or about Cripplegate Parish; if you heard of it in the city, why, then, it happened in Southwark; and, if you heard of it in Southwark, then it was done in the city; and the like. In the next place, of whatsoever part you heard the story, the particulars were always the same, especially that of laying a wet double clout[143] on a dying man's face, and that of smothering a young gentlewoman: so that it was apparent, at least to my judgment, that there was more of tale than of truth in those things. A neighbor and acquaintance of mine, having some money owing to him from a shopkeeper in Whitecross Street or thereabouts, sent his apprentice, a youth about eighteen years of age, to endeavor to get the money. He came to the door, and, finding it shut, knocked pretty hard, and, as he thought, heard somebody answer within, but was not sure: so he waited, and after some stay knocked again, and then a third time, when he heard somebody coming downstairs. At length the man of the house came to the door. He had on his breeches, or drawers, and a yellow flannel waistcoat, no stockings, a pair of slip shoes, a white cap on his head, and, as the young man said, death in his face. When he opened the door, says he, "What do you disturb me thus for?" The boy, though a little surprised, replied, "I come from such a one; and my master sent me for the money, which he says you know of."--"Very well, child," returns the living ghost; "call, as you go by, at Cripplegate Church, and bid them ring the bell," and with these words shut the door again, and went up again, and died the same day, nay, perhaps the same hour. This the young man told me himself, and I have reason to believe it. This was while the plague was not come to a height. I think it was in June, towards the latter end of the month. It must have been before the dead carts came about, and while they used the ceremony of ringing the bell for the dead, which was over for certain, in that parish at least, before the month of July; for by the 25th of July there died five hundred and fifty and upwards in a week, and then they could no more bury in form[144] rich or poor. I have mentioned above, that, notwithstanding this dreadful calamity, yet that[145] numbers of thieves were abroad upon all occasions where they had found any prey, and that these were generally women. It was one morning about eleven o'clock, I had walked out to my brother's house in Coleman Street Parish, as I often did, to see that all was safe. My brother's house had a little court before it, and a brick wall and a gate in it, and within that several warehouses, where his goods of several sorts lay. It happened that in one of these warehouses were several packs of women's high-crowned hats, which came out of the country, and were, as I suppose, for exportation, whither I know not. I was surprised that when I came near my brother's door, which was in a place they called Swan Alley, I met three or four women with high-crowned hats on their heads; and, as I remembered afterwards, one, if not more, had some hats likewise in their hands. But as I did not see them come out at my brother's door, and not knowing that my brother had any such goods in his warehouse, I did not offer to say anything to them, but went across the way to shun meeting them, as was usual to do at that time, for fear of the plague. But when I came nearer to the gate, I met another woman, with more hats, come out of the gate. "What business, mistress," said I, "have you had there?"--"There are more people there," said she. "I have had no more business there than they." I was hasty to get to the gate then, and said no more to her; by which means she got away. But just as I came to the gate, I saw two more coming across the yard, to come out, with hats also on their heads and under their arms; at which I threw the gate to behind me, which, having a spring lock, fastened itself. And turning to the women, "Forsooth," said I, "what are you doing here?" and seized upon the hats, and took them from them. One of them, who, I confess, did not look like a thief, "Indeed," says she, "we are wrong; but we were told they were goods that had no owner: be pleased to take them again. And look yonder: there are more such customers as we." She cried, and looked pitifully: so I took the hats from her, and opened the gate, and bade them begone, for I pitied the women indeed. But when I looked towards the warehouse, as she directed, there were six or seven more, all women, fitting themselves with hats, as unconcerned and quiet as if they had been at a hatter's shop buying for their money. I was surprised, not at the sight of so many thieves only, but at the circumstances I was in; being now to thrust myself in among so many people, who for some weeks I had been so shy of myself, that, if I met anybody in the street, I would cross the way from them. They were equally surprised, though on another account. They all told me they were neighbors; that they had heard any one might take them; that they were nobody's goods; and the like. I talked big to them at first; went back to the gate and took out the key, so that they were all my prisoners; threatened to lock them all into the warehouse, and go and fetch my lord mayor's officers for them. They begged heartily, protested they found the gate open, and the warehouse door open, and that it had no doubt been broken open by some who expected to find goods of greater value; which indeed was reasonable to believe, because the lock was broke, and a padlock that hung to the door on the outside also loose, and not abundance of the hats carried away. At length I considered that this was not a time to be cruel and rigorous; and besides that, it would necessarily oblige me to go much about, to have several people come to me, and I go to several, whose circumstances of health I knew nothing of; and that, even at this time, the plague was so high as that there died four thousand a week; so that, in showing my resentment, or even in seeking justice for my brother's goods, I might lose my own life. So I contented myself with taking the names and places where some of them lived, who were really inhabitants in the neighborhood, and threatening that my brother should call them to an account for it when he returned to his habitation. Then I talked a little upon another footing with them, and asked them how they could do such things as these in a time of such general calamity, and, as it were, in the face of God's most dreadful judgments, when the plague was at their very doors, and, it may be, in their very houses, and they did not know but that the dead cart might stop at their doors in a few hours, to carry them to their graves. I could not perceive that my discourse made much impression upon them all that while, till it happened that there came two men of the neighborhood, hearing of the disturbance, and knowing my brother (for they had been both dependents upon his family), and they came to my assistance. These being, as I said, neighbors, presently knew three of the women, and told me who they were, and where they lived, and it seems they had given me a true account of themselves before. This brings these two men to a further remembrance. The name of one was John Hayward, who was at that time under-sexton of the parish of St. Stephen, Coleman Street (by under-sexton was understood at that time gravedigger and bearer of the dead). This man carried, or assisted to carry, all the dead to their graves, which were buried in that large parish, and who were carried in form, and, after that form of burying was stopped, went with the dead cart and the bell to fetch the dead bodies from the houses where they lay, and fetched many of them out of the chambers and houses; for the parish was, and is still, remarkable, particularly above all the parishes in London, for a great number of alleys and thoroughfares, very long, into which no carts could come, and where they were obliged to go and fetch the bodies a very long way, which alleys now remain to witness it; such as White's Alley, Cross Keys Court, Swan Alley, Bell Alley, White Horse Alley, and many more. Here they went with a kind of handbarrow, and laid the dead bodies on, and carried them out to the carts; which work he performed, and never had the distemper at all, but lived about twenty years after it, and was sexton of the parish to the time of his death. His wife at the same time was a nurse to infected people, and tended many that died in the parish, being for her honesty recommended by the parish officers; yet she never was infected, neither.[146] He never used any preservative against the infection other than holding garlic and rue[147] in his mouth, and smoking tobacco. This I also had from his own mouth. And his wife's remedy was washing her head in vinegar, and sprinkling her head-clothes so with vinegar as to keep them always moist; and, if the smell of any of those she waited on was more than ordinary offensive, she snuffed vinegar up her nose, and sprinkled vinegar upon her head-clothes, and held a handkerchief wetted with vinegar to her mouth. It must be confessed, that, though the plague was chiefly among the poor, yet were the poor the most venturous and fearless of it, and went about their employment with a sort of brutal courage: I must call it so, for it was founded neither on religion or prudence. Scarce did they use any caution, but ran into any business which they could get any employment in, though it was the most hazardous; such was that of tending the sick, watching houses shut up, carrying infected persons to the pesthouse, and, which was still worse, carrying the dead away to their graves. It was under this John Hayward's care, and within his bounds, that the story of the piper, with which people have made themselves so merry, happened; and he assured me that it was true. It is said that it was a blind piper; but, as John told me, the fellow was not blind, but an ignorant, weak, poor man, and usually went his rounds about ten o'clock at night, and went piping along from door to door. And the people usually took him in at public houses where they knew him, and would give him drink and victuals, and sometimes farthings; and he in return would pipe and sing, and talk simply, which diverted the people; and thus he lived. It was but a very bad time for this diversion while things were as I have told; yet the poor fellow went about as usual, but was almost starved: and when anybody asked how he did, he would answer, the dead cart had not taken him yet, but that they had promised to call for him next week. It happened one night that this poor fellow, whether somebody had given him too much drink or no (John Hayward said he had not drink in his house, but that they had given him a little more victuals than ordinary at a public house in Coleman Street), and the poor fellow having not usually had a bellyful, or perhaps not a good while, was laid all along upon the top of a bulk or stall, and fast asleep at a door in the street near London Wall, towards Cripplegate; and that, upon the same bulk or stall, the people of some house in the alley of which the house was a corner, hearing a bell (which they always rung before the cart came), had laid a body really dead of the plague just by him, thinking too that this poor fellow had been a dead body as the other was, and laid there by some of the neighbors. Accordingly, when John Hayward with his bell and the cart came along, finding two dead bodies lie upon the stall, they took them up with the instrument they used, and threw them into the cart; and all this while the piper slept soundly. From hence they passed along, and took in other dead bodies, till, as honest John Hayward told me, they almost buried him alive in the cart; yet all this while he slept soundly. At length the cart came to the place where the bodies were to be thrown into the ground, which, as I do remember, was at Mountmill; and, as the cart usually stopped some time before they were ready to shoot out the melancholy load they had in it, as soon as the cart stopped, the fellow awaked, and struggled a little to get his head out from among the dead bodies; when, raising himself up in the cart, he called out, "Hey, where am I?" This frighted the fellow that attended about the work; but, after some pause, John Hayward, recovering himself, said, "Lord bless us! There's somebody in the cart not quite dead!" So another called to him, and said, "Who are you?" The fellow answered, "I am the poor piper. Where am I?"--"Where are you?" says Hayward. "Why, you are in the dead cart, and we are going to bury you."--"But I ain't dead, though, am I?" says the piper; which made them laugh a little, though, as John said, they were heartily frightened at first. So they helped the poor fellow down, and he went about his business. I know the story goes, he set up[148] his pipes in the cart, and frighted the bearers and others, so that they ran away; but John Hayward did not tell the story so, nor say anything of his piping at all. But that he was a poor piper, and that he was carried away as above, I am fully satisfied of the truth of. It is to be noted here that the dead carts in the city were not confined to particular parishes; but one cart went through several parishes, according as the number of dead presented. Nor were they tied[149] to carry the dead to their respective parishes; but many of the dead taken up in the city were carried to the burying ground in the outparts for want of room. At the beginning of the plague, when there was now no more hope but that the whole city would be visited; when, as I have said, all that had friends or estates in the country retired with their families; and when, indeed, one would have thought the very city itself was running out of the gates, and that there would be nobody left behind,--you may be sure from that hour all trade, except such as related to immediate subsistence, was, as it were, at a full stop. This is so lively a case, and contains in it so much of the real condition of the people, that I think I cannot be too particular in it, and therefore I descend to the several arrangements or classes of people who fell into immediate distress upon this occasion. For example:-- 1. All master workmen in manufactures, especially such as belonged to ornament and the less necessary parts of the people's dress, clothes, and furniture for houses; such as ribbon-weavers and other weavers, gold and silver lacemakers, and gold and silver wire-drawers, seamstresses, milliners, shoemakers, hatmakers, and glovemakers, also upholsterers, joiners, cabinet-makers, looking-glass-makers, and innumerable trades which depend upon such as these,--I say, the master workmen in such stopped their work, dismissed their journeymen and workmen and all their dependents. 2. As merchandising was at a full stop (for very few ships ventured to come up the river, and none at all went out[150]), so all the extraordinary officers of the customs, likewise the watermen, carmen, porters, and all the poor whose labor depended upon the merchants, were at once dismissed, and put out of business. 3. All the tradesmen usually employed in building or repairing of houses were at a full stop; for the people were far from wanting to build houses when so many thousand houses were at once stripped of their inhabitants; so that this one article[151] turned out all the ordinary workmen of that kind of business, such as bricklayers, masons, carpenters, joiners, plasterers, painters, glaziers, smiths, plumbers, and all the laborers depending on such. 4. As navigation was at a stop, our ships neither coming in or going out as before, so the seamen were all out of employment, and many of them in the last and lowest degree of distress. And with the seamen were all the several tradesmen and workmen belonging to and depending upon the building and fitting out of ships; such as ship-carpenters, calkers, ropemakers, dry coopers, sailmakers, anchor-smiths, and other smiths, blockmakers, carvers, gunsmiths, ship-chandlers, ship-carvers, and the like. The masters of those, perhaps, might live upon their substance; but the traders were universally at a stop, and consequently all their workmen discharged. Add to these, that the river was in a manner without boats, and all or most part of the watermen, lighter-men, boat-builders, and lighter-builders, in like manner idle and laid by. 5. All families retrenched their living as much as possible, as well those that fled as those that staid; so that an innumerable multitude of footmen, serving men, shopkeepers, journeymen, merchants' bookkeepers, and such sort of people, and especially poor maidservants, were turned off, and left friendless and helpless, without employment and without habitation; and this was really a dismal article. I might be more particular as to this part; but it may suffice to mention, in general, all trades being stopped, employment ceased, the labor, and by that the bread of the poor, were cut off; and at first, indeed, the cries of the poor were most lamentable to hear, though, by the distribution of charity, their misery that way was gently[152] abated. Many, indeed, fled into the country; but, thousands of them having staid in London till nothing but desperation sent them away, death overtook them on the road, and they served for no better than the messengers of death: indeed, others carrying the infection along with them, spread it very unhappily into the remotest parts of the kingdom. The women and servants that were turned off from their places were employed as nurses to tend the sick in all places, and this took off a very great number of them. And which,[153] though a melancholy article in itself, yet was a deliverance in its kind, namely, the plague, which raged in a dreadful manner from the middle of August to the middle of October, carried off in that time thirty or forty thousand of these very people, which, had they been left, would certainly have been an insufferable burden by their poverty; that is to say, the whole city could not have supported the expense of them, or have provided food for them, and they would in time have been even driven to the necessity of plundering either the city itself, or the country adjacent, to have subsisted themselves, which would, first or last, have put the whole nation, as well as the city, into the utmost terror and confusion. It was observable, then, that this calamity of the people made them very humble; for now, for about nine weeks together, there died near a thousand a day, one day with another, even by the account of the weekly bills, which yet, I have reason to be assured, never gave a full account by many thousands; the confusion being such, and the carts working in the dark when they carried the dead, that in some places no account at all was kept, but they worked on; the clerks and sextons not attending for weeks together, and not knowing what number they carried. This account is verified by the following bills of mortality:-- Of All Diseases. Of the Plague. Aug. 8 to Aug. 15 5,319 3,880 Aug. 15 to Aug. 22 5,668 4,237 Aug. 22 to Aug. 29 7,496 6,102 Aug. 29 to Sept. 5 8,252 6,988 Sept. 5 to Sept. 12 7,690 6,544 Sept. 12 to Sept. 19 8,297 7,165 Sept. 19 to Sept. 30 6,400 5,533 Sept. 27 to Oct. 3 5,728 4,929 Oct. 3 to Oct. 10 5,068 4,227 ------ ------ 59,918 49,605 So that the gross of the people were carried off in these two months; for, as the whole number which was brought in to die of the plague was but 68,590, here is[154] 50,000 of them, within a trifle, in two months: I say 50,000, because as there wants 395 in the number above, so there wants two days of two months in the account of time.[155] Now, when I say that the parish officers did not give in a full account, or were not to be depended upon for their account, let any one but consider how men could be exact in such a time of dreadful distress, and when many of them were taken sick themselves, and perhaps died in the very time when their accounts were to be given in (I mean the parish clerks, besides inferior officers): for though these poor men ventured at all hazards, yet they were far from being exempt from the common calamity, especially if it be true that the parish of Stepney had within the year one hundred and sixteen sextons, gravediggers, and their assistants; that is to say, bearers, bellmen, and drivers of carts for carrying off the dead bodies. Indeed, the work was not of such a nature as to allow them leisure to take an exact tale[156] of the dead bodies, which were all huddled together in the dark into a pit; which pit, or trench, no man could come nigh but at the utmost peril. I have observed often that in the parishes of Aldgate, Cripplegate, Whitechapel, and Stepney, there were five, six, seven, and eight hundred in a week in the bills; whereas, if we may believe the opinion of those that lived in the city all the time, as well as I, there died sometimes two thousand a week in those parishes. And I saw it under the hand of one that made as strict an examination as he could, that there really died a hundred thousand people of the plague in it that one year; whereas, in the bills, the article of the plague was but 68,590. If I may be allowed to give my opinion, by what I saw with my eyes, and heard from other people that were eyewitnesses, I do verily believe the same; viz., that there died at least a hundred thousand of the plague only, besides other distempers, and besides those which died in the fields and highways and secret places, out of the compass[157] of the communication, as it was called, and who were not put down in the bills, though they really belonged to the body of the inhabitants. It was known to us all that abundance of poor despairing creatures who had the distemper upon them, and were grown stupid or melancholy by their misery (as many were), wandered away into the fields and woods, and into secret uncouth[158] places, almost anywhere, to creep into a bush or hedge, and die. The inhabitants of the villages adjacent would in pity carry them food, and set it at a distance, that they might fetch it if they were able; and sometimes they were not able. And the next time they went they would find the poor wretches lie[159] dead, and the food untouched. The number of these miserable objects were[160] many; and I know so many that perished thus, and so exactly where, that I believe I could go to the very place, and dig their bones up still;[161] for the country people would go and dig a hole at a distance from them, and then, with long poles and hooks at the end of them, drag the bodies into these pits, and then throw the earth in form, as far as they could cast it, to cover them, taking notice how the wind blew, and so come on that side which the seamen call "to windward," that the scent of the bodies might blow from them. And thus great numbers went out of the world who were never known, or any account of them taken, as well within the bills of mortality as without. This indeed I had, in the main, only from the relation of others; for I seldom walked into the fields,[162] except towards Bethnal Green and Hackney, or as hereafter. But when I did walk, I always saw a great many poor wanderers at a distance, but I could know little of their cases; for, whether it were in the street or in the fields, if we had seen anybody coming, it was a general method to walk away. Yet I believe the account is exactly true. As this puts me upon mentioning my walking the streets and fields, I cannot omit taking notice what a desolate place the city was at that time. The great street I lived in, which is known to be one of the broadest of all the streets of London (I mean of the suburbs as well as the liberties, all the side where the butchers lived, especially without the bars[163]), was more like a green field than a paved street; and the people generally went in the middle with the horses and carts. It is true that the farthest end, towards Whitechapel Church, was not all paved, but even the part that was paved was full of grass also. But this need not seem strange, since the great streets within the city, such as Leadenhall Street, Bishopsgate Street, Cornhill, and even the Exchange itself, had grass growing in them in several places. Neither cart nor coach was seen in the streets from morning to evening, except some country carts to bring roots and beans, or pease, hay, and straw, to the market, and those but very few compared to what was usual. As for coaches, they were scarce used, but to carry sick people to the pesthouse and to other hospitals, and some few to carry physicians to such places as they thought fit to venture to visit; for really coaches were dangerous things, and people did not care to venture into them, because they did not know who might have been carried in them last; and sick infected people were, as I have said, ordinarily carried in them to the pesthouses; and sometimes people expired in them as they went along. It is true, when the infection came to such a height as I have now mentioned, there were very few physicians who cared to stir abroad to sick houses, and very many of the most eminent of the faculty[164] were dead, as well as the surgeons also; for now it was indeed a dismal time, and for about a month together, not taking any notice of the bills of mortality, I believe there did not die less than fifteen or seventeen hundred a day, one day with another. One of the worst days we had in the whole time, as I thought, was in the beginning of September, when, indeed, good people were beginning to think that God was resolved to make a full end of the people in this miserable city. This was at that time when the plague was fully come into the eastern parishes. The parish of Aldgate, if I may give my opinion, buried above one thousand a week for two weeks, though the bills did not say so many; but it[165] surrounded me at so dismal a rate, that there was not a house in twenty uninfected. In the Minories, in Houndsditch, and in those parts of Aldgate Parish about the Butcher Row, and the alleys over against me,--I say, in those places death reigned in every corner. Whitechapel Parish was in the same condition, and though much less than the parish I lived in, yet buried near six hundred a week, by the bills, and in my opinion near twice as many. Whole families, and indeed whole streets of families, were swept away together, insomuch that it was frequent for neighbors to call to the bellman to go to such and such houses and fetch out the people, for that they were all dead. And indeed the work of removing the dead bodies by carts was now grown so very odious and dangerous, that it was complained of that the bearers did not take care to clear such houses where all the inhabitants were dead, but that some of the bodies lay unburied till the neighboring families were offended by the stench, and consequently infected. And this neglect of the officers was such, that the churchwardens and constables were summoned to look after it; and even the justices of the hamlets[166] were obliged to venture their lives among them to quicken and encourage them; for innumerable of the bearers died of the distemper, infected by the bodies they were obliged to come so near. And had it not been that the number of people who wanted employment, and wanted bread, as I have said before, was so great that necessity drove them to undertake anything, and venture anything, they would never have found people to be employed; and then the bodies of the dead would have lain above ground, and have perished and rotted in a dreadful manner. But the magistrates cannot be enough commended in this, that they kept such good order for the burying of the dead, that as fast as any of those they employed to carry off and bury the dead fell sick or died (as was many times the case), they immediately supplied the places with others; which, by reason of the great number of poor that was left out of business, as above, was not hard to do. This occasioned, that, notwithstanding the infinite number of people which died and were sick, almost all together, yet they were always cleared away, and carried off every night; so that it was never to be said of London that the living were not able to bury the dead. As the desolation was greater during those terrible times, so the amazement of the people increased; and a thousand unaccountable things they would do in the violence of their fright, as others did the same in the agonies of their distemper: and this part was very affecting. Some went roaring, and crying, and wringing their hands, along the street; some would go praying, and lifting up their hands to heaven, calling upon God for mercy. I cannot say, indeed, whether this was not in their distraction; but, be it so, it was still an indication of a more serious mind when they had the use of their senses, and was much better, even as it was, than the frightful yellings and cryings that every day, and especially in the evenings, were heard in some streets. I suppose the world has heard of the famous Solomon Eagle, an enthusiast. He, though not infected at all, but in his head, went about denouncing of judgment upon the city in a frightful manner; sometimes quite naked, and with a pan of burning charcoal on his head. What he said or pretended, indeed, I could not learn. I will not say whether that clergyman was distracted or not, or whether he did it out of pure zeal for the poor people, who went every evening through the streets of Whitechapel, and, with his hands lifted up, repeated that part of the liturgy of the church continually, "Spare us, good Lord; spare thy people whom thou hast redeemed with thy most precious blood." I say I cannot speak positively of these things, because these were only the dismal objects which represented themselves to me as I looked through my chamber windows; for I seldom opened the casements while I confined myself within doors during that most violent raging of the pestilence, when indeed many began to think, and even to say, that there would none escape. And indeed I began to think so too, and therefore kept within doors for about a fortnight, and never stirred out. But I could not hold it. Besides, there were some people, who, notwithstanding the danger, did not omit publicly to attend the worship of God, even in the most dangerous times. And though it is true that a great many of the clergy did shut up their churches and fled, as other people did, for the safety of their lives, yet all did not do so. Some ventured to officiate, and to keep up the assemblies of the people by constant prayers, and sometimes sermons, or brief exhortations to repentance and reformation; and this as long as they would hear them. And dissenters[167] did the like also, and even in the very churches where the parish ministers were either dead or fled; nor was there any room for making any difference at such a time as this was. It pleased God that I was still spared, and very hearty and sound in health, but very impatient of being pent up within doors without air, as I had been for fourteen days or thereabouts. And I could not restrain myself, but I would go and carry a letter for my brother to the posthouse; then it was, indeed, that I observed a profound silence in the streets. When I came to the posthouse, as I went to put in my letter, I saw a man stand in one corner of the yard, and talking to another at a window; and a third had opened a door belonging to the office. In the middle of the yard lay a small leather purse, with two keys hanging at it, with money in it; but nobody would meddle with it. I asked how long it had lain there. The man at the window said it had lain almost an hour, but they had not meddled with it, because they did not know but the person who dropped it might come back to look for it. I had no such need of money, nor was the sum so big that I had any inclination to meddle with it or to get the money at the hazard it might be attended with: so I seemed to go away, when the man who had opened the door said he would take it up, but so that, if the right owner came for it, he should be sure to have it. So he went in and fetched a pail of water, and set it down hard by the purse, then went again and fetched some gunpowder, and cast a good deal of powder upon the purse, and then made a train from that which he had thrown loose upon the purse (the train reached about two yards); after this he goes in a third time, and fetches out a pair of tongs red hot, and which he had prepared, I suppose, on purpose; and first setting fire to the train of powder, that singed the purse, and also smoked the air sufficiently. But he was not content with that, but he then takes up the purse with the tongs, holding it so long till the tongs burnt through the purse, and then he shook the money out into the pail of water: so he carried it in. The money, as I remember, was about thirteen shillings, and some smooth groats[168] and brass farthings.[169] Much about the same time, I walked out into the fields towards Bow; for I had a great mind to see how things were managed in the river and among the ships; and, as I had some concern in shipping, I had a notion that it had been one of the best ways of securing one's self from the infection to have retired into a ship. And, musing how to satisfy my curiosity in that point, I turned away over the fields, from Bow to Bromley, and down to Blackwall, to the stairs that are there for landing, or taking water. Here I saw a poor man walking on the bank, or "sea wall" as they call it, by himself. I walked awhile also about, seeing the houses all shut up. At last I fell into some talk, at a distance, with this poor man. First I asked how people did thereabouts. "Alas, sir!" says he, "almost desolate, all dead or sick; here are very few families in this part, or in that village," pointing at Poplar, "where half of them are not dead already, and the rest sick." Then he, pointing to one house, "They are all dead," said he, "and the house stands open: nobody dares go into it. A poor thief," says he, "ventured in to steal something; but he paid dear for his theft, for he was carried to the churchyard too, last night." Then he pointed to several other houses. "There," says he, "they are all dead, the man and his wife and five children. There," says he, "they are shut up; you see a watchman at the door:" and so of other houses. "Why," says I, "what do you here all alone?"--"Why," says he, "I am a poor desolate man: it hath pleased God I am not yet visited, though my family is, and one of my children dead."--"How do you mean, then," said I, "that you are not visited?"--"Why," says he, "that is my house," pointing to a very little low boarded house, "and there my poor wife and two children live," said he, "if they may be said to live; for my wife and one of the children are visited; but I do not come at them." And with that word I saw the tears run very plentifully down his face; and so they did down mine too, I assure you. "But," said I, "why do you not come at them? How can you abandon your own flesh and blood?"--"O sir!" says he, "the Lord forbid! I do not abandon them, I work for them as much as I am able; and, blessed be the Lord! I keep them from want." And with that I observed he lifted up his eyes to heaven with a countenance that presently told me I had happened on a man that was no hypocrite, but a serious, religious, good man; and his ejaculation was an expression of thankfulness, that, in such a condition as he was in, he should be able to say his family did not want. "Well," says I, "honest man, that is a great mercy, as things go now with the poor. But how do you live, then, and how are you kept from the dreadful calamity that is now upon us all?"--"Why, sir," says he, "I am a waterman, and there is my boat," says he, "and the boat serves me for a house; I work in it in the day, and I sleep in it in the night: and what I get I lay it down upon that stone," says he, showing me a broad stone on the other side of the street, a good way from his house; "and then," says he, "I halloo and call to them till I make them hear, and they come and fetch it." "Well, friend," says I, "but how can you get money as a waterman? Does anybody go by water these times?"--"Yes, sir," says he, "in the way I am employed there does. Do you see there," says he, "five ships lie at anchor?" pointing down the river a good way below the town; "and do you see," says he, "eight or ten ships lie at the chain there, and at anchor yonder?" pointing above the town. "All those ships have families on board, of their merchants and owners, and such like, who have locked themselves up and live on board, close shut in, for fear of the infection; and I tend on them to fetch things for them, carry letters, and do what is absolutely necessary, that they may not be obliged to come on shore. And every night I fasten my boat on board one of the ship's boats, and there I sleep by myself, and, blessed be God! I am preserved hitherto." "Well," said I, "friend, but will they let you come on board after you have been on shore here, when this has been such a terrible place, and so infected as it is?" "Why, as to that," said he, "I very seldom go up the ship side, but deliver what I bring to their boat, or lie by the side, and they hoist it on board: if I did, I think they are in no danger from me, for I never go into any house on shore, or touch anybody, no, not of my own family; but I fetch provisions for them." "Nay," says I, "but that may be worse; for you must have those provisions of somebody or other; and since all this part of the town is so infected, it is dangerous so much as to speak with anybody; for the village," said I, "is, as it were, the beginning of London, though it be at some distance from it." "That is true," added he; "but you do not understand me right. I do not buy provisions for them here. I row up to Greenwich, and buy fresh meat there, and sometimes I row down the river to Woolwich,[170] and buy there; then I go to single farmhouses on the Kentish side, where I am known, and buy fowls and eggs and butter, and bring to the ships as they direct me, sometimes one, sometimes the other. I seldom come on shore here, and I came only now to call my wife, and hear how my little family do, and give them a little money which I received last night." "Poor man!" said I. "And how much hast thou gotten for them?" "I have gotten four shillings," said he, "which is a great sum, as things go now with poor men; but they have given me a bag of bread too, and a salt fish, and some flesh: so all helps out." "Well," said I, "and have you given it them yet?" "No," said he, "but I have called; and my wife has answered that she cannot come out yet, but in half an hour she hopes to come, and I am waiting for her. Poor woman!" says he, "she is brought sadly down; she has had a swelling, and it is broke, and I hope she will recover, but I fear the child will die. But it is the Lord!"--Here he stopped, and wept very much. "Well, honest friend," said I, "thou hast a sure comforter, if thou hast brought thyself to be resigned to the will of God: he is dealing with us all in judgment." "O sir!" says he, "it is infinite mercy if any of us are spared; and who am I to repine!" "Say'st thou so?" said I; "and how much less is my faith than thine!" And here my heart smote me, suggesting how much better this poor man's foundation was, on which he stayed in the danger, than mine: that he had nowhere to fly; that he had a family to bind him to attendance, which I had not; and mine was mere presumption, his a true dependence and a courage resting on God; and yet that he used all possible caution for his safety. I turned a little away from the man while these thoughts engaged me; for, indeed, I could no more refrain from tears than he. At length, after some further talk, the poor woman opened the door, and called, "Robert, Robert!" He answered, and bid her stay a few moments and he would come: so he ran down the common stairs to his boat, and fetched up a sack in which was the provisions he had brought from the ships; and when he returned he hallooed again; then he went to the great stone which he showed me, and emptied the sack, and laid all out, everything by themselves, and then retired; and his wife came with a little boy to fetch them away; and he called, and said, such a captain had sent such a thing, and such a captain such a thing, and at the end adds, "God has sent it all: give thanks to him." When the poor woman had taken up all, she was so weak she could not carry it at once in, though the weight was not much, neither: so she left the biscuit, which was in a little bag, and left a little boy to watch it till she came again. "Well, but," says I to him, "did you leave her the four shillings too, which you said was your week's pay?" "Yes, yes," says he; "you shall hear her own it." So he called again, "Rachel, Rachel!" which it seems was her name, "did you take up the money?"--"Yes," said she. "How much was it?" said he. "Four shillings and a groat," said she. "Well, well," says he, "the Lord keep you all;" and so he turned to go away. As I could not refrain from contributing tears to this man's story, so neither could I refrain my charity for his assistance; so I called him. "Hark thee, friend," said I, "come hither, for I believe thou art in health, that I may venture thee:" so I pulled out my hand, which was in my pocket before. "Here," says I, "go and call thy Rachel once more, and give her a little more comfort from me. God will never forsake a family that trusts in him as thou dost." So I gave him four other shillings, and bid him go lay them on the stone, and call his wife. I have not words to express the poor man's thankfulness; neither could he express it himself but by tears running down his face. He called his wife, and told her God had moved the heart of a stranger, upon hearing their condition, to give them all that money; and a great deal more such as that he said to her. The woman, too, made signs of the like thankfulness, as well to Heaven as to me, and joyfully picked it up; and I parted with no money all that year that I thought better bestowed. I then asked the poor man if the distemper had not reached to Greenwich. He said it had not till about a fortnight before; but that then he feared it had, but that it was only at that end of the town which lay south towards Deptford[171] Bridge; that he went only to a butcher's shop and a grocer's, where he generally bought such things as they sent him for, but was very careful. I asked him then how it came to pass that those people who had so shut themselves up in the ships had not laid in sufficient stores of all things necessary. He said some of them had; but, on the other hand, some did not come on board till they were frightened into it, and till it was too dangerous for them to go to the proper people to lay in quantities of things; and that he waited on two ships, which he showed me, that had laid in little or nothing but biscuit bread[172] and ship beer, and that he had bought everything else almost for them. I asked him if there were any more ships that had separated themselves as those had done. He told me yes; all the way up from the point, right against Greenwich, to within the shores of Limehouse and Redriff, all the ships that could have room rid[173] two and two in the middle of the stream, and that some of them had several families on board. I asked him if the distemper had not reached them. He said he believed it had not, except two or three ships, whose people had not been so watchful as to keep the seamen from going on shore as others had been; and he said it was a very fine sight to see how the ships lay up the Pool.[174] When he said he was going over to Greenwich as soon as the tide began to come in, I asked if he would let me go with him, and bring me back, for that I had a great mind to see how the ships were ranged, as he had told me. He told me if I would assure him, on the word of a Christian and of an honest man, that I had not the distemper, he would. I assured him that I had not; that it had pleased God to preserve me; that I lived in Whitechapel, but was too impatient of being so long within doors, and that I had ventured out so far for the refreshment of a little air, but that none in my house had so much as been touched with it. "Well, sir," says he, "as your charity has been moved to pity me and my poor family, sure you cannot have so little pity left as to put yourself into my boat if you were not sound in health, which would be nothing less than killing me, and ruining my whole family." The poor man troubled me so much when he spoke of his family with such a sensible concern and in such an affectionate manner, that I could not satisfy myself at first to go at all. I told him I would lay aside my curiosity rather than make him uneasy, though I was sure, and very thankful for it, that I had no more distemper upon me than the freshest man in the world. Well, he would not have me put it off neither, but, to let me see how confident he was that I was just to him, he now importuned me to go: so, when the tide came up to his boat, I went in, and he carried me to Greenwich. While he bought the things which he had in charge to buy, I walked up to the top of the hill, under which the town stands, and on the east side of the town, to get a prospect of the river; but it was a surprising sight to see the number of ships which lay in rows, two and two, and in some places two or three such lines in the breadth of the river, and this not only up to the town, between the houses which we call Ratcliff and Redriff, which they name the Pool, but even down the whole river, as far as the head of Long Reach, which is as far as the hills give us leave to see it. I cannot guess at the number of ships, but I think there must have been several hundreds of sail; and I could not but applaud the contrivance, for ten thousand people and more who attended ship affairs were certainly sheltered here from the violence of the contagion, and lived very safe and very easy. I returned to my own dwelling very well satisfied with my day's journey, and particularly with the poor man; also I rejoiced to see that such little sanctuaries were provided for so many families on board in a time of such desolation. I observed, also, that, as the violence of the plague had increased, so the ships which had families on board removed and went farther off, till, as I was told, some went quite away to sea, and put into such harbors and safe roads[175] on the north coast as they could best come at. But it was also true, that all the people who thus left the land, and lived on board the ships, were not entirely safe from the infection; for many died, and were thrown overboard into the river, some in coffins, and some, as I heard, without coffins, whose bodies were seen sometimes to drive up and down with the tide in the river. But I believe I may venture to say, that, in those ships which were thus infected, it either happened where the people had recourse to them too late, and did not fly to the ship till they had staid too long on shore, and had the distemper upon them, though perhaps they might not perceive it (and so the distemper did not come to them on board the ships, but they really carried it with them), or it was in these ships where the poor waterman said they had not had time to furnish themselves with provisions, but were obliged to send often on shore to buy what they had occasion for, or suffered boats to come to them from the shore; and so the distemper was brought insensibly among them. And here I cannot but take notice that the strange temper of the people of London at that time contributed extremely to their own destruction. The plague began, as I have observed, at the other end of the town (namely, in Longacre, Drury Lane, etc.), and came on towards the city very gradually and slowly. It was felt at first in December, then again in February, then again in April (and always but a very little at a time), then it stopped till May; and even the last week in May there were but seventeen in all that end of the town. And all this while, even so long as till there died about three thousand a week, yet had the people in Redriff and in Wapping and Ratcliff, on both sides the river, and almost all Southwark side, a mighty fancy that they should not be visited, or at least that it would not be so violent among them. Some people fancied the smell of the pitch and tar, and such other things, as oil and resin and brimstone (which is much used by all trades relating to shipping), would preserve them. Others argued it,[176] because it[177] was in its extremest violence in Westminster and the parish of St. Giles's and St. Andrew's, etc., and began to abate again before it came among them, which was true, indeed, in part. For example:-- Aug. 8 to Aug. 15, St. Giles-in-the-Fields 242 " " Cripplegate 886 " " Stepney 197 " " St. Mag.[178] Bermondsey 24 " " Rotherhithe 3 Total this week 4,030 Aug. 15 to Aug. 22, St. Giles-in-the-Fields 175 " " Cripplegate 847 " " Stepney 273 " " St. Mag. Bermondsey 36 " " Rotherhithe 2 Total this week 5,319 N.B.[179]--That it was observed that the numbers mentioned in Stepney Parish at that time were generally all on that side where Stepney Parish joined to Shoreditch, which we now call Spittlefields, where the parish of Stepney comes up to the very wall of Shoreditch churchyard. And the plague at this time was abated at St. Giles-in-the-Fields, and raged most violently in Cripplegate, Bishopsgate, and Shoreditch Parishes, but there were not ten people a week that died of it in all that part of Stepney Parish which takes in Limehouse, Ratcliff Highway, and which are now the parishes of Shadwell and Wapping, even to St. Katherine's-by-the-Tower, till after the whole month of August was expired; but they paid for it afterwards, as I shall observe by and by. This, I say, made the people of Redriff and Wapping, Ratcliff and Limehouse, so secure, and flatter themselves so much with the plague's going off without reaching them, that they took no care either to fly into the country or shut themselves up: nay, so far were they from stirring, that they rather received their friends and relations from the city into their houses; and several from other places really took sanctuary in that part of the town as a place of safety, and as a place which they thought God would pass over, and not visit as the rest was visited. And this was the reason, that, when it came upon them, they were more surprised, more unprovided, and more at a loss what to do, than they were in other places; for when it came among them really and with violence, as it did indeed in September and October, there was then no stirring out into the country. Nobody would suffer a stranger to come near them, no, nor near the towns where they dwelt; and, as I have been told, several that wandered into the country on the Surrey side were found starved to death in the woods and commons; that country being more open and more woody than any other part so near London, especially about Norwood and the parishes of Camberwell, Dulwich,[180] and Lusum, where it seems nobody durst[181] relieve the poor distressed people for fear of the infection. This notion having, as I said, prevailed with the people in that part of the town, was in part the occasion, as I said before, that they had recourse to ships for their retreat; and where they did this early and with prudence, furnishing themselves so with provisions so that they had no need to go on shore for supplies, or suffer boats to come on board to bring them,--I say, where they did so, they had certainly the safest retreat of any people whatsoever. But the distress was such, that people ran on board in their fright without bread to eat, and some into ships that had no men on board to remove them farther off, or to take the boat and go down the river to buy provisions, where it may be done safely; and these often suffered, and were infected on board as much as on shore. As the richer sort got into ships, so the lower rank got into hoys,[182] smacks, lighters, and fishing boats; and many, especially watermen, lay in their boats: but those made sad work of it, especially the latter; for going about for provision, and perhaps to get their subsistence, the infection got in among them, and made a fearful havoc. Many of the watermen died alone in their wherries as they rid at their roads, as well above bridge[183] as below, and were not found sometimes till they were not in condition for anybody to touch or come near them. Indeed, the distress of the people at this seafaring end of the town was very deplorable, and deserved the greatest commiseration. But, alas! this was a time when every one's private safety lay so near them that they had no room to pity the distresses of others; for every one had death, as it were, at his door, and many even in their families, and knew not what to do, or whither to fly. This, I say, took away all compassion. Self-preservation, indeed, appeared here to be the first law: for the children ran away from their parents as they languished in the utmost distress; and in some places, though not so frequent as the other, parents did the like to their children. Nay, some dreadful examples there were, and particularly two in one week, of distressed mothers, raving and distracted, killing their own children; one whereof was not far off from where I dwelt, the poor lunatic creature not living herself long enough to be sensible of the sin of what she had done, much less to be punished for it. It is not, indeed, to be wondered at; for the danger of immediate death to ourselves took away all bowels of love, all concern for one another. I speak in general: for there were many instances of immovable affection, pity, and duty in many, and some that came to my knowledge, that is to say, by hearsay; for I shall not take upon me to vouch the truth of the particulars. I could tell here dismal stories of living infants being found sucking the breasts of their mothers or nurses after they have been dead of the plague; of a mother in the parish where I lived, who, having a child that was not well, sent for an apothecary to view the child, and when he came, as the relation goes, was giving the child suck at her breast, and to all appearance was herself very well; but, when the apothecary came close to her, he saw the tokens upon that breast with which she was suckling the child. He was surprised enough, to be sure; but, not willing to fright the poor woman too much, he desired she would give the child into his hand: so he takes the child, and, going to a cradle in the room, lays it in, and, opening its clothes, found the tokens upon the child too; and both died before he could get home to send a preventive medicine to the father of the child, to whom he had told their condition. Whether the child infected the nurse mother, or the mother the child, was not certain, but the last most likely. Likewise of a child brought home to the parents from a nurse that had died of the plague; yet the tender mother would not refuse to take in her child, and laid it in her bosom, by which she was infected and died, with the child in her arms dead also. It would make the hardest heart move at the instances that were frequently found of tender mothers tending and watching with their dear children, and even dying before them, and sometimes taking the distemper from them, and dying, when the child for whom the affectionate heart had been sacrificed has got over it and escaped. I have heard also of some who, on the death of their relations, have grown stupid with the insupportable sorrow; and of one in particular, who was so absolutely overcome with the pressure upon his spirits, that by degrees his head sunk into his body so between his shoulders, that the crown of his head was very little seen above the bone of his shoulders; and by degrees, losing both voice and sense, his face, looking forward, lay against his collar bone, and could not be kept up any otherwise, unless held up by the hands of other people. And the poor man never came to himself again, but languished near a year in that condition, and died. Nor was he ever once seen to lift up his eyes, or to look upon any particular object.[184] I cannot undertake to give any other than a summary of such passages as these, because it was not possible to come at the particulars where sometimes the whole families where such things happened were carried off by the distemper; but there were innumerable cases of this kind which presented[185] to the eye and the ear, even in passing along the streets, as I have hinted above. Nor is it easy to give any story of this or that family, which there was not divers parallel stories to be met with of the same kind. But as I am now talking of the time when the plague raged at the easternmost parts of the town; how for a long time the people of those parts had flattered themselves that they should escape, and how they were surprised when it came upon them as it did (for indeed it came upon them like an armed man when it did come),--I say this brings me back to the three poor men who wandered from Wapping, not knowing whither to go or what to do, and whom I mentioned before,--one a biscuit baker, one a sailmaker, and the other a joiner, all of Wapping or thereabouts. The sleepiness and security of that part, as I have observed, was such, that they not only did not shift for themselves as others did, but they boasted of being safe, and of safety being with them. And many people fled out of the city, and out of the infected suburbs, to Wapping, Ratcliff, Limehouse, Poplar, and such places, as to places of security. And it is not at all unlikely that their doing this helped to bring the plague that way faster than it might otherwise have come: for though I am much for people's flying away, and emptying such a town as this upon the first appearance of a like visitation, and that all people who have any possible retreat should make use of it in time, and begone, yet I must say, when all that will fly are gone, those that are left, and must stand it, should stand stock-still where they are, and not shift from one end of the town or one part of the town to the other; for that is the bane and mischief of the whole, and they carry the plague from house to house in their very clothes. Wherefore were we ordered to kill all the dogs and cats, but because, as they were domestic animals, and are apt to run from house to house and from street to street, so they are capable of carrying the effluvia or infectious steams of bodies infected, even in their furs and hair? And therefore it was, that, in the beginning of the infection, an order was published by the lord mayor and by the magistrates, according to the advice of the physicians, that all the dogs and cats should be immediately killed; and an officer was appointed for the execution. It is incredible, if their account is to be depended upon, what a prodigious number of those creatures were destroyed. I think they talked of forty thousand dogs and five times as many cats; few houses being without a cat, some having several, sometimes five or six in a house. All possible endeavors were used also to destroy the mice and rats, especially the latter, by laying rats-bane and other poisons for them; and a prodigious multitude of them were also destroyed. I often reflected upon the unprovided condition that the whole body of the people were in at the first coming of this calamity upon them; and how it was for want of timely entering into measures and managements, as well public as private, that all the confusions that followed were brought upon us, and that such a prodigious number of people sunk in that disaster which, if proper steps had been taken, might, Providence concurring, have been avoided, and which, if posterity think fit, they may take a caution and warning from. But I shall come to this part again. I come back to my three men. Their story has a moral in every part of it; and their whole conduct, and that of some whom they joined with, is a pattern for all poor men to follow, or women either, if ever such a time comes again: and if there was no other end in recording it, I think this a very just one, whether my account be exactly according to fact or no. Two of them were said to be brothers, the one an old soldier, but now a biscuit baker; the other a lame sailor, but now a sailmaker; the third a joiner. Says John the biscuit baker, one day, to Thomas, his brother, the sailmaker, "Brother Tom, what will become of us? The plague grows hot in the city, and increases this way. What shall we do?" "Truly," says Thomas, "I am at a great loss what to do; for I find if it comes down into Wapping I shall be turned out of my lodging." And thus they began to talk of it beforehand. John. Turned out of your lodging, Tom? If you are, I don't know who will take you in; for people are so afraid of one another now, there is no getting a lodging anywhere. Tho. Why, the people where I lodge are good civil people, and have kindness for me too; but they say I go abroad every day to my work, and it will be dangerous; and they talk of locking themselves up, and letting nobody come near them. John. Why, they are in the right, to be sure, if they resolve to venture staying in town. Tho. Nay, I might even resolve to stay within doors too; for, except a suit of sails that my master has in hand, and which I am just finishing, I am like to get no more work a great while. There's no trade stirs now, workmen and servants are turned off everywhere; so that I might be glad to be locked up too. But I do not see that they will be willing to consent to that any more than to the other. John. Why, what will you do then, brother? And what shall I do? for I am almost as bad as you. The people where I lodge are all gone into the country but a maid, and she is to go next week, and to shut the house quite up; so that I shall be turned adrift to the wide world before you: and I am resolved to go away too, if I knew but where to go. Tho. We were both distracted we did not go away at first, when we might ha' traveled anywhere: there is no stirring now. We shall be starved if we pretend to go out of town. They won't let us have victuals, no, not for our money, nor let us come into the towns, much less into their houses. John. And, that which is almost as bad, I have but little money to help myself with, neither. Tho. As to that, we might make shift. I have a little, though not much; but I tell you there is no stirring on the road. I know a couple of poor honest men in our street have attempted to travel; and at Barnet,[186] or Whetstone, or thereabout, the people offered to fire at them if they pretended to go forward: so they are come back again quite discouraged. John. I would have ventured their fire, if I had been there. If I had been denied food for my money, they should have seen me take it before their faces; and, if I had tendered money for it, they could not have taken any course with me by the law. Tho. You talk your old soldier's language, as if you were in the Low Countries[187] now; but this is a serious thing. The people have good reason to keep anybody off that they are not satisfied are sound at such a time as this, and we must not plunder them. John. No, brother, you mistake the case, and mistake me too: I would plunder nobody. But for any town upon the road to deny me leave to pass through the town in the open highway, and deny me provisions for my money, is to say the town has a right to starve me to death; which cannot be true. Tho. But they do not deny you liberty to go back again from whence you came, and therefore they do not starve you. John. But the next town behind me will, by the same rule, deny me leave to go back; and so they do starve me between them. Besides, there is no law to prohibit my traveling wherever I will on the road. Tho. But there will be so much difficulty in disputing with them at every town on the road, that it is not for poor men to do it, or undertake it, at such a time as this is especially. John. Why, brother, our condition, at this rate, is worse than anybody's else; for we can neither go away nor stay here. I am of the same mind with the lepers of Samaria.[188] If we stay here, we are sure to die. I mean especially as you and I are situated, without a dwelling house of our own, and without lodging in anybody's else. There is no lying in the street at such a time as this; we had as good[189] go into the dead cart at once. Therefore, I say, if we stay here, we are sure to die; and if we go away, we can but die. I am resolved to be gone. Tho. You will go away. Whither will you go, and what can you do? I would as willingly go away as you, if I knew whither; but we have no acquaintance, no friends. Here we were born, and here we must die. John. Look you, Tom, the whole kingdom is my native country as well as this town. You may as well say I must not go out of my house if it is on fire, as that I must not go out of the town I was born in when it is infected with the plague. I was born in England, and have a right to live in it if I can. Tho. But you know every vagrant person may, by the laws of England, be taken up, and passed back to their last legal settlement. John. But how shall they make me vagrant? I desire only to travel on upon my lawful occasions. Tho. What lawful occasions can we pretend to travel, or rather wander, upon? They will not be put off with words. John. Is not flying to save our lives a lawful occasion? And do they not all know that the fact is true? We cannot be said to dissemble. Tho. But, suppose they let us pass, whither shall we go? John. Anywhere to save our lives: it is time enough to consider that when we are got out of this town. If I am once out of this dreadful place, I care not where I go. Tho. We shall be driven to great extremities. I know not what to think of it. John. Well, Tom, consider of it a little. This was about the beginning of July; and though the plague was come forward in the west and north parts of the town, yet all Wapping, as I have observed before, and Redriff and Ratcliff, and Limehouse and Poplar, in short, Deptford and Greenwich, both sides of the river from the Hermitage, and from over against it, quite down to Blackwall, was entirely free. There had not one person died of the plague in all Stepney Parish, and not one on the south side of Whitechapel Road, no, not in any parish; and yet the weekly bill was that very week risen up to 1,006. It was a fortnight after this before the two brothers met again, and then the case was a little altered, and the plague was exceedingly advanced, and the number greatly increased. The bill was up at 2,785, and prodigiously increasing; though still both sides of the river, as below, kept pretty well. But some began to die in Redriff, and about five or six in Ratcliff Highway, when the sailmaker came to his brother John, express,[190] and in some fright; for he was absolutely warned out of his lodging, and had only a week to provide himself. His brother John was in as bad a case, for he was quite out, and had only[191] begged leave of his master, the biscuit baker, to lodge in an outhouse belonging to his workhouse, where he only lay upon straw, with some biscuit sacks, or "bread sacks," as they called them, laid upon it, and some of the same sacks to cover him. Here they resolved, seeing all employment being at an end, and no work or wages to be had, they would make the best of their way to get out of the reach of the dreadful infection, and, being as good husbands as they could, would endeavor to live upon what they had as long as it would last, and then work for more, if they could get work anywhere of any kind, let it be what it would. While they were considering to put this resolution in practice in the best manner they could, the third man, who was acquainted very well with the sailmaker, came to know of the design, and got leave to be one of the number; and thus they prepared to set out. It happened that they had not an equal share of money; but as the sailmaker, who had the best stock, was, besides his being lame, the most unfit to expect to get anything by working in the country, so he was content that what money they had should all go into one public stock, on condition that whatever any one of them could gain more than another, it should, without any grudging, be all added to the public stock. They resolved to load themselves with as little baggage as possible, because they resolved at first to travel on foot, and to go a great way, that they might, if possible, be effectually safe. And a great many consultations they had with themselves before they could agree about what way they should travel; which they were so far from adjusting, that, even to the morning they set out, they were not resolved on it. At last the seaman put in a hint that determined it. "First," says he, "the weather is very hot; and therefore I am for traveling north, that we may not have the sun upon our faces, and beating upon our breasts, which will heat and suffocate us; and I have been told," says he, "that it is not good to overheat our blood at a time when, for aught we know, the infection may be in the very air. In the next place," says he, "I am for going the way that may be contrary to the wind as it may blow when we set out, that we may not have the wind blow the air of the city on our backs as we go." These two cautions were approved of, if it could be brought so to hit that the wind might not be in the south when they set out to go north. John the baker, who had been a soldier, then put in his opinion. "First," says he, "we none of us expect to get any lodging on the road, and it will be a little too hard to lie just in the open air. Though it may be warm weather, yet it may be wet and damp, and we have a double reason to take care of our healths at such a time as this; and therefore," says he, "you, brother Tom, that are a sailmaker, might easily make us a little tent; and I will undertake to set it up every night and take it down, and a fig for all the inns in England. If we have a good tent over our heads, we shall do well enough." The joiner opposed this, and told them, let them leave that to him: he would undertake to build them a house every night with his hatchet and mallet, though he had no other tools, which should be fully to their satisfaction, and as good as a tent. The soldier and the joiner disputed that point some time; but at last the soldier carried it for a tent: the only objection against it was, that it must be carried with them, and that would increase their baggage too much, the weather being hot. But the sailmaker had a piece of good hap[192] fall in, which made that easy; for his master who[193] he worked for, having a ropewalk, as well as sailmaking trade, had a little poor horse that he made no use of then, and, being willing to assist the three honest men, he gave them the horse for the carrying their baggage; also, for a small matter of three days' work that his man did for him before he went, he let him have an old topgallant sail[194] that was worn out, but was sufficient, and more than enough, to make a very good tent. The soldier showed how to shape it, and they soon, by his direction, made their tent, and fitted it with poles or staves for the purpose: and thus they were furnished for their journey; viz., three men, one tent, one horse, one gun for the soldier (who would not go without arms, for now he said he was no more a biscuit baker, but a trooper). The joiner had a small bag of tools, such as might be useful if he should get any work abroad, as well for their subsistence as his own. What money they had they brought all into one public stock, and thus they began their journey. It seems that in the morning when they set out, the wind blew, as the sailor said, by his pocket compass, at N.W. by W., so they directed, or rather resolved to direct, their course N.W. But then a difficulty came in their way, that as they set out from the hither end of Wapping, near the Hermitage, and that the plague was now very violent, especially on the north side of the city, as in Shoreditch and Cripplegate Parish, they did not think it safe for them to go near those parts: so they went away east, through Ratcliff Highway, as far as Ratcliff Cross, and leaving Stepney church still on their left hand, being afraid to come up from Ratcliff Cross to Mile End, because they must come just by the churchyard, and because the wind, that seemed to blow more from the west, blowed directly from the side of the city where the plague was hottest. So, I say, leaving Stepney, they fetched a long compass,[195] and, going to Poplar and Bromley, came into the great road just at Bow. Here the watch placed upon Bow Bridge would have questioned them; but they, crossing the road into a narrow way that turns out of the higher end of the town of Bow to Oldford, avoided any inquiry there, and traveled on to Oldford. The constables everywhere were upon their guard, not so much, it seems, to stop people passing by, as to stop them from taking up their abode in their towns; and, withal, because of a report that was newly raised at that time, and that indeed was not very improbable, viz., that the poor people in London, being distressed and starved for want of work, and by that means for want of bread, were up in arms, and had raised a tumult, and that they would come out to all the towns round to plunder for bread. This, I say, was only a rumor, and it was very well it was no more; but it was not so far off from being a reality as it has been thought, for in a few weeks more the poor people became so desperate by the calamity they suffered, that they were with great difficulty kept from running out into the fields and towns, and tearing all in pieces wherever they came. And, as I have observed before, nothing hindered them but that the plague raged so violently, and fell in upon them so furiously, that they rather went to the grave by thousands than into the fields in mobs by thousands; for in the parts about the parishes of St. Sepulchre's, Clerkenwell, Cripplegate, Bishopsgate, and Shoreditch, which were the places where the mob began to threaten, the distemper came on so furiously, that there died in those few parishes, even then, before the plague was come to its height, no less than 5,361 people in the first three weeks in August, when at the same time the parts about Wapping, Ratcliff, and Rotherhithe were, as before described, hardly touched, or but very lightly; so that in a word, though, as I said before, the good management of the lord mayor and justices did much to prevent the rage and desperation of the people from breaking out in rabbles and tumults, and, in short, from the poor plundering the rich,--I say, though they did much, the dead cart did more: for as I have said, that, in five parishes only, there died above five thousand in twenty days, so there might be probably three times that number sick all that time; for some recovered, and great numbers fell sick every day, and died afterwards. Besides, I must still be allowed to say, that, if the bills of mortality said five thousand, I always believed it was twice as many in reality, there being no room to believe that the account they gave was right, or that indeed they[196] were, among such confusions as I saw them in, in any condition to keep an exact account. But to return to my travelers. Here they were only examined, and, as they seemed rather coming from the country than from the city, they found the people easier with them; that they talked to them, let them come into a public house where the constable and his warders were, and gave them drink and some victuals, which greatly refreshed and encouraged them. And here it came into their heads to say, when they should be inquired of afterwards, not that they came from London, but that they came out of Essex. To forward this little fraud, they obtained so much favor of the constable at Oldford as to give them a certificate of their passing from Essex through that village, and that they had not been at London; which, though false in the common acceptation of London in the country, yet was literally true, Wapping or Ratcliff being no part either of the city or liberty. This certificate, directed to the next constable, that was at Homerton, one of the hamlets of the parish of Hackney, was so serviceable to them, that it procured them, not a free passage there only, but a full certificate of health from a justice of the peace, who, upon the constable's application, granted it without much difficulty. And thus they passed through the long divided town of Hackney (for it lay then in several separated hamlets), and traveled on till they came into the great north road, on the top of Stamford Hill. By this time they began to weary; and so, in the back road from Hackney, a little before it opened into the said great road, they resolved to set up their tent, and encamp for the first night; which they did accordingly, with this addition: that, finding a barn, or a building like a barn, and first searching as well as they could to be sure there was nobody in it, they set up their tent with the head of it against the barn. This they did also because the wind blew that night very high, and they were but young at such a way of lodging, as well as at the managing their tent. Here they went to sleep; but the joiner, a grave and sober man, and not pleased with their lying at this loose rate the first night, could not sleep, and resolved, after trying it to no purpose, that he would get out, and, taking the gun in his hand, stand sentinel, and guard his companions. So, with the gun in his hand, he walked to and again before the barn; for that stood in the field near the road, but within the hedge. He had not been long upon the scout, but he heard a noise of people coming on as if it had been a great number; and they came on, as he thought, directly towards the barn. He did not presently awake his companions, but in a few minutes more, their noise growing louder and louder, the biscuit baker called to him and asked him what was the matter, and quickly started out too. The other being the lame sailmaker, and most weary, lay still in the tent. As they expected, so the people whom they had heard came on directly to the barn, when one of our travelers challenged, like soldiers upon the guard, with, "Who comes there?" The people did not answer immediately; but one of them speaking to another that was behind them, "Alas, alas! we are all disappointed," says he; "here are some people before us; the barn is taken up." They all stopped upon that, as under some surprise; and it seems there were about thirteen of them in all, and some women among them. They consulted together what they should do; and by their discourse, our travelers soon found they were poor distressed people too, like themselves, seeking shelter and safety; and besides, our travelers had no need to be afraid of their coming up to disturb them, for as soon as they heard the words, "Who comes there?" they could hear the women say, as if frighted, "Do not go near them; how do you know but they may have the plague?" And when one of the men said, "Let us but speak to them," the women said, "No, don't, by any means; we have escaped thus far by the goodness of God; do not let us run into danger now, we beseech you." Our travelers found by this that they were a good sober sort of people, and flying for their lives as they were; and as they were encouraged by it, so John said to the joiner, his comrade, "Let us encourage them too, as much as we can." So he called to them. "Hark ye, good people," says the joiner; "we find by your talk that you are flying from the same dreadful enemy as we are. Do not be afraid of us; we are only three poor men of us. If you are free from the distemper, you shall not be hurt by us. We are not in the barn, but in a little tent here on the outside, and we will remove for you; we can set up our tent again immediately anywhere else." And upon this a parley began between the joiner, whose name was Richard, and one of their men, whose said name was Ford. Ford. And do you assure us that you are all sound men? Rich. Nay, we are concerned to tell you of it, that you may not be uneasy, or think yourselves in danger; but you see we do not desire you should put yourselves into any danger, and therefore I tell you that we have not made use of the barn; so we will remove from it, that you may be safe and we also. Ford. That is very kind and charitable; but if we have reason to be satisfied that you are sound, and free from the visitation, why should we make you remove, now you are settled in your lodging, and, it may be, are laid down to rest? We will go into the barn, if you please, to rest ourselves awhile, and we need not disturb you. Rich. Well, but you are more than we are. I hope you will assure us that you are all of you sound too, for the danger is as great from you to us as from us to you. Ford. Blessed be God that some do escape, though it be but few! What may be our portion still, we know not, but hitherto we are preserved. Rich. What part of the town do you come from? Was the plague come to the places where you lived? Ford. Ay, ay, in a most frightful and terrible manner, or else we had not fled away as we do; but we believe there will be very few left alive behind us. Rich. What part do you come from? Ford. We are most of us from Cripplegate Parish; only two or three of Clerkenwell Parish, but on the hither side. Rich. How, then, was it that you came away no sooner? Ford. We have been away some time, and kept together as well as we could at the hither end of Islington, where we got leave to lie in an old uninhabited house, and had some bedding and conveniences of our own, that we brought with us; but the plague is come up into Islington too, and a house next door to our poor dwelling was infected and shut up, and we are come away in a fright. Rich. And what way are you going? Ford. As our lot shall cast us, we know not whither; but God will guide those that look up to him. They parleyed no further at that time, but came all up to the barn, and with some difficulty got into it. There was nothing but hay in the barn, but it was almost full of that, and they accommodated themselves as well as they could, and went to rest; but our travelers observed that before they went to sleep, an ancient man, who, it seems, was the father of one of the women, went to prayer with all the company, recommending themselves to the blessing and protection of Providence before they went to sleep. It was soon day at that time of the year; and as Richard the joiner had kept guard the first part of the night, so John the soldier relieved him, and he had the post in the morning. And they began to be acquainted with one another. It seems, when they left Islington, they intended to have gone north away to Highgate, but were stopped at Holloway, and there they would not let them pass; so they crossed over the fields and hills to the eastward, and came out at the Boarded River, and so, avoiding the towns, they left Hornsey on the left hand, and Newington on the right hand, and came into the great road about Stamford Hill on that side, as the three travelers had done on the other side. And now they had thoughts of going over the river in the marshes, and make forwards to Epping Forest, where they hoped they should get leave to rest. It seems they were not poor, at least not so poor as to be in want: at least, they had enough to subsist them moderately for two or three months, when, as they said, they were in hopes the cold weather would check the infection, or at least the violence of it would have spent itself, and would abate, if it were only for want of people left alive to be infected. This was much the fate of our three travelers, only that they seemed to be the better furnished for traveling, and had it in their view to go farther off; for, as to the first, they did not propose to go farther than one day's journey, that so they might have intelligence every two or three days how things were at London. But here our travelers found themselves under an unexpected inconvenience, namely, that of their horse; for, by means of the horse to carry their baggage, they were obliged to keep in the road, whereas the people of this other band went over the fields or roads, path or no path, way or no way, as they pleased. Neither had they any occasion to pass through any town, or come near any town, other than to buy such things as they wanted for their necessary subsistence; and in that, indeed, they were put to much difficulty, of which in its place. But our three travelers were obliged to keep the road, or else they must commit spoil, and do the country a great deal of damage in breaking down fences and gates to go over inclosed fields, which they were loath to do if they could help it. Our three travelers, however, had a great mind to join themselves to this company, and take their lot with them; and, after some discourse, they laid aside their first design, which looked northward, and resolved to follow the other into Essex. So in the morning they took up their tent and loaded their horse, and away they traveled all together. They had some difficulty in passing the ferry at the riverside, the ferryman being afraid of them; but, after some parley at a distance, the ferryman was content to bring his boat to a place distant from the usual ferry, and leave it there for them to take it. So, putting themselves over, he directed them to leave the boat, and he, having another boat, said he would fetch it again; which it seems, however, he did not do for above eight days. Here, giving the ferryman money beforehand, they had a supply of victuals and drink, which he brought and left in the boat for them, but not without, as I said, having received the money beforehand. But now our travelers were at a great loss and difficulty how to get the horse over, the boat being small, and not fit for it, and at last could not do it without unloading the baggage and making him swim over. From the river they traveled towards the forest; but when they came to Walthamstow, the people of that town denied[197] to admit them, as was the case everywhere; the constables and their watchmen kept them off at a distance, and parleyed with them. They gave the same account of themselves as before; but these gave no credit to what they said, giving it for a reason, that two or three companies had already come that way and made the like pretenses, but that they had given several people the distemper in the towns where they had passed, and had been afterwards so hardly used by the country, though with justice too, as they had deserved, that about Brentwood[198] or that way, several of them perished in the fields, whether of the plague, or of mere want and distress, they could not tell. This was a good reason, indeed, why the people of Walthamstow should be very cautious, and why they should resolve not to entertain anybody that they were not well satisfied of; but as Richard the joiner, and one of the other men who parleyed with them, told them, it was no reason why they should block up the roads and refuse to let the people pass through the town, and who asked nothing of them but to go through the street; that, if their people were afraid of them, they might go into their houses and shut their doors: they would neither show them civility nor incivility, but go on about their business. The constables and attendants, not to be persuaded by reason, continued obstinate, and would hearken to nothing: so the two men that talked with them went back to their fellows to consult what was to be done. It was very discouraging in the whole, and they knew not what to do for a good while; but at last John, the soldier and biscuit baker, considering awhile, "Come," says he, "leave the rest of the parley to me." He had not appeared yet: so he sets the joiner, Richard, to work to cut some poles out of the trees, and shape them as like guns as he could; and in a little time he had five or six fair muskets, which at a distance would not be known; and about the part where the lock of a gun is, he caused them to wrap cloth and rags, such as they had, as soldiers do in wet weather to preserve the locks of their pieces from rust; the rest was discolored with clay or mud, such as they could get; and all this while the rest of them sat under the trees by his direction, in two or three bodies, where they made fires at a good distance from one another. While this was doing, he advanced himself, and two or three with him, and set up their tent in the lane, within sight of the barrier which the townsmen had made, and set a sentinel just by it with the real gun, the only one they had, and who[199] walked to and fro with the gun on his shoulder, so as that the people of the town might see them; also he tied the horse to a gate in the hedge just by, and got some dry sticks together and kindled a fire on the other side of the tent, so that the people of the town could see the fire and the smoke, but could not see what they were doing at it. After the country people had looked upon them very earnestly a great while, and by all that they could see could not but suppose that they were a great many in company, they began to be uneasy, not for their going away, but for staying where they were; and above all, perceiving they had horses and arms (for they had seen one horse and one gun at the tent, and they had seen others of them walk about the field on the inside of the hedge by the side of the lane with their muskets, as they took them to be, shouldered),--I say, upon such a sight as this, you may be assured they were alarmed and terribly frightened; and it seems they went to a justice of the peace to know what they should do. What the justice advised them to, I know not; but towards the evening they called from the barrier, as above, to the sentinel at the tent. "What do you want?" says John. "Why, what do you intend to do?" says the constable. "To do?" says John; "what would you have us to do?" Const. Why don't you begone? What do you stay there for? John. Why do you stop us on the King's highway, and pretend to refuse us leave to go on our way? Const. We are not bound to tell you the reason, though we did let you know it was because of the plague. John. We told you we were all sound, and free from the plague, which we were not bound to have satisfied you of, and yet you pretend to stop us on the highway. Const. We have a right to stop it up, and our own safety obliges us to it; besides, this is not the King's highway, it is a way upon sufferance. You see here is a gate, and if we do let people pass here, we make them pay toll. John. We have a right to seek our own safety as well as you; and you may see we are flying for our lives, and it is very unchristian and unjust in you to stop us. Const. You may go back from whence you came, we do not hinder you from that. John. No, it is a stronger enemy than you that keeps us from doing that, or else we should not have come hither. Const. Well, you may go any other way, then. John. No, no. I suppose you see we are able to send you going, and all the people of your parish, and come through your town when we will; but, since you have stopped us here, we are content. You see we have encamped here, and here we will live. We hope you will furnish us with victuals. Const. We furnish you! What mean you by that? John. Why, you would not have us starve, would you? If you stop us here, you must keep us. Const. You will be ill kept at our maintenance. John. If you stint us, we shall make ourselves the better allowance. Const. Why, you will not pretend to quarter upon us by force, will you? John. We have offered no violence to you yet, why do you seem to oblige us to it? I am an old soldier, and cannot starve; and, if you think that we shall be obliged to go back for want of provisions, you are mistaken. Const. Since you threaten us, we shall take care to be strong enough for you. I have orders to raise the county upon you. John. It is you that threaten, not we; and, since you are for mischief, you cannot blame us if we do not give you time for it. We shall begin our march in a few minutes. Const. What is it you demand of us? John. At first we desired nothing of you but leave to go through the town. We should have offered no injury to any of you, neither would you have had any injury or loss by us. We are not thieves, but poor people in distress, and flying from the dreadful plague in London, which devours thousands every week. We wonder how you can be so unmerciful. Const. Self-preservation obliges us. John. What! To shut up your compassion, in a case of such distress as this? Const. Well, if you will pass over the fields on your left hand, and behind that part of the town, I will endeavor to have gates opened for you. John. Our horsemen cannot pass with our baggage that way. It does not lead into the road that we want to go, and why should you force us out of the road? Besides, you have kept us here all day without any provisions but such as we brought with us. I think you ought to send us some provisions for our relief. Const. If you will go another way, we will send you some provisions. John. That is the way to have all the towns in the county stop up the ways against us. Const. If they all furnish you with food, what will you be the worse? I see you have tents: you want no lodging. John. Well, what quantity of provisions will you send us? Const. How many are you? John. Nay, we do not ask enough for all our company. We are in three companies. If you will send us bread for twenty men and about six or seven women for three days, and show us the way over the field you speak of, we desire not to put your people into any fear for us. We will go out of our way to oblige you, though we are as free from infection as you are. Const. And will you assure us that your other people shall offer us no new disturbance? John. No, no; you may depend on it. Const. You must oblige yourself, too, that none of your people shall come a step nearer than where the provisions we send you shall be set down. John. I answer for it, we will not. Here he called to one of his men, and bade him order Captain Richard and his people to march the lower way on the side of the marshes, and meet them in the forest; which was all a sham, for they had no Captain Richard or any such company. Accordingly, they sent to the place twenty loaves of bread and three or four large pieces of good beef, and opened some gates, through which they passed; but none of them had courage so much as to look out to see them go, and as it was evening, if they had looked, they could not have seen them so as to know how few they were. This was John the soldier's management; but this gave such an alarm to the county, that, had they really been two or three hundred, the whole county would have been raised upon them, and they would have been sent to prison, or perhaps knocked on the head. They were soon made sensible of this, for two days afterwards they found several parties of horsemen and footmen also about, in pursuit of three companies of men armed, as they said, with muskets, who were broke out from London and had the plague upon them, and that were not only spreading the distemper among the people, but plundering the country. As they saw now the consequence of their case, they soon saw the danger they were in: so they resolved, by the advice also of the old soldier, to divide themselves again. John and his two comrades, with the horse, went away as if towards Waltham,[200]--the other in two companies, but all a little asunder,--and went towards Epping.[200] The first night they encamped all in the forest, and not far off from one another, but not setting up the tent for fear that should discover them. On the other hand, Richard went to work with his ax and his hatchet, and, cutting down branches of trees, he built three tents or hovels, in which they all encamped with as much convenience as they could expect. The provisions they had at Walthamstow served them very plentifully this night; and as for the next, they left it to Providence. They had fared so well with the old soldier's conduct, that they now willingly made him their leader, and the first of his conduct appeared to be very good. He told them that they were now at a proper distance enough from London; that, as they need not be immediately beholden to the country for relief, they ought to be as careful the country did not infect them as that they did not infect the country; that what little money they had they must be as frugal of as they could; that as he would not have them think of offering the country any violence, so they must endeavor to make the sense of their condition go as far with the country as it could. They all referred themselves to his direction: so they left their three houses standing, and the next day went away towards Epping; the captain also (for so they now called him), and his two fellow travelers, laid aside their design of going to Waltham, and all went together. When they came near Epping, they halted, choosing out a proper place in the open forest, not very near the highway, but not far out of it, on the north side, under a little cluster of low pollard trees.[201] Here they pitched their little camp, which consisted of three large tents or huts made of poles, which their carpenter, and such as were his assistants, cut down, and fixed in the ground in a circle, binding all the small ends together at the top, and thickening the sides with boughs of trees and bushes, so that they were completely close and warm. They had besides this a little tent where the women lay by themselves, and a hut to put the horse in. It happened that the next day, or the next but one, was market day at Epping, when Captain John and one of the other men went to market and bought some provisions, that is to say, bread, and some mutton and beef; and two of the women went separately, as if they had not belonged to the rest, and bought more. John took the horse to bring it home, and the sack which the carpenter carried his tools in, to put it in. The carpenter went to work and made them benches and stools to sit on, such as the wood he could get would afford, and a kind of a table to dine on. They were taken no notice of for two or three days; but after that, abundance of people ran out of the town to look at them, and all the country was alarmed about them. The people at first seemed afraid to come near them; and, on the other hand, they desired the people to keep off, for there was a rumor that the plague was at Waltham, and that it had been in Epping two or three days. So John called out to them not to come to them. "For," says he, "we are all whole and sound people here, and we would not have you bring the plague among us, nor pretend we brought it among you." After this, the parish officers came up to them, and parleyed with them at a distance, and desired to know who they were, and by what authority they pretended to fix their stand at that place. John answered very frankly, they were poor distressed people from London, who, foreseeing the misery they should be reduced to if the plague spread into the city, had fled out in time for their lives, and, having no acquaintance or relations to fly to, had first taken up at Islington, but, the plague being come into that town, were fled farther; and, as they supposed that the people of Epping might have refused them coming into their town, they had pitched their tents thus in the open field and in the forest, being willing to bear all the hardships of such a disconsolate lodging rather than have any one think, or be afraid, that they should receive injury by them. At first the Epping people talked roughly to them, and told them they must remove; that this was no place for them; and that they pretended to be sound and well, but that they might be infected with the plague, for aught they knew, and might infect the whole country, and they could not suffer them there. John argued very calmly with them a great while, and told them that London was the place by which they, that is, the townsmen of Epping, and all the country round them, subsisted; to whom they sold the produce of their lands, and out of whom they made the rents of their farms; and to be so cruel to the inhabitants of London, or to any of those by whom they gained so much, was very hard; and they would be loath to have it remembered hereafter, and have it told, how barbarous, how inhospitable, and how unkind they were to the people of London when they fled from the face of the most terrible enemy in the world; that it would be enough to make the name of an Epping man hateful throughout all the city, and to have the rabble stone them in the very streets whenever they came so much as to market; that they were not yet secure from being visited themselves, and that, as he heard, Waltham was already; that they would think it very hard, that, when any of them fled for fear before they were touched, they should be denied the liberty of lying so much as in the open fields. The Epping men told them again that they, indeed, said they were sound, and free from the infection, but that they had no assurance of it; and that it was reported that there had been a great rabble of people at Walthamstow, who made such pretenses of being sound as they did, but that they threatened to plunder the town, and force their way, whether the parish officers would or no; that there were near two hundred of them, and had arms and tents like Low Country soldiers; that they extorted provisions from the town by threatening them with living upon them at free quarter,[202] showing their arms, and talking in the language of soldiers; and that several of them having gone away towards Rumford and Brentwood, the country had been infected by them, and the plague spread into both those large towns, so that the people durst not go to market there, as usual; that it was very likely they were some of that party, and, if so, they deserved to be sent to the county jail, and be secured till they had made satisfaction for the damage they had done, and for the terror and fright they had put the country into. John answered, that what other people had done was nothing to them; that they assured them they were all of one company; that they had never been more in number than they saw them at that time (which, by the way, was very true); that they came out in two separate companies, but joined by the way, their cases being the same; that they were ready to give what account of themselves anybody desired of them, and to give in their names and places of abode, that so they might be called to an account for any disorder that they might be guilty of; that the townsmen might see they were content to live hardly, and only desired a little room to breathe in on the forest, where it was wholesome (for where it was not, they could not stay, and would decamp if they found it otherwise there). "But," said the townsmen, "we have a great charge of poor upon our hands already, and we must take care not to increase it. We suppose you can give us no security against your being chargeable to our parish and to the inhabitants, any more than you can of being dangerous to us as to the infection." "Why, look you," says John, "as to being chargeable to you, we hope we shall not. If you will relieve us with provisions for our present necessity, we will be very thankful. As we all lived without charity when we were at home, so we will oblige ourselves fully to repay you, if God please to bring us back to our own families and houses in safety, and to restore health to the people of London. "As to our dying here, we assure you, if any of us die, we that survive will bury them, and put you to no expense, except it should be that we should all die, and then, indeed, the last man, not being able to bury himself, would put you to that single expense; which I am persuaded," says John, "he would leave enough behind him to pay you for the expense of. "On the other hand," says John, "if you will shut up all bowels of compassion, and not relieve us at all, we shall not extort anything by violence, or steal from any one; but when that little we have is spent, if we perish for want, God's will be done!" John wrought so upon the townsmen by talking thus rationally and smoothly to them, that they went away; and though they did not give any consent to their staying there, yet they did not molest them, and the poor people continued there three or four days longer without any disturbance. In this time they had got some remote acquaintance with a victualing house on the outskirts of the town, to whom they called at a distance to bring some little things that they wanted, and which they caused to be set down at some distance, and always paid for very honestly. During this time the younger people of the town came frequently pretty near them, and would stand and look at them, and would sometimes talk with them at some space between; and particularly it was observed that the first sabbath day the poor people kept retired, worshiped God together, and were heard to sing psalms. These things, and a quiet, inoffensive behavior, began to get them the good opinion of the country, and the people began to pity them and speak very well of them; the consequence of which was, that upon the occasion of a very wet, rainy night, a certain gentleman who lived in the neighborhood sent them a little cart with twelve trusses or bundles of straw, as well for them to lodge upon as to cover and thatch their huts, and to keep them dry. The minister of a parish not far off, not knowing of the other, sent them also about two bushels of wheat and half a bushel of white pease. They were very thankful, to be sure, for this relief, and particularly the straw was a very great comfort to them; for though the ingenious carpenter had made them frames to lie in, like troughs, and filled them with leaves of trees and such things as they could get, and had cut all their tent cloth out to make coverlids, yet they lay damp and hard and unwholesome till this straw came, which was to them like feather beds, and, as John said, more welcome than feather beds would have been at another time. This gentleman and the minister having thus begun, and given an example of charity to these wanderers, others quickly followed; and they received every day some benevolence or other from the people, but chiefly from the gentlemen who dwelt in the country round about. Some sent them chairs, stools, tables, and such household things as they gave notice they wanted. Some sent them blankets, rugs, and coverlids; some, earthenware; and some, kitchen ware for ordering[203] their food. Encouraged by this good usage, their carpenter, in a few days, built them a large shed or house with rafters, and a roof in form, and an upper floor, in which they lodged warm, for the weather began to be damp and cold in the beginning of September; but this house being very well thatched, and the sides and roof very thick, kept out the cold well enough. He made also an earthen wall at one end, with a chimney in it; and another of the company, with a vast deal of trouble and pains, made a funnel to the chimney to carry out the smoke. Here they lived comfortably, though coarsely, till the beginning of September, when they had the bad news to hear, whether true or not, that the plague, which was very hot at Waltham Abbey on the one side, and Rumford and Brentwood on the other side, was also come to Epping, to Woodford, and to most of the towns upon the forest; and which, as they said, was brought down among them chiefly by the higglers,[204] and such people as went to and from London with provisions. If this was true, it was an evident contradiction to the report which was afterwards spread all over England, but which, as I have said, I cannot confirm of my own knowledge, namely, that the market people carrying provisions to the city never got the infection or carried it back into the country; both which, I have been assured, has been[205] false. It might be that they were preserved even beyond expectation, though not to a miracle;[206] that abundance went and came and were not touched; and that was much encouragement for the poor people of London, who had been completely miserable if the people that brought provisions to the markets had not been many times wonderfully preserved, or at least more preserved than could be reasonably expected. But these new inmates began to be disturbed more effectually, for the towns about them were really infected. And they began to be afraid to trust one another so much as to go abroad for such things as they wanted; and this pinched them very hard, for now they had little or nothing but what the charitable gentlemen of the country supplied them with. But, for their encouragement, it happened that other gentlemen of the country, who had not sent them anything before, began to hear of them and supply them. And one sent them a large pig, that is to say, a porker; another, two sheep; and another sent them a calf: in short, they had meat enough, and sometimes had cheese and milk, and such things. They were chiefly put to it[207] for bread; for when the gentlemen sent them corn, they had nowhere to bake it or to grind it. This made them eat the first two bushels of wheat that was sent them, in parched corn, as the Israelites of old did, without grinding or making bread of it.[208] At last they found means to carry their corn to a windmill near Woodford, where they had it ground; and afterwards the biscuit baker made a hearth so hollow and dry, that he could bake biscuit cakes tolerably well, and thus they came into a condition to live without any assistance or supplies from the towns. And it was well they did; for the country was soon after fully infected, and about a hundred and twenty were said to have died of the distemper in the villages near them, which was a terrible thing to them. On this they called a new council, and now the towns had no need to be afraid they should settle near them; but, on the contrary, several families of the poorer sort of the inhabitants quitted their houses, and built huts in the forest, after the same manner as they had done. But it was observed that several of these poor people that had so removed had the sickness even in their huts or booths, the reason of which was plain: namely, not because they removed into the air, but[209] because they did not remove time[210] enough, that is to say, not till, by openly conversing with other people, their neighbors, they had the distemper upon them (or, as may be said, among them), and so carried it about with them whither they went; or (2) because they were not careful enough, after they were safely removed out of the towns, not to come in again and mingle with the diseased people. But be it which of these it will, when our travelers began to perceive that the plague was not only in the towns, but even in the tents and huts on the forest near them, they began then not only to be afraid, but to think of decamping and removing; for, had they staid, they would have been in manifest danger of their lives. It is not to be wondered that they were greatly afflicted at being obliged to quit the place where they had been so kindly received, and where they had been treated with so much humanity and charity; but necessity, and the hazard of life which they came out so far to preserve, prevailed with them, and they saw no remedy. John, however, thought of a remedy for their present misfortune; namely, that he would first acquaint that gentleman who was their principal benefactor with the distress they were in, and to[211] crave his assistance and advice. This good charitable gentleman encouraged them to quit the place, for fear they should be cut off from any retreat at all by the violence of the distemper; but whither they should go, that he found very hard to direct them to. At last John asked of him, whether he, being a justice of the peace, would give them certificates of health to other justices who[212] they might come before, that so, whatever might be their lot, they might not be repulsed, now they had been also so long from London. This his worship immediately granted, and gave them proper letters of health; and from thence they were at liberty to travel whither they pleased. Accordingly they had a full certificate of health, intimating that they had resided in a village in the county of Essex so long; that, being examined and scrutinized sufficiently, and having been retired from all conversation[213] for above forty days, without any appearance of sickness, they were therefore certainly concluded to be sound men, and might be safely entertained anywhere, having at last removed rather for fear of the plague, which was come into such a town, rather[214] than for having any signal of infection upon them, or upon any belonging to them. With this certificate they removed, though with great reluctance; and, John inclining not to go far from home, they removed towards the marshes on the side of Waltham. But here they found a man who, it seems, kept a weir or stop upon the river, made to raise water for the barges which go up and down the river; and he terrified them with dismal stories of the sickness having been spread into all the towns on the river and near the river, on the side of Middlesex and Hertfordshire (that is to say, into Waltham, Waltham Cross, Enfield, and Ware, and all the towns on the road), that they were afraid to go that way; though it seems the man imposed upon them, for that[215] the thing was not really true. However, it terrified them, and they resolved to move across the forest towards Rumford and Brentwood; but they heard that there were numbers of people fled out of London that way, who lay up and down in the forest, reaching near Rumford, and who, having no subsistence or habitation, not only lived oddly,[216] and suffered great extremities in the woods and fields for want of relief, but were said to be made so desperate by those extremities, as that they offered many violences to the country, robbed and plundered, and killed cattle, and the like; and others, building huts and hovels by the roadside, begged, and that with an importunity next door to demanding relief: so that the country was very uneasy, and had been obliged to take some of them up. This, in the first place, intimated to them that they would be sure to find the charity and kindness of the county, which they had found here where they were before, hardened and shut up against them; and that, on the other hand, they would be questioned wherever they came, and would be in danger of violence from others in like cases with themselves. Upon all these considerations, John, their captain, in all their names, went back to their good friend and benefactor who had relieved them before, and, laying their case truly before him, humbly asked his advice; and he as kindly advised them to take up their old quarters again, or, if not, to remove but a little farther out of the road, and directed them to a proper place for them. And as they really wanted some house, rather than huts, to shelter them at that time of the year, it growing on towards Michaelmas, they found an old decayed house, which had been formerly some cottage or little habitation, but was so out of repair as[217] scarce habitable; and by consent of a farmer, to whose farm it belonged, they got leave to make what use of it they could. The ingenious joiner, and all the rest by his directions, went to work with it, and in a very few days made it capable to shelter them all in case of bad weather; and in which there was an old chimney and an old oven, though both lying in ruins, yet they made them both fit for use; and, raising additions, sheds, and lean-to's[218] on every side, they soon made the house capable to hold them all. They chiefly wanted boards to make window shutters, floors, doors, and several other things; but as the gentleman above favored them, and the country was by that means made easy with them, and, above all, that they were known to be all sound and in good health, everybody helped them with what they could spare. Here they encamped for good and all, and resolved to remove no more. They saw plainly how terribly alarmed that country was everywhere at anybody that came from London, and that they should have no admittance anywhere but with the utmost difficulty; at least no friendly reception and assistance, as they had received here. Now, although they received great assistance and encouragement from the country gentlemen, and from the people round about them, yet they were put to great straits; for the weather grew cold and wet in October and November, and they had not been used to so much hardship, so that they got cold in their limbs, and distempers, but never had the infection. And thus about December they came home to the city again. I give this story thus at large, principally to give an account[219] what became of the great numbers of people which immediately appeared in the city as soon as the sickness abated; for, as I have said, great numbers of those that were able, and had retreats in the country, fled to those retreats. So when it[220] was increased to such a frightful extremity as I have related, the middling people[221] who had not friends fled to all parts of the country where they could get shelter, as well those that had money to relieve themselves as those that had not. Those that had money always fled farthest, because they were able to subsist themselves; but those who were empty suffered, as I have said, great hardships, and were often driven by necessity to relieve their wants at the expense of the country. By that means the country was made very uneasy at them, and sometimes took them up, though even then they scarce knew what to do with them, and were always very backward to punish them; but often, too, they forced them from place to place, till they were obliged to come back again to London. I have, since my knowing this story of John and his brother, inquired, and found that there were a great many of the poor disconsolate people, as above, fled into the country every way; and some of them got little sheds and barns and outhouses to live in, where they could obtain so much kindness of the country, and especially where they had any, the least satisfactory account to give of themselves, and particularly that they did not come out of London too late. But others, and that in great numbers, built themselves little huts and retreats in the fields and woods, and lived like hermits in holes and caves, or any place they could find, and where, we may be sure, they suffered great extremities, such that many of them were obliged to come back again, whatever the danger was. And so those little huts were often found empty, and the country people supposed the inhabitants lay dead in them of the plague, and would not go near them for fear, no, not in a great while; nor is it unlikely but that some of the unhappy wanderers might die so all alone, even sometimes for want of help, as particularly in one tent or hut was found a man dead, and on the gate of a field just by was cut with his knife, in uneven letters, the following words, by which it may be supposed the other man escaped, or that, one dying first, the other buried him as well as he could:-- O m I s E r Y! We Bo T H Sh a L L D y E, W o E, W o E I have given an account already of what I found to have been the case down the river among the seafaring men, how the ships lay in the "offing," as it is called, in rows or lines, astern of one another, quite down from the Pool as far as I could see. I have been told that they lay in the same manner quite down the river as low as Gravesend,[222] and some far beyond, even everywhere, or in every place where they could ride with safety as to wind and weather. Nor did I ever hear that the plague reached to any of the people on board those ships, except such as lay up in the Pool, or as high as Deptford Reach, although the people went frequently on shore to the country towns and villages, and farmers' houses, to buy fresh provisions (fowls, pigs, calves, and the like) for their supply. Likewise I found that the watermen on the river above the bridge found means to convey themselves away up the river as far as they could go; and that they had, many of them, their whole families in their boats, covered with tilts[223] and bales, as they call them, and furnished with straw within for their lodging; and that they lay thus all along by the shore in the marshes, some of them setting up little tents with their sails, and so lying under them on shore in the day, and going into their boats at night. And in this manner, as I have heard, the riversides were lined with boats and people as long as they had anything to subsist on, or could get anything of the country; and indeed the country people, as well gentlemen as others, on these and all other occasions, were very forward to relieve them, but they were by no means willing to receive them into their towns and houses, and for that we cannot blame them. There was one unhappy citizen, within my knowledge, who had been visited in a dreadful manner, so that his wife and all his children were dead, and himself and two servants only left, with an elderly woman, a near relation, who had nursed those that were dead as well as she could. This disconsolate man goes to a village near the town, though not within the bills of mortality, and, finding an empty house there, inquires out the owner, and took the house. After a few days he got a cart, and loaded it with goods, and carries them down to the house. The people of the village opposed his driving the cart along, but, with some arguings and some force, the men that drove the cart along got through the street up to the door of the house. There the constable resisted them again, and would not let them be brought in. The man caused the goods to be unloaded and laid at the door, and sent the cart away, upon which they carried the man before a justice of peace; that is to say, they commanded him to go, which he did. The justice ordered him to cause the cart to fetch away the goods again, which he refused to do; upon which the justice ordered the constable to pursue the carters and fetch them back, and make them reload the goods and carry them away, or to set them in the stocks[224] till they[225] came for further orders; and if they could not find them,[226] and the man would not consent to take them[227] away, they[225] should cause them[227] to be drawn with hooks from the house door, and burned in the street. The poor distressed man, upon this, fetched the goods again, but with grievous cries and lamentations at the hardship of his case. But there was no remedy: self-preservation obliged the people to those severities which they would not otherwise have been concerned in. Whether this poor man lived or died, I cannot tell, but it was reported that he had the plague upon him at that time, and perhaps the people might report that to justify their usage of him; but it was not unlikely that either he or his goods, or both, were dangerous, when his whole family had been dead of the distemper so little a while before. I know that the inhabitants of the towns adjacent to London were much blamed for cruelty to the poor people that ran from the contagion in their distress, and many very severe things were done, as may be seen from what has been said; but I cannot but say also, that where there was room for charity and assistance to the people, without apparent danger to themselves, they were willing enough to help and relieve them. But as every town were indeed judges in their own case, so the poor people who ran abroad in their extremities were often ill used, and driven back again into the town; and this caused infinite exclamations and outcries against the country towns, and made the clamor very popular. And yet more or less, maugre[228] all the caution, there was not a town of any note within ten (or, I believe, twenty) miles of the city, but what was more or less infected, and had some[229] died among them. I have heard the accounts of several, such as they were reckoned up, as follows:-- Enfield 32 Hornsey 58 Newington 17 Tottenham 42 Edmonton 19 Barnet and Hadley 43 St. Albans 121 Watford 45 Uxbridge 117 Hertford 90 Ware 160 Hodsdon 30 Waltham Abbey 23 Epping 26 Deptford 623 Greenwich 631 Eltham and Lusum 85 Croydon 61 Brentwood 70 Rumford 109 Barking about 200 Brandford 432 Kingston 122 Staines 82 Chertsey 18 Windsor 103 cum aliis.[230] Another thing might render the country more strict with respect to the citizens, and especially with respect to the poor, and this was what I hinted at before; namely, that there was a seeming propensity, or a wicked inclination, in those that were infected, to infect others. There have been great debates among our physicians as to the reason of this. Some will have it to be in the nature of the disease, and that it impresses every one that is seized upon by it with a kind of rage and a hatred against their own kind, as if there were a malignity, not only in the distemper to communicate itself, but in the very nature of man, prompting him with evil will, or an evil eye, that as they say in the case of a mad dog, who, though the gentlest creature before of any of his kind, yet then will fly upon and bite any one that comes next him, and those as soon as any, who have been most observed[231] by him before. Others placed it to the account of the corruption of human nature, who[232] cannot bear to see itself more miserable than others of its own species, and has a kind of involuntary wish that all men were as unhappy or in as bad a condition as itself. Others say it was only a kind of desperation, not knowing or regarding what they did, and consequently unconcerned at the danger or safety, not only of anybody near them, but even of themselves also. And indeed, when men are once come to a condition to abandon themselves, and be unconcerned for the safety or at the danger of themselves, it cannot be so much wondered that they should be careless of the safety of other people. But I choose to give this grave debate quite a different turn, and answer it or resolve it all by saying that I do not grant the fact. On the contrary, I say that the thing is not really so, but that it was a general complaint raised by the people inhabiting the outlying villages against the citizens, to justify, or at least excuse, those hardships and severities so much talked of, and in which complaints both sides may be said to have injured one another; that is to say, the citizens pressing to be received and harbored in time of distress, and with the plague upon them, complain of the cruelty and injustice of the country people in being refused entrance, and forced back again with their goods and families; and the inhabitants, finding themselves so imposed upon, and the citizens breaking in, as it were, upon them, whether they would or no, complain that when they[233] were infected, they were not only regardless of others, but even willing to infect them: neither of which was really true, that is to say, in the colors they[234] were described in. It is true there is something to be said for the frequent alarms which were given to the country, of the resolution of the people of London to come out by force, not only for relief, but to plunder and rob; that they ran about the streets with the distemper upon them without any control; and that no care was taken to shut up houses, and confine the sick people from infecting others; whereas, to do the Londoners justice, they never practiced such things, except in such particular cases as I have mentioned above, and such like. On the other hand, everything was managed with so much care, and such excellent order was observed in the whole city and suburbs, by the care of the lord mayor and aldermen, and by the justices of the peace, churchwardens, etc., in the outparts, that London may be a pattern to all the cities in the world for the good government and the excellent order that was everywhere kept, even in the time of the most violent infection, and when the people were in the utmost consternation and distress. But of this I shall speak by itself. One thing, it is to be observed, was owing principally to the prudence of the magistrates, and ought to be mentioned to their honor; viz., the moderation which they used in the great and difficult work of shutting up houses. It is true, as I have mentioned, that the shutting up of houses was a great subject of discontent, and I may say, indeed, the only subject of discontent among the people at that time; for the confining the sound in the same house with the sick was counted very terrible, and the complaints of people so confined were very grievous: they were heard in the very streets, and they were sometimes such that called for resentment, though oftener for compassion. They had no way to converse with any of their friends but out of their windows, where they would make such piteous lamentations as often moved the hearts of those they talked with, and of others who, passing by, heard their story; and as those complaints oftentimes reproached the severity, and sometimes the insolence, of the watchmen placed at their doors, those watchmen would answer saucily enough, and perhaps be apt to affront the people who were in the street talking to the said families; for which, or for their ill treatment of the families, I think seven or eight of them in several places were killed. I know not whether I should say murdered or not, because I cannot enter into the particular cases. It is true, the watchmen were on their duty, and acting in the post where they were placed by a lawful authority; and killing any public legal officer in the execution of his office is always, in the language of the law, called "murder." But as they were not authorized by the magistrate's instructions, or by the power they acted under, to be injurious or abusive, either to the people who were under their observation or to any that concerned themselves for them, so that,[235] when they did so, they might be said to act themselves, not their office; to act as private persons, not as persons employed; and consequently, if they brought mischief upon themselves by such an undue behavior, that mischief was upon their own heads. And indeed they had so much the hearty curses of the people, whether they deserved it or not, that, whatever befell them, nobody pitied them; and everybody was apt to say they deserved it, whatever it was. Nor do I remember that anybody was ever punished, at least to any considerable degree, for whatever was done to the watchmen that guarded their houses. What variety of stratagems were used to escape, and get out of houses thus shut up, by which the watchmen were deceived or overpowered, and that[236] the people got away, I have taken notice of already, and shall say no more to that; but I say the magistrates did moderate and ease families upon many occasions in this case, and particularly in that of taking away or suffering to be removed the sick persons out of such houses, when they were willing to be removed, either to a pesthouse or other places, and sometimes giving the well persons in the family so shut up leave to remove, upon information given that they were well, and that they would confine themselves in such houses where they went, so long as should be required of them. The concern, also, of the magistrates for the supplying such poor families as were infected,--I say, supplying them with necessaries, as well physic as food,--was very great: and in which they did not content themselves with giving the necessary orders to the officers appointed; but the aldermen, in person and on horseback, frequently rode to such houses, and caused the people to be asked at their windows whether they were duly attended or not, also whether they wanted anything that was necessary, and if the officers had constantly carried their messages, and fetched them such things as they wanted, or not. And if they answered in the affirmative, all was well; but if they complained that they were ill supplied, and that the officer did not do his duty, or did not treat them civilly, they (the officers) were generally removed, and others placed in their stead. It is true, such complaint might be unjust; and if the officer had such arguments to use as would convince the magistrate that he was right, and that the people had injured him, he was continued, and they reproved. But this part could not well bear a particular inquiry, for the parties could very ill be well heard and answered in the street from the windows, as was the case then. The magistrates, therefore, generally chose to favor the people, and remove the man, as what seemed to be the least wrong and of the least ill consequence; seeing, if the watchman was injured, yet they could easily make him amends by giving him another post of a like nature; but, if the family was injured, there was no satisfaction could be made to them, the damage, perhaps, being irreparable, as it concerned their lives. A great variety of these cases frequently happened between the watchmen and the poor people shut up, besides those I formerly mentioned about escaping. Sometimes the watchmen were absent, sometimes drunk, sometimes asleep, when the people wanted them; and such never failed to be punished severely, as indeed they deserved. But, after all that was or could be done in these cases, the shutting up of houses, so as to confine those that were well with those that were sick, had very great inconveniences in it, and some that were very tragical, and which merited to have been considered, if there had been room for it: but it was authorized by a law, it had the public good in view as the end chiefly aimed at; and all the private injuries that were done by the putting it in execution must be put to the account of the public benefit. It is doubtful whether, in the whole, it contributed anything to the stop of the infection; and indeed I cannot say it did, for nothing could run with greater fury and rage than the infection did when it was in its chief violence, though the houses infected were shut up as exactly and effectually as it was possible. Certain it is, that, if all the infected persons were effectually shut in, no sound person could have been infected by them, because they could not have come near them.[237] But the case was this (and I shall only touch it here); namely, that the infection was propagated insensibly, and by such persons as were not visibly infected, who neither knew whom they infected, nor whom they were infected by. A house in Whitechapel was shut up for the sake of one infected maid, who had only spots, not the tokens, come out upon her, and recovered; yet these people obtained no liberty to stir, neither for air or exercise, forty days. Want of breath, fear, anger, vexation, and all the other griefs attending such an injurious treatment, cast the mistress of the family into a fever; and visitors came into the house and said it was the plague, though the physicians declared it was not. However, the family were obliged to begin their quarantine anew, on the report of the visitor or examiner, though their former quarantine wanted but a few days of being finished. This oppressed them so with anger and grief, and, as before, straitened them also so much as to room, and for want of breathing and free air, that most of the family fell sick, one of one distemper, one of another, chiefly scorbutic[238] ailments, only one a violent cholic; until, after several prolongings of their confinement, some or other of those that came in with the visitors to inspect the persons that were ill, in hopes of releasing them, brought the distemper with them, and infected the whole house; and all or most of them died, not of the plague as really upon them before, but of the plague that those people brought them, who should have been careful to have protected them from it. And this was a thing which frequently happened, and was indeed one of the worst consequences of shutting houses up. I had about this time a little hardship put upon me, which I was at first greatly afflicted at, and very much disturbed about, though, as it proved, it did not expose me to any disaster; and this was, being appointed, by the alderman of Portsoken Ward, one of the examiners of the houses in the precinct where I lived. We had a large parish, and had no less than eighteen examiners, as the order called us: the people called us visitors. I endeavored with all my might to be excused from such an employment, and used many arguments with the alderman's deputy to be excused; particularly, I alleged that I was against shutting up houses at all, and that it would be very hard to oblige me to be an instrument in that which was against my judgment, and which I did verily believe would not answer the end it was intended for. But all the abatement I could get was only, that whereas the officer was appointed by my lord mayor to continue two months, I should be obliged to hold it but three weeks, on condition, nevertheless, that I could then get some other sufficient housekeeper to serve the rest of the time for me; which was, in short, but a very small favor, it being very difficult to get any man to accept of such an employment that was fit to be intrusted with it. It is true that shutting up of houses had one effect which I am sensible was of moment; namely, it confined the distempered people, who would otherwise have been both very troublesome and very dangerous in their running about streets with the distemper upon them, which, when they were delirious, they would have done in a most frightful manner, as, indeed, they began to do at first very much until they were restrained; nay, so very open they were, that the poor would go about and beg at people's doors, and say they had the plague upon them, and beg rags for their sores, or both, or anything that delirious nature happened to think of. A poor unhappy gentlewoman, a substantial citizen's wife, was, if the story be true, murdered by one of these creatures in Aldersgate Street, or that way. He was going along the street, raving mad, to be sure, and singing. The people only said he was drunk; but he himself said he had the plague upon him, which, it seems, was true; and, meeting this gentlewoman, he would kiss her. She was terribly frightened, as he was a rude fellow, and she run from him; but, the street being very thin of people, there was nobody near enough to help her. When she saw he would overtake her, she turned and gave him a thrust so forcibly, he being but weak, as pushed him down backward; but very unhappily, she being so near, he caught hold of her and pulled her down also, and, getting up first, mastered her and kissed her, and, which was worst of all, when he had done, told her he had the plague, and why should not she have it as well as he. She was frightened enough before; but when she heard him say he had the plague, she screamed out, and fell down into a swoon, or in a fit, which, though she recovered a little, yet killed her in a very few days; and I never heard whether she had the plague or no. Another infected person came and knocked at the door of a citizen's house where they knew him very well. The servant let him in, and, being told the master of the house was above, he ran up, and came into the room to them as the whole family were at supper. They began to rise up a little surprised, not knowing what the matter was; but he bid them sit still, he only come to take his leave of them. They asked him, "Why, Mr. ----, where are you going?"--"Going?" says he; "I have got the sickness, and shall die to-morrow night." It is easy to believe, though not to describe, the consternation they were all in. The women and the man's daughters, which[239] were but little girls, were frightened almost to death, and got up, one running out at one door and one at another, some downstairs and some upstairs, and, getting together as well as they could, locked themselves into their chambers, and screamed out at the windows for help, as if they had been frightened out of their wits. The master, more composed than they, though both frightened and provoked, was going to lay hands on him and throw him downstairs, being in a passion; but then, considering a little the condition of the man and the danger of touching him, horror seized his mind, and he stood still like one astonished. The poor distempered man, all this while, being as well diseased in his brain as in his body, stood still like one amazed. At length he turns round. "Ay!" says he with all the seeming calmness imaginable, "is it so with you all? Are you all disturbed at me? Why, then, I'll e'en go home and die there." And so he goes immediately downstairs. The servant that had let him in goes down after him with a candle, but was afraid to go past him and open the door; so he stood on the stairs to see what he would do. The man went and opened the door, and went out and flung[240] the door after him. It was some while before the family recovered the fright; but, as no ill consequence attended, they have had occasion since to speak of it, you may be sure, with great satisfaction. Though the man was gone, it was some time, nay, as I heard, some days, before they recovered themselves of the hurry they were in; nor did they go up and down the house with any assurance till they had burned a great variety of fumes and perfumes in all the rooms, and made a great many smokes of pitch, of gunpowder, and of sulphur. All separately shifted,[241] and washed their clothes, and the like. As to the poor man, whether he lived or died, I do not remember. It is most certain, that if, by the shutting up of houses, the sick had not been confined, multitudes, who in the height of their fever were delirious and distracted, would have been continually running up and down the streets; and even as it was, a very great number did so, and offered all sorts of violence to those they met, even just as a mad dog runs on and bites at every one he meets. Nor can I doubt but that, should one of those infected diseased creatures have bitten any man or woman while the frenzy of the distemper was upon them, they (I mean the person so wounded) would as certainly have been incurably infected as one that was sick before and had the tokens upon him. I heard of one infected creature, who, running out of his bed in his shirt, in the anguish and agony of his swellings (of which he had three upon him), got his shoes on, and went to put on his coat; but the nurse resisting, and snatching the coat from him, he threw her down, run over her, ran downstairs and into the street directly to the Thames, in his shirt, the nurse running after him, and calling to the watch to stop him. But the watchman, frightened at the man, and afraid to touch him, let him go on; upon which he ran down to the Still-Yard Stairs, threw away his shirt, and plunged into the Thames, and, being a good swimmer, swam quite over the river; and the tide being "coming in," as they call it (that is, running westward), he reached the land not till he came about the Falcon Stairs, where, landing and finding no people there, it being in the night, he ran about the streets there, naked as he was, for a good while, when, it being by that time high water, he takes the river again, and swam back to the Still Yard, landed, ran up the streets to his own house, knocking at the door, went up the stairs, and into his bed again; and[242] that this terrible experiment cured him of the plague, that is to say, that the violent motion of his arms and legs stretched the parts where the swellings he had upon him were (that is to say, under his arms and in his groin), and caused them to ripen and break; and that the cold of the water abated the fever in his blood. I have only to add, that I do not relate this, any more than some of the other, as a fact within my own knowledge, so as that I can vouch the truth of them; and especially that of the man being cured by the extravagant adventure, which I confess I do not think very possible, but it may serve to confirm the many desperate things which the distressed people, falling into deliriums and what we call light-headedness, were frequently run upon at that time, and how infinitely more such there would have been if such people had not been confined by the shutting up of houses; and this I take to be the best, if not the only good thing, which was performed by that severe method. On the other hand, the complaints and the murmurings were very bitter against the thing itself. It would pierce the hearts of all that came by, to hear the piteous cries of those infected people, who, being thus out of their understandings by the violence of their pain or the heat of their blood, were either shut in, or perhaps tied in their beds and chairs, to prevent their doing themselves hurt, and who would make a dreadful outcry at their being confined, and at their being not permitted to "die at large," as they called it, and as they would have done before. This running of distempered people about the streets was very dismal, and the magistrates did their utmost to prevent it; but as it was generally in the night, and always sudden, when such attempts were made, the officers could not be at hand to prevent it; and even when they got out in the day, the officers appointed did not care to meddle with them, because, as they were all grievously infected, to be sure, when they were come to that height, so they were more than ordinarily infectious, and it was one of the most dangerous things that could be to touch them. On the other hand, they generally ran on, not knowing what they did, till they dropped down stark dead, or till they had exhausted their spirits so as that they would fall and then die in perhaps half an hour or an hour; and, which was most piteous to hear, they were sure to come to themselves entirely in that half hour or hour, and then to make most grievous and piercing cries and lamentations, in the deep afflicting sense of the condition they were in. There was much of it before the order for shutting up of houses was strictly put into execution; for at first the watchmen were not so rigorous and severe as they were afterwards in the keeping the people in; that is to say, before they were (I mean some of them) severely punished for their neglect, failing in their duty, and letting people who were under their care slip away, or conniving at their going abroad, whether sick or well. But after they saw the officers appointed to examine into their conduct were resolved to have them do their duty, or be punished for the omission, they were more exact, and the people were strictly restrained; which was a thing they took so ill, and bore so impatiently, that their discontents can hardly be described; but there was an absolute necessity for it, that must be confessed, unless some other measures had been timely entered upon, and it was too late for that. Had not this particular of the sick being restrained as above been our case at that time, London would have been the most dreadful place that ever was in the world. There would, for aught I know, have as many people died in the streets as died in their houses: for when the distemper was at its height, it generally made them raving and delirious; and when they were so, they would never be persuaded to keep in their beds but by force; and many who were not tied threw themselves out of windows when they found they could not get leave to go out of their doors. It was for want of people conversing one with another in this time of calamity, that it was impossible any particular person could come at the knowledge of all the extraordinary cases that occurred in different families; and particularly, I believe it was never known to this day how many people in their deliriums drowned themselves in the Thames, and in the river which runs from the marshes by Hackney, which we generally called Ware River or Hackney River. As to those which were set down in the weekly bill, they were indeed few. Nor could it be known of any of those, whether they drowned themselves by accident or not; but I believe I might reckon up more who, within the compass of my knowledge or observation, really drowned themselves in that year than are put down in the bill of all put together, for many of the bodies were never found who yet were known to be lost; and the like in other methods of self-destruction. There was also one man in or about Whitecross Street burnt himself to death in his bed. Some said it was done by himself, others that it was by the treachery of the nurse that attended him; but that he had the plague upon him, was agreed by all. It was a merciful disposition of Providence, also, and which I have many times thought of at that time, that no fires, or no considerable ones at least, happened in the city during that year, which, if it had been otherwise, would have been very dreadful; and either the people must have let them alone unquenched, or have come together in great crowds and throngs, unconcerned at the danger of the infection, not concerned at the houses they went into, at the goods they handled, or at the persons or the people they came among. But so it was, that excepting that in Cripplegate Parish, and two or three little eruptions of fires, which were presently extinguished, there was no disaster of that kind happened in the whole year. They told us a story of a house in a place called Swan Alley, passing from Goswell Street near the end of Old Street into St. John Street, that a family was infected there in so terrible a manner that every one of the house died. The last person lay dead on the floor, and, as it is supposed, had laid herself all along to die just before the fire. The fire, it seems, had fallen from its place, being of wood, and had taken hold of the boards and the joists they lay on, and burned as far as just to the body, but had not taken hold of the dead body, though she had little more than her shift on, and had gone out of itself, not hurting the rest of the house, though it was a slight timber house. How true this might be, I do not determine; but the city being to suffer severely the next year by fire, this year it felt very little of that calamity. Indeed, considering the deliriums which the agony threw people into, and how I have mentioned in their madness, when they were alone, they did many desperate things, it was very strange there were no more disasters of that kind. It has been frequently asked me, and I cannot say that I ever knew how to give a direct answer to it, how it came to pass that so many infected people appeared abroad in the streets at the same time that the houses which were infected were so vigilantly searched, and all of them shut up and guarded as they were. I confess I know not what answer to give to this, unless it be this: that, in so great and populous a city as this is, it was impossible to discover every house that was infected as soon as it was so, or to shut up all the houses that were infected; so that people had the liberty of going about the streets, even where they pleased, unless they were known to belong to such and such infected houses. It is true, that, as the several physicians told my lord mayor, the fury of the contagion was such at some particular times, and people sickened so fast and died so soon, that it was impossible, and indeed to no purpose, to go about to inquire who was sick and who was well, or to shut them up with such exactness as the thing required, almost every house in a whole street being infected, and in many places every person in some of the houses. And, that which was still worse, by the time that the houses were known to be infected, most of the persons infected would be stone dead, and the rest run away for fear of being shut up; so that it was to very small purpose to call them infected houses and shut them up, the infection having ravaged and taken its leave of the house before it was really known that the family was any way touched. This might be sufficient to convince any reasonable person, that as it was not in the power of the magistrates, or of any human methods or policy, to prevent the spreading the infection, so that this way of shutting up of houses was perfectly insufficient for that end. Indeed, it seemed to have no manner of public good in it equal or proportionable to the grievous burthen that it was to the particular families that were so shut up; and, as far as I was employed by the public in directing that severity, I frequently found occasion to see that it was incapable of answering the end. For example, as I was desired as a visitor or examiner to inquire into the particulars of several families which were infected, we scarce came to any house where the plague had visibly appeared in the family but that some of the family were fled and gone. The magistrates would resent this, and charge the examiners with being remiss in their examination or inspection; but by that means houses were long infected before it was known. Now, as I was in this dangerous office but half the appointed time, which was two months, it was long enough to inform myself that we were no way capable of coming at the knowledge of the true state of any family but by inquiring at the door or of the neighbors. As for going into every house to search, that was a part no authority would offer to impose on the inhabitants, or any citizen would undertake; for it would have been exposing us to certain infection and death, and to the ruin of our own families as well as of ourselves. Nor would any citizen of probity, and that could be depended upon, have staid in the town if they had been made liable to such a severity. Seeing, then, that we could come at the certainty of things by no method but that of inquiry of the neighbors or of the family (and on that we could not justly depend), it was not possible but that the uncertainty of this matter would remain as above. It is true, masters of families were bound by the order to give notice to the examiner of the place wherein he lived, within two hours after he should discover it, of any person being sick in his house, that is to say, having signs of the infection; but they found so many ways to evade this, and excuse their negligence, that they seldom gave that notice till they had taken measures to have every one escape out of the house who had a mind to escape, whether they were sick or sound. And while this was so, it was easy to see that the shutting up of houses was no way to be depended upon as a sufficient method for putting a stop to the infection, because, as I have said elsewhere, many of those that so went out of those infected houses had the plague really upon them, though they might really think themselves sound; and some of these were the people that walked the streets till they fell down dead: not that they were suddenly struck with the distemper, as with a bullet that killed with the stroke, but that they really had the infection in their blood long before, only that, as it preyed secretly on their vitals, it appeared not till it seized the heart with a mortal power, and the patient died in a moment, as with a sudden fainting or an apoplectic fit. I know that some, even of our physicians, thought for a time that those people that so died in the streets were seized but that moment they fell, as if they had been touched by a stroke from heaven, as men are killed by a flash of lightning; but they found reason to alter their opinion afterward, for, upon examining the bodies of such after they were dead, they always either had tokens upon them, or other evident proofs of the distemper having been longer upon them than they had otherwise expected. This often was the reason that, as I have said, we that were examiners were not able to come at the knowledge of the infection being entered into a house till it was too late to shut it up, and sometimes not till the people that were left were all dead. In Petticoat Lane two houses together were infected, and several people sick; but the distemper was so well concealed, the examiner, who was my neighbor, got no knowledge of it till notice was sent him that the people were all dead, and that the carts should call there to fetch them away. The two heads of the families concerted their measures, and so ordered their matters as that, when the examiner was in the neighborhood, they appeared generally at a time, and answered, that is, lied for one another, or got some of the neighborhood to say they were all in health, and perhaps knew no better; till, death making it impossible to keep it any longer as a secret, the dead carts were called in the night to both the houses, and so it became public. But when the examiner ordered the constable to shut up the houses, there was nobody left in them but three people (two in one house, and one in the other), just dying, and a nurse in each house, who acknowledged that they had buried five before, that the houses had been infected nine or ten days, and that for all the rest of the two families, which were many, they were gone, some sick, some well, or, whether sick or well, could not be known. In like manner, at another house in the same lane, a man, having his family infected, but very unwilling to be shut up, when he could conceal it no longer, shut up himself; that is to say, he set the great red cross upon the door, with the words, "LORD, HAVE MERCY UPON US!" and so deluded the examiner, who supposed it had been done by the constable, by order of the other examiner (for there were two examiners to every district or precinct). By this means he had free egress and regress into his house again and out of it, as he pleased, notwithstanding it was infected, till at length his stratagem was found out, and then he, with the sound part of his family and servants, made off and escaped; so they were not shut up at all. These things made it very hard, if not impossible, as I have said, to prevent the spreading of an infection by the shutting up of houses, unless the people would think the shutting up of their houses no grievance, and be so willing to have it done as that they would give notice duly and faithfully to the magistrates of their being infected, as soon as it was known by themselves; but as that cannot be expected from them, and the examiners cannot be supposed, as above, to go into their houses to visit and search, all the good of shutting up houses will be defeated, and few houses will be shut up in time, except those of the poor, who cannot conceal it, and of some people who will be discovered by the terror and consternation which the thing put them into. I got myself discharged of the dangerous office I was in as soon as I could get another admitted, whom I had obtained for a little money to accept of it; and so, instead of serving the two months, which was directed, I was not above three weeks in it; and a great while too, considering it was in the month of August, at which time the distemper began to rage with great violence at our end of the town. In the execution of this office, I could not refrain speaking my opinion among my neighbors as to the shutting up the people in their houses, in which we saw most evidently the severities that were used, though grievous in themselves, had also this particular objection against them; namely, that they did not answer the end, as I have said, but that the distempered people went day by day about the streets. And it was our united opinion that a method to have removed the sound from the sick, in case of a particular house being visited, would have been much more reasonable on many accounts, leaving nobody with the sick persons but such as should, on such occasions, request to stay, and declare themselves content to be shut up with them. Our scheme for removing those that were sound from those that were sick was only in such houses as were infected; and confining the sick was no confinement: those that could not stir would not complain while they were in their senses, and while they had the power of judging. Indeed, when they came to be delirious and light-headed, then they would cry out of[243] the cruelty of being confined; but, for the removal of those that were well, we thought it highly reasonable and just, for their own sakes, they should be removed from the sick, and that, for other people's safety, they should keep retired for a while, to see that they were sound, and might not infect others; and we thought twenty or thirty days enough for this. Now, certainly, if houses had been provided on purpose for those that were sound, to perform this demiquarantine in, they would have much less reason to think themselves injured in such a restraint than in being confined with infected people in the houses where they lived. It is here, however, to be observed, that after the funerals became so many that people could not toll the bell, mourn or weep, or wear black for one another, as they did before, no, nor so much as make coffins for those that died, so, after a while, the fury of the infection appeared to be so increased, that, in short, they shut up no houses at all. It seemed enough that all the remedies of that kind had been used till they were found fruitless, and that the plague spread itself with an irresistible fury; so that, as the fire the succeeding year spread itself and burnt with such violence that the citizens in despair gave over their endeavors to extinguish it, so in the plague it came at last to such violence, that the people sat still looking at one another, and seemed quite abandoned to despair. Whole streets seemed to be desolated, and not to be shut up only, but to be emptied of their inhabitants: doors were left open, windows stood shattering with the wind in empty houses, for want of people to shut them. In a word, people began to give up themselves to their fears, and to think that all regulations and methods were in vain, and that there was nothing to be hoped for but an universal desolation. And it was even in the height of this general despair that it pleased God to stay his hand, and to slacken the fury of the contagion in such a manner as was even surprising, like its beginning, and demonstrated it to be his own particular hand; and that above, if not without the agency of means, as I shall take notice of in its proper place. But I must still speak of the plague as in its height, raging even to desolation, and the people under the most dreadful consternation, even, as I have said, to despair. It is hardly credible to what excesses the passions of men carried them in this extremity of the distemper; and this part, I think, was as moving as the rest. What could affect a man in his full power of reflection, and what could make deeper impressions on the soul, than to see a man almost naked, and got out of his house or perhaps out of his bed into the street, come out of Harrow Alley, a populous conjunction or collection of alleys, courts, and passages, in the Butcher Row in Whitechapel,--I say, what could be more affecting than to see this poor man come out into the open street, run, dancing and singing, and making a thousand antic gestures, with five or six women and children running after him, crying and calling upon him for the Lord's sake to come back, and entreating the help of others to bring him back, but all in vain, nobody daring to lay a hand upon him, or to come near him? This was a most grievous and afflicting thing to me, who saw it all from my own windows; for all this while the poor afflicted man was, as I observed it, even then in the utmost agony of pain, having, as they said, two swellings upon him, which could not be brought to break or to suppurate; but by laying strong caustics on them the surgeons had, it seems, hopes to break them, which caustics were then upon him, burning his flesh as with a hot iron. I cannot say what became of this poor man, but I think he continued roving about in that manner till he fell down and died. No wonder the aspect of the city itself was frightful. The usual concourse of the people in the streets, and which used to be supplied from our end of the town, was abated. The Exchange was not kept shut, indeed, but it was no more frequented. The fires were lost: they had been almost extinguished for some days by a very smart and hasty rain. But that was not all. Some of the physicians insisted that they were not only no benefit, but injurious to the health of the people. This they made a loud clamor about, and complained to the lord mayor about it. On the other hand, others of the same faculty, and eminent too, opposed them, and gave their reasons why the fires were and must be useful to assuage the violence of the distemper. I cannot give a full account of their arguments on both sides; only this I remember, that they caviled very much with one another. Some were for fires, but that they must be made of wood and not coal, and of particular sorts of wood too, such as fir, in particular, or cedar, because of the strong effluvia of turpentine; others were for coal and not wood, because of the sulphur and bitumen; and others were neither for one or other. Upon the whole, the lord mayor ordered no more fires, and especially on this account, namely, that the plague was so fierce that they saw evidently it defied all means, and rather seemed to increase than decrease upon any application to check and abate it; and yet this amazement of the magistrates proceeded rather from want of being able to apply any means successfully than from any unwillingness either to expose themselves or undertake the care and weight of business; for, to do them justice, they neither spared their pains nor their persons. But nothing answered. The infection raged, and the people were now terrified to the last degree, so that, as I may say, they gave themselves up, and, as I mentioned above, abandoned themselves to their despair. But let me observe here, that when I say the people abandoned themselves to despair, I do not mean to what men call a religious despair, or a despair of their eternal state; but I mean a despair of their being able to escape the infection, or to outlive the plague, which they saw was so raging, and so irresistible in its force, that indeed few people that were touched with it in its height, about August and September, escaped; and, which is very particular, contrary to its ordinary operation in June and July and the beginning of August, when, as I have observed, many were infected, and continued so many days, and then went off, after having had the poison in their blood a long time. But now, on the contrary, most of the people who were taken during the last two weeks in August, and in the first three weeks in September, generally died in two or three days at the farthest, and many the very same day they were taken. Whether the dog days[244] (as our astrologers pretended to express themselves, the influence of the Dog Star) had that malignant effect, or all those who had the seeds of infection before in them brought it up to a maturity at that time altogether, I know not; but this was the time when it was reported that above three thousand people died in one night; and they that would have us believe they more critically observed it pretend to say that they all died within the space of two hours, viz., between the hours of one and three in the morning. As to the suddenness of people dying at this time, more than before, there were innumerable instances of it, and I could name several in my neighborhood. One family without the bars, and not far from me, were all seemingly well on the Monday, being ten in family. That evening one maid and one apprentice were taken ill, and died the next morning, when the other apprentice and two children were touched, whereof one died the same evening and the other two on Wednesday. In a word, by Saturday at noon the master, mistress, four children, and four servants were all gone, and the house left entirely empty, except an ancient woman, who came to take charge of the goods for the master of the family's brother, who lived not far off, and who had not been sick. Many houses were then left desolate, all the people being carried away dead; and especially in an alley farther on the same side beyond the bars, going in at the sign of Moses and Aaron.[245] There were several houses together, which they said had not one person left alive in them; and some that died last in several of those houses were left a little too long before they were fetched out to be buried, the reason of which was not, as some have written very untruly, that the living were not sufficient to bury the dead, but that the mortality was so great in the yard or alley that there was nobody left to give notice to the buriers or sextons that there were any dead bodies there to be buried. It was said, how true I know not, that some of those bodies were so corrupted and so rotten, that it was with difficulty they were carried; and, as the carts could not come any nearer than to the alley gate in the High Street, it was so much the more difficult to bring them along. But I am not certain how many bodies were then left: I am sure that ordinarily it was not so. As I have mentioned how the people were brought into a condition to despair of life, and abandoned themselves, so this very thing had a strange effect among us for three or four weeks; that is, it made them bold and venturous. They were no more shy of one another, or restrained within doors, but went anywhere and everywhere, and began to converse. One would say to another, "I do not ask you how you are, or say how I am. It is certain we shall all go: so 'tis no matter who is sick or who is sound." And so they ran desperately into any place or company. As it brought the people into public company, so it was surprising how it brought them to crowd into the churches. They inquired no more into who[246] they sat near to or far from, what offensive smells they met with, or what condition the people seemed to be in; but, looking upon themselves all as so many dead corpses, they came to the churches without the least caution, and crowded together as if their lives were of no consequence compared to the work which they came about there. Indeed, the zeal which they showed in coming, and the earnestness and affection they showed in their attention to what they heard, made it manifest what a value people would all put upon the worship of God if they thought every day they attended at the church that it would be their last. Nor was it without other strange effects, for it took away all manner of prejudice at, or scruple about, the person whom they found in the pulpit when they came to the churches. It cannot be doubted but that many of the ministers of the parish churches were cut off among others in so common and dreadful a calamity; and others had not courage enough to stand it, but removed into the country as they found means for escape. As then some parish churches were quite vacant and forsaken, the people made no scruple of desiring such dissenters as had been a few years before deprived of their livings, by virtue of an act of Parliament called the "Act of Uniformity,"[247] to preach in the churches, nor did the church ministers in that case make any difficulty in accepting their assistance; so that many of those whom they called silent ministers had their mouths opened on this occasion, and preached publicly to the people. Here we may observe, and I hope it will not be amiss to take notice of it, that a near view of death would soon reconcile men of good principles one to another, and that it is chiefly owing to our easy situation in life, and our putting these things far from us, that our breaches are fomented, ill blood continued, prejudices, breach of charity and of Christian union so much kept and so far carried on among us as it is. Another plague year would reconcile all these differences; a close conversing with death, or with diseases that threaten death, would scum off the gall from our tempers, remove the animosities among us, and bring us to see with differing eyes than those which we looked on things with before. As the people who had been used to join with the church were reconciled at this time with the admitting the dissenters to preach to them, so the dissenters, who, with an uncommon prejudice, had broken off from the communion of the Church of England, were now content to come to their parish churches, and to conform to the worship which they did not approve of before. But, as the terror of the infection abated, those things all returned again to their less desirable channel, and to the course they were in before. I mention this but historically: I have no mind to enter into arguments to move either or both sides to a more charitable compliance one with another. I do not see that it is probable such a discourse would be either suitable or successful; the breaches seem rather to widen, and tend to a widening farther, than to closing: and who am I, that I should think myself able to influence either one side or other? But this I may repeat again, that it is evident death will reconcile us all: on the other side the grave we shall be all brethren again. In heaven, whither I hope we may come from all parties and persuasions, we shall find neither prejudice nor scruple: there we shall be of one principle and of one opinion. Why we cannot be content to go hand in hand to the place where we shall join heart and hand without the least hesitation, and with the most complete harmony and affection,--I say, why we cannot do so here, I can say nothing to; neither shall I say anything more of it, but that it remains to be lamented. I could dwell a great while upon the calamities of this dreadful time, and go on to describe the objects that appeared among us every day,--the dreadful extravagances which the distraction of sick people drove them into; how the streets began now to be fuller of frightful objects, and families to be made even a terror to themselves. But after I have told you, as I have above, that one man being tied in his bed, and finding no other way to deliver himself, set the bed on fire with his candle (which unhappily stood within his reach), and burned himself in bed; and how another, by the insufferable torment he bore, danced and sung naked in the streets, not knowing one ecstasy[248] from another,--I say, after I have mentioned these things, what can be added more? What can be said to represent the misery of these times more lively to the reader, or to give him a perfect idea of a more complicated distress? I must acknowledge that this time was so terrible that I was sometimes at the end of all my resolutions, and that I had not the courage that I had at the beginning. As the extremity brought other people abroad, it drove me home; and, except having made my voyage down to Blackwall and Greenwich, as I have related, which was an excursion, I kept afterwards very much within doors, as I had for about a fortnight before. I have said already that I repented several times that I had ventured to stay in town, and had not gone away with my brother and his family; but it was too late for that now. And after I had retreated and staid within doors a good while before my impatience led me abroad, then they called me, as I have said, to an ugly and dangerous office, which brought me out again; but as that was expired, while the height of the distemper lasted I retired again, and continued close ten or twelve days more, during which many dismal spectacles represented themselves in my view,[249] out of my own windows, and in our own street, as that particularly, from Harrow Alley, of the poor outrageous creature who danced and sung in his agony; and many others there were. Scarce a day or a night passed over but some dismal thing or other happened at the end of that Harrow Alley, which was a place full of poor people, most of them belonging to the butchers, or to employments depending upon the butchery. Sometimes heaps and throngs of people would burst out of the alley, most of them women, making a dreadful clamor, mixed or compounded of screeches, cryings, and calling one another, that we could not conceive what to make of it. Almost all the dead part of the night,[250] the dead cart stood at the end of that alley; for if it went in, it could not well turn again, and could go in but a little way. There, I say, it stood to receive dead bodies; and, as the churchyard was but a little way off, if it went away full, it would soon be back again. It is impossible to describe the most horrible cries and noise the poor people would make at their bringing the dead bodies of their children and friends out to the cart; and, by the number, one would have thought there had been none left behind, or that there were people enough for a small city living in those places. Several times they cried murder, sometimes fire; but it was easy to perceive that it was all distraction and the complaints of distressed and distempered people. I believe it was everywhere thus at that time, for the plague raged for six or seven weeks beyond all that I have expressed, and came even to such a height, that, in the extremity, they began to break into that excellent order of which I have spoken so much in behalf of the magistrates, namely, that no dead bodies were seen in the streets, or burials in the daytime; for there was a necessity in this extremity to bear with its being otherwise for a little while. One thing I cannot omit here, and indeed I thought it was extraordinary, at least it seemed a remarkable hand of divine justice; viz., that all the predictors, astrologers, fortune tellers, and what they called cunning men, conjurers, and the like, calculators of nativities, and dreamers of dreams, and such people, were gone and vanished; not one of them was to be found. I am verily persuaded that a great number of them fell in the heat of the calamity, having ventured to stay upon the prospect of getting great estates; and indeed their gain was but too great for a time, through the madness and folly of the people: but now they were silent; many of them went to their long home, not able to foretell their own fate, or to calculate their own nativities. Some have been critical enough to say[251] that every one of them died. I dare not affirm that; but this I must own, that I never heard of one of them that ever appeared after the calamity was over. But to return to my particular observations during this dreadful part of the visitation. I am now come, as I have said, to the month of September, which was the most dreadful of its kind, I believe, that ever London saw; for, by all the accounts which I have seen of the preceding visitations which have been in London, nothing has been like it, the number in the weekly bill amounting to almost forty thousands from the 22d of August to the 26th of September, being but five weeks. The particulars of the bills are as follows: viz.,-- Aug. 22 to Aug. 29 7,496 Aug. 29 to Sept. 5 8,252 Sept. 5 to Sept. 12 7,690 Sept. 12 to Sept. 19 8,297 Sept. 19 to Sept. 26 6,460 ------ 38,195 This was a prodigious number of itself; but if I should add the reasons which I have to believe that this account was deficient, and how deficient it was, you would with me make no scruple to believe that there died above ten thousand a week for all those weeks, one week with another, and a proportion for several weeks, both before and after. The confusion among the people, especially within the city, at that time was inexpressible. The terror was so great at last, that the courage of the people appointed to carry away the dead began to fail them; nay, several of them died, although they had the distemper before, and were recovered; and some of them dropped down when they have been carrying the bodies even at the pitside, and just ready to throw them in. And this confusion was greater in the city, because they had flattered themselves with hopes of escaping, and thought the bitterness of death was past. One cart, they told us, going up Shoreditch, was forsaken by the drivers, or, being left to one man to drive, he died in the street; and the horses, going on, overthrew the cart, and left the bodies, some thrown here, some there, in a dismal manner. Another cart was, it seems, found in the great pit in Finsbury Fields, the driver being dead, or having been gone and abandoned it; and the horses running too near it, the cart fell in, and drew the horses in also. It was suggested that the driver was thrown in with it, and that the cart fell upon him, by reason his whip was seen to be in the pit among the bodies; but that, I suppose, could not be certain. In our parish of Aldgate the dead carts were several times, as I have heard, found standing at the churchyard gate full of dead bodies, but neither bellman, or driver, or any one else, with it. Neither in these or many other cases did they know what bodies they had in their cart, for sometimes they were let down with ropes out of balconies and out of windows, and sometimes the bearers brought them to the cart, sometimes other people; nor, as the men themselves said, did they trouble themselves to keep any account of the numbers. The vigilance of the magistrate was now put to the utmost trial, and, it must be confessed, can never be enough acknowledged on this occasion; also, whatever expense or trouble they were at, two things were never neglected in the city or suburbs either:-- 1. Provisions were always to be had in full plenty, and the price not much raised neither, hardly worth speaking. 2. No dead bodies lay unburied or uncovered; and if any one walked from one end of the city to another, no funeral, or sign of it, was to be seen in the daytime, except a little, as I have said, in the first three weeks in September. This last article, perhaps, will hardly be believed when some accounts which others have published since that shall be seen, wherein they say that the dead lay unburied, which I am sure was utterly false; at least, if it had been anywhere so, it must have been in houses where the living were gone from the dead, having found means, as I have observed, to escape, and where no notice was given to the officers. All which amounts to nothing at all in the case in hand; for this I am positive in, having myself been employed a little in the direction of that part of the parish in which I lived, and where as great a desolation was made, in proportion to the number of the inhabitants, as was anywhere. I say, I am sure that there were no dead bodies remained unburied; that is to say, none that the proper officers knew of, none for want of people to carry them off, and buriers to put them into the ground and cover them. And this is sufficient to the argument; for what might lie in houses and holes, as in Moses and Aaron Alley, is nothing, for it is most certain they were buried as soon as they were found. As to the first article, namely, of provisions, the scarcity or dearness, though I have mentioned it before, and shall speak of it again, yet I must observe here. 1. The price of bread in particular was not much raised; for in the beginning of the year, viz., in the first week in March, the penny wheaten loaf was ten ounces and a half, and in the height of the contagion it was to be had at nine ounces and a half, and never dearer, no, not all that season; and about the beginning of November it was sold at ten ounces and a half again, the like of which, I believe, was never heard of, in any city under so dreadful a visitation, before. 2. Neither was there, which I wondered much at, any want of bakers or ovens kept open to supply the people with bread; but this was indeed alleged by some families, viz., that their maidservants, going to the bakehouses with their dough to be baked, which was then the custom, sometimes came home with the sickness, that is to say, the plague, upon them. In all this dreadful visitation there were, as I have said before, but two pesthouses made use of; viz., one in the fields beyond Old Street, and one in Westminster. Neither was there any compulsion used in carrying people thither. Indeed, there was no need of compulsion in the case, for there were thousands of poor distressed people, who having no help, or conveniences, or supplies, but of charity, would have been very glad to have been carried thither and been taken care of; which, indeed, was the only thing that, I think, was wanting in the whole public management of the city, seeing nobody was here allowed to be brought to the pesthouse but where money was given, or security for money, either at their introducing,[252] or upon their being cured and sent out; for very many were sent out again whole, and very good physicians were appointed to those places; so that many people did very well there, of which I shall make mention again. The principal sort of people sent thither were, as I have said, servants, who got the distemper by going of errands to fetch necessaries for the families where they lived, and who, in that case, if they came home sick, were removed to preserve the rest of the house; and they were so well looked after there, in all the time of the visitation, that there was but one hundred and fifty-six buried in all at the London pesthouse, and one hundred and fifty-nine at that of Westminster. By having more pesthouses, I am far from meaning a forcing all people into such places. Had the shutting up of houses been omitted, and the sick hurried out of their dwellings to pesthouses, as some proposed it seems at that time as well as since, it[253] would certainly have been much worse than it was. The very removing the sick would have been a spreading of the infection, and the rather because that removing could not effectually clear the house where the sick person was of the distemper; and the rest of the family, being then left at liberty, would certainly spread it among others. The methods, also, in private families which would have been universally used to have concealed the distemper, and to have concealed the persons being sick, would have been such that the distemper would sometimes have seized a whole family before any visitors or examiners could have known of it. On the other hand, the prodigious numbers which would have been sick at a time would have exceeded all the capacity of public pesthouses to receive them, or of public officers to discover and remove them. This was well considered in those days, and I have heard them talk of it often. The magistrates had enough to do to bring people to submit to having their houses shut up; and many ways they deceived the watchmen, and got out, as I observed. But that difficulty made it apparent that they would have found it impracticable to have gone the other way to work; for they could never have forced the sick people out of their beds and out of their dwellings: it must not have been my lord mayor's officers, but an army of officers, that must have attempted it. And the people, on the other hand, would have been enraged and desperate, and would have killed those that should have offered to have meddled with them or with their children and relations, whatever had befallen them for it; so that they would have made the people (who, as it was, were in the most terrible distraction imaginable), I say, they would have made them stark mad: whereas the magistrates found it proper on several occasions to treat them with lenity and compassion, and not with violence and terror, such as dragging the sick out of their houses, or obliging them to remove themselves, would have been. This leads me again to mention the time when the plague first began,[254] that is to say, when it became certain that it would spread over the whole town, when, as I have said, the better sort of people first took the alarm, and began to hurry themselves out of town. It was true, as I observed in its place, that the throng was so great, and the coaches, horses, wagons, and carts were so many, driving and dragging the people away, that it looked as if all the city was running away; and had any regulations been published that had been terrifying at that time, especially such as would pretend to dispose of the people otherwise than they would dispose of themselves, it would have put both the city and suburbs into the utmost confusion. The magistrates wisely caused the people to be encouraged, made very good by-laws[255] for the regulating the citizens, keeping good order in the streets, and making everything as eligible as possible to all sorts of people. In the first place, the lord mayor and the sheriffs,[256] the court of aldermen, and a certain number of the common councilmen, or their deputies, came to a resolution, and published it; viz., that they would not quit the city themselves, but that they would be always at hand for the preserving good order in every place, and for doing justice on all occasions, as also for the distributing the public charity to the poor, and, in a word, for the doing the duty and discharging the trust reposed in them by the citizens, to the utmost of their power. In pursuance of these orders, the lord mayor, sheriffs, etc., held councils every day, more or less, for making such dispositions as they found needful for preserving the civil peace; and though they used the people with all possible gentleness and clemency, yet all manner of presumptuous rogues, such as thieves, housebreakers, plunderers of the dead or of the sick, were duly punished; and several declarations were continually published by the lord mayor and court of aldermen against such. Also all constables and churchwardens were enjoined to stay in the city upon severe penalties, or to depute such able and sufficient housekeepers as the deputy aldermen or common councilmen of the precinct should approve, and for whom they should give security, and also security, in case of mortality, that they would forthwith constitute other constables in their stead. These things reëstablished the minds of the people very much, especially in the first of their fright, when they talked of making so universal a flight that the city would have been in danger of being entirely deserted of its inhabitants, except the poor, and the country of being plundered and laid waste by the multitude. Nor were the magistrates deficient in performing their part as boldly as they promised it; for my lord mayor and the sheriffs were continually in the streets and at places of the greatest danger; and though they did not care for having too great a resort of people crowding about them, yet in emergent cases they never denied the people access to them, and heard with patience all their grievances and complaints. My lord mayor had a low gallery built on purpose in his hall, where he stood, a little removed from the crowd, when any complaint came to be heard, that he might appear with as much safety as possible. Likewise the proper officers, called my lord mayor's officers, constantly attended in their turns, as they were in waiting; and if any of them were sick or infected, as some of them were, others were instantly employed to fill up, and officiate in their places till it was known whether the other should live or die. In like manner the sheriffs and aldermen did,[257] in their several stations and wards, where they were placed by office; and the sheriff's officers or sergeants were appointed to receive orders from the respective aldermen in their turn; so that justice was executed in all cases without interruption. In the next place, it was one of their particular cares to see the orders for the freedom of the markets observed; and in this part either the lord mayor, or one or both of the sheriffs, were every market day on horseback to see their orders executed, and to see that the country people had all possible encouragement and freedom in their coming to the markets and going back again, and that no nuisance or frightful object should be seen in the streets to terrify them, or make them unwilling to come. Also the bakers were taken under particular order, and the master of the Bakers' Company was, with his court of assistants, directed to see the order of my lord mayor for their regulation put in execution, and the due assize[258] of bread, which was weekly appointed by my lord mayor, observed; and all the bakers were obliged to keep their ovens going constantly, on pain of losing the privileges of a freeman of the city of London. By this means, bread was always to be had in plenty, and as cheap as usual, as I said above; and provisions were never wanting in the markets, even to such a degree that I often wondered at it, and reproached myself with being so timorous and cautious in stirring abroad, when the country people came freely and boldly to market, as if there had been no manner of infection in the city, or danger of catching it. It was indeed one admirable piece of conduct in the said magistrates, that the streets were kept constantly clear and free from all manner of frightful objects, dead bodies, or any such things as were indecent or unpleasant; unless where anybody fell down suddenly, or died in the streets, as I have said above, and these were generally covered with some cloth or blanket, or removed into the next churchyard till night. All the needful works that carried terror with them, that were both dismal and dangerous, were done in the night. If any diseased bodies were removed, or dead bodies buried, or infected clothes burned, it was done in the night; and all the bodies which were thrown into the great pits in the several churchyards or burying grounds, as has been observed, were so removed in the night, and everything was covered and closed before day. So that in the daytime there was not the least signal of the calamity to be seen or heard of, except what was to be observed from the emptiness of the streets, and sometimes from the passionate outcries and lamentations of the people, out at their windows, and from the numbers of houses and shops shut up. Nor was the silence and emptiness of the streets so much in the city as in the outparts, except just at one particular time, when, as I have mentioned, the plague came east, and spread over all the city. It was indeed a merciful disposition of God, that as the plague began at one end of the town first, as has been observed at large, so it proceeded progressively to other parts, and did not come on this way, or eastward, till it had spent its fury in the west part of the town; and so as it came on one way it abated another. For example:-- It began at St. Giles's and the Westminster end of the town, and it was in its height in all that part by about the middle of July, viz., in St. Giles-in-the-Fields, St. Andrew's, Holborn, St. Clement's-Danes, St. Martin's-in-the-Fields, and in Westminster. The latter end of July it decreased in those parishes, and, coming east, it increased prodigiously in Cripplegate, St. Sepulchre's, St. James's, Clerkenwell, and St. Bride's and Aldersgate. While it was in all these parishes, the city and all the parishes of the Southwark side of the water, and all Stepney, Whitechapel, Aldgate, Wapping, and Ratcliff, were very little touched; so that people went about their business unconcerned, carried on their trades, kept open their shops, and conversed freely with one another in all the city, the east and northeast suburbs, and in Southwark, almost as if the plague had not been among us. Even when the north and northwest suburbs were fully infected, viz., Cripplegate, Clerkenwell, Bishopsgate, and Shoreditch, yet still all the rest were tolerably well. For example:-- From the 25th of July to the 1st of August the bill stood thus of all diseases:-- St. Giles's, Cripplegate 554 St. Sepulchre's 250 Clerkenwell 103 Bishopsgate 116 Shoreditch 110 Stepney Parish 127 Aldgate 92 Whitechapel 104 All the 97 parishes within the walls 228 All the parishes in Southwark 205 ----- 1,889 So that, in short, there died more that week in the two parishes of Cripplegate and St. Sepulchre's by forty-eight than all the city, all the east suburbs, and all the Southwark parishes put together. This caused the reputation of the city's health to continue all over England, and especially in the counties and markets adjacent, from whence our supply of provisions chiefly came, even much longer than that health itself continued; for when the people came into the streets from the country by Shoreditch and Bishopsgate, or by Old Street and Smithfield, they would see the outstreets empty, and the houses and shops shut, and the few people that were stirring there walk in the middle of the streets; but when they came within the city, there things looked better, and the markets and shops were open, and the people walking about the streets as usual, though not quite so many; and this continued till the latter end of August and the beginning of September. But then the case altered quite; the distemper abated in the west and northwest parishes, and the weight of the infection lay on the city and the eastern suburbs, and the Southwark side, and this in a frightful manner. Then indeed the city began to look dismal, shops to be shut, and the streets desolate. In the High Street, indeed, necessity made people stir abroad on many occasions; and there would be in the middle of the day a pretty many[259] people, but in the mornings and evenings scarce any to be seen even there, no, not in Cornhill and Cheapside. These observations of mine were abundantly confirmed by the weekly bills of mortality for those weeks, an abstract of which, as they respect the parishes which I have mentioned, and as they make the calculations I speak of very evident, take as follows. The weekly bill which makes out this decrease of the burials in the west and north side of the city stands thus:-- St. Giles's, Cripplegate 456 St. Giles-in-the-Fields 140 Clerkenwell 77 St. Sepulchre's 214 St. Leonard, Shoreditch 183 Stepney Parish 716 Aldgate 629 Whitechapel 532 In the 97 parishes within the walls 1,493 In the 8 parishes on Southwark side 1,636 ----- 6,076 Here is a strange change of things indeed, and a sad change it was; and, had it held for two months more than it did, very few people would have been left alive; but then such, I say, was the merciful disposition of God, that when it was thus, the west and north part, which had been so dreadfully visited at first, grew, as you see, much better; and, as the people disappeared here, they began to look abroad again there; and the next week or two altered it still more, that is, more to the encouragement of the other part of the town. For example:-- Sept. 19-26. St. Giles's, Cripplegate 277 St. Giles-in-the-Fields 119 Clerkenwell 76 St. Sepulchre's 193 St. Leonard, Shoreditch 146 Stepney Parish 616 Aldgate 496 Whitechapel 346 In the 97 parishes within the walls 1,268 In the 8 parishes on Southwark side 1,390 ----- 4,927 Sept. 26-Oct. 3. St. Giles's, Cripplegate 196 St. Giles-in-the-Fields 95 Clerkenwell 48 St. Sepulchre's 137 St. Leonard, Shoreditch 128 Stepney Parish 674 Aldgate 372 Whitechapel 328 In the 97 parishes within the walls 1,149 In the 8 parishes on Southwark side 1,201 ----- 4,328 And now the misery of the city, and of the said east and south parts, was complete indeed; for, as you see, the weight of the distemper lay upon those parts, that is to say, the city, the eight parishes over the river, with the parishes of Aldgate, Whitechapel, and Stepney, and this was the time that the bills came up to such a monstrous height as that I mentioned before, and that eight or nine, and, as I believe, ten or twelve thousand a week died; for it is my settled opinion that they[260] never could come at any just account of the numbers, for the reasons which I have given already. Nay, one of the most eminent physicians, who has since published in Latin an account of those times and of his observations, says that in one week there died twelve thousand people, and that particularly there died four thousand in one night; though I do not remember that there ever was any such particular night so remarkably fatal as that such a number died in it. However, all this confirms what I have said above of the uncertainty of the bills of mortality, etc., of which I shall say more hereafter. And here let me take leave to enter again, though it may seem a repetition of circumstances, into a description of the miserable condition of the city itself, and of those parts where I lived, at this particular time. The city, and those other parts, notwithstanding the great numbers of people that were gone into the country, was[261] vastly full of people; and perhaps the fuller because people had for a long time a strong belief that the plague would not come into the city, nor into Southwark, no, nor into Wapping or Ratcliff at all; nay, such was the assurance of the people on that head, that many removed from the suburbs on the west and north sides into those eastern and south sides as for safety, and, as I verily believe, carried the plague amongst them there, perhaps sooner than they would otherwise have had it. Here, also, I ought to leave a further remark for the use of posterity, concerning the manner of people's infecting one another; namely, that it was not the sick people only from whom the plague was immediately received by others that were sound, but the well. To explain myself: by the sick people, I mean those who were known to be sick, had taken their beds, had been under cure, or had swellings or tumors upon them, and the like. These everybody could beware of: they were either in their beds, or in such condition as could not be concealed. By the well, I mean such as had received the contagion, and had it really upon them and in their blood, yet did not show the consequences of it in their countenances; nay, even were not sensible of it themselves, as many were not for several days. These breathed death in every place, and upon everybody who came near them; nay, their very clothes retained the infection; their hands would infect the things they touched, especially if they were warm and sweaty, and they were generally apt to sweat, too. Now, it was impossible to know these people, nor did they sometimes, as I have said, know themselves, to be infected. These were the people that so often dropped down and fainted in the streets; for oftentimes they would go about the streets to the last, till on a sudden they would sweat, grow faint, sit down at a door, and die. It is true, finding themselves thus, they would struggle hard to get home to their own doors, or at other times would be just able to go into their houses, and die instantly. Other times they would go about till they had the very tokens come out upon them, and yet not know it, and would die in an hour or two after they came home, but be well as long as they were abroad. These were the dangerous people; these were the people of whom the well people ought to have been afraid: but then, on the other side, it was impossible to know them. And this is the reason why it is impossible in a visitation to prevent the spreading of the plague by the utmost human vigilance; viz., that it is impossible to know the infected people from the sound, or that the infected people should perfectly know themselves. I knew a man who conversed freely in London all the season of the plague in 1665, and kept about him an antidote or cordial, on purpose to take when he thought himself in any danger; and he had such a rule to know, or have warning of the danger by, as indeed I never met with before or since: how far it may be depended on, I know not. He had a wound in his leg; and whenever he came among any people that were not sound, and the infection began to affect him, he said he could know it by that signal, viz., that the wound in his leg would smart, and look pale and white: so as soon as ever he felt it smart it was time for him to withdraw, or to take care of himself, taking his drink, which he always carried about him for that purpose. Now, it seems he found his wound would smart many times when he was in company with such who thought themselves to be sound, and who appeared so to one another; but he would presently rise up, and say publicly, "Friends, here is somebody in the room that has the plague," and so would immediately break up the company. This was, indeed, a faithful monitor to all people, that the plague is not to be avoided by those that converse promiscuously in a town infected, and people have it when they know it not, and that they likewise give it to others when they know not that they have it themselves; and in this case, shutting up the well or removing the sick will not do it, unless they can go back and shut up all those that the sick had conversed with, even before they knew themselves to be sick; and none knows how far to carry that back, or where to stop, for none knows when, or where, or how, they may have received the infection, or from whom. This I take to be the reason which makes so many people talk of the air being corrupted and infected, and that they need not be cautious of whom they converse with, for that the contagion was in the air. I have seen them in strange agitations and surprises on this account. "I have never come near any infected body," says the disturbed person; "I have conversed with none but sound healthy people, and yet I have gotten the distemper." "I am sure I am struck from Heaven," says another, and he falls to the serious part.[262] Again the first goes on exclaiming, "I have come near no infection, or any infected person; I am sure it is in the air; we draw in death when we breathe, and therefore it is the hand of God: there is no withstanding it." And this at last made many people, being hardened to the danger, grow less concerned at it, and less cautious towards the latter end of the time, and when it was come to its height, than they were at first. Then, with a kind of a Turkish predestinarianism,[263] they would say, if it pleased God to strike them, it was all one whether they went abroad, or staid at home: they could not escape it. And therefore they went boldly about, even into infected houses and infected company, visited sick people, and, in short, lay in the beds with their wives or relations when they were infected. And what was the consequence but the same that is the consequence in Turkey, and in those countries where they do those things, namely, that they were infected too, and died by hundreds and thousands? I would be far from lessening the awe of the judgments of God, and the reverence to his providence, which ought always to be on our minds on such occasions as these. Doubtless the visitation itself is a stroke from Heaven upon a city, or country, or nation, where it falls; a messenger of his vengeance, and a loud call to that nation, or country, or city, to humiliation and repentance, according to that of the prophet Jeremiah (xviii. 7, 8): "At what instant I shall speak concerning a nation, and concerning a kingdom, to pluck up, and to pull down, and to destroy it; if that nation, against whom I have pronounced, turn from their evil, I will repent of the evil that I thought to do unto them." Now, to prompt due impressions of the awe of God on the minds of men on such occasions, and not to lessen them, it is that I have left those minutes upon record. I say, therefore, I reflect upon no man for putting the reason of those things upon the immediate hand of God and the appointment and direction of his providence; nay, on the contrary, there were many wonderful deliverances of persons from infection, and deliverances of persons when infected, which intimate singular and remarkable providence in the particular instances to which they refer; and I esteem my own deliverance to be one next to miraculous, and do record it with thankfulness. But when I am speaking of the plague as a distemper arising from natural causes, we must consider it as it was really propagated by natural means. Nor is it at all the less a judgment for its being under the conduct of human causes and effects; for as the Divine Power has formed the whole scheme of nature, and maintains nature in its course, so the same Power thinks fit to let his own actings with men, whether of mercy or judgment, to go on in the ordinary course of natural causes, and he is pleased to act by those natural causes as the ordinary means, excepting and reserving to himself, nevertheless, a power to act in a supernatural way when he sees occasion. Now it is evident, that, in the case of an infection, there is no apparent extraordinary occasion for supernatural operation; but the ordinary course of things appears sufficiently armed, and made capable of all the effects that Heaven usually directs by a contagion. Among these causes and effects, this of the secret conveyance of infection, imperceptible and unavoidable, is more than sufficient to execute the fierceness of divine vengeance, without putting it upon supernaturals and miracles. The acute, penetrating nature of the disease itself was such, and the infection was received so imperceptibly, that the most exact caution could not secure us while in the place; but I must be allowed to believe--and I have so many examples fresh in my memory to convince me of it, that I think none can resist their evidence,--I say, I must be allowed to believe that no one in this whole nation ever received the sickness or infection, but who received it in the ordinary way of infection from somebody, or the clothes, or touch, or stench of somebody, that was infected before. The manner of its first coming to London proves this also, viz., by goods brought over from Holland, and brought thither from the Levant; the first breaking of it out in a house in Longacre where those goods were carried and first opened; its spreading from that house to other houses by the visible unwary conversing with those who were sick, and the infecting the parish officers who were employed about persons dead; and the like. These are known authorities for this great foundation point, that it went on and proceeded from person to person, and from house to house, and no otherwise. In the first house that was infected, there died four persons. A neighbor, hearing the mistress of the first house was sick, went to visit her, and went home and gave the distemper to her family, and died, and all her household. A minister called to pray with the first sick person in the second house was said to sicken immediately, and die, with several more in his house. Then the physicians began to consider, for they did not at first dream of a general contagion; but the physicians being sent to inspect the bodies, they assured the people that it was neither more or less than the plague, with all its terrifying particulars, and that it threatened an universal infection; so many people having already conversed with the sick or distempered, and having, as might be supposed, received infection from them, that it would be impossible to put a stop to it. Here the opinion of the physicians agreed with my observation afterwards, namely, that the danger was spreading insensibly: for the sick could infect none but those that came within reach of the sick person; but that one man, who may have really received the infection, and knows it not, but goes abroad and about as a sound person, may give the plague to a thousand people, and they to greater numbers in proportion, and neither the person giving the infection, nor the persons receiving it, know anything of it, and perhaps not feel the effects of it for several days after. For example:-- Many persons, in the time of this visitation, never perceived that they were infected till they found, to their unspeakable surprise, the tokens come out upon them, after which they seldom lived six hours; for those spots they called the tokens were really gangrene spots, or mortified flesh, in small knobs as broad as a little silver penny, and hard as a piece of callus[264] or horn; so that when the disease was come up to that length, there was nothing could follow but certain death. And yet, as I said, they knew nothing of their being infected, nor found themselves so much as out of order, till those mortal marks were upon them. But everybody must allow that they were infected in a high degree before, and must have been so some time; and consequently their breath, their sweat, their very clothes, were contagious for many days before. This occasioned a vast variety of cases, which physicians would have much more opportunity to remember than I; but some came within the compass of my observation or hearing, of which I shall name a few. A certain citizen who had lived safe and untouched till the month of September, when the weight of the distemper lay more in the city than it had done before, was mighty cheerful, and something too bold, as I think it was, in his talk of how secure he was, how cautious he had been, and how he had never come near any sick body. Says another citizen, a neighbor of his, to him one day, "Do not be too confident, Mr. ----: it is hard to say who is sick and who is well; for we see men alive and well to outward appearance one hour, and dead the next."--"That is true," says the first man (for he was not a man presumptuously secure, but had escaped a long while; and men, as I have said above, especially in the city, began to be overeasy on that score),--"that is true," says he. "I do not think myself secure; but I hope I have not been in company with any person that there has been any danger in."--"No!" says his neighbor. "Was not you at the Bull Head Tavern in Gracechurch Street, with Mr. ----, the night before last?"--"Yes," says the first, "I was; but there was nobody there that we had any reason to think dangerous." Upon which his neighbor said no more, being unwilling to surprise him. But this made him more inquisitive, and, as his neighbor appeared backward, he was the more impatient; and in a kind of warmth says he aloud, "Why, he is not dead, is he?" Upon which his neighbor still was silent, but cast up his eyes, and said something to himself; at which the first citizen turned pale, and said no more but this, "Then I am a dead man too!" and went home immediately, and sent for a neighboring apothecary to give him something preventive, for he had not yet found himself ill. But the apothecary, opening his breast, fetched a sigh, and said no more but this, "Look up to God." And the man died in a few hours. Now, let any man judge from a case like this if it is possible for the regulations of magistrates, either by shutting up the sick or removing them, to stop an infection which spreads itself from man to man even while they are perfectly well, and insensible of its approach, and may be so for many days. It may be proper to ask here how long it may be supposed men might have the seeds of the contagion in them before it discovered[265] itself in this fatal manner, and how long they might go about seemingly whole, and yet be contagious to all those that came near them. I believe the most experienced physicians cannot answer this question directly any more than I can; and something an ordinary observer may take notice of which may pass their observation. The opinion of physicians abroad seems to be, that it may lie dormant in the spirits, or in the blood vessels, a very considerable time: why else do they exact a quarantine of those who come into their harbors and ports from suspected places? Forty days is, one would think, too long for nature to struggle with such an enemy as this, and not conquer it or yield to it; but I could not think by my own observation that they can be infected, so as to be contagious to others, above fifteen or sixteen days at farthest; and on that score it was, that when a house was shut up in the city, and any one had died of the plague, but nobody appeared to be ill in the family for sixteen or eighteen days after, they were not so strict but that they[266] would connive at their going privately abroad; nor would people be much afraid of them afterwards, but rather think they were fortified the better, having not been vulnerable when the enemy was in their house: but we sometimes found it had lain much longer concealed. Upon the foot of all these observations I must say, that, though Providence seemed to direct my conduct to be otherwise, it is my opinion, and I must leave it as a prescription, viz., that the best physic against the plague is to run away from it. I know people encourage themselves by saying, "God is able to keep us in the midst of danger, and able to overtake us when we think ourselves out of danger;" and this kept thousands in the town whose carcasses went into the great pits by cartloads, and who, if they had fled from the danger, had, I believe, been safe from the disaster: at least, 'tis probable they had been safe. And were this very fundamental[267] only duly considered by the people on any future occasion of this or the like nature, I am persuaded it would put them upon quite different measures for managing the people from those that they took in 1665, or than any that have been taken abroad that I have heard of: in a word, they would consider of separating the people into smaller bodies, and removing them in time farther from one another, and not let such a contagion as this, which is indeed chiefly dangerous to collected bodies of people, find a million of people in a body together, as was very near the case before, and would certainly be the case if it should ever appear again. The plague, like a great fire, if a few houses only are contiguous where it happens, can only[268] burn a few houses; or if it begins in a single, or, as we call it, a lone house, can only burn that lone house where it begins; but if it begins in a close-built town or city, and gets ahead, there its fury increases, it rages over the whole place, and consumes all it can reach. I could propose many schemes on the foot of which the government of this city, if ever they should be under the apprehension of such another enemy, (God forbid they should!) might ease themselves of the greatest part of the dangerous people that belong to them: I mean such as the begging, starving, laboring poor, and among them chiefly those who, in a case of siege, are called the useless mouths; who, being then prudently, and to their own advantage, disposed of, and the wealthy inhabitants disposing of themselves, and of their servants and children, the city and its adjacent parts would be so effectually evacuated that there would not be above a tenth part of its people left together for the disease to take hold upon. But suppose them to be a fifth part, and that two hundred and fifty thousand people were left; and if it did seize upon them, they would, by their living so much at large, be much better prepared to defend themselves against the infection, and be less liable to the effects of it, than if the same number of people lived close together in one smaller city, such as Dublin, or Amsterdam, or the like. It is true, hundreds, yea thousands, of families fled away at this last plague; but then of them many fled too late, and not only died in their flight, but carried the distemper with them into the countries where they went, and infected those whom they went among for safety; which confounded[269] the thing, and made that be a propagation of the distemper which was the best means to prevent it. And this, too, is evident of it, and brings me back to what I only hinted at before, but must speak more fully to here, namely, that men went about apparently well many days after they had the taint of the disease in their vitals, and after their spirits were so seized as that they could never escape it; and that, all the while they did so, they were dangerous to others. I say, this proves that so it was; for such people infected the very towns they went through, as well as the families they went among; and it was by that means that almost all the great towns in England had the distemper among them more or less, and always they would tell you such a Londoner or such a Londoner brought it down. It must not be omitted,[270] that when I speak of those people who were really thus dangerous, I suppose them to be utterly ignorant of their own condition; for if they really knew their circumstances to be such as indeed they were, they must have been a kind of willful murderers if they would have gone abroad among healthy people, and it would have verified indeed the suggestion which I mentioned above, and which I thought seemed untrue, viz., that the infected people were utterly careless as to giving the infection to others, and rather forward to do it than not; and I believe it was partly from this very thing that they raised that suggestion, which I hope was not really true in fact. I confess no particular case is sufficient to prove a general; but I could name several people, within the knowledge of some of their neighbors and families yet living, who showed the contrary to an extreme. One man, the master of a family in my neighborhood, having had the distemper, he thought he had it given him by a poor workman whom he employed, and whom he went to his house to see, or went for some work that he wanted to have finished; and he had some apprehensions even while he was at the poor workman's door, but did not discover it[271] fully; but the next day it discovered itself, and he was taken very ill, upon which he immediately caused himself to be carried into an outbuilding which he had in his yard, and where there was a chamber over a workhouse, the man being a brazier. Here he lay, and here he died, and would be tended by none of his neighbors but by a nurse from abroad, and would not suffer his wife, nor children, nor servants, to come up into the room, lest they should be infected, but sent them his blessing and prayers for them by the nurse, who spoke it to them at a distance; and all this for fear of giving them the distemper, and without which, he knew, as they were kept up, they could not have it. And here I must observe also that the plague, as I suppose all distempers do, operated in a different manner on differing constitutions. Some were immediately overwhelmed with it, and it came to violent fevers, vomitings, insufferable headaches, pains in the back, and so up to ravings and ragings with those pains; others with swellings and tumors in the neck or groin, or armpits, which, till they could be broke, put them into insufferable agonies and torment; while others, as I have observed, were silently infected, the fever preying upon their spirits insensibly, and they seeing little of it till they fell into swooning and faintings, and death without pain. I am not physician enough to enter into the particular reasons and manner of these differing effects of one and the same distemper, and of its differing operation in several bodies; nor is it my business here to record the observations which I really made, because the doctors themselves have done that part much more effectually than I can do, and because my opinion may in some things differ from theirs. I am only relating what I know, or have heard, or believe, of the particular cases, and what fell within the compass of my view, and the different nature of the infection as it appeared in the particular cases which I have related; but this may be added too, that though the former sort of those cases, namely, those openly visited, were the worst for themselves as to pain (I mean those that had such fevers, vomitings, headaches, pains, and swellings), because they died in such a dreadful manner, yet the latter had the worst state of the disease; for in the former they frequently recovered, especially if the swellings broke; but the latter was inevitable death. No cure, no help, could be possible; nothing could follow but death. And it was worse, also, to others; because, as above, it secretly and unperceived by others or by themselves, communicated death to those they conversed with, the penetrating poison insinuating itself into their blood in a manner which it was impossible to describe, or indeed conceive. This infecting and being infected without so much as its being known to either person is evident from two sorts of cases which frequently happened at that time; and there is hardly anybody living, who was in London during the infection, but must have known several of the cases of both sorts. 1. Fathers and mothers have gone about as if they had been well, and have believed themselves to be so, till they have insensibly infected and been the destruction of their whole families; which they would have been far from doing if they had had the least apprehensions of their being unsound and dangerous themselves. A family, whose story I have heard, was thus infected by the father, and the distemper began to appear upon some of them even before he found it upon himself; but, searching more narrowly, it appeared he had been infected some time, and, as soon as he found that his family had been poisoned by himself, he went distracted, and would have laid violent hands upon himself, but was kept from that by those who looked to him; and in a few days he died. 2. The other particular is, that many people, having been well to the best of their own judgment, or by the best observation which they could make of themselves for several days, and only finding a decay of appetite, or a light sickness upon their stomachs,--nay, some whose appetite has been strong, and even craving, and only a light pain in their heads,--have sent for physicians to know what ailed them, and have been found, to their great surprise, at the brink of death, the tokens upon them, or the plague grown up to an incurable height. It was very sad to reflect how such a person as this last mentioned above had been a walking destroyer, perhaps for a week or fortnight before that; how he had ruined those that he would have hazarded his life to save, and had been breathing death upon them, even perhaps in his tender kissing and embracings of his own children. Yet thus certainly it was, and often has been, and I could give many particular cases where it has been so. If, then, the blow is thus insensibly striking; if the arrow flies thus unseen, and cannot be discovered,--to what purpose are all the schemes for shutting up or removing the sick people? Those schemes cannot take place but upon those that appear to be sick or to be infected; whereas there are among them at the same time thousands of people who seem to be well, but are all that while carrying death with them into all companies which they come into. This frequently puzzled our physicians, and especially the apothecaries and surgeons, who knew not how to discover the sick from the sound. They all allowed that it was really so; that many people had the plague in their very blood, and preying upon their spirits, and were in themselves but walking putrefied carcasses, whose breath was infectious, and their sweat poison, and yet were as well to look on as other people, and even knew it not themselves,--I say they all allowed that it was really true in fact, but they knew not how to propose a discovery.[272] My friend Dr. Heath was of opinion that it might be known by the smell of their breath; but then, as he said, who durst smell to that breath for his information, since to know it he must draw the stench of the plague up into his own brain in order to distinguish the smell? I have heard it was the opinion of others that it might be distinguished by the party's breathing upon a piece of glass, where, the breath condensing, there might living creatures be seen by a microscope, of strange, monstrous, and frightful shapes, such as dragons, snakes, serpents, and devils, horrible to behold. But this I very much question the truth of, and we had no microscopes at that time, as I remember, to make the experiment with.[273] It was the opinion, also, of another learned man that the breath of such a person would poison and instantly kill a bird, not only a small bird, but even a cock or hen; and that, if it did not immediately kill the latter, it would cause them to be roupy,[274] as they call it; particularly that, if they had laid any eggs at that time, they would be all rotten. But those are opinions which I never found supported by any experiments, or heard of others that had seen it,[275] so I leave them as I find them, only with this remark, namely, that I think the probabilities are very strong for them. Some have proposed that such persons should breathe hard upon warm water, and that they would leave an unusual scum upon it, or upon several other things, especially such as are of a glutinous substance, and are apt to receive a scum, and support it. But, from the whole, I found that the nature of this contagion was such that it was impossible to discover it at all, or to prevent it spreading from one to another by any human skill. Here was indeed one difficulty, which I could never thoroughly get over to this time, and which there is but one way of answering that I know of, and it is this; viz., the first person that died of the plague was on December 20th, or thereabouts, 1664, and in or about Longacre: whence the first person had the infection was generally said to be from a parcel of silks imported from Holland, and first opened in that house. But after this we heard no more of any person dying of the plague, or of the distemper being in that place, till the 9th of February, which was about seven weeks after, and then one more was buried out of the same house. Then it was hushed, and we were perfectly easy as to the public for a great while; for there were no more entered in the weekly bill to be dead of the plague till the 22d of April, when there were two more buried, not out of the same house, but out of the same street; and, as near as I can remember, it was out of the next house to the first. This was nine weeks asunder; and after this we had no more till a fortnight, and then it broke out in several streets, and spread every way. Now, the question seems to lie thus: Where lay the seeds of the infection all this while? how came it to stop so long, and not stop any longer? Either the distemper did not come immediately by contagion from body to body, or, if it did, then a body may be capable to continue infected, without the disease discovering itself, many days, nay, weeks together; even not a quarantine[276] of days only, but a soixantine,[277]--not only forty days, but sixty days, or longer. It is true there was, as I observed at first, and is well known to many yet living, a very cold winter and a long frost, which continued three months; and this, the doctors say, might check the infection. But then the learned must allow me to say, that if, according to their notion, the disease was, as I may say, only frozen up, it would, like a frozen river, have returned to its usual force and current when it thawed; whereas the principal recess of this infection, which was from February to April, was after the frost was broken and the weather mild and warm. But there is another way of solving all this difficulty, which I think my own remembrance of the thing will supply; and that is, the fact is not granted, namely, that there died none in those long intervals, viz., from the 20th of December to the 9th of February, and from thence to the 22d of April. The weekly bills are the only evidence on the other side, and those bills were not of credit enough, at least with me, to support an hypothesis, or determine a question of such importance as this; for it was our received opinion at that time, and I believe upon very good grounds, that the fraud lay in the parish officers, searchers, and persons appointed to give account of the dead, and what diseases they died of; and as people were very loath at first to have the neighbors believe their houses were infected, so they gave money to procure, or otherwise procured, the dead persons to be returned as dying of other distempers; and this I know was practiced afterwards in many places, I believe I might say in all places where the distemper came, as will be seen by the vast increase of the numbers placed in the weekly bills under other articles[278] of diseases during the time of the infection. For example, in the months of July and August, when the plague was coming on to its highest pitch, it was very ordinary to have from a thousand to twelve hundred, nay, to almost fifteen hundred, a week, of other distempers. Not that the numbers of those distempers were really increased to such a degree; but the great number of families and houses where really the infection was, obtained the favor to have their dead be returned of other distempers, to prevent the shutting up their houses. For example:-- Dead of other Diseases besides the Plague. From the 18th to the 25th of July 942 To the 1st of August 1,004 To the 8th 1,213 To the 15th 1,439 To the 22d 1,331 To the 29th 1,394 To the 5th of September 1,264 To the 12th 1,056 To the 19th 1,132 To the 26th 927 Now, it was not doubted but the greatest part of these, or a great part of them, were dead of the plague; but the officers were prevailed with to return them as above, and the numbers of some particular articles of distempers discovered is as follows:-- Aug. 1-8. Aug. 8-15. Aug. 15-22. Aug. 22-29. Fever 314 353 348 383 Spotted fever 174 190 166 165 Surfeit 85 87 74 99 Teeth 90 113 111 133 --- --- --- --- 663 743 699 780 Aug. 29-Sept. 5. Sept. 5-12. Sept. 12-19. Sept. 19-26. Fever 364 332 309 268 Spotted Fever 157 97 101 65 Surfeit 68 45 49 36 Teeth 138 128 121 112 --- --- --- --- 727 602 580 481 There were several other articles which bore a proportion to these, and which it is easy to perceive were increased on the same account; as aged,[279] consumptions, vomitings, imposthumes,[280] gripes, and the like, many of which were not doubted to be infected people; but as it was of the utmost consequence to families not to be known to be infected, if it was possible to avoid it, so they took all the measures they could to have it not believed, and if any died in their houses, to get them returned to the examiners, and by the searchers, as having died of other distempers. This, I say, will account for the long interval which, as I have said, was between the dying of the first persons that were returned in the bills to be dead of the plague, and the time when the distemper spread openly, and could not be concealed. Besides, the weekly bills themselves at that time evidently discover this truth; for while there was no mention of the plague, and no increase after it had been mentioned, yet it was apparent that there was an increase of those distempers which bordered nearest upon it. For example, there were eight, twelve, seventeen, of the spotted fever in a week when there were none or but very few of the plague; whereas before, one, three, or four were the ordinary weekly numbers of that distemper. Likewise, as I observed before, the burials increased weekly in that particular parish and the parishes adjacent, more than in any other parish, although there were none set down of the plague; all which tell us that the infection was handed on, and the succession of the distemper really preserved, though it seemed to us at that time to be ceased, and to come again in a manner surprising. It might be, also, that the infection might remain in other parts of the same parcel of goods which at first it came in, and which might not be, perhaps, opened, or at least not fully, or in the clothes of the first infected person; for I cannot think that anybody could be seized with the contagion in a fatal and mortal degree for nine weeks together, and support his state of health so well as even not to discover it to themselves:[281] yet, if it were so, the argument is the stronger in favor of what I am saying, namely, that the infection is retained in bodies apparently well, and conveyed from them to those they converse with, while it is known to neither the one nor the other. Great were the confusions at that time upon this very account; and when people began to be convinced that the infection was received in this surprising manner from persons apparently well, they began to be exceeding shy and jealous of every one that came near them. Once, on a public day, whether a sabbath day or not I do not remember, in Aldgate Church, in a pew full of people, on a sudden one fancied she smelt an ill smell. Immediately she fancies the plague was in the pew, whispers her notion or suspicion to the next, then rises and goes out of the pew. It immediately took with the next, and so with them all; and every one of them, and of the two or three adjoining pews, got up and went out of the church, nobody knowing what it was offended them, or from whom. This immediately filled everybody's mouths with one preparation or other, such as the old women directed, and some, perhaps, as physicians directed, in order to prevent infection by the breath of others; insomuch, that if we came to go into a church when it was anything full of people, there would be such a mixture of smells at the entrance, that it was much more strong, though perhaps not so wholesome, than if you were going into an apothecary's or druggist's shop: in a word, the whole church was like a smelling bottle. In one corner it was all perfumes; in another, aromatics,[282] balsamics,[283] and a variety of drugs and herbs; in another, salts and spirits, as every one was furnished for their own preservation. Yet I observed that after people were possessed, as I have said, with the belief, or rather assurance, of the infection being thus carried on by persons apparently in health, the churches and meetinghouses were much thinner of people than at other times, before that, they used to be; for this is to be said of the people of London, that, during the whole time of the pestilence, the churches or meetings were never wholly shut up, nor did the people decline coming out to the public worship of God, except only in some parishes, when the violence of the distemper was more particularly in that parish at that time, and even then[284] no longer than it[285] continued to be so. Indeed, nothing was more strange than to see with what courage the people went to the public service of God, even at that time when they were afraid to stir out of their own houses upon any other occasion (this I mean before the time of desperation which I have mentioned already). This was a proof of the exceeding populousness of the city at the time of the infection, notwithstanding the great numbers that were gone into the country at the first alarm, and that fled out into the forests and woods when they were further terrified with the extraordinary increase of it. For when we came to see the crowds and throngs of people which appeared on the sabbath days at the churches, and especially in those parts of the town where the plague was abated, or where it was not yet come to its height, it was amazing. But of this I shall speak again presently. I return, in the mean time, to the article of infecting one another at first. Before people came to right notions of the infection and of infecting one another, people were only shy of those that were really sick. A man with a cap upon his head, or with cloths round his neck (which was the case of those that had swellings there),--such was indeed frightful; but when we saw a gentleman dressed, with his band[286] on, and his gloves in his hand, his hat upon his head, and his hair combed,--of such we had not the least apprehensions; and people conversed a great while freely, especially with their neighbors and such as they knew. But when the physicians assured us that the danger was as well from the sound (that is, the seemingly sound) as the sick, and that those people that thought themselves entirely free were oftentimes the most fatal; and that it came to be generally understood that people were sensible of it, and of the reason of it,--then, I say, they began to be jealous of everybody; and a vast number of people locked themselves up, so as not to come abroad into any company at all, nor suffer any that had been abroad in promiscuous company to come into their houses, or near them (at least not so near them as to be within the reach of their breath, or of any smell from them); and when they were obliged to converse at a distance with strangers, they would always have preservatives in their mouths and about their clothes, to repel and keep off the infection. It must be acknowledged that when people began to use these cautions they were less exposed to danger, and the infection did not break into such houses so furiously as it did into others before; and thousands of families were preserved, speaking with due reserve to the direction of Divine Providence, by that means. But it was impossible to beat anything into the heads of the poor. They went on with the usual impetuosity of their tempers, full of outcries and lamentations when taken, but madly careless of themselves, foolhardy, and obstinate, while they were well. Where they could get employment, they pushed into any kind of business, the most dangerous and the most liable to infection; and if they were spoken to, their answer would be, "I must trust to God for that. If I am taken, then I am provided for, and there is an end of me;" and the like. Or thus, "Why, what must I do? I cannot starve. I had as good have the plague as perish for want. I have no work: what could I do? I must do this, or beg." Suppose it was burying the dead, or attending the sick, or watching infected houses, which were all terrible hazards; but their tale was generally the same. It is true, necessity was a justifiable, warrantable plea, and nothing could be better; but their way of talk was much the same where the necessities were not the same. This adventurous conduct of the poor was that which brought the plague among them in a most furious manner; and this, joined to the distress of their circumstances when taken, was the reason why they died so by heaps; for I cannot say I could observe one jot of better husbandry[287] among them (I mean the laboring poor) while they were all well and getting money than there was before; but[288] as lavish, as extravagant, and as thoughtless for to-morrow as ever; so that when they came to be taken sick, they were immediately in the utmost distress, as well for want as for sickness, as well for lack of food as lack of health. The misery of the poor I had many occasions to be an eyewitness of, and sometimes, also, of the charitable assistance that some pious people daily gave to such, sending them relief and supplies, both of food, physic, and other help, as they found they wanted. And indeed it is a debt of justice due to the temper of the people of that day, to take notice here, that not only great sums, very great sums of money, were charitably sent to the lord mayor and aldermen for the assistance and support of the poor distempered people, but abundance of private people daily distributed large sums of money for their relief, and sent people about to inquire into the condition of particular distressed and visited families, and relieved them. Nay, some pious ladies were transported with zeal in so good a work, and so confident in the protection of Providence in discharge of the great duty of charity, that they went about in person distributing alms to the poor, and even visiting poor families, though sick and infected, in their very houses, appointing nurses to attend those that wanted attending, and ordering apothecaries and surgeons, the first to supply them with drugs or plasters, and such things as they wanted, and the last to lance and dress the swellings and tumors, where such were wanting; giving their blessing to the poor in substantial relief to them, as well as hearty prayers for them. I will not undertake to say, as some do, that none of those charitable people were suffered to fall under the calamity itself; but this I may say, that I never knew any one of them that miscarried, which I mention for the encouragement of others in case of the like distress; and doubtless if they that give to the poor lend to the Lord, and he will repay them, those that hazard their lives to give to the poor, and to comfort and assist the poor in such misery as this, may hope to be protected in the work. Nor was this charity so extraordinary eminent only in a few; but (for I cannot lightly quit this point) the charity of the rich, as well in the city and suburbs as from the country, was so great, that in a word a prodigious number of people, who must otherwise have perished for want as well as sickness, were supported and subsisted by it; and though I could never, nor I believe any one else, come to a full knowledge of what was so contributed, yet I do believe, that, as I heard one say that was a critical observer of that part,[289] there was not only many thousand pounds contributed, but many hundred thousand pounds, to the relief of the poor of this distressed, afflicted city. Nay, one man affirmed to me that he could reckon up above one hundred thousand pounds a week which was distributed by the churchwardens at the several parish vestries, by the lord mayor and the aldermen in the several wards and precincts, and by the particular direction of the court and of the justices respectively in the parts where they resided, over and above the private charity distributed by pious hands in the manner I speak of; and this continued for many weeks together. I confess this is a very great sum; but if it be true that there was distributed, in the parish of Cripplegate only, seventeen thousand eight hundred pounds in one week to the relief of the poor, as I heard reported, and which I really believe was true, the other may not be improbable. It was doubtless to be reckoned among the many signal good providences which attended this great city, and of which there were many other worth recording. I say, this was a very remarkable one, that it pleased God thus to move the hearts of the people in all parts of the kingdom so cheerfully to contribute to the relief and support of the poor at London; the good consequences of which were felt many ways, and particularly in preserving the lives and recovering the health of so many thousands, and keeping so many thousands of families from perishing and starving. And now I am talking of the merciful disposition of Providence in this time of calamity, I cannot but mention again, though I have spoken several times of it already on other accounts (I mean that of the progression of the distemper), how it began at one end of the town, and proceeded gradually and slowly from one part to another, and like a dark cloud that passes over our heads, which, as it thickens and overcasts the air at one end, clears up at the other end: so, while the plague went on raging from west to east, as it went forwards east, it abated in the west; by which means those parts of the town which were not seized, or who[290] were left, and where it had spent its fury, were (as it were) spared to help and assist the other: whereas, had the distemper spread itself over the whole city and suburbs at once, raging in all places alike, as it has done since in some places abroad, the whole body of the people must have been overwhelmed, and there would have died twenty thousand a day, as they say there did at Naples, nor would the people have been able to have helped or assisted one another. For it must be observed that where the plague was in its full force, there indeed the people were very miserable, and the consternation was inexpressible; but a little before it reached even to that place, or presently after it was gone, they were quite another sort of people; and I cannot but acknowledge that there was too much of that common temper of mankind to be found among us all at that time, namely, to forget the deliverance when the danger is past. But I shall come to speak of that part again. It must not be forgot here to take some notice of the state of trade during the time of this common calamity; and this with respect to foreign trade, as also to our home trade. As to foreign trade, there needs little to be said. The trading nations of Europe were all afraid of us. No port of France, or Holland, or Spain, or Italy, would admit our ships, or correspond with us. Indeed, we stood on ill terms with the Dutch, and were in a furious war with them, though in a bad condition to fight abroad, who had such dreadful enemies to struggle with at home. Our merchants were accordingly at a full stop. Their ships could go nowhere; that is to say, to no place abroad. Their manufactures and merchandise, that is to say, of our growth, would not be touched abroad. They were as much afraid of our goods as they were of our people; and indeed they had reason, for our woolen manufactures are as retentive of infection as human bodies, and, if packed up by persons infected, would receive the infection, and be as dangerous to the touch as a man would be that was infected; and therefore when any English vessel arrived in foreign countries, if they did take the goods on shore, they always caused the bales to be opened and aired in places appointed for that purpose. But from London they would not suffer them to come into port, much less to unload their goods, upon any terms whatever; and this strictness was especially used with them in Spain and Italy. In Turkey and the islands of the Arches,[291] indeed, as they are called, as well those belonging to the Turks as to the Venetians, they were not so very rigid. In the first there was no obstruction at all, and four ships which were then in the river loading for Italy (that is, for Leghorn and Naples) being denied product, as they call it, went on to Turkey, and were freely admitted to unlade their cargo without any difficulty, only that when they arrived there, some of their cargo was not fit for sale in that country, and other parts of it being consigned to merchants at Leghorn, the captains of the ships had no right nor any orders to dispose of the goods; so that great inconveniences followed to the merchants. But this was nothing but what the necessity of affairs required; and the merchants at Leghorn and Naples, having notice given them, sent again from thence to take care of the effects which were particularly consigned to those ports, and to bring back in other ships such as were improper for the markets at Smyrna[292] and Scanderoon.[293] The inconveniences in Spain and Portugal were still greater; for they would by no means suffer our ships, especially those from London, to come into any of their ports, much less to unlade. There was a report that one of our ships having by stealth delivered her cargo, among which were some bales of English cloth, cotton, kerseys, and such like goods, the Spaniards caused all the goods to be burned, and punished the men with death who were concerned in carrying them on shore. This I believe was in part true, though I do not affirm it; but it is not at all unlikely, seeing the danger was really very great, the infection being so violent in London. I heard likewise that the plague was carried into those countries by some of our ships, and particularly to the port of Faro, in the kingdom of Algarve,[294] belonging to the King of Portugal, and that several persons died of it there; but it was not confirmed. On the other hand, though the Spaniards and Portuguese were so shy of us, it is most certain that the plague, as has been said, keeping at first much at that end of the town next Westminster, the merchandising part of the town, such as the city and the waterside, was perfectly sound till at least the beginning of July, and the ships in the river till the beginning of August; for to the 1st of July there had died but seven within the whole city, and but sixty within the liberties; but one in all the parishes of Stepney, Aldgate, and Whitechapel, and but two in all the eight parishes of Southwark. But it was the same thing abroad, for the bad news was gone over the whole world, that the city of London was infected with the plague; and there was no inquiring there how the infection proceeded, or at which part of the town it was begun or was reached to. Besides, after it began to spread, it increased so fast, and the bills grew so high all on a sudden, that it was to no purpose to lessen the report of it, or endeavor to make the people abroad think it better than it was. The account which the weekly bills gave in was sufficient; and that there died two thousand to three or four thousand a week was sufficient to alarm the whole trading part of the world: and the following time being so dreadful also in the very city itself, put the whole world, I say, upon their guard against it. You may be sure also that the report of these things lost nothing in the carriage. The plague was itself very terrible, and the distress of the people very great, as you may observe of what I have said, but the rumor was infinitely greater; and it must not be wondered that our friends abroad, as my brother's correspondents in particular, were told there (namely, in Portugal and Italy, where he chiefly traded), that in London there died twenty thousand in a week; that the dead bodies lay unburied by heaps; that the living were not sufficient to bury the dead, or the sound to look after the sick; that all the kingdom was infected likewise, so that it was an universal malady such as was never heard of in those parts of the world. And they could hardly believe us when we gave them an account how things really were; and how there was not above one tenth part of the people dead; that there were five hundred thousand left that lived all the time in the town; that now the people began to walk the streets again, and those who were fled to return; there was no miss of the usual throng of people in the streets, except as every family might miss their relations and neighbors; and the like. I say, they could not believe these things; and if inquiry were now to be made in Naples, or in other cities on the coast of Italy, they would tell you there was a dreadful infection in London so many years ago, in which, as above, there died twenty thousand in a week, etc., just as we have had it reported in London that there was a plague in the city of Naples in the year 1656, in which there died twenty thousand people in a day, of which I have had very good satisfaction that it was utterly false. But these extravagant reports were very prejudicial to our trade, as well as unjust and injurious in themselves; for it was a long time after the plague was quite over before our trade could recover itself in those parts of the world; and the Flemings[295] and Dutch, but especially the last, made very great advantages of it, having all the market to themselves, and even buying our manufactures in the several parts of England where the plague was not, and carrying them to Holland and Flanders, and from thence transporting them to Spain and to Italy, as if they had been of their own making. But they were detected sometimes, and punished, that is to say, their goods confiscated, and ships also; for if it was true that our manufactures as well as our people were infected, and that it was dangerous to touch or to open and receive the smell of them, then those people ran the hazard, by that clandestine trade, not only of carrying the contagion into their own country, but also of infecting the nations to whom they traded with those goods; which, considering how many lives might be lost in consequence of such an action, must be a trade that no men of conscience could suffer themselves to be concerned in. I do not take upon me to say that any harm was done, I mean of that kind, by those people; but I doubt I need not make any such proviso in the case of our own country; for either by our people of London, or by the commerce, which made their conversing with all sorts of people in every county, and of every considerable town, necessary,--I say, by this means the plague was first or last spread all over the kingdom, as well in London as in all the cities and great towns, especially in the trading manufacturing towns and seaports: so that first or last all the considerable places in England were visited more or less, and the kingdom of Ireland in some places, but not so universally. How it fared with the people in Scotland, I had no opportunity to inquire. It is to be observed, that, while the plague continued so violent in London, the outports, as they are called, enjoyed a very great trade, especially to the adjacent countries and to our own plantations.[296] For example, the towns of Colchester, Yarmouth, and Hull, on that side[297] of England, exported to Holland and Hamburg the manufactures of the adjacent counties for several months after the trade with London was, as it were, entirely shut up. Likewise the cities of Bristol[298] and Exeter, with the port of Plymouth, had the like advantage to Spain, to the Canaries, to Guinea, and to the West Indies, and particularly to Ireland. But as the plague spread itself every way after it had been in London to such a degree as it was in August and September, so all or most of those cities and towns were infected first or last, and then trade was, as it were, under a general embargo, or at a full stop, as I shall observe further when I speak of our home trade. One thing, however, must be observed, that as to ships coming in from abroad (as many, you may be sure, did), some who were out in all parts of the world a considerable while before, and some who, when they went out, knew nothing of an infection, or at least of one so terrible,--these came up the river boldly, and delivered their cargoes as they were obliged to do, except just in the two months of August and September, when, the weight of the infection lying, as I may say, all below bridge, nobody durst appear in business for a while. But as this continued but for a few weeks, the homeward-bound ships, especially such whose cargoes were not liable to spoil, came to an anchor, for a time, short of the Pool, or freshwater part of the river, even as low as the river Medway, where several of them ran in; and others lay at the Nore, and in the Hope below Gravesend: so that by the latter end of October there was a very great fleet of homeward-bound ships to come up, such as the like had not been known for many years. Two particular trades were carried on by water carriage all the while of the infection, and that with little or no interruption, very much to the advantage and comfort of the poor distressed people of the city; and those were the coasting trade for corn, and the Newcastle trade for coals. The first of these was particularly carried on by small vessels from the port of Hull, and other places in the Humber, by which great quantities of corn were brought in from Yorkshire and Lincolnshire; the other part of this corn trade was from Lynn in Norfolk, from Wells, and Burnham, and from Yarmouth, all in the same county; and the third branch was from the river Medway, and from Milton, Feversham, Margate, and Sandwich, and all the other little places and ports round the coast of Kent and Essex.[299] There was also a very good trade from the coast of Suffolk, with corn, butter, and cheese. These vessels kept a constant course of trade, and without interruption came up to that market known still by the name of Bear Key, where they supplied the city plentifully with corn when land carriage began to fail, and when the people began to be sick of coming from many places in the country. This also was much of it owing to the prudence and conduct of the lord mayor, who took such care to keep the masters and seamen from danger when they came up, causing their corn to be bought off at any time they wanted a market (which, however, was very seldom), and causing the cornfactors[300] immediately to unlade and deliver the vessels laden with corn, that they had very little occasion to come out of their ships or vessels, the money being always carried on board to them, and put it into a pail of vinegar before it was carried. The second trade was that of coals from Newcastle-upon-Tyne, without which the city would have been greatly distressed; for not in the streets only, but in private houses and families, great quantities of coal were then burnt, even all the summer long, and when the weather was hottest, which was done by the advice of the physicians. Some, indeed, opposed it, and insisted that to keep the houses and rooms hot was a means to propagate the distemper, which was a fermentation and heat already in the blood; that it was known to spread and increase in hot weather, and abate in cold; and therefore they alleged that all contagious distempers are the worst for heat, because the contagion was nourished, and gained strength, in hot weather, and was, as it were, propagated in heat. Others said they granted that heat in the climate might propagate infection, as sultry hot weather fills the air with vermin, and nourishes innumerable numbers and kinds of venomous creatures, which breed in our food, in the plants, and even in our bodies, by the very stench of which infection may be propagated; also that heat in the air, or heat of weather, as we ordinarily call it, makes bodies relax and faint, exhausts the spirits, opens the pores, and makes us more apt to receive infection or any evil influence, be it from noxious, pestilential vapors, or any other thing in the air; but that the heat of fire, and especially of coal fires, kept in our houses or near us, had quite a different operation, the heat being not of the same kind, but quick and fierce, tending not to nourish, but to consume and dissipate, all those noxious fumes which the other kind of heat rather exhaled, and stagnated than separated, and burnt up. Besides, it was alleged that the sulphureous and nitrous particles that are often found to be in the coal, with that bituminous substance which burns, are all assisting to clear and purge the air, and render it wholesome and safe to breathe in, after the noxious particles (as above) are dispersed and burnt up. The latter opinion prevailed at that time, and, as I must confess, I think with good reason; and the experience of the citizens confirmed it, many houses which had constant fires kept in the rooms having never been infected at all; and I must join my experience to it, for I found the keeping of good fires kept our rooms sweet and wholesome, and I do verily believe made our whole family so, more than would otherwise have been. But I return to the coals as a trade. It was with no little difficulty that this trade was kept open, and particularly because, as we were in an open war with the Dutch at that time, the Dutch capers[301] at first took a great many of our collier ships, which made the rest cautious, and made them to stay to come in fleets together. But after some time the capers were either afraid to take them, or their masters, the States, were afraid they should, and forbade them, lest the plague should be among them, which made them fare the better. For the security of those northern traders, the coal ships were ordered by my lord mayor not to come up into the Pool above a certain number at a time; and[302] ordered lighters and other vessels, such as the woodmongers (that is, the wharf keepers) or coal sellers furnished, to go down and take out the coals as low as Deptford and Greenwich, and some farther down. Others delivered great quantities of coals in particular places where the ships could come to the shore, as at Greenwich, Blackwall, and other places, in vast heaps, as if to be kept for sale; but[303] were then fetched away after the ships which brought them were gone; so that the seamen had no communication with the river men, nor so much as came near one another.[304] Yet all this caution could not effectually prevent the distemper getting among the colliery, that is to say, among the ships, by which a great many seamen died of it; and that which was still worse was, that they carried it down to Ipswich and Yarmouth, to Newcastle-upon-Tyne, and other places on the coast, where, especially at Newcastle and at Sunderland, it carried off a great number of people. The making so many fires as above did indeed consume an unusual quantity of coals; and that upon one or two stops of the ships coming up (whether by contrary weather or by the interruption of enemies, I do not remember); but the price of coals was exceedingly dear, even as high as four pounds a chaldron;[305] but it soon abated when the ships came in, and, as afterwards they had a freer passage, the price was very reasonable all the rest of that year. The public fires which were made on these occasions, as I have calculated it, must necessarily have cost the city about two hundred chaldron of coals a week, if they had continued, which was indeed a very great quantity; but as it was thought necessary, nothing was spared. However, as some of the physicians cried them down, they were not kept alight above four or five days. The fires were ordered thus:-- One at the Custom House; one at Billingsgate; one at Queenhithe, and one at the Three Cranes; one in Blackfriars, and one at the gate of Bridewell; one at the corner of Leadenhall Street and Gracechurch; one at the north and one at the south gate of the Royal Exchange; one at Guildhall, and one at Blackwell Hall gate; one at the lord mayor's door in St. Helen's; one at the west entrance into St. Paul's; and one at the entrance into Bow Church. I do not remember whether there was any at the city gates, but one at the bridge foot there was, just by St. Magnus Church. I know some have quarreled since that at the experiment, and said that there died the more people because of those fires; but I am persuaded those that say so offer no evidence to prove it, neither can I believe it on any account whatever. It remains to give some account of the state of trade at home in England during this dreadful time, and particularly as it relates to the manufactures and the trade in the city. At the first breaking out of the infection there was, as it is easy to suppose, a very great fright among the people, and consequently a general stop of trade, except in provisions and necessaries of life; and even in those things, as there was a vast number of people fled and a very great number always sick, besides the number which died, so there could not be above two thirds, if above one half, of the consumption of provisions in the city as used to be. It pleased God to send a very plentiful year of corn and fruit, and not of hay or grass, by which means bread was cheap by reason of the plenty of corn, flesh was cheap by reason of the scarcity of grass, but butter and cheese were dear for the same reason; and hay in the market, just beyond Whitechapel Bars, was sold at four pounds per load; but that affected not the poor. There was a most excessive plenty of all sorts of fruit, such as apples, pears, plums, cherries, grapes; and they were the cheaper because of the wants of the people; but this made the poor eat them to excess, and this brought them into surfeits and the like, which often precipitated them into the plague. But to come to matters of trade. First, foreign exportation being stopped, or at least very much interrupted and rendered difficult, a general stop of all those manufactures followed of course, which were usually brought for exportation; and, though sometimes merchants abroad were importunate for goods, yet little was sent, the passages being so generally stopped that the English ships would not be admitted, as is said already, into their port. This put a stop to the manufactures that were for exportation in most parts of England, except in some outports; and even that was soon stopped, for they all had the plague in their turn. But though this was felt all over England, yet, what was still worse, all intercourse of trade for home consumption of manufactures, especially those which usually circulated through the Londoners' hands, was stopped at once, the trade of the city being stopped. All kinds of handicrafts in the city, etc., tradesmen and mechanics, were, as I have said before, out of employ; and this occasioned the putting off and dismissing an innumerable number of journeymen and workmen of all sorts, seeing nothing was done relating to such trades but what might be said to be absolutely necessary. This caused the multitude of single people in London to be unprovided for, as also of families whose living depended upon the labor of the heads of those families. I say, this reduced them to extreme misery; and I must confess it is for the honor of the city of London, and will be for many ages, as long as this is to be spoken of, that they were able to supply with charitable provision the wants of so many thousands of those as afterwards fell sick and were distressed; so that it may be safely averred that nobody perished for want, at least that the magistrates had any notice given them of. This stagnation of our manufacturing trade in the country would have put the people there to much greater difficulties, but that the master workmen, clothiers, and others, to the uttermost of their stocks and strength, kept on making their goods to keep the poor at work, believing that, as soon as the sickness should abate, they would have a quick demand in proportion to the decay of their trade at that time; but as none but those masters that were rich could do thus, and that many were poor and not able, the manufacturing trade in England suffered greatly, and the poor were pinched all over England by the calamity of the city of London only. It is true that the next year made them full amends by another terrible calamity upon the city; so that the city by one calamity impoverished and weakened the country, and by another calamity (even terrible, too, of its kind) enriched the country, and made them again amends: for an infinite quantity of household stuff, wearing apparel, and other things, besides whole warehouses filled with merchandise and manufactures, such as come from all parts of England, were consumed in the fire of London the next year after this terrible visitation. It is incredible what a trade this made all over the whole kingdom, to make good the want, and to supply that loss; so that, in short, all the manufacturing hands in the nation were set on work, and were little enough for several years to supply the market, and answer the demands. All foreign markets also were empty of our goods, by the stop which had been occasioned by the plague, and before an open trade was allowed again; and the prodigious demand at home falling in, joined to make a quick vent[306] for all sorts of goods; so that there never was known such a trade all over England, for the time, as was in the first seven years after the plague, and after the fire of London. It remains now that I should say something of the merciful part of this terrible judgment. The last week in September, the plague being come to its crisis, its fury began to assuage. I remember my friend Dr. Heath, coming to see me the week before, told me he was sure the violence of it would assuage in a few days; but when I saw the weekly bill of that week, which was the highest of the whole year, being 8,297 of all diseases, I upbraided him with it, and asked him what he had made his judgment from. His answer, however, was not so much to seek[307] as I thought it would have been. "Look you," says he: "by the number which are at this time sick and infected, there should have been twenty thousand dead the last week, instead of eight thousand, if the inveterate mortal contagion had been as it was two weeks ago; for then it ordinarily killed in two or three days, now not under eight or ten; and then not above one in five recovered, whereas I have observed that now not above two in five miscarry. And observe it from me, the next bill will decrease, and you will see many more people recover than used to do; for though a vast multitude are now everywhere infected, and as many every day fall sick, yet there will not so many die as there did, for the malignity of the distemper is abated;" adding that he began now to hope, nay, more than hope, that the infection had passed its crisis, and was going off. And accordingly so it was; for the next week being, as I said, the last in September, the bill decreased almost two thousand. It is true, the plague was still at a frightful height, and the next bill was no less than 6,460, and the next to that 5,720; but still my friend's observation was just, and it did appear the people did recover faster, and more in number, than they used to do; and indeed if it had not been so, what had been the condition of the city of London? For, according to my friend, there were not fewer than 60,000 people at that time infected, whereof, as above, 20,477 died, and near 40,000 recovered; whereas, had it been as it was before, 50,000 of that number would very probably have died, if not more, and 50,000 more would have sickened; for in a word the whole mass of people began to sicken, and it looked as if none would escape. But this remark of my friend's appeared more evident in a few weeks more; for the decrease went on, and another week in October it decreased 1,843, so that the number dead of the plague was but 2,665; and the next week it decreased 1,413 more, and yet it was seen plainly that there was abundance of people sick, nay, abundance more than ordinary, and abundance fell sick every day; but, as above, the malignity of the disease abated. Such is the precipitant disposition of our people (whether it is so or not all over the world, that is none of my particular business to inquire; but I saw it apparently here), that, as upon the first sight of the infection they shunned one another, and fled from one another's houses and from the city with an unaccountable, and, as I thought, unnecessary fright, so now, upon this notion spreading, viz., that the distemper was not so catching as formerly, and that if it was catched it was not so mortal, and seeing abundance of people who really fell sick recover again daily, they took to such a precipitant courage, and grew so entirely regardless of themselves and of the infection, that they made no more of the plague than of an ordinary fever, nor indeed so much. They not only went boldly into company with those who had tumors and carbuncles upon them that were running, and consequently contagious, but eat and drank with them, nay, into their houses to visit them, and even, as I was told, into their very chambers where they lay sick. This I could not see rational. My friend Dr. Heath allowed, and it was plain to experience, that the distemper was as catching as ever, and as many fell sick, but only he alleged that so many of those that fell sick did not die; but I think that while many did die, and that at best the distemper itself was very terrible, the sores and swellings very tormenting, and the danger of death not left out of the circumstance of sickness, though not so frequent as before,--all those things, together with the exceeding tediousness of the cure, the loathsomeness of the disease, and many other articles, were enough to deter any man living from a dangerous mixture[308] with the sick people, and make them[309] as anxious almost to avoid the infection as before. Nay, there was another thing which made the mere catching of the distemper frightful, and that was the terrible burning of the caustics which the surgeons laid on the swellings to bring them to break and to run; without which the danger of death was very great, even to the last; also the insufferable torment of the swellings, which, though it might not make people raving and distracted, as they were before, and as I have given several instances of already, yet they put the patient to inexpressible torment; and those that fell into it, though they did escape with life, yet they made bitter complaints of those that had told them there was no danger, and sadly repented their rashness and folly in venturing to run into the reach of it. Nor did this unwary conduct of the people end here; for a great many that thus cast off their cautions suffered more deeply still, and though many escaped, yet many died; and at least it[310] had this public mischief attending it, that it made the decrease of burials slower than it would otherwise have been; for, as this notion ran like lightning through the city, and the people's heads were possessed with it, even as soon as the first great decrease in the bills appeared, we found that the two next bills did not decrease in proportion: the reason I take to be the people's running so rashly into danger, giving up all their former cautions and care, and all shyness which they used to practice, depending that the sickness would not reach them, or that, if it did, they should not die. The physicians opposed this thoughtless humor of the people with all their might, and gave out printed directions, spreading them all over the city and suburbs, advising the people to continue reserved, and to use still the utmost caution in their ordinary conduct, notwithstanding the decrease of the distemper; terrifying them with the danger of bringing a relapse upon the whole city, and telling them how such a relapse might be more fatal and dangerous than the whole visitation that had been already; with many arguments and reasons to explain and prove that part to them, and which are too long to repeat here. But it was all to no purpose. The audacious creatures were so possessed with the first joy, and so surprised with the satisfaction of seeing a vast decrease in the weekly bills, that they were impenetrable by any new terrors, and would not be persuaded but that the bitterness of death was passed; and it was to no more purpose to talk to them than to an east wind; but they opened shops, went about streets, did business, and conversed with anybody that came in their way to converse with, whether with business or without, neither inquiring of their health, or so much as being apprehensive of any danger from them, though they knew them not to be sound. This imprudent, rash conduct cost a great many their lives who had with great care and caution shut themselves up, and kept retired, as it were, from all mankind, and had by that means, under God's providence, been preserved through all the heat of that infection. This rash and foolish conduct of the people went so far, that the ministers took notice to them of it, and laid before them both the folly and danger of it; and this checked it a little, so that they grew more cautious. But it had another effect, which they could not check: for as the first rumor had spread, not over the city only, but into the country, it had the like effect; and the people were so tired with being so long from London, and so eager to come back, that they flocked to town without fear or forecast, and began to show themselves in the streets as if all the danger was over. It was indeed surprising to see it; for though there died still from a thousand to eighteen hundred a week, yet the people flocked to town as if all had been well. The consequence of this was, that the bills increased again four hundred the very first week in November; and, if I might believe the physicians, there were above three thousand fell sick that week, most of them newcomers too. One John Cock, a barber in St. Martin's-le-Grand, was an eminent example of this (I mean of the hasty return of the people when the plague was abated). This John Cock had left the town with his whole family, and locked up his house, and was gone into the country, as many others did; and, finding the plague so decreased in November that there died but 905 per week of all diseases, he ventured home again. He had in his family ten persons; that is to say, himself and wife, five children, two apprentices, and a maidservant. He had not been returned to his house above a week, and began to open his shop and carry on his trade, but the distemper broke out in his family, and within about five days they all died except one: that is to say, himself, his wife, all his five children, and his two apprentices; and only the maid remained alive. But the mercy of God was greater to the rest than we had reason to expect; for the malignity, as I have said, of the distemper was spent, the contagion was exhausted, and also the wintry weather came on apace, and the air was clear and cold, with some sharp frosts; and this increasing still, most of those that had fallen sick recovered, and the health of the city began to return. There were indeed some returns of the distemper, even in the month of December, and the bills increased near a hundred; but it went off again, and so in a short while things began to return to their own channel. And wonderful it was to see how populous the city was again all on a sudden; so that a stranger could not miss the numbers that were lost, neither was there any miss of the inhabitants as to their dwellings. Few or no empty houses were to be seen, or, if there were some, there was no want of tenants for them. I wish I could say, that, as the city had a new face, so the manners of the people had a new appearance. I doubt not but there were many that retained a sincere sense of their deliverance, and that were heartily thankful to that Sovereign Hand that had protected them in so dangerous a time. It would be very uncharitable to judge otherwise in a city so populous, and where the people were so devout as they were here in the time of the visitation itself; but, except what of this was to be found in particular families and faces, it must be acknowledged that the general practice of the people was just as it was before, and very little difference was to be seen. Some, indeed, said things were worse; that the morals of the people declined from this very time; that the people, hardened by the danger they had been in, like seamen after a storm is over, were more wicked and more stupid, more bold and hardened in their vices and immoralities, than they were before; but I will not carry it so far, neither. It would take up a history of no small length to give a particular of all the gradations by which the course of things in this city came to be restored again, and to run in their own channel as they did before. Some parts of England were now infected as violently as London had been. The cities of Norwich, Peterborough, Lincoln, Colchester, and other places, were now visited, and the magistrates of London began to set rules for our conduct as to corresponding with those cities. It is true, we could not pretend to forbid their people coming to London, because it was impossible to know them asunder; so, after many consultations, the lord mayor and court of aldermen were obliged to drop it. All they could do was to warn and caution the people not to entertain in their houses, or converse with, any people who they knew came from such infected places. But they might as well have talked to the air; for the people of London thought themselves so plague-free now, that they were past all admonitions. They seemed to depend upon it that the air was restored, and that the air was like a man that had had the smallpox,--not capable of being infected again. This revived that notion that the infection was all in the air; that there was no such thing as contagion from the sick people to the sound; and so strongly did this whimsey prevail among people, that they run altogether promiscuously, sick and well. Not the Mohammedans, who, prepossessed with the principle of predestination, value[311] nothing of contagion, let it be in what it will, could be more obstinate than the people of London. They that were perfectly sound, and came out of the wholesome air, as we call it, into the city, made nothing of going into the same houses and chambers, nay, even into the same beds, with those that had the distemper upon them, and were not recovered. Some, indeed, paid for their audacious boldness with the price of their lives. An infinite number fell sick, and the physicians had more work than ever, only with this difference, that more of their patients recovered, that is to say, they generally recovered; but certainly there were more people infected and fell sick now, when there did not die above a thousand or twelve hundred a week, than there was[312] when there died five or six thousand a week, so entirely negligent were the people at that time in the great and dangerous case of health and infection, and so ill were they able to take or except[313] of the advice of those who cautioned them for their good. The people being thus returned, as it were, in general, it was very strange to find, that, in their inquiring after their friends, some whole families were so entirely swept away that there was no remembrance of them left. Neither was anybody to be found to possess or show any title to that little they had left; for in such cases what was to be found was generally embezzled and purloined, some gone one way, some another. It was said such abandoned effects came to the King as the universal heir; upon which we are told, and I suppose it was in part true, that the King granted all such as deodands[314] to the lord mayor and court of aldermen of London, to be applied to the use of the poor, of whom there were very many. For it is to be observed, that though the occasions of relief and the objects of distress were very many more in the time of the violence of the plague than now, after all was over, yet the distress of the poor was more now a great deal than it was then, because all the sluices of general charity were shut. People supposed the main occasion to be over, and so stopped their hands; whereas particular objects were still very moving, and the distress of those that were poor was very great indeed. Though the health of the city was now very much restored, yet foreign trade did not begin to stir; neither would foreigners admit our ships into their ports for a great while. As for the Dutch, the misunderstandings between our court and them had broken out into a war the year before, so that our trade that way was wholly interrupted; but Spain and Portugal, Italy and Barbary,[315] as also Hamburg, and all the ports in the Baltic,--these were all shy of us a great while, and would not restore trade with us for many months. The distemper sweeping away such multitudes, as I have observed, many if not all of the outparishes were obliged to make new burying grounds, besides that I have mentioned in Bunhill Fields, some of which were continued, and remain in use to this day; but others were left off, and, which I confess I mention with some reflection,[316] being converted into other uses, or built upon afterwards, the dead bodies were disturbed, abused, dug up again, some even before the flesh of them was perished from the bones, and removed like dung or rubbish to other places. Some of those which came within the reach of my observations are as follows:-- First, A piece of ground beyond Goswell Street, near Mountmill, being some of the remains of the old lines or fortifications of the city, where abundance were buried promiscuously from the parishes of Aldersgate, Clerkenwell, and even out of the city. This ground, as I take it, was since[317] made a physic garden,[318] and, after[319] that, has been built upon. Second, A piece of ground just over the Black Ditch, as it was then called, at the end of Holloway Lane, in Shoreditch Parish. It has been since made a yard for keeping hogs and for other ordinary uses, but is quite out of use as a burying ground. Third, The upper end of Hand Alley, in Bishopsgate Street, which was then a green field, and was taken in particularly for Bishopsgate Parish, though many of the carts out of the city brought their dead thither also, particularly out of the parish of St. Allhallows-on-the-Wall. This place I cannot mention without much regret. It was, as I remember, about two or three years after the plague was ceased, that Sir Robert Clayton[320] came to be possessed of the ground. It was reported, how true I know not, that it fell to the King for want of heirs (all those who had any right to it being carried off by the pestilence), and that Sir Robert Clayton obtained a grant of it from King Charles II. But however he came by it, certain it is the ground was let out to build on, or built upon by his order. The first house built upon it was a large fair house, still standing, which faces the street or way now called Hand Alley, which, though called an alley, is as wide as a street. The houses in the same row with that house northward are built on the very same ground where the poor people were buried; and the bodies, on opening the ground for the foundations, were dug up, some of them remaining so plain to be seen, that the women's skulls were distinguished by their long hair, and of others the flesh was not quite perished; so that the people began to exclaim loudly against it, and some suggested that it might endanger a return of the contagion; after which the bones and bodies, as fast as they[321] came at them, were carried to another part of the same ground, and thrown altogether into a deep pit, dug on purpose, which now is to be known[322] in that it is not built on, but is a passage to another house at the upper end of Rose Alley, just against the door of a meetinghouse, which has been built there many years since; and the ground is palisadoed[323] off from the rest of the passage in a little square. There lie the bones and remains of near two thousand bodies, carried by the dead carts to their grave in that one year. Fourth, Besides this, there was a piece of ground in Moorfields, by the going into the street which is now called Old Bethlem, which was enlarged much, though not wholly taken in, on the same occasion. N.B. The author of this journal lies buried in that very ground, being at his own desire, his sister having been buried there a few years before. Fifth, Stepney Parish, extending itself from the east part of London to the north, even to the very edge of Shoreditch churchyard, had a piece of ground taken in to bury their dead, close to the said churchyard; and which, for that very reason, was left open, and is since, I suppose, taken into the same churchyard. And they had also two other burying places in Spittlefields,--one where since a chapel or tabernacle has been built for ease to this great parish, and another in Petticoat Lane. There were no less than five other grounds made use of for the parish of Stepney at that time; one where now stands the parish church of St. Paul, Shadwell, and the other where now stands the parish church of St. John, at Wapping, both which had not the names of parishes at that time, but were belonging to Stepney Parish. I could name many more; but these coming within my particular knowledge, the circumstance, I thought, made it of use to record them. From the whole, it may be observed that they were obliged in this time of distress to take in new burying grounds in most of the outparishes for laying the prodigious numbers of people which died in so short a space of time; but why care was not taken to keep those places separate from ordinary uses, that so the bodies might rest undisturbed, that I cannot answer for, and must confess I think it was wrong. Who were to blame, I know not. I should have mentioned that the Quakers[324] had at that time also a burying ground set apart to their use, and which they still make use of; and they had also a particular dead cart to fetch their dead from their houses. And the famous Solomon Eagle, who, as I mentioned before,[325] had predicted the plague as a judgment, and run naked through the streets, telling the people that it was come upon them to punish them for their sins, had his own wife died[326] the very next day of the plague, and was carried, one of the first, in the Quakers' dead cart to their new burying ground. I might have thronged this account with many more remarkable things which occurred in the time of the infection, and particularly what passed between the lord mayor and the court, which was then at Oxford, and what directions were from time to time received from the government for their conduct on this critical occasion; but really the court concerned themselves so little, and that little they did was of so small import, that I do not see it of much moment to mention any part of it here, except that of appointing a monthly fast in the city, and the sending the royal charity to the relief of the poor, both which I have mentioned before. Great was the reproach thrown upon those physicians who left their patients during the sickness; and, now they came to town again, nobody cared to employ them. They were called deserters, and frequently bills were set up on their doors, and written, "Here is a doctor to be let!" So that several of those physicians were fain for a while to sit still and look about them, or at least remove their dwellings and set up in new places and among new acquaintance. The like was the case with the clergy, whom the people were indeed very abusive to, writing verses and scandalous reflections upon them; setting upon the church door, "Here is a pulpit to be let," or sometimes "to be sold," which was worse. It was not the least of our misfortunes, that with our infection, when it ceased, there did not cease the spirit of strife and contention, slander and reproach, which was really the great troubler of the nation's peace before. It was said to be the remains of the old animosities which had so lately involved us all in blood and disorder;[327] but as the late act of indemnity[328] had lain asleep the quarrel itself, so the government had recommended family and personal peace, upon all occasions, to the whole nation. But it[329] could not be obtained; and particularly after the ceasing of the plague in London, when any one had seen the condition which the people had been in, and how they caressed one another at that time, promised to have more charity for the future, and to raise no more reproaches,--I say, any one that had seen them then would have thought they would have come together with another spirit at last. But, I say, it could not be obtained. The quarrel remained, the Church[330] and the Presbyterians were incompatible. As soon as the plague was removed, the dissenting ousted ministers who had supplied the pulpits which were deserted by the incumbents, retired. They[331] could expect no other but that they[332] should immediately fall upon them[331] and harass them with their penal laws; accept their[331] preaching while they[332] were sick, and persecute them[331] as soon as they[332] were recovered again. This even we that were of the Church thought was hard, and could by no means approve of it. But it was the government, and we could say nothing to hinder it. We could only say it was not our doing, and we could not answer for it. On the other hand, the dissenters reproaching those ministers of the Church with going away, and deserting their charge, abandoning the people in their danger, and when they had most need of comfort, and the like,--this we could by no means approve; for all men have not the same faith and the same courage, and the Scripture commands us to judge the most favorably, and according to charity. A plague is a formidable enemy, and is armed with terrors that every man is not sufficiently fortified to resist, or prepared to stand the shock against.[333] It is very certain that a great many of the clergy who were in circumstances to do it withdrew, and fled for the safety of their lives; but it is true, also, that a great many of them staid, and many of them fell in the calamity, and in the discharge of their duty. It is true, some of the dissenting turned-out ministers staid, and their courage is to be commended and highly valued; but these were not abundance. It cannot be said that they all staid, and that none retired into the country, any more than it can be said of the Church clergy that they all went away. Neither did all those that went away go without substituting curates[334] and others in their places, to do the offices needful, and to visit the sick as far as it was practicable. So that, upon the whole, an allowance of charity might have been made on both sides, and we should have considered that such a time as this of 1665 is not to be paralleled in history, and that it is not the stoutest courage that will always support men in such cases. I had not said this, but had rather chosen[335] to record the courage and religious zeal of those of both sides who did hazard themselves for the service of the poor people in their distress, without remembering that any failed in their duty on either side; but the want of temper among us has made the contrary to this necessary: some that staid, not only boasting too much of themselves, but reviling those that fled, branding them with cowardice, deserting their flocks, and acting the part of the hireling, and the like. I recommend it to the charity of all good people to look back and reflect duly upon the terrors of the time; and whoever does so will see that it is not an ordinary strength that could support it. It was not like appearing in the head of an army, or charging a body of horse in the field; but it was charging death itself on his pale horse.[336] To stay was indeed to die; and it could be esteemed nothing less, especially as things appeared at the latter end of August and the beginning of September, and as there was reason to expect them at that time; for no man expected, and I dare say believed, that the distemper would take so sudden a turn as it did, and fall immediately two thousand in a week, when there was such a prodigious number of people sick at that time as it was known there was; and then it was that many shifted[337] away that had staid most of the time before. Besides, if God gave strength to some more than to others, was it to boast of their ability to abide the stroke, and upbraid those that had not the same gift and support, or ought they not rather to have been humble and thankful if they were rendered more useful than their brethren? I think it ought to be recorded to the honor of such men, as well clergy as physicians, surgeons, apothecaries, magistrates, and officers of every kind, as also all useful people, who ventured their lives in discharge of their duty, as most certainly all such as staid did to the last degree; and several of these kinds did not only venture, but lost their lives on that sad occasion. I was once making a list of all such (I mean of all those professions and employments who thus died, as I call it, in the way of their duty), but it was impossible for a private man to come at a certainty in the particulars. I only remember that there died sixteen clergymen, two aldermen, five physicians, thirteen surgeons, within the city and liberties, before the beginning of September. But this being, as I said before, the crisis and extremity of the infection, it can be no complete list. As to inferior people, I think there died six and forty constables and headboroughs[338] in the two parishes of Stepney and Whitechapel; but I could not carry my list on, for when the violent rage of the distemper, in September, came upon us, it drove us out of all measure. Men did then no more die by tale[339] and by number: they might put out a weekly bill, and call them seven or eight thousand, or what they pleased. It is certain they died by heaps, and were buried by heaps; that is to say, without account. And if I might believe some people who were more abroad and more conversant with those things than I (though I was public enough for one that had no more business to do than I had),--I say, if we may believe them, there was not many less buried those first three weeks in September than twenty thousand per week. However the others aver the truth of it, yet I rather choose to keep to the public account. Seven or eight thousand per week is enough to make good all that I have said of the terror of those times; and it is much to the satisfaction of me that write, as well as those that read, to be able to say that everything is set down with moderation, and rather within compass than beyond it. Upon all these accounts, I say, I could wish, when we were recovered, our conduct had been more distinguished for charity and kindness, in remembrance of the past calamity, and not so much in valuing ourselves upon our boldness in staying; as if all men were cowards that fly from the hand of God, or that those who stay do not sometimes owe their courage to their ignorance, and despising the hand of their Maker, which is a criminal kind of desperation, and not a true courage. I cannot but leave it upon record, that the civil officers, such as constables, headboroughs, lord mayor's and sheriff's men, also parish officers, whose business it was to take charge of the poor, did their duties, in general, with as much courage as any, and perhaps with more; because their work was attended with more hazards, and lay more among the poor, who were more subject to be infected, and in the most pitiful plight when they were taken with the infection. But then it must be added, too, that a great number of them died; indeed, it was scarcely possible it should be otherwise. I have not said one word here about the physic or preparations that were ordinarily made use of on this terrible occasion (I mean we that frequently went abroad up and down the streets, as I did). Much of this was talked of in the books and bills of our quack doctors, of whom I have said enough already. It may, however, be added, that the College of Physicians were daily publishing several preparations, which they had considered of in the process of their practice; and which, being to be had in print, I avoid repeating them for that reason. One thing I could not help observing,--what befell one of the quacks, who published that he had a most excellent preservative against the plague, which whoever kept about them should never be infected, or liable to infection. This man, who, we may reasonably suppose, did not go abroad without some of this excellent preservative in his pocket, yet was taken by the distemper, and carried off in two or three days. I am not of the number of the physic haters or physic despisers (on the contrary, I have often mentioned the regard I had to the dictates of my particular friend Dr. Heath); but yet I must acknowledge I made use of little or nothing, except, as I have observed, to keep a preparation of strong scent, to have ready in case I met with anything of offensive smells, or went too near any burying place or dead body. Neither did I do, what I know some did, keep the spirits high and hot with cordials and wine, and such things, and which, as I observed, one learned physician used himself so much to, as that he could not leave them off when the infection was quite gone, and so became a sot for all his life after. I remember my friend the doctor used to say that there was a certain set of drugs and preparations which were all certainly good and useful in the case of an infection, out of which or with which physicians might make an infinite variety of medicines, as the ringers of bells make several hundred different rounds of music by the changing and order of sound but in six bells; and that all these preparations shall[340] be really very good. "Therefore," said he, "I do not wonder that so vast a throng of medicines is offered in the present calamity, and almost every physician prescribes or prepares a different thing, as his judgment or experience guides him; but," says my friend, "let all the prescriptions of all the physicians in London be examined, and it will be found that they are all compounded of the same things, with such variations only as the particular fancy of the doctor leads him to; so that," says he, "every man, judging a little of his own constitution and manner of his living, and circumstances of his being infected, may direct his own medicines out of the ordinary drugs and preparations. Only that," says he, "some recommend one thing as most sovereign, and some another. Some," says he, "think that Pill. Ruff., which is called itself the antipestilential pill, is the best preparation that can be made; others think that Venice treacle[341] is sufficient of itself to resist the contagion; and I," says he, "think as both these think, viz., that the first is good to take beforehand to prevent it, and the last, if touched, to expel it." According to this opinion, I several times took Venice treacle, and a sound sweat upon it, and thought myself as well fortified against the infection as any one could be fortified by the power of physic. As for quackery and mountebank, of which the town was so full, I listened to none of them, and observed often since, with some wonder, that for two years after the plague I scarcely ever heard one of them about the town. Some fancied they were all swept away in the infection to a man, and were for calling it a particular mark of God's vengeance upon them for leading the poor people into the pit of destruction merely for the lucre of a little money they got by them; but I cannot go that length, neither. That abundance of them died is certain (many of them came within the reach of my own knowledge); but that all of them were swept off, I much question. I believe, rather, they fled into the country, and tried their practices upon the people there, who were in apprehension of the infection before it came among them. This, however, is certain, not a man of them appeared for a great while in or about London. There were indeed several doctors who published bills recommending their several physical preparations for cleansing the body, as they call it, after the plague, and needful, as they said, for such people to take who had been visited and had been cured; whereas, I must own, I believe that it was the opinion of the most eminent physicians of that time, that the plague was itself a sufficient purge, and that those who escaped the infection needed no physic to cleanse their bodies of any other things (the running sores, the tumors, etc., which were broken and kept open by the direction of the physicians, having sufficiently cleansed them); and that all other distempers, and causes of distempers, were effectually carried off that way. And as the physicians gave this as their opinion wherever they came, the quacks got little business. There were indeed several little hurries which happened after the decrease of the plague, and which, whether they were contrived to fright and disorder the people, as some imagined, I cannot say; but sometimes we were told the plague would return by such a time; and the famous Solomon Eagle, the naked Quaker I have mentioned, prophesied evil tidings every day, and several others, telling us that London had not been sufficiently scourged, and the sorer and severer strokes were yet behind. Had they stopped there, or had they descended to particulars, and told us that the city should be the next year destroyed by fire, then, indeed, when we had seen it come to pass, we should not have been to blame to have paid more than common respect to their prophetic spirits (at least, we should have wondered at them, and have been more serious in our inquiries after the meaning of it, and whence they had the foreknowledge); but as they generally told us of a relapse into the plague, we have had no concern since that about them. Yet by those frequent clamors we were all kept with some kind of apprehensions constantly upon us; and if any died suddenly, or if the spotted fevers at any time increased, we were presently alarmed; much more if the number of the plague increased, for to the end of the year there were always between two and three hundred[342] of the plague. On any of these occasions, I say, we were alarmed anew. Those who remember the city of London before the fire must remember that there was then no such place as that we now call Newgate Market; but in the middle of the street, which is now called Blow Bladder Street, and which had its name from the butchers, who used to kill and dress their sheep there (and who, it seems, had a custom to blow up their meat with pipes, to make it look thicker and fatter than it was, and were punished there for it by the lord mayor),--I say, from the end of the street towards Newgate there stood two long rows of shambles for the selling[343] meat. It was in those shambles that two persons falling down dead as they were buying meat, gave rise to a rumor that the meat was all infected; which though it might affright the people, and spoiled the market for two or three days, yet it appeared plainly afterwards that there was nothing of truth in the suggestion: but nobody can account for the possession of fear when it takes hold of the mind. However, it pleased God, by the continuing of the winter weather, so to restore the health of the city, that by February following we reckoned the distemper quite ceased, and then we were not easily frighted again. There was still a question among the learned, and[344] at first perplexed the people a little; and that was, in what manner to purge the houses and goods where the plague had been, and how to render them[345] habitable again which had been left empty during the time of the plague. Abundance of perfumes and preparations were prescribed by physicians, some of one kind, some of another, in which the people who listened to them put themselves to a great, and indeed in my opinion to an unnecessary, expense; and the poorer people, who only set open their windows night and day, burnt brimstone, pitch, and gunpowder, and such things, in their rooms, did as well as the best; nay, the eager people who, as I said above, came home in haste and at all hazards, found little or no inconvenience in their houses, nor in their goods, and did little or nothing to them. However, in general, prudent, cautious people did enter into some measures for airing and sweetening their houses, and burnt perfumes, incense, benjamin,[346] resin, and sulphur in their rooms, close shut up, and then let the air carry it all out with a blast of gunpowder; others caused large fires to be made all day and all night for several days and nights. By the same token that[347] two or three were pleased to set their houses on fire, and so effectually sweetened them by burning them down to the ground (as particularly one at Ratcliff, one in Holborn, and one at Westminster, besides two or three that were set on fire; but the fire was happily got out again before it went far enough to burn down the houses); and one citizen's servant, I think it was in Thames Street, carried so much gunpowder into his master's house, for clearing it of the infection, and managed it so foolishly, that he blew up part of the roof of the house. But the time was not fully come that the city was to be purged with fire, nor was it far off; for within nine months more I saw it all lying in ashes, when, as some of our quaking philosophers pretend, the seeds of the plague were entirely destroyed, and not before,--a notion too ridiculous to speak of here, since, had the seeds of the plague remained in the houses, not to be destroyed but by fire, how has it been that they have not since broken out, seeing all those buildings in the suburbs and liberties, all in the great parishes of Stepney, Whitechapel, Aldgate, Bishopsgate, Shoreditch, Cripplegate, and St. Giles's, where the fire never came, and where the plague raced with the greatest violence, remain still in the same condition they were in before? But to leave these things just as I found them, it was certain that those people who were more than ordinarily cautious of their health did take particular directions for what they called seasoning of their houses; and abundance of costly things were consumed on that account, which I cannot but say not only seasoned those houses as they desired, but filled the air with very grateful and wholesome smells, which others had the share of the benefit of, as well as those who were at the expenses of them. Though the poor came to town very precipitantly, as I have said, yet, I must say, the rich made no such haste. The men of business, indeed, came up, but many of them did not bring their families to town till the spring came on, and that they saw reason to depend upon it that the plague would not return. The court, indeed, came up soon after Christmas; but the nobility and gentry, except such as depended upon and had employment under the administration, did not come so soon. I should have taken notice here, that notwithstanding the violence of the plague in London and other places, yet it was very observable that it was never on board the fleet; and yet for some time there was a strange press[348] in the river, and even in the streets, for seamen to man the fleet. But it was in the beginning of the year, when the plague was scarce begun, and not at all come down to that part of the city where they usually press for seamen; and though a war with the Dutch was not at all grateful to the people at that time, and the seamen went with a kind of reluctancy into the service, and many complained of being dragged into it by force, yet it proved, in the event, a happy violence to several of them, who had probably perished in the general calamity, and who, after the summer service was over, though they had cause to lament the desolation of their families (who, when they came back, were many of them in their graves), yet they had room to be thankful that they were carried out of the reach of it, though so much against their wills. We, indeed, had a hot war with the Dutch that year, and one very great engagement[349] at sea, in which the Dutch were worsted; but we lost a great many men and some ships. But, as I observed, the plague was not in the fleet; and when they came to lay up the ships in the river, the violent part of it began to abate. I would be glad if I could close the account of this melancholy year with some particular examples historically, I mean of the thankfulness to God, our Preserver, for our being delivered from this dreadful calamity. Certainly the circumstances of the deliverance, as well as the terrible enemy we were delivered from, called upon the whole nation for it. The circumstances of the deliverance were indeed very remarkable, as I have in part mentioned already; and particularly the dreadful condition which we were all in, when we were, to the surprise of the whole town, made joyful with the hope of a stop to the infection. Nothing but the immediate finger of God, nothing but omnipotent power, could have done it. The contagion despised all medicine, death raged in every corner; and, had it gone on as it did then, a few weeks more would have cleared the town of all and everything that had a soul. Men everywhere began to despair; every heart failed them for fear; people were made desperate through the anguish of their souls; and the terrors of death sat in the very faces and countenances of the people. In that very moment, when we might very well say, "Vain was the help of man,"[350]--I say, in that very moment it pleased God, with a most agreeable surprise, to cause the fury of it to abate, even of itself; and the malignity declining, as I have said, though infinite numbers were sick, yet fewer died; and the very first week's bill decreased 1,843, a vast number indeed. It is impossible to express the change that appeared in the very countenances of the people that Thursday morning when the weekly bill came out. It might have been perceived in their countenances that a secret surprise and smile of joy sat on everybody's face. They shook one another by the hands in the streets, who would hardly go on the same side of the way with one another before. Where the streets were not too broad, they would open their windows and call from one house to another, and asked how they did, and if they had heard the good news that the plague was abated. Some would return, when they said good news, and ask, "What good news?" And when they answered that the plague was abated, and the bills decreased almost two thousand, they would cry out, "God be praised!" and would weep aloud for joy, telling them they had heard nothing of it; and such was the joy of the people, that it was, as it were, life to them from the grave. I could almost set down as many extravagant things done in the excess of their joy as of their grief; but that would be to lessen the value of it. I must confess myself to have been very much dejected just before this happened; for the prodigious numbers that were taken sick the week or two before, besides those that died, was[351] such, and the lamentations were so great everywhere, that a man must have seemed to have acted even against his reason if he had so much as expected to escape; and as there was hardly a house but mine in all my neighborhood but what was infected, so, had it gone on, it would not have been long that there would have been any more neighbors to be infected. Indeed, it is hardly credible what dreadful havoc the last three weeks had made: for, if I might believe the person whose calculations I always found very well grounded, there were not less than thirty thousand people dead, and near one hundred thousand fallen sick, in the three weeks I speak of; for the number that sickened was surprising, indeed it was astonishing, and those whose courage upheld them all the time before, sunk under it now. In the middle of their distress, when the condition of the city of London was so truly calamitous, just then it pleased God, as it were, by his immediate hand, to disarm this enemy: the poison was taken out of the sting. It was wonderful. Even the physicians themselves were surprised at it. Wherever they visited, they found their patients better,--either they had sweated kindly, or the tumors were broke, or the carbuncles went down and the inflammations round them changed color, or the fever was gone, or the violent headache was assuaged, or some good symptom was in the case,--so that in a few days everybody was recovering. Whole families that were infected and down, that had ministers praying with them, and expected death every hour, were revived and healed, and none died at all out of them. Nor was this by any new medicine found out, or new method of cure discovered, or by any experience in the operation which the physicians or surgeons attained to; but it was evidently from the secret invisible hand of Him that had at first sent this disease as a judgment upon us. And let the atheistic part of mankind call my saying what they please, it is no enthusiasm: it was acknowledged at that time by all mankind. The disease was enervated, and its malignity spent; and let it proceed from whencesoever it will, let the philosophers search for reasons in nature to account for it by, and labor as much as they will to lessen the debt they owe to their Maker, those physicians who had the least share of religion in them were obliged to acknowledge that it was all supernatural, that it was extraordinary, and that no account could be given of it. If I should say that this is a visible summons to us all to thankfulness, especially we that were under the terror of its increase, perhaps it may be thought by some, after the sense of the thing was over, an officious canting of religious things, preaching a sermon instead of writing a history, making myself a teacher instead of giving my observations of things (and this restrains me very much from going on here, as I might otherwise do); but if ten lepers were healed, and but one returned to give thanks, I desire to be as that one, and to be thankful for myself. Nor will I deny but there were abundance of people who, to all appearance, were very thankful at that time: for their mouths were stopped, even the mouths of those whose hearts were not extraordinarily long affected with it; but the impression was so strong at that time, that it could not be resisted, no, not by the worst of the people. It was a common thing to meet people in the street that were strangers, and that we knew nothing at all of, expressing their surprise. Going one day through Aldgate, and a pretty many people being passing and repassing, there comes a man out of the end of the Minories; and, looking a little up the street and down, he throws his hands abroad: "Lord, what an alteration is here! Why, last week I came along here, and hardly anybody was to be seen." Another man (I heard him) adds to his words, "'Tis all wonderful; 'tis all a dream."--"Blessed be God!" says a third man; "and let us give thanks to him, for 'tis all his own doing." Human help and human skill were at an end. These were all strangers to one another, but such salutations as these were frequent in the street every day; and, in spite of a loose behavior, the very common people went along the streets, giving God thanks for their deliverance. It was now, as I said before, the people had cast off all apprehensions, and that too fast. Indeed, we were no more afraid now to pass by a man with a white cap upon his head, or with a cloth wrapped round his neck, or with his leg limping, occasioned by the sores in his groin,--all which were frightful to the last degree but the week before. But now the street was full of them, and these poor recovering creatures, give them their due, appeared very sensible of their unexpected deliverance, and I should wrong them very much if I should not acknowledge that I believe many of them were really thankful; but I must own that for the generality of the people it might too justly be said of them, as was said of the children of Israel after their being delivered from the host of Pharaoh, when they passed the Red Sea, and looked back and saw the Egyptians overwhelmed in the water, viz., "that they sang his praise, but they soon forgot his works."[352] I can go no further here. I should be counted censorious, and perhaps unjust, if I should enter into the unpleasing work of reflecting, whatever cause there was for it, upon the unthankfulness and return of all manner of wickedness among us, which I was so much an eyewitness of myself. I shall conclude the account of this calamitous year, therefore, with a coarse but a sincere stanza of my own, which I placed at the end of my ordinary memorandums the same year they were written:-- A dreadful plague in London was, In the year sixty-five, Which swept an hundred thousand souls Away, yet I alive. H.F.[353] FOOTNOTES: [4] It was popularly believed in London that the plague came from Holland; but the sanitary (or rather unsanitary) conditions of London itself were quite sufficient to account for the plague's originating there. Andrew D. White tells us, that it is difficult to decide to-day between Constantinople and New York as candidates for the distinction of being the dirtiest city in the world. [5] Incorrectly used for "councils." [6] In April, 1663, the first Drury Lane Theater had been opened. The present Drury Lane Theater (the fourth) stands on the same site. [7] The King's ministers. At this time they held office during the pleasure of the Crown, not, as now, during the pleasure of a parliamentary majority. [8] Gangrene spots (see text, pp. 197, 198). [9] The local government of London at this time was chiefly in the hands of the vestries of the different parishes. It is only of recent years that the power of these vestries has been seriously curtailed, and transferred to district councils. [10] The report. [11] Pronounced H[=o]´burn. {Transcriber's note: [=o] indicates o-macron} [12] Was. [13] Were. [14] Outlying districts; so called because they enjoyed certain municipal immunities, or liberties. Until recent years, a portion of Philadelphia was known as the "Northern Liberties." [15] Attempts to believe the evil lessened. [16] Was. [17] Were. [18] The chief executive officer of the city of London still bears this title. [19] One of the many instances in which Defoe mixes his tenses. [20] Whom. We shall find many more instances of Defoe's misuse of this form, as also of others (see Introduction, p. 15). [21] Used almost in its original sense of a military barrier. [22] Whom. [23] See Matt, xxvii. 40; Mark xv. 30; Luke xxiii. 35. [24] Denial. [25] The civil war between the Royalists and the Parliamentarians, 1642-51. [26] Whom. [27] This argument is neatly introduced to account for the narrator's staying in the city at all, when he could easily have escaped. [28] Explained by the two following phrases. [29] Whom. [30] "Lay close to me," i.e., was constantly in my mind. [31] Kept safe from the plague. [32] "My times are in thy hand" (Ps. xxxi. 15). [33] Dorking is about twenty miles southwest of London. [34] Rather St. Martin's-in-the-Fields and St. Giles's. [35] Was. [36] Charles II. and his courtiers. The immunity of Oxford was doubtless due to good drainage and general cleanliness. [37] Eccl. xii. 5. [38] Have seen. [39] Nor. This misuse of "or" for "nor" is frequent with Defoe. [40] The four inns of court in London which have the exclusive right of calling to the bar, are the Inner Temple, the Middle Temple, Lincoln's Inn, and Gray's Inn. The Temple is so called because it was once the home of the Knights Templars. [41] The city proper, i.e., the part within the walls, as distinguished from that without. [42] Were. [43] The population of London at this time was probably about half a million. It is now about six millions. (See Macaulay's History, chap. iii.) [44] Acel´dama, the field of blood (see Matt. xxvii. 8). [45] Phlegmatic hypochondriac is a contradiction in terms; for "phlegmatic" means "impassive, self-restrained," while "hypochondriac" means "morbidly anxious" (about one's health). Defoe's lack of scholarship was a common jest among his more learned adversaries, such as Swift, and Pope. [46] It was in this very plague year that Newton formulated his theory of gravitation. Incredible as it may seem, at this same date even such men as Dryden held to a belief in astrology. [47] William Lilly was the most famous astrologer and almanac maker of the time. In Butler's Hudibras he is satirized under the name of Sidrophel. [48] Poor Robin's Almanack was first published in 1661 or 1662, and was ascribed to Robert Herrick, the poet. [49] See Rev. xviii. 4. [50] Jonah iii. 4. [51] Flavius Josephus, the author of the History of the Jewish Wars. He is supposed to have died in the last decade of the first century A.D. [52] So called because many Frenchmen lived there. In Westminster there was another district with this same name. [53] "Gave them vapors," i.e., put them into a state of nervous excitement. [54] Soothsayers. [55] In astrology, the scheme or figure of the heavens at the moment of a person's birth. From this the astrologers pretended to foretell a man's destiny. [56] Roger Bacon, a Franciscan friar of the thirteenth century, had a knowledge of mechanics and optics far in advance of his age: hence he was commonly regarded as a wizard. The brazen head which he manufactured was supposed to assist him in his necromantic feats; it is so introduced by Greene in his play of Friar Bacon and Friar Bungay (1594). [57] A fortune teller who lived in the reign of Henry VIII., and was famous for her prophecies. [58] The most celebrated magician of mediæval times (see Spenser's Faërie Queene and Tennyson's Merlin and Vivien). [59] Linen collar or ruff. [60] Him. [61] The interlude was originally a short, humorous play acted in the midst of a morality play to relieve the tedium of that very tedious performance. From the interlude was developed farce; and from farce, comedy. [62] Charles II. and his courtiers, from their long exile in France, brought back to England with them French fashions in literature and in art. [63] To be acted. [64] Buffoons, clowns. [65] About 62½ cents. [66] About twenty-five dollars; but the purchasing power of money was then seven or eight times what it is now. [67] Strictly speaking, this word means "love potions." [68] Exorcism is the act of expelling evil spirits, or the formula used in the act. Defoe's use of the word here is careless and inaccurate. [69] Bits of metal, parchment, etc., worn as charms. [70] Making the sign of the cross. [71] Paper on which were marked the signs of the zodiac,--a superstition from astrology. [72] A meaningless word used in incantations. Originally the name of a Syrian deity. [73] Iesus Hominum Salvator ("Jesus, Savior of Men"). The order of the Jesuits was founded by Ignatius de Loyola in 1534. [74] The Feast of St. Michael, Sept. 29. [75] This use of "to" for "of" is frequent with Defoe. [76] The Royal College of Physicians was founded by Thomas Linacre, physician to Henry VIII. Nearly every London physician of prominence is a member. [77] The city of London proper lies entirely in the county of Middlesex. [78] Literally, "hand workers;" now contracted into "surgeons." [79] Cares, duties. [80] Consenting knowledge. [81] Disposed of to the public, put in circulation. [82] That is, by the disease. [83] Happen. [84] Engaged. [85] Heaps of rubbish. [86] A kind of parish constable. [87] The writer seems to mean that the beggars are so importunate, there is no avoiding them. [88] Fights between dogs and bears. This was not declared a criminal offense in England until 1835. [89] Contests with sword and shield. [90] The guilds or organizations of tradesmen, such as the goldsmiths, the fishmongers, the merchant tailors. [91] St. Katherine's by the Tower. [92] Trinity (east of the) Minories. The Minories (a street running north from the Tower) was so designated from an abbey of St. Clare nuns called Minoresses. They took their name from that of the Franciscan Order, Fratres Minores, or Lesser Brethren. [93] St. Luke's. [94] St. Botolph's, Bishopsgate. [95] St. Giles's, Cripplegate. [96] Were. [97] Chemise. [98] This word is misplaced; it should go before "perish." [99] Before "having," supply "the master." [100] Fences. [101] From. [102] This old form for "caught" is used frequently by Defoe. [103] Came to grief. [104] "Who, being," etc., i.e., who, although single men, had yet staid. [105] The wars of the Commonwealth or of the Puritan Revolution, 1640-52. [106] Holland and Belgium. [107] "Hurt of," a common form of expression used in Defoe's time. [108] Manager, economist. This meaning of "husband" is obsolete. [109] A participial form of expression very common in Old English, the "a" being a corruption of "in" or "on." [110] Were. [111] "'Name of God," i.e., in the name of God. [112] Torches. [113] "To and again," i.e., to and fro. [114] Were. [115] As if. [116] Magpie. [117] This word is from the same root as "lamp." The old form "lanthorn" crept in from the custom of making the sides of a lantern of horn. [118] Supply "be." [119] Inclination. [120] In expectation of the time when. [121] Their being checked. [122] This paragraph could hardly have been more clumsily expressed. It will be found a useful exercise to rewrite it. [123] "To have gone," i.e., to go. [124] Spotted. [125] "Make shift," i.e., endure it. [126] Device, expedient. [127] "In all" is evidently a repetition. [128] Objects cannot very well happen. Defoe must mean, "the many dismal sights I saw as I went about the streets." [129] As. [130] "Rosin" is a long-established misspelling for "resin." Resin exudes from pine trees, and from it the oil of turpentine is separated by distillation. [131] As distinguished from fish meat. [132] Defoe uses these pronouns in the wrong number, as in numerous other instances. [133] The projecting part of a building. [134] Their miraculous preservation was wrought by their keeping in the fresh air of the open fields. It seems curious that after this object lesson the physicians persisted in their absurd policy of shutting up infected houses, thus practically condemning to death their inmates. [135] Used here for "this," as also in many other places. [136] Supply "with." [137] Such touches as this created a widespread and long-enduring belief that Defoe's fictitious diary was an authentic history. [138] "Running out," etc., i.e., losing their self-control. [139] Idiocy. In modern English, "idiotism" is the same as "idiom." [140] Gangrene, death of the soft tissues. [141] Before "that" supply "we have been told." [142] Hanging was at this time a common punishment for theft. In his novel Moll Flanders, Defoe has a vivid picture of the mental and physical sufferings of a woman who was sent to Newgate, and condemned to death, for stealing two pieces of silk. [143] Cloth, rag. [144] They could no longer give them regular funerals, but had to bury them promiscuously in pits. [145] Evidently a repetition. [146] In old and middle English two negatives did not make an affirmative, as they do in modern English. [147] It is now well known that rue has no qualities that are useful for warding off contagion. [148] "Set up," i.e., began to play upon. [149] Constrained. [150] Because they would have been refused admission to other ports. [151] Matter. So used by Sheridan in The Rivals, act iii. sc. 2. [152] Probably a misprint for "greatly." [153] This. [154] Are. [155] He has really given two days more than two months. [156] A count. [157] Range, limits. [158] Unknown. [159] Lying. [160] Was. [161] Notice this skillful touch to give verisimilitude to the narrative. [162] Country. [163] "Without the bars," i.e., outside the old city limits. [164] Profession. [165] The plague. [166] The legal meaning of "hamlet" in England is a village without a church of its own: ecclesiastically, therefore, it belongs to the parish of some other village. [167] All Protestant sects other than the Established Church of England. [168] A groat equals fourpence, about eight cents. It is not coined now. [169] A farthing equals one quarter of a penny. [170] About ten miles down the Thames. [171] The t is silent in this word. [172] Hard-tack, pilot bread. [173] Old form for "rode." [174] See the last sentence of the next paragraph but one. [175] Roadstead, an anchoring ground less sheltered than a harbor. [176] Substitute "that they would not be visited." [177] The plague. [178] St. Margaret's. [179] Nota bene, note well. [180] Dul´ich. All these places are southward from London. Norwood is six miles distant. [181] Old form of "dared." [182] Small vessels, generally schooner-rigged, used for carrying heavy freight on rivers and harbors. [183] London Bridge. [184] This incident is so overdone, that it fails to be pathetic, and rather excites our laughter. [185] Supply "themselves." [186] Barnet was about eleven miles north-northwest of London. [187] Holland and Belgium. [188] See Luke xvii. 11-19. [189] Well. [190] With speed, in haste. [191] This word is misplaced. It should go immediately before "to lodge." [192] Luck. [193] Whom. [194] A small sail set high upon the mast. [195] "Fetched a long compass," i.e., went by a circuitous route. [196] The officers. [197] Refused. [198] Nearly twenty miles northeast of London. [199] He. This pleonastic use of a conjunction with the relative is common among illiterate writers and speakers to-day. [200] Waltham and Epping, towns two or three miles apart, at a distance of ten or twelve miles almost directly north of London. [201] Pollard trees are trees cut back nearly to the trunk, and so caused to grow into a thick head (poll) of branches. [202] Entertainment. In this sense, the plural, "quarters," is the commoner form. [203] Preparing. [204] Peddlers. [205] "Has been," an atrocious solecism for "were." [206] To a miraculous extent. [207] "Put to it," i.e., hard pressed. [208] There are numerous references in the Hebrew Scriptures to parched corn as an article of food (see, among others, Lev. xxiii. 14, Ruth ii. 14, 2 Sam. xvii. 28). [209] Supply "(1)." [210] Soon. [211] Substitute "would." [212] Whom. [213] Familiar intercourse. [214] Evidently a repetition. [215] "For that," i.e., because. [216] Singly. [217] Supply "to be." [218] Buildings the rafters of which lean against or rest upon the outer wall of another building. [219] Supply "of." [220] The plague. [221] "Middling people," i.e., people of the middle class. [222] At the mouth of the Thames. [223] Awnings. [224] Two heavy timbers placed horizontally, the upper one of which can be raised. When lowered, it is held in place by a padlock. Notches in the timbers form holes, through which the prisoner's legs are thrust, and held securely. [225] The constables. [226] The carters. [227] The goods. [228] In spite of, notwithstanding. [229] Supply "who." [230] "Cum aliis," i.e., with others. Most of the places mentioned in this list are several miles distant from London: for example, Enfield is ten miles northeast; Hadley, over fifty miles northeast; Hertford, twenty miles north; Kingston, ten miles southwest; St. Albans, twenty miles northwest; Uxbridge, fifteen miles west; Windsor, twenty miles west; etc. [231] Kindly regarded. [232] Which. [233] The citizens. [234] Such statements. [235] For "so that," substitute "so." [236] How. [237] It was not known in Defoe's time that minute disease germs may be carried along by a current of air. [238] Affected with scurvy. [239] "Which," as applied to persons, is a good Old English idiom, and was in common use as late as 1711 (see Spectator No. 78; and Matt. vi. 9, version of 1611). [240] Flung to. [241] Changed their garments. [242] Supply "I heard." [243] At. [244] Various periods are assigned for the duration of the dog days: perhaps July 3 to Aug. 11 is that most commonly accepted. The dog days were so called because they coincided with the heliacal rising of Sirius or Canicula (the little dog). [245] An inn with this title (and probably a picture of the brothers) painted on its signboard. [246] Whom. [247] The Act of Uniformity was passed in 1661. It required all municipal officers and all ministers to take the communion according to the ritual of the Church of England, and to sign a document declaring that arms must never be borne against the King. For refusing obedience to this tyrannical measure, some two thousand Presbyterian ministers were deprived of their livings. [248] Madness, as in Hamlet, act iii. sc. 1. [249] "Represented themselves," etc., i.e., presented themselves to my sight. [250] "Dead part of the night," i.e., from midnight to dawn. Compare, "In the dead waste and middle of the night." Hamlet, act i. sc 2. [251] "Have been critical," etc., i.e., have claimed to have knowledge enough to say. [252] Being introduced. [253] The plague. [254] "First began" is a solecism common in the newspaper writing of to-day. [255] Literally, laws of the by (town). In modern usage, "by-law" is used to designate a rule less general and less easily amended than a constitutional provision. [256] "Sheriff" is equivalent to shire-reeve (magistrate of the county or shire). London had, and still has, two sheriffs. [257] Acted. [258] The inspection, according to ordinance, of weights, measures, and prices. [259] "Pretty many," i.e., a fair number of. [260] The officers. [261] Were. [262] "Falls to the serious part," i.e., begins to discourse on serious matters. [263] See note, p. 28. The Mohammedans are fatalists. {Transcriber's note: The reference is to footnote 28.} [264] A growth of osseous tissue uniting the extremities of fractured bones. [265] Disclosed. [266] The officers. [267] Leading principle. [268] Defoe means, "can burn only a few houses." In the next line he again misplaces "only." [269] Put to confusion. [270] Left out of consideration. [271] The distemper. [272] A means for discovering whether the person were infected or not. [273] Defoe's ignorance of microscopes was not shared by Robert Hooke, whose Micrographia (published in 1664) records numerous discoveries made with that instrument. [274] Roup is a kind of chicken's catarrh. [275] Them, i.e., such experiments. [276] From the Latin quadraginta ("forty"). [277] From the Latin sexaginta ("sixty"). [278] Kinds, species. [279] Old age. [280] Abscesses. [281] Himself. [282] The essential oils of lavender, cloves, and camphor, added to acetic acid. [283] In chemistry, balsams are vegetable juices consisting of resins mixed with gums or volatile oils. [284] Supply "they declined coming to public worship." [285] This condition of affairs. [286] Collar. [287] Economy. [288] Supply "they were." [289] Action (obsolete in this sense). See this word as used in 2 Henry IV., act iv. sc. 4. [290] Which. [291] Sailors' slang for "Archipelagoes." [292] An important city in Asia Minor. [293] A city in northern Syria, better known as Iskanderoon or Alexandretta. The town was named in honor of Alexander the Great, the Turkish form of Alexander being Iskander. [294] Though called a kingdom, Algarve was nothing but a province of Portugal. It is known now as Faro. [295] The natives of Flanders, a mediæval countship now divided among Holland, Belgium, and France. [296] Colonies. In the reign of Charles II., the English colonies were governed by a committee (of the Privy Council) known as the "Council of Plantations." [297] The east side. [298] On the west side. [299] See map of England for all these places. Feversham is in Kent, forty-five miles southeast of London; Margate is on the Isle of Thanet, eighty miles southeast. [300] Commission merchants. [301] Privateers. Capers is a Dutch word. [302] Supply "he." [303] Supply "the coals." [304] "One another," by a confusion of constructions, has been used here for "them." [305] By a statute of Charles II. a chaldron was fixed at 36 coal bushels. In the United States, it is generally 26¼ hundredweight. [306] Opening. [307] "To seek," i.e., without judgment or knowledge. [308] Mixing. [309] Him. [310] This unwary conduct. [311] Think. [312] Were. [313] Accept. [314] Personal chattels that had occasioned the death of a human being, and were therefore given to God (Deo, "to God"; dandum, "a thing given"); i.e., forfeited to the King, and by him distributed in alms. This curious law of deodands was not abolished in England until 1846. [315] The southern coast of the Mediterranean, from Egypt to the Atlantic. [316] Censure. [317] Afterward. [318] "Physic garden," i.e., a garden for growing medicinal herbs. [319] Since. [320] Lord mayor of London, 1679-80, and for many years member of Parliament for the city. [321] The workmen. [322] Recognized. [323] Fenced. [324] Members of the Society of Friends, a religious organization founded by George Fox about 1650. William Penn was one of the early members. The society condemns a paid ministry, the taking of oaths, and the making of war. [325] See p. 105, next to the last paragraph. [326] Die. "Of the plague" should immediately follow "died." [327] See Note 3, p. 26. {Transcriber's note: the reference is to footnote 26.} [328] The act of indemnity passed at the restoration of Charles II. (1660). In spite of the King's promise of justice, the Parliamentarians were largely despoiled of their property, and ten of those concerned in the execution of Charles I. were put to death. [329] Family and personal peace. [330] The Established Church of England, nearly all of whose ministers were Royalists. The Presbyterians were nearly all Republicans. [331] The dissenting ministers. [332] The Churchmen. [333] Of. [334] What we should call an assistant minister is still called a curate in the Church of England. [335] "I had not said this," etc., i.e., I would not have said this, but would rather have chosen, etc. [336] See Rev. vi. 8. [337] Moved away (into the country). [338] The duties of headboroughs differed little from those of the constables. The title is now obsolete. [339] Count. [340] "Must." In this sense common in Chaucer. The past tense, "should," retains something of this force. Compare the German sollen. [341] Otherwise known as theriac (from the Greek [Greek: thêriakos], "pertaining to a wild beast," since it was supposed to be an antidote for poisonous bites). This medicine was compounded of sixty or seventy drugs, and was mixed with honey. [342] Supply "died." [343] Supply "of." [344] Substitute "which." [345] Those. [346] A corruption of "benzoin," a resinous juice obtained from a tree that flourishes in Siam and the Malay Archipelago. When heated, it gives off a pleasant odor. It is one of the ingredients used in court-plaster. [347] This word should be omitted. [348] The "press gang" was a naval detachment under the command of an officer, empowered to seize men and carry them off for service on men-of-war. [349] Off Lowestoft, in 1665. Though the Dutch were beaten, they made good their retreat, and heavily defeated the English the next year in the battle of The Downs. [350] See Ps. lx. 11; cviii. 12. [351] Were. [352] See Exod. xiv., xv., and xvi. 1-3. [353] "H.F." is of course fictitious. oclc-transitioning-2020 ---- Transitioning to the Next Generation of Metadata Transitioning to the Next Generation of Metadata Karen Smith-Yoshimura O C L C R E S E A R C H R E P O R T Transitioning to the Next Generation of Metadata Karen Smith-Yoshimura Senior Program Officer © 2020 OCLC. This work is licensed under a Creative Commons Attribution 4.0 International License. http://creativecommons.org/licenses/by/4.0/ September 2020 OCLC Research Dublin, Ohio 43017 USA www.oclc.org ISBN: 978-1-55653-167-5 DOI: 110.25333/rqgd-b343 OCLC Control Number: 1197990500 ORCID iDs Karen Smith-Yoshimura https://orcid.org/0000-0002-8757-2962 Please direct correspondence to: OCLC Research oclcresearch@oclc.org Suggested citation: Smith-Yoshimura, Karen. 2020. Transitioning to the Next Generation of Metadata. Dublin, OH: OCLC Research. https://doi.org/10.25333/rqgd-b343. http://creativecommons.org/licenses/by/4.0/ http://www.oclc.org https://orcid.org/0000-0002-8757-2962 mailto:oclcresearch@oclc.org https://doi.org/10.25333/rqgd-b343 C O N T E N T S Executive Summary ........................................................................... vi Introduction ......................................................................................... 1 The Transition to Linked Data and Identifiers ..................................... 4 Expanding the use of persistent identifiers ...........................................................4 Moving from “authority control” to “identity management” ..................................8 Addressing the need for multiple vocabularies and equity, diversity, and inclusion .................................................................................................................. 11 Linked data challenges .......................................................................................... 15 Describing “Inside-Out” and “Facilitated” Collections ....................16 Archival collections ................................................................................................ 16 Archived websites ...................................................................................................17 Audio and video collections .................................................................................. 18 Image collections .................................................................................................. 20 Research data ......................................................................................................... 22 Evolution of “Metadata as a Service” ............................................... 25 Metrics ....................................................................................................................25 Consultancy ...........................................................................................................25 New applications ....................................................................................................26 Bibliometrics .......................................................................................................... 27 Semantic indexing ................................................................................................. 27 Preparing for Future Staffing Requirements .................................... 28 The culture shift .....................................................................................................28 Learning opportunities ..........................................................................................29 New tools and skills ............................................................................................... 30 Self-education ........................................................................................................ 31 Addressing staff turnover ...................................................................................... 31 Impact ............................................................................................... 32 Acknowledgments ............................................................................ 33 Appendix .......................................................................................... 34 Notes ................................................................................................. 35 F I G U R E S FIGURE 1 “Changing Resource Description Workflows” by OCLC Research ....................... 4 FIGURE 2 Some 300 abbreviated author names for a five-page article in Physical Review Letters ........................................................................................... 6 FIGURE 3 Examples of some DOI and ARK identifiers .......................................................... 8 FIGURE 4 One Wikidata identifier links to other identifiers and labels in different languages ............................................................................................. 9 FIGURE 5 Excerpt from the survey results from the 2017 EDI survey of the Research Library Partnership .......................................................................... 13 FIGURE 6 Responses to 2019 survey on challenges related to managing A/V collections ....................................................................................................... 19 FIGURE 7 The OCLC ResearchWorks IIIF Explorer retrieves images about “Paris Maps” across CONTENTdm collections .................................................... 22 FIGURE 8 Distribution of 465 Indigenous language codes in the Australian National Bibliographic Database ........................................................ 26 FIGURE 9 UK Hatchette’s “River of Authors” generated from the British Library’s catalog metadata .........................................................................27 E X E C U T I V E S U M M A R Y The OCLC Research Library Partners Metadata Managers Focus Group, first established in 1993, is one of the longest-standing groups within the OCLC Research Library Partnership (RLP), a transnational network of research libraries. The Focus Group provides a forum for administrators responsible for creating and managing metadata in their institutions to share information about topics of common concern and to identify metadata management issues. The issues raised by the Focus Group are pursued by OCLC Research in support of the RLP and inform OCLC products and services. This report, Transitioning to the Next Generation of Metadata, synthesizes six years (2015-2020) of OCLC Research Library Partners Metadata Managers Focus Group discussions and what they may foretell for the “next generation of metadata.” The firm belief that metadata underlies all discovery regardless of format, now and in the future, permeates all Focus Group discussions. Yet metadata is changing. Format-specific metadata management based on curated text strings in bibliographic records understood only by library systems is nearing obsolescence, both conceptually and technically. Innovations in librarianship are exerting pressure on metadata management practices to evolve as librarians are required to provide metadata for far more resources of various types and to collaborate on institutional or multi-institutional projects with fewer staff. This report traces how metadata is evolving and considers the impact this transition may have on library services, posing such questions as: • Why is metadata changing? • How is the creation process changing? • How is the metadata itself changing? • What impact will these changes have on future staffing requirements, and how can libraries prepare? The future of linked data is tied to the future of metadata: the metadata that libraries, archives, and other cultural heritage institutions have created and will create will provide the context for future linked data innovations as “statements” associated with those links. The impact will be global, affecting how librarians and archivists will describe the inside-out and facilitated collections, inspiring new offerings of “metadata as a service,” and influencing future staffing requirements. Transitioning to the next generation of metadata is an evolving process, intertwined with changing standards, infrastructures, and tools. Together, Focus Group members came to a common understanding of the challenges, shared possible approaches to address them, and inoculated these ideas into other communities that they interact with. vi I N T R O D U C T I O N The OCLC Research Library Partners Metadata Managers Focus Group (hereafter referenced as the Focus Group),1 first established in 1993, is one of the longest-standing groups within the OCLC Research Library Partnership (RLP),2 a transnational network of research libraries. The Focus Group provides a forum for administrators responsible for creating and managing metadata in their institutions to share information about topics of common concern and to identify metadata management issues. The issues raised by the Focus Group are pursued by OCLC Research in support of the RLP and inform OCLC products and services. The firm belief that metadata underlies all discovery regardless of format, now and in the future, permeates all Focus Group discussions. Metadata provides the research infrastructure necessary for all libraries’ “value delivery systems,” fulfilling their community’s requests for information and resources. Metadata is crucial for transitioning to next generations of library and discovery systems. Good metadata created today can easily be reused in a linked data environment in the future.3 As noted in the British Library’s Foundations for the Future: “Our vision is that by 2023 the Library’s collection metadata assets will be unified on a single, sustainable, standards-based infrastructure offering improved options for access, collaboration and open reuse.”4 Format-specific metadata management based on curated text strings in bibliographic records understood only by library systems is nearing obsolescence, both conceptually and technically. Format-specific metadata management based on curated text strings in bibliographic records understood only by library systems is nearing obsolescence, both conceptually and technically. Innovations in librarianship are exerting pressure on metadata management practices to evolve as librarians are required to provide metadata for far more resources of various types and to collaborate on institutional or multi-institutional projects with fewer staff. “Traditional methods of metadata generation, management and dissemination,” suggests the British Library’s Collection Management Strategy, “are not scalable or appropriate to an era of rapid digital change, rising audience expectations and diminishing resources.”5 Focus Group members are eager to unleash the power of metadata in legacy records for different interactions and uses by both machines and end-users in the future. Consistent metadata created according to past rules or standards need to be transformed into new structures. 1 2 Transitioning to the Next Generation of Metadata Why is metadata changing? Traditional library metadata was and is made by librarians conforming to rules that are mainly used and understood by librarians. It is record-centered, expensive to produce, and has historic size limitations. Metadata is limited in its coverage, notably not including articles within scholarly journals or other scholarly outputs. The infrastructure has been inadequate for managing corrections and enhancements, inducing an emphasis on perfection that has exacerbated the slowness of metadata creation. In short, the metadata could be better, there is not enough of it, and the metadata that does exist is not used widely outside the library domain. How is the creation process changing? Metadata is no longer created by library staff alone. Today, publishers, authors, and other interested parties are equally involved in metadata creation. Metadata creation has also been pushed forward in the scholarly life cycle, with publishers creating metadata records earlier than in the traditional cataloging process. Metadata can now be enhanced or corrected by machines or by crowdsourcing. How is the metadata itself changing? Machine-readable cataloging (MARC) was created to replicate the metadata traditionally found on library catalog cards. We are transitioning from MARC records to assemblages of well-coded and shareable, linkable components, with an emphasis on references, and we are eliminating anachronistic abbreviations not understood by machines. Instead of relying only on library vocabularies such as subject headings and coded lists, the developing assemblages can accommodate vocabularies created for specific domains, expanding the metadata’s potential audiences. In short, the metadata could be better, there is not enough of it, and the metadata that does exist is not used widely outside the library domain. The Focus Group’s composition has fluctuated over time, and currently comprises representatives from 63 RLP Partners in 11 countries spanning four continents.6 The group includes both past and incoming chairs of the Program for Cooperative Cataloging (PCC),7 providing cross-fertilization between the two. Topics for group discussions can be proposed by any Focus Group member and are selected by an eight-member Planning Group (see appendix), who then write “context statements” explaining why the topic is considered timely and important and then develop question sets that delve into the topic. Context statements and question sets are then distributed to all Focus Group members who are given three to five weeks to submit their responses. Compilations of the Focus Group’s responses inform face-to-face discussions held in conjunction with the American Library Association conferences8 and in subsequent virtual meetings. As the Focus Group facilitator, I have summarized and synthesized these discussions in a series of OCLC Research Hanging Together Blog publications.9 Nearly 40 blog posts on a wide range of metadata-related topics have been published on this forum over the past six years. Transitioning to the Next Generation of Metadata 3 The Metadata Managers Focus Group is just one activity within the broader OCLC Research Library Partnership, which is devoted to extensive professional development opportunities for library staff. Focus Group members value their affiliation with the Research Library Partnership as a channel to becoming the “change agents” of future metadata management.10 Focus Group members’ responses to question sets have facilitated intra-institutional discussions and helped metadata managers understand how their institutions’ situation compares with peers within the Partnership. These Focus Group discussions identified a broad range of metadata-related issues, documented in this report. Transitioning to the next generation of metadata is an evolving process, intertwined with changing standards, infrastructures, and tools. Together, Focus Group members came to a common understanding of the challenges, shared possible approaches to address them, and inoculated these ideas into other communities that they interact with. Collectively, Focus Group members command a wide range of experiences with linked data. The Focus Group’s keen interest in linked data implementations sparked the series of OCLC Research’s International Linked Data Surveys for Implementers.11 A subset of Focus Group members have participated in various linked data projects, including the OCLC Research Project Passage and CONTENTdm Linked Data pilot, OCLC’s Shared Entity Management Infrastructure, Library of Congress’ Bibliographic Framework Initiative (BIBFRAME), the Mellon-funded Linked Data for Production (LD4P) project, the Share-VDE initiative, and the IMLS planning grant Shareable Local Name Authorities, which exposed issues raised by identifier hubs in the linked data environment.12 In addition, Focus Group members contribute to the PCC task groups addressing aspects of linked data work, including the PCC Task Group on Linked Data Best Practices, Task Group on Identity Management, Task Group on URIs in MARC, and the PCC Linked Data Advisory Committee.13 This cross-fertilization has prompted the Focus Group to examine issues around the entities represented in institutional resources. This report synthesizes six years (2015-2020) of OCLC Research Library Partners Metadata Managers Focus Group discussions and what they may foretell for the “next generation of metadata.” The document is organized in the following sections, each representing an emerging trend identified in the Focus Group’s discussions: • The transition to linked data and identifiers: expanding the use of persistent identifiers as part of the shift from “authority control” to “identity management” • Describing the “inside-out” and “facilitated” collections: challenges in creating and managing metadata for unique resources created or curated by institutions in various formats and shared with consortia • Evolution of “metadata as a service”: increased involvement with metadata creation beyond the traditional library catalog • Preparing for future staffing requirements: the changing landscape calls for new skill sets needed by both new professionals entering the field and seasoned catalogers The document concludes with some observations on the forecasted impact of the next generation of metadata on the wider library community. 4 Transitioning to the Next Generation of Metadata The Transition to Linked Data and Identifiers Linked data offers the ability to take advantage of structured data with an emphasis on context. It relies on language-neutral identifiers pointing to objects, with a focus on “things” replacing the “strings” inherent in current authority and catalog records. These identifiers can then be connected to related data, vocabularies, and terms in other languages, disciplines, and domains, including nonlibrary domains. Linked data applications can consume others’ contributions and thus free metadata specialists from having to re-describe things already described elsewhere, allowing them instead to focus on providing access to their institutions’ unique and distinctive collections. This promises a richer user experience and increased discoverability with more contextual relationships than is possible with our current systems. Furthermore, linked data offers an opportunity to go beyond the library domain by drawing on information about entities from diverse sources.14 FIGURE 1. “Changing Resource Description Workflows” by OCLC Research15 The hope is that linked data will allow libraries to offer new, value-added services that current models cannot support, that outside parties will be able to make better use of library resource descriptions, and that the data will be richer because more parties share in its creation. Moving to a linked data environment portends changes to resource description workflows, as shown in figure 1. The drive to move metadata operations to linked data depends on the availability of tools, access to linked data sources for reuse, documented best practices on identifiers and the metadata descriptions associated with them (“statements”), and a critical mass of implementations on a network level. EXPANDING THE USE OF PERSISTENT IDENTIFIERS The Focus Group discussed the “future-proofing” of cataloging, which refers to the opportunities to unleash the power of metadata in legacy records for different interactions and uses in the future. Persistent identifiers were viewed as crucial to transitioning from current metadata to future applications.16 Identifiers, in the form of language-neutral alphanumeric strings, serve as a shorthand for assembling the elements required to uniquely describe an object or resource. They can be resolved over networks with specific protocols for finding, identifying, and using that object or resource. In the nonlibrary domain, Social Security and employee numbers are examples of https://www.oclc.org/research/areas/data-science/linkeddata/linked-data-overview.html Transitioning to the Next Generation of Metadata 5 such identifiers. In the library and academic domains, Focus Group members pointed to ORCID (Open Researcher and Contributor ID)17 as a “glue” that holds together the four arms of scholarly work: publishing, repository, library catalog, and researchers—but ORCID is limited to only living researchers. ORCID is increasingly used in STEM (science, technology, engineering, mathematics) journals for all authors and contributors18 and included in institutions’ Research Information Management systems. ISNI (International Standard Name Identifier)19 uniquely identifies persons and organizations involved in creative activities used by libraries, publishers, databases, and rights management organizations, and it covers nonliving creators. Persistent identifiers were viewed as crucial to transitioning from current metadata to future applications. Persistent identifiers are used by parties such as Google and HathiTrust for service integration.20 More institutions are using geospatial coordinates in metadata or URIs (Uniform Resource Identifiers) pointing to geospatial coordinates that support API (Application Programming Interface) calls to GeoNames,21 enabling map visualizations. Research institutions are also adopting person identifiers such as ORCID to streamline the collection of the institutional research record, usually through a Research Information Management system, as documented in the 2017 OCLC Research Report Convenience and Compliance: Case Studies on Persistent Identifiers in European Research Information Management.22 While publishers serve as a key player in the metadata workstream, publisher data does not always meet library requirements. For example, publisher data for monographs usually does not include identifiers. The British Library is working with five UK publishers to add ISNIs23 to their metadata as a promising proof-of-concept for publishers and libraries working together earlier in the supply chain. The ability to batch load or algorithmically add identifiers in the future is on Focus Group members’ wish list. No single person identifier covers all use cases. Researchers’ names have been only partially represented in national name authority files that identify persons both living and dead. A sizable quantity of legacy names are represented only by text strings in bibliographic records. Authority records are created only by institutions involved in the PCC’s Name Authority Cooperative Program (NACO)24 or in national library programs. Even then, authority records are created selectively for certain headings or sometimes only when references are involved. The LC/NACO name authority file contained only 30% of the total names reflected in WorldCat’s bibliographic record access points (9 million LC/NACO records compared to the 30 million total names reported on the WorldCat Identities project page as of 2012).25 By 2020, this percentage decreased to 18%: 11 million LC/NACO authority records compared to 62 million in WorldCat Identities. These statistics illustrate that the number of names represented in bibliographic records are increasing more quickly than those that are under authority control. Authority files focus on the “preferred form” of a name, which can vary depending on language, discipline, context, and time period. Scholars have objected to the very concept of a “preferred form,” as the name may be referred to differently depending on the context.26 When a name has multiple forms, historians need to know the provenance of each name following the citation 6 Transitioning to the Next Generation of Metadata practices commonly used in their field. An identifier linked to different forms of names, each associated with the provenance and context, could resolve this conundrum. Researcher names are just one example of a need unmet by current identifier systems. Institutions have been minting their own “local identifiers” to meet this need. Use cases for local identifiers include registering all researchers on campus; representing entities that are underrepresented in national authority files such as authors of electronic dissertations and theses, performers, events, local place names, and campus buildings; identifying entities in digital library projects and institutional repositories; reflecting multilingual needs of the community; and supporting “housekeeping” tasks such as recording archival collection titles.27 Focus Group members’ consistent need to disambiguate names across disciplines and formats spurred creating the OCLC Research working group on Registering Researchers in Authority Files.28 The need to accurately record researchers’ institutional affiliations to reflect the institution’s scholarly output, to promote cross-institutional collaborations, and to lead to more successful recruitment and funding led to another working group on Addressing the Challenges with Organizational Identifiers and ISNI,29 which presented new data modeling of organizations that others could adapt for their own uses. Since then, the Research Organization Registry (ROR) was launched to develop an open, sustainable, usable, and unique identifier for every research organization in the world.30 Disambiguating names is the most labor-intensive part of authority work and will still be a prerequisite for assigning unique identifiers. Given the different name identifier systems already in use, libraries need a name reconciliation service. Authority work and algorithms based on text string matching have limits; the results will still need manual expert review. Tapping the expertise in user communities to verify if two identifiers represent the same person may help. Disambiguation is particularly difficult for authors or contributors listed in journal articles, where names are often abbreviated and there may be dozens or even hundreds of contributors. For example, an article in Physical Review Letters—Precision Measurement of the Top Quark Mass in Lepton + Jets Final State—has approximately 300 abbreviated author names for a five- page article (figure 2).31 This exemplifies the different practices among disciplines. By contrast, other objects with many contributors such as feature films and orchestral recordings are usually represented by only a relative handful of the associated names in library legacy metadata.32 Such differences make creating metadata that is uniform, understandable, and widely reusable a challenge. FIGURE 2 . Some 300 abbreviated author names for a five-page article in Physical Review Letters ar X iv :1 40 5. 17 56 v2 [ he p- ex ] 16 J un 2 01 4 FERMILAB-PUB-14-123-E Precision measurement of the top-quark mass in lepton+jets final states V.M. Abazov,31 B. Abbott,67 B.S. Acharya,25 M. Adams,46 T. Adams,44 J.P. Agnew,41 G.D. Alexeev,31 G. Alkhazov,35 A. Altona,56 A. Askew,44 S. Atkins,54 K. Augsten,7 C. Avila,5 F. Badaud,10 L. Bagby,45 B. Baldin,45 D.V. Bandurin,73 S. Banerjee,25 E. Barberis,55 P. Baringer,53 J.F. Bartlett,45 U. Bassler,15 V. Bazterra,46 A. Bean,53 M. Begalli,2 L. Bellantoni,45 S.B. Beri,23 G. Bernardi,14 R. Bernhard,19 I. Bertram,39 M. Besançon,15 R. Beuselinck,40 P.C. Bhat,45 S. Bhatia,58 V. Bhatnagar,23 G. Blazey,47 S. Blessing,44 K. Bloom,59 MI. Boehnlein,45 D. Boline,64 E.E. Boos,33 G. Borissov,39 M. Borysoval,38 A. Brandt,70 O. Brandt,20 R. Brock,57 A. Bross,45 D. Brown,14 X.B. Bu,45 M. Buehler,45 V. Buescher,21 V. Bunichev,33 S. Burdinb,39 C.P. Buszello,37 E. Camacho-Pérez,28 B.C.K. Casey,45 H. Castilla-Valdez,28 S. Caughron,57 S. Chakrabarti,64 K.M. Chan,51 A. Chandra,72 E. Chapon,15 G. Chen,53 S.W. Cho,27 S. Choi,27 B. Choudhary,24 S. Cihangir,45 D. Claes,59 J. Clutter,53 M. Cookek,45 W.E. Cooper,45 M. Corcoran,72 F. Couderc,15 M.-C. Cousinou,12 D. Cutts,69 A. Das,42 G. Davies,40 S.J. de Jong,29, 30 E. De La Cruz-Burelo,28 F. Déliot,15 R. Demina,63 D. Denisov,45 S.P. Denisov,34 S. Desai,45 C. Deterrec,20 K. DeVaughan,59 H.T. Diehl,45 M. Diesburg,45 P.F. Ding,41 A. Dominguez,59 A. Dubey,24 L.V. Dudko,33 A. Duperrin,12 S. Dutt,23 M. Eads,47 D. Edmunds,57 B. Ellison,43 V.D. Elvira,45 Y. Enari,14 H. Evans,49 V.N. Evdokimov,34 A. Fauré,15 L. Feng,47 T. Ferbel,63 F. Fiedler,21 F. Filthaut,29, 30 W. Fisher,57 H.E. Fisk,45 M. Fortner,47 H. Fox,39 S. Fuess,45 P.H. Garbincius,45 A. Garcia-Bellido,63 J.A. García-González,28 V. Gavrilov,32 W. Geng,12, 57 C.E. Gerber,46 Y. Gershtein,60 G. Ginther,45, 63 O. Gogota,38 G. Golovanov,31 P.D. Grannis,64 S. Greder,16 H. Greenlee,45 G. Grenier,17 Ph. Gris,10 J.-F. Grivaz,13 A. Grohsjeanc,15 S. Grünendahl,45 M.W. Grünewald,26 T. Guillemin,13 G. Gutierrez,45 P. Gutierrez,67 J. Haley,68 L. Han,4 K. Harder,41 A. Harel,63 J.M. Hauptman,52 J. Hays,40 T. Head,41 T. Hebbeker,18 D. Hedin,47 H. Hegab,68 A.P. Heinson,43 U. Heintz,69 C. Hensel,1 I. Heredia-De La Cruzd,28 K. Herner,45 G. Heskethf ,41 M.D. Hildreth,51 R. Hirosky,73 T. Hoang,44 J.D. Hobbs,64 B. Hoeneisen,9 J. Hogan,72 M. Hohlfeld,21 J.L. Holzbauer,58 I. Howley,70 Z. Hubacek,7, 15 V. Hynek,7 I. Iashvili,62 Y. Ilchenko,71 L. Illingworth,45 A.S. Ito,45 S. Jabeenm,45 M. Jaffré,13 A. Jayasinghe,67 M.S. Jeong,27 R. Jesik,40 P. Jiang,4 K. Johns,42 E. Johnson,57 M. Johnson,45 A. Jonckheere,45 P. Jonsson,40 J. Joshi,43 A.W. Jung,45 A. Juste,36 E. Kajfasz,12 D. Karmanov,33 I. Katsanos,59 R. Kehoe,71 S. Kermiche,12 N. Khalatyan,45 A. Khanov,68 L. Kharchilava,62 Y.N. Kharzheev,31 I. Kiselevich,32 J.M. Kohli,23 A.V. Kozelov,34 J. Kraus,58 A. Kumar,62 M. Kupco,8 T. Kurča,17 V.A. Kuzmin,33 S. Lammers,49 P. Lebrun,17 H.S. Lee,27 S.W. Lee,52 W.M. Lee,45 X. Lei,42 J. Lellouch,14 D. Li,14 H. Li,73 L. Li,43 Q.Z. Li,45 J.K. Lim,27 D. Lincoln,45 J. Linnemann,57 V.V. Lipaev,34 R. Lipton,45 H. Liu,71 Y. Liu,4 A. Lobodenko,35 M. Lokajicek,8 R. Lopes de Sa,64 R. Luna-Garciag,28 K. L. Lyon,45 A.K.A. Maciel,1 R. Madar,19 R. Magaña-Villalba,28 S. Malik,59 V.L. Malyshev,31 J. Mansour,20 J. Martínez-Ortega,28 R. McCarthy,64 C.L. McGivern,41 M.M. Meijer,29, 30 A. Melnitchouk,45 D. Menezes,47 P.G. Mercadante,3 M. Merkin,33 A. Meyer,18 J. Meyeri,20 F. Miconi,16 N.K. Mondal,25 M. Mulhearn,73 E. Nagy,12 M. Narain,69 R. Nayyar,42 H.A. Neal,56 J.P. Negret,5 P. Neustroev,35 H.T. Nguyen,73 T. Nunnemann,22 J. Orduna,72 N. Osman,12 J. Osta,51 A. Pal,70 N. Parashar,50 V. Parihar,69 S.K. Park,27 R. Partridgee,69 N. Parua,49 A. Patwaj ,65 B. Penning,45 M. Perfilov,33 Y. Peters,41 K. Petridis,41 G. Petrillo,63 P. Pétroff,13 M.-A. Pleier,65 V.M. Podstavkov,45 A.V. Popov,34 M. Prewitt,72 D. Price,41 N. Prokopenko,34 J. Qian,56 A. Quadt,20 B. Quinn,58 P.N. Ratoff,39 I. Razumov,34 I. Ripp-Baudot,16 F. Rizatdinova,68 M. Rominsky,45 A. Ross,39 C. Royon,15 P. Rubinov,45 R. Ruchti,51 G. Sajot,11 A. Sánchez-Hernández,28 M.P. Sanders,22 A.S. Santosh,1 G. Savage,45 M. Savitskyi,38 L. Sawyer,54 T. Scanlon,40 R.D. Schamberger,64 Y. Scheglov,35 H. Schellman,48 C. Schwanenberger,41 R. Schwienhorst,57 J. Sekaric,53 H. Severini,67 E. Shabalina,20 V. Shary,15 S. Shaw,57 A.A. Shchukin,34 V. Simak,7 P. Skubic,67 P. Slattery,63 D. Smirnov,51 G.R. Snow,59 J. Snow,66 I. Snyder,65 S. Söldner-Rembold,41 L. Sonnenschein,18 K. Soustruznik,6 J. Stark,11 D.A. Stoyanova,34 M. Strauss,67 L. Suter,41 P. Svoisky,67 M. Titov,15 V.V. Tokmenin,31 Y.-T. Tsai,63 D. Tsybychev,64 B. Tuchming,15 C. Tully,61 L. Uvarov,35 S. Uvarov,35 S. Uzunyan,47 R. Van Kooten,49 W.M. van Leeuwen,29 N. Varelas,46 E.W. Varnes,42 LI. A. Vasilyev,34 A.Y. Verkheev,31 L.S. Vertogradov,31 M. Verzocchi,45 M. Vesterinen,41 D. Vilanova,15 P. Vokac,7 H.D. Wahl,44 M.H.L.S. Wang,45 J. Warchol,51 G. Watts,74 M. Wayne,51 J. Weichert,21 L. Welty-Rieger,48 M.R.J. Williams,49 G.W. Wilson,53 M. Wobisch,54 D.R. Wood,55 T.R. Wyatt,41 Y. Xie,45 R. Yamada,45 https://arxiv.org/pdf/1405.1756.pdf https://arxiv.org/pdf/1405.1756.pdf https://arxiv.org/pdf/1405.1756.pdf Transitioning to the Next Generation of Metadata 7 Abbreviated forms of author names on journal articles make it difficult—and often impossible—to match them to the correct authority form or an identifier, if it exists. Associating ORCIDs with article authors makes it easier to differentiate authors with the same abbreviated forms. Research Information Management (RIM) systems apply identity management for local researchers so that they are correctly associated with the articles they have written. Their articles are displayed as part of their profiles. (See for example, Experts@Minnesota or University of Illinois at Urbana- Champaign’s Experts research profiles.)33 For researcher identity management to work, individuals must create and maintain their own ORCIDs. Institutions have been encouraging their researchers to include an ORCID in their profiles. Researchers have greater incentives to adopt ORCID to meet national and funder requirements such as those of the National Science Foundation and the National Institutes of Health in the United States.34 Research Information Management Systems harvest metadata from abstract and indexing databases such as Scopus, Web of Science, and PubMed, each of which has its own person identifiers that help with disambiguation; they may also be linked to an author’s ORCID. Linked data could access information across many environments, including those in Research Information Systems, but would require accurately linking multiple identifiers for the same person to each other. Some Focus Group members are performing metadata reconciliation work, such as searching matching terms from linked data sources and adding their URIs in metadata records as a necessary first step toward a linked data environment or as part of metadata enhancement work.35 Improving the quality of the data improves users’ experiences in the short term and will help with the transition to linked data later. Most metadata reconciliation is done on personal names, subjects, and geographic names. Sources used for such reconciliation include OCLC’s Virtual International Authority File (VIAF), the Library of Congress’s linked data service (id.loc.gov), ISNI, the Getty’s Union List of Artists Names (ULAN), Art and Architecture Thesaurus (AAT), and Thesaurus of Geographic Names (TGN), OCLC’s Faceted Application of Subject Terminology (FAST), and various national authority files. Selections of the source depend on the trustworthiness of the organization responsible, subject matter, and richness of the information. Such metadata reconciliation work is labor intensive and does not scale well. Some members of the Focus Group have experimented with obtaining identifiers (persistent URIs from linked data sources) to eventually replace their current reliance on text strings. Institutions concluded that it is more efficient to create URIs in authority records at the outset rather than reconcile them later. The University of Michigan has developed a LCNAF Named Entity Reconciliation program36 using Google’s Open Refine that searches the VIAF file with the VIAF API for matches, looks for Library of Congress source records within a VIAF cluster, and extracts the authorized heading. This results in a dataset pairing the authorized LC Name Authority File heading with the original heading and a link to the URI of the LCNAF linked data service. This service could be modified to bring in the VIAF identifier instead; it gets fair results even though it uses string matching. A long list of nonlibrary sources that could enhance current authority data or could be valuable to link to in certain contexts has been identified. Wikidata and Wikipedia led the list. Other sources include: AllMusic, author and fan sites, Discogs, EAC-CPF (Encoded Archival Context for Corporate Bodies, Persons, and Families), EAD (Encoded Archival Description), family trees, GeoNames, GoodReads, IMDb (Internet Movie Database), Internet Archive, Library Thing, LinkedIn, MusicBrainz, ONIX (ONline Information eXchange), Open Library, ORCID, and Scopus ID. The PCC’s Task Group on URIs in MARC’s document, Formulating and Obtaining URIs: A Guide to Commonly Used Vocabularies and Reference Sources,37 provides valuable guidance for collecting data from these other sources. Wikidata is viewed as an important source for expanding the language range and providing multilingual metadata more easily than with current library systems.38 8 Transitioning to the Next Generation of Metadata Identifiers for “works” represent a particular challenge, as there is no consensus on what represents a “distinctive work.”39 Local work identifiers cannot be shared or reused. Focus Group members voiced concern that differing interpretations of what a “work” is could hamper the ability to reuse data created elsewhere and look to a central trusted repository like OCLC to publish persistent Work Identifiers that could be used throughout the community. Identifiers need to be both unchanging over time and independent of where the digital object is or will be stored. For instance, identifiers for data sets such as digital resources and collections in institutional repositories include system-generated IDs, locally minted identifiers, PURL handles, DOIs (Digital Object Identifiers), URIs, URNs, and ARKs (Archival Resource Keys). A few examples of DOI and ARK Identifiers are shown in figure 3. Resources can have both multiple copies and versions that change over time. Institutional repositories used as collaborative spaces can lead to multiple publications from the same data sets, a problem compounded by self-deposits from coauthors at different institutions into different repositories. Furthermore, libraries (as well as funders and national assessment efforts) want to be able to link related pieces (such as preprints, supplementary data, and images) with the publication. Multiple DOIs pointing to the same object pose a problem. Some libraries use DataCite or Crossref to mint and publish unique, long-term identifiers and thus minimize the potential for broken citation links.40 Ideally, libraries would contribute to a hub for the metadata describing their researchers’ data sets regardless of where the data sets are stored. FIGURE 3. Examples of some DOI (left) and ARK (right) identifiers 41 MOVING FROM “AUTHORITY CONTROL” TO “IDENTITY MANAGEMENT” The emphasis in authority work is shifting from construction of text strings to identity management—differentiating entities, creating identifiers, and establishing relationships among entities.42 The intellectual work required to differentiate names is the same for both current authority work and identify management. Focus Group members agree that the future is in identity management and getting away from “managing text strings” as the basis of controlling headings in Examples of Some DOI and ARK Identifiers. Transitioning to the Next Generation of Metadata 9 bibliographic records.43 But identity management poses a change in focus, from providing access points in resource descriptions to describing the entities in the resource (work, persons, corporate bodies, places, events) and establishing the relationships and links among them. Identity management poses a change in focus, from providing access points in resource descriptions to describing the entities in the resource (work, persons, corporate bodies, places, events) and establishing the relationships and links among them. The transition from “authority control” and “authorized access points” in our legacy systems to identity management requires us to separate identifiers from their associated labels. A unique identifier could be associated with an aggregate of attributes that would enable users to distinguish one entity from another.44 Ideally, libraries could take advantage of the identifiers and attributes from other, nonlibrary sources. Wikidata, for example, aggregates a variety of identifiers as well as labels in different languages, as shown in figure 4. FIGURE 4. One Wikidata identifier links to other identifiers and labels in different languages One Wikidata Identifier Links to Other Identifiers and Labels in Different Languages Wikidata Identifier Q19526 https://www.wikidata.org/wiki/Q19526 10 Transitioning to the Next Generation of Metadata Providing contextual information is more important than providing one unique label. Labels could differ depending on communities—such as various spellings of names and terms, different languages and writing systems, and different disciplines—without requiring that one form be preferred over another. Label preference becomes localized rather than homogenized for global use. A key barrier to moving from text strings to identity management is the lack of technology and infrastructure to support it. New tools are needed to index and display information about the entities described with links to the sources of the identifiers. Since multiple identifiers may point to the same entity, tools to reconcile them will also be needed. Some systems index only the controlled access points, which is a problem when dealing with names represented in different languages. Can library systems be reconfigured to deal with identifiers as the match point, collocation point, and the key to whatever associated labels are displayed and indexed?45 Some Focus Group members are experimenting with Wikidata as another option to assign identifiers for names not represented in authority files, which would broaden the potential pool of contributors.46 Many libraries are looking toward Wikidata and Wikibase—the software platform underlying Wikidata—to solve some of the long-standing issues faced by technical services departments, archival units, and others.47 Wikidata/Wikibase are viewed as a possible alternative to traditional authority control and have other potential benefits such as embedded multilingual support and bridging the silos describing the institution’s resources. Focus Group members’ experimentations with Wikidata and OCLC projects using the Wikibase platform indicate that Wikibase is a plausible framework for realizing linked data implementations. This infrastructure could enable the Focus Group and the wider bibliographic and archival communities to focus on the entities that need to be created, their relationship with each other, and how best they can increase discoverability by end-users. Identity management could also bridge the variations of names found in journal articles, scholarly profile services, and library catalogs, transcending these now siloed domains. This bridge is a requirement to fulfill the promises of linked data. Because Wikidata was originally seeded by drawing data from Wikipedia, representation of books in Wikidata has a focus on “works” and their authors. This focus on works and authors could be viewed as an alternate version of the traditional author/title entries in authority files. Books that are “notable” are more likely to be represented in Wikidata. Recently, an effort to support citations in Wikipedia articles, WikiCite,48 demonstrates a need to register and support identifiers that make up those citations, including information about a specific edition or document. One of the most practical—and powerful—aspects of identity management is to reduce the amount of copying/pasting in library metadata workflows when an identifier is stewarded in an external location. Identifiers could provide a bridge between MARC and non-MARC environments and to nonlibrary resources. Librarians would not have to be the experts in all domains.49 Many resources curated or managed by libraries are not under authority control, such as digital and archival Transitioning to the Next Generation of Metadata 11 collections, institutional repositories, and research data. Identifiers could provide links to these resources. Identity management could also bridge the variations of names found in journal articles, scholarly profile services, and library catalogs, transcending these now siloed domains. This bridge is a requirement to fulfill the promises of linked data. ADDRESSING THE NEED FOR MULTIPLE VOCABULARIES AND EQUITY, DIVERSITY, AND INCLUSION Concepts or subject headings are particularly thorny as terminology can differ depending on the time period and discipline. In some cases, terms may be considered pejorative, harmful, or even racist by some communities. Addressing language issues is important as libraries seek to develop relationships and build trust with marginalized communities. The issues around equity, diversity, and inclusion are complex, and the vocabulary used in subject headings is just one aspect, and language-neutral identifiers represent one approach. The issue of supporting “alternate” subject headings came to the fore when the Library of Congress’ initial solution to change the LC subject heading for “Illegal aliens” to “Undocumented immigrants” failed to be implemented. This prompted one Focus Group member to comment, “Being held hostage to a national system slow to change in the face of changing semantics is damaging to libraries, as generally we pride ourselves on being welcoming and inclusive.” End-users hold their libraries accountable for what appears in their catalogs. Although LCSH is the Library of Congress Subject Headings, it is used worldwide, sometimes losing its context.50 Addressing language issues is important as libraries seek to develop relationships and build trust with marginalized communities. Some see Faceted Application of Subject Terminology (FAST)51 as a means to engage the community to mitigate the issues that have driven attempts to develop alternate subject headings for LCSH. A subset of the Focus Group has been applying FAST to records that would otherwise lack any subjects. FAST was originally developed by OCLC as a medium between totally-uncontrolled keywords at one end of the spectrum and difficult-to-learn-and-apply precoordinated subject strings at the other end.52 FAST headings provide an easy transition to a linked data environment, since each FAST heading has a unique identifier. As FAST headings are generated from Library of Congress precoordinated subject headings, they can also include the same terminology that some consider inappropriate or disrespectful. The recently launched FAST Policy and Outreach Committee53 represents FAST users to oversee community engagement, term contributions, and procedures and to recommend improvements. Its vision statement reads: FAST will be a fully supported, widely adopted and community developed general subject vocabulary derived from LCSH with tools and services that serve the needs of diverse communities and contexts.54 Multiple overlapping and sometimes conflicting vocabularies already exist in legacy library data.55 For example, Focus Group members in New Zealand add terms from the Māori Subject 12 Transitioning to the Next Generation of Metadata Headings thesaurus (Ngā Upoko Tukutuku) to the same records as LC subject headings; Focus Group members in Australia add terms authorized in the Australian Institute of Aboriginal and Torres Strait Islander Studies (AIATSIS) Thesauri.56 There may be no satisfactory equivalences across languages. Different concepts in national library vocabularies cannot always be mapped unequivocally to English concepts. The multiyear MACS (Multilingual Access to Subjects)57 built relationships across three subject vocabularies: Library of Congress Subject Headings, the German GND integrated authority file, and the French RAMEAU (Répertoire d’autorité-matière encyclopédique et alphabétique unifié). It has been a labor-intensive process and is not known to be widely implemented.58 A growing percentage of data in institutions’ discovery layers comes from non-MARC, nonlibrary sources. Metadata describing universities’ research data and materials in Institutional Repositories is usually treated differently—and separately. How should institutions provide normalization and access to the entities described so users do not experience the “collision of name spaces” and ambiguous terms (or terms meaning different things depending on the source)? Synaptica Knowledge Solutions’ Ontology Management – Graphite tool59 to create and manage various types of controlled vocabularies seems promising in this context. Focus Group members cited examples of established vocabularies or datasets that have become outdated or do not provide for local needs or sensibilities. Slow or unresponsive maintenance models for established vocabularies have tempted some to consider distributed models. High training thresholds to participate in current models have contributed to a desire for alternatives.60 Linked data could provide the means for local communities to prefer a different label for an established vocabulary’s preferred term for a concept or entity. One might reference a local description of a concept or entity not represented—or not represented satisfactorily—in established vocabularies or linked data sources. If these kinds of amendments and additions are made possible in a linked data environment, others could agree (or disagree) with the point of view by linking to the new resource. Such a distributed model for managing both terminology and entity description raises issues around metadata stability expectations, metadata interoperability, and metadata maintenance. How could a distributed model avoid people duplicating work on the same entity or concept? How would a distributed model record the trustworthiness of the contributors, or determine who would be allowed to contribute? Educational institutions and libraries have under- taken EDI initiatives, and metadata departments have been struggling to support them. Stability and permanence issues have been highlighted by the numerous vocabularies created for specific projects that, once funding ended, remain frozen in time. As one Focus Group member noted, “Nothing is sadder than a vocabulary that someone invented that was left to go stale.” Such examples provide a major reason for librarians wanting to rely on international authority files rather than on local solutions. They also exemplify the value of the Library of Congress taking on the entire cost of creating and maintaining LCSH. The OCLC Research report on the findings from a 2017 survey of the Research Library Partnership on equity, diversity, and inclusion (EDI)61 spurred discussions on the complexity of embedding Transitioning to the Next Generation of Metadata 13 equity, diversity, and inclusion in controlled vocabularies in library catalogs.62 Educational institutions and libraries have undertaken EDI initiatives, and metadata departments have been struggling to support them. The excerpt from the EDI survey in figure 5 shows that metadata in library catalogs lags behind other areas in support of the institution’s EDI goals and principles. FIGURE 5. Excerpt from the survey results from the 2017 EDI survey of the Research Library Partnership 63 Focus Group members are eager to provide more detailed subject access than is currently offered by national subject heading systems, such as LCSH, which has more granularity for Western European places than for Southeast Asia and Africa. They see the need to offer more accurate and current terms and replace terms that reflect bias or are considered offensive with more neutral terms. Challenges that Focus Group members identified in offering more respectful terminology in subject access for users: • Discovery: Using other, less-offensive vocabularies locally can split collections in the library catalog, thus hampering discovery of all relevant materials. • Lack of consensus: Focus Group members doubt that there can ever be complete consensus about any given text string. Terms that may be offensive to one community may not always be clear to others. (For example, “Dissident art” rather than “Non-conformist art.”) • Speed: The process of changing standard subject headings can be very slow. • Capacity: Changing headings in existing records can require a massive undertaking. Targeted access point maintenance occurs in the context of access point maintenance generally. For example, the Library of Congress recently changed the heading “Mentally handicapped” to 0% 20% 40% 60% 80% 100% Collection building Select materials for digitiation Metadata in library catalogs Metadata in archival collections Metadata in digital collections Terminologies or vocabularies Changed Plan to change What Areas Have You Changed or Plan to Change Due to Your Institutions EDI Goals and Principals? 14 Transitioning to the Next Generation of Metadata “People with mental disabilities.” Implementing such changes in the catalog can involve a mix of automated, vended, and manual remediation methods, as well as decisions about resource allocation.64 Some noted it would be less labor-intensive to present a “cultural sensitivity” message as part of the search interface to alert users that terms and annotations they find in a catalog may reflect the creator’s attitude or the period in which the item was created and may be considered inappropriate today in some contexts. • Sharing: Local vocabularies cannot be shared with other systems. • Maintenance: Some who have tried to use local vocabularies more suitable for their context and communities found them too burdensome to maintain and abandoned them. • Language barriers: The language of our controlled vocabularies may be exclusive to audiences who do not read that language. The Ohio State University Libraries has tried to address this by developing some non-Latin script equivalents of English subject terms. • Classification: Current classification systems are apt to segregate ethnic groups. Rather than include them as part of an overall concept like history, education, or literature, they tend to be grouped together as one lump. As institutions store more publications off-site, the need to shelve materials together and have just one classification in a record has subsided, but few apply multiple classifications in one record. Requirements for a distributed system that accommodates multiple vocabularies and could also support EDI converged around the need to support semantic relationships among different vocabularies. Communities of practice need a hub to aggregate and reconcile terms within their own domains. It was noted that different communities of practice might use terms that conflict with others’ terminologies or mean different things. The PCC Linked Data Advisory Committee’s Linked Data Infrastructure Models: Areas of Focus for PCC Strategies65 describes high-level functional requirements and a spectrum of models anticipated as cultural heritage institutions adopt linked data as a strategy for data sharing. The model must be both scalable and extensible, with the ability to accommodate the proliferation of new topics and terms symptomatic of the humanities and sciences and facilitate contributions by the researchers themselves. It needs to be flexible enough to coexist with other vocabularies. Replacing text strings with stable, persistent identifiers would facilitate using different labels depending on context. This would accommodate both different languages and scripts (and different spellings within a language, such as American vs. British English), as well as terms that are more respectful to marginalized communities. The 19 October 2017 OCLC Research Works in Progress webinar on “Decolonizing Descriptions: Finding, Naming and Changing the Relationship between Indigenous People, Libraries, and Archives”66 described the process launched by the Association for Manitoba Archives and the University of Alberta Libraries to examine subject headings and classification schemes and consider how they might be more respectful and inclusive of the experiences of Indigenous peoples. Expanding vocabularies to include those used in other communities requires building trust relationships. A model of “community contribution” for new terms and community voting could be more inclusive. Libraries’ current “consensus environment” excludes a lot of people. Much metadata is currently created according to Western knowledge constructs, and systems have been designed around them. Communicating the history of changes and the provenance of each new or modified term would provide transparency that could contribute to the trustworthiness of the source. The edit history and discussion pages that are part of each Wikidata entity description is a possible model to follow. Requiring provenance as part of a distributed vocabulary model may help in creating an alternative environment that is more equitable, diverse, and inclusive. Transitioning to the Next Generation of Metadata 15 LINKED DATA CHALLENGES Identifiers and vocabularies are just two components required in the transition to linked data. A vital part of describing entities are the associated statements made. How will libraries resolve or reconcile conflicts between statements?67 Different types of inconsistencies may appear than do now with, for example, different birthdates for persons. The provenance of each statement becomes more critical. Even in the current environment, certain sources are more trusted and give catalogers confidence in their accuracy. Libraries often have a list of “preferred sources.”68 OCLC Research explored how libraries might apply Google’s “Knowledge Vault” to identify statements that may be more “truthful” than others in the 2015 “Works in Progress Webinar: Looking Inside the Library Knowledge Vault.”69 Focus Group members posited that aggregations such as WorldCat, the Virtual International Authority File (VIAF), and Wikidata may allow the library community to view statements from these sources with more confidence than others. Librarians could share their expertise by establishing the relationships between and among statements from different sources. But good linked data requires good metadata. Administrators are well aware of the tension between delivering access to library collections in a timely manner and providing good quality description. The metadata descriptions must be full enough to allow libraries to manage their collections and to support accessibility and discoverability for the end-user. Many libraries need to compromise between speed over accuracy, speed over depth, or brevity over nothing. These compromises are reflected by using inadequate vendor records, by creating minimal or less-than-full level descriptions for certain types of resources, and by limiting authority work. Minimal-level cataloging is commonly used as an alternative to leaving materials uncatalogued, often because of large volume of materials and insufficient staff resources.70 These less-than-full descriptions will result in fewer and less accurate linked data statements. Good linked data requires good metadata. The transition period from legacy cataloging systems reliant on MARC to a new linked data environment with entities and statements has many challenges since both standards and practices are moving targets. It is unclear how libraries will share statements rather than records in a linked data environment. Focus Group members were divided on whether a centralized linked data store would be needed to provide “trustworthy provenance” or whether data should be distributed with peer-to-peer sharing.71 Different statements might be correct in their own contexts. “Conflicting statements” might represent different world views. Selecting statements based on provenance could be challenging to our principles of equity, diversity, and inclusion. The Focus Group members wondered how to involve the many vendors that supply or process MARC records in the transition to linked data. In the United Kingdom, the Jisc initiative “Plan M” (where “M” stands for “metadata”) seeks to streamline the metadata supply among libraries, publishers, data suppliers, and infrastructure providers.72 Among the implications cited by stakeholders in the UK’s National Bibliographic Knowledgebase (NBK) in Plan M’s 10-year vision: “Linked data instances of the NBK will need to be created and maintained requiring convincing business-cases around the impact this could have on research.”73 Working with others in the linked data environment involves people unfamiliar with the library environment, requiring metadata specialists to explain what their needs are in terms nonlibrarians can understand. 16 Transitioning to the Next Generation of Metadata Describing “Inside-Out” and “Facilitated” Collections OCLC Vice President and Chief Strategist, Lorcan Dempsey, refers to the shifting emphasis of libraries to support the creation, curation, and discoverability of institutional resources as the “inside-out collection” (in contrast to the “outside-in collection,” in which the library buys or licenses materials from external providers to make them accessible to a local audience). Providing access to a broader range of local, external, and collaborative resources around user needs is the “facilitated collection.”74 Focus Group members’ activities have increasingly focused on metadata that will provide access to the resources unique to their institutions as well as those in their consortia or national networks. All resources collected, created, and curated by libraries require metadata to make them discoverable. However, Focus Group members concentrated on the challenges and issues related to specific formats: • Archival collections • Archived websites • Audio and video collections • Image collections • Research data All these content types can be categorized as belonging to “inside-out” collections and present different challenges. For example, Focus Group members described efforts to retrieve metadata from completely different systems as “super challenging.” In addition, many of these resources are not under any authority control. Reconciling access points from various thesauri and metadata mapping work requires technical services expertise and skills.75 This reconciliation also will be needed in the previously discussed linked data environment. This section summarizes the discussions on these format types. ARCHIVAL COLLECTIONS Archival collections are in many ways the crown jewels of collections as they are unique research resources providing insights into the world across many centuries and places, providing the primary sources for new knowledge creation. Increasing visibility for these collections reaps significant benefits for both scholars and libraries and archives. Archives are, however, complex and present different metadata issues compared with traditional library collections. As institutions turn to ArchiveSpace and other content management systems to provide infrastructures for structured archival metadata, various issues are emerging.76 Archives have had more autonomy than libraries within their institutions because they have unique collections with their own population of users, their own metadata standards, and their own systems. While some institutions have integrated archival processing within technical services, most maintain a separate unit. Archivists do not have the tradition of creating authority records and sharing identifiers for the same entity as is common among librarians. They also tend to use the fullest form of a name based on the information found in collections, while librarians focus on “preferred” form found in publications. Even so, a significant shift from artisanal archival approaches to metadata standardization has been occurring. Transitioning to the Next Generation of Metadata 17 So how can archivists and librarians better integrate their metadata and name authority practices? The number of personal names in archival collections can be so large that most are uncontrolled and without identifiers. However, the contextual information that archivists provide for person and organization entities could enrich the information provided in authority files—a use case that was explored in the 2017-2018 Project Passage pilot77 and examined in more detail in 2019-2020 by the OCLC Research Library Partners Archives and Special Collections Linked Data Review Group.78 The increased reliance on electronic and digital resources during the COVID-19 pandemic will likely accelerate institutions digitizing their archival and distinctive collections that have been available only in physical form.79 More metadata may be created from digitized versions of these resources. ARCHIVED WEBSITES For some years, archives and libraries have been archiving web resources of scholarly or institutional interest to ensure their continuing access and long-term survival. Some websites are ephemeral or intentionally temporary, such as those created for a specific event. Institutions would like to archive and preserve the content of their websites as part of their historical record. A large majority of web content is harvested by web crawlers, but the metadata generated by harvesting alone is considered insufficient to support discovery.80 Some archived websites are institutional, theme-based collections supporting a specific research area such as Columbia University’s Human Rights, Historic Preservation and Urban Planning, and New York City Religions.81 National libraries archive websites within their national domain. For example the National Library of Australia’s Archived websites (1996-now)82 collect websites in partnership with cultural institutions around Australia, government websites formerly accessible through the Australian Government Web Archive, and websites from the .au domain collected annually through large scale crawl harvests. These curated collections by subject provide snapshots of Australian cultural and social history. Examples of consortia-based archived websites include the Ivy Plus Libraries Confederation’s Collaborative Architecture, Urbanism, and Sustainability Web Archive (CAUSEWAY) and Contemporary Composers Web Archive (CCWA) and the New York Art Resources Consortium (NYARC), which captures dynamic web-based versions of auction catalogs and artist, gallery, and museum websites.83 The Focus Group discussed the challenges for creating and managing the metadata needed to enhance machine-harvested metadata from websites. Some of the challenges identified: • Type of website matters. Descriptive metadata requirements may depend on the type of website archived (e.g., transient sites, research data, social media, or organizational sites). Sometimes only the content of the sites is archived when the user experience of the site (its “look-and-feel”) is not considered significant. • Practices vary. Some characteristics of websites are not addressed by existing descriptive rules such as RDA (Resource Description and Access) and DACS (Describing Archives: A Content Standard). Metadata tends to follow bibliographic description traditions or archival practice depending on who creates the metadata. • Consider scale and projected use. Metadata requirements may differ depending on the scale of material being archived and its projected use. For example, digital humanists look at web content as data and analyze it for purposes such as identifying trends, while other users merely need individual pages. The level of metadata granularity (collection, seed/URL, document) may also vary based on anticipated user needs, scale of material being crawled, and available staffing. 18 Transitioning to the Next Generation of Metadata • Update frequency. Many websites are updated repeatedly, requiring re-crawling when the content has changed. Some types of change can result in capture failures. • Multi-institutional websites. Some websites are archived by multiple institutions. Each may have captured the same site on different dates and with varying crawl specifications. How can they be searched and used in conjunction with one another? A 2015 survey of the OCLC Research Library Partnership revealed the “lack of descriptive metadata guidelines” as the biggest challenge related to website archiving, leading to the formation of the OCLC Research Library Partnership Web Archiving Metadata Working Group.”84 The challenges that the Focus Group identified were explored in depth by this working group, which issued a report of its recommendations in 2018, Descriptive Metadata for Web Archiving.85 AUDIO AND VIDEO COLLECTIONS Focus Group members reported that their institutions had repositories filled with large amounts of audiovisual (A/V) materials, which often represent unique, local collections.86 However, as Chela Scott Weber states in the publication Research and Learning Agenda for Archives, Special, and Distinctive Collections in Research Libraries, “For decades, A/V materials in our collections were largely either separated from related manuscript material (often shunted away to be dealt with at a later date) or treated at the item level. Both have served to create sizeable backlogs of un-quantified and un-described A/V materials.”87 Much of this audiovisual material urgently requires preservation, digitization, clarification of conditions of use, and description. In addition, the needed skill sets and stakeholders across institutions are complex. The nature of the management of A/V resources requires knowledge of the use context as well as technical metadata issues, providing a complex environment to think through requirements for description and access. Further, libraries must deal with current time-based media that is either being produced locally as part of research and learning, or streaming media that is being commercially licensed. Focus Group discussions focused on the A/V resources within archival collections—often in deteriorating formats, in large backlogs, and sometimes requiring rare and expensive equipment to access and assess the files. For locally generated content, institutions prefer that the creators describe their own resources. Metadata describing the same A/V materials may differ across library, archival, and digital asset management systems. The overarching challenge was how much effort needs to be invested in describing these A/V materials because they are unique. Institutions have used hierarchical structures to aggregate similar materials with finding aids that are marked up in the Encoded Archival Description standard,88 which provides useful contextual information for individual items within a specific collection. But often an aggregated approach to description can lack important details about individual items needed for discovery, such as transcribed title and date broadcast. This is a particularly acute issue for legacy data describing recordings from years past. Metadata describing the same A/V materials may differ across library, archival, and digital asset management systems. Transitioning to the Next Generation of Metadata 19 Some hope that better discovery layers will alleviate the need to repeat the same information across databases, but presenting the information to users would require using consistent access points across systems. The same will be true in a linked data environment. But the challenge to link between items and the finding aid and to maintain the links over time despite changes in systems will remain. Metadata for A/V materials needs to include important technical information, such as details about the A/V capture and digitization process like compression, year digitized, the technology used, and file compatibility. This data is critical to ensure perpetual access for such enormous files and mercurial playback formats. Some Focus Group members have implemented PREMIS (Preservation Metadata: Implementation Strategies),89 the international standard for metadata to support the preservation of digital objects and ensure their long-term usability, for some of their A/V materials. OCLC Senior Program Officer Chela Scott Weber continues working with the Research Library Partnership on the needs and challenges of managing A/V collections, summarized in OCLC Research Hanging Together Blog posts: “Assessing Needs of AV in Special Collections” and “Scale & Risk: Discussing Challenges to Managing A/V Collections in the RLP.”90 A subset of the Focus Group members responded to Weber’s 2019 survey to assess the needs of audiovisual materials in special collections within the Research Library Partnership; incorporating A/V collections into archival and digital collections workflows were two of the challenges that most interested respondents, as shown in figure 6. FIGURE 6. Responses to 2019 survey on challenges related to managing A/V collections What Challenges Related to Managing A/V Collections Would You Be Interested in the RLP Addressing? (n=137) 0 20 40 60 80 100 120 140 Resource allocation, assessment, and prioritization Digital asset management and preservation Physical collection management Digitization and preservation reformatting Incorporating into digital collection workflows Incorporating into archival workflows Selection, appraisal, and collection development Very interested Interested Somewhat interested Not interested 20 Transitioning to the Next Generation of Metadata IMAGE COLLECTIONS Focus group members manage a wide variety of image collections presenting challenges for metadata management. In some cases, image collections that developed outside the library and its data models need to be integrated with other collections or into new search environments. Depending on the nature of the collection and its users, questions arise concerning identification of works, depiction of entities, chronology, geography, provenance, genre, subjects (“of- ness” and “about-ness”). Image collections also offer opportunities for crowdsourcing and interdisciplinary research.91 Many libraries describe their digital image resources on the collection level while selectively describing items. As much as possible, enhancements are done in batch. Some do authority work, depending on the quality of the accompanying metadata. Some libraries have disseminated metadata guidelines to help bring more consistency to the data. Among the challenges discussed by the Focus Group: • Variety of systems and schemas: Image collections created in different parts of the institution such as art or anthropology departments serve different purposes and use different systems and schemas than those used by the library. The metadata often comes in spreadsheets or unstructured accompanying data. Often, the metadata created by other departments requires much editing, massaging, and manual review. The situation is simpler when all digitization is handled through one centralized location and the library does all the metadata creation. Some libraries are using Dublin Core for their image collections’ metadata and others are using MODS (Metadata Object Description Schema).92 Some wrap the metadata records in METS (Metadata Encoding and Transmission Standard),93 a schema maintained by the Library of Congress designed to express the hierarchical nature of digital library objects, the names and locations of the files that comprise those objects, and the associated metadata. Some suggested that MODS be used in conjunction with MADS (Metadata Authority Description Schema).94 • Duplicate metadata for different objects: Metadata for a scanned set of drawings may be identical, even though there are slight differences in those drawings. Duplicating the metadata across similar objects is likely due to limited staff. Possibly the faculty or the photographers could add more details. • Lack of provenance: A common challenge is receiving image collections with scanty metadata and with no information regarding their provenance. For example, metadata staff at one institution were given OCR’ed text retrieved by a researcher from HathiTrust. Millions of images lacked the location of the original source material and therefore limited—if not discredited—any further use. • Maintaining links between metadata and images: How should libraries store images and keep them in sync with the metadata? There may be rights issues from relying on a specific platform to maintain links between metadata and images. Where should thumbnails live? • Relating multiple views and versions of same object: Multiple versions of the same object taken over time can be very useful for disciplines like forensics. For example, Brown University decided to describe a “blob” of various images of the same thing in different formats and then describe the specific versions included. This work was done even though there is no system yet that displays relationships among images, such as components of a piece, even when the metadata in records are wrapped and stored in METS. Transitioning to the Next Generation of Metadata 21 • Managing relationships with faculty and curators: It is important to ensure that faculty feel their needs are met. Collaboration is necessary among holders of the materials, metadata specialists, and developers as all come from different perspectives. The challenge is to support both a specific purpose and groups of people as well as large-scale discovery. • Aggregating digital collections: Institutions have been sharing the metadata for their digital collections with both national and international discovery services. Within individual organizations, librarians create and recreate metadata for digital and digitized resources in a plethora of systems—the library catalog, archive management, digital asset and preservation systems, the institutional repository, research management systems, and external subscription-based repositories. Targets for sharing this metadata range from tailored topic- based digital discovery services to national and international aggregations such as Google Scholar, HathiTrust, Digital Public Library of America (DPLA), Internet Archive, Trove, and WorldCat to online exhibitions such as Google Arts and Culture or image banks such as Flickr or Unsplash. Such aggregations can help inform an institution’s own collection development, as librarians can see their contributions in the context of others’ content and identify gaps that they may wish to fill locally.95 Aggregators often have different guidelines and input formats. Aggregators’ very reasonable contention that they cannot support many variations in submitted metadata conflict with contributors’ very reasonable contention that they cannot support the different needs of a wide range of aggregators. Disseminating corrections or updates between the source and the aggregation can be problematic. Information that may have been corrected in the chain leading to incorporation in the aggregation may not be pushed back to the source, so that the same errors must be corrected repeatedly. It is often not clear what data elements have been updated, when, or by whom. Aggregating images and bringing together different images or versions of the same object was the goal of the 2012-2013 OCLC Research Europeana Innovation Pilots,96 which developed a method for hierarchically structuring cultural objects at different similarity levels to find “semantic clusters”— those that include terms with a similar meaning. In 2017, OCLC implemented the International Interoperability Image Framework (IIIF)97 Presentation Manifest protocol in its CONTENTdm digital content management system, an aggregation containing more than 70 million digital records contributed by over 2,500 libraries worldwide. In 2019 OCLC Research developed an IIIF Explorer experimental prototype for testing and evaluation that searches across all the CONTENTdm images using the IIIF Presentation Manifest protocol,98 as shown in figure 7. Aggregating content across IIIF-compliant systems may facilitate discovery across the plethora of platforms containing digital content mentioned above. In 2020, OCLC Research launched the CONTENTdm Linked Data Pilot,99 focused on developing scalable methods and approaches to produce machine-readable representations of entities and relationships and make visible the connections formerly invisible. Existing record-based metadata is being converted to linked data by replacing strings of characters with identifiers from known authority files and local library-defined vocabularies; the resulting graphs of entities and relationships can retrieve contextual information from sources such as GeoNames and Wikidata. This pilot (to be completed by August 2020) is addressing many of the above challenges identified by the Focus Group. 22 Transitioning to the Next Generation of Metadata FIGURE 7. The OCLC ResearchWorks IIIF Explorer retrieves images about “Paris Maps” across CONTENTdm collections RESEARCH DATA Research funders expect that the research data resulting from research they support will be archived and made available to others. Institutions have allotted more resources to collecting and curating this scholarly resource for reuse within the scholarly record. OCLC Research Scientist Ixchel Faniel’s two-part blog entry “Data Management and Curation in 21st Century Archives” (Sept 2015)100 prompted the discussion among Focus Group members on the metadata needed for research data management.101 To maximize the chances that metadata for research data are shareable (that is, sufficiently comparable) and helpful to those considering reusing the data, our communities would benefit from sharing ideas and discussing plans to meet emerging discovery needs. Metadata is important for both discovery and reuse of datasets. The 2016 OCLC Research report Building Blocks: Laying the Foundation for a Research Data Management Program noted: Datasets are useful only when they can be understood. Encourage researchers to provide structured information about their data, providing context and meaning and allowing others to find, use and properly cite the data. At minimum, advise researchers to clearly tell the story of how they gathered and used the data and for what purpose. This information is best placed in a readme.txt file that includes project information and project-level metadata, as well as metadata about the data itself (e.g., file names, file formats and software used, title, author, date, funder, copyright holder, description, keywords, observation unit, kind of data, type of data and language).102 The OCLC ResearchWorks IIIF Explorer Retrieves Images about “Paris Maps” across CONTENTdm Collections https://researchworks.oclc.org/iiif-explorer/search?q=paris%20maps Transitioning to the Next Generation of Metadata 23 All four of the of 2017-2018 The Realities of Research Data Management series webinars103 led by OCLC Senior Program Officer Rebecca Bryant mention the importance of metadata. Research information infrastructure calls on many of the key strengths of the library profession. Metadata is fundamental to our complex research environment—beginning with the planning our researchers do before and during the creation of data; to managing the data; then to disseminating the knowledge gained; finally through to understanding the impact, engagement, and the resulting reputation of our home institutions.104 Libraries’ expertise in metadata standards, identifiers, linked data, and data sharing systems as well as technical systems can be invaluable to the research life cycle. Faniel highlighted this value in the November 2019 Next blog post “Let’s Cook Up Some Metadata Consistency”: [C]ataloging for discovery using terms and definitions that are consistent across repositories is critical, if we want the data and their associated metadata to be discoverable for reuse in any way imaginable. . . . Librarians and archivists can help create consistencies in metadata that build bridges between researchers and repositories, thus greatly increasing the discovery, reuse, and value of their institutions’ research investments.105 National contexts differ. For example, our Australian colleagues can take advantage of Australia’s National Computational Infrastructure for big data and the Australian Data Archive for the social sciences.106 Canada has launched a national network called Portage for the “shared stewardship of research data.”107 Libraries’ expertise in metadata standards, identifiers, linked data, and data sharing systems as well as technical systems can be invaluable to the research life cycle. Some institutions have developed templates to capture metadata in a structured form. Some Focus Group members noted the need to keep such forms as simple as possible as it can be difficult to get researchers to fill them in. All agreed data creators needed to be the main source of metadata. But what will inspire data creators to produce quality metadata? New ways of training and outreach are needed, an area of exploration within Metadata 2020’s Research Communications project.108 Focus Group members generally agreed on the data elements required to support reuse: licenses, processing steps, tools, data documentation, data definitions, data steward, grant numbers, and geospatial and temporal data (where relevant). Metadata schema used includes Dublin Core, MODS (Metadata Object Description Schema) and DDI (Data Documentation Initiative’s metadata standard). The Digital Curation Centre in the UK provides a linked catalog of metadata standards.109 The Research Data Alliance’s Metadata Standards Directory Working Group has set up a community- maintained directory of metadata standards for different disciplines.110 The disparity of metadata schemas across disciplines represents a hurdle in institutions’ discovery layers. 24 Transitioning to the Next Generation of Metadata The importance of identifiers for both the research data and the data creator(s) has become more widely acknowledged. DOIs, Handles and ARKs (Archival Resource Key) have been used to provide persistent access to datasets. Identifiers are available at the full data set level and for component parts, and they can be used to track downloads and potentially help measure impact. Both ORCID and ISNI are in use to identify data creators uniquely, and work is continuing on the Research Organizational Registry to address institutional affiliations. Among the most critical issues identified by Focus Group members is that metadata specialists need to be more involved in the early stages of the research life cycle. Researchers need to understand the importance of metadata in their data management plans. The lack of “metadata governance” across an institution makes integrating workflows between repositories and discovery layers problematic. Some Focus Group members have started to analyze the metadata requirements for the research data life cycle, not just the final product, asking questions like: Who are the collaborators?111 How do various projects use different data files? What kind of analysis tools do they use? What are the relationships of data files across a project, between related projects, and to other scholarly output such as related journal articles? Research support services such as those offered at the University of Michigan112 are being developed to assist researchers during all phases of the research data life cycle, often through collaboration with other campus units. Among the most critical issues identified by Focus Group members is that metadata specialists need to be more involved in the early stages of the research life cycle. Researchers need to understand the importance of metadata in their data management plans. The lack of “metadata governance” across an institution makes integrating workflows between repositories and discovery layers problematic. Some libraries have started to provide research data management support in a variety of ways. For example, metadata specialists work with their institutions’ Scholarly Communications and Publishing Division which also manages the Institutional Repository. These institutional repositories may have only the “citation” or “metadata-only” records with a link to the full text or data set deposited in a disciplinary repository. “Metadata consultation services” may be provided to advise on the data management plan, which includes appropriate metadata standards and controlled vocabularies, a strategy to effectively organize their data, and an approach that will facilitate reuse of the data years after the research is completed. The OCLC Research The Realities of Research Data Management report series classifies metadata support as part of the “expertise” function, and flags some variations in its case studies.113 At the University of Illinois at Urbana-Champaign, metadata consultants help researchers with metadata regardless of where the research data is deposited; Monash University supports metadata curation only for local deposits.114 Transitioning to the Next Generation of Metadata 25 Communication is key for researchers to understand the importance of metadata throughout the research life cycle. Some universities offer “research sprints” where researchers partner with a team of expert librarians that may include metadata creation, management, analysis, and preservation. The “Shared BigData Gateway for Research Libraries,” hosted by Indiana University and partially funded by the Institute of Museum and Library Services, is developing a cloud-based platform to share data and expertise across institutions, including datasets such as records from the US Patent and Trademark Office and the Microsoft Academic Graph.115 Curation of research data as part of the evolving scholarly record requires new skill sets, including deeper domain knowledge and experience with data modeling and ontology development. Libraries are investing more effort in becoming part of their faculty’s research process and are offering services that help ensure that their research data will be accessible if not also preserved. Good metadata will help guide other researchers to the research data they need for their own projects, and the data creators will have the satisfaction of knowing that their data has benefitted others.116 Evolution of “Metadata as a Service” Metadata underlies the ability to discover all resources in the inside-out and facilitated collections. Focus Group members anticipate more involvement with metadata creation beyond the traditional library catalog and new services that leverage both legacy and future metadata. METRICS Library strategic goals often include key phrases such as “foster discovery and use,” “enrich the user experience,” and “explore new ways to support the whole life cycle of scholarship,” all of which is predicated on quality metadata. Usage metrics—such as how frequently items have been borrowed, cited, downloaded, or requested—could be used to build a wide range of library services and activities. Focus Group members identified some possible services: informing collection management decisions about weeding projects and identifying materials for offsite storage; evaluating subscriptions; comparing citations for researchers’ publications with what the library is not purchasing; and improving relevancy ranking, personalizing search results, offering recommendation services in the discovery layer, and measuring impact of library usage on research or student success or learning analytics.117 The University of Minnesota conducted a study to investigate the relationships between first-year undergraduate students’ use of the academic library, academic achievement, and retention.118 The results suggest a strong correlation between using academic library services and resources—particularly database logins, book loans, electronic journal logins, and library workstation logins—and higher grade point averages. In the United Kingdom, the Jisc Library Impact Data Project found a similar correlation.119 CONSULTANCY Metadata’s value is demonstrated by integrating it into the fabric of both the library and other units across the campus. For example, metadata specialists can provide “metadata as a service”— consultancy in the earliest stages of both library and research projects.120 An emerging trend is for digital humanities departments to request advice from metadata specialists on metadata standards and how to use controlled vocabularies. More visibility of this metadata consultant role appears in recent library job postings. In one Metadata Librarian job posting at Cornell,121 one duty cited was 20% for “metadata outreach and consultation”: “Maintains strong working relationships and communicates regularly with staff across Cornell, fostering collaborative efforts between Metadata Services and the greater Cornell community.” Georgia Tech is recruiting a metadata librarian who 26 Transitioning to the Next Generation of Metadata will “serve as a metadata consultant to larger library projects/initiatives. Work closely with other Library departments, Emory University Libraries, GALILEO, University System of Georgia Libraries, and other partners involved in joint projects.”122 NEW APPLICATIONS The shared and consistent use of MARC fields supports new applications. Libraries currently use identifiers in bibliographic records to fetch tables of contents, abstracts, reviews, and cover images and to generate floor maps of where to locate resources in a specific classification range (such as in OCLC’s integration with StackMap).123 Bibliographic metadata is used to populate Digital Asset Management Systems and Institutional Repositories, and with tools such as Tableau and OpenRefine, can enable a richer analysis of collections and a view of collections. MARC metadata is connecting scholars with the bibliographic data for their projects and can generate relationships to related resources with applications such as Yewno.124 MARC metadata is also being used to inform institutional output measures and affiliation tracking and serves as a source to build organization histories. The provenance implicit in an institution’s bibliographic metadata has proven helpful in documenting theft cases. Analyzing catalog data by data mining can also be used to enrich the metadata, such as generating language codes missing in related records or identifying the original titles of translated works. MARC data has also supported generating subject maps to discover relationships otherwise not explicit in the cataloging metadata.125 Visualizations represent another type of metadata service. A striking example is from the Auslang national codeathon held in 2019, a collaboration among the National Library of Australia, the Australian Institute of Aboriginal and Torres Strait Islander Studies, Trove, Libraries Australia, and the State and Territory libraries—a national code-a-thon to identify items in Indigenous Australian languages.126 Figure 8 shows the results, a map indicating the 465 Indigenous languages in the Australian National Bibliographic Database tagged as a result of the code-a-thon, and an example of involving the community to enhance bibliographic metadata. FIGURE 8. Distribution of 465 Indigenous language codes in the Australian National Bibliographic Database Distribution of 465 Indigenous Language Codes in the Australian National Bibliographic Database https://www.nla.gov.au/our-collections/processing-and-describing-the-collections /Austlang-national-codeathon https://www.nla.gov.au/our-collections/processing-and-describing-the-collections /Austlang-national-codeathon Transitioning to the Next Generation of Metadata 27 BIBLIOMETRICS Library metadata is also being used to generate bibliometrics, statistical methods to analyze books, articles, and other publications. Using library metadata for Digital Humanities research projects has much potential. For example, a Library of Congress researcher used bibliographic metadata to trace the history of publishing and copyright; UCLA researchers have used cataloging metadata to track the commercialization of inventions such as insulin. A novel use of cataloging metadata was by Hachette UK, the United Kingdom’s second largest bookseller, which commissioned the Graphic History Company to unlock the histories of all nine of Hachette’s publishing houses and weave them into a cohesive story by asking the British Library for every author and book title published by their nine publishing houses spanning 250 years. The British Library provided a list of over 55,000 authors, from which 5,000 of the most prominent were selected to create perhaps the most beautiful example of metadata use: a giant mural spanning eight floors featuring all 5,000 authors in chronological order. (Figure 9 shows one part of the mural; for more images of the mural, see Hachette’s River of Authors.)127 FIGURE 9. UK Hatchette’s “River of Authors” generated from the British Library’s catalog metadata SEMANTIC INDEXING When controlled vocabularies and thesauri are converted into linked open data and shared publicly, their traditional role of facilitating collection browsing will fade but could find a renewed purpose within web-based knowledge organizations systems (KOS).128 As Marcia Zeng points out in Knowledge Organization Systems (KOS) in the Semantic Web: a multi-dimensional review, UK Hatchette’s “River of Authors” Generated from the British Library’s Catalog Metadata 28 Transitioning to the Next Generation of Metadata a KOS vocabulary is more than just the source of values to be used in metadata descriptions: by modeling the underlying semantic structures of domains, KOS act as semantic road maps and make possible a common orientation by indexers and future users, whether human or machine.129 Good examples of such repurposing are the Getty Vocabularies that allow browsing of Getty’s representation of knowledge and also helps users generate their own SPARQL queries that can be embedded in external applications. Another example is Social Networks and Archival Context (SNAC),130 which enables browsing of entities and relationships independently of their collections of origins. In such cases, the discovery tool pivots to being person-centric (or family-centric, or topic- centric), rather than (only) collection-centric. Rather than one “global domain,” metadata specialists could provide added value by adding bridges from the metadata in library domain databases to other domains. Wikidata is an example of a platform aggregating entities from different sources and linking to more details in various language Wikipedias. Some institutions have employed Wikimedians in Residence to accelerate this process. Focus Group members hope that Artificial Intelligence—or at least machine-learning—could mitigate the amount of current manual effort to link names and concepts in research data. Perhaps algorithms could be used to match names based on related metadata or sources, relate topics to each other based on context, disambiguate names based on other metadata available, and analyze datasets to identify possible biases in a collection.131 A few Research Library Partners participate in Artificial Intelligence for Libraries, Archives & Museums (AI4LAM),132 an “international, participatory community focused on advancing the use of artificial intelligence in, for and by libraries, archives, and museums.”133 Some high-level recommendations on enhancing descriptions at scale and improving discovery are noted in Thomas Padilla’s OCLC Research 2019 position paper Responsible Operations: Data Science, Machine Learning, and AI in Libraries.134 Preparing for Future Staffing Requirements The anticipated changes from transitioning to the next generation of metadata will also shift staffing requirements to prepare for the future. Focus Group members identified new skill sets needed for both professionals entering the field as well as seasoned catalogers, driven by the changing information technology landscape and increasing staff attrition. Focus Group members characterized professionals as those who “trail-blaze innovations,” which are then routinized for nonprofessionals. These discussions reinforce Padilla’s recommendations on investigating core competencies, committing to internal talent, and expanding evidence-based training.135 THE CULTURE SHIFT Focus Group members reported a delicate balance of allocating staff to “traditional cataloging activities” (such as original and copy cataloging, authority work) with more exploratory R&D projects, such as linked data projects, exploring new data models and technologies such as Wikidata, and learning about emerging standards and identifiers. A culture shift is needed: from pride in production alone to valuing opportunities to learn, explore, and try new approaches to metadata work. Metadata specialists must understand that improving all metadata is more important than any individual’s productivity numbers. This culture shift requires buy-in from administrators to support training programs for staff to learn new workflows for processing multiple formats and to view metadata specialists as more than just “production machines.” Transitioning to the Next Generation of Metadata 29 Metadata managers faced with staff reductions while still being expected to maintain production levels must justify allocating staff time for R&D—or “play time”—to explore such questions as: What can we stop doing? What is the one thing you learned that we all need to do more of? What do you need to move forward? What open source software could help us do the work more efficiently? What new methods could enhance discoverability, access, and use of our facilitated collections? Managers must incorporate goals for success that are not based solely on numbers.136 A culture shift is needed: from pride in production alone to valuing opportunities to learn, explore, and try new approaches to metadata work. Indications of this culture shift include institutions outsourcing some metadata work or training support staff to create metadata for the “easier stuff” while mandating that catalogers only do what well-trained humans can do. Metadata managers could scope the materials requiring metadata that support staff or students can handle, providing templates where possible. If you remove these tasks, the majority of what remains requires highly skilled metadata specialists with expertise in languages, physical formats, and disambiguating and describing persons, organizations, and other entities. LEARNING OPPORTUNITIES To encourage the culture shift among metadata specialists to change their mindsets about how they work and stimulate interest in learning opportunities, Focus Group members have used several approaches: • Identify who on your team has the aptitude to acquire new skills. At one institution, the staff member shared what she learned and the whole unit became “lively” because she brought her colleagues along. It created appreciation for “continuous learning” and staff presented their activities at national conferences. • Convene cross-team group discussions to look at problem metadata and come up with solutions, encouraging staff to move forward together. Staff less interested in new skills can pick up some of the production from those learning new skills and producing less. • Launch “reading clubs” where staff all read an article and respond to three discussion questions to inspire metadata specialists to think about broader metadata issues outside of their daily work. • Hold weekly group “video-viewing brown-bag lunches” for staff on new developments such as linked data so staff can “watch and learn” together. • Participate in multi-institutional projects to collaborate with peers to solve problems and cross-pollinate ideas. • Encourage participation in professional conferences and standards development. 30 Transitioning to the Next Generation of Metadata Educating and training catalogers has been at the forefront of many discussions in the metadata community. Both new professionals and seasoned catalogers need new skills to successfully transition to the emerging linked data environment. Catalogers are learning about and experimenting with BIBFRAME while remaining responsible for traditional bibliographic control of collections. Metadata specialists utilize tools for metadata mapping, remediation, and enhancement. They identify and map semantic relationships among assorted taxonomies to make multiple thesauri intelligible to end users. For the more technical aspects of metadata management, competition for talent from other industries has been increasing. This may intensify as metadata becomes more central to various areas of government, nonprofit, and private enterprise.137 NEW TOOLS AND SKILLS The extent of metadata specialists’ collaboration with IT or systems staff varies among institutions. Such collaboration is necessary for many reasons, including managing data that is outside the library’s control. Some noted that “cultural differences” exist between the professions: developers tend to be more dynamic and focus on quick prototyping and iteration, while librarians focus first on documenting what is needed and are more “schematic.” Which is more likely to be successful: teaching metadata specialists IT skills or teaching IT staff metadata principles? The “holy grail” is to recruit someone with an IT background interested in metadata services. Retaining staff with IT skills is difficult—they are in demand for higher-paying jobs in the private sector. Focus Group members’ experiences have shown that it is easier for librarians to learn programming skills than it is to hire IT specialists to learn the “technical services mindset.” Ideally, Focus Group members would like a few staff who have the technical skills to take batch actions on data, or at least who know how to use the external tools available to automate as many tasks as possible. For many years, Focus Group members have been using MarcEdit and/or other tools such as OpenRefine, scripts (e.g., Python, Ruby, or Perl), and macros for metadata reconciliation and batch processing.138 MarcEdit is the most popular tool, and has a large, global, and active user community as indicated in its 2017 Usage Snapshot.139 Terry Reese, MarcEdit’s developer, estimates that about one-third of all users work in non-MARC environments and two-thirds of the most active users are OCLC members. Focus Group members reported that they use MarcEdit for data transformation, enhancing vendor records, building MARC records from spreadsheets, linked data reconciliation, de-duplicating records within a file, merging two or more records into one, Z39.50 harvesting, and reconciling metadata before sending records to other systems. The 2017 release of MarcEdit 7 includes new features such as light weight clustering functionality, providing a powerful way to find relationships between data without introducing a large learning curve. It also has mechanisms that support linked data.140 Reese has created a series of YouTube tutorials available on his MarcEdit Playlist.141 Managers want to focus less on specific schema and more on metadata principles that can be applied to a range of different formats and environments. Desirable soft skills include problem- solving, effective collaboration, willingness—even eagerness—to try new things, understanding researchers’ needs, and advocacy. Although some metadata specialists have always enjoyed experimenting with new approaches, often they lack the time to learn new tools or methodologies while keeping up with their routine work assignments. Libraries should promote metadata as an exciting career option to new professionals in venues such as library schools and ALA’s New Members Roundtable. Emphasizing that metadata encompasses much more than library cataloging—entity identification; descriptive standards used in various academic disciplines; describing born-digital, archival, and research data that can interact with the semantic Web—can increase its appeal. As one Focus Group member noted, “We bring order out of a vacuum.”142 Transitioning to the Next Generation of Metadata 31 SELF-EDUCATION Metadata increasingly is being created outside the library by academics and students with minimal training, leading to a need for more catalogers with record maintenance skills. Focus Group members noted the need for technical skills such as simple scripting, data remediation, and identity management to reconcile equivalents across multiple registries. Frequently mentioned sources of instruction include Library Juice Academy, MarcEdit tutorials, LinkedIn Learning (which acquired Lynda.com), Library of Congress Training Webinars, ALCTS Webinars, Code Academy, Software Carpentry, and conferences such as Code4Lib and Mashcat.143 W3C’s Data on the Web Best Practices and Semantic Web for the Working Ontologist were recommended reading.144 Crucial to the success of such training is the ability to quickly apply what has been learned. If new skills are not applied, people forget what they have learned. Staff feel frustrated when they have invested the time to learn something that they cannot use in their daily work. Focus Group members have seen a big shift from relying on Library of Congress instructions to self- education from multiple sources. Some approaches mentioned by participants: • Emphasize continuity of metadata principles when introducing an expanded scope of work. • Take advantage of the Library Workflow Exchange,145 a site designed to help librarians share workflows and best practices across institutions, including scripts. • From the 2017 Electronic Resources and Libraries Conference: “Don’t wait; iterate!” In other words, rather than waiting until staff have all the required skills, let them do tasks iteratively, learning as they go, so they are ready for new tasks when the time comes. • Have small groups of metadata specialists take programming courses together, after which they can continue to meet and discuss ways to apply their new skills to automate routine tasks. • Encourage staff to participate in events such as OCLC’s DevConnect Webinars146 to learn from libraries using OCLC APIs to enhance their library operations and services. • Create reading and study groups that include cross-campus or cross-divisional staff. • Expand the scope of current work to enable metadata specialists to apply their skills to new domains or terminology, such as using Dublin Core for digital collections. Involve staff in digital projects from the conceptual stage to developing project specifications, quality assurance practices and tool selection. As a bonus, this fosters collaborative teamwork relationships. • Hire graduate students in computer science for short-term tasks such as creating scripts. ADDRESSING STAFF TURNOVER Turnover in a professional position within a cataloging or metadata unit now comes with the significant risk that it may be impossible to convince administrators to retain the position in the unit and repost it. This is particularly true when the outgoing incumbent performed a high proportion of “traditional” work, such as original cataloging in MARC. The odds of retaining the position are much greater if careful thought goes into how the position could be reconfigured or re-purposed to meet emerging needs.147 Most Focus Group members have had to address varying amounts of turnover, either from retirements or staff leaving for other positions. Half of them needed to reconfigure the positions of outgoing librarians. Looking at what other institutions are advertising helps in creating an attractive position description. Many cataloging positions do not require an MLS degree, so recruiting 32 Transitioning to the Next Generation of Metadata professionals has focused on adaptability, aligning new positions with university priorities, and on eagerness to learn and take initiative in areas such as metadata for research output, open access, digital collections, and linked data. Mapping out future strategies and designing ways of making metadata interoperate across systems have been components of recent recruitments. New staff with programming skills are sought after, as they can apply batch techniques to metadata that can compensate for the loss of staff. Using technology in the service of library service helps catalogers “do more with less.” Focus Group members want new staff to be aware of both the shared cataloging community and the overlaps with other cultural heritage organizations such as archives and museums. The library environment keeps evolving, and librarians have had to reflect on their priorities moving forward. Metadata managers need to rethink the roles of metadata specialists beyond “traditional” cataloging work. Potential candidates with more flexible skill sets have become more attractive than those with a traditional cataloging background who may not adapt well to working in new environments. Many cataloging roles and descriptions may need to be rewritten and retooled. Perhaps the only activities that will perennially remain professional tasks are those like management, scouting new trends, strategizing, participating in new international standards, leading and implementing changes, and thinking about the big picture. Impact The next generation of metadata will become even more focused on entities rather than record- based descriptions of an institution’s collections. Focus Group members’ linked data activities, including their participation in OCLC Research’s Project Passage and CONTENTdm Linked Data pilots, contributed to OCLC obtaining Andrew W. Mellon funding for its two-year Shared Entity Management Infrastructure project,148 launched in January 2020. Eleven of the Shared Entity Management Infrastructure Advisory Group members are also Focus Group members. The project builds on OCLC Research’s linked data work, and will provide a production infrastructure with persistent, authoritative identifiers for persons and works. It will be largely API-based, allowing librarians to customize their workflows around linked data infrastructure. This infrastructure has long been desired by Focus Group members as it will address many of the challenges documented above around persistent identifiers, especially identifiers for “works.” The next generation of metadata will become even more focused on entities rather than record-based descriptions of an institution’s collections. Authoritative, persistent identifiers provided by the Shared Entity Management Infrastructure will supply the needed language-neutral links to trustworthy sources. The metadata that libraries, archives, and other cultural heritage institutions have created and will create will provide the context for these entities, as “statements” associated with those links. The impact will be global, affecting how librarians and archivists will describe the inside-out and facilitated collections, inspiring new offerings of “metadata as a service,” and influencing future staffing requirements. Transitioning to the Next Generation of Metadata 33 A C K N O W L E D G M E N T S OCLC Research wishes to thank all Research Library Partners Metadata Managers Focus Group members who have shared their experiences and thoughts summarized here. Additionally, we extend thanks to the dedicated Metadata Managers Planning Group, which initiated the topics and provided the context statements and question sets, the responses to which served as the basis of our discussions. In addition, we particularly appreciate the insightful comments from the following Focus Group members who reviewed an earlier version of this document; their comments improved this synthesis. • Charlene Chou, New York University • Suzanne Pilsk, Smithsonian Institution • Greg Reeve, Brigham Young University • Alexander Whelan, Columbia University • Helen K. R. Williams, London School of Economics I also extend thanks to current and former OCLC colleagues: Rebecca Bryant, Jody DeRidder, Annette Dortmund, Rachel Frick, Janifer Gatenby, Jean Godby, Shane Huddleston, Andrew Pace, Merrilee Proffitt, Nathan Putnam, Stephan Schindehette, and Chela Weber for their careful review of all or parts of earlier versions of this document. Thank you to Erica Melko for her editing, Jeanette McNicol for the design of this report, and JD Shipengrover for the cover artwork. On a personal note, I have greatly benefited from my interactions with the OCLC Research Partners Metadata Managers Focus Group and have been delighted to play a small part in this transition to the next generation of metadata. 34 Transitioning to the Next Generation of Metadata A P P E N D I X OCLC Research Library Partners Metadata Managers Planning Group 2015-2020 Planning Group members selected the topics for the OCLC Research Library Partners Metadata Managers discussions, wrote up the context statements why the topic was important and timely, and developed the question sets that Focus Group members responded to. The Planning Group initiators for each topic also reviewed draft summaries that were later posted on the OCLC Research Hanging Together blog. Current Planning Group members are listed in bold; institutional affiliations are given for the time when they served on the Planning Group: • Jennifer Baxmeyer, Princeton University • Sharon Farnel, University of Alberta • Steven Folsom, Harvard University and Cornell University • Erin Grant, University of Washington • Dawn Hale, Johns Hopkins University • Myung-Ja Han, University of Illinois, Urbana-Champaign • Kate Harcourt, Columbia University • Corey Harper, New York University • Stephen Hearn, University of Minnesota • Daniel Lovins, Yale University • Roxanne Missingham, Australian National University • Chew Chiat Naun, Cornell University and Harvard University • Suzanne Pilsk, Smithsonian • John Riemer, University of California, Los Angeles • Carlen Ruschoff, University of Maryland • Philip Schreur, Stanford University • Jackie Shieh, George Washington University • Melanie Wacker, Columbia University Transitioning to the Next Generation of Metadata 35 N O T E S 1. OCLC Research Library Partnership Metadata Managers Focus Group. https://www.oclc.org/research/areas/data-science/metadata-managers.html. 2. OCLC Research. “The OCLC Research Library Partnership.” https://www.oclc.org/research/partnership.html. 3. Smith-Yoshimura. 2017. “Metadata Advocacy” Hanging Together: the OCLC Research Blog, 17 October 2017. https://hangingtogether.org/?p=6282. 4. British Library. 2019. Foundations for the Future: The British Library’s Collection Metadata Strategy 2019-2023. London: British Library. https://www.bl.uk/bibliographic/pdfs/british -library-collection-metadata-strategy-2019-2023.pdf. 5. Ibid, 4. 6. Statistics as of 1 June 2020. 7. Library of Congess. “Program for Cooperative Cataloging.” https://www.loc.gov/aba/pcc/. 8. Except for June 2020, when all discussions were held virtually only because of the COVID-19 pandemic. 9. See Hanging Together: The OCLC Research Blog, search-category Metadata. https://hangingtogether.org/?cat=81. 10. Benefits from affiliating with the RLP are cited in Smith-Yoshimura. 2018. “What Metadata Managers Expect from and Value about the Research Library Partnership,” Hanging Together: The OCLC Research Blog, 16 April 2018. https://hangingtogether.org/?p=6683. 11. Analyses of the three International Linked Data Surveys for Implementers 2014-2018 and the spreadsheet of all responses to the surveys are available. See OCLC Research. 2020. “Linked Data.” International Linked Data Survey. https://www.oclc.org/research/themes/data-science /linkeddata/linked-data-survey.html. 12. Godby, Jean, Karen Smith-Yoshimura, Bruce Washburn, Kalan Davis, Karen Detling, Christine Fernsebner Eslao, Steven Folsom, Xiaoli Li, Marc McGee, Karen Miller, Honor Moody, Holly Tomren, and Craig Thomas. 2019. Creating Library Linked Data with Wikibase: Lessons Learned from Project Passage. Dublin, OH: OCLC Research. https://doi.org/10.25333/faq3-ax08; OCLC Research. 2020. “CONTENTdm Linked Data pilot.” https://www.oclc.org/research /themes/data-science/linkeddata/contentdm-linked-data-pilot.html; OCLC. 2020. “WorldCat®: OCLC and Linked Data.” Shared Entity Management Infrastructure. https://www.oclc.org/en/worldcat/linked-data/shared-entity-management-infrastructure.html; https://www.oclc.org/research/areas/data-science/metadata-managers.html https://www.oclc.org/research/partnership.html https://hangingtogether.org/?p=6282 https://www.bl.uk/bibliographic/pdfs/british-library-collection-metadata-strategy-2019-2023.pdf https://www.bl.uk/bibliographic/pdfs/british-library-collection-metadata-strategy-2019-2023.pdf https://www.loc.gov/aba/pcc/ https://hangingtogether.org/?cat=81 https://hangingtogether.org/?p=6683 https://www.oclc.org/research/themes/data-science/linkeddata/linked-data-survey.html https://www.oclc.org/research/themes/data-science/linkeddata/linked-data-survey.html https://doi.org/10.25333/faq3-ax08 https://www.oclc.org/research/themes/data-science/linkeddata/contentdm-linked-data-pilot.html https://www.oclc.org/research/themes/data-science/linkeddata/contentdm-linked-data-pilot.html https://www.oclc.org/en/worldcat/linked-data/shared-entity-management-infrastructure.html 36 Transitioning to the Next Generation of Metadata Library of Congress. “BIBFRAME.” Bibliographic Framework Initiative. https://www.loc.gov/bibframe/; Futornick, Michelle. 2019. “LD4P2 Linked Data for Production: Pathway to Implementation.” LS4P2 Project Background and Goals. Lyrasis. Posted 14 January 2019. https://wiki.lyrasis.org/display/LD4P2/LD4P2+Project+Background+and+Goals; Share-VDE (Share Virtual Discovery Environment). “An Effective Environment for the Use of Linked Data by Libraries.” Accessed 17 September 2019. https://www.share-vde.org /sharevde/clusters?l=en; Casalini, Michele, Chiat Naun Chew, Chad Cluff, Michelle Durocher, Steven Folsom, Paul Frank, Janifer Gatenby, Jean Godby, Jason Kovari, Nancy Lorimer, Clifford Lynch, Peter Murray, Jeremy Myntti, Anna Neatrour, Cory Nimer, Suzanne Pilsk, Daniel Pitti, Isabel Quintana, Jing Wang, and Simeon Warner. 2018. National Strategy for Shareable Local Name Authorities National Forum: White Paper. Ithaka, New York: Cornell University Library eCommons digital repository. https://hdl.handle.net/1813/56343. 13. Library of Congress. 2019. PCC (Program for Cooperative Cataloging) Task Group on Linked Data Best Practices. 2019. PCC Task Group on Linked Data Best Practices Final Report: Submitted to PCC Policy Committee 12 September 2019. Washington DC: Library of Congress. https://www.loc.gov/aba/pcc/taskgroup/linked-data-best-practices-final-report.pdf; Library of Congress. 2018. “Charge for PCC Task Group on Identity Management in NACO,” 5. American Bar Association, Program for Cooperative Cataloging, revised 22 May 2018. https://www.loc.gov/aba/pcc/taskgroup/PCC-TG-Identity-Management-in-NACO -rev2018-05-22.pdf; Library of Congress. 2020 “PCC Task Group on URIs in MARC.” Programs of the PCC. Charge. Accessed 19 September 2020. https://www.loc.gov/aba/pcc/bibframe/TaskGroups /URI-TaskGroup.html; Library of Congress. 2018. “PCC Linked Data Advisory Committee: Linked Data Advisory Committee Charge.” PCC Task Groups 2018. Task Groups. Revised 24 July 2018. [Word doc; 28KB]. https://www.loc.gov/aba/pcc/taskgroup/task-groups.html. 14. Smith-Yoshimura, Karen. 2015. “Shift to Linked Data for Production.” OCLC Research Hanging Together Blog, 13 May 2015. https://hangingtogether.org/?p=5195. 15. OCLC Research. 2020. “LInked Data.” Linked Data Overview. https://www.oclc.org/research /areas/data-science/linkeddata/linked-data-overview.html. [All figures CC BY 4.0] 16. Smith-Yoshimura, Karen. 2019. “‘Future Proofing’ of Cataloging.” OCLC Research Hanging Together Blog, 10 November 2019 https://hangingtogether.org/?p=7526. 17. ORCID: Connecting Research and Researchers. “What is Orcid.” Our Vision. Accessed 19 September 2020. https://orcid.org/about/what-is-orcid/mission. 18. See for example the list of signatories of journal publishers requiring ORCID IDs for authors. ORCID. “ORCID Open Letter - Publishers.” Accessed 19 September 2020. https://orcid.org/content/requiring-orcid-publication-workflows-open-letter. https://www.loc.gov/bibframe/ https://wiki.lyrasis.org/display/LD4P2/LD4P2+Project+Background+and+Goals https://www.share-vde.org/sharevde/clusters?l=en https://www.share-vde.org/sharevde/clusters?l=en https://hdl.handle.net/1813/56343 https://www.loc.gov/aba/pcc/taskgroup/linked-data-best-practices-final-report.pdf https://www.loc.gov/aba/pcc/taskgroup/PCC-TG-Identity-Management-in-NACO-rev2018-05-22.pdf https://www.loc.gov/aba/pcc/taskgroup/PCC-TG-Identity-Management-in-NACO-rev2018-05-22.pdf https://www.loc.gov/aba/pcc/bibframe/TaskGroups/URI-TaskGroup.html https://www.loc.gov/aba/pcc/bibframe/TaskGroups/URI-TaskGroup.html https://www.loc.gov/aba/pcc/taskgroup /task-groups.html https://hangingtogether.org/?p=5195 https://www.oclc.org/research/areas/data-science/linkeddata/linked-data-overview.html https://www.oclc.org/research/areas/data-science/linkeddata/linked-data-overview.html https://hangingtogether.org/?p=7526 https://orcid.org/about/what-is-orcid/mission https://orcid.org/content/requiring-orcid-publication-workflows-open-letter Transitioning to the Next Generation of Metadata 37 19. ISNI. “What is ISNI.” Accessed 19 September 2020. https://isni.org/page/what-is-isni/. 20. HathiTrust is a not-for-profit collaborative of academic and research libraries preserving more than 17 million digitized items. See: HathiTrust Digital Library. “Welcome to HathtiTrust.” Accessed 19 September 2020. https://www.hathitrust.org/about. 21. GeoNames. “Browse the Names.” Accessed 19 September 2020. https://www.geonames.org/. 22. Bryant, Rebecca, Annette Dortmund, and Constance Malpas. 2017. Convenience and Compliance: Case Studies on Persistent Identifiers in European Research Information. Dublin, OH: OCLC Research. https://doi.org/10.25333/C32K7M. 23. ISNI currently holds 11.02 million identities: 10.26 million individuals (of which 2.91 million are researchers) and 933,039 organizations. Statistics retrieved from ISNI. See ISNI. “Key Statistics.” Accessed 5 May 2020. https://isni.org/. 24. Library of Congress. 2020. “NACO – Name Authority Cooperative Program.” Documents and Updates. Programs for Cataloging and Acquisitions (PCC). Accessed 19 September 2020. http://www.loc.gov/aba/pcc/naco/index.html. 25. Smith-Yoshimura, Karen. 2015. “Getting identifiers Created for Legacy Names.” Hanging Together: The OCLC Research Blog, 30 October 2015. https://hangingtogether.org/?p=5463. 26. Smith-Yoshimura, Karen. 2013. “Irreconcilable Differences? Name Authority Control & Humanities Scholarship” Hanging Together: The OCLC Research Blog, 27 March 2013. https://hangingtogether.org/?p=2621. 27. Smith-Yoshimura, Karen. 2017. “Use Cases for Local Identifiers.” Hanging Together: The OCLC Research Blog, 5 May 2017. https://hangingtogether.org/?p=5938. 28. OCLC Research. 2020. “Registering Researchers in Authority Files.” https://www.oclc.org/research/themes/research-collections/registering-researchers.html. 29 Smith-Yoshimura, Karen, Janifer Gatenby, Grace Agnew, Christopher Brown, Kate Byrne, Matt Carruthers, Peter Fletcher, Stephen Hearn, Xiaoli Li, Marina Muilwijk, Chew Chiat Naun, John Riemer, Roderick Sadler, Jing Wang, Glen Wiley, and Kayla Willey. 2016. Addressing the Challenges with Organizational Identifiers and ISNI. Dublin, Ohio: OCLC Research. https://doi.org/10.25333/C3FC9Q. 30. Research Organization Registry (ROR). “About.” https://ror.org/about/. 31. V. M. Abazov, B. Abbott, B. S. Acharya, M. Adams, T. Adams, J. P. Agnew, G. D. Alexeev et al. (2014) 2020. “Precision Measurement of the Top-Quark Mass in Lepton+jets Final States.” (Archived 24 February 2020) ArXiv.org: 1501.07912. https://arxiv.org/pdf/1405.1756. 32. Smith-Yoshimura, Karen. 2017. “How Much Metadata Is Practical?” Hanging Together: The OCLC Research Blog, 14 November 2017. https://hangingtogether.org/?p=6328. 33. University of Minnesota. 2020. “Experts@Minnesota.” Find Profiles. https://experts.umn.edu/en/persons/ or https://isni.org/page/what-is-isni/ https://www.hathitrust.org/about https://www.geonames.org/ https://doi.org/10.25333/C32K7M https://isni.org/ http://www.loc.gov/aba/pcc/naco/index.html https://hangingtogether.org/?p=5463 https://hangingtogether.org/?p=2621 https://hangingtogether.org/?p=5938 https://www.oclc.org/research/themes/research-collections/registering-researchers.html https://doi.org/10.25333/C3FC9Q https://doi.org/10.25333/C3FC9Q https://ror.org/about/ https://arxiv.org/pdf/1405.1756 https://hangingtogether.org/?p=6328 https://experts.umn.edu/en/persons/ 38 Transitioning to the Next Generation of Metadata University of Illinois at Urbana-Champaign. 2020. “Illinois Experts.” Find U of I Research, View Scholarly Works, and Discover New Collaborators. https://experts.illinois.edu/. 34. The National Institute of Health (NIH): National Institute of Allergy and Infectious Diseases (NIAID) on 7 April 2020 mandates ORCIDs for training, fellowship, education, or career development awards in FY20. See NIH: NIAID. 2019. “ORCID iD: Required for Some, Encouraged for All.” NIAID Funding News. Last reviewed 7 August 2019. https://www.niaid.nih.gov/grants-contracts/orcid-id-required-some-encouraged-all; See also Lyrasis. 2020. “SciENcv and ORCID to Streamline NIH and NSF Grant Applications.” LyrasisNow (blog), 8 April 2020. https://lyrasisnow.org/sciencv-and-orcid-to-streamline-nih -and-nsf-grant-applications/. 35. Smith-Yoshimura, Karen. 2016. “Metadata Reconciliation.” Hanging Together: The OCLC Research Blog, 28 September 2016. https://hangingtogether.org/?p=5710. 36. Carruthers, Matt. (2014) 2020. mcarruthers/LCNAF-Named-Entity-Reconciliation. GitHub Repository. https://github.com/mcarruthers/LCNAF-Named-Entity-Reconciliation. 37. Deliot, Corine, Steven Folsom, Myung-Ja Han, Nancy Lorimer, Terry Reese, and Adam Schiff. 2019. Formulating and Obtaining URIs: A Guide to Commonly used Vocabularies and Reference Sources. Library of Congress PCC Task Group on URIs in MARC. https://www.loc.gov/aba/pcc/bibframe/TaskGroups/formulate_obtain_URI_guide.pdf. 38. Smith-Yoshimura, Karen. 2019. “New Ways of Using and Enhancing Cataloging and Authority Records.” Hanging Together: The OCLC Research Blog, 2 April 2019. https://hangingtogether.org/?p=5710. 39. Smith-Yoshimura, Karen. 2015. “Persistent Identifiers for Local Collections.” Hanging Together: The OCLC Research Blog, 27 October 2015. https://hangingtogether.org/?p=5445. 40. DataCite. “Assign DOIs.” https://datacite.org/dois.html; Wilkinson, Laura J. 2020. “Constructing your DOIs.” Crossref: The Crossref Curriculum. Last updated 8 April 2020. https://www.crossref.org/education/member-setup/constructing -your-dois/. 41. See DOI examples in detail from: DOI. 2020. “DOI System Examples.” Accessed 20 September 2020. https://www.doi.org/demos.html; and See ARK examples in detail from: Department, Dallas (Tex ) Police. 1963. “[Photographs of Identification Cards].” Collection. University of North Texas. The Portal to Texas History digital repository. https://texashistory.unt.edu/ark:/67531/metapth346793/. https://experts.illinois.edu/ https://www.niaid.nih.gov/grants-contracts/orcid-id-required-some-encouraged-all https://lyrasisnow.org/sciencv-and-orcid-to-streamline-nih-and-nsf-grant-applications/ https://lyrasisnow.org/sciencv-and-orcid-to-streamline-nih-and-nsf-grant-applications/ https://hangingtogether.org/?p=5710 https://github.com/mcarruthers/LCNAF-Named-Entity-Reconciliation https://www.loc.gov/aba/pcc/bibframe/TaskGroups/formulate_obtain_URI_guide.pdf https://hangingtogether.org/?p=5710 https://hangingtogether.org/?p=5445 https://datacite.org/dois.html https://www.crossref.org/education/member-setup/constructing-your-dois/ https://www.crossref.org/education/member-setup/constructing-your-dois/ https://www.doi.org/demos.html https://texashistory.unt.edu/ark:/67531/metapth346793/ Transitioning to the Next Generation of Metadata 39 42. “Identity management” here reflects its usage among metadata specialists (See, for example, Library of Congress. 2018. “Charge for PCC Task Group on Identity Management in NACO,” 5. American Bar Association, Program for Cooperative Cataloging. Revised 22 May 2018. https://www.loc.gov/aba/pcc/taskgroup/PCC-TG-Identity-Management-in-NACO- rev2018-05-22.pdf.) But the term has other meanings depending on the audience; for example, identity access management, as described in: Wikiwand. “Identity Management.” https://www.wikiwand.com/en/Identity_management. 43. Smith-Yoshimura, Karen. 2018. “The Coverage of Identity Management Work.” Hanging Together: The OCLC Research Blog, 8 October 2018. https://hangingtogether.org/?p=6805. 44. Smith-Yoshimura, Karen. 2017. “Beyond the Authorized Access Point? Hanging Together: The OCLC Research Blog, 10 October 2017. https://hangingtogether.org/?p=6271. 45. Smith-Yoshimura, “Coverage of Identity Management.” (See note 43.) 46. Watch the highly-rated Webinar by Andrew Lih and Robert Fernandez. 2018. “Works in Progress Webinar: Introduction to Wikidata for Librarians: Structuring Wikipedia and Beyond.” Produced by OCLC Research, 12 June 2018. MP4 video presentation, 1:1:51. https://www.oclc.org/research/events/2018/06-12.html. 47. Smith-Yoshimura, Karen. 2020. “Experimentations with Wikidata/Wikibase, Hanging Together: The OCLC Research Blog, 18 June 2020. https://hangingtogether.org/?p=8002. 48. Wikimedia. “WikiCite.” Home. https://meta.wikimedia.org/wiki/WikiCite. 49. Smith-Yoshimura, Karen. 2016. “Impact of Identifiers on Authority Workflows. Hanging Together: The OCLC Research Blog, 22 March 2016. https://hangingtogether.org/?p=5603. 50. Smith-Yoshimura, Karen. 2019. “Strategies for Alternate Subject Headings and Maintaining Subject Headings. Hanging Together: The OCLC Research Blog, 29 October 2019. https://hangingtogether.org/?p=7591. 51. OCLC 2020. “FAST (Faceted Application of Subject Terminology).” https://www.oclc.org/en/fast.html. 52. Smith-Yoshimura, Karen. 2016. “Faceted Vocabularies.” Hanging Together: The OCLC Research Blog, 31 October 2016. https://hangingtogether.org/?p=5739. 53 OCLC 2020. “FAST.” (See note 51.) 54. OCLC 2020. “FAST (Faceted Application of Subject Terminology).” Heading #3, FAST Policy and Outreach (FPOC) Committee, : https://www.oclc.org/en/fast.html. 55. Smith-Yoshimura, Karen. 2017. “Vocabulary Control Data in Discovery Environments.” Hanging Together: The OCLC Research Blog, 5 October 2017. https://hangingtogether.org/?p=6264. https://www.loc.gov/aba/pcc/taskgroup/PCC-TG-Identity-Management-in-NACO-rev2018-05-22.pdf https://www.loc.gov/aba/pcc/taskgroup/PCC-TG-Identity-Management-in-NACO-rev2018-05-22.pdf https://www.wikiwand.com/en/Identity_management https://hangingtogether.org/?p=6805 https://hangingtogether.org/?p=6271 https://www.oclc.org/research/events/2018/06-12.html https://hangingtogether.org/?p=8002 https://meta.wikimedia.org/wiki/WikiCite https://hangingtogether.org/?p=5603 https://hangingtogether.org/?p=7591 https://www.oclc.org/en/fast.html https://hangingtogether.org/?p=5739 https://www.oclc.org/en/fast.html https://hangingtogether.org/?p=6264 40 Transitioning to the Next Generation of Metadata 56. National Library, New Zealand Government. “Ngā Upoko Tukutuku / Māori Subject Headings” http://mshupoko.natlib.govt.nz/mshupoko/; AIATSIS Pathways: Gateway to the AIATSIS Thesauri. “Pathways.” http://www1.aiatsis.gov.au/. 57. Deutsche Nationalbibliothek. 2019. “MACS - Multilingual Access to Subjects.” (Archived 13 Jan 2019.) https://web.archive.org/web/20190113003823/https:/www.dnb.de/EN/Wir/Kooperation /MACS/macs_node.html. 58. Smith-Yoshimura, Karen. 2019. “Knowledge Organization Systems.” Hanging Together: The OCLC Research Blog, 17 March 2019. https://hangingtogether.org/?p=7135. 59. Synaptica. “Ontology Management – Graphite.” https://www.synaptica.com/graphite/. 60. Smith-Yoshimura, Karen. 2018. “Are Distributed Models for Vocabulary Maintenance Viable?” Hanging Together: The OCLC Research Blog, 12 April 2018. https://hangingtogether.org/?p=6672. 61. OCLC Research. 2020. “Equity, Diversity, and Inclusion in the OCLC Research Library Partnership Survey.” Overview. Accessed 20 September 2020. https://www.oclc.org/research /areas/community-catalysts/rlp-edi.html. 62. Smith-Yoshimura, Karen. 2018. “Creating Metadata for Equity, Diversity, and Inclusion.” Hanging Together: The OCLC Research Blog, 7 November 2018. https://hangingtogether.org/?p=6833. 63. Smith-Yoshimura. “Distributed Models.” (See note 60.) 64. Smith-Yoshimura, Karen. 2019. “Strategies for Alternate Subject Headings and Maintaining Subject Headings.” Hanging Together: The OCLC Research Blog, 29 October 2019. https://hangingtogether.org/?p=7591. 65. Baxmeyer, Jennifer, Karen Coyle, Joanna Dyla, MJ Han, Steven Folsom, Phil Schreur, and Tim Thompson. 2017. Linked Data Infrastructure Models: Areas of Focus for PCC Strategies. Library of Congress PCC Linked Data Advisory Committee. https://www.loc.gov/aba/pcc /documents/LinkedDataInfrastructureModels.pdf. 66. Bone, Christine, Sharon Farnel, Sheila Laroque, and Brett Lougheed. 2017. “Works in Progress Webinar: Decolonizing Descriptions: Finding, Naming and Changing the Relationship between Indigenous People, Libraries and Archives “ Produced by OCLC Research, 19 October 2017. MP4 video presentation, 54:35.00. https://www.oclc.org/research/events/2017/10-19.html. 67. Smith-Yoshimura, Karen. 2015. “Shift to Linked Data for Production.” Hanging Together: The OCLC Research Blog, 13 May 2015. https://hangingtogether.org/?p=5195. 68. Smith-Yoshimura, Karen. 2015. “Working in Shared Files.” Hanging Together: The OCLC Research Blog, 7 April 2015. https://hangingtogether.org/?p=5091. 69. Bruce Washburn and Jeff Mixter, 2018. “Works in Progress Webinar: Looking Inside the Library Knowledge Vault.” Produced by OCLC Research, 12 August 2018. MP4 video presentation, 57:45:00. https://www.oclc.org/research/events/2015/08-12.html. http://mshupoko.natlib.govt.nz/mshupoko/ http://www1.aiatsis.gov.au/ https://web.archive.org/web/20190113003823/https:/www.dnb.de/EN/Wir/Kooperation/MACS/macs_node.html https://web.archive.org/web/20190113003823/https:/www.dnb.de/EN/Wir/Kooperation/MACS/macs_node.html https://hangingtogether.org/?p=7135 https://www.synaptica.com/graphite/ https://hangingtogether.org/?p=6672 https://www.oclc.org/research/areas/community-catalysts/rlp-edi.html https://www.oclc.org/research/areas/community-catalysts/rlp-edi.html https://hangingtogether.org/?p=6833 https://hangingtogether.org/?p=7591 https://www.loc.gov/aba/pcc/documents/LinkedDataInfrastructureModels.pdf https://www.loc.gov/aba/pcc/documents/LinkedDataInfrastructureModels.pdf https://www.oclc.org/research/events/2017/10-19.html https://hangingtogether.org/?p=5195 https://hangingtogether.org/?p=5091 https://www.oclc.org/research/events/2015/08-12.html Transitioning to the Next Generation of Metadata 41 70. Smith-Yoshimura, Karen. 2019. Systematic Reviews of Our Metadata, Hanging Together: The OCLC Research Blog, 10 April 2019. https://hangingtogether.org/?p=7117. 71. Smith-Yoshimura, Karen. 2015. “Working in Shared File. ”Hanging Together: The OCLC Research Blog, 7 April 2015. https://hangingtogether.org/?p=5091. 72. Jisc Library Services. n.d. “What Is ‘Plan M’?” Accessed 21 September 2020. https://libraryservices.jiscinvolve.org/wp/2019/12/plan-m/; Smith-Yoshimura, Karen. 2020. “Knowledge Management and Metadata.” Hanging Together: The OCLC Research Blog, 9 April 2020. https://hangingtogether.org/?p=7845; For more information about the current phase of “Plan M” (May–November 2020), see Grindley, Neil. “Moving Plan M Forwards – We Need Your Help!” Library Services (PlanM) (blog), Jisc, 6 May 2020. https://libraryservices.jiscinvolve.org/wp/2020/05/planm_nextphase/. 73. Grindley, Neil. 2019. “Plan M: Definition, Principles and Direction.” Jisc. (Word docx.) http://libraryservices.jiscinvolve.org/wp/files/2019/12/Plan-M-Definition-and-Direction-1.docx. 74. Dempsey, Lorcan. 2016. “Library Collections in the Life of the User: Two Directions.” LIBER Quarterly 26(4): 338–359. http://doi.org/10.18352/lq.10170. 75. Smith-Yoshimura, Karen. 2019. “Presenting Metadata from Different Sources in Discovery Layers. Hanging Together: The OCLC Research Blog, 16 April 2019. https://hangingtogether.org/?p=7880. 76. Smith-Yoshimura, Karen. 2017. “Metadata for Archival Collections.” Hanging Together: The OCLC Research Blog, 30 May 2017. https://hangingtogether.org/?p=5903. 77. Godby, Jean, Karen Smith-Yoshimura, Bruce Washburn, Kalan Knudson Davis, Karen Detling, Christine Fernsebner Eslao, Steven Folsom, Xiaoli Li, Marc McGee, Karen Miller, Honor Moody, Craig Thomas, and Holly Tomren. 2019. Creating Library Linked Data with Wikibase: Lessons Learned from Project Passage, 49-51. Dublin, OH: OCLC Research. https://doi.org/10.25333/faq3-ax08. 78. The OCLC Research Library Partnership Archives and Special Collections Linked Data Review Group is described at https://www.oclc.org/research/partnership/working-groups/archives -special-collections-linked-data-review.html. 79. Smith-Yoshimura, Karen. 2020. “Metadata Management in Times of Uncertainty.” Hanging Together: The OCLC Research Blog, 15 June 2020. https://hangingtogether.org/?p=7998. 80. Smith-Yoshimura, Karen. 2016. “Metadata for Archived Websites.” Hanging Together: The OCLC Research Blog, 14 March 2016. https://hangingtogether.org/?p=5591. 81. Archive-It. 2008. “Human Rights.” Columbia University Libraries Collection. (Archived May 2008). https://archive-it.org/collections/1068; Archive-It. 2010. “New York City Places and Spaces.” Columbia University Libraries Collection. (Archived January 2010). https://archive-it.org/collections/1757; https://hangingtogether.org/?p=7117 https://hangingtogether.org/?p=5091 https://libraryservices.jiscinvolve.org/wp/2019/12/plan-m/ https://hangingtogether.org/?p=7845 https://libraryservices.jiscinvolve.org/wp/2020/05/planm_nextphase/ http://libraryservices.jiscinvolve.org/wp/files/2019/12/Plan-M-Definition-and-Direction-1.docx http://doi.org/10.18352/lq.10170 http://Presenting metadata from different sources in discovery layers http://Presenting metadata from different sources in discovery layers https://hangingtogether.org/?p=7880 https://hangingtogether.org/?p=5903 https://doi.org/10.25333/faq3-ax08 https://www.oclc.org/research/partnership/working-groups/archives-special-collections-linked-data-review.html https://www.oclc.org/research/partnership/working-groups/archives-special-collections-linked-data-review.html https://hangingtogether.org/?p=7998 https://hangingtogether.org/?p=5591 https://archive-it.org/collections/1068 https://archive-it.org/collections/1757 42 Transitioning to the Next Generation of Metadata Archive-It. 2010. “Burke Library New York City Religions.” Columbia University Libraries Collection. (Archived May 2010). https://archive-it.org/collections/1945. 82. NLA. “Trove.” Archived Websites. Sub Collections. Accessed 20 September 2020. https://trove.nla.gov.au/website. 83. Archive-It. 2014. “Collaborative Architecture, Urbanism, and Sustainability Web Archive (CAUSEWAY).” Ivy Plus Libraries Confederation Collection. (Archived June 2014.) https://archive-it.org/collections/4638; Archive-It. 2013. “Contemporary Composers Web Archive (CCWA).” Ivy Plus Libraries Confederation Collection. (Archived October 2013.) https://archive-it.org/collections/4019; NYARC: New York Art Resources Consortium. “Web Archiving.” http://www.nyarc.org /content/web-archiving. 84. OCLC Research. 2020. “Web Archiving Metadata Working Group” The Problem, Addressing the Problem, Outputs. https://www.oclc.org/research/themes/research-collections/wam.html. 85. Dooley, Jackie, and Kate Bowers. 2018. Descriptive Metadata for Web Archiving: Recommendations of the OCLC Research Library Partnership Web Archiving Metadata Working Group. Dublin, OH: OCLC Research. https://doi.org/10.25333/C3005C. 86. Smith-Yoshimura, Karen. 2018. “Metadata for Audio and Videos.” Hanging Together: The OCLC Research Blog, 29 October 2018. https://hangingtogether.org/?p=6814. 87. Weber, Chela Scott. 2017. Research and Learning Agenda for Archives, Special, and Distinctive Collections in Research Libraries. Dublin, OH: OCLC Research. https://doi.org/10.25333/C3C34F. 88. Library of Congress. “Standards.” Encoded Archival Description (EAD) Official Site. Accessed 21 September, 2020. https://www.loc.gov/ead/. 89. Library of Congress. “Standards.” Preservation Metadata Maintenance Activity (PREMIS). Accessed 21 September, 2020. https://www.loc.gov/standards/premis/. 90. Weber, Chela Scott. 2019. “Assessing Needs of AV in Special Collections.” Hanging Together: The OCLC Research Blog, 23 July 2019. https://hangingtogether.org/?p=7405; Weber, Chela Scott. 2019. “Scale & Risk: Discussing Challenges to Managing A/V Collections in the RLP.” Hanging Together: The OCLC Research Blog, 1 October 2019. https://hangingtogether.org/?p=7479. 91. Smith-Yoshimura, Karen. 2015. “Managing Metadata for Image Collections.” Hanging Together: The OCLC Research Blog, 9 April 2015. https://hangingtogether.org/?p=5130. 92. Library of Congress. “Standards.” Metadata Object Description Schema (MODS). Accessed 21 September 2020. http://www.loc.gov/standards/mods/. 93. Ibid. https://archive-it.org/collections/1945 https://trove.nla.gov.au/website https://archive-it.org/collections/4638 https://archive-it.org/collections/4019 http://www.nyarc.org/content/web-archiving http://www.nyarc.org/content/web-archiving https://www.oclc.org/research/themes/research-collections/wam.html https://www.oclc.org/research/themes/research-collections/wam.html https://doi.org/10.25333/C3005C https://hangingtogether.org/?p=6814 https://doi.org/10.25333/C3C34F https://www.loc.gov/ead/ https://www.loc.gov/standards/premis/ https://hangingtogether.org/?p=7405 https://hangingtogether.org/?p=7479 https://hangingtogether.org/?p=5130 http://www.loc.gov/standards/mods/ Transitioning to the Next Generation of Metadata 43 94. Library of Congress. “Standards.” Metadata Authority Description Schema (MADS).” Accessed 21 September 2020. http://www.loc.gov/standards/mads/. 95. Smith-Yoshimura, Karen. 2016. “Sharing Digital Collections Workflows.” Hanging Together: The OCLC Research Blog, 2 November 2016. https://hangingtogether.org/?p=5744. 96. OCLC Research. 2020. “Europeana Innovation Pilots.” Accessed 20 September 2020. http://www.oclc.org/research/themes/data-science/europeana.html?urlm=168921. 97. IIIF (International Image Interoperability Framework): Enabling Richer Access to the World’s Images. “Home.” Accessed 20 September 2020. https://iiif.io/. 98. OCLC Research. 2020. “OCLC ResearchWorks IIIF Explorer.” https://www.oclc.org/research /themes/data-science/iiif/iiifexplorer.html. 99. OCLC Research. 2020. “CONTENTdm Linked Data Pilot.” Introduction. https://www.oclc.org /research/themes/data-science/linkeddata/contentdm-linked-data-pilot.html. 100. Smith-Yoshimura, Karen. 2015. “Data Management and Curation in 21st Century Archives – Part 1.” 21 September 2015. http://hangingtogether.org/?p=5375. 101. Smith-Yoshimura, Karen. 2016. “Metadata for Research Data Management.” Hanging Together: The OCLC Research Blog, 18 April 2016. https://hangingtogether.org/?p=5616. 102. Erway, Ricky, Laurence Horton, Amy Nurnberger, Reid Otsuji, and Amy Rushing. 2015. Building Blocks: Laying the Foundation for a Research Data Management Program, 8. Dublin, OH: OCLC Research. https://doi.org/10.25333/C39P86. 103. See the OCLC Research Data Management Planning Guide at https://www.oclc.org/research/areas/research-collections/rdm/guide.html. 104. Smith-Yoshimura, Karen. 2020. “Knowledge Management and Metadata.” Hanging Together: The OCLC Research Blog, 9 April 2020. https://hangingtogether.org/?p=7845. 105. Faniel, Ixchel M. 2019. “Let’s Cook Up Some Metadata Consistency.” Next (blog), OCLC, 21 November 2019. http://www.oclc.org/blog/main/lets-cook-up-some-metadata-consistency/. 106. NCI (National Computational Infrastructure): Australia. “Home.” Accessed 21 September 2020. http://nci.org.au/; ADA (Australian Data Archive). “Home.” Accessed 21 September 2020. https://www.ada.edu.au/. 107. Portage Network. “Home.” Accessed 21 September 2020. https://portagenetwork.ca/. 108. Metadata 2020 is a “collaboration advocating richer, connected, reusable, open metadata for all research outputs” (http://www.metadata2020.org/). The Metadata 2020 Researcher Communications project is outlined here: http://www.metadata2020.org/projects /researcher-communications/. http://www.loc.gov/standards/mads/ https://hangingtogether.org/?p=5744 http://www.oclc.org/research/themes/data-science/europeana.html?urlm=168921 https://iiif.io/ https://www.oclc.org/research/themes/data-science/iiif/iiifexplorer.html https://www.oclc.org/research/themes/data-science/iiif/iiifexplorer.html https://www.oclc.org/research/themes/data-science/linkeddata/contentdm-linked-data-pilot.html https://www.oclc.org/research/themes/data-science/linkeddata/contentdm-linked-data-pilot.html https://www.oclc.org/research/themes/data-science/linkeddata/contentdm-linked-data-pilot.html http://hangingtogether.org/?p=5375 https://hangingtogether.org/?p=5616 https://doi.org/10.25333/C39P86 https://www.oclc.org/research/areas/research-collections/rdm/guide.html https://hangingtogether.org/?p=7845 http://www.oclc.org/blog/main/lets-cook-up-some-metadata-consistency/ http://nci.org.au/ https://www.ada.edu.au/ https://portagenetwork.ca/ http://www.metadata2020.org/ http://www.metadata2020.org/projects/researcher-communications/ http://www.metadata2020.org/projects/researcher-communications/ 44 Transitioning to the Next Generation of Metadata 109. Digital Curation Centre. “Disciplinary Metadata.” List of Metadata Standards. Accessed 21 September 2020. http://www.dcc.ac.uk/resources/metadata-standards/list. 110. RDA Metadata Directory. “Metadata Standards Directory Working Group.” GitHub Repository. Accessed 21 September 2020. http://rd-alliance.github.io/metadata-directory/. 111. NISO is about to make CRediT (Contributor Roles Taxonomy)—which identifies 14 roles describing each contributor’s specific contribution to the scholarly output—a standard. CRediT was developed by CASRAI, the Consortia Advancing Standards in Research Administration Information. See CASRAI. “CRediT – Contributor Roles Taxonomy.” Accessed 21 September 2020. https://casrai.org/credit/. 112. University of Michigan Library. 2020. “Data Services.” http://www.lib.umich.edu/research -data-services. 113. OCLC Research. 2020. “The Realities of Research Data Management.” Overview. https://www.oclc.org/research/publications/2017/oclcresearch-research-data -management.html. 114. Bryant, Rebecca, Brian Lavoie, and Constance Malpas. 2017. Scoping the University RDM Service Bundle. The Realities of Research Data Management, Part 2, pp. 16, 21. Dublin, OH: OCLC Research. https://doi.org/10.25333/C3Z039. 115. Indiana University. 2018. “IU will Lead $2 Million Partnership to Expand Access to Research Data: IU Libraries and IU Network Science Institute Are Leading a Public-Private Partnership to Create the Shared BigData Gateway for Research Libraries” News at UI, (Science and Technology.) Indiana University, 18 October 2018. https://news.iu.edu/stories/2018/10/iu /releases/18-shared-bigdata-gateway-for-research-networks.html; Microsoft. 2020. “Microsoft Academic Graph.” Established 5 June 2015. https://www.microsoft.com/en-us/research/project/microsoft-academic-graph/; For more details, watch the August 2019 recording of “Democratizing Access to Large Datasets through Shared Infrastructure.” See Wittenberg, Jamie, and Valentin Pentchev. “Works in Progress Webinar: Democratizing Access to Large Datasets through Shared Infrastructure.” Produced by OCLC Research, 8 August 2019. MP4 video presentation, 58:34:00. https://www.oclc.org/research/events/2019/080819-democratizing-access-large -datasets-shared-infrastructure.html. 116. NISO’s Reproducibility Badging and Definitions now out for public comment may also help researchers extend the benefit of their research to others. See “Taxonomy, Definitions, and Recognition Badging Scheme Working Group | NISO Website.” n.d. Accessed 22 September 2020. https://www.niso.org/standards-committees/reproducibility-badging. 117. Smith-Yoshimura, Karen. 2015. “Services Built on Usage Metrics.” Hanging Together: The OCLC Research Blog, 30 September 2015. https://hangingtogether.org/?p=5430. 118. Krista M. Soria, Jan Fransen, Shane Nackerud. 2014. “Stacks, Serials, Search Engines, and Students’ Success: First-Year Undergraduate Students’ Library Use, Academic Achievement, and Retention.” Journal of Academic Librarianship 40: 84-91. https://doi.org/10.1016/j.acalib.2013.12.002. http://www.dcc.ac.uk/resources/metadata-standards/list http://rd-alliance.github.io/metadata-directory/ https://casrai.org/credit/ http://www.lib.umich.edu/research-data-services http://www.lib.umich.edu/research-data-services https://www.oclc.org/research/publications/2017/oclcresearch-research-data-management.html https://www.oclc.org/research/publications/2017/oclcresearch-research-data-management.html https://doi.org/10.25333/C3Z039 https://news.iu.edu/stories/2018/10/iu/releases/18-shared-bigdata-gateway-for-research-networks.html https://news.iu.edu/stories/2018/10/iu/releases/18-shared-bigdata-gateway-for-research-networks.html https://www.microsoft.com/en-us/research/project/microsoft-academic-graph/ https://www.oclc.org/research/events/2019/080819-democratizing-access-large-datasets-shared-infrastructure.html https://www.oclc.org/research/events/2019/080819-democratizing-access-large-datasets-shared-infrastructure.html https://www.niso.org/standards-committees/reproducibility-badging https://hangingtogether.org/?p=5430 https://doi.org/10.1016/j.acalib.2013.12.002 Transitioning to the Next Generation of Metadata 45 119. See Jisc. “Library Impact Data Project (LIDP).” Accessed 21 September 2020. http://www.activitydata.org/LIDP.html. 120. Smith-Yoshimura, Karen. 2019. “Alternatives to Statistics for Measuring Success and Value of Cataloging.” Hanging Together: The OCLC Research Blog, 15 April 2019. https://hangingtogether.org/?p=7122. 121. DLF (Digital Library Federation). 2015. “Metadata Librarian, Cornell University Library.” DLF (blog), 11 June 2015. https://www.diglib.org/metadata-librarian-cornell-university-library/. 122. Salary.com. (2019) 2020. “Metadata Librarian.” Posted by Georgia Tech University 13 November 2019. (Archived 2 September 2020) https://web.archive.org/web/20200903061830/https://www.salary.com/job/gt-library /metadata-librarian/e5644ece-c847-4cfb-994f-c4c80fa81e3d. 123. OCLC. 2020. “Locate Items in the Library with StackMap.” https://help.oclc.org/Discovery_and _Reference/WorldCat_Discovery/Search_results/Locate_items_in_the_library_with_StackMap. 124. Yewno: Transforming Information into Knowledge. 2020. “Home.” https://www.yewno.com/. 125. Smith-Yoshimura, Karen. 2019. “New Ways of Using and Enhancing Cataloging and Authority Records” Hanging Together: The OCLC Research Blog, 2 April 2019. https://hangingtogether.org/?p=7805. 126. National Library of Australia (NLA). “Austlang National Codeathon.” Accessed 21 September 2020. https://www.nla.gov.au/our-collections/processing-and-describing-the-collections /Austlang-national-codeathon [Map of Australia. 2020 HERE, Bing, Microsoft Corporation]; NLA. “Trove.” Search. Uncover. Australia. Accessed 21 September 2020. https://trove.nla.gov.au/. 127. The Graphic History Company – Hachette UK. “River of Authors.” Accessed 21 September 2020. http://theghc.co/project.php?project=hachette-uk-a-river-of-authors. 128. Smith-Yoshimura, Karen. 2019. “Knowledge Organization Systems.” Hanging Together: The OCLC Research Blog, 17 April 2019. https://hangingtogether.org/?p=7135. 129. Zeng, Marcia Lei, and Philipp Mayr. 2019. “Knowledge Organization Systems (KOS) in the Semantic Web: A Multi-dimensional Review.” International Journal on Digital Libraries 20: 209- 230. https://doi.org/10.1007/s00799-018-0241-2. 130. SNAC (Social Networks and Archival Context). “About SNAC.” What is SNAC? https://portal.snaccooperative.org/about. 131. Smith-Yoshimura, Karen. 2020. “Knowledge Management and Metadata.” Hanging Together: The OCLC Research Blog, 9 April 2020. https://hangingtogether.org/?p=7845. 132. AI4LAM (Artificial Intelligence for Libraries, Archives & Museums). Updated 18 May 2020 https://sites.google.com/view/ai4lam/home. http://www.activitydata.org/LIDP.html https://hangingtogether.org/?p=7122 https://www.diglib.org/metadata-librarian-cornell-university-library/ https://web.archive.org/web/20200903061830/https://www.salary.com/job/gt-library/metadata-librarian https://web.archive.org/web/20200903061830/https://www.salary.com/job/gt-library/metadata-librarian https://help.oclc.org/Discovery_and_Reference/WorldCat_Discovery/Search_results/Locate_items_in_the_library_with_StackMap https://help.oclc.org/Discovery_and_Reference/WorldCat_Discovery/Search_results/Locate_items_in_the_library_with_StackMap https://www.yewno.com/ https://hangingtogether.org/?p=7805 https://www.nla.gov.au/our-collections/processing-and-describing-the-collections/Austlang-national-codeathon https://www.nla.gov.au/our-collections/processing-and-describing-the-collections/Austlang-national-codeathon https://trove.nla.gov.au/ http://theghc.co/project.php?project=hachette-uk-a-river-of-authors https://hangingtogether.org/?p=7135 https://doi.org/10.1007/s00799-018-0241-2 https://portal.snaccooperative.org/about https://hangingtogether.org/?p=7845 https://sites.google.com/view/ai4lam/home 46 Transitioning to the Next Generation of Metadata 133. AI4LAM’s mission is to organize, share, and elevate knowledge about and use of artificial intelligence by libraries, archives, and museums. It was founded in 2018, inspired by the success of the International Image Interoperability Framework (IIIF) in coordinating large scale collaboration on interoperable technology to advance LAMs. See AI4LAM. “About.” Our Mission. https://sites.google.com/view/ai4lam/about. 134. Padilla, Thomas. 2019. Responsible Operations: Data Science, Machine Learning, and AI in Libraries. Dublin, OH: OCLC Research. https://doi.org/10.25333/xk7z-9g97. 135. Ibid, 17-19. 136. Smith-Yoshimura, Karen. 2019. “Alternatives to Statistics for Measuring Success and Value of Cataloging.” Hanging Together: The OCLC Research Blog, 15 April 2019. https://hangingtogether.org/?p=7122. 137. Smith-Yoshimura, Karen. 2017. “New Skill Sets for Metadata Management.” Hanging Together: The OCLC Research blog, 17 April 2017. https://hangingtogether.org/?p=5929. 138. Smith-Yoshimura, Karen. 2018. “MarcEdit and Other Tools for Batch Processing and Metadata Reconciliation.” Hanging Together: The OCLC Research Blog, 26 March 2018. https://hangingtogether.org/?p=6646. 139. Reese, Terry. 2018 “MarcEdit 2017 Usage Information.“ Terry’s Worklog (blog), 9 September 2020. http://blog.reeset.net/archives/2572. 140. Reese, Terry. 2020. “Working with Linked Data In MarcEdit.” MarcEdit Development (blog). Accessed 21 September 2020. https://marcedit.reeset.net/working-with-linked-data-in- marcedit. 141. Reese, Terry. 2018. “MarcEdit Playlist.” 139 YouTube videos. Last updated 26 December 2018. https://www.youtube.com/playlist?list=PLrHRsJ91nVFScJLS91SWR5awtFfpewMWg. 142. Smith-Yoshimura, Karen. 2017. “New Skill Sets for Metadata Management.” Hanging Together: The OCLC Research blog, 17 April 2017. https://hangingtogether.org/?p=5929. 143. “XML and RDF-Based Systems Archives.” n.d. Library Juice Academy (blog). Accessed 22 September 2020. https://libraryjuiceacademy.com/certificate/xml-and-rdf-based-systems/; Reese, Terry. 2013. “Tutorials.” YouTube (selected). MarcEdit Development (blog). 14 March 2013. http://marcedit.reeset.net/tutorials; “Lynda: Online Courses, Classes, Training, Tutorials.” n.d. Lynda.com - from LinkedIn Learning. Accessed 22 September 2020. https://www.lynda.com/; “Learn to Code - for Free.” n.d. Codecademy. Accessed 22 September 2020. https://www.codecademy.com/; https://sites.google.com/view/ai4lam/about https://doi.org/10.25333/xk7z-9g97 https://hangingtogether.org/?p=7122 https://hangingtogether.org/?p=5929 https://hangingtogether.org/?p=6646 http://blog.reeset.net/archives/2572 https://marcedit.reeset.net/working-with-linked-data-in-marcedit https://marcedit.reeset.net/working-with-linked-data-in-marcedit https://www.youtube.com/playlist?list=PLrHRsJ91nVFScJLS91SWR5awtFfpewMWg https://hangingtogether.org/?p=5929 https://libraryjuiceacademy.com/certificate/xml-and-rdf-based-systems/ http://marcedit.reeset.net/tutorials https://www.lynda.com/ https://www.codecademy.com/ Transitioning to the Next Generation of Metadata 47 Software Carpentry. “Teaching Basic Lab Skills for Research Computing.” Upcoming Workshops. Accessed 22 September 2020. https://software-carpentry.org/. 144. “Data on the Web Best Practices.” n.d. Accessed 22 September 2020. https://www.w3.org/TR/dwbp/; Semantic Web for the Working Ontologist. (2008) 2020. http://workingontologist.org/. 145. Library Workflow Exchange. n.d. “About.” Accessed 21 September 2020. http://www.libraryworkflowexchange.org/about/. 146. OCLC Developer Network. 2020. “DevConnect Webinars. https://www.oclc.org/developer/ events/devconnect-workshops.en.html. 147. Smith-Yoshimura, Karen. 2019. “Stewardship of Professional FTEs In Metadata Work and Turnover.” Hanging Together: The OCLC Research Blog, 18 October 2019. https://hangingtogether.org/?p=7580. 148. OCLC. 2020. “WorldCat®: OCLC and Linked Data.” Shared Entity Management Infrastructure. https://www.oclc.org/en/worldcat/linked-data/shared-entity-management-infrastructure.html. https://software-carpentry.org/ https://www.w3.org/TR/dwbp/ http://workingontologist.org/ http://www.libraryworkflowexchange.org/about/ https://www.oclc.org/developer/events/devconnect-workshops.en.html https://www.oclc.org/developer/events/devconnect-workshops.en.html https://hangingtogether.org/?p=7580 https://www.oclc.org/en/worldcat/linked-data/shared-entity-management-infrastructure.html For more information about our work related to digitizing library collections, please visit: oc.lc/digitizing 6565 Kilgour Place Dublin, Ohio 43017-3395 T: 1-800-848-5878 T: +1-614-764-6000 F: +1-614-764-6096 www.oclc.org/research ISBN: 978-1-55653-167-5 DOI: 10.25333/rqgd-b343 RM-PR-216787-WWAE 2009 O C L C R E S E A R C H R E P O R T http://oc.lc/digitizing Executive Summary Introduction The Transition to Linked Data and Identifiers Expanding the use of persistent identifiers Moving from “authority control” to “identity management” Addressing the need for multiple vocabularies and equity, diversity, and inclusion Linked data challenges Describing “Inside-Out” and “Facilitated” Collections Archival collections Archived websites Audio and video collections Image collections Research data Evolution of “Metadata as a Service” Metrics Consultancy New applications Bibliometrics Semantic indexing Preparing for Future Staffing Requirements The culture shift Learning opportunities New tools and skills Self-education Addressing staff turnover Impact Acknowledgments Appendix Notes FIGURE 1. “Changing Resource Description Workflows” by OCLC Research FIGURE 2. Some 300 abbreviated author names for a five-page article in Physical Review Letters FIGURE 3. Examples of some DOI (left) and ARK (right) identifiers FIGURE 4. One Wikidata identifier links to other identifiers and labels in different languages FIGURE 5. Excerpt from the survey results from the 2017 EDI survey of the Research Library Partnership FIGURE 6. Responses to 2019 survey on challenges related to managing A/V collections FIGURE 7. The OCLC ResearchWorks IIIF Explorer retrieves images about “Paris Maps” across CONTENTdm collections FIGURE 8. Distribution of 465 Indigenous language codes in the Australian National Bibliographic Database FIGURE 9. UK Hatchette’s “River of Authors” generated from the British Library’s catalog metadata Blank Page orr-bootleg-2020 ---- Bootleg: Chasing the Tail with Self-Supervised Named Entity Disambiguation Laurel Orr†, Megan Leszczynski†, Simran Arora†, Sen Wu†, Neel Guha†, Xiao Ling‡, and Christopher Ré† †Stanford University ‡Apple {lorr1,mleszczy,simran,senwu,nguha,chrismre}@cs.stanford.edu, xiaoling@apple.com Abstract A challenge for named entity disambiguation (NED), the task of mapping textual mentions to entities in a knowledge base, is how to disambiguate entities that appear rarely in the training data, termed tail entities. Humans use subtle reasoning patterns based on knowledge of entity facts, relations, and types to disambiguate unfamiliar entities. Inspired by these patterns, we introduce Bootleg, a self-supervised NED system that is explicitly grounded in reasoning patterns for disambiguation. We define core reasoning patterns for disambiguation, create a learning procedure to encourage the self-supervised model to learn the patterns, and show how to use weak supervision to enhance the signals in the training data. Encoding the reasoning patterns in a simple Transformer architecture, Bootleg meets or exceeds state-of-the-art on three NED benchmarks. We further show that the learned representations from Bootleg successfully transfer to other non-disambiguation tasks that require entity-based knowledge: we set a new state-of- the-art in the popular TACRED relation extraction task by 1.0 F1 points and demonstrate up to 8% performance lift in highly optimized production search and assistant tasks at a major technology company. 1 Introduction Knowledge-aware deep learning models have recently led to significant progress in fields ranging from natural language understanding [38, 41] to computer vision [56]. Incorporating explicit knowledge allows for models to better recall factual information about specific entities [38]. Despite these successes, a persistent challenge that recent works continue to identify is how to leverage knowledge for low-resource regimes, such as tail examples that appear rarely (if at all) in the training data [16]. In this work, we study knowledge incorporation in the context of named entity disambiguation (NED) to better disambiguate the long tail of entities that occur infrequently during training.1 Humans disambiguate by leveraging subtle reasoning over entity-based knowledge to map strings to entities in a knowledge base. For example, in the sentence “Where is Lincoln in Logan County?”, resolving the mention “Lincoln” to “Lincoln, IL” requires reasoning about relations because “Lincoln, IL”—not “Lincoln, NE” or “Abraham Lincoln”—is the capital of Logan County. Previous NED systems disambiguate by memorizing co-occurrences between entities and textual context in a self-supervised manner [16, 51]. The self-supervision is critical to building a model that is easy to maintain and does not require expensive hand-curated features. However, these approaches struggle to handle tail entities: a baseline SotA model from [16] achieves less than 28 F1 points over the tail, compared to 86 F1 points over all entities. Despite their rarity in training data, many real-world entities are tail entities: 89% of entities in the Wikidata knowledge base do not have Wikipedia pages to serve as a source of textual training data. However, to achieve 60 F1 points on disambiguation, we find that the prior SotA baseline model should see an entity 1In this work, we define tail entities as those occurring 10 or fewer times in the training data. 1 ar X iv :2 01 0. 10 36 3v 3 [ cs .C L ] 2 3 O ct 2 02 0 How tall is Lincoln? The Core Reasoning Patterns of NED Type Affordance people have “heights” Lincoln, NE Abraham Lincoln Lincoln Motor Lincoln, IL LOC PER ORG LOC co-occurrence with "Nebraska" Lincoln, NE Abraham Lincoln Lincoln Motor Lincoln, IL Where is Lincoln in Logan County? "capital-of" relation Lincoln, NE Abraham Lincoln Lincoln Motor Lincoln, IL Logan County, IL Logan County, OK Logan County, OH Is a Lincoln or Ford more expensive? consistent "car" types Lincoln, NE Abraham Lincoln Lincoln Motor Lincoln, IL Ford Motor Ford, Australia Henry Ford KG Relations Entity Memorization Type Consistency Increasing generality of pattern Where is Lincoln Nebraska? Up to 100x more data needed to recover performance of Bootleg over the tail Overcoming the Long Tail of NED Baseline Bootleg Tail Torso Head Unseen 0 ~100x F1 0 0.2 0.4 0.6 0.8 1.0 Number entity occurrences in training 1 102 104 106 Figure 1: (Left) shows the four reasoning patterns for disambiguation. The correct entity is bolded. (Right) shows F1 versus number of times an entity was seen in training data for a baseline NED model compared to Bootleg across the head, torso, tail, and unseen. on-the-order-of 100 times during training (Figure 1 (right)). This presents a scalability challenge as there are 15x more entities in Wikidata than in Wikipedia, the majority of which are tail entities. For the model to observe each of these tail entities 100x, the training data would need to be scaled by 1,500x the size of Wikipedia. Prior approaches struggle with the tail, yet industry applications such as search and voice assistants are known to be tail-heavy [4, 20]. Given the requirement for high quality tail disambiguation, major technology companies continue to press on this challenge [29, 39]. Instead of scaling the training data until co-occurrences between tail entities and text can be memorized, we define a principled set of reasoning patterns for entity disambiguation across the head and tail. When humans disambiguate entities, they leverage signals from context as well as from entity relations and types. For example, resolving “Lincoln” in the text “How tall is Lincoln?” to “Abraham Lincoln” requires reasoning that people, not locations or car companies, have heights—a type affordance pattern. These core patterns apply to both head and tail examples with high coverage and involve reasoning over entity facts, relations, and types, information which is available for both head and tail in structured data sources. 2 Thus, we hypothesize that these patterns assembled from the structured resources can be learned over training data and generalize to the tail. In this work, we introduce Bootleg, an open-source, self-supervised NED system designed to succeed on head and tail entities. 3 Bootleg encodes the entity, relation, and type signals as embedding inputs to a simple stacked Transformer architecture. The key challenges we face are understanding how to use knowledge for NED, designing a model that learns those patterns, and fully extracting the useful knowledge signals from the training data: • Tail Reasoning: Humans use subtle reasoning patterns to disambiguate different entities, especially unfamiliar tail entities. The first challenge is characterizing these reasoning patterns and understanding their coverage over the tail. • Poor Tail Generalization: We find that a model trained using standard regularization and a combination of entity, type and relation information performs 10 F1 points worse on disambiguating unseen entities compared to the two models which respectively use only type and only relation information. We find this performance drop is due to the model’s over-reliance on discriminative textual and entity features compared to more general type and relation features. • Underutilized Data: Self-supervised models improve with more training data [7]. However, only a 2We find that type affordance patterns apply to over 84% of all examples, including tail examples, while KG relation patterns apply to over 27% of all examples and type consistency applies to over 8% of all examples. In Wikidata, 75% of entities that are not in Wikipedia have type or knowledge graph connectivity signals, and among tail entities, 88% are in non-tail type categories and 90% are in non-tail relation categories. 3Bootleg is open-source at http://hazyresearch.stanford.edu/bootleg 2 http://hazyresearch.stanford.edu/bootleg limited portion of the standard NED training dataset, Wikipedia, is useful: Wikipedia lacks labels [19] and we find that an estimated 68% of entities in the dataset are not labeled.4 Bootleg addresses these challenges through three contributions: • Reasoning Patterns for Disambiguation: We contribute a principled set of core disambiguation patterns for NED (Figure 1 (left))—entity memorization, type consistency, KG relation, and type affordance—and show that on slices of Wikipedia examples exemplifying each pattern, Bootleg provides a lift over the baseline SotA model on tail examples by 18 F1, 56 F1, 62 F1, and 45 F1 points respectively. Overall, using these patterns, Bootleg meets or exceeds state-of-the-art performance on three NED benchmarks and outperforms the prior SotA by more than 40 F1 points on the tail of Wikipedia. • Generalizing Learning to the Tail: Our key insight is that there are distinct entity-, type-, and relation- tails. Over tail entities (based on entity count in the training data), 88% have non-tail types and 90% have non-tail relations. The model should balance these signals differently depending on the particular entity being disambiguated. We thus contribute a new 2D regularization scheme to combine the entity, tail, and relation signals and achieve a lift of 13.6 F1 points on unseen entities compared to the model using standard regularization techniques. We conduct extensive ablation studies to verify the effectiveness of our approach. • Weak Labelling of Data: Our insight is that because Wikipedia is highly structured—most sentences on an entity’s Wikipedia page refer to that entity via pronouns or alternative names—we can weakly label our training data to label mentions. Through weak labeling, we increase the number of labeled mentions in the training data by 1.7x, and find this provides a 2.6 F1 point lift on unseen entities. With these three contributions, Bootleg achieves SotA on three NED benchmarks. We further show that embeddings from Bootleg are useful for downstream applications that require the knowledge of entities. We show the reasoning patterns learned in Bootleg transfer to tasks beyond NED by extracting Bootleg’s learned embeddings and using them to set a new SotA by 1.0 F1 points on the TACRED relation extraction task [2, 53], where the prior SotA model also uses entity-based knowledge [38]. Bootleg representations further provide an 8% performance lift on highly optimized industrial search and assistant tasks at a major technology company. For Bootleg’s embeddings to be viable for production, it is critical that these models are space-efficient: the models using only Bootleg relation and type embeddings each achieve 3.3x the performance of the prior SotA baseline over unseen entities using 1% of the space. 2 NED Overview and Reasoning Patterns We now define the task of named entity disambiguation (NED), the four core reasoning patterns, and the structural resources required for learning the patterns. Task Definition Given a knowledge base of entities E and an input sentence, the goal of named entity disambiguation is to determine the entities e ∈ E referenced in each sentence. Specifically, the input is a sequence of N tokens W = {w1, . . . , wN} and a set of M non-overlapping spans in the sequence W, termed mentions, to be disambiguated M = {m1, . . . , mM}. The output is the most likely entity for each mention. The Tail of NED We define the tail, torso, and head of NED as entities occurring less than 11 times, between 11 and 1,000, and more than 1,000 times in training, respectively. Following Figure 1 (right), the head represents those entities a simple language-based baseline model can easily resolve, as shown by a baseline SotA model from [16] achieving 86 F1 over all entities. These entities were seen enough times during training to memorize distinguishing contextual cues. The tail represents the entities these models struggle to resolve due to their rarity in training data, as shown by the same baseline model achieving less than 28 F1 on the tail. 4We computed this statistic by computing the number of proper nouns and the number of pronouns/known aliases for an entity on that entity’s page that were not already linked. 3 2.1 Four Reasoning Patterns When humans disambiguate entities in text, they conceptually leverage signals over entities, relationships, and types. Our empirical analysis reveals a set of desirable reasoning patterns for NED. The patterns operate at different levels of granularity (see Figure 1 (left))—from patterns which are highly specific to an entity, to patterns which apply to categories of entities—and are defined as follows. • Entity Memorization: We define entity memorization as the factual knowledge associated with a specific entity. Disambiguating “Lincoln” in the text “Where is Lincoln, Nebraska?” requires memorizing that “Lincoln, Nebraska”, not “Abraham Lincoln” frequently occurs with the text “Nebraska” (Figure 1 (left)). This pattern is easily learned by now-standard Transformer-based language models. As this pattern is at the entity-level, it is the least general pattern. • Type Consistency: Type consistency is the pattern that certain textual signals in text indicate that the types of entities in a collection are likely similar. For example, when disambiguating “Lincoln” in the text “Is a Lincoln or Ford more expensive?”, the keyword “or” indicates that the entities in the pair (or sequence) are likely of the same Wikidata type, “car company”. Type consistency is a more general pattern than entity memorization, covering 12% of the tail examples in a sample of Wikipedia.5 • KG Relations: We define the knowledge graph (KG) relation pattern as when two candidates have a known KG relationship and textual signals indicate that the relation is discussed in the sentence. For example, when disambiguating “Lincoln” in the sentence “Where is Lincoln in Logan County?”, “Lincoln, IL” has the KG relationship “capital of” with Logan County while Lincoln, NE does not. The keyword “in” is associated with the relation “capital of” between two location entities, indicating that “Lincoln, IL” is correct, despite being the less popular candidate entity associated with “Lincoln”. As patterns over pairs of entities with KG relations cover 23% of the tail examples, this is a more general reasoning pattern than consistency. • Type Affordance: We define type affordance as the textual signals associated with a specific entity- type in natural language. For example, “Manhattan” is likely resolved to the cocktail rather than the burrough in the sentence “He ordered a Manhattan.” due to the affordance that drinks, not locations, are “ordered”. As affordance signals cover 76% of the tail examples, it is the most general reasoning pattern. Required Structural Resources An NED system requires entity, relation, and type knowledge signals to learn these reasoning patterns. Entity knowledge is captured in unstructured text, while relation signals and type signals are readily available in structured knowledge bases such as Wikidata: from a sample of Wikipedia, 27% of all mentions and 23% of tail mentions participate in a relation, and 97% of all mentions and 92% of tail mentions are assigned some type in Wikidata. As these structural resources are readily available for all entities, they are useful for generalizing to the tail. A rare entity with a particular type or relation can leverage textual patterns learned from every other entity with that type or relation. Given the input signals and reasoning patterns, the next key challenge is ensuring that the model combines the discriminative entity and more general relation and type signals that are useful for disambiguation. 3 Bootleg Architecture for Tail Disambiguation We now describe our approach to leverage the reasoning patterns based on entity, relation, and type signals. We then present our new regularization scheme to inject inductive bias of when to use general versus discriminative reasoning patterns and our weak labeling technique to extract more signal from the self-supervision training data. 5Coverage numbers are calculated from representative slices of Wikidata that require each reasoning pattern. Additional details in Section 5. 4 “Where is Lincoln in Logan County?” Lincoln, ILLincoln, NEAbraham Lincoln AddAttn type embsentity emb AddAttn relation embs Cat + Proj Ent2Ent Phrase2Ent Softmax + Lincoln, IL Logan County, IL KG2Ent single layer Logan Country, OHLogan County, OKLogan Country, IL ue re te E W E W BERT Figure 2: Bootleg’s neural model. The entity, type, and relation embeddings are generated for each candidate and concatenated to form our entity representation matrix E. This, together with our word embedding matrix W, are inputs to Bootleg’s Ent2Ent, Phrase2Ent, and KG2Ent modules which aim to encode the four reasoning patterns. The most likely candidate for each mention is returned. 3.1 Encoding the Signals We first encode the structural signals—entities, KG relations and types—by mapping each to a set of embeddings. • Entity Embedding: Each entity e is represented by a unique embedding ue. • Type Embedding: Let T be the set of possible entity types. Given a known mapping from an entity e to its set {te,1, . . . , te,T |te,i ∈ T} of T possible types, Bootleg assigns an embedding te,i to each type. Because an entity can have multiple types, we use an additive attention [3], AddAttn, to create a single type embedding te = AddAttn([te,1, . . . , te,T ]). We further allow the model to leverage coarse named entity recognition types through a mention-type prediction module (see Appendix A for details). This coarse predicted type is concatenated with the assigned type to form te. • Relation Embedding: Let R represent the set of possible relationships any entity can participate in. Similar to types, given a mapping from an entity e to its set {re,1, . . . , re,R|re,i ∈R} of R relationships, Bootleg assigns an embedding re,i to each relation. Because an entity can participate in multiple relations, we use the additive attention to compute re = AddAttn([re,1, . . . , re,R]). As in existing work [16, 40], given the input sentence of length N and set of M mentions, Bootleg generates for each mention mi a set Γ(mi) = {e1i , . . . , e K i } of K possible entity candidates that could be referred to by mi. For each candidate and its associated types and relations, Bootleg uses a multi-layer perceptron e = MLP([ue, te, re]) to generate a vector representation for each candidate entity, for each mention. We denote this entity matrix as E ∈ RM×K×H, where H is the hidden dimension. We use BERT to generate contextual embeddings for each token in the input sentence. We denote this sentence embedding as W ∈ RN×H. W and E are passed to Bootleg’s model architecture, described next. 3.2 Bootleg Model Architecture The design goal of Bootleg is to capture the reasoning patterns by modeling textual signals associated with entities (for entity memorization), co-occurrences between entity types (for type consistency), textual signals associated with relations along with which entities are explicitly linked in the KG (for KG relations), 5 and textual signals associated with types (for type affordance). We design three modules to capture these design goals: a phrase memorization module, a co-occurrence memorization module, and a knowledge graph connection module. The model architecture is shown in Figure 2. We describe each module next. Phrase Memorization Module We design the phrase memorization module, Phrase2Ent, to encode the dependencies between the input text and the entity, relation, and type embeddings. The purpose of this module is to learn textual cues for the entity memorization and type affordance patterns. It should also learn relation context for the KG relation pattern. It will, for example, allow the person type embedding to encode the association with the keyword “height”. The module accepts as input E and W and outputs Ep = MHA(E, W), where MHA is the standard multi-headed attention with a feed-forward layer and skip connections [48]. Co-occurrence Memorization Module We design the co-occurrence memorization module, Ent2Ent, to encode the dependencies between entities. The purpose of the Ent2Ent module is to learn textual cues for the type consistency pattern. The module accepts E and computes Ec = MHA(E) using self-attention. Knowledge Graph (KG) Connection Module We design the KG module, KG2Ent, to collectively resolve entities based on pairwise connectivity features. Let K represent the adjacency matrix of a (possibly weighted) graph where the nodes are entities and an edge between ei and ej signifies that the two entities share some pairwise feature. Given E, KG2Ent computes Ek = softmax(K + wI)E + E where I is the identity and w is a learned scalar weight that allows Bootleg to learn to balance the original entity and its connections. This module allows for representation transfer between two related entities, meaning entities with a high-scoring representation will boost the score of related entities. The second computation acts as a skip connection between the input and output. In Bootleg, we allow the user to specify multiple KG2Ent modules: one for each adjacency matrix. The purpose of KG2Ent along with Phrase2Ent is to learn the KG relation pattern. End-to-End The computations for one layer of Bootleg includes: E′ =MHA(E, W) + MHA(E) Ek =softmax(K + wI)E′ + E′ where Ek is passed as the entity matrix to the next layer. After the final layer, Bootleg scores each entity by computing Sdis = max(EkvT , E′vT ) with Sdis ∈ RM×K and learned scoring vector v ∈ RH. Bootleg then outputs the highest scoring candidate for each mention. This scoring treats Ek and E′ as two separate predictions in an ensemble method, allowing the model to use collective reasoning from Ek when it achieves the highest scoring representation. If there are multiple KG2Ent modules, we use the average of their outputs as input to the next layer and, for scoring, take the maximum score across all outputs. For training, we use the cross-entropy loss of S to compute the disambiguation loss Ldis. 3.3 Improving Tail Generalization Regularization is the standard technique to encourage models to generalize, as models will naturally fit to discriminative features. However, we demonstrate that standard regularization is not effective when we want to leverage a combination of general and discriminative signals. We then present two techniques, regularization and weak labeling, to encourage Bootleg to incorporate general structural signals and learn general reasoning patterns. 3.3.1 Regularization We hypothesize that Bootleg will over-rely on the more discriminative entity features compared to the more general type and relation features to lower training loss. However, tail disambiguation requires Bootleg to leverage the general features. Using standard regularization techniques, we evaluate three models which respectively use only type embeddings, only relation embeddings, and a combination of type, relation, and 6 entity embeddings. Bootleg’s performance on unseen entities is 10 F1 points worse on the latter than each of the former two, suggesting that standard regularization is not sufficient when the signals operate at different granularities (details Table 9 in Appendix B). We can improve tail performance if Bootleg leverages memorized discriminative features for popular entities and general features for rare entities. We achieve this by designing a new regularization scheme for the entity-specific embedding u, which has two key properties: it is 2-dimensional and more popular entities are regularized less than less popular ones. • 2-dimensional: In contrast to 1-dimensional dropout, 2-dimensional regularization involves masking the full embedding. With probability p(e), we set u = 0 before the MLP layer; i.e., e = MLP([0, te, re]). Entirely masking the entity embedding in these cases, the model learns to disambiguate using the type and relation patterns, without entity knowledge. • Inverse Popularity: We find in ablations (Appendix B) that setting p(e) proportional to the power of the inverse of the entity e’s popularity in the training data (i.e., the more popular the less regularized), gives us the best performance and improves by 13.6 F1 on unseen entities over standard regularization. In contrast, fixing p(e) at 80% improves performance by over 11.3 F1 over standard regularization, and regularizing proportional to the power of popularity only improves performance by 3.8 F1 (details in Section 4). The regularization scheme encourages Bootleg to use entity-specific knowledge when the entity is seen enough times to memorize entity patterns and encourages the use of generalizable patterns over the rare, highly masked, entities. 3.3.2 Weakly Supervised Data Labeling We use Wikipedia to train Bootleg: we define a self-supervision task in which the internal links in Wikipedia are the gold entity labels for mentions during training. Although this dataset is large and widely used, it is often incomplete with an estimated 68% of named entities being unlabeled. Given the scale and the requirement that Bootleg be self-supervised, it is not feasible to hand-label the data. Our insight is that because Wikipedia is highly structured—most sentences on an entity’s Wikipedia page refer to that entity via pronouns or alternative names—we can weakly label our training data [44] to label mentions. We use two heuristics for weak labeling: the first labels pronouns that match the gender of a person’s Wikipedia page as references to that person, and the second labels known alternative names for an entity if the alternative name appears in sentences on the entity’s Wikipedia page. Through weak labeling, we increase the number of labeled mentions in the training data by 1.7x across Wikipedia, and find this provides a 2.6 F1 lift on unseen entities (full results in Appendix B Table 11). 4 Experiments We demonstrate that Bootleg (1) nearly matches or exceeds state-of-the-art performance on three standard NED benchmarks and (2) outperforms a BERT-based NED baseline on the tail. As NED is critical for downstream tasks that require the knowledge of entities, we (3) verify Bootleg’s learned reasoning patterns can transfer by using them for a downstream task: using Bootleg’s learned representations, we achieve a new SotA on the TACRED relation extraction task and improve performance on a production task at a major technology company by 8%. Finally, we (4) demonstrate that Bootleg can be sample-efficient by using only a fraction of its learned entity embeddings without sacrificing performance. We (5) ablate Bootleg to understand the impact of the structural signals and the regularization scheme on improved tail performance. 4.1 Experimental Setup Wikipedia Data We define our knowledge base as the set of entities with mentions in Wikipedia (for a total of 5.3M entities). We allow each mention to have up to K = 30 possible candidates. As Bootleg is a sentence disambiguation system, we train on individual sentences from Wikipedia, where the anchor links and our weak labeling (Section 3.3) serve as mention labels. 7 Table 1: We compare Bootleg to the best published numbers on three NED benchmarks. “-” indicates that the metric was not reported. Bolded numbers indicate the best value for each metric on each benchmark. Benchmark Model Precision Recall F1 KORE50 Hu et al. [24]7 80.0 79.8 79.9 Bootleg 86.0 85.4 85.7 RSS500 Phan et al. [40] 82.3 82.3 82.3 Bootleg 82.5 82.5 82.5 AIDA Févry et al. [16] - 96.7 - Bootleg 96.9 96.7 96.8 Our candidate lists Γ are mined from Wikipedia anchor links and the “also known as” field in Wikidata. For each person, we further add their first and last name as aliases linking to that person. We use the mention boundaries provided in the Wikipedia data and generate candidates by performing a direct lookup in Γ. We use Wikidata and YAGO knowledge graphs and Wikipedia to extract structural data about entity types and relations as input for Bootleg. Further details about data are in Appendix B. Metrics We report micro-average F1 scores for all metrics over true anchor links in Wikipedia (not weak labels). We measure the torso and tail sets based on the number of times that an entity is the gold entity across Wikipedia anchors and weak labels, as this represents the number of times an entity is seen by Bootleg. For benchmarks, we also report precision and recall using the number of mentions extracted by Bootleg and the number of mentions defined in the data as denominators, respectively. The numerator is the number of correctly disambiguated mentions. For Wikipedia data experiments, we filter mentions such that (a) the gold entity is in the candidate set and (b) they have more than one possible candidate. The former is to decouple candidate generation from model performance for ablations.6 The latter is to not inflate a model’s performance, as all models are trivially correct when there is a single candidate. Training For our main Bootleg model, we train for two epochs on Wikipedia sentences with a maximum sentence length of 100. For our benchmark model, we train for one epoch and additionally add a title embedding feature, a sentence co-occurrence KG matrix as another KG module, and a Wikipedia page co-occurrence statistical feature. Additional details about the models and training procedure are in Appendix B. 4.2 Bootleg Performance Benchmark Performance To understand the overall performance of Bootleg, we compare against reported state-of-the-art numbers of two standard sentence benchmarks (KORE50, RSS500) and the standard document benchmark (AIDA CoNLL-YAGO). Benchmark details are in Appendix B. For AIDA, we first convert each document into a set of sentences where a sentence is the document title, a BERT SEP token, and the sentence. We find this is sufficient to encode document context into Bootleg. We fine-tune the pretrained Bootleg model on the AIDA training set with learning rate of 0.00007, 2 epochs, batch size of 16, and evaluating every 25 steps. We choose the test score associated with the best validation score.8 In Table 1, we show that Bootleg achieves up to 5.8 F1 points higher than prior reported numbers on benchmarks. Tail Performance To validate that Bootleg improves tail disambiguation, we compare against a baseline model from Févry et al. [16], which we refer to as NED-Base.9 NED-Base learns entity embeddings by 6We drop only 1% of mentions from this filter. 8We use the standard candidate list from Pershina et al. [36] when comparing to existing systems for fine-tuning and inference for AIDA CoNLL-YAGO. 9As code for the model from Févry et al. [16] is not publicly available, we re-implemented the model. We used our candidate 8 Table 2: (top) We compare Bootleg to a BERT-based NED baseline (NED-Base) on validation sets of a Wikipedia dataset. We report micro-average F1 scores. All torso, tail, and unseen validation sets are filtered by the number of entity occurrences in the training data and such that the mention has more than one candidate. Model All Entities Torso Entities Tail Entities Unseen Entities NED-Base 85.9 79.3 27.8 18.5 Bootleg 91.3 87.3 69.0 68.5 Bootleg (Ent-only) 85.8 79.0 37.9 14.9 Bootleg (Type-only) 88.0 81.6 62.9 61.6 Bootleg (KG-only) 87.1 79.4 64.0 64.7 # Mentions 4,065,778 1,911,590 162,761 9,626 maximizing the dot product between the entity candidates and fine-tuned BERT-contextual representations of the mention. NED-Base is successful overall on the validation achieving 85.9 F1 points, which is within 5.4 F1 points of Bootleg (Table 2). However, when we examine performance over the torso and tail, we see that Bootleg outperforms NED-Base by 8 and 41.2 F1 points, respectively. Finally, on unseen entities, Bootleg outperforms NED-Base by 50 F1 points. Note that NED-Base only has access to textual data, indicating that text is often sufficient for popular entities, but not for rare entities. 4.3 Downstream Evaluation Relation Extraction Using the learned representations from Bootleg, we achieve the new state-of-the-art on TACRED, a standard relation extraction benchmark. TACRED involves identifying the relationship between a specified subject and object in an example sentence as one of 41 relation types (e.g., spouse) or no relation. Relation extraction is a well-suited for evaluating Bootleg because the substrings in the text can refer to many different entities, and the disambiguated entities impact the set of likely relations. Given an example, we run inference with the Bootleg model to disambiguate named entities and generate the contextual Bootleg entity embedding matrix, which we feed to a simple Transformer architecture that uses SpanBERT [27] (details in Appendix C). We achieve a micro-average test F1 score of 80.3, which improves upon the prior state of the art—KnowBERT [38], which also uses entity-based knowledge—by 1.0 F1 points and the baseline SpanBERT model by 2.3 F1 points on TACRED-Revisited data (Table 3) ([53], Alt et al. [2]). We find that the Bootleg downstream model corrects errors made by the SpanBERT baseline, for example by leveraging entity, type, and relation information or recognizing that different textual aliases refer to the same entity (see Table 4). generators and fine-tuned a pretrained BERT encoder rather than training a BERT encoder from scratch, as is done in Févry et al. [16]. We trained NED-Base on the same weak labelled data as Bootleg for 2 epochs. Table 3: Test micro-average F1 score on revised TACRED dataset. Validation Set F1 Bootleg Model 80.3 KnowBERT 79.3 SpanBERT 78.0 9 Table 4: The following are examples of how the contextual entity representation from Bootleg, generated from entity, relation, and type signals, can help our downstream model. We provide the TACRED example, signals provided by Bootleg, as well our model and the baseline SpanBERT models’ predictions. TACRED Example Bootleg Signals Our Prediction SpanBERT Prediction Vincent Astor, like Marshall (subj), died unexpectedly of a heart attack (obj) in 1959 .. . Gold relation: Cause of Death Disambiguates “Marshall” to Thomas Riley Marshall and “heart attack” to myocardial infarction, which have the Wikidata relation “cause of death” Cause of Death No Relation The International Water Management (obj) Institute or IWMI (subj) study said both . ... Gold relation: Alternate Names Disambiguates alias “International Water Management Institute” and its acronym, the alias “IWMI”, to the same Wikidata entity Alternate Names No Relation In studying the slices for which the Bootleg downstream model improves upon the baseline SpanBERT model, we rank TACRED examples in three ways: by the proportion of words where Bootleg disambiguates it as an entity, leverages Wikidata relations for the embedding, and leverages Wikidata types for the embedding. For each of these three, we report the gap between the SpanBERT model and Bootleg model’s error rates on the examples with above-median proportion (more Bootleg signal) relative to the below-median proportion (less Bootleg signal). We find that the relative gap between the baseline and Bootleg error rates is larger on the slice above (with more Bootleg information) than below the median by 1.10x, 4.67x, and 1.35x respectively: with more Bootleg information, the improvement our SotA model provides over SpanBERT increases (more details in Appendix C). Industry Use Case We additionally demonstrate how the learned entity embeddings from Bootleg provide useful information to a system at a large technology company that answers factoid queries such as “How tall is the president of the United States?". We use Bootleg’s embeddings in the Overton [45] system and compare to the same system without Bootleg embeddings as the baseline. We measure the overall test quality (F1) on an in-house entity disambiguation task as well as the quality over the tail slices which include unseen entities. Per company policy, we report relative to the baseline rather than raw F1 score; for example, if the baseline F1 score is 80.0 and the subject F1 is 88.0, the relative quality is 88.0/80.0 = 1.1. Table 5 shows that the use of Bootleg’s embeddings consistently results in a positive relative quality, even over Spanish, French, and German, where improvements are most visible over tail entities. 4.4 Memory Usage We explore the memory usage of Bootleg during inference and demonstrate that by only using the entity embeddings for the top 5% of entities, ranked by popularity in the training data, Bootleg reduces its Table 5: Relative F1 quality of an Overton[45] model with Bootleg embeddings over one without in four languages. Validation Set English Spanish French German All Entities 1.08 1.03 1.02 1.00 Tail Entities 1.08 1.17 1.05 1.03 10 All Torso Tail Unseen F1 0.6 0.7 0.8 0.9 1.0 Compression ratio 0 20 40 60 80 100 Figure 3: We show the error across all entities, torso entities, tail entities, and unseen entities as we decrease the number embeddings we use during inference, assigning the non-popular entities to a fixed unseen entity embedding. For example, a compression ratio of 80 means only the top 20% of entity embeddings are used, ranked by entity popularity. embedding memory consumption by 95%, while sacrificing only 0.8 F1 points over all entities. We find that the 5.3M entity embeddings used in Bootleg consume the most memory, taking 5.2 GB of space while the attention network only consumes 39 MB (1.37B updated model parameters in total, 1.36B from embeddings). As Bootleg’s representations must be used in a variety of downstream tasks, the representations must be memory-efficient: we thus study the effect of reducing Bootleg’s memory footprint by only using the most popular entity embeddings. Specifically, for the top k% of entities ranked by the number of occurrences in training data, we keep the learned entity embedding intact. For the remaining entities, we choose a random entity embedding for an unseen entity to use instead. Instead of storing 5.3M entity embeddings, we thus store ((100−k)/100)∗5.3M, which gives a compression ratio of (100 −k). Figure 3 shows performance for k of 100, 50, 20, 10, 5, 1, and 0.1. We see that when just the top 5% of entity embeddings are used, we only sacrifice 0.8 F1 points overall and in fact score 2 F1 points higher over the tail. We hypothesize that the increase in tail performance is due to the fact that the majority of mention candidates all have the same learned embedding, decreasing the amount of conflict among candidates from textual patterns. 4.5 Ablation Study Bootleg To better understand the performance gains of Bootleg, we perform an ablation study over a subset of Wikipedia (data details explained in Appendix B). We train Bootleg with: (1) only learned entity embeddings (Ent-only), (2) only type information from type embeddings (Type-only), and (3) only knowledge graph information from relation embeddings and knowledge graph connections (KG-only). All model sizes are reported in Appendix B Table 10. In Table 2, we see that just using type or knowledge graph information leads to improvements on the tail of over 25 F1 points and on the unseen entities of over 46 F1 points compared to the Ent-only model. However, neither the Type-only nor KG-only model performs as well on any of the validation sets as the full Bootleg model. An interesting comparison is between Ent-only and NED-Base. NED-Base overall outperforms Ent-only due to the fine-tuning of BERT word embeddings. We attribute the high performance of Ent-only on the tail compared to NED-Base to our Ent2Ent module which allows for memorizing co-occurrence patterns over entities. Regularization To understand the impact of our entity regularization function p(e) on overall performance, we perform an ablation study on a sample of Wikipedia (explained in Appendix B). We apply (1) a fixed regularization set to a constant percent of 0, 20, 50 and 80, (2) a regularization function proportional to the power of the inverse popularity, and (3) the inverse of (2). Table 6 shows results over unseen entities (full results and details in Appendix B). We see that the fixed regularization of 80% achieves the highest F1 over the fixed regularizations of (1). The method that regularizes by inverse popularity achieves the highest 11 Table 6: We show the micro F1 score over unseen entities for a Wikipedia sample as we vary the entity regularization scheme p(e). A scalar percent means a fixed regularization. InvPop (inverse poularity scheme) applies less regularization for more popular entities and Pop applies more regularization for more popular entities. p(e) 0% 20% 50% 80% Pop InvPop Unseen Entities 48.6 52.5 57.7 59.9 52.4 62.2 overall F1. We further see that the scheme where popular entities are more regularized sees a drop of 9.8 F1 points in performance compared to the inverse popularity scheme. 5 Analysis We have shown that Bootleg excels on benchmark tasks and that Bootleg’s learned patterns can transfer to non-NED tasks. We now verify whether the defined entity, type consistency, KG relation, and affordance reasoning patterns are responsible for these results. We evaluate each over a representative slice of the Wikipedia validation set that exemplifies one of the reasoning patterns and present the results from each ablated model (Table 7). • Entity To evaluate whether Bootleg captures factual knowledge about entities in the form of textual entity cues, we consider the slice of 28K overall, 5K tail examples where the gold entity has no relation or type signals available. • Type Consistency To evaluate whether Bootleg captures consistency patterns, we consider the slice of 312K overall, 19K tail examples that contain a list of three or more sequential distinct gold entities, where all items in the list share at least one type. • KG Relation To evaluate whether Bootleg captures KG relation patterns, we consider the slice of 1.1M overall, 37K tail examples for which the gold entities are connected by a known relation in the Wikidata knowledge graph. • Type Affordance To evaluate whether Bootleg captures affordance patterns, we consider a slice where the sentence contains keywords that are afforded by the type of the gold entity. We mine the keywords afforded by a type by taking the 15 keywords that receive the highest TF-IDF scores over training examples with that type. This slice has 3.4M overall, 124K tail examples. Pattern Analysis For the slice representing each reasoning pattern, we find that Bootleg provides a lift over the Entity-only and NED-Base models, especially over the tail. We find that Bootleg generally combines the entity, relation, and type signals effectively, performing better than the individual Entity-only, KG-only, and Type-only models, although the KG-only model performs well on the KG relation slice. The lift from Bootleg across slices indicates the model’s ability to capture the reasoning required for the slice. We provide additional details in Appendix D. Error Analysis We next study the errors made by Bootleg and find four key error buckets. • Granularity Bootleg struggles with granularity, predicting an entity that is too general or too specific compared to the gold entity (example in Table 8). Considering the set of examples where the predicted entity is a Wikidata subclass of the gold entity or vice versa, Bootleg predicts a too general or specific entity in 12% of overall and 7% of tail errors. • Numerical Bootleg struggles with entities containing numerical tokens, which may be due to the fact that the BERT model represents some numbers with sub-word tokens and is known to not perform as well for numbers as other language models [49] (example in Table 8). To evaluate examples requiring 12 Table 7: We report the Overall/Tail F1 scores across each ablation model for a slice of data that exemplifies a reasoning pattern. Each slice is representative but may not cover every example that contains the reasoning pattern. Model Entity Type Consistency KG Relation Type Affordance NED-Base 59/29 84/29 91/30 87/28 Bootleg 66/47 95/85 98/92 93/73 Bootleg (Ent-only) 59/31 87/45 90/42 87/39 Bootleg (Type-only) 53/44 93/80 93/69 90/66 Bootleg (KG-only) 40/29 92/79 97/93 89/68 % Coverage 0.7%/3.3% 8%/12% 27%/23% 84%/76% reasoning over numbers, we consider the slice of data where the entity title contains a year, as this is the most common numerical feature in a title. This slice covers 14% of overall and 25% of tail errors. • Multi-Hop There is room for improvement in multi-hop reasoning. In the example shown Table 8, none of the present gold entities—Stillwater Santa Fe Depot, Citizens Bank Building (Stillwater, Oklahoma), Hoke Building (Stillwater, Oklahoma), or Walker Building (Stillwater, Oklahoma)—are directly connected in Wikidata; however, they share connections to the entity “Oklahoma”. This indicates that the correct disambiguation is Citizens Bank Building (Stillwater, Oklahoma), not Citizens Bank Building (Burnsville, North Carolina). To evaluate examples requiring 2-hop reasoning, we consider examples where none of the present entities are directly linked in the KG, but a present pair connects to a different entity that is not present in the sentence. We find this occurs in 6% of overall and 7% of tail errors. This type of error represents a fundamental limitation of Bootleg as we do not encode any form of multi-hop reasoning over a KG in Bootleg. Our KG information only encodes single-hop patterns (i.e., direct connections). • Exact Match Bootleg struggles on several examples in which the exact entity title is present in the text. Considering examples where the BERT Baseline is correct but Bootleg is incorrect, in 28% of the examples, the textual mention is an exact match of the entity title. Further, 32% of the examples contain a keyword from the entity title that Bootleg misses (example in Table 8). We attribute this decrease in performance to Bootleg’s regularization. This mention-to-entity similarity would need to be encoded in Bootleg’s entity embedding, but the regularization encourages Bootleg to not use entity-level information. 6 Related Work We discuss related work in terms of both NED and the broader picture of self-supervised models and tail data. Standard, pre-deep-learning approaches to NED have been rule-based [1] or leverage statistical techniques and manual feature engineering to filter and rank candidates [50]. For example, link counts and similarity scores between entity titles and mention are two such features [12]. These systems tend to be hard to maintain over time, with the work of Petasis et al. [37] building a model to detect when a rule-based NED system needs to be retrained and updated. In recent years, deep learning systems have become the new standard (see Mudgal et al. [32] for a high-level overview of deep learning approaches to entity disambiguation and entity matching problems). The most recent state-of-the-art models generally rely on deep contextual word embeddings with entity embeddings [16, 46, 51]. As we showed in Table 2, these models perform well over popular entities, but struggle to resolve the tail. Jin et al. [26] and Hoffart et al. [23] study disambiguation at the tail, and both rely on phrase-based language models for feature extraction. Unlike our work, they do not fuse type or knowledge graph information for disambiguation. 13 Table 8: We identify four key error buckets for Bootleg: granularity, numerical errors, multi-hop reasoning, and missed exact matches. We provide a Wikipedia example, the gold entity, and Bootleg’s predicted entity for each example. Error Wikipedia Example Bootleg Prediction Gold Entity Granularity Posey is the recipient of a Golden Globe Award nomination, a Satel- lite Award nomination and two In- dependent Spirit Award nominations. Satellite Awards Satellite Award for Best Actress – Motion Picture Numerical He competed in the individual road race and team time trial events at the 1976 Summer Olympics. Cycling at the 1960 Summer Olympics – 1960 Men’s Road Race Cycling at the 1976 Sum- mer Olympics – 1976 Men’s Road Race Multi-hop Other nearby historic buildings in- clude the Santa Fe Depot, the Cit- izens Bank Building, the Hoke Building, the Walker Building, and the Courthouse Citizens Bank Build- ing (Burnsville, North Carolina) Citizens Bank Building (Stillwater, Oklahoma) Exact Match According to the Nielsen Media Research, the episode was watched by 469 million viewers... Nielsen ratings Nielsen Media Research Disambiguation with Types Similar to our work, recent approaches have found that type information can be useful for entity disambiguation [9, 14, 21, 31, 43, 55]. Dredze et al. [14] use predicted coarse-grained types as entity features into a SVM classifier. Chen et al. [9] models type information as local context and integrates a BERT contextual embedding into the model from [17]. Raiman and Raiman [43] learns its own type systems and performs disambiguation through type prediction alone (essentially capturing the type affordance pattern). Ling et al. [31] demonstrate that the 112-type FIGER type ontology could improve entity disambiguation, and the LATTE framework [55] uses multi-task learning to jointly perform type classification and entity disambiguation on biomedical data. Gupta et al. [21] adds both an entity-level and mention-level type objective using type embeddings embeddings. We build on these works using fine and coarse-grained entity-level type embeddings and a mention-level type prediction task. Disambiguation with Knowledge Graphs Several recent works have also incorporated (knowledge) graph information through graph embeddings [35], co-occurrences in the Wikipedia hyperlink graph [42], and the incorporation of latent relation variables [30] to aid disambiguation. Cetoli et al. [8] and Mulang et al. [33] incorporate Wikidata triples as context into entity disambiguation by encoding triples as textual phrases (e.g., “ ”) to use as additional inputs, along with the original text to disambiguate, into a language model. In Bootleg, the Wikidata connections through the KG2Ent module allow for collective resolution and are not just additional features. Entity Knowledge in Downstream Tasks The works of Broscheit [6], Peters et al. [38], Poerner et al. [41], Zhang et al. [54] all try to add entity knowledge into a deep language model to improve downstream natural language task performance. Peters et al. [38], Poerner et al. [41], Zhang et al. [54] incorporate pretrained entity embeddings and finetune either on a the standard masked sequence-to-sequence prediction task or combined with an entity disambiguation/linking task.10 On the other hand, Broscheit [6] trains its own entity embeddings. Most works, like Bootleg, see lift from incorporating entity representations in the downstream tasks. 10Entity disambiguation refers to when the mentions are pre-detected in text. Entity linking includes the mention detection phase. In Bootleg, we focus on the entity disambiguation task. 14 Wikipedia Weak Labelling Although uncommon, Broscheit [6], De Cao et al. [13], Ghaddar and Langlais [19], Nothman et al. [34] all apply some heuristic weak labelling techniques to increase link coverage in Wikipedia for either entity disambiguation or named entity recognition. All methods generally rely on finding known surface forms for entities and labelling those in the text. Bootleg is the first to investigate the lift from incorporating weakly labelled Wikipedia data over the tail. Self-Supervision and the Tail The works of Tata et al. [47], Chung et al. [11], Ilievski et al. [25], and Chung et al. [10] all focus on the importance of the tail during inference and the challenges of capturing it during training. They all highlight the data management challenges of monitoring the tail (and other missed slices of data) and improving generalizability. In particular, Ilievski et al. [25] studies the tail in NED and encourages the use of separate head and tail subsets of data. From a broader perspective of natural language systems and generalizability, Ettinger et al. [15] highlights that many NLP systems are brittle in the face of tail linguistic patterns. Bootleg builds off this work, investigating the tail with respect to NED and demonstrating the generalizable reasoning patterns over structural resources can aid tail disambiguation. 7 Conclusion We present Bootleg, a state-of-the-art NED system that is explicitly grounded in a principled set of reasoning patterns for disambiguation, defined over entities, types, and knowledge graph relations. The contributions of this work include the characterization and evaluation of core reasoning patterns for disambiguation, a new learning procedure to encourage the model to learn the patterns, and a weak supervision technique to increase utilization of the training data. We find that Bootleg improves over the baseline SotA model by over 40 F1 points on the tail of Wikipedia. Using Bootleg’s entity embeddings for a downstream relation extraction task improves performance by 1.0 F1 points, and Bootleg’s representations lead to an 8% lift on highly optimized production tasks at a major technology company. We hope this work inspires future research on improving tail performance by incorporating outside knowledge in deep models. Acknowledgements: We thank Jared Dunnmon, Dan Fu, Karan Goel, Sarah Hooper, Monica Lam, Fred Sala, Nimit Sohoni, and Silei Xu for their valuable feedback and Pallavi Gudipati for help with experiments. We gratefully acknowledge the support of DARPA under Nos. FA86501827865 (SDH) and FA86501827882 (ASED); NIH under No. U54EB020405 (Mobilize), NSF under Nos. CCF1763315 (Beyond Sparsity), CCF1563078 (Volume to Velocity), and 1937301 (RTML); ONR under No. N000141712266 (Unifying Weak Supervi- sion); the Moore Foundation, NXP, Xilinx, LETI-CEA, Intel, IBM, Microsoft, NEC, Toshiba, TSMC, ARM, Hitachi, BASF, Accenture, Ericsson, Qualcomm, Analog Devices, the Okawa Foundation, American Family Insurance, Google Cloud, Swiss Re, the HAI-AWS Cloud Credits for Research program, and members of the Stanford DAWN project: Teradata, Facebook, Google, Ant Financial, NEC, VMWare, and Infosys. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwith- standing any copyright notation thereon. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views, policies, or endorsements, either expressed or implied, of DARPA, NIH, ONR, or the U.S. Government. References [1] John Aberdeen, John D Burger, David Day, Lynette Hirschman, David D Palmer, Patricia Robinson, and Marc Vilain. Mitre: Description of the alembic system as used in met. In TIPSTER TEXT PROGRAM PHASE II: Proceedings of a Workshop held at Vienna, Virginia, May 6-8, 1996, pages 461–462, 1996. [2] Christoph Alt, Aleksandra Gabryszak, and Leonhard Hennig. Tacred revisited: A thorough evaluation of the tacred relation extraction task. In ACL, 2020. [3] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014. [4] Michael S Bernstein, Jaime Teevan, Susan Dumais, Daniel Liebling, and Eric Horvitz. Direct answers for search queries in the long tail. In SIGCHI, 2012. [5] Terra Blevins and Luke Zettlemoyer. Moving down the long tail of word sense disambiguation with gloss-informed biencoders. arXiv preprint arXiv:2005.02590, 2020. 15 [6] Samuel Broscheit. Investigating entity knowledge in bert with simple neural end-to-end entity linking. arXiv preprint arXiv:2003.05473, 2020. [7] Tom B Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. arXiv preprint arXiv:2005.14165, 2020. [8] Alberto Cetoli, Mohammad Akbari, Stefano Bragaglia, Andrew D O’Harney, and Marc Sloan. Named entity disambiguation using deep learning on graphs. arXiv preprint arXiv:1810.09164, 2018. [9] Shuang Chen, Jinpeng Wang, Feng Jiang, and Chin-Yew Lin. Improving entity linking by modeling latent entity type information. arXiv preprint arXiv:2001.01447, 2020. [10] Yeounoh Chung, Peter J Haas, Eli Upfal, and Tim Kraska. Unknown examples & machine learning model generalization. arXiv preprint arXiv:1808.08294, 2018. [11] Yeounoh Chung, Neoklis Polyzotis, Kihyun Tae, Steven Euijong Whang, et al. Automated data slicing for model validation: A big data-ai integration approach. TKDE, 2019. [12] Silviu Cucerzan. Large-scale named entity disambiguation based on wikipedia data. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 708–716, 2007. [13] Nicola De Cao, Gautier Izacard, Sebastian Riedel, and Fabio Petroni. Autoregressive entity retrieval. arXiv preprint arXiv:2010.00904, 2020. [14] Mark Dredze, Paul McNamee, Delip Rao, Adam Gerber, and Tim Finin. Entity disambiguation for knowledge base population. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 277–285, 2010. [15] Allyson Ettinger, Sudha Rao, Hal Daumé III, and Emily M Bender. Towards linguistically generalizable nlp systems: A workshop and shared task. arXiv preprint arXiv:1711.01505, 2017. [16] Thibault Févry, Nicholas FitzGerald, Livio Baldini Soares, and Tom Kwiatkowski. Empirical evaluation of pretraining strategies for supervised entity linking. In AKBC, 2020. [17] Octavian-Eugen Ganea and Thomas Hofmann. Deep joint entity disambiguation with local neural attention. arXiv preprint arXiv:1704.04920, 2017. [18] Daniel Gerber, Sebastian Hellmann, Lorenz Bühmann, Tommaso Soru, Ricardo Usbeck, and Axel- Cyrille Ngonga Ngomo. Real-time rdf extraction from unstructured data streams. In ISWC, 2013. [19] Abbas Ghaddar and Philippe Langlais. Winer: A wikipedia annotated corpus for named entity recognition. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 413–422, 2017. [20] Ben Gomes. Our latest quality improvements for search. https://blog.google/products/search/ our-latest-quality-improvements-search/, 2017. [21] Nitish Gupta, Sameer Singh, and Dan Roth. Entity linking via joint encoding of types, descriptions, and context. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2681–2690, 2017. [22] Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fürstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. Robust disambiguation of named entities in text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 782–792. Association for Computational Linguistics, 2011. [23] Johannes Hoffart, Stephan Seufert, Dat Ba Nguyen, Martin Theobald, and Gerhard Weikum. Kore: keyphrase overlap relatedness for entity disambiguation. In CIKM, 2012. 16 https://blog.google/products/search/our-latest-quality-improvements-search/ https://blog.google/products/search/our-latest-quality-improvements-search/ [24] Shengze Hu, Zhen Tan, Weixin Zeng, Bin Ge, and Weidong Xiao. Entity linking via symmetrical attention-based neural network and entity structural features. Symmetry, 2019. [25] Filip Ilievski, Piek Vossen, and Stefan Schlobach. Systematic study of long tail phenomena in entity linking. In Proceedings of the 27th International Conference on Computational Linguistics, pages 664–674, 2018. [26] Yuzhe Jin, Emre Kiciman, Kuansan Wang, and Ricky Loynd. Entity linking at the tail: Sparse signals, unknown entities and phrase models. In WSDM ’14 Proceedings of the 7th ACM international conference on Web search and data mining, pages 453–462. ACM, February 2014. [27] Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S. Weld, Luke Zettlemoyer, and Omer Levy. SpanBERT: Improving pre-training by representing and predicting spans. arXiv preprint arXiv:1907.10529, 2019. [28] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In Yoshua Bengio and Yann LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. [29] Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob Devlin, Kenton Lee, et al. Natural questions: a benchmark for question answering research. Transactions of the Association for Computational Linguistics, 7:453–466, 2019. [30] Phong Le and Ivan Titov. Improving entity linking by modeling latent relations between mentions. arXiv preprint arXiv:1804.10637, 2018. [31] Xiao Ling, Sameer Singh, and Daniel S Weld. Design challenges for entity linking. Transactions of the Association for Computational Linguistics, 3:315–328, 2015. [32] Sidharth Mudgal, Han Li, Theodoros Rekatsinas, AnHai Doan, Youngchoon Park, Ganesh Krishnan, Rohit Deep, Esteban Arcaute, and Vijay Raghavendra. Deep learning for entity matching: A design space exploration. In SIGMOD, 2018. [33] Isaiah Onando Mulang, Kuldeep Singh, Chaitali Prabhu, Abhishek Nadgeri, Johannes Hoffart, and Jens Lehmann. Evaluating the impact of knowledge graph contexton entity disambiguation models. arXiv preprint arXiv:2008.05190, 2020. [34] Joel Nothman, James R Curran, and Tara Murphy. Transforming wikipedia into named entity training data. In Proceedings of the Australasian Language Technology Association Workshop 2008, pages 124–132, 2008. [35] Alberto Parravicini, Rhicheek Patra, Davide B Bartolini, and Marco D Santambrogio. Fast and accurate entity linking via graph embedding. In Proceedings of the 2nd Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), pages 1–9, 2019. [36] Maria Pershina, Yifan He, and Ralph Grishman. Personalized page rank for named entity disambiguation. In NAACL, 2015. [37] Georgios Petasis, Frantz Vichot, Francis Wolinski, Georgios Paliouras, Vangelis Karkaletsis, and Con- stantine D Spyropoulos. Using machine learning to maintain rule-based named-entity recognition and classification systems. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, pages 426–433, 2001. [38] Matthew E. Peters, Mark Neumann, Robert Logan, Roy Schwartz, Vidur Joshi, Sameer Singh, and Noah A. Smith. Knowledge enhanced contextual word representations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 43–54, Hong Kong, China, November 2019. Association for Computational Linguistics. doi: 10.18653/v1/D19-1005. 17 [39] Fabio Petroni, Aleksandra Piktus, Angela Fan, Patrick Lewis, Majid Yazdani, Nicola De Cao, James Thorne, Yacine Jernite, Vassilis Plachouras, Tim Rocktäschel, et al. Kilt: a benchmark for knowledge intensive language tasks. arXiv preprint arXiv:2009.02252, 2020. [40] Minh C. Phan, Aixin Sun, Yi Tay, Jialong Han, and Chenliang Li. Pair-linking for collective entity disambiguation: Two could be better than all. TKDE, 2019. [41] N Poerner, U Waltinger, and H Schütze. E-bert: Efficient-yet-effective entity embeddings for bert. arXiv preprint arXiv:1911.03681, 2019. [42] Priya Radhakrishnan, Partha Talukdar, and Vasudeva Varma. Elden: Improved entity linking using densified knowledge graphs. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1844–1853, 2018. [43] Jonathan Raphael Raiman and Olivier Michel Raiman. Deeptype: multilingual entity linking by neural type system evolution. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018. [44] Alexander Ratner, Stephen H Bach, Henry Ehrenberg, Jason Fries, Sen Wu, and Christopher Ré. Snorkel: Rapid training data creation with weak supervision. In VLDB, 2017. [45] Christopher Ré, Feng Niu, Pallavi Gudipati, and Charles Srisuwananukorn. Overton: A data system for monitoring and improving machine-learned products. In CIDR, 2020. [46] Hamed Shahbazi, Xiaoli Z Fern, Reza Ghaeini, Rasha Obeidat, and Prasad Tadepalli. Entity-aware elmo: Learning contextual entity representation for entity disambiguation. arXiv preprint arXiv:1908.05762, 2019. [47] Sandeep Tata, Vlad Panait, Suming Jeremiah Chen, and Mike Colagrosso. Itemsuggest: A data management platform for machine learned ranking services. In CIDR, 2019. [48] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Neurips, 2017. [49] Eric Wallace, Yizhong Wang, Sujian Li, Sameer Singh, and Matt Gardner. Do nlp models know numbers? probing numeracy in embeddings. arXiv preprint arXiv:1909.07940, 2019. [50] Vikas Yadav and Steven Bethard. A survey on recent advances in named entity recognition from deep learning models. arXiv preprint arXiv:1910.11470, 2019. [51] Ikuya Yamada and Hiroyuki Shindo. Pre-training of deep contextualized embeddings of words and entities for named entity disambiguation. arXiv preprint arXiv:1909.00426, 2019. [52] Mohamed Amir Yosef, Sandro Bauer, Johannes Hoffart, Marc Spaniol, and Gerhard Weikum. HYENA: Hierarchical type classification for entity names. In Proceedings of COLING 2012: Posters, pages 1361–1370, Mumbai, India, December 2012. The COLING 2012 Organizing Committee. [53] Yuhao Zhang, Victor Zhong, Danqi Chen, Gabor Angeli, and Christopher D. Manning. Position-aware attention and supervised data improve slot filling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017), pages 35–45, 2017. [54] Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. Ernie: Enhanced language representation with informative entities. arXiv preprint arXiv:1905.07129, 2019. [55] Ming Zhu, Busra Celikkaya, Parminder Bhatia, and Chandan K. Reddy. Latte: Latent type modeling for biomedical entity linking. In AAAI, 2020. [56] Yuke Zhu, Joseph J Lim, and Li Fei-Fei. Knowledge acquisition for visual question answering via iterative querying. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1154–1163, 2017. 18 A Extended Model Details We now provide additional details about the model introduced in Section 3. We first describe our type prediction module and then describe the added entity positional encoding. Type Prediction To allow the model to further infer the correct types for an entity, especially when the entity does not have a preassigned type, we add a coarse mention type prediction task given the mention embedding. Given a mention m and a coarse type embedding matrix T, the task is to assign a coarse type embedding for the mention m; i.e., determine tm. We do so by adding the first and last token of the mention from W to generate a contextualized mention embedding m. We predict the coarse type of the mention t̂m by computing Stype = softmax(MLP(m)) t̂m = StypeT where Stype generates a distribution over coarse types. For each entity candidate of m, t̂m gets concatenated to the other type embedding te before the MLP. This is supervised by minimizing the cross entropy between Stype and the true coarse type for the gold entity, generating a type prediction loss Ltype. When performing type prediction, our overall loss is Ldis + Ltype. Position Encoding We need Bootleg to be able to reason over absolute and relative positioning of the words in the sentence and the mentions. For example, in the sentence “Where is America in Indiana?”, “America” refers to the city in Indiana, not the United States. In the sentence “Where is Indiana in America?”, “America” refers to the United States. The relative position of “Indiana”, “in”, and “America” signals the correct answer. To achieve this signaling, we add the sin positional encoding from Vaswani et al. [48] to E before it is passed to our neural model. Specifically, for mention m, we concatenate of the positional encoding of the first and last token of m, project the concatenation to dimension H, and add it to each of m’s K candidates in E. As we use BERT word embeddings for W , the positional encoding is already added to words in the sentence. B Extended Results We now give the details of our experimental setup and training. We then give extended results over the regularization scheme and model ablations. Lastly, we extend our error analysis to validate Bootleg’s ability to reason over the four patterns. B.1 Evaluation Data Wikipedia Datasets We use two main datasets to evaluate Bootleg. • Wikipedia: we use the November 2019 dump of Wikipedia to train Bootleg. We use the set of entities that are linked to in Wikipedia for a total of 3.3M entities. After weak labelling, we have a total of 5.7M sentences. • Wikipedia Subset: we use a subset of Wikipedia for our micro ablation experiments over regularization parameters. We generate this subset by taking all sentences where at least one mention is a mention from the KORE50 disambiguation benchmark. Our set of entities is all entities and entity candidates referred to by mentions in this subset of sentences. We have a total of 370,000 entities and 520,000 sentences. For our Wikipedia experiments, we use a 80/10/10 train/test/dev split by Wikipedia pages, meaning all sentences for a single Wikipedia page get placed into one of the splits. For our benchmark model, we use a 96/2/2 train/test/dev split over sentences to allow our model to learn as much as possible from Wikipedia for our benchmark tasks. 19 Benchmark Datasets We use three benchmark NED datasets. Following standard procedure [17], we only consider mentions whose linked entities appear in Wikipedia. The datasets are summarized as follows: • KORE50: KORE50 [23] represents difficult-to-disambiguate sentences and contains 144 mentions to disambiguate. Note, as of the Nov 2019 Wikipedia dump, one mention in the 144 does not have a Wikipedia page. Although it is standard to remove mentions that do not link to an entity in Wikipedia, to be comparable to other methods, we measure with 144 mentions, not 143. • RSS500: RSS500 [18] is a dataset of news sentences and contains 520 mentions (4 of the mentions did not have entities in E). • AIDA CoNLL-YAGO: AIDA CoNLL-YAGO [22] is a document-based news dataset containing 4, 485 test mentions, 4, 791 validation set mentions, and 18, 541 training mentions. As Bootleg is a sentence-level NED system, we create sentences from documents following the technique from Févry et al. [16] where we concatenate the title of the document to the beginning of each sentence. To improve the quality of annotated mention boundaries in the benchmarks, we follow the technique of Phan et al. [40] and allow for mention boundary expansion using a standard off-the-shelf NER tagger.11 For candidate generation, as aliases may not exist in Γ, we gather possible candidates by looking at n-grams in descending order of length and determine the top 30 by measuring the similarity of the proper nouns in the example sentence to each candidate’s Wikipedia page text. Structural Resources The last source of input data to Bootleg is the structural resources of types and knowledge graph relations. We extract relations from Wikidata knowledge graph triples. For our pairwise KG adjacency matrix used in KG2Ent, we require the subject and object to be in E. For our relation embeddings, we only require the subject be in E as our goal is to extract all relations an entity participates in independent of the other entities in the sentence. We have a total of 1,197 relations. We use two different type sources to assign types to entities—Wikidata types and HYENA types [52]—and use coarse HYENA types for type prediction. The Wikidata types are generated from Wikidata’s “instance of”, “subclass of”, and “occupation” relationships. The “occupation" types are used to improve disambiguation of people, which otherwise only receive “human" types in Wikidata. We filter the set of Wikidata types to be only those occurring 100 or more times in Wikipedia, leaving 27K Wikidata types in total. The HYENA type hierarchy has 505 types derived from the YAGO type hierarchy. We also use the coarsest HYENA type for an entity as the gold type for type prediction. There are 5 coarse HYENA types of person, location, organization, artifact, event, and miscellaneous. B.2 Training Details Model Parameters We run three separate models of Bootleg: two on our full Wikipedia data (one for the ablation and one for the benchmarks) and one on our micro data. For all models we use 30 candidates for each mention and incorporate the structural resources discussed above. We set T = 3 and R = 50 for the number of types and relations assigned to each entity. For the models trained on our full Wikipedia data, we set the hidden dimension to 512, the dimension of u to 256, and the dimension of all other type and relation embeddings to 128. For our models trained on our micro dataset, we set the hidden dimension to 256, the dimension of u to 256, and the dimension of all other type and relation embeddings to 128. The final differences to discuss are between the benchmark model and ablation model over all of Wikipedia. To make the best performing model for benchmarks, we add two additional additional features we found improved performance: • We use an additional KG2Ent module in addition to an adjacency matrix indicating if two entities are connected in Wikidata. We add a matrix containing the log of the number of times two entities occur in a sentence together in Wikipedia. If they co-occur less than 10 times, the weight is 0. We found this helped the model better learn entity co-occurrences from Wikipedia. • We allow our model to use additional entity-based features to be concatenated into our final E matrix. We add two features. The first is the average BERT WordPiece embeddings of the title of an entity. This is 11We use the spaCy NER tagger from https://spacy.io/ 20 https://spacy.io/ Table 9: (top) We compare Bootleg to a BERT-based NED baseline (NED-Base) on validation sets of our micro Wikipedia dataset and ablate Bootleg by only training with entity, type, or knowledge graph data. We further ablate (bottom 8 rows) the regularization schemes for the entity embeddings for Bootleg. Model All Entities Torso Entities Tail Entities Unseen Entities NED-Base 90.2 91.6 50.5 21.5 Bootleg (Ent-only) 89.1 89.0 48.3 15.5 Bootleg (Type-only) 91.6 90.4 65.9 56.8 Bootleg (KG-only) 91.8 90.8 65.3 58.6 Bootleg (p(e) = 0%) 92.5 92.3 67.7 48.6 Bootleg (p(e) = 20%) 92.8 92.5 68.9 52.5 Bootleg (p(e) = 50%) 92.9 92.7 70.1 57.7 Bootleg (p(e) = 80%) 92.8 92.2 69.5 59.9 Bootleg (InvPopLog) 92.7 91.9 69.7 61.1 Bootleg (InvPopPow) 92.8 92.3 70.5 62.2 Bootleg (InvPopLin) 92.6 91.8 69.7 61.0 Bootleg (PopPow) 92.9 92.5 68.9 52.4 # Mentions 96,237 37,077 11,087 2,810 similar to improving tail generalization by embedding a word definition in word sense disambiguation [5]. This allows the model to better capture textual cues indicating the correct entity. We also append a 1-dimensional feature of how many other entities in the sentence appear on the entity’s Wikipedia page. This increases the likelihood of an entity that has more connection to other candidates in the sentence. We empirically find that using the page co-occurrences as an entity feature rather than as a KG2Ent module performs similarly and reduces the runtime. Further, our benchmark model uses a fixed regularization scheme of 80% which did not hurt benchmark performance and training was marginally faster than the inverse popularity scheme. We did not use these features for ablations as we wanted a clean study of the model components as described in Section 3 with respect to the reasoning patterns. Training We initialize all entity embeddings to the same vector to reduce the impact of noise from unseen entities receiving different random embeddings. We use the Adam optimizer [28] with a learning rate of 0.0001 and a dropout of 0.1 in all feedforward layers, 16 heads in our attention modules, and we freeze the BERT encoder stack. Note for the NED-Base model, we do not freeze the encoder stack to be consistent with Févry et al. [16]. For the models trained on all of Wikipedia, we use a batch size of 512 and train for 1 epoch for the benchmark model and 2 epochs for the ablation models on 8 NVIDIA V100 GPUs. For our micro data model, we use a batch size of 96 and train for 8 epochs on a NVIDIA P100 GPU. B.3 Extended Ablation Results Ablation Model Size Table 10 reports the model sizes of each of the five ablation models from Table 2. As we finetuned the BERT language model in NED-Base (to be consistent with Févry et al. [16]) but do not do so in Bootleg, we do not count the BERT parameters in our reported sizes to be comparable. Regularization We now present the extended results of our regularization and weak labelling ablations over our representative micro dataset. Table 9 gives full ablations over a variety of regularization techniques. As in Table 2, we include results from models using only the entity, type, or relation information, in addition to the BERT and Bootleg models. 21 Table 10: We report the model sizes in MB of each of the five ablation models: NED-Base, Bootleg, Bootleg (Ent-Only), Bootleg (KG-Only), and Bootleg (Type-Only). Model NED-Base Bootleg Ent-Only Type-Only KG-Only Embedding Size (MB) 5,186 5,201 5,186 13 1 Network Size (MB) 4 39 35 38 34 Total Size (MB) 5,190 5,240 5,221 51 35 Table 11: We report Bootleg trained with versus without weak labelling on our micro Wikipedia dataset. The slices defined by gold anchor counts (pre-weak labelling). We use the InvPopPow regularization for both. Model All Entities Torso Entities Tail Entities Unseen Entities Bootleg 92.8 92.6 70.5 63.3 Bootleg (No WL) 93.3 93.1 70.2 60.7 # Mentions 96,237 36,904 11,541 3,146 We report the results of inverse popularity regularization based on three different functions that map the the curve of entity counts in training to the regularization value. For each function, we fix that entities with a frequency 1 receive a regularization value of 0.95 while entities with a frequency of 10,000 receive a value of 0.05 and assign intermediate values to generate a linear, logarithmic, and power curve that applies less regularization for more popular entities. The regularization reported in Table 6 uses a power law function (f(x) = 0.95(x−0.32)). We also report in Table 9 a linear function (f(x) = −0.00009x + 0.9501) and a logarithmic function (f(x) = −0.097 log(x) + 0.96). Each regularization function is set to a range of 0.05 to 0.95. We leave it as future work to explore other varied regularization functions. Table 9 shows similar trends as reported in Section 4 that Bootleg with all sources of information and the power law inverse regularization performs best over the tail and the unseen entities. We do see that the model trained with a fixed regularization of 0.5 performs marginally better on the torso and over all entities, likely because this involves less regularization over those entity embeddings, allowing it to better leverage the memorized entity patterns, while also leveraging some type and relational information (as shown by its improved performance over a lower fixed regularization). This model, however, performs 4.5 F1 points worse over unseen entities than the best model. Weak Labeling Table 11 shows Bootleg’s results with the inverse power law regularization with and without weak labelling. For this ablation, we define our set of torso, tail, and unseen entities by counting entity occurrence before weak labelling to have a better understanding as to the lift from adding weak labelling (rather than the drop without it). We see that weak labelling provides a lift of 2.6 F1 points over unseen entities and 0.3 F1 points over tail entities. Surprisingly, without weak labeling, Bootleg performs 0.5 F1 points better on torso entities. We hypothesize this occurs because the noisy weakly labels increase the amount available signals for Bootleg to learn consistency patterns for rarer entities—noisy signals are better than no signals—however, popular entities have enough less-noisy gold labels in the training data, so the noise from weak labels may create conflicting signals that hurt performance. To validate this hypothesis, we see that overall, counting both true and weakly labelled mentions, 4% of mentions without weak labeling share the same types as at least one other mention in the sentence while 14% of mentions with weak labelling do. Our model predicts a consistent answer only 4% of the time without weak labeling compared to 13% of the time with weak labeling. Note this is a slightly higher coverage numbers than reported in Section 5 as we are using a weaker form of consistency—two mentions in the sentence share the same type independent of position and ordering—and are including weakly labelled mentions. This 22 indicates consistency is a significantly more useful pattern with weak labelling, and our model predicts more consistent answers with weak labelling than without. Over the torso with weak labelling, we find that 14% of the errors across all mentions (weak labelled and anchor) are when Bootleg uses consistency reasoning, but the correct answer does not follow the consistency pattern. Without weak labelling, only 5% of the errors are due to consistency. C Extended Downstream Details We now provide additional details of our SotA TACRED model, which uses Bootleg embeddings. Input We use the revisited TACRED dataset [2]: each example includes text and subject and object positions in the text. The task involves extracting the relation between the subject and object. There are 41 potential relations as well as a “no relation” option. The other features we use as inputs are NER, POS tags, and contextual Bootleg embeddings for entities that Bootleg disambiguates in the sentence. Bootleg Model As TACRED does not come with existing mention boundaries, we perform mention extraction by searching over n-grams, from longest to shortest, in the sentence and extract those that are known mentions in Bootleg’s candidate maps. We use the same Bootleg model from our ablations with entity, KG, and type information except with the addition of fine-tuned BERT word embeddings. For efficiency, we train on a subset of Wikipedia training data relevant to TACRED. To obtain the relevant subset, we take Wikipedia sentences containing entities extracted during candidate generation from a uniform sample of TACRED data; i.e., entities in the candidate lists of detected mentions from a uniform TACRED sample. The contextualized entity embeddings from Bootleg over all TACRED are fed to the downstream model. Downstream Model We first use standard SpanBERT-Large embeddings to encode the input text, concatenate the contextual Bootleg embeddings with the SpanBERT embeddings, and then pass this through four transformer layers. We then calculate the cross-entropy loss and apply a softmax for scoring. We freeze the Bootleg embeddings and fine-tune the SpanBERT embeddings. We use the following hyperparameters: the learning rate is 0.00002, batch size is 8, gradient accummulation is 6, number of epochs is 10, L2 regularization is 0.008, warm-up percentage is 0.1, and maximum sequence length is 128. We train with a NVIDIA Tesla V100 GPU. Extended Results We study the model performance as a function of the signals provided by Bootleg. In Table 12, we show that on slices with above-median numbers of Bootleg entity, relation, and type signal counts identified in the TACRED example, the relative gap between BERT and Bootleg errors is larger on the slice above, than below, the median by 1.10x, 4.67x, and 1.35x respectively. In Table 13 we show the relative error rates from the Bootleg and baseline SpanBERT models for the slices where Bootleg provides an entity, relation, or type signal for the TACRED example’s subject or object. On the slice of these signals are respectively present, the baseline model performs 1.20x, 1.18x, and 1.20x worse than the Bootleg TACRED model. These results indicate that the knowledge representations from Bootleg successfully transfer useful information to the downstream model. D Extended Error Analysis D.1 Reasoning Patterns Here we provide additional details about the distributions of types and relations in the data. Distinct Tails Like entities, types and relations also have tail distributions. For example, types such as “country” and “film” appear 2.7M and 800k times respectively, while types such as “quasiregular polyhedron” and “hospital-acquired infection” appear once each in our Wikipedia training data. Meanwhile, relations such as “occupation” and “educated at” appear 35M and 16M times respectively, while relations such as 23 Table 12: We rank TACRED examples by the proportion of words that receive Bootleg embedding features where: Bootleg disambiguates an entity, leverages Wikidata relations for the embedding, and leverages Wikidata types for the embedding. We take examples where the proportion is greater than 0. For each of these three slices, we report the gap between the SpanBERT model and Bootleg model’s error rates on the examples with above-median proportion (more Bootleg signal) relative to the below-median proportion (less Bootleg signal). With more Bootleg information, we see the improvement our SotA model provides over SpanBERT increases. Bootleg Signal # Examples with the Signal Gap Above/Below Median Entity 15323 1.10 Relation 5400 4.67 Type 15509 1.35 Table 13: We compute the error rate of SpanBERT relative to our Bootleg downstream model for three slices of TACRED data where respectively Bootleg disambiguates the subject and/or object, Bootleg leverages Wikidata relations for the embedding of the subject and object pair, and Bootleg leverages Wikidata types for the embedding of the subject and/or object in the example. Subject-Object Signal # Examples BERT/Bootleg Error Rate Entity 12621 1.20 Relation 542 1.18 Obj Type 12044 1.20 “positive diagnostic predictor” and “author of afterword” appear 7 times respectively in the Wikipedia training data. However we find that the entity-, relation-, and type-tails are distinct: 88% of the tail-entities by entity-count have Wikidata types that are non-tail types and 90% of the tail-entities by entity-count have non-tail relations.12 For example, the head Wikidata type “country” contains rare entities “Palaú” and “Belgium–France border”. We observe that Bootleg significantly improves tail performance over each of the tails. We rank the Wikidata types and relations by the number of occurrences in the training data and study the lift from Bootleg as a function of the number of times the signal appears during training. Bootleg performs an 9.4 F1 and 20.3 F1 points better than the NED-Base baseline for examples with gold types appearing more and less than the median number of times during training respectively. Bootleg provides a a 7.8 F1 points and 13.7 F1 points better than the baseline for examples with gold relations appearing more and less than the median number of times during training respectively. These results indicate that Bootleg excels on the tails of types and relations as well. Next, ranking the Wikidata types and relations by the proportion of comprised rare (tail and unseen) entities, we further find that Bootleg provides the lowest error rates across types and relations, regardless of the proportion of rare entities, while the baseline and Entity-Only models give relatively larger error rates as the proportion of rare entities increases (Figure 4). The trend for types is flat as the proportion of rare entities increases, while the trend for relations is upwards sloping. These results indicate that Bootleg is better able to transfer the patterns learned from one entity to other entities that share its same types and relations. The improvement from Bootleg over the baseline increases as the rare-proportion increases, indicating that Bootleg is able to efficiently transfer knowledge even when the type or relation category contains none or few popular entities. Type Affordance For the type affordance pattern, we find that the TF-IDF keywords provide high coverage over the examples containing the gold type: 88% of examples where the gold entity has a particular type contain an affordance keyword for that type. An example of a type with full coverage by the affordance 12Similar to tail-entities, tail-types and tail-relations are defined as those appearing 1-10 times in the training data. 24 0.0 0.2 0.4 0.6 0.8 1.0 Rare-Entities Proportion of Relation 0.0 0.2 0.4 0.6 0.8 1.0 Er ro r R at e BOOTLEG BERT KG-Only Type-Only Entity-Only 0.0 0.2 0.4 0.6 0.8 1.0 Rare-Entities Proportion of Type 0.0 0.2 0.4 0.6 0.8 1.0 Er ro r R at e BOOTLEG BERT KG-Only Type-Only Entity-Only Figure 4: For all the entities of a particular type or relation, we calculate the percentage of rare entities (tails and toes entities). We show the error rate on the Wikipedia validation set as a function of the rare-proportion of entities of a given (Left) relation or (Right) type appearing in the validation set. keywords is “café”, with keywords such as “coffee”, “Starbucks”, and “Internet”; in each of the 77 times an entity of the type “cafe” appears in the validation set, an affordance keyword is present. Types with low coverage in the validation set by affordance keywords tend to be the rare types: for the types with coverage less than 50%, such as “dietician” or “chess official”, the median number of occurrences in the validation set is 1. This supports the need for knowledge signals with distinct tails, which can be assembled to together address the rare examples. 25 1 Introduction 2 NED Overview and Reasoning Patterns 2.1 Four Reasoning Patterns 3 Bootleg Architecture for Tail Disambiguation 3.1 Encoding the Signals 3.2 Bootleg Model Architecture 3.3 Improving Tail Generalization 3.3.1 Regularization 3.3.2 Weakly Supervised Data Labeling 4 Experiments 4.1 Experimental Setup 4.2 Bootleg Performance 4.3 Downstream Evaluation 4.4 Memory Usage 4.5 Ablation Study 5 Analysis 6 Related Work 7 Conclusion A Extended Model Details B Extended Results B.1 Evaluation Data B.2 Training Details B.3 Extended Ablation Results C Extended Downstream Details D Extended Error Analysis D.1 Reasoning Patterns plumb-humanities-2021 ---- Chapter 3 Humanities and Social Science Reading through Machine Learning Marisa Plumb San Jose State University Introduction The purposes of computational literary studies have evolved and diversified a great deal over the last half century. Within this dynamic and often contentious space, a set of fundamental ques- tions deserve our collective attention: does the computation and digitization of language recast the ways we read, value, and receive words? In what ways can research and scholarship on lit- erature become a more meaningful part of the future development of computer systems? As the theory and practice of computational literary studies evolve, their potential to play a direct role in revising historical narratives and framing new research questions poses cross-disciplinary implications. It’s worthwhile to anchor these questions in the origin stories that today’s digital humanists tell, from the work of Josephine Miles at Berkeley in the 1930s (Buurma and Heffernan 2018) to Roberto Busa’s work in the 1940s to work that links Structuralism and Russian Formalism at the turn of the 19th century (Algee-Hewitt 2015) to today’s systemized explorations of texts. The sciences and humanities have a shared history in their desire to solve the patterns and systems that make language functional and impactful, and there have long been linguistic and computational tools that help advance this work. What’s more challenging to unravel and articulate from these origin stories are the mathematical concepts behind the tools that humanists wield. Ideally one would navigate this historical landscape when assessing the fitness of any given computational technique for addressing a specific humanities research question, but often researchers choose 29 30 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 3 tools because they are powerful and popular, without a robust understanding of the conceptual assumptions they embody, which are defined by the mathematical and statistical principles they are based on. This can make it difficult to generate reproducible results that contribute to a tool’s methodological development. This is related to a set of issues that drive debates among computationally-minded scholars, which regularly appear in digital humanities forums. In 2019, for instance, Nan Da issued a harsh critique of humanists’ implementation of statistical methods in their research.1 Her claim is that computational methods are not a good match for literary research, and she systematically shows how the results from several computational humanities are not only difficult to reproduce, but can be easily skewed with minor changes to how an algorithm is implemented. Although this debate about digital methods points to a necessary evolution in the field (in which researchers become more accountable to the computational laws that they are utilizing), her essay’s broader mission is to question the appropriateness of using computational tools to investigate literary objects and ideas. Refutations to this claim were swift and abundant (Critical Inquiry 2019), and highlight a number of concepts central to my concern here with future intersections of machine learning and literary research. Respondents such as Mark Algee-Hewitt pointed out that literary scholars em- ploy computational statistical models in order to reveal something about texts that human readers could not. In doing so, literary scholars are at liberty to note where computation reaches its use- ful limit2 and take up more traditional forms of literary analysis (Algee-Hewitt 2019). Katherine Bode explores the promise and pitfalls of this hybrid “close and distant reading” approach in her 2020 article on the intersection of topic modeling and bias. Imperfect as the hybrid method is, stressing the value of familiar interpretive methods remains important, politically and practically, when bringing computation into humanities departments. This essay extends the argument that computational tools do more than turn big data into novel close reading opportunities. Machine learning, and word embedding algorithms in par- ticular, may have a unique ability to shift this conversation into new territory, where scholars begin to ask how historical research can contribute more sophisticated approaches to treating words as data. With historically-minded approaches to dataset creation for machine learning, issues emerge that engender new theoretical frameworks for evaluating the ability of statistical models of information to reveal cultural and artistic dimensions of language. I will first contex- tualize what they do, and then show a few of the mathematical concepts that have driven their development. Of the many available machine learning algorithms, word embedding algorithms have shown particular promise in capturing contextual meanings (of words or other units of textual data) more accurately than previous techniques in natural language processing. Word embeddings en- compass a set of language modeling techniques where words or phrases from a large set of texts (i.e., “corpus”) are analyzed through the use of a neural network architecture. For each vocabu- lary term in the corpus, the neural network algorithm uses the term’s proximity to other words to assign it values that become a vector of real numbers — one high-dimensional vector is generated for each word. (The term “embedding” refers to the mathematics that turns a space with many 1Da’s critique of statistical model usage in computational humanities work sparked a forum of responses in Critical Inquiry. 2This limit typically exists for a combination of three reasons: computer programs can only generate models based on the data we give them, a tool isn’t fully understood and so not robustly explored, and many algorithms and tools are being used in experimental ways. Plumb 31 dimensions per word into a continuous vector space with a much lower dimension.)3 They raise three critical issues to this essay: How do word embeddings reflect the contexts of words in order to capture their relative meanings? If word embeddings approximate word meanings, do they also reflect culture? How can literary history and cultural studies inform how scholars use them? Word embeddings are powerful because they calculate semantic similarities between words based on their distributional properties in large samples of language data. As computational lin- guist Jussi Karlgren puts it: Language is a general-purpose representation of human knowledge, and models to process it vary in the degree they are bound to some task or some specific usage. Currently, the trend is to learn regularities and representations with as little explicit knowledge-based linguistic processing as possible, and recent advances in such gen- eral models for end-to-end learning to address linguistics tasks have been quite suc- cessful. Most of those approaches make little use of information beyond the occur- rence or co-occurrence of words in the linguistic signal and take the single word to be the atomary unit. This is notable because it highlights the power of word embeddings to assign values to words in order to represent their relative meanings, simply based on unstructured language data, without a system of linguistic rules or a labelling system. It also highlights the fact that a word embedding model’s success is based on the parameters of the task it is designed to address. So while the accu- racy and power of word vector algorithms might be recognizable in general-purpose applications that improve with larger training corpora (for instance Google News and Wikipedia), they can be equally powerful representation learning systems for specific historical research tasks that use different benchmarks for success. Humanists using these machine learning methods are learning to think differently about corpora size, corpora content, and the utility of a successfully-trained model for analysis and interpretation. No matter what the application, the success of machine learning applications is predicated on creating good datasets. As a recent paper in IEEE Transactions on Knowledge and Data En- gineering notes, “the majority of the time for running machine learning end-to-end is spent on preparing the data, which includes collecting, cleaning, analyzing, visualizing, and feature en- gineering” (Roh et al. 2019, 1). Acknowledging this helps contextualize machine learning al- gorithms for text analysis tasks in the humanities, but also highlights data curation challenges that can be taken up in new ways by humanists. This naturally raises questions about how ma- chine learning algorithms like word embeddings are implemented for text analysis, and how they should be modified for historical research—they require different computational priorities and frameworks. In parallel to the corpora considerations that computational humanities scholars ponder, there is an abundance of work, across disciplines such as cognitive science and psychology (Grif- fiths et al. 2007), that attempts to refine the problems and limits of using large collections of text for training embeddings. These large collections tend to reflect the biases that exist in soci- ety and history, and in turn, systems based on these datasets can make troubling inferences, now well documented as algorithmic bias.4 Computer science researchers need to evaluate the social dimensions of their applications in diverse societies and find ways to fairly represent all popula- tions. 3See Koehrsen 2018 for a fuller explanation of the process. 4As investigated, for instance, in Noble 2018. 32 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 3 Digital humanities practices can implicitly help address these issues. Literary studies, as it evolves towards multivocality and canon expansion, makes explicit a link between methods of literary analysis and digital practices that are deliberately inclusive, less-biased, and diachronic (rather than ahistorical). Emerging literary scholarship uses computational methods to question hegemonic practices in the history of the field, through the now-familiar practice of data cura- tion (Poole 2013). But this work can also help combat algorithmic bias more broadly, and expand beyond corpus development into algorithmic design. As digital literary scholarship continues to deepen its exchanges with Sociology, History, and Information Science, stronger methodologies for using fair and representative data will become pervasive throughout these disciplines, as well as in commercial applications. Interdisciplinary methodologies are foundational to future com- putational literary research that can make sophisticated contributions to text analysis. The Bengal Annual: A Case Study Complex relationships between words cannot be fully assessed with one flat application of a pow- erful tool to a set of texts. But this does not mean that the usefulness of machine learning for literature is limited: rather, scholars can wield it to control how machines learn sets of relation- ships between concepts. Choosing which texts to include in a corpus is coupled to decisions about whether and how to label its contents, and how to tune the parameters of the algorithms. For the purposes of literary analysis, these should be embraced as interpretive, biased acts—ones that deepen understanding of commonly-employed computational methods—and folded into emerging methodologies. Because humanities scholars are not generating models to serve appli- cations with thousands of end-users who primarily expect accuracy, they can exploit the fallacies of machine learning in order to improve how dataset management and feature engineering are conducted. Working with big data in order to generate models isn’t valuable because it reveals history’s “true” cultural patterns, but because it demonstrates how machines already circulate those “truths.” A scholar’s deep knowledge of the historical content and formalities of language can determine how corpora are compared, how we experiment with known biases, and how we move towards a future landscape of literary analysis that is inclusive of marginalized texts and the latest cultural theory. Roopika Risam, for instance, advocates for both a theoretical and practice-based decoloniza- tion of the digital humanities, noting ways that postcolonial digital archives can intervene in knowledge production in society (2018, 79). Corpora created from periods of revolution, then, might reveal especially useful vector relationships and lead to better understanding of semantic changes during those times. Those word embeddings might be useful for teaching computers racialized language over timelines, so that machine learning applications do not only “read” his- tory as a flat set of relationships, and inevitably reflect the worst of its biases. To begin to unpack this process, I will present a case study on the 1830 Bengal Annual and a corpus of similarly-situated texts. Our team, made up of students in Katherine D. Harris’s graduate seminar on decolonizing Romantic Literature at San Jose State University, asked: can we operationalize questions that arise from close readings of texts to turn problematic quanti- tative evaluations of words into more complex methods of interpretation? A computer cannot interpret complex cultural concepts, but it can be instructed to weigh time period, narrative per- spective, and publication venue, much as a literary scholar would. With the explosion of print culture in England in the first half of the nineteenth century, publishers began introducing new forms of serialized print materials, which included serialized Plumb 33 publications known as literary annuals (Harris 2015). These multi-author texts were commonly produced as high-quality volumes that could be purchased as gifts in the months leading up to the holiday season. As a genre, the annual included poetry, prose, and engravings, among other varieties of content, very often from well-known authors. Literary annuals represent a significant shift in the economics surrounding the production of print materials for mass consumption— for instance, contributors were typically paid. And annuals, though a luxury item, were more affordable than books sold before the mechanization of the printing press (Harris 2015, 1-29). Literary annuals and other periodicals are interesting sites of literary study because they can be read as reinforcing or resisting the British Empire. London-based periodicals were eventually distributed to all of Britain’s colonial holdings, including India (Harris 2019). As The Bengal Annual was written in India and contains a small representation of Indian authors, our project investigates it as a variation on British-centric reading materials of the time, which perhaps offered a provisional voice to a wider community of writers (though not without claims of superiority over the colonized territory it exploits). Some of the contents invoke themes that are affiliated with major Romantic writers such as William Wordsworth and Samuel T. Coleridge, but editor D.L. Richardson included short stories and fiction, which were not held in the same regard as poetry. He also employed local native Indian engravers and writers. To explore the thesis that the concepts and genres typically associated with British Romantic Literature are represented differently in a text that was written and produced in a different space with a set of contributors who were not exclusively British natives, we experimented with word embeddings on semantic similarity tasks, comparing the annual to texts like Lyrical Ballads. Such a task is within the scope of traditional literary analysis, but my agenda was to probe the idea that we need large-scale representations of marginalized voices in order to show real differences from the ideas of the dominant race, class, and gender.5 The project team first used statistical tools to find out if the Annual’s poetry, non-fiction, and fiction contained interesting relationships between vocabularies about body parts, social class, and gender. We gathered information about terms that might reveal how different parts of the body were referenced depending on sex. These differences were validated by traditional close- reading knowledge about British Romantic Literature and its historical contexts,6 and signaled the need to read and analyze the Annual’s passages about body parts, especially ones by writers of different genders and social backgrounds. These simple methods allowed us to take a streamlined approach to confirming that an author’s perspective indeed altered his or her word choices and other aspects of their references to male vs. female bodies. Collecting and mapping those references, however, was not enough to build a larger argu- ment about how discourse on bodies might be different in non-canonical British Romantic Lit- erature. Based on the potential for word embeddings to model semantic spaces for different cor- pora and compare the distribution of terms, the next step was to build a corpus of non-canonical texts of similar scope to a corpus of canonical works, so that models for each could be legitimately compared. This work, currently in progress, faces challenges that are becoming more familiar to digital historians: the digitization of rare texts, the review of digitization processes for accuracy, and the cleaning of data. The primary challenge is to find the correct works to include: this requires historical exper- 5Such textual repositories are important outside of literature departments, too. We need data to represent all voices in training machines to represent any social arena. 6Some of these findings are illustrated in the project’s Scalar site: ?iiT,ffb+�H�`Xmb+X2/mfrQ`Fbfi?2@#2M ;�H@�MMm�Hf#Q/B2b@BM@i?2@�MMm�H. http://scalar.usc.edu/works/the-bengal-annual/bodies-in-the-annual http://scalar.usc.edu/works/the-bengal-annual/bodies-in-the-annual 34 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 3 tise, but also raises the question of how to uncover unknown authors. Manu Chander’s Brown Romantics calls for a global assessment of Romantic Literature’s impact by “calling attention to its genuinely unacknowledged legislators” (Chander 2017, 11). But he contends that even the authors he was able to study were already individuals who aspired to assimilate with British cul- ture and ideologies in some ways, and perhaps don’t represent political resistance or views entirely antithetical to the British Empire. Guided by Chander’s questions about how to locate dissent in contexts of colonization, we documented instances in the text that highlight the dynamics of colonialism, race, and nation- alism, and compared them to a set of statistical explorations of the text’s vocabulary (particu- larly terms related to national identity, gender, and bodies). Chander’s call for a more globally- comprehensive study of Romanticism speaks to the politics of corpora curation discussed above, but also suggests that corpus comparison can benefit from formal methodological guidelines. Puzzling out how to best combine traditional close readings with quantitative inquiries, and then map that work to a machine-learning research framework, revealed several shortcomings in methodological standardization. It also revealed several opportunities for rethinking the way algorithms could be implemented, by adopting and systematizing familiar comparative research practices. Ideas about such methodologies are emerging in many disciplines, which I highlight later in this essay. Disciplinary directions for word vector research The potential of word embedding techniques for projects such as our BengalAnnual analysis can be seen in the new computational research directions that have emerged in humanities research.7 Vector-space representations are based on high-dimensional vectors8 of real numbers.9 Those vectors’ values are assigned using a word’s relationship to the words near it in a text, based on the likelihood that a word will appear in proximity to other words it is told to “look” at. For example, this visualization demonstrates an embedding space for a historical corpus (1640-1699) using the values assigned to word vectors (figure 3.1). In a visualized space (with reduced dimensions) such as the one in figure 3.1, distances among vectors can be assessed, for example, to articulate the forty words most similar to wit. This partic- ular model (trained using the word2vec algorithm), published in the 2019 Debates in the Digital Humanities,10 allowed the authors to visualize the term wit with synonyms on the left side, and terms related to argumentation on the right, such as indeed, argues, and consequently. This ini- tial exploration prompted Gavin and his co-authors to look at a vector space model for a single author (John Dryden), in order to both validate the model against their subject matter expertise and explore the model’s results. Although word vectors are often employed for machine trans- lation tasks11 or to project analogistic relationships between concepts,12 they can also be used to 7See Kirschenbaum 2007 and Argamon and Olsen 2009. 8A word vector may have hundreds or even thousands of dimensions. 9Word embedding algorithms are modelled on the linguistic concept that context is a primary way that word meanings are produced. Their usefulness is dependent on the breadth and domain-relevance of the corpus they are trained on, meaning that a corpus of medical research vs. a corpus of 1980s television guides vs. a corpus of family law proceedings would generate models that show different relationships between words like “family,” “health,” “heart,” etc. 10See Goldstone 2019. 11Software used to translate text or speech from one language to a target language. Machine translation is a subfield of computational linguistics that can now allow for domain-based (i.e. specialized subject matter) customizations of translations, making translated word choices more context-specific.. 12Although word embeddings aren’t explicitly trained to learn analogies, the vectors exhibit seemingly linear behavior (such as “woman is to queen as man is to king”), which approximately describe a parallelogram. This phenomenon is Plumb 35 Figure 3.1: A visualized space with reduced dimensions of a neighborhood around wit (Gavin et al. 2019, Figure 21.2). question concepts that are traditionally associated with particular literary periods and evaluate those associations with new kinds of evidence. What this type of study suggests is that we can look at cultural concepts like wit in new ways. These results can also facilitate a comparison of historical models of wit to contemporary ones— to show how its meaning may have shifted, using its changing relationship to other words as evidence. This is a growing area of research in the social sciences, computational linguistics, and other disciplines (Kutuzov et al. 2019) In a survey paper on current work in diachronic word embeddings and semantic shifts, Kutuzov et al. note that the surge of interest points to its impor- tance for natural language processing, but that it currently lacks “cohesion, common terminology and shared practices.” Some of this cohesion might be generated by putting the usefulness of word vectors in con- text of the history of information retrieval and the history of distributed representation. Word embeddings emerged in the 1960s, with data modeled as a matrix, and a user’s query of a database represented as a vector. Simple vector operations could be used to locate relevant data or docu- ments. Gerald Salton is generally credited as one of the first to do this, based on the idea that he could represent a document as a vector of keywords and use measures like cosine similarity and di- mensionality reduction to compare documents.13 Since the 1990s, vector space models have been explored in Allen and Hospedales 2019. 13Algorithms like word2vec take as input the linguistic context of words in a given corpus of text, and output an N dimensional space of those words—each word is represented as a vector of dimension N in that Euclidean space. Word vectors with thousands of values are transformed to lower-dimensional spaces in which the directionality of two vectors can be measured using cosine similarity—words that exist in similar contexts would be expected to have a similar cosine measurement and map to like clusters in the distributed space. 36 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 3 used in distributional semantics. In a paper on the history of vector space models, which exam- ines the trajectory of Gerald Salton’s work, David Dubin notes that these mathematical models can be defined as “a consistent mathematical structure designed to correspond to some physical, biological, social, psychological, or conceptual entity” (2004). In the case of word vectors, word context and colocations give us quantifiable information about a word’s meaning. But research in cognitive science has long questioned the property of linguistic similarity in spatial representations because they don’t align with important aspects of human semantic pro- cessing (Tversky 1977). Tversky shows, for example, that people’s interpretation of semantic sim- ilarity does not always obey the triangle inequality, i.e., the words w1 and w3 are not necessarily similar when both pairs of (w1, w2) and (w2, w3) are similar. While “asteroid” is very similar to “belt” and “belt” is very similar to “buckle”, “asteroid” and “buckle” are not similar (Griffiths et al. 2007). One reason this violation arises is because a word is represented as a single vector even when it has multiple meanings. This has led to research that attempts new methods to capture different senses of words in embedding applications. In a paper surveying techniques for dif- ferentiating words at the “sense” level, Jose Camacho-Collados and Mohammad Taher Pilehvar show that these efforts fall in two camps: “Unsupervised models directly learn word senses from text corpora, while knowledge-based techniques exploit the sense inventories of lexical resources as their main source for representing meanings” (2018, 744). The first method, an unsupervised model, induces different meanings of a word — it is trained to analyze and represent each word sense based on statistical knowledge derived from the contexts within a corpus. The second method for disambiguation relies on information contained in other databases or sources. WordNet, for instance, associates multiple words with concepts, providing a sense inventory for terms. It is made up of synsets, which represent unique concepts that can be expressed through nouns, verbs, adjectives or adverbs. The synset of a concept such as “a busi- ness where patrons can purchase coffee and use WiFi” might be “cafe, coffeeshop, internet cafe” etc. Camacho-Collados and Pilehvar review different ways to process word embedding results using WordNet and similar resources, which essentially provide synonyms that share a common meaning. There exists a relationship between work that addresses word disambiguation and work that addresses the biases that word vector algorithms produce. Just as researchers can modify gen- eral word embedding models to capture a word’s multiple meanings, they can also modify them according to a word’s usage over time. These evolving methods begin to account for the social, historical, and psychological dimensions of language. If one can show that applying word embed- ding algorithms to diachronic corpora or corpora of different domains produces different biases, this would suggest that nuanced shifts in vocabulary and word usage can be used to impact data curation practices that seek to isolate and remove historical bias from other word embedding models. Biases, one might say, persist despite contextual changes. Or, one might say that the short- comings of word embeddings don’t account for changes in bias that are present in context. This is where the domain expertise of literary scholars also becomes essential. Historians’ domain ex- pertise and natural interest in comparative corpora (from different time periods or containing different types of documents) situates their ability to curate datasets that tend to both data ethics and computational innovation. Such work could have impact beyond historical research, and result in data-level corrections to biases that emerge in more general-purpose embedding applica- tions. This could be more effective and reproducible than correcting them superficially (Gonen and Goldberg 2019). For instance, if novel cultural biases can be traced to an origin period, texts Plumb 37 from that period could constitute a sub-corpus. Embedding models specific to that corpus might be subtracted from the vectors generated from a broader dataset. Examining a methodology’s history is an essential way in which scholars can strengthen the validity of computationally-driven research and its integration into literary departments—this type of scholarship reconstitutes literary insights after the risky move of flattening literary texts with the rigor of machines. But as Lauren Klein (2019) and others reveal, scholars have begun to apply interpretation and imagination in both the computational and the “close reading” as- pects of their research. This reinforces that computational shifts in the study of literature are more than just the adoption of useful tools for the sake of locating a novel pattern in data. An increasingly important branch of digital literary research demonstrates the efficacy of engaging the interdisciplinary complexity of computational tools in relation to the complexity of literary analysis. New ideas for close readings and analysis can serve as windows into defining secondary com- putational research questions that emerge from an initial statistical exploration. As in the work reviewed by Camacho-Collados Pilehvar, outside knowledge of word senses can be used for post- processing word embeddings that address theoretical issues. Implementing this type of process for humanities research, one might begin with the question: can I generate word vector models that attend to both author gender and word context if I train them in innovative ways? Does this require a corpus of male authors and one of female authors? Or would this be better accom- plished with an outside lexical source that has already associated word senses with genders? Multi-disciplinary scholars are experimenting with a variety of methods to use word vector algorithms to track semantic complexities, and humanities researchers need an awareness of the technical innovations across a range of these disciplines because they are in a position to bring im- portant domain knowledge to these efforts. Ideally, the questions that unite these disciplinary ef- forts might be: how do we make word contexts and distributional semantics more useful for both historians, who need reproducible results that lead to new interpretation, and technologists, who need historical interpretation to play a larger role in language generalization? Modeling language histories depends on how deeply humanists can understand word embedding models, so that they can augment their inherent shortcomings. Cross-disciplinary collaborations help scholars return to fundamental issues that arise when we treat words as data, and help bring more cohesive methodological standards to language modeling. New directions in cross-disciplinary machine learning frameworks Literary scholars set up computational inquiries with attention to cultural complexity, and seek out instances of language that convey historical context. So while they aren’t likely to lead the charge in correcting fundamental shortcomings of language representation algorithms, they can increasingly impact social assessments of those algorithms, provide methodologies for those al- gorithms to locate anomalies in language usage, and assess whether those algorithms embody socially just practices (D’Ignazio and Klein 2020). Some literary scholars also critique the non- neutral ideologies that are in place in both computing and the humanities (Rhody 2017, 660). These efforts not only make the field of literary studies (and its history) more relevant to a digitally and computationally-driven future, but also help literary scholars create meaningful intersections between their computational tools and theoretical training. That training includes frameworks 38 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 3 for reading and analysis that computers cannot yet perform, but should aspire to—from close reading, Semiotic Criticism, and Formalism to Post-structuralism, Cultural Studies, and Femi- nist Theory. The varied systems literary scholars have developed for thinking about signs, words, and symbols should not be seen as irreconcilable with computational tools for text analysis. In- stead, they should become the foundation for new methodologies that tackle the shortcomings of machine learning algorithms and project future directions for text analysis. Linguists and scientists interested in natural language processing have often looked to the hu- manities for methods that assign rules to the production of meaning. Such methods exist within the history of literary criticism, some of which are being newly explored as concepts for language modeling algorithms. For instance, data curation takes inspiration from cultural studies, which empowers literary scholars to correct for bias and underrepresentation in literature by expand- ing the canon. Subsequent literary findings from that research need not only be literary ones: they have the potential to serve as models for best practices for computational tools and datasets more broadly. While the rift between society’s most progressive ideas and its technological ad- vancement is not unique to the rise of machine learning, practical opportunities exist to repair the rift with a blend of literary criticism and computational skills, and there are many recent ex- amples14 of the growing importance of combining rich technical explanations, interdisciplinary theories, and original computational work in corpus linguistics and beyond. A desire to wield social and computational concerns simultaneously is evident also in recent work in Linguistics,15 Sociology,16 and History.17 Studies in computational Sociology by Lauren Nelson, Austin C. Kozlowski, Matt Taddy, James A. Evans, Peter McMahan, and Kenneth Benoit contain important parallels for machine learning-driven text analysis. Nelson, for instance, calls for a new three-step methodology to com- putational sociology, one that “combines expert human knowledge and hermeneutic skills with the processing power and pattern recognition of computers, producing a more methodologically rigorous but interpretive approach to content analysis” (2020, 1). She describes a framework that can aid in reproducibility, which was noted as a problem by Da. Kozlowski, Taddy, and Evans, who study relationships between attention and knowledge, in a September 2019 paper on the “geometry of culture” use a vector space model to analyze a century of books. They show “that the markers of class continuously shifted amidst the economic transformations of the twentieth century, yet the basic cultural dimensions of class remained remarkably stable. The notable ex- ception is education, which became tightly linked to affluence independent of its association with cultivated taste” (1). This implies that disciplinary expertise can be used to isolate sub-corpora for use in secondary word embedding research problems. Resulting word similarity findings could aid in both validating the initial research finding and defining domain-specific datasets that are reusable for future research. The idea of using humanities methodologies to inform model architectures for machine learn- 14See Whitt 2018 for a state-of-the-art overview of the intersecting fields of corpus linguistics, historical linguistics, and genre-based studies of language usage. 15A special issue in the journal Language from the Linguistic Society of America published responses to a call to reconcile the unproductive rift between generative linguistics and neural network models. Christopher Potts’s response (2019) advocates an imperative integration between deep learning and traditional linguistic semantics. 16Sociologist Laura K. Nelson (2020) calls for a three-step methodological framework called computational grounded theory. 17Another special issue, this one from Isis, a journal from the History of Science Society, suggests that “the history of knowledge can act as a bridge between the world of the humanities, with its tradition of close reading and detailed understanding of individual cases, and the world of big data and computational analysis” (Laubichler, Maienschein, and Renn 2019, 502). Plumb 39 ing is part of a wider history of computational scientists drawing inspiration from other fields to make AI systems better. Designing humanities research with novel word embedding mod- els stands to widen the territory where machine learning engineers look for conceptual concepts to inspire strategies for improving the performance of artificial language understanding. Many computer scientists are investigating the figurative (Gagliano et al. 2019) and the metaphorical (Mao et al. 2018) in language. As machines get better at reading and interpreting texts, literary studies and theories will become more applicable to how those machines are programmed to look at multiple layers and dimensions of language. Ted Underwood, Andrew Piper, Katherine Bode, James Dobson, and others make connections between computational literary research and social dimensions of the history of vector space model research. Since vector models are based on the 1950s linguistic notion of similarity (Firth 1957), researchers working to show superior algorith- mic performance focus on different aspects of why similarity is important than do researchers seeking cultural insights within their data. But Underwood points out that a word vector can also be seen as a way to quantitatively account for more aspects of meaning (2019). Already, cross-disciplinary scholarship draws on computational linguistics,18 information science,19 and semantic linguistics, and the imperative to understand concepts from all of these fields is grow- ing. As better methods are developed for using word embeddings to better understand texts from different domains and time periods, more sophisticated tools and paradigms emerge that echo the complexity of traditional literary and historical interpretation. Systematic data curation, combined with word embedding algorithms, represent a new inter- pretive system for literary scholars. The potential of machine learning methods for text analysis goes beyond historical literary text analysis, and the methods for literary text analysis using ma- chine learning also go beyond literature departments. The corpora they model and the way they frame their research questions reframe the potential to use systems like word vectors to under- stand aspects of historical language and could have broader ramifications on how other applica- tions model word meanings. Because such literary research generates novel frameworks for using machine learning to represent language, it’s imperative to explore the question: Are there ways that humanities methodologies and research goals can exert greater influence in the computa- tional sciences, make the history of literary studies more relevant in the evolution of machine learning techniques, and better serve our shared social values? References Algee-Hewitt, Mark. 2015. “The Order of Poetry: Information, Aesthetics and Jakobson’s The- ory of Literary Communication.” Presented at the Russian Formalism & the Digital Hu- manities Conference, April 13, Stanford University, Palo Alto, CA. ?iiTb,ff/B;Bi�H? mK�MBiB2bXbi�M7Q`/X2/mf`mbbB�M@7Q`K�HBbK@/B;Bi�H@?mK�MBiB2b. Algee-Hewitt, Mark. 2019. “Criticism, Augmented.” In the Moment (blog). April 1, 2019. ?iiTb,ff+`BiBM[XrQ`/T`2bbX+QKfkyRNfy9fyRf+QKTmi�iBQM�H@HBi2`�`v@ bim/B2b@T�`iB+BT�Mi@7Q`mK@`2bTQMb2bf. Allen, Carl, and Timothy Hospedales. 2019. “Analogies Explained: Towards Understanding Word Embeddings.” In International Conference on Machine Learning, 223–31. PMLR. ?iiT,ffT`Q+22/BM;bXKH`XT`2bbfpNdf�HH2MRN�X?iKH. 18Linguistics scholars are also adopting computational models to make progress with theories related to semantic sim- ilarity. For instance, see Potts 2019. 19See Lin 1998, for example. https://digitalhumanities.stanford.edu/russian-formalism-digital-humanities https://digitalhumanities.stanford.edu/russian-formalism-digital-humanities https://critinq.wordpress.com/2019/04/01/computational-literary-studies-participant-forum-responses/ https://critinq.wordpress.com/2019/04/01/computational-literary-studies-participant-forum-responses/ http://proceedings.mlr.press/v97/allen19a.html 40 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 3 Argamon, Shlomo and Mark Olsen. 2009. “Words, Patterns and Documents: Experiments in Machine Learning and Text Analysis.” Digital Humanities Quarterly 3 (2). ?iiT,ffrrrX /B;Bi�H?mK�MBiB2bXQ`;f/?[fpQHfjfkfyyyy9Rfyyyy9RX?iKH. Bode, Katherine. 2020. “Why You Can’t Model Away Bias.” Modern Language Quarterly 81 (1): 95–124. ?iiTb,ff/QBXQ`;fRyXRkR8fyykedNkN@dNjjRyk. Buurma, Rachel Sagner, and Laura Heffernan. 2018. “Search and Replace: Josephine Miles and the Origins of Distant Reading.” Modernism / Modernity Print+ 3, Cycle 1 (April). ?iiTb,ffKQ/2`MBbKKQ/2`MBivXQ`;f7Q`mKbfTQbibfb2�`+?@�M/@`2TH�+2. Camacho-Collados, Jose and Mohammad Taher Pilehvar. 2018. “From Word To Sense Embed- dings: A Survey on Vector Representations of Meaning.” Journal of Artificial Intelligence Research 63 (December): 743–88. ?iiTb,ff/QBXQ`;fRyXReRjfD�B`XRXRRk8N. Chander, Manu Samriti. 2017. Brown Romantics: Poetry and Nationalism in the Global Nine- teenth Century. Lewisburg, PA: Bucknell University Press. Critical Inquiry. 2019. “Computational Literary Studies: A Critical Inquiry Online Forum.” In the Moment (blog). March 31, 2019. ?iiTb,ff+`BiBM[XrQ`/T`2bbX+QKfkyRNfyjf jRf+QKTmi�iBQM�H@HBi2`�`v@bim/B2b@�@+`BiB+�H@BM[mB`v@QMHBM2@7Q`m Kf. Da, Nan Z. 2019. “The Computational Case against Computational Literary Studies.” Critical Inquiry 45 (3): 601–39. ?iiTb,ff/QBXQ`;fRyXRy3efdyk8N9. D’Ignazio, Catherine, and Lauren Klein. 2020. Data Feminism. Cambridge: MIT Press. Douglas, Samantha, Dan Dirilo, Taylor-Dawn Francis, Keith Giles, and Marisa Plumb. n.d. “The Bengal Annual: A Digital Exploration of Non-Canonical British Romantic Litera- ture.” ?iiTb,ffb+�H�`Xmb+X2/mfrQ`Fbfi?2@#2M;�H@�MMm�HfBM/2t. Dubin, David. 2004. “The Most Influential Paper Gerard Salton Never Wrote.” Library Trends 52 (4): 748–64. ?iiTb,ffrrrXB/2�HbXBHHBMQBbX2/mf#Bibi`2�Kf?�M/H2fkR9 kfReNdf.m#BMd93de9XT/7?b2[m2M+24k. Firth, J.R. 1957. “A Synopsis of Linguistic Theory.” In Studies in Linguistic Analysis, 1–32. Oxford: Blackwell. Gagliano, Andrea, Emily Paul, Kyle Booten, and Marti A. Hearst. 2019. “Intersecting Word Vectors to Take Figurative Language to New Heights.” In Proceedings of the Fifth Workshop on Computational Linguistics for Literature, 20-31. San Diego, California: Association for Computational Linguistics. ?iiTb,ff/QBXQ`;fRyXR3e8jfpRfqRe@ykyj. Gavin, Michael, Collin Jennings, Lauren Kersey, and Brad Pasanek. 2019. “Spaces of Meaning: Conceptual History, Vector Semantics, and Close Reading.” In Debates in the Digital Hu- manities 2019, edited by Matthew K. Gold and Lauren F. Klein, 243–267. Minneapolis: University of Minnesota Press. Goldstone, Andrew. 2019. “Teaching Quantitative Methods: What Makes It Hard (in Literary Studies).” In Debates in the Digital Humanities 2019, edited by Matthew K. Gold and Lau- ren F. Klein. Minneapolis: University of Minnesota Press. ?iiTb,ff/?/2#�i2bX;+X+ mMvX2/mf`2�/fmMiBiH2/@7k�+7dk+@�9eN@9N/3@#2j8@ed7N�+R2j�eyfb2+iB QMfeky+�7N7@y3�3@9382@�9Ne@8R9yykNe2#+/O+?RN. Gonen, Hila and Yoav Goldberg. 2019. “Lipstick on a Pig: Debiasing Methods Cover up System- atic Gender Biases in Word Embeddings But do not Remove Them.” ArXiv:1903.03862, September. ?iiTb,ff�`tBpXQ`;f�#bfRNyjXyj3ek. Griffiths, Thomas L., Mark Steyvers, and Joshua B. Tenenbaum. 2007. “Topics in Semantic Representation.” Psychological Review 114 (2): 211–44. ?iiTb,ff/QBXQ`;fRyXRyjdf http://www.digitalhumanities.org/dhq/vol/3/2/000041/000041.html http://www.digitalhumanities.org/dhq/vol/3/2/000041/000041.html https://doi.org/10.1215/00267929-7933102 https://modernismmodernity.org/forums/posts/search-and-replace https://doi.org/10.1613/jair.1.11259 https://critinq.wordpress.com/2019/03/31/computational-literary-studies-a-critical-inquiry-online-forum/ https://critinq.wordpress.com/2019/03/31/computational-literary-studies-a-critical-inquiry-online-forum/ https://critinq.wordpress.com/2019/03/31/computational-literary-studies-a-critical-inquiry-online-forum/ https://doi.org/10.1086/702594 https://scalar.usc.edu/works/the-bengal-annual/index https://www.ideals.illinois.edu/bitstream/handle/2142/1697/Dubin748764.pdf?sequence=2 https://www.ideals.illinois.edu/bitstream/handle/2142/1697/Dubin748764.pdf?sequence=2 https://doi.org/10.18653/v1/W16-0203 https://dhdebates.gc.cuny.edu/read/untitled-f2acf72c-a469-49d8-be35-67f9ac1e3a60/section/620caf9f-08a8-485e-a496-51400296ebcd#ch19 https://dhdebates.gc.cuny.edu/read/untitled-f2acf72c-a469-49d8-be35-67f9ac1e3a60/section/620caf9f-08a8-485e-a496-51400296ebcd#ch19 https://dhdebates.gc.cuny.edu/read/untitled-f2acf72c-a469-49d8-be35-67f9ac1e3a60/section/620caf9f-08a8-485e-a496-51400296ebcd#ch19 https://arxiv.org/abs/1903.03862 https://doi.org/10.1037/0033-295X.114.2.211 https://doi.org/10.1037/0033-295X.114.2.211 Plumb 41 yyjj@kN8sXRR9XkXkRR. Harris, Katherine D. 2015. Forget Me Not: The Rise of the British Literary Annual, 1823NJ1835. Athens: Ohio University Press. Harris, Katherine D. 2019. “TheBengalAnnual and #bigger6.” Keats-ShelleyJournal 68: 117–18. ?iiTb,ffKmb2XD?mX2/mf�`iB+H2fddRRjk. Kirschenbaum, Matthew. 2007. “The Remaking of Reading: Data Mining and the Digital Hu- manities.” Presented at the National Science Foundation Symposium on Next Generation of Data Mining and Cyber-Enabled Discovery for Innovation, Baltimore, MD, October 11. ?iiTb,ffrrrX+b22XmK#+X2/mf�?BHHQHfL:.Jydf�#bi`�+ibfi�HFbfJEB`b+? 2M#�mKXT/7. Klein, Lauren F. 2019. “What the New Computational Rigor Should Be.” IntheMoment (blog). April 1, 2019. ?iiTb,ff+`BiBM[XrQ`/T`2bbX+QKfkyRNfy9fyRf+QKTmi�iBQM�H @HBi2`�`v@bim/B2b@T�`iB+BT�Mi@7Q`mK@`2bTQMb2b@8f. Koehrsen, Will. 2018. “Neural Network Embeddings Explained.” Towards Data Science, Octo- ber 2, 2018. ?iiTb,ffiQr�`/b/�i�b+B2M+2X+QKfM2m`�H@M2irQ`F@2K#2//BM;b @2tTH�BM2/@9/yk32e7y8ke. Kozlowski, Austin C., Matt Taddy, and James A. Evans. 2019. “The Geometry of Culture: Ana- lyzing the Meanings of Class through Word Embeddings.” American Sociological Review 84 (5): 905–949. ?iiTb,ff/QBXQ`;fRyXRRddfyyyjRkk9RN3ddRj8. Kutuzov, Andrey, Lilja Øvrelid, Terrence Szymanski, and Erik Velldal. 2018. “Diachronic word embeddings and semantic shifts: a survey.” In Proceedings of the 27th International Con- ference on Computational Linguistics, 1384-1397. Santa Fe, New Mexico: Association for Computational Linguistics. ?iiTb,ffrrrX�+Hr2#XQ`;f�Mi?QHQ;vf*R3@RRRd. Laubichler, Manfred D., Jane Maienschein, and Jürgen Renn. 2019. “Computational History of Knowledge: Challenges and Opportunities.” Isis 110 (3): 502-512. Lin, Dekang. 1998. “An Information-Theoretic Definition of Similarity.” In Proceedings of the Fifteenth International Conference on Machine Learning, 296–304. San Francisco, Califor- nia: Morgan Kaufmann Publishers Inc. Mao, Rui, Chenghua Lin, and Frank Guerin. 2018. “Word Embedding and WordNet Based Metaphor Identification and Interpretation.” In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1222–31. Mel- bourne, Australia: Association for Computational Linguistics. ?iiTb,ff/QBXQ`;fRy XR3e8jfpRfSR3@RRRj. Nelson, Laura K. 2020. “Computational Grounded Theory: A Methodological Framework.” Sociological Methods & Research 49 (1): 3–42. ?iiTb,ff/QBXQ`;fRyXRRddfyy9NRk9R RddkNdyjX Noble, Safiya Umoja. 2018. Algorithms of Oppression: How Search Engines Reinforce Racism. New York: New York University Press. Poole, Alex H. 2013. “Now Is the Future Now? The Urgency of Digital Curation in the Digital Humanities.” Digital Humanities Quarterly 7 (2). ?iiT,ffrrrX/B;Bi�H?mK�MBiB2 bXQ`;f/?[fpQHfdfkfyyyRejfyyyRejX?iKH. Potts, Christopher. 2019. “A Case for Deep Learning in Semantics: Response to Pater.” Lan- guage 95 (1): e115–24. ?iiTb,ff/QBXQ`;fRyXRj8jfH�MXkyRNXyyRN. Rhody, Lisa. 2017. “Beyond Darwinian Distance: Situating Distant Reading in a Feminist Ut Pictura Poesis Tradition.” PMLA 132 (3): 659-667. Risam, Roopika. 2018. “Decolonizing the Digital Humanities in Theory and Practice.” In The https://doi.org/10.1037/0033-295X.114.2.211 https://doi.org/10.1037/0033-295X.114.2.211 https://muse.jhu.edu/article/771132 https://www.csee.umbc.edu/~hillol/NGDM07/abstracts/talks/MKirschenbaum.pdf https://www.csee.umbc.edu/~hillol/NGDM07/abstracts/talks/MKirschenbaum.pdf https://critinq.wordpress.com/2019/04/01/computational-literary-studies-participant-forum-responses-5/ https://critinq.wordpress.com/2019/04/01/computational-literary-studies-participant-forum-responses-5/ https://towardsdatascience.com/neural-network-embeddings-explained-4d028e6f0526 https://towardsdatascience.com/neural-network-embeddings-explained-4d028e6f0526 https://doi.org/10.1177/0003122419877135 https://www.aclweb.org/anthology/C18-1117 https://doi.org/10.18653/v1/P18-1113 https://doi.org/10.18653/v1/P18-1113 https://doi.org/10.1177/0049124117729703. https://doi.org/10.1177/0049124117729703. http://www.digitalhumanities.org/dhq/vol/7/2/000163/000163.html http://www.digitalhumanities.org/dhq/vol/7/2/000163/000163.html https://doi.org/10.1353/lan.2019.0019 42 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 3 Routledge Companion to Media Studies and Digital Humanities, edited by Jentery Sayers, 78–86. New York: Routledge. Roh, Yuji, Geon Heo, and Steven Euijong Whang. 2019. “A Survey on Data Collection for Ma- chine Learning: A Big Data - AI Integration Perspective.” IEEE Transactions on Knowledge and Data Engineering Early Access: 1–20. ?iiTb,ff/QBXQ`;fRyXRRyNfhE.1XkyRNX kN9eRek. Tversky, Amos. “Features of Similarity.” Psychological Review 84 (4): 327–52. ?iiTb,ff/QBX Q`;fRyXRyjdfyyjj@kN8sX39X9Xjkd. Underwood, Ted. 2019. Distant Horizons: Digital Evidence and Literary Change. Chicago: University of Chicago Press. Whitt, Richard J., ed. 2018. Diachronic Corpora, Genre, and Language Change. John Benjamins Publishing Company. https://doi.org/10.1109/TKDE.2019.2946162 https://doi.org/10.1109/TKDE.2019.2946162 https://doi.org/10.1037/0033-295X.84.4.327 https://doi.org/10.1037/0033-295X.84.4.327 prudhomme-taking-2021 ---- Chapter 11 Taking a Leap Forward: Machine Learning for New Limits Patrice-Andre Prud’homme Oklahoma State University Introduction Today, machines can analyze vast amounts of data and increasingly produce accurate results through the repetition of mathematical or computational procedures. With the increasing computing ca- pabilities available to us today, artificial intelligence (AI) and machine applications have made a leap forward. These rapid technological changes are inevitably influencing our interpretation of what AI can do and how it can affect people’s lives. Machine learning models that are developed on the basis of statistical patterns from observed data provide new opportunities to augment our knowledge of text, photographs, and other types of data in support of research and educa- tion. However, “the viability of machine learning and artificial intelligence is predicated on the representativeness and quality of the data that they are trained on,” as Thomas Padilla, Interim Head, Knowledge Production at the University of Nevada Las Vegas, asserts (2019, 14). With that in mind, these technologies and methodologies could help augment the capacity of archives and libraries to leverage their creation-value and minimize their institutional memory loss while enhancing the interdisciplinary approach to research and scholarship. In this essay, I begin by placing artificial intelligence and machine learning in context, then proceed by discussing why AI matters for archives and libraries, and describing the techniques used in a pilot automation project from the perspective of digital curation at Oklahoma State University Archives. Lastly, I end by challenging other areas in the library and adjacent fields to join in the dialogue, to develop a machine learning solution more broadly, and to explore op- portunities that we can reap by reaching out to others who share a similar interest in connecting people to build knowledge. 127 128 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 11 Artificial Intelligence and Machine Learning. Why do they Matter? Artificial intelligence has seen a resurging interest in the recent past—in the news, in the literature, in academic libraries and archives, and in other fields, such as medical imaging, inspection of steel corrosion, and more. John McCarthy, American computer scientist, defined artificial intelligence as “the science and engineering of making intelligent machines, especially intelligent computer programs. It is related to the similar task of using computers to understand human intelligence, but AI does not have to confine itself to methods that are biologically observable” (2007, 2). This definition has since been extended to reflect a deeper understanding of AI today and what systems run by computers are now able to do. Dr. Carmel Kent notes that “AI feels like a moving target” as we still need to learn how it affects our lives (2019). Within the last decades, the amazing jump in computing capabilities has been quite transformative in that machines are increasingly able to ingest and analyze large amounts of data and more complex data to automatically produce models that can deliver faster and more accurate results. 1 Their “power lies in the fact that machines can recognize patterns efficiently and routinely, at a scale and speed that humans cannot approach,” writes Catherine Nicole Coleman, digital research architect for Stanford University (2017). A Paradigm Shift for Archives and Libraries Within the context of university archives, this paradigm shift has been transforming the way we interpret archival data. Artificial intelligence, and specifically machine learning as a subfield of AI, has direct applications through pattern recognition techniques that predict the labeling values for unlabeled data. As the software analytics company SAS argues, it is “the iterative aspect of machine learning [that] is important because as models are exposed to new data, they are able to independently adapt. They learn from previous computations to produce reliable, repeatable decisions and results” (n.d.). Case in point, how can we use machine learning to train machines and apply facial and text recognition techniques to interpret the sheer number of photographs and texts in either ana- log or born-digital formats held in archives and libraries? Combining automatic processes to as- sist in supporting inventory management with a focus on descriptive metadata, a machine learn- ing solution could help alleviate time-consuming and relatively expensive metadata tagging tasks, and thus scale the process more effectively using relatively small amounts of data. However, the traditional approach of machine learning would still require a significant time commitment by archivists and curators to identify essential features to make patterns usable for data training. By contrast, deep learning algorithms are able “to learn high-level features from data in an incremen- tal manner. This eliminates the need of domain expertise and hard core feature extraction” (Ma- hapatra 2018). Deep learning has regained popularity since the mid-2000s due to “fast development of high- performance parallel computing systems, such as GPU clusters” (Zhao 2019, 3213). Deep learn- ing neural networks are more effective in feature detection as they are able to solve complex prob- lems such as image classification with greater accuracy when trained with large datasets. The challenge is whether archives and libraries can afford to take advantage of greater computing capabilities to develop sophisticated techniques and make complex patterns from thousands of 1See SAS n.d. and Brennan 2019. Prud’homme 129 digital works. The sheer size of library and archive datasets, such as university photograph collec- tions, presents challenges to properly using these new, sophisticated techniques. As Jason Griffey writes, “AI is only as good as its training data and the weighting that is given to the system as it learns to make decisions. If that data is biased, contains bad examples of decision-making, or is simply collected in such a way that it isn’t representative of the entirety of the problem set[…], that system is going to produce broken, biased, and bad outputs” (2019, 8). How can cultural heritage institutions ensure that their machine learning algorithms avoid such bad outputs? Implications to Machine Learning Machine learning has the potential to enrich the value of digital collections by building upon ex- perts’ knowledge. It can also help identify resources that archivists and curators may never have the time for, and at the same time correct assumptions about heritage materials. It can generate the necessary added value to support the mission of archives and libraries in providing a public good. Annie Schweikert states that “artificial intelligence and machine learning tools are consid- ered by many to be the next step in streamlining workflows and easing workloads” (2019, 6). For images, how can archives build a data-labeling pipeline into their digital curation work- flow that enables machine learning of collections? With the objective being to augment knowl- edge and create value, how can archives and libraries “bring the skills and knowledge of library staff, scholars, and students together to design an intelligent information system” (Coleman 2017)? Despite the opportunities to augment knowledge from facial recognition, models generated by machine learning algorithms should be scrutinized so long it is unclear how choices are made in feature selection. Machine learning “has the potential to reveal things …that we did not know and did not want to know” as Charlie Harper asserts (2018). It can also have direct ethical impli- cations, leading to biased interpretations for nefarious motives. Machine Learning and Deep Learning on the Grounds of Generating Value In the fall 2018, Oklahoma State University Archives began to look more closely at a machine learning solution to facilitate metadata creation in support of curation, preservation, and dis- covery. Conceptually, we envisioned boosting the curation of digital assets, setting up policies to prioritize digital preservation and access for education and research, and enhancing the long-term value of those data. In this section, I describe the parameters of automation and machine learning used to support inventory work and experiment with face recognition models to add contextual- ization to digital objects. From a digital curation perspective, the objective is to explore ways to add value to digital objects for which little information is known, if any, in order to increase the visibility of archival collections. What started this Pilot Project? Before proceeding, we needed to gain a deeper understanding of the large quantity of files held in the archives—both types of data and metadata. The challenge was that with so many files, so many formats, files become duplicated and renamed, doctored, and scattered throughout direc- tories to accommodate different types of projects over time, making it hard to sift due to sparse 130 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 11 metadata tags that may have differed from one system to another. In short, how could we justify the value of these digital assets for curatorial purposes? How much could we rely on the estab- lished institutional memory within the archives? Lastly, could machine learning or deep learning applications help us build a greater capacity to augment knowledge? In order to optimize re- sources and systematically make sense of data, we needed to determine that machine learning could generate value, which in turn could help us more tightly integrate our digital initiatives with machine learning applications. Such applications would only be as effective as the data are good for training and the value we could derive from them. Methodology and Plan of Action First, we recruited two student interns to create a series of processes that would automatically populate a comprehensive inventory of all digital collections, including finding duplicate files by hashing. We generated the inventory by developing a process that could be universally adapted to all library digital collections, setting up a universal list of works and their associated metadata, with a focus on descriptive metadata, which in turn could support digital curation and discov- ery of archival materials—digitized analog materials and born-digital materials. We developed a universal policy for digital archival collections, which would allow us to incorporate all forms of metadata into a single format to remedy inconsistencies in existing metadata. This first phase was critical in the sense that it would condition the cleansing and organizing of data. We could then proceed with the design of a face recognition database, with the intent to trace individuals fea- tured in the inventory works of the archives to the extent that our data were accurate. We utilized the Oklahoma State University Yearbook collections and other digital collections as authoritative references for other works, for the purpose of contextualization to augment our data capacity. Second, we implemented our plan; worked closely with the Library Systems’ team within a Windows-based environment; decided on Graphics Processing Unit (GPU) performance and cost, taking into consideration that training neural networks necessitates computing power; de- termined storage needs; and fulfilled other logistical requirements to begin the step-by-step pro- cess of establishing a pattern recognition database. We designed the database on known objects before introducing and comparing new data to contextualize each entry. With this framework, we would be able to add general metadata tags to a uniform storage system using deep learning technology. Third, we applied Tesseract OCR on a series of archival image-text combinations from the archives to extract printed text from those images and photographs. “Tesseract 4 adds a new neural net (LSTM) [Long Short-Term Memory] based OCR engine which is focused on line recognition,” while also recognizing character patterns (“Tesseract” n.d.). We were able to obtain successful output for the most part, with the exception of a few characters that were hard to detect due to pixelation and font types. Fourth, we looked into object identifiers, keeping in mind that “When there are scarce or insufficient labeled data, pre-training is usually conducted” (Zhao 2019, 3215). Working through the inventory process, we knew that we would also need to label more data to grow our capacity. We chose to use ResNet 50, a smaller version backbone of Keras-Retinanet, frequently used as a starting point for transfer learning. ResNet 152 was another implementation layer used as shown in Figure 11.1 demonstrating the output of a training session or epoch for testing purposes. Keras is a deep learning network API (Application Programming Interface) that supports multiple back-end neural network computation engines (Heller 2019) and RetinaNet is a sin- Prud’homme 131 Figure 11.1: ResNet 152 application using PASCAL VOC 2012 Figure 11.2: Face recognition API test gle, unified network consisting of a backbone network and two task-specific subnetworks used for object detection (Karaka 2019). We proceeded by first dumping a lot of pre-tagged infor- mation from pre-existing datasets into this neural network. We experimented with three open source datasets: PASCAL VOC 2012, a set including 20 object categories; Open Images Database (OID), a very large dataset annotated with image-level labels and object bounding boxes; and Mi- crosoft COCO, a large-scale object detection, segmentation, and captioning dataset. With a few faces from the OID dataset, we could compare and see if a face was previously recognized. Ex- panding our process to data known from the archives collection, we determined facial areas, and more specifically, assigned bounding box regressions to feed into the facial recognition API, based on Keras code written in Python. The face recognition API is available via GitHub. 2 It uses a method called Histogram of Oriented Gradient (HOG) encoding that makes the actual face recognition process much easier to implement for individuals because the encodings are fairly unique for every person, as opposed to encoding images and trying to blindly figure out which parts are faces based on our label boxes. Figure 11.2 illustrates our test, confirming from two very different photographs the presence of Jessie Thatcher Bost, the first female graduate from Oklahoma A&M College in 1897. Ren et al. stated that it is important to construct a deep and convolutional per-region object 2See ?iiTb,ff;Bi?m#X+QKf�;2Bi;2vf7�+2n`2+Q;MBiBQM. https://github.com/ageitgey/face_recognition 132 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 11 classifier to obtain good accuracy using ResNets (2015). Going forward, we could use the tool “as is” despite the low tolerance for accuracy, or instead try to establish large datasets of faces by training on our own collections in hopes of improving accuracy. We proceeded with utilizing the Oklahoma State University Yearbook collections, comparing image sets with other photographs that may include these faces. We look forward to automating more of these processes. A Conclusive First Experiment We can say that our first experiment developing a machine learning solution on a known set of archival data resulted in positive output, while recognizing that it is still a work in progress. For example, the model we ran for the pilot is not natively supported on Windows, which hindered team collaboration. In light of these challenges, we think that our experiment was a step in the right direction of adding value to collections by bringing in a new layer of discovery for hidden or unidentified content. Above all, this type of work relies greatly on transparency. As Schweikert notes, “Trans- parency is not a perk, but a key to the responsible adoption of machine learning solutions” (2019, 72). More broadly, issues in transparency and ethics in machine learning are important concerns in the collecting and handling of data. In order to boost adoption and get more buy-in with this new type of discovery layer, our team shared information intentionally about the process to help add credibility to the work and foster a more collaborative environment within the library. Also, the team developed a Graphic User Interface (GUI) to search the inventory within the archives and ultimately grow the solution beyond the department. Challenges and Opportunities of Machine Learning Challenges In a National Library of Medicine blog post, Patti Brennan points out “that AI applications are only as good as the data upon which they are trained and built”(2019), and having these data ready for analysis is a must in order to yield accurate results. Scaling of input and output variables also plays an important role in the performance improvement when using neural network mod- els. Jerome Pesenti, Head of AI at Facebook, states that “When you scale deep learning, it tends to behave better and to be able to solve a broader task in a better way” (2019). Clifford Lynch affirms, “machine learning applications could substantially help archives make their collections more discoverable to the public, to the extent that memory organizations can develop the skills and workflows to apply them” (2019). This raises the question whether archives can also afford to create the large amount of data from print heritage materials or refine their born-digital col- lections in order to build the capacity to sustain the use of deep-learning applications. Granted, the increasing volume of born-digital materials could help leverage this data capacity somehow; it does not exclude the fact that all data will need to be ready prior to using deep learning. Since machine learning is only good so long as value is added, archives and libraries will need to think in terms of optimization as well, deciding when value-generated output is justified compared to the cost of computing infrastructure and skilled labor needs. Besides value, operations, such as storing and ensuring access to these data, are just as important considerations to making machine learning a feasible endeavor. Prud’homme 133 Opportunities Investment in resources is also needed for interpreting results, in that “results of an AI-powered analysis should only factor into the final decision; they should not be the final arbiter of that de- cision” (Brennan 2019). While this could be a challenge in itself, it can also be an opportunity when machine learning helps minimize institutional memory loss in archives and libraries (e.g., when long-time archivists and librarians leave the institution). Machine learning could supple- ment practices that are already in place—it may not necessarily replace people—and at the same time generate metadata for the access and discovery of collections that people may never have the time to get to otherwise. But we will still need to determine accuracy in results. As deep learn- ing applications will only be as effective as the data, archives and libraries should expand their capacity by working with academic departments and partnering with university supercomput- ing centers or other highly performant computing environments across consortium aggregating networks. Such networks provide a computing environment with greater data capacity and more GPUs. Along similar lines, there are opportunities to build upon Carpentries workshops and the communities of practice that surround this type of interest. These growing opportunities could help boost the use of machine learning and deep learn- ing applications to minimize our knowledge gaps about local history and the surrounding com- munity, bringing together different types of data scattered across organizations. This increased capacity for knowledge could grow through collaborative partnerships, connecting people, schol- ars, computer scientists, archivists and librarians, to share their expertise through different types of projects. Such projects could emphasize the multi- and interdisciplinary academic approach to research, including digital humanities and other forms or models of digital scholarship. Conclusion Along with greater computing capabilities, artificial intelligence could be an opportunity for li- braries and archives to boost the discovery of their digital collections by pushing text and image recognition machine learning techniques to new limits. Machine learning applications could help increase our knowledge of texts, photographs, and more, and determine their relevance within the context of research and education. It could minimize institutional memory loss, espe- cially as long-time professionals are leaving the profession. However, these applications will only be as effective as the data are good for training and for the added value they generate. At Oklahoma State University, we took a leap forward developing a machine learning so- lution to facilitate metadata creation in support of curation, preservation, and discovery. Our experiment with text extraction and face recognition models generated conclusive results within one academic year with two student interns. The team was satisfied with the final output and so was the library as we reported on our work. Again, it is still a work in progress and we look forward to taking another leap forward. In sum, it will be organizations’ responsibility to build their data capacity to sustain deep learning applications and justify their commitment of resources. Nonetheless, as Oklahoma State University’s face recognition initiative suggests, these applications can augment archives’ and li- braries’ support for multi- and interdisciplinary research and scholarship. 134 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 11 References Brennan, Patti. 2019. “AI is Coming. Are Data Ready?” NLM Musings from the Mezzanine (blog). March 26, 2019. ?iiTb,ffMHK/B`2+iQ`XMHKXMB?X;QpfkyRNfyjfkef�B@ Bb@+QKBM;@�`2@i?2@/�i�@`2�/vf. Carmel, Kent. 2019. “Evidence Summary: Artificial Intelligence in Education.” European EdTech Network. ?iiTb,ff22iMX2mfFMQrH2/;2f/2i�BHf1pB/2M+2@amKK�` v@$f@�`iB7B+B�H@AMi2HHB;2M+2@BM@2/m+�iBQM. Coleman, Catherine Nicole. 2017. “Artificial Intelligence and the Library of the Future, Revis- ited.” Stanford Libraries (blog). November 3, 2017. ?iiTb,ffHB#`�`vXbi�M7Q`/X2 /mf#HQ;bf/B;Bi�H@HB#`�`v@#HQ;fkyRdfRRf�`iB7B+B�H@BMi2HHB;2M+2@�M /@HB#`�`v@7mim`2@`2pBbBi2/. “Face Recognition.” n.d. Accessed November 30, 2019. ?iiTb,ff;Bi?m#X+QKf�;2Bi;2vf 7�+2n`2+Q;MBiBQM. Griffey, Jason, ed.. 2019. “Artificial Intelligence and Machine Learning in Libraries.” Special issue, Library Technology Reports 55, no. 1 (January). ?iiTb,ffDQm`M�HbX�H�XQ`;fB M/2tXT?TfHi`fBbbm2fpB2rAbbm2fdyNf9dR. Harper, Charlie. 2018. “Machine Learning and the Library or: How I Learned to Stop Worrying and Love My Robot Overlords.” Code4Lib, no. 41 (August). ?iiTb,ffDQm`M�HX+Q/2 9HB#XQ`;f�`iB+H2bfRjedR. Heller, Martin. 2019. “What is Keras? The Deep Neural Network API Explained.” InfoWorld (website). January 28, 2019. ?iiTb,ffrrrXBM7QrQ`H/X+QKf�`iB+H2fjjjeRNkfr? �i@Bb@F2`�b@i?2@/22T@M2m`�H@M2irQ`F@�TB@2tTH�BM2/X?iKH. Karaka, Anil. 2019. “Object Detection with RetinaNet.” Weights & Biases (website). July 18, 2019. ?iiTb,ffrrrXr�M/#X+QKf�`iB+H2bfQ#D2+i@/2i2+iBQM@rBi?@`2iBM�M 2i. Lynch, Clifford. 2019. “Machine Learning, Archives and Special Collections: A High Level View.” International Council on Archives Blog. October 1, 2019. ?iiTb,ff#HQ;@B+� XQ`;fkyRNfRyfykfK�+?BM2@H2�`MBM;@�`+?Bp2b@�M/@bT2+B�H@+QHH2+iBQM b@�@?B;?@H2p2H@pB2rf. Mahapatra, Sambit. “Why Deep Learning over Traditional Machine Learning?” Towards Data Science (website). March 21, 2018. ?iiTb,ffiQr�`/b/�i�b+B2M+2X+QKfr?v@/22 T@H2�`MBM;@Bb@M22/2/@Qp2`@i`�/BiBQM�H@K�+?BM2@H2�`MBM;@R#e�NNRdd yej. McCarthy, John. “What is Artificial Intelligence?” Professor John McCarthy (website). Revised November 12, 2007. ?iiT,ffDK+Xbi�M7Q`/X2/mf�`iB+H2bfr?�iBb�Bfr?�iBb� BXT/7. Padilla, Thomas. 2019. Responsible Operations: Data Science, Machine Learning, and AI in Libraries. Dublin, OH: OCLC Research. ?iiTb,ff/QBXQ`;fRyXk8jjjftFdx@N;Nd. Pesenti, Jerome. 2019. “Facebook’s Head of AI Says the Field Will Soon ‘Hit the Wall.’ ” Inter- view by Will Knight. Wired (website). December 4, 2019. ?iiTb,ffrrrXrB`2/X+QKf biQ`vf7�+2#QQFb@�B@b�vb@7B2H/@?Bi@r�HHf. Ren, Shaoqing, Kaiming He, Ross Girshick, Xiangyu Zhang, and Jian Sun. 2015. “Object De- tection Networks on Convolutional Feature Maps.” IEEE Transactions on Pattern Analysis and Machine Intelligence 39, no. 7 (April). SAS. n.d. “Machine Learning: What It Is and Why It Matters.” Accessed December 17, 2019. https://nlmdirector.nlm.nih.gov/2019/03/26/ai-is-coming-are-the-data-ready/ https://nlmdirector.nlm.nih.gov/2019/03/26/ai-is-coming-are-the-data-ready/ https://library.stanford.edu/blogs/digital-library-blog/2017/11/artificial-intelligence-and-library-future-revisited https://library.stanford.edu/blogs/digital-library-blog/2017/11/artificial-intelligence-and-library-future-revisited https://library.stanford.edu/blogs/digital-library-blog/2017/11/artificial-intelligence-and-library-future-revisited https://github.com/ageitgey/face_recognition https://github.com/ageitgey/face_recognition https://journals.ala.org/index.php/ltr/issue/viewIssue/709/471 https://journals.ala.org/index.php/ltr/issue/viewIssue/709/471 https://journal.code4lib.org/articles/13671 https://journal.code4lib.org/articles/13671 https://www.infoworld.com/article/3336192/what-is-keras-the-deep-neural-network-api-explained.html https://www.infoworld.com/article/3336192/what-is-keras-the-deep-neural-network-api-explained.html https://www.wandb.com/articles/object-detection-with-retinanet https://www.wandb.com/articles/object-detection-with-retinanet https://blog-ica.org/2019/10/02/machine-learning-archives-and-special-collections-a-high-level-view/ https://blog-ica.org/2019/10/02/machine-learning-archives-and-special-collections-a-high-level-view/ https://blog-ica.org/2019/10/02/machine-learning-archives-and-special-collections-a-high-level-view/ https://towardsdatascience.com/why-deep-learning-is-needed-over-traditional-machine-learning-1b6a99177063 https://towardsdatascience.com/why-deep-learning-is-needed-over-traditional-machine-learning-1b6a99177063 https://towardsdatascience.com/why-deep-learning-is-needed-over-traditional-machine-learning-1b6a99177063 http://jmc.stanford.edu/articles/whatisai/whatisai.pdf http://jmc.stanford.edu/articles/whatisai/whatisai.pdf https://doi.org/10.25333/xk7z-9g97 https://www.wired.com/story/facebooks-ai-says-field-hit-wall/ https://www.wired.com/story/facebooks-ai-says-field-hit-wall/ Prud’homme 135 ?iiTb,ffrrrXb�bX+QKf2MnmbfBMbB;?ibf�M�HviB+bfK�+?BM2@H2�`MBM;X?i KH. Schweikert, Annie. 2019. “Audiovisual Algorithms, New Techniques for Digital Processing.” Master’s Thesis, New York University. ?iiTb,ffrrrXMvmX2/mfiBb+?fT`2b2`p�iB QMfT`Q;`�Kfbim/2MinrQ`FfkyRNbT`BM;fRNbni?2bBbna+?r2BF2`iXT/7. “Tesseract OCR.” n.d. Accessed December 11, 2019. ?iiTb,ff;Bi?m#X+QKfi2bb2`�+i@Q +`fi2bb2`�+i. Zhao, Zhong-Qiu, Peng Zheng, Shou-tao Xu, and Xindong Wu. 2017 “Object Detection with Deep Learning: A Review.” IEEE Transactions on Neural Networks and Learning Sys- tems 30, no. 11 (2019): 3212-3232. https://www.sas.com/en_us/insights/analytics/machine-learning.html https://www.sas.com/en_us/insights/analytics/machine-learning.html https://www.nyu.edu/tisch/preservation/program/student_work/2019spring/19s_thesis_Schweikert.pdf https://www.nyu.edu/tisch/preservation/program/student_work/2019spring/19s_thesis_Schweikert.pdf https://github.com/tesseract-ocr/tesseract https://github.com/tesseract-ocr/tesseract white-entrepreneurship-1987 ---- Entrepreneurship and the Library Profession Full Terms & Conditions of access and use can be found at https://www.tandfonline.com/action/journalInformation?journalCode=wjla20 Journal of Library Administration ISSN: 0193-0826 (Print) 1540-3564 (Online) Journal homepage: https://www.tandfonline.com/loi/wjla20 Entrepreneurship and the Library Profession Herbert S. White To cite this article: Herbert S. White (1987) Entrepreneurship and the Library Profession, Journal of Library Administration, 8:1, 11-27, DOI: 10.1300/J111V08N01_03 To link to this article: https://doi.org/10.1300/J111V08N01_03 Published online: 26 Oct 2008. Submit your article to this journal Article views: 24 View related articles https://www.tandfonline.com/action/journalInformation?journalCode=wjla20 https://www.tandfonline.com/loi/wjla20 https://www.tandfonline.com/action/showCitFormats?doi=10.1300/J111V08N01_03 https://doi.org/10.1300/J111V08N01_03 https://www.tandfonline.com/action/authorSubmission?journalCode=wjla20&show=instructions https://www.tandfonline.com/action/authorSubmission?journalCode=wjla20&show=instructions https://www.tandfonline.com/doi/mlt/10.1300/J111V08N01_03 https://www.tandfonline.com/doi/mlt/10.1300/J111V08N01_03 Entrepreneurship and the Library Profession Herbert S. White The management literature, along with numerous biographies and au- tobiographies, serves to describe the entrepreneur for us quite accurately. The dictionary definition of an individual who starts and conducts an enterprise does not really begin to scratch the surface. Entrepreneurs are perceived to be risk taking innovators, individualistic, believers in them- selves and in their own competence regardless of the views of others, and as often as not stubborn, selfish, insensitive to the concerns of others at least when those concerns get in their way, and sometimes arrogant and ruthless. We know from statistics put out by the Department of Com- merce and other government agencies that most individual entrepreneurs fail, and yet we have a tremendous admiration for those who succeed. Somehow their perseverance and courage in the face of overwhelming odds, negative research findings and the advice of friends, colleagues, and "experts" strikes a responsive chord, and the careers of Howard Hughes and Ted Turner fascinate us, perhaps because these individuals have dared to do and say what most of us know we would never have the courage to attempt. Entrepreneurs of the "old breed" may also command our awe because we recognize them as a dying breed, killed off by changes in the organi- zational decision process. Entrepreneurs, as already noted, are rugged individualists, who will ignore the admonitions of others because they believe so firmly in their own judgements. Entrepreneurs are not only thought to be too often wrong to be tolerated by the organizational struc- ture, but perhaps more importantly we believe that organizational deci- sion making now tends toward committee approaches, consensus, partic- ipation, and consultation. In part this is because of the perception that this leads to better decision, but to a greater extent because it leads to accept- able decisions. Entrepreneurs frequently rub p'eople the wrong way, in part because of their own low level of patience or tolerance for disagree- ment. The development of management theories that urge a greater level of involvement and decision sharing are largely based on the argument Herbert S. While is Dean of the Schml of Library and Informalion Science. Indiana University, Bloomington. Indiana. Journal of Library AdminisIralion, Vol. 8(1). Spring 1987 63 1987 by The Haworth Press. Inc. All rights reserved. 11 12 JOURNAL OF LIBRARY ADMINISTRATION that this not only improves the quality of decisions but also enhances morale and commitment in the office, factory, or laboratory. At the same time, there are other and less glorious reasons. Participation and the use of committees can be seen as an escape hatch for managers who wish to avoid making decisions at all costs, and who would prefer to distribute them to a committee decision process which is risk free simply because that many people can not be fired or even punished. Management writers have therefore noted, in describing the growth of formal decision making structures, what has been called the end of an era. Not in totality, of course. The Ross Perots do occasionally surface, but the greater trend is evidenced by the risk avoidance strategy of the leveraged buyout, or the stock market tactic of not building an organization but of acquiring wealth through infiltrating one through the process of borrowing on its own equity, or of threatening to do so to receive a bribe which, under the nomenclature of "greenmail," is perfectly legal even if ethically odious. The less we now seek to emulate entrepreneurs in the business commu- nity, the more we have come to look upon them as heroic larger than life figures. We recognize, at the same time, that entrepreneurs, even suc- cessful entrepreneurs, have their failings as executives. Most signifi- cantly, they are usually done in by the very success of the organization they have created. As organizations prosper and grow as the result of the drive, innovation, and ingenuity of its entrepreneur founder, they take on characteristics of the more standard bureaucratic model and exceed the abilitv of that individual to make all decisions. Entreoreneuriallv started orgar;izations which have succeeded have done so bkcause tha; success has then been transformed into a more traditionallv structured mechanism or they have failed because of their success and gAwth.hith& the span of decision making exceeded the ability of the one key individual unwilling to delegate to others or other fundamental problems such as unchecked growth leading to cash flow imbalances do the organization in. Q t is an interesting study to note that IBM was totally restructured to diffuse both authority and responsibility by Thomas Watson, Jr. after the death of his entrepreneurial father who founded the company. Entrepreneur founders often insist on making all decisions even when the sheer size of the com- pany made bad decisions inevitable and no decisions even more likely. Management analysts suggest that this happened just in time, not because IBM would have perished (at least not yet) but because it could not have continued to grow. A library example can be drawn from the administra- tion of Librarian of Congress Archibald MacLeish. His greatest contribu- tion perhaps was a restructuring of the decision process to circumvent policies established by his predecessor, a great entrepreneurial builder who failed to recognize that his unwillingness to delegate was now stran- gling decisions. Entrepreneurs are generally seen as unwilling to adapt and as resistant to the sage advice of others who presumably know better. Herbert S. White 13 The willingness to make decisions is their strength and the insistence on making all decisions is also their weakness. They can start organizations but they do not often manage mature ones. In recognition of this trait, it is generally acknowledged that entrepre- neurs do not stay within organizations, they leave and start their own. Some, such as the brilliant innovator who founded Apple Computers, Steven Jobs, do it more than once. Their knack, their accomplishment, and their success comes from starting organizations, not from managing mature ones. In fact, they often find the management of mature organiza- tions to be boring. This perception of entrepreneurs as anti-managers is so closely held that it has come to be accepted as an obvious truth. For the library profession this dichotomy is perceived to be equally stark. Libraries are, after all, very mature organizations. They have a hardening of decision arteries brought about not only by the risk avoid- ance tendencies of many librarians, but by a preference for minimal or no changes by the library's clientele, be these academic faculty members, special library users whose preconceptions come from what they have seen as university students, or the public library patrons heavily skewed toward children and the elderly. All are groups that have an affinity to- ward the library just as it is. Individuals who do not like the library just as it is do not tend to try to change it. They just ignore it and as we already know from a variety of research investigations the existence of an inade- quate library does not pose an insurmountable barrier. Users adapt to poor service, find other approaches to information, or pretend they never needed the information in the first place. All this tends to create a scenario which, in libraries as in other organi- zation structures, but perhaps particularly in libraries, stresses an envi- ronment of cooperation and coordination, of "getting along," of working "as a team," and of not being aggressive or confrontational, or even assertive and outspoken. As a library educator I see recruiters who search for those who work well within groups, and who avoid the students whose academic brilliance and articulateness makes them unique. As an administrator 1 can certainly understand and even appreciate that prefer- ence. At the same time, we must recognize, as our colleagues in the more generic management field have recognized, that this will almost inevita- bly deprive us of entrepreneurs and the strengths that such individuals bring to an organizational dynamism. Entrepreneurs are noi usually good team players. They can be opinionated, rude, loud, stubborn, even ob- noxious. The redeeming quality only comes into play when they are right. Some organizational management consider it too great a price to have such people around disturbing the equilibrium. Not all. The aforementioned Thomas Watson, Jr., having created for IBM a balanced structure combining delegation, authority, and responsi- bility, and certainly based in large measure on committee input and coor- 14 JOURNAL O F LIBR4RYADMINISTRATIOM dination, nevertheless argued passionately for the protection of what he called the organization's "wild ducks"-those individuals march to the beat of a different drummer, create difficulties and tensions wherever they go9 but can also, just once in a while, be counted on to make the major breakthroughs that "organization men" could never make. We now know enough about hiring and selection policies to recognize that nobody would, in the 1980s, hire Thomas Edison as a junior research scientist. And yet, presumably, somebody should. For libraries, oriented toward tradition and risk avoidance to a greater extent than most organizations and without the incentive of profits to justifg, chance taking, the temptation to hire a long line of inoffensive looking "grey" people is just about insurmountable. Nowhere is this pat- tern clearer than in the hiring of administrators for major academic li- braries. Candidates from these posts come from three pools: (1) adminis- trators of smaller academic libraries; (2) assistant administrators of large and even larger libraries; and (3) faculty scholars who would like to take a crack at running the library. All three of these candidate pools are almost automatically conditioned in support of the status quo, or at best of care- fully suggested miniscule change. As this article will argue later, such attitudes will not serve us in the face of the crises we now face. Neverthe- less, the climate is not ready for entrepreneurs in this setting. Entrepre- neurs would without doubt make major changes, and there is no aware- ness that major changes are needed. Such changes could of course dramatically improve the library, They could also harm it and that is a chance nobody is prepared to take. In addressing the issue of entrepreneurship in the library and informa- tion industry, Helena Strauch's chapter in "Careers in Mormation" as- sumes at the outset that entrepreneurial careers will be outside the struc- ture of the traditional setting and outside the present library.' She stresses that such entrepreneurs leave their present employers and blaze new trails for themselves. She emphasizes quite correctly the dangers, misconceptions, and myths that accompany entrepreneurship. The most significant of these is the point that while entrepreneurs are indeed free to follow their own hunches and implement their initiatives, their success will nevertheless be dependent on their ability to get others to see and accept the validity of what they see so clearly. The examples of library and information science pioneers she cites are Mortimer Taube, Saul Herner, Eugene Garfield, and Earl Coleman, among others. A11 Pit the classic definition of the entrepreneur- brilliant, individualistic, visionary, courageous, impatient of the weaknesses ofothers-not builders of teams, delegators of author- ity, or developers of subordinates. If that perception of irreconcilable differences between entrepreneurs and team managers were allowed to continue into the last 15 years of this Herben S. White I5 century, then the impact on libraries of individuals with entrepreneurial spirit would indeed continue to be minimal. Bureaucratic organizations have efficient mechanisms for driving out the person who is different from the norm, because even those who are brighter, quicker and more efficient than the norm tend to make others uncomfortable. This is of course true throughout organizations, because human behavior is fairly generic. It is nowhere more true than in libraries for reasons already stated in part and to be elaborated later in this article. Students do not normally choose this profession in search of wealth or notoriety and the admonition of the 1970s that we should seek a consensus through consul- tation and participation has found an eager audience among librarians. That consensus decisions are invariably "safe," that they are usually uni- maginative in conception and slow in development have been recognized as well for some time. Some administrators in the for-profit sector such as Thomas Watson, J t , though, sought ways to balance individual initia- tive and innovation with the characteristics of a large slow-moving bu- reaucratic structure. They had the motive and incentive for greater profits and a greater market share, however. Libraries have no such performance measurement criteria and neither their staffs nor their users have come to expect any. We could therefore perceive ourselves as in a trap, discourag- ing entrepreneurs from entering our midst and driving away those who wandered in by accident. A recent development named by its originator, Gifford Pinchot 111, as intrapreneurship seeks to differentiate his argument that entrepreneurs do not have to leave and that they can work within the organization.' Pin- chot's ideas are worth examining, because, as will be argued later, proba- bly no profession has a greater need for this newly termed intrapreneur than librarianship. Pinchot stresses the importance of entrepreneurial ap- proaches within organizations, with examples drawn directly from the for-profit sector - and argues persuasively that organizations that depend on "style" of organizational behavior as opposed to a concentration on results pay a heavy penalty. While the term "intrapreneur" may be new, the ideas are really not. Other managers and teachers of management have stated that a continued reliance on innovation is essential for success in any organizational set- ting. William Zucker, professor at the Wharton School of Business at the University of Pennsylvania, has argued that entrepreneurship is a part of the warp and woof of any organization. Robert T. Grohman, president of the clothing firm Levi Strauss, has stated that "Scarcity of innovation is the surest path to slackened competition, to emphasis on maintenance of effort, and finally, to inertia."' These writers, and others, argue that an atmosphere open to innovation requires a tolerance for failure, and an openness to risk. It is perhaps Peter Drucker, the articulate and outspoken guru of man- 16 JOURNAL O F LIBRARY ADMINISTRATION agement concepts, who puts the idea into its most useful perspective and who permits us to apply it to librarianship. I t was Drucker who, a number of years ago, pointed out that managers only get credit for two activities, innovation and marketing, because operational maintenance of the status quo is assumed and earns no recognition. It was he who noted that man- agers needed to be innovative to avoid the risk of being boring, because boredom was a deadly sin in any management environment. Librarians, of course, can take the example from there. If we are taken for granted, if we are trivialized, and if our decisions not to generate an atmosphere of crisis then causes others to rush to solve them, it may be because we have not taken these injunctions to heart. In a 1985 book Innovation and Entrepreneursh@ Drucker examines these issues in greater detail, and he makes some statements which librar- ians might find tart ling.^ He argues, for example, that the Past 15 years have seen the emergence of a truly entrepreneurial economy in the United States. This timing is interesting, because it would relegate the concept of "team management" to the 1960s, and replace it in the 11970s and 1980s with a more rugged individualistic model. For libraries, as always well behind their industrial role models, such a suggestion becomes particu- larly ironic. Our literature is still filled with urgings that we develop concepts of greater participation, consensus seeking, and committee de- cision structures, when Drucker now suggests that such an approach has been pass6 for some time. The suggestion of the 1960s sociological argument was that people will work harder if they are happier and that what we really need are managers who are sensitive to the concerns and needs of people. That this "fad" has now run its course can also be seen from a general examination of the longevity of management theories and from an examination of recent newspaper accounts. It is most directly evident from the actions of corpo- rations in the mid-1980s, which are ruthlessly stripping their organiza- tions of middle management layers of coordinators, staff assistants, and facilitators, and relying on keeping individuals who "do things" and who "make things happen." The change, of course, is never total. Manage- ment fads have a way of swinging like pendulums, overcorrecting per- ceived imbalances and then precipitating a counteraction against an activ- ity that had gone too far. Certainly no one would argue that sensitivity to individuals, an ability to listen, and a willingness to compromise are bad. What appears clear, however, is that these values as virtues in themselves have fallen by the wayside. Organizations do want individuals with these values if possible, more importantly they want people who will make a direct and personal contribution to the program of the organization, if necessary by cutting through the red tape and caution flags set up by the bureaucracy's "people people," the ones who do little but convene meet- ings and report the consensus achieved there. I attempted to address this fascination in our own field with style as opposed to substance of deci- sion making in a column entitled "Participative Management is the An- swer, But What Was the Q ~ e s t i o n ? " ~ The column drew little direct re- sponse, but at least some comment that I had rather cleverly exposed some of the weaknesses in an excessive use of participative processes. This exposure only pointed out that we had to work harder to make the process work, because participation was desirable as a social good and we needed to make it effective. Why, for heaven's sake? Because it makes for more efficient and effective libraries? Because it makes for more fulfilled and happy librarians? Neither point can be "proven," but in the 1960s such things did not have to be proven if they were "obvi- ous." What we do know about job fulfillment would suggest that protect- ing staff members against abuse, unfairly low salaries, and objectives for which there are no resources of implementation are far more effective techniques than the recognized absurdity of sitting around in lengthy meetings pondering the undoable. I cannot help noting that as this is being written the radio reports the death of Admiral Hyrnan Rickover, one of the great, most effective, and to some, most obnoxious entrepre- neurs. Rickover developed the atomic submarine, and in that process his unflagging enemy was the bureaucracy of the U.S. Navy, most particu- larly those admirals whose route to success was through the socialization process of getting along with others. The country club is still an effective route to success in the United States, as school ties are in Great Britain, party membership in the Soviet Union, and important relatives in the Middle East. 1 hope that the reader will forgive the digression, which somehow seemed a necessary tribute at the time of writing. For libraries, the devel- opments in management practice, and our own approaches to seeking collegiality and cooperation just as others turn to-individuality and contri- bution may suggest that we are still, or perhaps once again, out of phase. As an educator and administrator I know that some libraries shun candi- dates who are "different," even though they know that in this case differ- ent may mean brighter and more articulate. I also know that some educa- tors assign group projects and assign group grades, although they know that they cannot really tell who contributed what portion to the overall outcome, except that this contribution was almost never equally distrib- uted. Hyman Rickover, by contrast, prided himself on refusing to accept recommendations with several equal signatures. He wanted to know who was taking responsibility and who would ultimately be credited or blamed. Rickover was unusual because he insisted on this practice in the 1960s when such behavior was considered bizarre. I t is bizarre no longer, except perhaps in libraries and similar institutions. In discussing entrepreneurship within the organization, Peter Drucker stresses that opportunities for innovation come from unexpected suc- 18 JOUIWAL OF LIBRARY ADMlNlSTRATlON cesses (the development of data base access services for libraries comes immediately to mind), or unexpected failures (here the obvious example would seem to be our failure to secure adequate financial support for our historic and traditional approaches). Drucker further argues that innova- tion should be based on an analysis of opportunities, that i t should be kept as simple as possible, that it should start small, and that innovation should be for the present and not for the future. Although Drucker does not specifically discuss libraries, his book points to the particular difficulty of public service institutions in attempt- ing to deal with innovation. The problems he identifies are certainly de- scriptive of libraries; yet it is his unswerving conclusion that ways must be found around these difficulties: 1. We are judged by budgets rather than by results. True enough, but the solution is self-evident. It lies in the concept of program budgeting, of starting with proposed activities and moving from these to budgets, rather than the other way around. Program budgeting, of course, has been in vogue in responsible administrative circles for 20 years. If it has not been applied to library budgets it is because we have not developed either the justification or the insistence that it should. 2. We depend on a multitude of constituents, any of whom has at least a partial veto over what we do. Again, this is certainly an accurate de- scription of libraries. It opens questions of "turf," of who controls the decision process of how libraries function. the professionals or the ama- teurs. Turf battles are not uncommon in any discipline, but doctors and lawyers won theirs long ago. Teachers are involved in the strug- gle to determine who decides curriculum and classroom size. Librarians, - by contrast, have generally been loath to broach the issue. Some, particu- larly in academia, even argue that users know better than we what the library should do, a statement made specious by the recognition that most users do not even know what the library could do, only what it does. 3. Public service institutions see their mission as one of "doing good," as a moral absolute rather than in economic terms. If we do not succeed, we assume either that we must try harder or obtain more re- sources. It does not readily occur to public service institutions to examine what they do and why they do it. Drucker must have been looking over our shoulders when he wrote that. Drucker's most telling points for us come when he describes what is needed to develop an organizational climate hospitable to entrepreneur- ship and innovation. It is my preference to list these all at once and then comment on them in greater detail. Drucker's points are: 1. A clear definition of organizational mission 2. A realistic statement of objectives Herberr S. White 19 3. The recognition that a failure to achieve objectives requires the re- definition of these objectives in terms that can be achieved 4. The need to look at innovation as an opportunity rather than as a threat The application of Drucker's points to the scenario of the library is quite evident. Libraries do tend to have mission statements, but these tend to be open-ended and unsuited to quantification. Such mission state- ments as "provide library services to all the citizens of the community" or "support the research and teaching mission of the university" are not inappropriate as slogans, but they suggest a clear need for translation into objectives. Objectives, management theorists remind us, are statements of what is specifically going to be accomplished in a finite period of time. They are allied to an identification of nceded resources, and to the devel- opment of plans and strategies for accomplishing these objectives. I t is here, and in Drucker's third point which argues that a failure to achieve objectives (particularly because of an inadequacy of resources) requires a redefinition of objectives, that libraries, and by inference, other public service institutions, fail so badly. It may be because we do not control our agenda and instead depend on constituents to tell us what to do, constituents who may not be the same individuals as those who control our resources. It may be, as Drucker has also suggested, because we ,believe in the "goodness" of our service, and therefore take a per- sonal responsibility for providing it, regardless of resources. Certainly, when our managers suggest to us that "they have confidence in our abil- ity to cope" or that "we do the best we can," those statements contain a tacit admission that the job really cannot be done. We are often reluctant, though, to make that necessary statement to complete the scenario. As library managers, we give our subordinates assignments that cannot be satisfactorily completed because there is not enough time or not enough resources. When we knowingly perpetrate this fraud by urging them to "do the best they can," we do so much damage to organizational morale that management styles of autocracy, consultation, or participation be- come irrelevant by comparison. When individuals are given jobs in which they cannot succeed that injustice transcends the style in which we do it. And yet, as a wholly emerging literature on the management of declin- ing resources suggests, we do badly in relating our resources to our objec- tives, likely because we are measured on the basis of budgets rather than results. If this is the case, and 1 am sure it is, it becomes puzzling that we are unwilling or at least unable to renegotiate those expected results based on the resources provided. The clue may be found in the last and by his definition most important characteristics in the Drucker description of public service institutions. Because we are here to do "good" and be- 20 JOURNAL OF LIBRARYADMINISTRATION cause we see our mission as a moral imperative rather than in more mun- dane down to earth terms, it may be precisely because we do not want to limit our objectives that we fail to tie them to the resources provided. However, in this noble and perhaps understandable aim, we make two fatal mistakes. The first is that by a failure to tie accomplishments to resources we insure both decreased resources and ultimately decreased accomplishments. In other words, we doom ourselves to failure because we are unwilling to take the risks necessary to be managers. We also fail to recognize, as Thomas Galvin has reminded us, that "management is a contact ~ p o r t . " ~ Our second mistake is even more serious. because it makes victims of our subordinates. In our unwillingness to fight for the resources they need to accomolish their iobs or to restructure the iob to meet the resources. we sentence subordinates to play a game they cannot win. It seems ironic to me that in this profession we spend so much time trying to implement in the 1980s a 1960s philosophy of personal decision involvement. At the same time we ignore a reality that has been around a lot longer and is not going to change - that we owe our subordinates a job they can accomplish with the training and resources provided -and that we owe them a chance to win, if they are willing to try. Ht takes no genius to figure out what happens to the morale of individuals who learn that they cannot win no matter how hard they try. A number of recent articles dealing with the management of declining resources i n the public sector, but most directly that by Bo Hedberg writ- ten a decade ago, tell us quite directly how to deal with this phenome- non.' Hedberg outlines the stages through which a management con- fronted by declining resources must pass to ultimately deal with the problem. The first is the premise that the decline is temporary and that nothing needs to be done because the problem will disappear on its own. This is of course occasionally true, but not nearly as frequently as we would like to believe, or as often our managers urge us to believe. The most obvious test for such a hypothesis is to ask what situation is likely to improve and why this will occur. Frequently, the hope for better days ahead is nothing more than wishful thinking. As often as not the cut presages further cuts i n the future. The prudent manager, of course, deals with realities and not with fond hopes. ]In contrast to the teaching of children's fairy tales, wishing will not make it so. Hedberg's second stage recognizes that the cut is not temporary, but that somehow "it can be absorbed." Many libraries allow themselves to be cajoled or pressured into accepting this scenario, but Wedberg notes quite clearly that the argument is bankrupt, for two reasons. The first is that it creates yet more pressures on efficiency, on doing what we do as rapidly and cheaply as possible, rather than allowing a concentration of effectiveness, an investigation of why we are doing what we are doing Herbert S. White 21 and a determination of possible alternatives. The second is that as we suggest to our subordinates that they must "do more" because the budget has been cut, we create two unacceptable premises. The first is that they must somehow bear the brunt and accept the blame for our failure to succeed as managers, because clearly one of our jobs as managers is the obtaining of resources. The second is that we entrap them into an admis- sion that they can absorb more work, and thereby into a self-indictment of not having worked as hard as they could have. Subordinates are smarter than that. The suggestion that more work can be absorbed must be cate- gorically rejected whether or not it has validity. If higher production quo- tas are unilaterally imposed, they are met with dire warnings that quality will suffer. Of course, having been issued, these warnings must be made to come true and therefore they do come true. Ultimately, Hedberg argues, organizations can only deal with declining resources by reexamining and changing the premises of what is to be done, how it is to be done, or both. That clearly involves us in renegotiat- ing the "contract" with our constituent groups. Sometimes it is these individuals who also directly or indirectly provide our funds and that negotiation should be relatively simple. In other situations, the constitu- ents and source of funds represent different communities. It is then neces- sary for the librarian to clearly establish the relationship in the under- standing of both groups-so that constituents know whom to blame and that budget cutters understand whatever risks they incur in cutting budg- ets. It must be obvious to all readers that if there are no perceived risks in the cutting of budgets or alternatively no credit to be gained in restoring or augmenting a library budget, then library budgets will indeed be auto- matically cut. This is true because there is inevitably credit to be gained for frugality and economy. If that action has no price it becomes an ab- surdly simple decision to make. Librarians who do not understand that political reality participate in their own destruction and fail to protect their organizations or their staffs. It is these individuals who are our truly incompetent managers. Budget cuts and other failures of programs and initiatives, Drucker reminds us, provide excellent opportunities for innovation. It is impor- tant in this case to stop harboring grudges, to cease looking for scape- goats, and not to engage in lengthy and paralyzing analysis. Rather, Drucker suggests that the potential innovator and entrepreneur simply look around. Options will suggest themselves as long as we concentrate properly on objectives and not on processes themselves. I t can be argued that declining environments not only provide opportunities for innova- tion, they make the process essential. For, as my college tennis coach well advised me, "always change a losing game. The worst that can happen is that you will still lose. You therefore risk n~thing."~ As librarians have the opportunity to examine the admonitions of 22 JOURNAL OF LIBRARY ADMINISTRATION Drucker and others dealing with nurturing innovation and entrepreneur- ship within the organization structure, there are a number of concerns that must be kept in mind. The first is that libraries, perhaps even beyond other public service institutions, tend to attract and then promote individuals with other than entrepreneurial and innovative characteristics. Entrepreneurs, we must remind ourselves, are not necessarily the most pleasant and affable of individuals. We can see this from the list of information entrepreneurs identified in Helena Strauch's chapter, all of them personally known to this writer. The point is that what they have done shows courage and it works. The biographical synopses of Admiral Hyman Rickover which now flood the media stress that same contradiction. Entrepreneurs and innovators are not likely to win popularity contests, if for no other reason than their advocacy for change and most of the organization resists change. The job of the manager in this environment is to separate what is perceived to be beneficial change from change which can have negative implications or which is change simply for the sake of change. However, as libraries, rushing to implement the 1960s sociologicaP model of gov- ernance, seek individuals less noted for iconoclastic brilliance than an ability to fit into a collegial mold, the likelihood of the survival of inno- vators and entrepreneurs bewmes lessened. Protecting entrepreneurs in an organization requires effort as Thomas Watson, Jr., in his admonition to nurture and protect "wild ducks," constantly sought to remind IBM lower level administrators. B recall telling one of my subordinate supervi- sors in one of my information industry assignments: "You cause me more trouble than any three of my other subordinates put together. But you are worth it." It was the nicest compliment 1 could think of. Of course, this puts us in mind of one of the premises that management theory has rewg- nized all along. G m d subordinates make trouble. They are impatient for change, for improvement, and for personal reward. Only a self-confident supervisor can handle this. When organizational dynamics demand that the individual first "persuade" his or her co-workers, the cause is likely to be lost. Groups are unlikely to foster innovation. Group decisions tend to foster compromise and safe approaches. When we deliberately opt for such a tensionless environment, we give up a great deal. I t was indicative to me that at least one of the criticisms about my article that questioned the blind adherence to participation as a management style conceded that indeed participation did not always work, but that we should strive to make it work. That approach has nothing to do with organizational objec- tives and it has nothing to do with the role of the library. Nor, as we have learned, does it have much to do with making people "happy." Giving people job assignments at which they can succeed is more likely to do this, but such an action of course requires protecting them and the library from the imposition of objectives that cannot be accomplished. Herbert S. White 23 Innovation and entrepreneurship involve risk. There is clearly the risk of failure, and would-be innovators must know that we are not just look- ing over their shoulder waiting for them to fail. Failure, in institutional initiatives and for individuals, is a normal thing to be expected, as long as there is some reasonable balance with success. I t was interesting to note t h a t Coca-Cola conferred a substantial salary increase on the executive who made the decision to implement the new formula Coca-Cola. That decision, marketing statistics clearly indicate, was a mistake and corpo- rate management obliquely acknowledged that fact. The reward was not for success, it was for the courage to try. Public service institutions such as libraries, we are reminded by Drucker, are not expected to take risks, because they are judged by their budgets and not by their accomplishments. If budgets were adequate that might be an acceptable approach, although not for the innovators and entrepreneurs who had accidentally drifted into our profession. However, we know that budgets are not adequate, and they are becoming less ade- quate all the time. We therefore must, for our own professional survival, concentrate on results rather than budgets, and apply a healthy dose of innovation and entrepreneurship to the process. Drucker also notes that unexpected successes, unexpected problems, incongruities between what is and what ought to be, and the development of new knowledge and new technology all provide a fertile ground for the consideration of innovation. All these can be seen to apply to libraries and only a few examples will suffice to make the point. The reader can easily find his or her own. 1. The application of technology to the cataloging process, most spe- cifically through the development of the MARC system, allows us to reexamine the premises and redesign the parameters of how we analyze. That is particularly true for subject analysis. Subject analysis for books is sparse in the library cataloging process for a number of reasons, but two immediately come to mind. The economics of manual card filing limited the number of cards to be produced and filed. Having opted for a detailed analysis on each basic card of the descriptive features for the book and having decided to file at least under author, title, and other applicable descriptive tags, only at most two or three subject cards appeared afforda- ble. Moreover, the difficulty with card catalogs of performing coordi- nated multi-term searches suggested the use of more generic subject terms, even though we recognized (or at least presumably recognized) that such broad headings with many cards would make subject searches difficult. If anyone doubts this he or she need only spend a little time at the divided catalog of a major academic library. Most subject searches are begun in the author-title catalog. Whatever the rationale for this historic approach, it is valid no longer. Computer access to bibliographic informa- tion permits not only the economic storage of more records, but also the 24 J O U R N A L OF LIBR4RY ADMINISTRATION coordinated searching of several access points. It provided for us the innovative and entrepreneurial opportunity to take advantage of the new technology by developing new analytical approaches, rather than just transferring manual cataloging rules to machines. It is perhaps ironic to note that one entrepreneurial commercial information service is now bringing out a tool with which to search for book information through greater subject detail. It is a need we have failed to address and at least somebody perceives the gap. 2. The development of computerized access for bibliographic search- ing has made that exercise far more productive, far more worthwhile, far more interesting, and at the same time far more complex. It makes his- toric reference department budgets based on the manual perusal of card files and published indexes totally irrelevant. This development also sug- gests a relatively simple justification for a manyfold increase in the public service budget of libraries, based not on history but on need and opportu- nities. It is, in any budgetary setting that concentrates on results, rela- tively easy to do. In the case of bibliographic searching, it is easy to demonstrate that the funds will be spent in any case and undoubtedly spent less efficiently if outside the library budget. A budgetary increase is also easy to do because the management literature reminds us that large increases with significant results are far easier to justify than small in- creases which produce no noted change. The sad indictment is not that we have failed, it is that by and large we have not tried. Instead, we have produced a tortuous literature that argues that teaching individuals to do their own information searching work is the "better" approach. This ar- gument runs counter to evidence that we do it more economically and more effectively and that most users would rather delegate the process if they could. What we have done here is to continue to define the library's tasks as we have always defined them and to expel all tasks that do not fit priority definitions. We have of course applied new techniques to the tasks we have always performed, but that is neither innovative nor entre- preneurial. It is safe, it is sure-it is also boring and in the long term probably suicidal. 3. As a corollary to increased bibliographic access we have also devel- oped a heightened demand for document delivery, as users and reference librarians are no longer restricted to finding out about things in our own card catalogs. Statistics indicate a growing reliance on interlibrary loan. We ought to be thrilled at that opportunity to broaden our services, al- though these increases are usually reported in our literature as trouble- some problems. The reason, of course, is that we have never developed an updated strategy for dealing with the issue of shipping material from one library to another. We prefer to avoid the problem by pretending that this is nothing more than a mutually shared exchange. That, of course, becomes the ultimate trivialization of the process. Fortunately, technol- ogy has developed techniques to help us with this problem and these techniques are widely in use. Paper copy can be reproduced by bouncing it off a satellite, or by transmitting it via telephone lines to a printer at the receiving end. I recall a demonstration of this process at the Library of Congress over 30 years ago. The quality was not very good, but the inventor clearly saw value for libraries as soon as quality improved. Little did he know that quality had nothing to do with our use of this technique. Our concern was and remains cost. However, we do not need to be as esoteric. Many commercial services, and even the formerly tradition- bound U.S. Postal Service, now have mechanisms for one day delivery. Why do we not utilize this service? Why do we continue to insist on using package delivery and lower class delivery at that? That process, added to delays at both library ends, can result in a wait of four weeks or longer. Is it because we really believe that what we do is so trivial that it is not worth a greater effort? I hope nobody will suggest that it is not because we do not have the money, because that excuse, if accepted, would effec- tively prevent any change from ever taking place. We obviously don't have the money, but the answer is to get some. Innovation and entrepreneurship in this profession are rare enough that when some example does come to mind, it really stands out. Several years ago one of our brightest Indiana graduates accepted a position with an industrial firm to provide on-line search services in that organization's small library, services never offered previously. The corporation made it clear that this was only a one year appointment, funded by appropriations that would not and could not be sustained and that under no circum- stances would the job extend beyond one year. Many students were and are reluctant to take such a position, but this young entrepreneur and innovator had no reservations. She told me that she had a full year to make herself totally indispensable and that long before that year was up management would be convinced that the funds had to be found some- where to maintain the service. Of course, she was completely correct. The decision to turn this into a full-time position was made before the end of six months, to make sure she did not start looking for another job. Peter Drucker has also told us this. He reminds us that if we can create utility price is almost always irrelevant. Put another way, we know, or we ought to know, that corporations, universities, and even small municipali- ties find ways to afford what they really want to afford, and that could include library activities. Entrepreneurs and innovators know that too, probably instinctively. I am not sure that one can really develop entrepreneurial and innovative perceptions in individuals unless the germ is already there. What we must do is find such individuals, attract them to our profession, and then nur- ture and protect them from all of the organizational forces that demand that they resemble everybody else. 26 J O U R N A L O F LIBRARYADMINISTRATION Entrepreneurs are not often themselves the managers of large entcr- prises, including major libraries, although small and special libraries can function with entrepreneurs in charge. There is in management too much of the administrative, of the routine, of the maintenance of stability to allow senior executives in charge that much flexibility. It is an old man- agement axiom that the higher one progresses within a management structure, the less freedom one has. What senior library managers can and should do is create a climate that welcomes and encourages innova- tors and entrepreneurs, that protects them from second-guessers when they fail, that guards them in their "difference" from colleagues and coworkers, and that makes it clear that risk is welcome even when it leads to failure, and is still preferable to never trying or suggesting anything. Finally, upper level library managers must create an environment in which people care about the organization, in which they understand the importance of their own contribution, and in which they are given assign- ments with a relationship to the resources provided-assignments in which they have a chance to succeed. That is of course true in all organi- zational dynamics, but it is particularly true in settings such as libraries, in which such a climate does not automatically develop. This, as already noted, is because those outside the library with whom we deal expect little and least of all innovation and change. By and large, our users want us to continue what they already find comfortable, only they want more of it. Our bosses want the same thing, only they want it all at lower cost. It is an intolerable no-win strategy, and it can only be changed by individuals with the foresight to seek a better way, and the courage to fight for it. Intrapreneurs are revolutionaries. They cause trouble but they are worth it. Hewlett Packard, one of the organizations mentioned by Pinchot as supportive of entrepreneurship and innovation, goes so far as to issue a Medal of Defiance, awarded in recognition of extraordinary contempt and defiance beyond the normal call of duty, It is not at all certain that even Hewlett Packard, which encourages this process among its engineers, either welcomes or expects it from its librar- ians. As we already know from Drucker, librarians are not judged by their accomplishments, only by their budgets. Because it is for us a no-win situation, we have to change it in our dealings with those who fund us. However, we have to do more. We have to attract innovative and entre- preneurial individuals into our profession, we have to hire them knowing that they will make trouble and protect them when they do. Obviously, only if their contribution turns out to be constructive. However, our 1980s application of outmoded 1960s management values which concen- trate not on accomplishment but on getting along within an overall group model, a team approach whose theme is consensus, compromise, con- formity and comfort, will quite effectively weed out whatever entrepre- Herbet? S. White 27 neurs we might attract before they can even begin to make a difference. We must have the discipline to change this model ourselves. We have no choice. Our present system of management structure and decision distri- bution is safe and comfortable, but it doesn't work. It would not work in a corporate profit center setting, either, but at lcast there we would find an executive like Thomas Watson, Jr. to worry about it. The discipline must come from within, because as already noted nobody else cares what we do, only how much we spend. Many in our user communities, of course, would just as soon we changed nothing that is already comfortable. That is unacceptable, because it trivializes our own professional role. Most of all, we must search for individuals who seek to destabilize the status quo and who are willing to take risks i n search of improvement and change. Not high risks. Calculated moderate risks, but risks all the same. Why? Because what we are doing now does not work and I do not think I need to prove that contention. In the final analysis, my old tennis coach is still a good management philosopher. Always change a losing game. NOTES 1. Strauch. Hclcna M. Enrreprenerrrship in ihe informarion indlrsrry. I n Oreers in informoiion, cditcd by Janc F. Spivack. Whitc Plains, NY Knowlcdgc Industry Publications. 1982, pp. 73-101. 2. Pinchot. Gifford 111. Inirapreneorii~g: Wh?. you don7 have lo leave the corpornrion lo become an enrrepmneur. Ncw York: Harpcr & Row, 1985. 3. Nclton. Sharon. Finding mom for ihe iniroyrenelrr. Nation's Busincss, 72. Fcb.. 1984, pp. 50- *- J L . 4. Druckcr. Pctcr F. Innovaiion ondenrreprenearship. Ncw York: Harpcr & Row, 1985. 5. White, Hcrbert S. Pariiciparivc managemeni is the answer: blri whai was the q~resrion? Library Journal I I 0 (no. 13). Aug. 198.5, pp. 62-63. 6. Galvin, Thomas J. Marims for managcrials~rrvival in iorrgh limes. Confcrcncc handout availa- ble from thc author. 7. Hcdbcrg. Bo and othcrs. Camping on seesaws. Pmscriplions for a self-designing organizaiion. Administrative Scicnccs Quarterly 21(1). pp. 41-65, 1976. 8 . Whitc. Hcrbcrt S. Bjom Borgand rhc IibrorymoirriaLs h~rdgel. Information and Library Man- ager I, June 1981. pp. 3-4. yoon-what-2020 ---- What is the difference between Azure Security Center and Azure Sentinel? | by John Yoon | The Cloud Builders Guild | Medium Sign in AWS Azure Google Computing Coding Networking Security What is the difference between Azure Security Center and Azure Sentinel? John YoonFollow Feb 22, 2020 · 5 min read Azure Security Center vs Azure Sentinel Many Cloud Architects and Cloud Engineers are somewhat confused to grasp the difference between Azure Security Center (ASC) and Azure Sentinel. Both products look quite similar at a first glance and both offered by Microsoft to secure your Azure infrastructure. Moreover, in all Microsoft’s Cybersecurity reference designs these products work shoulder-to-shoulder. There are several main reasons for this confusion: the historical set of functionality that both products offer, the complementary functionality they perform and, the most important, is that they share a subset of functionality in the Cybersecurity activities life-cycle. End-to-end Cybersecurity cycle.The picture above represents a high-level sequence of activities happening in a typical Security Operations Center (SOC). Both ASC and Sentinel play a significant part in some of these activities. Azure Security Center plays a vital role in “Collect” and “Detect” roles. While Azure Sentinel in addition to the first two roles also designed to perform “Investigate” and “Respond” roles. To understand the differences, we shall look deeper into both offerings. Azure Security Center is a unified infrastructure security management system that strengthens the security posture of your data centers, and provides advanced threat protection across your hybrid workloads in the cloud — whether they’re in Azure or not — as well as on-premises. Azure Security Center addresses the three most urgent security challenges: Rapidly changing workloads — It’s both a strength and a challenge of the cloud. On the one hand, end-users are empowered to do more. On the other, how do you make sure that the ever-changing services people are using and creating are up to your security standards and follow security best practices? Increasingly sophisticated attacks — Wherever you run your workloads, the attacks keep getting more sophisticated. You have to secure your public cloud workloads, which are, in effect, an Internet-facing workload that can leave you even more vulnerable if you don’t follow security best practices. Security skills are in short supply — The number of security alerts and alerting systems far outnumbers the number of administrators with the necessary background and experience to make sure your environments are protected. Staying up-to-date with the latest attacks is a constant challenge, making it impossible to stay in place while the world of security is an ever-changing front. To help you protect yourself against these challenges, Security Center provides you with the tools to: Strengthen security posture: Security Center assesses your environment and enables you to understand the status of your resources, and whether they are secure. Protect against threats: Security Center assesses your workloads and raises threat prevention recommendations and threat detection alerts. Get secure faster: In Security Center, everything is done in cloud speed. Because it is natively integrated, deployment of Security Center is easy, providing you with auto-provisioning and protection with Azure services. Microsoft Azure Sentinel is a scalable, cloud-native, security information event management (SIEM) and security orchestration automated response (SOAR) solution. Azure Sentinel delivers intelligent security analytics and threat intelligence across the enterprise, providing a single solution for alert detection, threat visibility, proactive hunting, and threat response. Azure Sentinel core capabilitiesCollect data at cloud scale across all users, devices, applications, and infrastructure, both on-premises and in multiple clouds. Detect previously undetected threats, and minimize false positives using Microsoft’s analytics and unparalleled threat intelligence. Investigate threats with artificial intelligence, and hunt for suspicious activities at scale, tapping into years of cybersecurity work at Microsoft. Respond to incidents rapidly with built-in orchestration and automation of common tasks. Azure Sentinel performs more roles including hunting, automated playbooks and incident responses as well as assistance with manual incident investigations. On the other hand, Azure Security Center is a great source of recommendations, alerts and diagnostics that can be utilised by Azure Sentinel to provide even better analytics and incident alerts. Therefore, both products must be used in a well-architectured SOC. These products are highly complementary and can be easily enabled thanks to the great out-of-the-box integration. Below is an illustration of the entire process and where Azure Sentinel and ASC play their roles. Security Center is one of the many sources of threat protection information that Azure Sentinel collects data from, to create a view for the entire organization. Microsoft recommends that customers using Azure use Azure Security Center for threat protection of workloads such as VMs, SQL, Storage, and IoT, in just a few clicks can connect Azure Security Center to Azure Sentinel. Once the Security Center data is in Azure Sentinel, customers can combine that data with other sources like firewalls, users, and devices, for proactive hunting and threat mitigation with advanced querying and the power of artificial intelligence. To reduce confusion and simplify the user experience, two of the early SIEM-like features in Security Center, namely investigation flow in security alerts and custom alerts will be removed in the near future. Individual alerts remain in Security Center, and there are equivalents for both security alerts and custom alerts in Azure Sentinel. Microsoft will continue to invest in both Azure Security Center and Azure Sentinel. Azure Security Center will continue to be the unified infrastructure security management system for cloud security posture management and cloud workload protection. Azure Sentinel will continue to focus on SIEM. If you have any Business or Technology ideas or challenges that you would like to discuss, then please post your questions, challenge my opinion and please send me a message. John Yoon. Cloud Solution Architect The Cloud Builders Guild Cloud enthusiasts building things in the cloud. Follow 39 Azure Azure Security Azure Security Center Azure Sentinel Cybersecurity 39 claps 39 claps Written by John Yoon Follow Cloud Solution Architect Follow The Cloud Builders Guild Follow We are Cloud enthusiasts writing about coding and building things in the Cloud. Follow Written by John Yoon Follow Cloud Solution Architect The Cloud Builders Guild Follow We are Cloud enthusiasts writing about coding and building things in the Cloud. More From Medium Apple Caught Apps Spying Keystrokes On Millions Of Devices Anupam Chugh in The Big Tech Alternatives to third-party cookies in 2020 Rafał Rybnik in The Innovation Passkb: how to reliably and securely bypass password paste blocking Ignat Infiltrating Python’s Software Supply Chain Chetan Conikee in Analytics Vidhya Top 3 corporate data breaches of 2019 — why business VPN is a must Mary Clatson in The Startup Wi-fi Signals Can Reveal Your Password Prof Bill Buchanan OBE Slow Loris — Rethinking DoS attacks Lev Perlman in Frontend Weekly Detecting Malware In Android Stores Prof Bill Buchanan OBE in ASecuritySite: When Bob Met Alice Learn more. Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more Make Medium yours. Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore Share your thinking. If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium About Help Legal Get the Medium app matienzo-lighting-2020 ---- Lighting the Way: A Preliminary Report on the National Forum on Archival Discovery and Delivery Lighting the Way: A Preliminary Report on the National Forum on Archival Discovery and Delivery Mark A. Matienzo Dinah Handel Josh Schneider Camille Villa Stanford University Libraries November 2020 This project was made possible in part by the Institute of Museum and Library Services, through grant LG-35-19-0012-19. The views, findings, conclusions or recommendations expressed in this publication do not necessarily represent those of the Institute of Museum and Library Services. This work is licensed under a Creative Commons Attribution 4.0 International License. https://creativecommons.org/licenses/by/4.0/ 2 Table of contents Executive summary 3 Acknowledgements 4 Project background 5 Key concepts 5 Primary audiences and principles 6 Project activities and goals 7 Forum design and structure 8 The application process, response rate, and travel funding 8 Forum conceptual background and overview 9 Day 1 (February 10, 2020) 11 Day 2 (February 11, 2020) 17 Day 3 (February 12, 2020) 21 Evaluation and analysis 23 Retrospective and facilitator reflection 24 Participant feedback survey 27 Emerging themes 31 Discussion and next steps 43 Scope and focus 43 Participation and community engagement 44 Facilitation and structure of meetings and activities 45 Written contributions 45 Next steps 45 Appendices 47 Application form Community Agreements and Code of Conduct Lighting the Way Forum Playbook 25/10 Crowd Sourcing Ideas Anonymized Who/What/When Matrix actions Feedback survey questions Quantitative feedback summary 3 Executive summary Stanford Libraries hosted Lighting the Way: A National Forum on Archival Discovery and Delivery, which focused on information sharing and collaborative problem solving around improving discovery and delivery for archives and special collections. Archival discovery and delivery is how the project describes what people and systems do to support finding, accessing, and using material from archives and special collections. Systems include not just software, but also workflows, paper forms, standards, and more. The meeting had 71 participants drawn from multiple disciplines and job functions in the archives, library, and technology sectors. Participants were selected through an application process, which received over 200 submissions. Participants primarily came from US academic research libraries, but also included participants from government archives, tribal archives, software vendors, and museums. Approximately 300 remote attendees joined the livestreamed portion of the Forum. The Forum, held over two and a half days, included a series of plenary presentations in the first half of the first day, and a series of facilitated activities held over the rest of the event. The presentations and facilitated activities were intended to highlight both successes and ongoing challenges faced by participants and their institutions in terms of archival discovery and delivery. The presentations and activities also were designed to serve the goals of the Forum, which included 1) allowing participants to see, map, and build connections – between one another, their work, the systems they rely on, and the communities they serve; 2) organizing around shared opportunities and challenges, identified by participants during group activities; and 3) providing a platform for engagement with the project. Participants used idea generation activities focused on eliciting these opportunities and challenges, as well as potential areas for further work. Many ideas emerged about potential projects and initiatives for both participants and the projects to carry forward, but the most consistent themes centered around the ongoing need for a community focused on archival discovery and delivery. Additional areas of interest included removing barriers and addressing cultural issues with archives and special collections, developing a national “virtual reading room” for controlled online access to digital materials, developing a community of practice around user studies in archives, incorporating decolonial and anti-racist practices, and contributing to shared technical infrastructure. Participants were provided opportunities to give structured feedback through both daily retrospectives during the Forum and a feedback survey distributed after the event’s conclusion. While participants were overall satisfied with the Forum and saw it as a space for productive conversations to emerge, there were also several areas identified for improvement. Participants appreciated the breadth of the Forum’s focus, although this led to some confusion around the scope and focus for the Forum and the overall project. Following the event, the project team synthesized the activities, outputs, and feedback from the Forum into this written report shared with the participants and the broader archives community. The project is recentering its efforts and will host an online working meeting and asynchronous activities in Spring 2021 focused on collaborative writing and in-depth exploration of topics and themes raised in the Forum. 4 Acknowledgements The project team would like to acknowledge the Institute of Museum and Library Services, whose support made this project possible, including Senior Library Program Officer Ashley Sands, who has provided essential guidance throughout the project. We also thank Stanford University Libraries leadership, namely Michael A. Keller, Tom Cramer, and Roberto Trujillo, for their administrative backing for the project. In addition, the project team wishes to acknowledge Franz Kunst and Sally DeBauche for their contributions to the project’s foundational research which informed the Forum’s design, and Supavadee Kiattinant, who supported the Forum’s administrative and logistical needs. We have been lucky to receive tremendous support from our participant-advisors: Amelia Abreu, Hillel Arnold, Elvia Arroyo-Ramírez, Dorothy Berry, Max Eckard, Amanda Ferrara, Geoff Froh, Julie Hardesty, Linda Hocking, Sara Logue, Sandra Phoenix, Gregory Wiedeman, and Audra Eagle Yun. We would like to acknowledge Hillel for his assistance in planning and designing the Forum and its activities, and Max and Dorothy for their assistance in reviewing forum applications. Finally, we wish to thank the individuals listed below who participated in the Forum and provided feedback on earlier versions of this report. The success of the Forum, and the project’s future activities, would not have been possible without their active participation. List of participants and facilitators1 Amelia Abreu, UX Night School‡ Valerie Addonizio, Atlas Systems Sean Aery, Duke University Carla O. Alvarez, University of Texas Libraries Krystal Appiah, University of Virginia Hillel Arnold, Rockefeller Archive Center*†‡ Elvia Arroyo-Ramírez, UC Irvine*‡ Anne Bahde, Oregon State University Erin Baucom, The University of Montana - Missoula Stephanie Becker, Case Western Reserve University Dorothy Berry, Houghton Library, Harvard University*‡ Elena Colón-Marrero, Computer History Museum Rose Chiango, Philadelphia Museum of Art Greg Cram, The New York Public Library Katherine Crowe, Nat’l. Anthropological Archives, Smithsonian Institution Michelle Dalmau, Indiana University* Kira Dietz, Virginia Tech Sally DeBauche, Stanford University Libraries Kate Donovan, Harvard Library Sarah Dorpinghaus, University of Kentucky Libraries Max Eckard, University of Michigan Bentley Historical Library*‡ Glynn Edwards, Stanford University Libraries Danielle Emerling, West Virginia University Amanda Ferrara, Princeton University†‡ Tanis Franco, University of Toronto Scarborough Geoff Froh, Densho‡ Katie Gillespie, Atlas Systems Kevin Glick, Yale University Library, Manuscripts & Archives Sara Angela Guzman, Tohono O'odham Cultural Center & Museum Wendy Hagenmaier, Georgia Tech Dinah Handel, Stanford University Libraries* DeLisa Minor Harris, Fisk University Aaisha Haykal, College of Charleston Michelle Herman, Archives of American Art, Smithsonian Institution Linda Hocking, Litchfield Historical Society†‡ 1 * facilitator; † notetaker; ‡ project participant-advisor Shane Huddleston, OCLC Noah Huffman, Duke University Carrie Hintz, Emory University Nancy Kennedy, Smithsonian Institution Emily Lapworth, University of Nevada, Las Vegas Charlie Macquarie, University of California, San Francisco Jenny Manasco, American Baptist Historical Society Mark A. Matienzo, Stanford University Libraries* Anna McCormick, New York University Giordana Mecagni, Northeastern University Archives & Special Collections Daisy Muralles, formerly UCSB Library (now CSU East Bay) Lori Myers-Steele, Berea College Lisa Nguyen, Hoover Institution Library & Archives, Stanford University Donovan Pete, Indigenous Digital Archives Christie Peterson, Smith College Special Collections*† Kim Pham, University of Denver Sandra Phoenix, HBCU Library Alliance‡ Chris Powell, University of Michigan Library Genevieve Preston, San Bernardino County Historical Archives Merrilee Proffitt, OCLC Caitlin Rizzo, Pennsylvania State University T-Kay Sangwand, UCLA Josh Schneider, Stanford University* Bethany Scott, University of Houston Libraries Sarah Seestone, Stanford University*† Heather Smedberg, UC San Diego Library Trevor R Thornton, NC State University Libraries Althea Topek, formerly Tulane University (now Smith College) Anna Trammell, Pacific Lutheran University Adrian Turner, California Digital Library Amanda Whitmire, Stanford University Libraries Amy Wickner, University of Maryland Audra Eagle Yun, UC Irvine*‡ Camille Villa, Stanford University Libraries* Jennifer Vine, Stanford University Libraries Greg Wiedeman, University at Albany, SUNY*‡ 5 Project background Lighting the Way, facilitated by Stanford University Libraries and funded by the Institute of Museum and Library Services, is focused on convening a series of meetings focused on improving discovery and delivery for archives and special collections. Through its activities, meetings, and deliverables, the project is intended to engage stakeholders and experts including archives, library, and technology workers. The meetings are intended to build consensus around strategic and technical directions to improve user experience, access, and interoperability across user-facing discovery and delivery systems for archives, and to provide a model for values-driven technology work within cultural heritage. While archivists, librarians, and technologists have begun to explore and understand how effective systems integration impacts their work, most integration efforts have tended to focus on the integration of staff-facing systems like collection management systems, repositories, and digital preservation infrastructure. Based on Stanford Libraries’ work on the ArcLight archival discovery platform,2 and informed by work across the library, archives, and technology sectors, we believe there is an opportunity to get a broader and more in-depth understanding of how networks of people and technology impact archival discovery and delivery, and to develop a forward-looking agenda describing an ethical, equitable, sustainable, and well-integrated future for archives and special collections. Key concepts Archival discovery and delivery is how we describe what people and systems do to support finding, accessing, and using material from archives and special collections. Systems include software, workflows, paper forms, standards, and more. We also often refer to software that specifically serves these functions as “front-end systems,” which include those supporting search and presentation of archival description, delivery and presentation of digital objects, request management systems, and interpretation and crowdsourcing. Part of the broader challenge is to determine how to effectively integrate all those systems to work together as a coordinated whole, which serves as the fundamental area of focus for Lighting the Way. Integration is the use of processes or tools to join these systems to work together as a coordinated whole, which provides a “functional coupling” between systems. Inadequate integration for archival discovery and delivery not only impacts researchers, but can also impact archives, library, and technology workers responsible for those functions and systems. Integration also requires close collaboration across job roles and responsibilities, departments, and institutions, and thus fundamentally relies on people as well. 2 “ArcLight,” Stanford University Libraries, accessed May 3, 2020, https://library.stanford.edu/projects/arclight. https://library.stanford.edu/projects/arclight 6 Primary audiences and principles Participants in the project represent the primary audiences and stakeholders for the project across multiple disciplines and job functions both within and outside the context of archives and libraries, in three complementary and inter-reliant groups: ● Archives, special collections, and other library workers, across job functions (e.g. technical services, public services and reference, metadata management, digital collections, and administration), position classification (e.g. support staff, credentialed professional), and type of institution (e.g. academic, public libraries, museums, historical societies, government archives, tribal archives). ● Technology workers, across job functions (e.g. software developers, user experience designers, product managers, systems architects, etc.), position classification, and type of institution (e.g. archives- or library affiliated, vendors, service providers, consortia, open source software communities). ● People with interest or expertise in terms of legal and ethical issues in archives/special collections, across areas of focus (e.g. intellectual property, inclusive description, cultural sensitivity, risk management, and open access). This audience definition helps ensure that the project and its meetings remained focused on the needs and experience of practitioners across these categories. In addition, our project remains focused on providing opportunities for deeper collaboration and conversation between archives workers and technology workers. We recognize that archival discovery and delivery is supported by a wide range of responsibility and kinds of expertise, across institutional contexts, levels of resourcing, and the types of communities we serve. We also recognize that people may be discouraged or excluded from these conversations both within their institution or in larger community settings based on their identity or systemic issues. To this end, we have established a core set of principles for the project: ● We believe everyone from our core audiences has something to contribute; not everyone needs to be a self-identified expert. ● We focus on shared and holistic concerns and recommendations, rather than focusing on specific technologies or tools. ● We enable the adaptability of recommendations across contexts, communities, levels of resourcing. ● We develop recommendations consciously as an inclusive expression of professional ethics and values. To be truly transformational, our work must be conducted in a space that acknowledges the power dynamics of bringing together workers across professional contexts, roles, and job classifications, acknowledging institutional privilege, and the lack of representation of marginalized people within the archives, library, and technology sectors. The Lighting the Way project is committed to providing a 7 productive, inclusive, and welcoming environment for discussion and collaboration about archival discovery and delivery. All participants are expected to follow our Community Agreements and Code of Conduct, including project staff, advisors, event participants, and other contributors. This document has been included as an appendix to this report. Project activities and goals The project has two major categories of activities: meetings, and research and communications: ● Meetings include the Forum (the topic of this report) and the working meeting (a smaller planned meeting of 25 people, focused on collaborative writing). ● Research and communications include foundational research (undertaken before the Forum to provide background), the integration handbook (containing short case studies about current or planned archival discovery and delivery efforts, position papers, and other written contributions from project participants), a statement of principles (generated in the working meeting), a white paper (summarizing the overall project), and peer-reviewed articles and presentations. These activities support the following project goals: ● Map the ecosystem supporting archival discovery and delivery. This will allow us to better understand the purpose of systems, software, and standards, why they need to work together, how people work together to implement them, how better integration of these systems might be achieved, and challenges for this work. This activity is supported by the foundational research before the Forum, as well as the Forum and working meeting. ● Develop both conceptual and actionable recommendations for technical, ethical, and practical concerns related to archival discovery and delivery and integration. This is intended to provide high-level guidance, informed by comparable statements of principles, such as the revised DACS principles3 and the DLF Born Digital Access Working Group’s Levels of Access 4 and Access Values documents. 5 This will be supported by the integration handbook, the white paper, the statement of principles, and the Forum and working meeting. ● Build a shared understanding between archives and technology workers undertaking this work. This is intended to allow for effective collaboration between these groups. This is 3 “Statement of Principles.” Describing Archives: A Content Standard, Version 2019.0.3. Society of American Archivists, 2019. https://saa-ts-dacs.github.io/dacs/04_statement_of_principles.html. 4 Elvia Arroyo-Ramírez, Kelly Bolding, Danielle Butler, Alston Cobourn, Brian Dietz, Jessica Farrell, Alissa Helms, Kyle Henke, Charles Macquarie, Shira Peltzman, Camille Tyndall Watson, Ashley Taylor, Jessica Venlet, and Paige Walker, “Levels of Born-Digital Access,” February, 2020, https://doi.org/10.17605/OSF.IO/R5F78. 5 Digital Library Federation Born-Digital Access Working Group. “Access Values,” Version 1, https://doi.org/10.17605/OSF.IO/ED7VK. https://saa-ts-dacs.github.io/dacs/04_statement_of_principles.html https://doi.org/10.17605/OSF.IO/R5F78 https://doi.org/10.17605/OSF.IO/ED7VK 8 primarily supported by the Forum and working meetings, and secondarily by research and communications. ● Activate a diverse group of project participants to adopt the recommendations and findings developed during the project across institutional contexts, capacities, and software platforms. This is intended to encourage a broad community to engage with the project and provide us feedback on the recommendations of the project. This is primarily supported by the Forum and working meetings, and secondarily, by research and communications. Forum design and structure Lighting the Way: A National Forum on Archival Discovery and Delivery (“the Forum”) was designed to support information sharing and collaborative problem solving around archival discovery and delivery. The goals for the Forum were: ● To allow participants to see, map, and build connections – between one another, their work, the systems they rely on, and the communities they serve. ● To identify and organize around shared opportunities and challenges, identified by participants during group activities. ● To provide a platform for engagement with the project, leading to participation in other project activities (e.g. attending the working meeting or contributing to written products like the integration handbook). The application process, response rate, and travel funding As the first of the two project meetings, the Forum served as the beginning point of engagement for most participants in the project. While participant advisors and other project supporters were identified early in the project lifecycle, the Forum was intended as an essential form of outreach. Realizing there was likely wide interest in the project, the project team prepared a call for participation and application process, as informed by previous IMLS National Forum Grant projects, such as Always Already Computational: Collections as Data6 and the National Web Privacy Forum 7. The project team created an application using the Qualtrics survey platform, intended to gather information about prospective participants, their responsibilities, their work related to archival discovery and delivery, and successes and challenges therein. 6 Thomas Padilla, Laurie Allen, Hannah Frost, Sarah Potvin, Elizabeth Russey Roke, and Stewart Varner. “Always Already Computational: Collections as Data. Final Report.” May 22, 2019. https://doi.org/10.5281/zenodo.3152935 7 Scott W.H. Young, Sara Mannheimer, Jason A. Clark, and Lisa Janicke Hinchliffe, “A Roadmap for Achieving Privacy in the Age of Analytics: A White Paper from A National Forum on Web Privacy and Web Analytics.” Montana State University Library. May 1, 2019. http://doi.org/10.15788/20190416.15445. https://doi.org/10.5281/zenodo.3152935 http://doi.org/10.15788/20190416.15445 9 Given the project’s focus on equity and inclusion, the application asked prospective participants whether they receive travel support from their employer, whether they would need travel support to participate in the Forum, whether they identified as a member of any underrepresented or marginalized populations, and whether their work directly or indirectly supports underrepresented or marginalized populations. Applicants were not asked to self-disclose any further additional information about marginalized aspects of their identity. Applications were evaluated by the Project Director and two of the project’s participant advisors against a rubric of up to 25 points: ● Level of engagement with the project (up to 4 points), used to identify participant advisors, authors of letters of support, invited guests, or referrals from the project team or participant advisors; ● Match with defined project audience (up to 4 points), used to ensure that the event remained practitioner-focused; ● Whether the applicant was a member of an underrepresented or marginalized population (up to 2 points); ● Whether the applicant’s work directly or indirectly underrepresented or marginalized population (up to 2 points); ● Whether the applicant could attend regardless of their access to funding (up to 4 points); ● The depth of their answers about their work in relation to archival discovery and delivery and successes and challenges therein (up to 4 points); ● Whether the applicant was willing to present or write about the work they described (up to 2 points); ● and whether the applicant provided any actionable feedback or referrals (up to 3 points). Applications were open between November 13-December 15, 2019, with a total of 422 applications received (both complete and incomplete), and 203 complete applications that were evaluated by the reviewers. While the Forum was originally envisioned as a 2.5-day meeting of up to 50 participants, with 30 fully funded participants, the high response rate led the project team to expand the Forum to a total of 71 participants, including facilitators. 42 participants (59.15%) received travel funding of some form, including lodging costs paid directly by Stanford and reimbursements for actual or per diem travel costs, with an average of $1,130.65 offered per funded participant, and an average of $1,055.80 in paid participant support costs per funded participant (either directly charged to Stanford or reimbursed). Forum conceptual background and overview The Forum used a mix of plenary presentations and facilitated breakout activities to achieve the Forum goals. The Forum was notably influenced by the work of the Montana State University Library’s National 10 Web Privacy Forum,8 which used collaborative design exercises to allow participants to help actively shape their project’s agenda. During both project conception and planning, the Lighting the Way project team wanted to leverage proven exercises and techniques drawn from existing methodologies used in ideation sessions for human-centered design, 9 as we believed that this could serve as a useful resource to others interested in facilitating such a meeting. Ideation sessions rely heavily on the use of lateral thinking, a concept developed by Edward de Bono, which allows for indirect and creative approaches to problem solving to arise through disrupting constraining thought patterns.10 Ideation sessions also often have a natural “flow” to them, following a process that guides participants through three modes of thinking: divergent thinking (generating a large number of ideas), emergent thinking (building from and upon past ideas), and convergent thinking (sorting, clustering, and evaluating ideas). 11 Collaborative activities were drawn from two primary sources: Liberating Structures,12 a framework of facilitation techniques that allow for distributed control developed by Keith McCandless and Henri Lipmanowicz, and Gray, Brown, and Macanufo’s Gamestorming, 13 which leverages the “divergent/emergent/convergent” model. Both frameworks were chosen because of their use of engaging activities that could center the expertise of the Forum participants and maximize participation by using a variety of communication methods and modes. The activities also allowed the project team and facilitators to structure activities around groups of varying sizes, allowing for time for individual reflection, small group discussion, and larger interactions between groups and across the entire Forum. The Forum was held across 2.5 days, from February 10-12, 2020 at the Bechtel Conference Center, Encina Hall, on the Stanford University campus. Day 1 focused on context-setting presentations and divergent activities intended to set the stage and develop Forum themes. Presentations held in the first half of Day 1 were also livestreamed using a Zoom webinar and YouTube Live, and were recorded. Day 2 focused on emergent activities, intended to support participants in examining, exploring, and experimenting in the problem space. Day 3 focused on convergent activities intended to move participants towards conclusions, decisions, and both individual and collective action. 8 ibid. 9 See, for example, LUMA Institute, Innovating for People: Handbook of Human-Centered Design Methods (Pittsburgh: LUMA Institute, 2012). 10 Rikke Friis Dam and Yu Siang Teo, “Understand the Elements and Thinking Modes That Create Fruitful Ideation Sessions.” The Interaction Design Foundation. February 11, 2018. https://www.interaction-design.org/literature/article/understand-the-elements-and-thinking-modes-that- create-fruitful-ideation-sessions. 11 ibid. 12 Henri Lipmanowicz and Keith McCandless, “Liberating Structures - Introduction,” Liberating Structures, accessed May 3, 2020. http://www.liberatingstructures.com/. 13 Dave Gray, Sunni Brown, and James Macanufo, Gamestorming: A Playbook for Innovators, Rulebreakers, and Changemakers. Sebastopol, CA: O’Reilly, 2010. https://www.interaction-design.org/literature/article/understand-the-elements-and-thinking-modes-that-create-fruitful-ideation-sessions https://www.interaction-design.org/literature/article/understand-the-elements-and-thinking-modes-that-create-fruitful-ideation-sessions http://www.liberatingstructures.com/ 11 Participants were seated at 10 round tables facing a stage on one side. For much of the three days, participants could sit where they pleased, but were encouraged to sit with people they did not know. While some of the Forum activities took place at the level of these tables or groups, many of the exercises took place across affinity groups, and several included physical movement or interaction with physical artifacts. In the second half of Day 2, tables were aligned with participant affinity groups based roughly on job function, including “Technical Services,” “Heads/Managers/ Leadership,” “Digital Collections,” “Developers/UX/IT/ Product Managers,” and “Public Services.” Participants were provided with Participant Packets in advance of the meeting, which included a Local Map, Forum Schedule, Community Agreements and Code of Conduct, Project Overview, Participant Preparation Guide, Reimbursement Instructions, and List of Participants. A total of 15 facilitators and notetakers were selected from the project team, participant advisors, and a select number of Forum participants who had self-identified as being willing to volunteer for these roles. Facilitators helped to keep the conversation going, determined how to help people stay engaged, helped answer questions about specific activities, watched for and responded to difficult social dynamics, and made sure that everyone participated and was heard. Facilitators included four primary facilitators, that organized the days and led specific activities, as well as one or more table facilitators that were assigned to a specific table. Notetakers were asked to help document the Forum and its activities, and to assist facilitators in making space for relationship-building among the participants. Facilitators and notetakers received access to a Forum “playbook,” which included detailed logistical information about the event and schedule, each activity, supplies needed, and a basic script to follow for the day. A lightly redacted version of the playbook is included as an appendix to this report. Day 1 (February 10, 2020) Overview The first day focused on activities which introduced participants to each other, and enabled and encouraged divergent thinking. In the “divergent/emergent/convergent” model, the first day set the stage for subsequent discussions. The more ideas you can get out in the open, the more you will have to work with in the next stage. The opening is not the time for critical thinking or skepticism; it’s the time for blue-sky thinking, brainstorming, energy, and optimism. The keyword for opening is “divergent”: you want the widest possible spread of perspectives; you want to populate your world with as many and as diverse a set of ideas as you can. 14 14 Gray, Brown, and Macanufo, “What is a Game?” Gamestorming, 11. 12 During breakfast, participants were encouraged to introduce themselves to everyone at their table, with sample icebreaker or introductory questions provided by the facilitators. Trading Cards To begin the first day, participants were asked to create a trading card for themselves to share with others at their table. The Trading Cards activity 15 was intended to serve as an icebreaker, introducing participants to their tablemates and generating discussion, while avoiding the large-scale, often depersonalizing, and time-consuming whole-room introductions that might otherwise serve this purpose. A participant’s trading card consisted of a drawing or representation of themself, along with a few words that described the kind of work they did, and why they were excited about the Forum. At each table, these cards were passed around, and tablemates wrote questions on the backs of the cards. Then, the table passed each trading card back to its creator, and each person took a turn answering a question on their card. Plenary presentations and themes After the initial icebreaker, the Forum’s public program was livestreamed via Zoom webinar and YouTube Live and recorded. Following the Forum, recordings of the presentations and slide decks were made available under a Creative Commons Attribution – NonCommercial – No Derivatives (CC BY-NC- ND) license via the Stanford Digital Repository. 16 The program began with welcome announcements by the Project Director and Associate University Librarian, followed by an acknowledgement of IMLS funding support, a land acknowledgement, review of the core principles, Community Agreements and Code of Conduct, and associated logistical announcements. Introductory remarks were rounded out by a presentation from the project team, which identified team members, described project concepts, audiences, goals, activities, and related 15 Dave Gray, “Trading Cards,” Gamestorming (blog), January 27, 2011, https://gamestorming.com/trading- cards/. 16 Stanford University Libraries. Lighting the Way: A National Forum on Archival Discovery and Delivery, February 10, 2020. Presentations. https://searchworks.stanford.edu/view/jp429tw5870 https://gamestorming.com/trading-cards/ https://gamestorming.com/trading-cards/ https://searchworks.stanford.edu/view/jp429tw5870 13 research, and provided a high-level overview of the “divergent/emergent/convergent” model that would guide the Forum, and ending with a review of the Forum schedule. The Project Team presentation was followed by plenary presentations, organized into four themes: The Evolving Systems Ecosystem; Networks and the Big Picture; Ethical, Legal, and Cultural Concerns; and Impacts on Public Services and Outreach. Each theme had its own question and answer session to engage the participants. ● The Evolving Systems Ecosystem: What software and other systems do we use to make archival discovery and delivery possible, and how is that changing within institutional contexts? Presenters included Trevor Thornton (North Carolina State University), Lori Myers-Steele (Berea College), Anna Trammell (Pacific Lutheran University), and Kim Pham (University of Denver). ● Networks and the Big Picture: What issues are impacting archives and libraries at the level of the sector, consortia, or beyond, related to discovery and delivery? Presenters included Adrian Turner (California Digital Library) and Merrilee Proffitt (OCLC Research). ● Ethical, Legal, and Cultural Concerns: How have factors like privacy, cultural protocols, copyright, and others impacted our ability to address archival discovery and delivery, on a technical, operational, or strategic level? Presenters included Amanda Whitmire (Miller Library, Hopkins Marine Station, Stanford University), Tanis Franco (University of Toronto Scarborough), T-Kay Sangwand (University of California, Los Angeles), and Greg Cram (The New York Public Library). ● Impacts on Public Services and Outreach: How does archival discovery and delivery fit within the front-line work of library and archives workers focused on reference, outreach, public service, and community needs? Presenters included Genevieve Preston (San Bernardino County Historical Archives), Daisy Muralles (University of California, Santa Barbara), Heather Smedberg (University of California, San Diego), and Sara Guzman (Himdag Ki - Tohono O’odham Nation Cultural Center & Museum). The livestreamed public portion of the Forum formally concluded with remarks by the Project Director, followed by lunch. Mad Tea The first activity following lunch, Mad Tea, 17 was intended to build energy, foster creativity, and further introduce participants to one other while articulating shared concerns around archival discovery and 17 “Mad Tea (v1.2): Rearrange the context for taking action,” Liberating Structures, accessed May 3, 2020 at http://www.liberatingstructures.com/mad-tea/. http://www.liberatingstructures.com/mad-tea/ 14 delivery. Participants were invited to finish a list of open-ended sentences that were projected on the screen and read aloud by the facilitator. Participants formed two concentric circles and completed the sentences with rotating partners. There were 17 questions, with 30 seconds allotted for each exchange. Questions included: 1. What first inspired me in this work is… 2. Something we must learn to live with is… 3. An uncertainty we must creatively adapt to is… 4. What I find challenging in our current situation is… 5. Before we make our next move, we cannot neglect to... 6. Something we should stop doing (or divest) is… 7. What I hope can happen for us in this work is… 8. A big opportunity I see for us is… 9. If we do nothing, the worst thing that can happen for us is… 10. A courageous conversation we are not having is… 11. An action or practice helping us move forward is… 12. A project that gives me confidence we are transforming is… 13. Something we need to research is… 14. A bold idea I recommend is… 15. A question that is emerging for me is… 16. When all is said and done, I want to... 17. Something I plan to do is... Participants were then asked to spend 10 minutes answering the following questions by themselves: 1. What is the deepest need for my / our work? 2. What is happening around me / us that demands creative adaptation? 3. Where am I / are we starting, honestly? 4. Given my / our purpose, what seems possible now? 5. What paradoxical challenges must I / we face down to make progress? 6. How am I / are we acting our way forward toward the future? Participants were then asked to spend a few additional minutes reflecting on the question: 1. What are the biggest opportunities we have in terms of improving archival discovery and delivery? Afterwards, they were invited to spend 5 minutes reflecting with one or two people at their table about their answers to this question and identifying other opportunities. After those 5 minutes had passed, 15 they were asked to spend 5 additional minutes sharing answers and generating additional ideas at their tables. Finally, each table had 90 seconds to share one opportunity and one challenge that stuck out in their discussion to the rest of the Forum attendees. Speedboat The next activity designed to encourage divergent thinking was Speedboat, 18 which expanded on the previous exercise by identifying key drivers and hindrances in the problem space of archival discovery and delivery. Participants were provided with Post-It easel pads and markers on which they were instructed to draw a boat, “the good ship Archival Discovery and Delivery.” Participants were then asked to spend 15 minutes on their own adding “sails” and “anchors” to that drawing: sails representing strengths or supports that provide forward momentum; and anchors representing challenges and obstacles that slow it down. Participants then spent a further 15 minutes discussing all the sails and anchors that their table had added as a group, considering whether any of the sails or anchors seemed more significant than others, and adding additional sails or anchors identified as part of that discussion. Tables then spent an additional 15 minutes presenting their drawings to the other participants, considering as a group the extent to which sails and anchors aligned across tables, and especially which sails and anchors seemed directly oppositional. Speedboat diagrams were photographed and affixed to the walls of the room for review by all attendees. This exercise was selected in part to inform TRIZ, an activity in Day 2, which asked attendees to consider how we might be unknowingly perpetuating these anchors and others in our day-to-day work. 18 Dave Gray, “Speed Boat,” Gamestorming (blog), April 5, 2011, https://gamestorming.com/speedboat/. https://gamestorming.com/speedboat/ 16 Low-Tech Social Network The final exercise of Day 1 was the Low-Tech Social Network,19 which continued to encourage divergent thinking by asking participants to consider the social networks represented in the room, as well as the individuals and other network nodes that were not represented. Participants either reused their trading card from earlier in the day, or created a new one which included keywords describing their interests and affiliations. They were then asked to “upload” their profiles to the social network by taping their trading cards to a sheet of easel paper. With guidance from facilitators, participants identified the contours of the network at their tables by spending 10 minutes drawing connections such as “friends with,” “works with,” “went to school with,” etc. Participants were then encouraged to further elaborate and build out these connections through discussions first with their tablemates, and later with a neighboring table. The easel paper for each augmented network was then taped to the wall, and participants were asked to spend 15 minutes reviewing the social networks of the other tables, reflecting on who else they were connected to in the network, and considering which connections they found the most noticeable or striking. Finally, participants rejoined their tables and were invited to share anything with the room that stood out to them about the social networks uncovered through the exercise. Low-tech social networks were photographed and affixed to the walls of the room for review by all attendees. Retrospective The day concluded with a retrospective, during which participants were encouraged to provide feedback to the facilitators, to inform the design and approach of remaining Forum activities. The 19 Dave Gray, “Low-Tech Social Network,” Gamestorming (blog), January 27, 2011, https://gamestorming.com/low-tech-social-network/. https://gamestorming.com/low-tech-social-network/ 17 specific retrospective technique chosen was the “4Ls,”20 through which participants reflected and shared what they liked, learned, lacked, and longed for: ● Liked: What did you enjoy? What worked well? ● Learned: What did you discover? What new information did you get? ● Lacked: What was missing? What would have helped your participation go more smoothly? ● Longed For: What needed to exist but didn’t? What would help the Forum be (more) successful? Participants spent 5 minutes developing their own responses, and 10 minutes sharing and compiling responses at their tables. Each table then took 1 minute to share highlights of these reflections with the rest of the room. After participants were released for the day and the Forum space was cleaned up and reset for the next day’s activities, a facilitator’s retrospective and discussion was held to reflect on the day’s events, and consider any modifications needed for the next day. An in-depth discussion of this feedback on Day 1 follows in the Evolution and analysis section below. Day 2 (February 11, 2020) Overview The second day focused on exploring themes and ideas from the divergent phase more deeply. The authors of Gamestorming write “The keyword for the exploring stage is ‘emergent’: you want to create the conditions that will allow unexpected, surprising, and delightful things to emerge.”21 To that end, the day’s activities asked participants to reflect on context, barriers (actual and imagined), affinities, workplace agency, and bigger and bolder ideas. Before the first activity, facilitators began with introductory remarks that acknowledged two factors that arose through national news: the request for funding from the Trump Administration for the orderly closure of IMLS and the agency’s response,22 and controlled blasting at the Organ Pipe Cactus National Monument which desecrated burial sites and ancestral homelands of the Tohono O’odham Nation during construction of the border wall between the United States and Mexico. 23 Facilitators also presented a recap of Day 1, the plan for Day 2, and reminders of the Forum goals and what Forum design expected of participants. The day ended with a substantive participant retrospective, which influenced the Day 3 agenda. 20 Mary Gorman and Ellen Gottesdiener. “The 4L’s: A Retrospective Technique.” EBG Consulting (blog), June 24, 2010. https://www.ebgconsulting.com/blog/the-4ls-a-retrospective-technique/. 21 Gray, Brown, and Macanufo, 11. 22 Institute of Museum and Library Services, “IMLS Statement on the President's FY 2021 Budget Proposal,” February 10, 2020. https://www.imls.gov/news/imls-statement-presidents-fy-2021-budget-proposal. 23 BBC, “Native burial sites blown up for US border wall,” February 10, 2020. https://www.bbc.com/news/world- us-canada-51449739. https://www.ebgconsulting.com/blog/the-4ls-a-retrospective-technique/ https://www.ebgconsulting.com/blog/the-4ls-a-retrospective-technique/ https://www.imls.gov/news/imls-statement-presidents-fy-2021-budget-proposal https://www.bbc.com/news/world-us-canada-51449739 https://www.bbc.com/news/world-us-canada-51449739 18 Trading Cards To begin the second day, participants were encouraged to sit at different tables than the day prior, and to remake a trading card for themselves. Otherwise, the activity followed the same structure as the first day. The activity was repeated to allow participants to reintroduce themselves to one another, and to build on connections created in the first day of the Forum. Context Map Following the trading card warm-up exercise, participants at each table were guided through a Context Map exercise.24 Context mapping is intended to allow participants to build a systemic view of archival discovery and delivery, identifying factors and trends that impact it like changes in technology as well as political and economic climates. First, the participants drew a context map, composed of six sections: two sections for trends (the type of trend to be determined by the map-makers), political climate, economic climate, stakeholder needs, and uncertainties. Each table spent 60 minutes filling in the map, discussing as a group the two types of trends they wanted to highlight, and filling out each portion of the map. Maps were taped onto the wall so participants could view on their own and compare their maps with those from other tables. TRIZ Following a short break, a facilitator led the participants in an exercise called TRIZ,25 which is designed to identify counterproductive behaviors and unpack how they might be unknowingly perpetuated. Participants were asked to recall the “anchors” (things that may hold us back or weigh us down, barriers) identified during the Speedboat exercise on Day 1. Upon recalling the anchors, participants were then asked to imagine ways in which they could maximize these anchors, listing out undesirable and unwanted outcomes with regards to archival discovery and access. Example maximized undesirable and unwanted outcomes included: 24 Dave Gray, “Context Map”, Gamestorming (blog), October 27, 2010. https://gamestorming.com/context-map- 2/. 25 Henri Lipmanowicz and Keith McCandless, “Making Space with TRIZ,” Liberating Structures, accessed May 3, 2020. http://www.liberatingstructures.com/6-making-space-with-triz/. https://gamestorming.com/context-map-2/ https://gamestorming.com/context-map-2/ http://www.liberatingstructures.com/6-making-space-with-triz/ 19 ● Negative economic and labor outcomes (making all funding temporary and competitive based on funder priorities; transitioning to gig employment; corporate sponsorship) ● Undermining user privacy (e.g. collecting biometric and medical data and selling it) and collaborating with law enforcement and intelligence; ● Design without user input; ● Accepting new collections without regard for the availability of resources necessary to preserve, describe or provide access to them.; ● Ending fair use, copyright exceptions, and limitations; ● Having a lack of bravery to make changes; ● Intensifying a focus on quantifiable measures of success only; and ● Continuing to pursue a technology because it took a lot to implement (sunk costs). After each table finished producing a list of unwanted outcomes, they were asked to consider actions or behaviors which enable those unwanted outcomes and assess ways in which these imagined behaviors might resemble actual behaviors. Discussion was held at the table level, and final lists of unwanted outcomes were shared with the room. Although this exercise provoked a great deal of laughter, it also emphasized how close some of the maximized imagined anchors were to real ones. Affinity Map After lunch, participants were asked to change tables and sit with their “affinity” group for the next exercise. Affinity groups were based loosely on the job titles that participants provided in their applications and were intended to be roughly the same size (double the size of existing table-based groups). The Affinity Map exercise26 was an opportunity to identify and reflect on patterns and themes based on institutional roles. Participants were asked to brainstorm answers to two questions: ● What changes must take place to improve and enhance archival delivery and discovery? ● What are steps we can take to improve archival delivery and discovery? Each person spent 15 minutes brainstorming between 10-15 solutions or ideas on their own. Then, as a group, participants attempted to place similar ideas together into a cluster, without naming or quantifying the cluster. Once clustering of ideas was complete, the group then labeled each cluster. 26 Dave Gray, “Affinity Map,” Gamestorming (blog), October 15, 2010. https://gamestorming.com/affinity-map/. https://gamestorming.com/affinity-map/ 20 At the end of the exercise, the groups shared among themselves what might have been surprising or unexpected about their solutions and groupings. 15% Solutions and 25/10 Crowd Sourcing The final activity of the day combined two distinct techniques drawn from Liberating Structures, 15% Solutions 27 and 25/10 Crowd Sourcing.28 Each of the two activities were intended to build on past activities by having them consider both small steps they could take individually, and big ideas that they wanted to pursue. The activity started with 15% Solutions, which focused on revealing actions that all participants could take to move archival discovery and delivery forward and encouraged participants to think positively. Participants were asked to reflect on where they individually had discretion and freedom to act, and what they could do without more resources or authority. Participants were given pens and notecards and were asked to generate lists of these “15% solutions” over the course of five minutes. After that, participants met in groups of three or four people to discuss their lists of solutions. Each participant was allowed 5 minutes to be the center of discussion, and during that time, the other group members were asked not to provide advice or critique and instead ask clarifying questions and offer encouragement. Following 15% Solutions, the activity transitioned into 25/10 Crowd Sourcing, which asked participants to think ten times bolder to generate and identify “big ideas” to improve archival discovery and delivery, and then to “peer review” these ideas. First, participants were asked to think of one big idea and a first step to set it in motion, and to write it on an index card without their name. Then, everyone in the room walked around, repeatedly exchanging index cards with other participants for 30 seconds. Then, when a bell sounded, everyone read the card they had in hand and rated the strength and promise of the idea on a scale of 1 to 5, with 5 being the best. This process of “milling and passing” and then rating ideas 27 Henri Lipmanowicz and Keith McCandless, “15% Solutions,” Liberating Structures, accessed May 3, 2020. http://www.liberatingstructures.com/7-15-solutions/. 28 Henri Lipmanowicz and Keith McCandless, “25/10 Crowd Sourcing,” Liberating Structures, accessed May 3, 2020. http://www.liberatingstructures.com/12-2510-crowd-sourcing/. http://www.liberatingstructures.com/7-15-solutions/ http://www.liberatingstructures.com/12-2510-crowd-sourcing/ 21 was repeated 5 times. After the fifth rating, the primary facilitator asked the room to tally the total score on the cards, with the highest possible score being 25. Then, the primary facilitator asked the room to share any ideas that had a rating of 25, 24, 23, and 22. These ideas were recorded as possible next steps for Forum participants to explore upon returning home from the Forum, and served to inform the activities on Day 3. Retrospective As in day 1, the facilitators led a retrospective using the “4Ls” technique, allowing for participants to provide feedback to inform potential changes in Forum design, followed by a separate facilitator-only discussion. An in-depth discussion of this feedback on Day 2 follows in the Evolution and analysis section below. Day 3 (February 12, 2020) Overview The third day aimed to move participants from idea generation to planning for individual and collective action: In the final act you want to move toward conclusions—toward decisions, actions, and next steps. This is the time to assess ideas, to look at them with a critical or realistic eye. You can’t do everything or pursue every opportunity. Which of them are the most promising? Where do you want to invest your time and energy? The keyword for the closing act is “convergent”: you want to narrow the field in order to select the most promising things for whatever comes next.29 Following logistical updates, facilitators recapped Day 2’s focus on emergent thinking and reviewed the outcomes of the facilitator discussion. This included a review of how people were assigned to affinity groups, an acknowledgement that some of the previous day’s exercises were less successful than others. The project team also publicly acknowledged problematic group dynamics which resulted in experiences of marginalization reported by participants. Social Network Webbing The day’s first activity, Social Network Webbing, 30 was designed to articulate connections between individuals and roles at the Forum, and to identify individuals and roles whose voices might be missing. First, attendees at each table were asked to spend 5 minutes adding their names to a Post-It note 29 Gray, Brown, and Macanufo, 11. 30 Henri Lipmanowicz and Keith McCandless, “Social Network Webbing,” Liberating Structures, accessed May 3, 2020. http://www.liberatingstructures.com/23-social-network-webbing/. http://www.liberatingstructures.com/23-social-network-webbing/ 22 correlated with the affinity groups identified earlier, and affix those to a large easel pad. Then attendees were asked to spend 10 minutes adding Post-It notes for others engaged in this work but not present at the Forum, again using the associated color, and arrange the Post-Its based on each person’s degrees of separation. Attendees were then asked to spend an additional 10 minutes identifying who else they would like to include in these discussions, by writing their names (or roles) on Post-it with the associated color, and continuing to think about the actual and desired spread of participation. Tables were then asked to spend 15 minutes reflecting on their social network web, considering the questions: ● Who knows whom? ● Who has influence and expertise? ● Who can block progress? ● Who can boost progress? After that reflection, attendees at each table were asked to spend 10 minutes to further discuss strategies to 1) invite, attract, and weave new people into this work; 2) work around blockages; and 3) boost progress. Finally, tables spent 10 minutes reporting out on their discussions. Social network webs were photographed and affixed to the walls of the room for review by all attendees. Who/What/When Matrix The final activity, Who/What/When Matrix,31 was designed to identify specific next steps and individuals responsible for moving them forward. Each table was asked to draw three columns on a large sheet of paper, labeled Who (the person or people taking the action), What (the action to be taken), and When (the date at which the action would 31 Dave Gray, “Who/What/When Matrix,” Gamestorming (blog), March 30, 2011. https://gamestorming.com/whowhatwhen-matrix/. https://gamestorming.com/whowhatwhen-matrix/ 23 be done). All participants were asked to spend 10 minutes to add at least one action, which they were personally committing to take on in the next year. Although the original plan was to ask attendees to determine an action on their own, the Forum decided as a group to focus on individual actions that would directly support the “top 10 ideas” generated from the 25/10 Crowdsourcing exercise generated on Day 2. Participants then spent an additional 10 minutes in groups of two to three people to discuss the proposed actions. Each participant was given a chance to speak and receive feedback. Who/What/When Matrices were affixed to the walls of the room for review by the table and by all attendees. Matrices were also photographed and transcribed by meeting facilitators and participant advisors. An anonymized list of all actions from this activity is included as an appendix to this report. Retrospective As during the first two days, the third day concluded with a retrospective, using the “4Ls,” retrospective technique, through which participants reflected and shared what they liked, learned, lacked, and longed for. Forum Conclusion The Forum ended after a discussion session that reviewed next steps, opportunities for participation, and additional time for questions and answers. Participants were invited to continue discussions informally in person over a provided lunch. Participant advisors and facilitators compiled and transcribed artifacts from the first three days. Evaluation and analysis This section of the report on the Forum is intended to evaluate its design and outcomes, through looking at feedback, Forum artifacts (notes, outputs from activities, etc.), and through reflection of the project team and facilitators. This information has also been used to develop the recommendations as described in the Next steps section that follows below. 24 Retrospective and facilitator reflection As noted in the Forum design section, the project team relied heavily on retrospectives to receive feedback from participants and to discuss potential changes in Forum design among facilitators during the event. Day 1 Day 1 activities received generally positive feedback from participants, although some noted what felt like an abrupt transition from presentations, which provided a more passive mode of interaction, to the facilitated activities that started in the afternoon. Participants also consistently noted that they longed for more time to get to know one another through introductions. ● Trading Cards was viewed as a positive and low-barrier introductory exercise. ● Mad Tea felt chaotic to some participants, and the loudness of the activity made it hard to hear instructions from the facilitators. ● Speedboat had minimal in-depth feedback, but facilitators noted that most participants were highly engaged. ● Low-Tech Social Network was also identified as fun, although participants noted that they were not clear on the purpose for the activity. This is a case where facilitators needed to be clearer about the need to draw these connections between participants. Facilitators noted that participants seemed to enjoy the broader context for the project, and the understanding of archival discovery and delivery as being a wide but shared area of concern. While the introductory presentations from the project team were intended to reflect the “fuzzy” nature of a human-centered design process, we also heard that some participants would have benefited from more concrete goals. On Day 1, participants also noted that a better structured introductory exercise may have provided more context by allowing participants to learn more about one another. The lack of an introduction session also was consistently noted as something that participants wanted in order to feel better connected to one another. Facilitators also recognized that participants would benefit from additional context about project outcomes and goals, and the “fuzzy” and participant-driven nature of the Forum. Likewise, they noted that participants should be encouraged to take breaks and given additional time to organize thoughts. In response to this feedback, the facilitators made a conscious decision to spend additional time at the beginning of Day 2 and at the end of Day 3 talking about project goals, outcomes, and directions, and focused on tangible next steps for participants to continue their engagement. Participants noted that they began to see synergy between activities and the focus of the Forum by the end of Day 2. Ensuring that participants have clarity on project direction is essential, especially given challenges with forward momentum on the project related to the rise of the COVID-19 pandemic. 25 Day 2 Day 2 activities ranged widely in terms of success, based upon feedback provided by participants during the retrospectives. ● Context Map was organized as a more in-depth activity to start the day, for participants to get a broader landscape view. Explicit feedback on this activity was minimal. ● TRIZ seemed to be incredibly successful, as it was seen as fun, engaging, reflective, and cathartic. Participants enjoyed the creative inversion aspect of the activity, and that many of the extreme unproductive behaviors they identified resembled situations at their own institutions. ● Affinity Map was consistently identified as the activity that participants struggled with most. As a 90-minute activity, it was the longest of any activity block throughout the entire Forum, and had the largest group size as well. Both of these factors made it the most challenging exercise to facilitate. Participants were also unsure why they were assigned to a particular affinity group; only well after the exercise was over did the facilitators realize that the group categories were not shared with participants on the slides. This may have exacerbated situations in which some participants felt marginalized (see below). Facilitators also noted that some participants would have preferred to select or opt into the group for this exercise, rather than being placed in the predefined groups. ● 15% Solutions and 25/10 Crowd Sourcing received significant positive feedback. While some participants found 15% Solutions empowering, others noted that it reinforced a notion that everyone should be doing more, with one participant asking for an exercise about what one could give up to work towards their 15% Solution. 25/10 Crowd Sourcing was seen as positive and democratic, and it enabled a wide variety of ideas to come to the foreground; however, one participant wanted more guidance on how to score ideas generated in the activity. In the facilitators’ retrospective for Day 2, significant concerns arose around experiences of marginalization identified by some participants, which were not disrupted by facilitators in a timely way. While this seemed to be a consistent thread throughout the day, it became most notable when reviewing the inclusion of perspectives from marginalized people in topic “clusters” generated in the Affinity Map activity. This was particularly troubling for both participants and facilitators given some of the ideas generated by 25/10 Crowd Sourcing. Some of these ideas included decolonization or pursuing anti-racist work, but in ways that did not have well-defined initial steps, as prompted by the second question in the Affinity Map and 25/10 Crowd Sourcing activities. In addition, there was recognition that the Forum also needed discussion spaces that allowed for greater nuance on such concepts that do not always have agreed upon definitions. Facilitators took these areas of feedback seriously and made changes in the plan for Day 3 by: ● acknowledging the problematic dynamics that arose in Day 2, and remind participants about the Community Agreements and Code of Conduct; 26 ● using active facilitation throughout the day, to make space for people to participate in ways most comfortable to them; ● allowing for optional facilitated conversations about both big and nuanced ideas over lunch for participants able to attend; ● modifying the Social Network Webbing activity to remove reference to the underspecified and miscommunicated affinity groups used for the Affinity Map activity; ● modifying the Who/What/When Matrix activity to draw from ideas generated in 25/10 Crowd Sourcing as a starting point; and ● removing a group-based idea-evaluation subactivity (Dot Voting)32 from ideas generated in the Who/What/When Matrix, instead allowing for small group discussion in dyads or triads. Day 3 Day 3 wrapped on a high note with its two activities: ● Social Network Webbing was challenging for some groups. The facilitators decided to change direction on this exercise as a result of discussion on Day 2’s retrospective. Initially, the roles were intended to reflect the affinity groups selected for the Affinity Map exercise, but facilitators made the decision to remove this and allow table groups to determine the roles they wanted to discuss. Participants noted that they wanted a clearer end goal for the activity, with some noting confusion about whether they should represent individuals or abstract categories of people connected to archival discovery and delivery. Other participants noted that they lacked enough time for the exercise. Overall, facilitators observed that participants got to the ultimate purpose of the exercise: to better understand power and influence. ● Who/What/When Matrix was viewed as positive by participants, as it provided a lightweight accountability structure for them to carry out future efforts based upon areas of interest. Some participants noted that they wanted more information about how these efforts would be manifested over time. Overall, the mix of activities chosen for the Forum were seen as energizing and invigorating for participants, despite the fact that many were new to many of the highly interactive exercises used. Initially, some participants struggled to effectively engage with the Forum, expressing anxiety due to unfamiliar facilitation techniques and uncertainty over if and when they could leave the room without missing content. Participants also consistently noted that the pace and transitions occasionally felt abrupt and left them exhausted at the end of each day, and asked us to consider allowing more time for reflection and journaling to allow them to digest the big ideas under discussion. While facilitators encouraged participants to take care of themselves and their needs throughout the event, a participant 32 Sarah Gibbons, “Dot Voting: A Simple Decision-Making and Prioritizing Technique in UX.” Nielsen Norman Group, July 7, 2019. https://www.nngroup.com/articles/dot-voting/. https://www.nngroup.com/articles/dot-voting/ 27 familiar with comparable facilitation techniques noted that advising participants that they would be moving and to wear comfortable clothing prior to the Forum would have been useful. Participants in general valued the roles of facilitators and notetakers and wanted more details on the methods used in the Forum. The biggest challenge for facilitators and notetakers was that they were stretched too thin during some activities. The project team expects that this was likely influenced both by the increase in size of the event from 50 to 71 participants and the larger group size in the Affinity Map activity, and that smaller events would likely be more manageable to facilitate. Participant feedback survey The project team asked participants, livestream viewers, and facilitators to provide additional feedback and reflections through a survey hosted on the Qualtrics platform and distributed via email. The survey received a total of 120 responses. 51 responses were from Forum participants or facilitators, or approximately 71.8% of the 71 Forum participants and facilitators. 69 responses were received from livestream viewers, or approximately 17.7% of the 389 confirmed livestream attendees, or approximately 12.7% of the 536 total livestream registrants. Feedback questions included an evaluation to calculate Net Promoter Score, quantitative questions on participant interest and experience, and qualitative questions including what participants liked most or least about the Forum, what they learned at the Forum, which future project-facilitated activities could be the most valuable, and an open ended feedback question. Overall, quantitative feedback indicated a high level of overall satisfaction with the event, in terms of content, goals, and logistics, with Net Promoter Scores indicated as favorable (33.62 overall; 46.00 for Forum participants; 23.08 for livestream viewers). However, while feedback indicated positive satisfaction and sentiment, the survey results indicate that further work will be necessary for the project to properly leverage the Forum and to meet its overall goals consistently. This quantitative feedback can be seen in more detail in the Quantitative feedback summary appendix. While qualitative feedback submitted by participants, livestream viewers, and facilitators varied widely, the project team identified a set of consistent themes, which helps assess the effectiveness of the Forum and can inform future project activities. What people enjoyed the most about the Forum Participants enjoyed meeting, interacting with, and hearing from other participants from a variety of backgrounds and in a variety of roles. Some participants noted that this included a range of institutional and organizational types, and that for some, it was an opportunity to hear a variety of opinions and viewpoints, including challenging ideas from new voices. Multiple participants felt that the Forum was a great opportunity to network and make connections with colleagues from other institutions or roles, and noted that the expertise of the attendees enhanced their experience. 28 The majority of the Forum participants gave positive feedback on the format, given its mix of structured activities, brainstorming, and plenary presentations, as well as the “cooperative and optimistic spirit” present at the Forum. Feedback indicated an appreciation for the Forum’s active facilitation and interactive nature, with thanks extended to the facilitators for providing a safe space to share ideas and to promote discussion as it provided an informal and inclusive venue for participants. Some attendees appreciated that the Forum provided opportunities for more participants to speak up and contribute their perspectives. A few participants even remarked that they would be bringing some of the facilitation techniques to their own institution. Webinar attendees enjoyed the presentations and noted that the pace of presentations worked well, and were appreciative of their ability to watch the presentations. Finally, some participants appreciated the range of topics and learning about issues other than technology, although this also ties to concerns about focus described in the next section. What people enjoyed the least about the Forum The project team also encouraged participants to reflect on aspects of the Forum and programming that needed improvement or change. Most notably, several participants felt the Forum suffered from a lack of focus (“scope creep”). At times, discussion strayed away from discussing archival discovery and delivery and systems integration and felt unfocused. Others noted that the discussions felt overbroad, and that it was difficult to see how they would tie together. Several participants noted that they would have preferred more explicit practical discussion of technology and systems, and well-defined outcomes, with some participants qualifying their answers that the broad conversations were nonetheless useful. A few participants noted that some activities “devolved” into milling about, and that the challenges around the Affinity Map particularly amplified this feeling of a lack of focus. Some participants suggested that a clearer description of the Forum and project goals would have been useful, while another noted that they felt concerned that they may have misinterpreted the description of the Forum in advance of the event based on the topics that were actually covered. One participant noted that a concrete deliverable for the Forum itself could have helped to maintain focus for the event. For one participant, more background on the Forum and its focus have meant additional context on the readings provided by the foundational research team. For another, this could have included better explanation of the facilitated exercises and a roadmap of the Forum’s activities. Several participants noted the lack of representation of researchers as participants in the Forum and project, and the lack of reference to specific user studies to inform the discussion. Participants noted that the days of the Forum seemed packed and felt too long, noting that they would have appreciated more unstructured time to sit alone with their thoughts, reflect on the day’s activities, or to have smaller group discussions or intermingle with participants that were not at their table or in their affinity groups. Participants suggested that scheduling fewer activities might be helpful in the future, or to shorten the second day and extend the third day. Some participants also noted that they wanted more time to dive into specific topics or to workshop potential solutions. 29 While participants otherwise provided positive feedback about the Forum design itself, some participants also gave important feedback regarding the Forum’s design and the project’s goals of inclusivity. Some participants noted that while the Forum had participants serving tribal communities, the Forum did not seem to be designed with tribal institutions in mind, meaning that it had less relevance to the issues faced by tribal communities and institutions. Another participant raised a concern that the Forum’s format allowed complex or important issues to get passed over quickly, rather than allowing them to be discussed with appropriate nuance. Finally, a few participants mentioned they would have benefited from a better understanding of the nature of the Forum and its interactive aspects, such as wearing comfortable clothing, telling participants to expect to move, or that there may be some activities that could trigger anxiety (e.g. Mad Tea). While appreciative of the willingness of the program organizers to livestream the plenary talks to an online audience that could not attend in person, some livestream attendees wished that they could have participated more fully in the rest of the Forum’s activities. Other respondents, including both livestream attendees and Forum participants, wanted better insight into how the project team selected participants for the Forum or to have the project team increase the number of in-person participants. What people learned at the Forum Many participants noted that the Forum helped uncover common ground across participant roles and institutional types, as well as the realization that not everyone shared the same understanding of common challenges, or the same priorities. Many participants were nonetheless able to obtain a deeper understanding of the shared struggles around archival discovery and delivery across institutions, and generally felt validated knowing that they were not facing these challenges alone. They also noted that they learned about the tension between collaborative efforts balanced against local needs, and that sharing tools built by one institution can be challenging because they will not always be useful. Participants reported learning about the range of technologies used across different archives for the purpose of archival discovery and delivery, most notably from success stories and evaluation of tools used at other institutions. Participants also noted that the Forum was a useful opportunity to learn about existing collaborative efforts towards a national finding aid network. Others noted that the Forum provided useful momentum towards collaborating on a shared virtual reading room service. Overall, these reflections also aligned with recognition made by several participants that there needs to be shared responsibility or leadership for this activity to continue to move forward. Several participants felt that the facilitation techniques adopted were remarkably successful at supporting information exchange in a large group, and expressed plans to explore or encourage their adoption in other contexts. Some participants expressed a desire for a publicly shareable version of the “playbook” used by the Forum’s facilitators and project team. One participant also noted that they learned that feedback to the facilitators was itself important, and appreciated the mindfulness with which the facilitators accepted and acted upon that feedback. Another participant expressed relief to a 30 notetaker for the Forum when they were told that physical copies of the playbook were available at the table for the facilitators and notetakers to follow. This participant noted that having this knowledge in advance and in the moment would have helped them better interact with the Forum by seeing the connections, impact, and structure of its activities. This also reflects the importance of providing better insight into the plans for the Forum or similar events to support participants with varied styles of learning and comprehension. Some Forum participants described leaving the event with a greater understanding of a variety of concepts surrounding archival discovery and delivery. In some cases, participants reported that shared goals were themselves a valuable takeaway from the Forum. Participants also learned more about the political and economic factors that archives face, and noted that many issues around archival discovery and delivery relate to resources, labor issues, institutional priorities and power. One participant noted that they learned how using open source software, or participating in an open source community, can be a privilege, while another described learning that technology is highly bound to cultural issues in and across organizations. For some participants, the discussions around equity and inclusion in archives were illuminating. Several participants indicated that learning about the work of institutions actually doing post-custodial and reparative work was valuable, as was learning about work within tribal and indigenous libraries and archives. Moreover, several participants also noted they learned about the perceptions around decolonization and anti-racist work, the unevenness of understanding these concepts across Forum participants, and how these practices might be incorporated into their day-to-day work. Forum participants also learned from one another about different practices, such as offering office hours on systems for public services staff, and planned to implement similar practices at their home institutions. The most valuable experience that the project can offer Forum participants had extensive suggestions on the most valuable experience the Forum could offer. Many suggested that the Forum and participants grow into a community of practice, and facilitate opportunities for collaborative work on ideas from the crowdsourcing activity at the Forum, or other projects related to archival discovery and delivery. Certain participants noted that this could be achieved by providing a space for continued conversation and organized activities over time. Some participants suggested that the Forum could help the community organize around addressing and refining the prioritized actions that were shared, including by identifying ongoing work in these areas to prompt action or build networks, or providing a community framework to coordinate and maintain accountability for those actions. This might include a platform for continued communications and continued participant input, as well as an expansion of the call for participation to others with aligned responsibilities and interests. Some proposed activities that allowed written contributions from each participant, including mapping out different systems integrations, or identifying underserved user groups and understanding their needs or preferences around archival discovery and delivery. Other participants noted that additional interactive virtual meetings open to all archivists, or events at local, 31 regional, and national conferences, would also be valuable. A few also suggested that the Forum create opportunities for cross-institutional knowledge-sharing, such as pairing technical services staff with public services staff. Some participants and webinar attendees were interested in further incorporating user experience research methods in archival discovery and delivery. Others suggested that support for both professional networking and practical advice on tool selection, and supporting the further sharing of workflows, methods, and user studies results, would be an interesting role that the Forum could play after the event. One participant suggested that the project focus enable participants to plan an international research agenda that illuminates the needs of users who discover and rely on archival collections. Several participants suggested that the Forum could play a role in further amplifying concerns raised at the Forum by undertaking targeted outreach to administrators and managers, connecting smaller groups working on similar problems and outcomes, and working to better include and understand the needs of indigenous organizations. Suggestions about facilitation included ensuring a space that allowed for deeper conversations on nuanced topics was important, and that building more time for reflection would be helpful. One participant also noted that clarity on the role of participant advisors and better identification of the facilitators would have been useful. Emerging themes Day 1 Day 1’s focus on divergent thinking led the project team to expect to get a broader understanding of how participants viewed archival discovery and delivery. As noted earlier, presentations and activities focused on encouraging everyone to think about possibilities and to build energy. Plenary presentations The plenary presentations from participants were intended to provide a broader perspective on archival discovery and delivery and to establish the potential divergence in concerns, both within each thematic grouping and across them. In the session on The Evolving Systems Ecosystem, Forum participants heard from a range of institutions about their recent efforts to adopt new discovery systems. Institutions took a variety of approaches: building new homegrown systems, wiring together microservices via APIs and custom integrations, or selecting from several types of proprietary software. Factors that impacted the selection of a solution included the organization’s size, previous archivist and IT staffing, and which software systems institutions were migrating away from. Across institutions, presenters emphasized the importance of interdepartmental collaboration in managing the complexity of migrating to new systems. 32 In Networks and the Big Picture, presentations focused on upcoming trends in archival aggregation. Both speakers focused on the importance of avoiding the siloing or exclusion of archives from future discovery work. Adrian Turner reported on California Digital Library’s ongoing research on establishing a national finding aid network. In a concerted effort to move away from a “build it and they will come” mindset, NAFAN’s work focuses on understanding and validating user needs, scoping potential maintenance costs, and designing intake to minimize technical debt for regional aggregators. Merrilee Proffitt of OCLC Research warned of the potential for archives to be siloed or excluded from other work around linked data and artificial intelligence. In Ethical, Legal, and Cultural Concerns, highlighted areas such as privacy, cultural protocols, and copyright that impact archival discovery and delivery. On the policy end, Amanda Whitmire (Stanford University Libraries) illustrated how FERPA restricted access to making undergraduate environmental research widely available. Greg Cram (NYPL) detailed how the cost of rights review forms a bottleneck on making audiovisual collections available online. Tanis Franco (University of Toronto, Scarborough) and T-Kay Sangwand (UCLA) highlighted the importance of providing finding aids and training materials in languages relevant to the communities described and served by archives. In Impacts on Public Services and Outreach, presentations focused on the tension between understanding users and their needs and addressing those needs. Constraints discussed by presenters included the need to support more complex models of access, addressing gaps in infrastructure and other essential services, and providing opportunities for active engagement for staff and students. Mad Tea The Mad Tea exercise provided participants with the opportunity to raise shared concerns around archival discovery and delivery. At the end of the exercise, each table shared their answers to the question “What are the biggest opportunities we have in terms of improving archival discovery and delivery?” The primary themes that arose from sharing were: collaboration across repositories and technology platforms, in particular sharing knowledge and material resources; a need to better understand our vast community of users so that we are well positioned to support their research; open source, community development that is thoughtful and impactful; and the necessity to aggregate and share archival collections data. Speedboat To encourage divergent thinking and build upon the Mad Tea exercise, attendees participated in the Speedboat exercise to identify sails (drivers) and anchors (hindrances) to archival discovery and delivery. In the Sails category, common themes were: access to resources, such as time, money, and permanent staff; a collaborative work environment that includes good leaders, colleagues who are open to change, and trust; functional, well documented workflows; and advocacy and engagement across and outside of the organization in the form of stakeholders and champions. In the anchors category, common themes were: lack of resources, such as time and precarious labor; “legacy” 33 problems, such as technical debt and processing backlogs; organizational issues, such as silos, risk- aversion, low morale and burn-out culture, and bad leadership; copyright; and embedded bias and oppression within the systems and standards used in archival discovery and delivery. In general, this exercise brought to light organization-wide drivers and hindrances that impact archives and special collections. Low Tech Social Network While intended to orchestrate divergent thinking, the Low-Tech Social Network exercise did not reflect any significant themes related to the focus of the Forum. Some participants reflected on professional connections or shared areas of work, while others focused on other areas of personal interests. Retrospective Day 1 of the Forum ended with an audience-wide retrospective, using the 4 L’s technique to discuss what they liked, learned, lacked, and longed for. Participants liked the variety of activities for the day, the catering, the presentations, meeting new people and sharing experiences, and the communal nature of the gathering. Participants remarked that they came away with a greater understanding of the context they were working in, as well as the differences in roles and workflows across different institutions. In general, participants noted they learned more about the overall discovery landscape from the day’s presentations and discussion. Participants felt that the day’s events were lacking in the opportunity to reflect on the larger goals and purpose of the Forum, user insights and perspective, and wanted more down time for introverts. By the end of day one, participants found themselves longing for more discussion time and a deeper inquiry into certain issues, more time for question and answer sessions after the presentations, and additional perspectives from HBCUs. Day 2 Context Map The Context Map exercise asked each table to identify contextual factors for archival discovery and access across six vectors, including the political and economic climate, stakeholder needs, technological factors, “uncertainties,” as well as two “trend” vectors determined by the participants at each table. Despite the relatively abstract scaffolding, the context maps created at each table proved to be surprisingly consistent. Broad emerging themes included a focus on meeting evolving user expectations and needs, including in the context of teaching, learning, and research; as well as professional labor trends. Societal trends beyond those considered primarily political or economic were also broadly featured and crossed multiple vectors. One common takeaway was that while several of the major societal challenges identified are not ones that participants are perhaps best positioned to solve in their professional capacities, the most immediate challenges are not intractable, but will require broad collaboration and significant institutional investment and support. 34 Recurring factors and emerging trends across vectors included: ● Political/economic: devaluation and defunding of cultural memory work; unequal distribution of funding and competition for limited resources; precarious labor/gig economy; lack of transparency in our political systems; less accountability and general attack on facts/history; focus on STEM negatively impacting the humanities; institutional overemphasis on return-on- investment; wealth inequality leading to inequitable access to cultural heritage materials/digital divide; collection monetization by donors and institutions; privatization and monetization of user data; climate change and its broader impact on our lives and institutions; ethics of ownership; ethics of representation; copyright and access restrictions ● User-focused: vast spectrum of potential users and their needs; expectations for easy, unified, streamlined access; access to ubiquitous commercial digitization and storage services; remote research and virtual reading rooms; accessibility needs; teaching/research trends such as an emphasis on primary source materials and archival literacy in undergraduate education; digital literacy; data reuse; archives in the research lifecycle; on-demand digitization; on-demand learning; privacy concerns; right to be forgotten. ● Technological: systems integration; metadata interoperability protocols and APIs; software dependence and emulation; shift from record to entity-based description (linked data); online viewers; open source; machine learning/artificial intelligence. ● Uncertainties: (cut across all categories). ● Participant-selected trends: factors impacting institutional labor/resourcing (7 tables selected this trend); factors impacting user behavior and expectations for discovery/delivery (5 tables selected this trend); factors impacting institutional policies/activities (4 tables selected this trend). TRIZ TRIZ asked participants to identify the unproductive or counterproductive behaviors that encourage or reinforce barriers to archival discovery and access. The behaviors/actions identified align with emerging themes across five categories: ● Institutional: A lack of clear decision-making responsibility, leadership, and mission; a lack of institutional bravery and risk tolerance; a scarcity mindset that drives inequitable labor decisions. ● Labor/Economic: A trend towards adopting temporary/term/contingent staffing models, leading to lack of continuity not only for employees, but for the work and responsibilities themselves; less public funding leading to greater dependence on private donors/granting agencies, intensifying a focus on quantifiable success; data-driven processing priorities can perpetuate economic inequality or continue to suppress inclusive access to collections. These behaviors and trends are exacerbated at smaller, public, and/or less resourced institutions. 35 ● Political/Legal: Threats to fair use and other copyright exceptions. ● Technological: Continuing to pursue a technology if it took a lot to implement (sunk costs). ● User-focused: User privacy not respected or considered: institutional collaboration with law enforcement and intelligence; retention of usage data; user requirements, needs, and experience not considered: a lack of user studies, user research, and usability testing. Lack of compliance to online accessibility standards; challenges to physical access: economic and physical barriers that make collections inaccessible, e.g. physical spaces that are not ADA- compliant; gendered restrooms or restrooms without changing tables; parking costs. Affinity Map Within the Affinity Map, participants were asked both to brainstorm individually, and to identify emerging patterns in those ideas through clustering them into groups. Accordingly, the emerging themes reflected in this description were initially identified by the participants themselves. The project team have clustered these further in preparing this report, with the original clusters below for each associated question (“Changes” and “Steps”). ● Addressing structural barriers and oppression ○ Changes clusters: “anti-racist practice”; “disrupt hierarchies and racism”; “ethical responsibilities and decolonizing the archives”; “removing or lowering barriers in the culture of our spaces”; “increase access”; “accessibility”; “responsible access” ○ Steps clusters: “multi-language and cultural awareness”; “DEI”; “DEI + labor”; “inclusive description”; “accessibility”; “improve data” ● Ethical access and transformational relationships with communities ○ Changes clusters: “ethical responsibilities and decolonizing the archives”; “responsible access”; “creator needs” ○ Steps clusters: “outreach”; “open/expand leadership”; “community outreach and collaboration” ● Advocacy, organization, leadership, and staff development ○ Changes clusters: “staff empowerment”; “library culture”; “professional skills and training”; “building staff skills”; “push back”; “advocacy and awareness” ○ Steps clusters: “general principles & values & culture shifts”; “affecting staff education and support”; “transparency about our processes and strategies”; “improve teaching & education for staff and community”; “open/expand leadership”; “influence professional organizations towards positive change”; “advocacy”; “open/expand leadership” ● Addressing user needs and understanding users ○ Changes clusters: “user needs”; “involve users”; “users and usability”; “user focus”; “users”; “accessibility” 36 ○ Steps clusters: “users/usability”; “user studies, testing/UX”; “finding and developing user centered tools and workflow tools”; “research user needs”; “UX- improve/do testing” ● Improving communication and collaboration within organizations: ○ Changes clusters: “collaboration”; “high level/beyond archival collaboration”; “sharing resources/ideas”; “sharing knowledge with communities (professional and non)” ○ Steps clusters: “multi-institutional collaboration”; “transparency about our processes and strategies”; “transparency/documentation” “communicate”; “share/collaborate”; “build consortia/ collectivize (contribute to/share)”; “vendors”; “collaboration” ● Strategic planning and strategic thinking ○ Changes clusters: “mindset/strategy”; “aspirations”; “library culture”; “high level/beyond archival collaboration” ○ Steps clusters: “stakeholder goals”; “navigating true north” ● Improving systems interoperability and integration ○ Changes clusters: “systems interoperability”; “systems and integration”; “integrations” ○ Steps clusters: “systems integration and practices integration”; “systems and system integration”; integrate/ use appropriate systems” ● General technology and systems improvements ○ Changes clusters: “simplifying systems”; “leverage automation”; “systems and integration”; “build better stuff” ○ Steps clusters: “tech advancements and feature requests”; “systems integration and practices integration”; “better workflows”; “systems and system integration”; “integrate/ use appropriate systems”; “search” ● Discovery and delivery ○ Changes clusters: “search”; “delivery” ○ Steps clusters: “search” ● Improving aggregation, data interoperability, and cross-LAM discovery ○ Changes clusters: “aggregation” across multiple groups; significant discussion of “bridging the gap” between digital libraries and archives; “systems interoperability” ○ Steps clusters: “aggregate”; “APIs + interoperability” ● Improving archival description and metadata ○ Changes clusters: “description”; “metadata”; “description and/or metadata” ○ Steps clusters: “description”; “inclusive description”; “improve data”; “effective / transparent / quality metadata” ● Resourcing and staffing, including ethical aspects ○ Changes clusters: “resourcing”; “$$$”; “sharing resources and ideas” ○ Steps clusters: “resources”; “advocacy”; “staff and labor”; “DEI and labor”; “$$$”; “advocate for permanent labor and funding” ● Proactively responding to risk management, copyright, and other rights issues ○ Changes clusters: “clarify copyright and open source statuses”; “risk management” 37 ○ Steps clusters: “rights/reuse”; “transparent rights” While participants and facilitators both found this activity challenging, many of the clusters represent cohesion in topics across the five groups. However, there was also significant topical cohesion not identified in the clusters above around ideas intended to support indigenous communities and tribal archives, such as creating a tribal archives consortium, promoting indigenous data sovereignty, supporting orthography and typology for indigenous languages, and better understanding what systems can best support specific tribal communities. 15% Solutions and 25/10 Crowd Sourcing The final activity on Day 2 involved two shorter activities that were conjoined to allow individual ideas and smaller steps for action to expand into broader and bolder possibilities. The 15% Solutions activity was intended to have participants reflect on individual steps that they could take themselves, without needing additional resources or permission. Key themes from participants’ answers to this activity include: ● Finding opportunities for action and advocacy; ● Using their current influence to advise, raise, or act on ethical and inclusion concerns (e.g. avoiding use of contingent labor); ● Improving communication and collaboration, including sharing documentation, experience, and information resources (e.g. best practices), as well as regular calls with colleagues to share information; ● Understanding more about users (including data gathering and formal assessment), undertaking user testing, and sharing accessibility and usability feedback with platform maintainers; ● Envisioning new working relationships and understanding more about the roles and responsibilities of colleagues across both teams and organizations (especially between archives and technology workers); ● Evaluating new systems and tools (including archival discovery systems and machine learning), gathering and specifying requirements, and identifying key points of integration between systems; ● Rethinking workflows supporting digitization and reproduction (including both rights workflows and workflows used by researchers); ● Increasing concrete support of aggregation and consortial networks; ● Improving the discovery of born-digital material by including it in finding aids; ● Revising description and metadata to use inclusive and anti-racist language, including existing resources to address issues in description (e.g. Archives for Black Lives in Philadelphia’s Anti- Racist Description Resources); ● Cross-training or improving skills, especially technical skills; 38 ● Strategizing thoughtfully, and asking more questions about why specific activities (e.g. digitization) are prioritized; ● Reducing the use of archival jargon; ● Removing barriers to access, both physical and financial; ● Taking direct input from community members; ● Avoiding burnout; and ● Translation of archival description. The themes identified in the outcomes of this activity suggested that participants were able to reflect on the connections throughout the day’s activities. The project team followed this activity with the 25/10 Crowd Sourcing activity with the intention of having the earlier activity’s ideas serve as a “springboard” for the bolder steps identified by participants in the final activity of the day. While the full list of ideas generated by activity is included as an appendix to this report, the generated were grouped by the project team into one or more of the following sixteen themes: ● Structural change, including integrating anti-oppressive and inclusive practices to archives and technology work; ● Developing shared regional or national projects and infrastructure; ● Developing “virtual reading rooms”; ● Improving and rethinking description and metadata; ● Community engagement and community-specific platforms; ● Collaboration and communication between roles and institutions; ● Collection development and digitization strategies; ● Creating best practices and other forms of professional guidance; ● Prioritizing and investing in open source software development; ● Gathering and analyzing data; ● User experience and usability studies; ● Crowdsourcing; ● Prioritizing accessibility; ● Sharing resources; ● Improving and simplifying discovery and delivery and the technology that supports it; and ● Improving and rethinking policies around access, rights, and reuse. There was a notable demand identified by the participants to enact work to develop larger collaborative networks around improving archival discovery and delivery, especially around shared software or platform development, and a clear desire to reduce the use of “bespoke” or “homegrown” solutions for archival discovery and delivery. This intersected with there being significant interest in work towards establishing a national “virtual reading room” for restricted material. 39 Additionally, there was significant interest in sharing resources between institutions, with an expectation that well-resourced institutions could commit ongoing funds. Many participants also recognized more user studies, especially a larger scale study across multiple institutions, could be beneficial. At the same time, these ideas surfaced a strong desire for structural change that can also be expressed in part through archives and technology work. Participant reflection demonstrates the recognition that participants needed deeper relationships with the communities they serve, which could help to develop “hyperlocal” systems suited towards the needs of those communities. Retrospective In addition to the feedback received during the retrospective discussed above, the following emerging themes that reflected the day’s activities were identified by participants: ● Cultural challenges are harder to fix than technical ones, and investing in people and relationship building even within one’s institution is essential to this work. User needs must be prioritized, and in fact, professional trends in archives were seen by some as completely disconnected from user trends. ● Participants noted that connections across activities became more coherent, and while they saw the synergy between the activities, some lacked sufficient focus to carry ideas across exercises. ● Collaboration and information sharing are essential, even if it is clear that institutions may be working towards divergent goals. Participants noted that there are often similar issues and concerns across institutions, even when those institutions or their goals may be perceived as very different. Strategic needs, as well as values, were seen as shared across institutions, and we should find opportunities to communicate those ways to align our work more. In addition, participants emphasized that we cannot afford to leave smaller peers behind as this work advances. ● The Forum’s activities on day 2 helped reinforce that all stages of the archival workflow impact archival discovery and delivery. ● There is a prevalent reliance on temporary labor in archives, which was reinforced as an urgent problem in undertaking improvements to archival discovery and delivery. ● There is a real interest in developing a community around structural change, decolonization, and equity concerns in archives and special collections, and that continued work is necessary to center the expertise of Black, Indigenous, people of color, queer, non-binary, and other marginalized people who participate in events like the Forum. Day 3 As Day 3 was intended to have convergent thinking as its focus, the project team expected that themes and ideas would continue the coherence expressed on Day 2. The intent was to activate the participants to interpret these connections in ways that would lead to potential future action, and provide guidance to efforts of the Lighting the Way project. 40 Social Network Webbing The focus of the Social Network Webbing exercise was to understand working relationships and power in relation to the Forum and the work of archival discovery and delivery. Participant groups varied in their approach of how they selected people not represented in the Forum: some groups identified roles or other groupings of people, while others identified individuals in relation to specific roles. Groupings of people identified as missing from the conversation included the following: ● Library administration and leadership (viewed as an important conduit to resources); ● Users, including “casual researchers”; ● Curators, bibliographers, and subject librarians responsible for collection development; ● Funders (e.g. granting agencies); ● Technology vendors; ● Other cultural heritage organizations: museums, public library archives, etc.; ● Communities documented in collections; ● Donors; ● Experts in specific domains, including preservation (both physical and digital), copyright, scholarly communication, indigenous languages, and accessibility; ● International partners; ● Library-wide technical services and IT groups; ● Acquisitions staff; ● Term archivists; ● Adjacent professional communities (e.g. the BitCurator Consortium, ArchivesSpace, Aeon users); ● Public services staff; ● Privacy and security experts; ● Global south archivists, archivists of color, and archivists from specific regions like Appalachia; ● National organizations and initiatives, including DPLA, the National Archives and Records Administration, and Library of Congress; ● Professional organizations; and ● Individuals who could be boosters because of their platforms. In reviewing blockers and boosters for this work, many groups reflected existing and known cultural challenges and strengths within their organizations, and several groups noted that some groups could be both boosters and blockers. Participants also consistently noted the importance of users to be engaged with the conversation, and resource allocators or providers (administrators, funders, and donors) influence this work. Who/What/When Matrix As the Who/What/When Matrix, the final activity was intended to draw the Forum to a close, and provide meaningful next steps that participants could take alone or together to improve archival discovery and 41 delivery, or to otherwise take action on ideas that came up in the Forum. As described above, the activity used the top-scored ideas drawn from 25/10 Crowd Sourcing as a starting point for their own ideas. While an anonymized list of all actions from this activity is included as an appendix to this report, themes across the actions include the following: Personal research: Many participants indicated that they were interested in conducting personal research to deepen their understanding of topics brought up at the Forum. A selection of those topics include: ● Learn more about “minimal computing” and how it can apply to archives and digital collections ● Learn more about anti-racist and feminist frameworks ● Learn more about POC-centered [people of color-centered] materials in the collection ● Read more about instating a reparations framework, and moving away from philanthropic thinking ● Learn more about indigenous traditional knowledge practices ● Learn more about ArcLight implementation at other institutions Description projects: Forum participants also identified ways in which they could change, enhance, or add more description to their collections. These ideas include: ● Work on collection-specific finding aids ● Determine how to integrate crowdsourced description into the repository and collection metadata ● Advocate for indigenous data sovereignty ● Have staff read anti-racist description models, and integrate guidance into local description model ● Explicitly identify women and POC [people of color] in collections materials ● Deconstruct the finding aid ● Revise existing description to remove racist and sexist subject headings ● Contribute to the National Finding Aid Network project Copyright work: Presentations and discussions at the Forum around copyright influenced many participant’s next steps. Some actions participants planned to take around copyright include: ● Continue conversations about making archival data open ● Take a copyright risk management course ● Develop a list of copyright variables relevant to fair use and virtual reading rooms ● Open a conversation with the State House of Representatives to update copyright law, and sharing the template so others can do the same ● Revise copyright workflows to adhere to a risk assessment model instead of a rights clearance model 42 Public Services: For some participants, doing public-facing work around outreach felt most impactful. Some ideas include: ● Host zine workshop ● Schedule events to build connections with underrepresented communities Patron services and user research: The necessity of understanding users of archival systems was a paramount concern throughout the Forum. Many participants saw opportunities to work directly with patrons and perform user research. Specific actions listed include: ● Develop a policy to fund user-driven digitization ● Share findings of local user study ● Solidify connections to underrepresented user communities ● Develop a digitization and digital collections strategy to best serve patrons ● Gather and create resources for usability and accessibility for archives Technical systems: Forum participants made a clear connection between the archival systems that we use and issues around archival discovery and delivery. Many participants identified both local and national steps to take to explore these connections with more depth: ● Continue doing open source development with Islandora 8 ● Establish a Virtual Reading Room service: ○ Begin conversations with stakeholders ○ Engage Atlas Systems ● Develop an ArchivesSpace API helpers library ● Send a list of feature requests to Circa developer at NCSU ● Work on developing Circa, including accessibility audit ● Convene a meeting with stakeholders at local institution to make systems interoperable ● Replace bespoke finding aid application with open source product Professional service: Some participants identified actions they can take in the context of professional organizations, or within the professional community more broadly, in order to enact change. Ideas include: ● Do DEI [diversity, equity, and inclusion] work with SAA education committee ● Find opportunities to discuss rightsstatements.org with archival community ● Bring awareness to institutions with large holdings of indigenous materials Retrospective In addition to feedback received during the retrospective, the following emerging themes that reflected the day and Forum overall were identified by participants: ● Non-advice and non-critique focused activities were valuable in allowing participants to communicate their ideas in the Forum setting. 43 ● The desire for structural change, especially decolonizing archival work and associated technology work was strong for some people, although some participants also noted that they lacked other people committed to bringing decolonial values into their technology work specifically. ● Some participants also expressed a strong desire for a greater presence from indigenous and tribal archivists, and suggested that hearing and hosting more of them would be helpful for others to empathize or assist them in their work. ● There was a strong desire to identify what next steps could occur, both in terms of the activities of the Lighting the Way project, as well as how these efforts could connect to the interests or goals of individual participants. This also reflected in participant feedback that mentioned that more time for unstructured conversation would have been beneficial. ● In an open question and answer session following the retrospectives, participants also wanted to know how gaps in representation as identified through the Social Network Webbing activity would be addressed in the working meeting. Facilitators and participant advisors noted that this was an area in which more work and conversation were needed, and agreed to follow up. Discussion and next steps Participant feedback and the themes identified throughout the Forum provide important insights into what was most valuable and what could be improved about the Forum, as well as how to structure future project activities. Given the disruption to the project caused by the COVID-19 pandemic, which led to the postponement of the Working Meeting and delays in community engagement, this information is also valuable in considering how the project can refocus activities to meet its overall objectives. The project team has identified the following areas for further discussion and attention as it continues planning its activities. Overall, the project team has been considering revisiting the overall approach of the project to include both a broader “open track” with a low level of participation and engagement and a narrower group focused directly on advancing progress on the project deliverables. This follows in part the project team’s original plan for the working meeting, but we expect that that event may be organized differently than originally intended. Considering lessons learned from the Forum, both of these tracks will be informed by a more focused understanding of archival discovery and delivery. We will refine our facilitation techniques to reinforce concrete goals and expectations, to give participants dedicated time to reflect and organize their thoughts, and make space for discussion around more nuanced topics, all the while meeting the project’s overall goals. Scope and focus The project team has spent significant time reflecting on the feedback regarding the perceived “scope creep” and lack of focus felt by Forum participants. In some ways, this is unsurprising, as Forum participants were selected across a wide range of roles and expertise, and the intent was to make the Forum broadly inclusive across this range (e.g. the communication to prospective participants that they need not be “technical experts”). The Forum planning team also took a slightly broader understanding 44 of archival discovery and delivery to allow for additional generative conversations to take place. However, given the feedback, the project team needs to define and communicate scope for the project and its remaining activities more clearly. Better definition of archival discovery and delivery, and communication of that definition, is an essential part of this, as a definition based solely on systems integration may be too narrow. This consideration must also contend with the suggestions from Forum participants about the broader need for user studies across archives and special collections, and the desire for more depth around how decolonization and anti-racism will impact archival discovery and delivery. Given the impact of the COVID-19 pandemic and broader efforts to achieve racial equity and address systemic racism, these priorities have likely become even more important. At the same time, resources and structural arrangements of many institutions have been significantly impacted. Future efforts of the project to engage participants should allow for continued discussion and reflection on how to consider and incorporate these factors into all work supporting archival discovery and delivery. Participation and community engagement Overall, participants gave positive feedback around the breadth of roles and responsibilities represented by Forum participants. Despite the project team increasing the overall capacity of the Forum, both livestream viewers and participants desired opportunities to engage more broadly, and wanted an even larger convening. The project should consider how remote participation can be best leveraged to meet the project’s goals and objectives, especially as the pandemic will likely limit travel until well into 2021. The project team expects that the Working Meeting, originally scheduled for June 2020, will be held entirely online with remote participants, with synchronous and asynchronous components. The project team is considering having a broader “open track” with a low level of participation and engagement, and a narrower group focused directly on advancing progress on the project deliverables intended as the focus for the Working Meeting. In addition, the project team will revisit the list of groups identified as missing or desired from the Social Network Webbing exercise and consider how participation can be targeted or refined to include them. Most notably, participants highlighted outreach to administrators and managers to ensure concerns are suitably amplified. Participant feedback identified the value of the project providing a community framework for moving activities forward, with particular interest in developing larger collaborative networks. Specific areas of focus within these activities could include communities of practice supporting archives-focused user studies, virtual reading rooms, or other topics related to shared software development. In addition to the Working Meeting, which is focused on advancing progress on specific written deliverables, the project team is considering holding additional online events for further engagement. These events and community frameworks also should identify ways in which they can be sustainably connected to other efforts and groups beyond the grant period, such as sections of the Society of American Archivists, Digital Library Federation working groups, and communities and consortia supporting specific software or tools such as ArchivesSpace or BitCurator. 45 All of these possibilities under consideration must also be balanced by the availability of project resources in terms of both budget and personnel, especially given the challenges identified in the facilitator retrospective. While valuable, broader participation in online events and lightweight community frameworks will take planning and time to execute successfully, so that they are of highest value. The project team will consult with the project’s participant-advisors, and potentially with Forum participants, on which options are most impactful and can be balanced against resource constraints. Facilitation and structure of meetings and activities While the facilitation and structure of the Forum overall received positive feedback from participants, the project also team noted areas for improvement and refinement. Participants stated that additional presentations and reinforcement of concrete goals and expectations would have helped situate them at the Forum. The project team will carry these recommendations forward into its future activities. Additionally, consistent feedback regarding the need for participants to have dedicated time to reflect and organize their thoughts has led the project team to ensure that this is prioritized in future meeting and activity planning. The project team will also identify facilitation techniques that can be used to make space for discussion around more nuanced topics, beyond relegating them to an unactionable “parking lot” list. Finally, the project team is determining how to best respond to the feedback requesting a sharable version of the Forum playbook, and what form related resources produced for external audiences might take to ensure they provide adequate context and detail about the facilitation methods and activities chosen. Written contributions In addition to the statement of principles developed within the forthcoming working meeting, the integration handbook serves as a primary written output for the project created through contributions from project participants. It is intended to describe use cases related to archival discovery and delivery for a particular institution or project, the systems to be integrated, and specific integration patterns and strategies as practical recommendations. It has become clear that this working title and focus does not resonate well with potential contributors, and thus the project team needs to provide a clearer description of what contributions could look like. The intent for this particular deliverable is to have contributors communicate about successes or challenges regarding archival discovery and delivery; this may take a variety of forms, including mapping out different systems integrations, identifying underserved user groups and understanding their needs around archival discovery and delivery, or providing a position paper on a particular area of interest. The project team will work to create a set of clear guidelines for contributions as it continues to organize its engagement activities. Next steps With a draft form of this report ready and IMLS approval of a no-cost extension for the project for another year, the project team spent much of the summer and fall of 2020 reengaging with the project’s participant-advisors to consult upon project direction. The project team also plans to spend much of 46 its time over the coming months on community engagement, both with Forum participants as well as the broader project audience. This is intended to share both the outcomes of the Forum as well as to start the process of engagement with potential contributors to the project’s written products. The project team looks forward to engaging with participant-advisors and others on feedback regarding project direction, and will resume communicating regularly through its website, social media, and email list channels. The project team expects to issue a call for participation in January 2021 for the working meeting and authorship of contributions to the project’s handbook, with the working meeting to be held over a series of two-hour online meetings and asynchronous work over 4-6 weeks in Spring 2021. 47 Appendices Application form Community Agreements and Code of Conduct Lighting the Way Forum Playbook 25/10 Crowd Sourcing Ideas Anonymized Who/What/When Matrix actions Feedback survey questions Quantitative feedback summary https://docs.google.com/spreadsheets/d/1IFmcXUPr8v9cIN0PwbqK3-Vn7nIPLKsNLqquRlJi1Yw/edit https://docs.google.com/spreadsheets/d/1IFmcXUPr8v9cIN0PwbqK3-Vn7nIPLKsNLqquRlJi1Yw/edit Page 1 of 17 Lighting the Way Forum: Application Form NOTICE: Please submit all applications through the online application form linked from the announcements and project website. This PDF of the application form is provided for your reference only. Start of Block: Background and Call for Participation Stanford University Libraries invites archives, library, and technology workers and those in related fields to self-nominate as participants for Lighting the Way: A National Forum on Archival Discovery and Delivery, funded by IMLS grant LG-35-19-0012-19. The forum event will take place at Stanford University in Stanford, California from February 10-12, 2020, with approximately 50 participants. Grant funds will allow us to fund partial to full travel costs, meals during the event, and lodging for most participants. To apply, please complete the following application form, which requests information about you, your responsibilities, and your work related to focus of the project. Please answer all questions to the best of your ability. The application should take 15-20 minutes to complete. The initial call for participation will be open from November 13 to December 15, 2019. Our project team will then review the nominations on a rolling basis, and will respond no later than January 10, 2020. Information gathered in this application form will be used to select participants for the Forum, to inform Forum planning, and to identify opportunities for the project team to follow up with you. Your responses will not be shared beyond the project team and its participant advisors. If you have any questions or feedback about the application process or the project, please contact Mark A. Matienzo, the Project Director, at matienzo@stanford.edu. End of Block: Background and Call for Participation Page 2 of 17 Start of Block: Background information Background information The following information is about you and your affiliation with an organization or project. Contact information o Name (1) ________________________________________________ o E-mail address (2) ________________________________________________ o Primary affiliation (e.g. employer or project) (3) ________________________________________________ o Position title at primary affiliation (4) ________________________________________________ Page 3 of 17 Which of the following best describes your primary affiliation? o 2 year college/university (1) o 4+ year college/university (2) o Other academic institution (3) o Government agency (4) o Tribal agency (5) o Nonprofit organization (6) o For-profit organization (7) o Community archives (8) o Self employed/Consultant (9) o Other (please specify) (10) o Don't know (11) How would you describe your primary affiliation? Examples: "A small special collections library within a large public library system"; "a vendor focusing on digital collections systems" ________________________________________________________________ Page Break Page 4 of 17 In which US state or territory do you currently reside? ▼ Alabama (1) ... I do not reside in the United States (58) Display This Question: If 50 States, D.C. and Puerto Rico = I do not reside in the United States In which non-US country do you currently reside? Please note that while we welcome applications from potential participants outside of the United States, international travel support is only available on a case by case basis. ▼ Afghanistan (1) ... Zimbabwe (1357) Page Break Page 5 of 17 We strongly encourage self-nominations from individuals who identify with or whose work directly serves underrepresented and/or marginalized populations, including those not well- represented within libraries, archives, or technology (e.g. women, people of color, LGBTQ+, ability/disability, non-binary gender identities, etc.) We also encourage applications from members of underrepresented and/or marginalized groups that don't fit into the categories listed above. Do you identify as a member of any underrepresented or marginalized populations? o Yes (1) o No (2) o Don't know (3) Does your work support underrepresented or marginalized populations? o Yes, it directly supports underrepresented or marginalized populations (1) o Yes, it indirectly supports underrepresented or marginalized populations (2) o No (3) o Don't know (4) Page Break Page 6 of 17 Do you receive travel support from your employer or primary affiliation for meetings, conferences, or other professional travel? o Yes, I receive full support (1) o Yes, I receive partial support (e.g. lodging only; flights only; no meals) (2) o No (3) o Don't know (4) If selected, will you need the Lighting the Way Forum to fund your travel to attend in person? o Definitely yes (1) o Probably yes (2) o Might or might not (3) o Probably not (4) o Definitely not (5) Regardless of funding, are you otherwise able to attend the Lighting the Way Forum, to be held February 10-12, 2020 at Stanford University in Stanford, California? o Definitely yes (1) o Probably yes (2) o Might or might not (3) o Probably not (4) o Definitely not (5) End of Block: Background information Start of Block: Information about your current role Page 7 of 17 Information about your current role The following sets of questions relate to your current role or position at CURRENT_AFFILIATION. How would you describe your current role or position at CURRENT_AFFILIATION? (Select all that apply.) ▢ Archives or library worker (1) ▢ Technology worker (2) ▢ Legal, copyright, or risk management worker (3) ▢ Managing a program that employs archives or library workers (4) ▢ Managing a program that employs technology workers (5) ▢ Managing a program that employs legal, copyright, or risk management workers (6) ▢ Teaching in an archival, library, or technology-related education program (7) ▢ Studying to be an archives, library, technology, or legal/risk management worker (8) ▢ Working in another profession or occupation, but with archives or library-related responsibilities (9) ▢ Working in another profession or occupation, but with technology-related responsibilities (10) ▢ Working in another profession or occupation, but with legal, copyright, or risk management-related responsibilities (11) Page 8 of 17 ▢ Administering a program serving archives or library interests but not working directly with collections (e.g., consortium, vendor, granting agency, education provider, professional association) (12) ▢ Administering a program serving technology interests but not working directly with archives and special collections (e.g., consortium, vendor, granting agency, education provider, professional association) (13) ▢ Administering a program serving legal, copyright, or risk management interests but not working directly with archives and special collections (e.g., consortium, vendor, granting agency, education provider, professional association) (14) ▢ Other (Please specify) (15) Display This Question: If How would you describe your current role or position at ${q://QID3/ChoiceTextEntryValue/3}? (Sele... = Other (Please specify) Briefly describe your current role or position at CURRENT_AFFILIATION: ________________________________________________________________ Select that which best describes your current employment status in regards to role or position at CURRENT_AFFILIATION: ▼ Employed, full time (1) ... Other (please describe) (8) Display This Question: If Select that which best describes your current employment status in regards to role or position at... = Employed, full time Or Select that which best describes your current employment status in regards to role or position at... = Employed, part time Page 9 of 17 Please indicate whether your position is a permanent position or a term, temporary, or contingent position: o Permanent (1) o Term, temporary, or contingent (2) o Rather not say (3) Display This Question: If Select that which best describes your current employment status in regards to role or position at... = Other (please describe) Please describe your current employment status: ________________________________________________________________ Page Break Page 10 of 17 What are your primary duties that relate to archives/library work at CURRENT_AFFILIATION? (Select all that apply). ▢ Public services (reference, instruction, outreach, exhibits) (1) ▢ Technical services (arrangement, description, accessioning, metadata, cataloging) (2) ▢ Collection development (acquisition, appraisal, donor relations) (3) ▢ Digital library projects (including digitization) (5) ▢ Preservation (conservation; physical materials only) (6) ▢ Born-digital archives or digital preservation (7) ▢ Other (please describe) (4) Display This Question: If What are your primary duties that relate to archives/library work at ${q://QID9/ChoiceTextEntry = Other (please describe) Specify any additional duties that relate to archives/library work at CURRENT_AFFILIATION: ________________________________________________________________ Page Break Page 11 of 17 What are your primary duties or responsibilities that relate to technology work at CURRENT_AFFILIATION? ▢ Software development (1) ▢ User experience design (2) ▢ Project management (3) ▢ Product management (4) ▢ Support (5) ▢ Systems administration (6) ▢ Other (please describe) (7) Display This Question: If What are your primary duties or responsibilities that relate to technology work at ... = Other (please describe) Specify any additional duties that relate to technology work at CURRENT_AFFILIATION: ________________________________________________________________ Page Break Page 12 of 17 What are your primary duties or responsibilities that relate to legal/ethical/risk management work at CURRENT_AFFILIATION? ▢ Policy development (1) ▢ Privacy issues (2) ▢ Copyright/intellectual property (3) ▢ Legal compliance (4) ▢ Policy compliance (5) ▢ Cultural protocols (6) ▢ Other (please describe) (7) Display This Question: If What are your primary duties or responsibilities that relate to legal/ethical/risk management wor... = Other (please describe) Specify any additional duties that relate to legal/ethical/risk management work at CURRENT_AFFILIATION: ________________________________________________________________ End of Block: Information about your current role Page 13 of 17 Start of Block: Information about your work and projects Information about your work and projects in relation to archival discovery and delivery This section focuses on getting information about the work that you or your organization/project is doing in relation to the Forum. Please be as specific as you can within the character limits for each question. "Archival discovery and delivery" is how we describe what people and systems do to find, access, and use material from archives and special collections. Systems that support archival discovery and delivery include but are not limited to those supporting search and presentation of archival description, delivery and presentation of digital objects, request management systems, and interpretation and crowdsourcing. Please describe your (or your organization/project's) work on past, current or planned projects or needs related to archival discovery/delivery. ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ Please list any systems (e.g. software, tools, etc.) that you use in your work to support archival discovery/delivery. ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ Page Break Page 14 of 17 Successes and challenges For each question below, please include any detail about specific technologies (systems or tools), how staff across job function work together, institutional contexts, or other issues that describe how your work on archival discovery and delivery has been successful or is challenging. Please describe any successes you have made in archival discovery/delivery. ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ Please describe any current/continuing challenges you face around archival discovery/delivery. ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ Page Break Page 15 of 17 Would you be willing to present or write about your work (or that of your institution/project) in relation to the forum? o Yes (1) o No (2) o Don't know (3) Please describe any other areas of expertise, interests, topics, or perspectives you could bring to the Forum. ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ End of Block: Information about your work and projects Page 16 of 17 Start of Block: Feedback Feedback to the Project Team This is the final section of the application, and is optional. It allows you to provide additional feedback to the project team, such as recommending other potential participants or suggesting particular topics for discussion. If you have specific suggestions about people or topics, please identify both who/what they are, and why you are proposing them. Do you have other suggestions about potential participants? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ Are there specific topics you want the Forum to cover? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ Page 17 of 17 Are there specific topics you want the Forum to avoid? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ Do you have other questions/feedback for us about the project? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ End of Block: Feedback Community Agreements and Code of Conduct  The Lighting the Way project is committed to providing a productive, inclusive and welcoming                            environment for discussion and collaboration about archival discovery and delivery, following the                        Stanford University Libraries policy on workplace and sponsored conference conduct. To support this                         1 and to further the goals of the project, we expect all participants to follow our Community Agreements                                  and Code of Conduct, including project staff, advisors, event participants, and other contributors.    The Community Agreements outline ways in which we encourage and expect each other to hold safe,                                engaging, and respectful discussions. The Code of Conduct outlines behaviors which will not be                            tolerated, how to report concerns or incidents, and how the code will be applied.  Community Agreements  Our project seeks to address the broader challenge of how to improve archival discovery and delivery,                                or what people and systems do to find, access, and use materials from archives and special                                collections. We recognize that this work is supported by a wide range of responsibility and kinds of                                  expertise, across institutional contexts, levels of resourcing, and the types of communities we serve.                            We also recognize that people may be discouraged or excluded from these conversations in a local                                context based on their identity or systemic issues including racism, classism, sexism, homophobia,                          and more. To this end, we have established a core set of principles for the project:    ● We believe everyone has something to contribute; not everyone needs to be a self-identified                            expert.  ● We focus on shared and holistic concerns and recommendations, rather than focusing on                          specific technologies or tools.  ● We enable the adaptability of recommendations across contexts, communities, levels of                      resourcing.  ● We develop recommendations consciously as an inclusive expression of professional ethics                      and values.  ● To be truly transformational, our work must be conducted in a space that acknowledges the                              power dynamics of bringing together workers across professional contexts, roles, and job                        classifications, acknowledging institutional privilege, and the lack of representation of                    marginalized people within the archives, library, and technology sectors.    We expect all participants to practice community by agreeing to the following:    ● To ensure only one person speaks at a time, and consider pausing to allow those who need                                  more time to process or interject in conversation to do so.  1 “Workplaces and sponsored conference conduct.” ​Stanford University Libraries​, accessed May 3, 2020.                          https://library.stanford.edu/using/special-policies/workplace-and-sponsored-conference-conduct​/  https://library.stanford.edu/using/special-policies/workplace-and-sponsored-conference-conduct ● To make space and take space - encourage and yield the floor to those whose viewpoints may                                  be under-represented in a group, and take space made for you as you’re able.  ● To listen to and respect a person’s description of their experiences, including but not limited                              to those related to marginalization and discrimination.  ● To recognize the interdependent nature of our work to support archival discovery and                          delivery.  ● To acknowledge that choices around practice, implementation, and technology vary widely                      and can be dependent on the availability of resources, and to respect our work as                              incremental.  ● To provide a space where everyone can feel comfortable participating, even if they don’t use                              specific terminology or the perfect way to express their ideas or knowledge.  ● To embrace curiosity and creativity, allowing for the opportunity to try new ideas, consider                            other perspectives, and establish new patterns.  ● To use welcoming language (including a person’s pronouns) and favoring gender-neutral                      collective nouns (“folks” or “y’all,” not “guys”).  ● To give credit where it's due, and to uplift each other’s work and ideas.  ● To accept critique and feedback graciously, and to offer it constructively.  ● To seek concrete ways to make our physical spaces and online resources more universally                            accessible.  ● To acknowledge the difference between intent and impact, and to look for ways to take                              responsibility for the negative impact that we have.  ● To be aware of time, respecting the commitment of all participants and project staff to                              accomplish the goals of the meeting.  ● To take the moments that everyone needs to care for ourselves and their community, by                              paying attention to the needs of your body and mind, and to the welfare of those around us.  Code of Conduct  The Lighting the Way project seeks to provide participants with opportunities for collaboration that                            are free from all forms of harassment and inclusive of all people. All communication should be                                appropriate for a professional audience including people of many different backgrounds. Verbal                        comments that reinforce social structures of domination related to gender, gender identity and                          expression, sexual orientation, disability, physical appearance, national or regional origin, body size,                        accent, race, age, religion, or other marginalized characteristics are inappropriate. Do not insult or put                              down other participants. Be careful in the words that you choose. Sexist, racist, and other                              exclusionary jokes are not appropriate for the forum.    Harassment is understood as any behavior that threatens or demeans another person or group, or                              produces an unsafe environment. It includes offensive verbal comments or non-verbal expressions                        that reinforce social structures of domination; sexual or discriminatory images in public spaces                          (including online); deliberate intimidation, stalking, following; threats or incitement of violence;                      photography or recording without clear permission; sustained disruption of presentations or                      discussion; inappropriate physical contact; and unwelcome sexual attention.  Photography and Recording  Presentations from the Lighting the Way Forum (e.g. in the 9:00 AM-12:30 PM block on Monday,                                February 10) will be live-streamed and recorded. All speakers are required to review and sign Stanford                                University’s speaker release form.    We otherwise ask you to not photograph fellow participants without permission of all those being                              photographed. Please ensure when taking group photos that everyone in the picture agrees where the                              photograph will be shared. If you wish to record at the event for personal use, please speak with the                                      project team before the Forum.  Applying the Code of Conduct  All project participants — including the project team, facilitators, and participants — are expected to                              abide by this Code of Conduct in person, in online spaces, and while present in any groups of project                                      participants inside or outside a formal project event (e.g. including receptions and informal                          gatherings). Participants violating the Code of Conduct will be warned and may be asked to leave an                                  event, and in some cases, may be asked to no longer participate in the project. If you are being                                      harassed, witness another participant being harassed, or have any other concerns, please contact a                            person listed below.     For guidance on how to address reports of violations of the Code of Conduct or Community                                Agreements, see “Procedures for Responding to Violations of the Code of Conduct and Community                            Agreements.”    The project team and designated facilitators will be on hand to respond to Code of Conduct violations                                  and assist in following the Community Agreements. If you witness, suspect, or are the target of a                                  violation of the Code of Conduct at the Forum, contact a project team member or facilitator. At events,                                    they are identifiable by distinctive striped lanyards for their badges.  Procedures for Responding to Violations of the Code of Conduct and Community Agreements  Our project Code of Conduct and Community Agreements are a statement of values. Ultimately,                            however, it is only as good as its enforcement procedures. This procedure documents actions to be                                taken by project staff and volunteers in the event of a violation of either the Code of Conduct or the                                        Community Agreements.   Taking reports  Upon receiving a report of a violation of the Code of Conduct or Community Agreements, ask the                                  reporter if they would like to make a formal report. Let them know that you can’t make any promises                                      about how it will be handled, but their safety and confidentiality will be a priority. ​Take a written                                    report, or write down verbal reports as soon as possible. Reports of any length should be taken in a                                      quiet, private space (e.g. the “VIP Room” off the Bechtel Conference Center’s Main Hall), not a                                reporter’s hotel room. If the following information is not volunteered in the written or verbal report,                                ask for it/include it, but do not pressure them.    ● Identifying information (name if possible) of the participant violating the Code of Conduct or                            Community Agreements  ● Reporter’s name and contact information  ● The approximate time and date of the behavior (if different than the time the report was                                made)  ● Place of the incident  ● What happened (try to collect as much information as possible to provide a clear                            understanding of what occured)  ● Other people involved in the incident    Do not question the reporter's truthfulness. It is your job to maintain a supportive environment and                                ensure that fair procedures are followed, not to conduct an investigation. ​Do not summon law                              enforcement unless there is a threat to physical safety, or at the request of the reporter (see                                  Threats to physical safety and law enforcement section, below).    If the reporter is distressed and/or needs additional assistance, offer them a private space to be in, ask                                    how you can help, and make sure they have local emergency contact information (included in the                                Code of Conduct). Ask if there is a trusted friend they would like you to get; if so, have someone bring                                          that person.    If the incident was widely witnessed: Thank them for the report and tell them you will convene the                                    members of the project team and/or designated facilitators.    If the incident was private: Thank them for the report and say you will convene the relevant project                                    staff if that is okay with them. ​Consent is critical. Be explicit with the reporter about with whom you                                      intend to share the report, e.g. project staff, facilitators, or other volunteers.    Do not:    ● Pressure them to withdraw the report.  ● Ask for their advice on handling the report or imposing penalties. This is the responsibility of                                project staff and facilitators.  ● Share details of the incident with anyone, including ​project staff, facilitators, or other Stanford                            employees​, without the specific consent of the reporter.    Be aware that people who have experienced harassment and abuse may be re-traumatized if the                              details become public. In addition, abusers may recognize these details, even if they have been                              anonymized, become angry at the reporter, and enact further trauma. Again, confidentiality and                          consent are incredibly important.  Threats to physical safety and law enforcement  If you have any concerns as to anyone's physical safety, contact venue security or local law                                enforcement immediately.    Do not involve law enforcement under any other circumstances except by request of the reporter.                              Remember that some participants will experience law enforcement as increasing, not diminishing,                        threats to their safety, so it is very important that they be in control of this choice.    If escalation leads to a harasser being required to leave an event, and they refuse to leave, it may be                                        necessary to involve venue staff, other Stanford employees, or law enforcement as a last resort.  Addressing reports involving Stanford community members  If a report concerns a Code of Conduct violation and directly involves a Stanford community member                                (either the reporter or the participant violating the Code of Conduct is Stanford faculty, staff, student,                                postdoc, etc.), then the report should be brought to the attention of the following Stanford Libraries                                staff:    ● Tom Cramer, AUL for DLSS   ● Catalina Rodriguez, Director of HR  ● Gary Harris, Associate Director of HR    As indicated above, such reports should be shared ​only if the reporter has granted their consent to do                                    so.   Recusal process  Conflicts of interest may include relationships of the following nature with either party:    ● Close friendships  ● Business partnerships  ● Romantic relationships  ● Family relationships  ● Hierarchical academic or business relationships  ● Any other significant power relationship  ● Significant personal conflict  ● Involvement in the incident    If you think the nature of your relationship with either party is such that you would be significantly                                    biased for or against them, or if you would be in a position to retaliate against or receive retaliation                                      from either party depending on the outcome, you should recuse yourself. Additionally, if the nature of                                your relationship is such that outside people might reasonably perceive a conflict of interest, you                              should recuse yourself.    It is not necessary to recuse yourself on the basis of having been present at a public violation under                                      discussion, or on the basis of the sort of general friendships and acquaintanceships which many                              people share in professional spaces.    Recusing yourself means you should stop influencing the decision in any way. Don’t participate in the                                discussion, and don’t discuss the decision with others (including other staff), read or write the                              documentation, etc. If there are email threads, group chats, etc., leave them if possible (and if you                                  haven’t recused yourself, don’t include people who are recused in these group communications).  Responding to reports  Send the report immediately to the team of facilitators and project staff listed as contacts for the                                  event using established private communication channels, and/or convene a meeting (physical or                        virtual) as soon as possible (within 2 hours if during an event, or within 1 business day if not at an                                          event). Do let the alleged harasser know that a complaint has been lodged (reread the language above                                  about confidentiality and consent first). Project staff and facilitators are not in a position to conduct                                exhaustive investigations, so don't. It may be necessary and prudent to gather some additional                            information before reaching a decision, however.    At the meeting, discuss:    ● What happened?  ● Are you doing anything about it?  ● If so, who is doing it?  ● When will they do it?    Specific sanctions may include but are not limited to:    ● warning the harasser to cease their behavior and that any further reports will result in other                                sanctions  ● requiring that the harasser avoid any interaction with, and physical proximity to, the                          reporter(s) for the remainder of the event  ● early termination of a presentation that violates the policy  ● not publishing the video or slides of a presentation that violated the policy  ● not allowing a speaker who violated the policy to give (further) talks at the event  ● immediately ending any event volunteer responsibilities and privileges the harasser holds                      requiring that the harasser not volunteer for future project events (either indefinitely or for a                              certain time period)  ● requiring that the harasser immediately leave the event and not return  ● banning the harasser from future events (either indefinitely or for a certain time period)  ● publishing an account of the harassment    Keep in mind that it is never a good idea to require an apology. If a harasser would like to apologize,                                          this may also be a bad idea. Do not include the reporter, the alleged harasser, or anyone with a                                      conflict of interest at this meeting.    If there is no consensus in the group on a response, the coordinators will determine and communicate                                  the course of action.    Violations that have been reported second hand, not by the target of the violation, should be handled                                  on a case by case basis. Keep an eye on those involved in the report and, if need be, approach the                                          affected parties.  Communications  Project staff and facilitators will determine whether private (not widely witnessed) incidents need to                            be addressed with the community of project participants. Widely witnessed incidents should be                          addressed to the broader community of project participants.  Involved parties  As soon as possible after the meeting, communicate your decision and any actions you are taking to                                  involved parties.    When meeting with someone accused of harassment, follow the Rule of Two - have two volunteers in                                  the room. Any more than two might be viewed as piling on the person, any less than two is a safety                                          concern.    Remind individuals of the Community Agreements and point out any pertinent sections regarding the                            nature of the report. The Community Agreements serve to provide a structure for supportive, effective                              and inclusive collaboration.   The broader project community  First, reread the language above about confidentiality and consent, and consider this section in that                              light.    ● Do respond quickly.  ● Do keep individuals on both sides of an incident anonymous. (Potential exceptions: when a                            harasser is a conference staffer; when the incident was public and high-profile.)  ● Do provide a general sense of the nature of the incident.  ● Do say what you have done in response to the incident.  ● You may briefly note any steps taken by harassers to remedy the situation (e.g. apology,                              leaving the conference). Don't give them a cookie for it.  ● Do provide multiple avenues for community feedback to project staff. This feedback should be                            private. If you provide only one feedback mechanism, make sure it is accessible to everyone                              (e.g. email good, in-person conversations bad).  ● Do reiterate the project’s values.    Your goal is to be transparent about your process and values while respecting the privacy of                                individuals involved. Keep it brief and clear. There will probably be upset community members who                              want to talk. Conference staff should listen to them nonjudgmentally, take notes if needed, thank                              them for their feedback, and not flip into problem-solving or explaining mode. Apologize as needed;                              avoid defensiveness.  After events  If someone's conduct was egregious enough that they should be banned from further participation in                              the project (e.g. future events or contributions), this needs to be recorded and communicated to the                                Project Director (Mark Matienzo), Tom Cramer (Associate University Librarian for DLSS), or someone                          else on the core project team.  References  The Community Agreements and Code of Conduct were developed through consultation and                        adaptation of a number of existing sources, including:    ● The Collective Responsibility Code of Conduct and Community Agreement  2 ● The LDCX 2019 Code of Conduct  3 ● The Digital Library Federation Code of Conduct  4 ● The Recurse Center Social Rules  5 ● AORTA’s ​Anti-Oppressive Facilitation for Democratic Process: Making Meetings Awesome for                    Everyone  6 ● Seeds for Change’s ​Group Agreements for Workshops and Meetings  7 ● Valerie Aurora and Mary Gardiner’s ​How to Respond to Code of Conduct Reports  8   Response procedures were adapted from the Code4Lib response procedures.  9 2 “Code of Conduct and Community Agreement.” ​Collective Responsibility: National Forum on Labor Practices for                              Grant-Funded Digital Positions​, accessed May 3, 2020.              https://laborforum.diglib.org/code-of-conduct-and-community-agreement/​ .  3 “Code of Conduct - LDCX 2019 Conference.” ​Stanford University Libraries, ​accessed May 3, 2020.                              https://library.stanford.edu/projects/ldcx/2019-conference/code-conduct​.  4 “DLF Code of Conduct.” ​Digital Library Federation​, accessed May 3, 2020.                        https://www.diglib.org/about/code-of-conduct/​.  5 “Social rules,” ​Recurse Center​, accessed May 3, 2020. ​https://www.recurse.com/social-rules​.  6 AORTA (Anti-Oppression Resource & Training Alliance). “Anti-Oppressive Facilitation for Democratic Process:                        Making Meetings Awesome for Everyone,” June 2017.              https://aorta.coop/portfolio_page/anti-oppressive-facilitation/​.   7 “Group Agreements for workshops and meetings.” ​Seeds for Change​, accessed May 3, 2020.                            https://www.seedsforchange.org.uk/groupagree​.  8 Valerie Aurora and Mary Gardiner, ​How to Respond to Code of Conduct Reports. Frameshift Consulting LLC, 2019.                                    https://files.frameshiftconsulting.com/books/cocguide.pdf  9 “Procedures for reporting and responding to violations of Code of Conduct.” ​Code4Lib Code of Conduct​,                                January 15, 2020. ​https://github.com/code4lib/code-of-conduct/blob/master/procedures.md  https://laborforum.diglib.org/code-of-conduct-and-community-agreement/ https://library.stanford.edu/projects/ldcx/2019-conference/code-conduct https://www.diglib.org/about/code-of-conduct/ https://www.recurse.com/social-rules https://aorta.coop/portfolio_page/anti-oppressive-facilitation/ https://www.seedsforchange.org.uk/groupagree https://files.frameshiftconsulting.com/books/cocguide.pdf https://github.com/code4lib/code-of-conduct/blob/master/procedures.md Lighting the Way Forum Playbook  A Resource for Facilitators, Notetakers, and Vendors  Overview  Goals  The Lighting the Way Forum focuses on information sharing and collaborative problem solving around  improving how user-facing systems support discovery and delivery for archives and special collections. The  goals for the Forum are:    ● To allow participants to visualize, map, and build connections​ – between one another, their work,  the systems they rely on, and the communities they serve  ● To organize around shared opportunities and challenges​, identified by participants during group  activities  ● To provide a platform for engagement with the project​, leading to participation in other project  activities (e.g. attending the working meeting or contributing to written products like the integration  handbook)    These goals align with the four primary project goals:    ● Map the ecosystem supporting archival discovery and delivery.  ● Develop both conceptual and actionable recommendations for systems integration for technical,  ethical, and practical concerns.  ● Build a shared understanding between workers in fields like archives, library, and technology  undertaking this work.  ● Activate a diverse group of project participants.  Design  The Forum uses a mix of plenary presentations and facilitated breakout activities to focus on addressing the  goals listed above, following the “divergent - emergent - convergent” model described in ​Macanufo, Brown,  and Gray’s ​Gamestorming​ ​and ​widely used in design ideation sessions​.​ Day 1 focuses on divergent activities  (set the stage, develop themes, etc.); day 2 focuses on emergent activities (examine, explore, and experiment);  and day 3 focuses on convergent activities (conclusions, decisions, action).  Schedule overview  The ​draft schedule​ has the forum running across two and a half days, with meals and breaks in blue,  facilitated activities in red, presentations in green, facilitator announcements in yellow, and debrief sessions  for facilitators in purple. The ​public agenda​ shows the schedule for participants.     https://www.oreilly.com/library/view/gamestorming/9781449391195/ch01.html https://www.oreilly.com/library/view/gamestorming/9781449391195/ch01.html https://www.oreilly.com/library/view/gamestorming/9781449391195/ch01.html https://www.interaction-design.org/literature/article/understand-the-elements-and-thinking-modes-that-create-fruitful-ideation-sessions https://docs.google.com/spreadsheets/d/1bwT53AzdynbwuJxutyLN7ptbc-yVlnworngxSitgunY/edit#gid=0 https://library.stanford.edu/projects/lightingtheway/forum-february-2020/agenda Room layout      Facilitators and notetakers  Facilitators and notetakers are essential to the success of the Forum. We want to make sure that facilitators  and notetakers understand their responsibility, and encourage them to review the following resources:    ● Community Agreements and Code of Conduct​ (read this first!)  ● AORTA Collective Anti-Oppressive Facilitation Guide​ (read this second!)  ● Procedures for Responding to Violations of the Code of Conduct and Community Agreements    Facilitators help keep the conversation going, determine how to help people stay engaged, help answer  questions about specific activities, watch and respond to difficult social dynamics, and make sure everyone  can participate and be heard. The Forum relies on structured activities that allow people to engage with one  another in small groups, at 7-8 person tables, 12-16 person large groups, and across the room. Facilitators at  the table level can participate in exercises, but must stay attuned to their role and not dominate  conversations or allow them to be derailed.    Notetakers help document the Forum and its activities to help our project achieve its goals, and to help make  space for relationship-building among the participants. We may ask for notetakers at the end of a given  exercise during group discussion to identify the most promising outcomes, and sometimes we may need it  during an exercise to help with reporting out to the larger room.      Brief descriptions of key roles follow:    ● Primary facilitator:​ a facilitator responsible for the overall structure, timing, flow of a given day  ● Activity facilitator​: a facilitator responsible for a specific activity or section of the program; acts as  timekeeper or delegates timekeeping to a floater facilitator  ● Floater facilitator:​ a facilitator that helps support the activity facilitator during a given activity while  the activity is underway; assists activity facilitator with timekeeping  ● Table facilitator:​ a facilitator that helps a given table/group stay on track during activities  ● Notetaker: ​someone to record the output of activities using Google Docs or other techniques as  needed or preferred, either at the level of an individual table or in room-wide report-outs  ● Livestream monitor/moderator​: responsible for checking that the livestreaming function is working,  and watching for any questions from remote participants.   Day 1  Set up    ● Room should be set up with 10 tables of 7 chairs each, plus 16-20 additional chairs in the back. Each  table should have the following: (confirm stock each day)  ○ 1 Post-It easel pad  ○ 10+ pens  https://docs.google.com/document/d/1EpIj8JlzD114GlNkvoiNf89Q79MZ4rmIkHpTORj1mX8/edit http://aorta.coop/wp-content/uploads/2017/06/AO-Facilitation-Resource-Sheet.pdf https://docs.google.com/document/d/1MmNYz1eG1ZqchaczDX3Kup5iYAyRPB9YP7p0FeQBGsE/edit ○ 10+ markers (various colors; colors not significant)   ○ 15+ 5x8” index cards   ○ 1 roll of painter’s tape  ○ 7 pads of 3x3” Post-It notes per table (various colors; colors significant)  ○ 1 or 2 printed copies of the Forum Playbook (this document)   ● Ensure an additional supply is available of the following:  ○ A few extra Post-It easel pads  ○ Pens  ○ Markers  ○ Index cards  ○ Post-It notes  ○ Dot stickers  ○ 1 ream printer paper  ○ 24-pack bottled water  ● Ensure power (surge protector?) is available at each table and at the podium  Registration  ● Participants should receive the following:  ○ Name badge/lanyard  ○ Folder with handouts  ■ Schedule  ■ Directions  ■ List of participants (no contact information)  ■ Community Agreements and Code of Conduct  ■ Stanford Libraries swag (stickers, laptop camera cover, etc.)  ● If a participant is listed as having specific food needs (restrictions/allergies), confirm with them.  ● People can sit with whomever they’d like, although they’re encouraged to sit with people they don’t  know.  Breakfast / Getting to know one another    Script:  ● We encourage you to introduce yourself to everyone at your table. Let the conversation flow naturally  or feel free to choose to answer any of these questions.  ● For people who like icebreakers:  ○ What’s your favorite snack?  ○ How do you celebrate a job well done?  ○ What was your favorite vacation you’ve ever taken and why?  ● For people who don’t like icebreaker:  ○ What brought you to the Forum?  ○ What motivates you?  ○ What do you think is the most rewarding outcome we could achieve for the Forum?    Pre-Activity: Trading Cards (15 minutes)  Lead(s)​: Individual table facilitators  Activity Type: ​Divergent  Goal(s)​: Get participants to talk to one another  Reference material: ​Trading Cards - Gamestorming  Supplies: ​1 index card and 1 marker/pen per person  Outcomes: ​Artifact to generate discussion.  Script:  ● We see that you are starting to have conversations with one another and want to give you another way  to get to know the participants at your table.  ● We’re going to have each of you make a “trading card” for yourself. Take a blank index card and a  marker or pen. At the top, write your name, and below that draw a picture that represents you. Save a  couple lines at the bottom. On one line, write a few words that describe what kind of work you do.  Below that, write a few words that describe something you’re excited about in terms of the Forum.  We’ll take five minutes for you to draw your trading card.  ● (After 5 minutes)​ OK - you all should have your trading cards now. Now, we’ll spend five minutes  trading cards around your table. When someone hands you a card, hold onto it for a moment, and  think of a question you might ask that person, and write it on the back of the card. After you’ve done  that, you can trade your card to someone else at the table. You can hold onto at most one card if you  want to follow up on a specific question that either you or someone else wrote.  ● (After 5 minutes)​ Now, you should all have another person’s card with a question or two on the back.  At each table, you can ask the person who’s card you’re holding the question on the back, and they  can either answer or pass. We’ll take only three minutes to do this, so make your answers fast!    Welcome, Logistics, Announcements, and Invited Talks (3 hours)      Activity 1 - Mad Tea (60 minutes)  Activity Type: ​Divergent  Goal(s)​: Get participants to talk to one another; articulate shared concerns; build energy and creativity  Reference material: ​Mad Tea - Liberating Structures​; ​1-2-4-All - Liberating Structures  Supplies: ​Bell or other noisemaker for activity facilitator; paper for idea generation; Google Docs for notes  Outcomes: ​Idea sharing/reflection using strategy worksheet; notes doc on opportunities   Script:  ● Our first activity invites you to finish a list of open-ended sentences that relate to shaping the direction  about how we think about and discuss archival discovery and delivery. We’ll show you questions on  the screen and read them off to you.  ● We’ll form two concentric circles. For those of you on the inside of the tables towards the center of the  room, face the outside of the room; for those on the outside of the tables, face the inside of the room.  If you’re able and willing to stand, we recommend doing so.  https://gamestorming.com/trading-cards/ http://www.liberatingstructures.com/mad-tea/ http://www.liberatingstructures.com/1-1-2-4-all/ ● You will start by pairing someone directly across from you. One of you should invite the first person to  complete the sentence. When you hear the sound of [whatever] only once, or see me raise [whatever],  the first person invites the other person should complete the sentence. When you hear the sound of  [whatever] twice, or see me raise [whatever], rotate two people to your right if you are willing to move.  ● [Questions - use list from ​LS Mad Tea​; add to a slide deck]  ○ [Show first question] ​And we are going to start with this as the first question. ​[read  question]  ○ [Make sound/signal; after 45 seconds, make sound/signal again only once] ​OK, time to  rotate two people to your right.  ○ [After 45 seconds; Make sound/signal twice and show/read next question]  ○ [Repeat second and third until done]  ● Now that you’ve gotten a chance to talk to a lot more people, let’s have you return to your original  seat if you can find it. You’re now invited to reflect on your responses and what you heard. Please  spend ten minutes answering the following questions: ​[Present these on a slide]  ○ 1. What is the deepest need for my / our work?  ○ 2. What is happening around me / us that demands creative adaptation?  ○ 3. Where am I / are we starting, honestly?  ○ 4. Given my / our purpose, what seems possible now?  ○ 5. What paradoxical challenges must I / we face down to make progress?  ○ 6. How am I / are we acting our way forward toward the future?  ● [After 10 minutes] ​Now that you’ve reflected on these questions, spend two in silent reflection  answering the following question. ​What are the biggest opportunities we have in terms of  improving archival discovery and delivery?   ● [After 2 minutes] ​I invite you to now spend five minutes reflecting with one or two people at your  table about your answers to this question, and identifying other opportunities. Make sure someone is  a note taker within your group.  ● [After 5 minutes] ​Now that you’ve discussed in groups of two or three, I encourage you to share your  answers and generating additional ideas at your table. Let’s spend five minutes doing that. Make sure  you have a note taker who is listing all of the ideas.   ● [After 5 minutes] ​So by now, you should have discussed at opportunities for improving archival  discovery and delivery at your table. Let’s share around the room - name one opportunity and one  challenge that stuck out in your discussion. Let’s keep the discussion brief, perhaps 90 seconds most  per table. ​[Ensure we have a note taker for the room]  Break (30 minutes)  Activity type: ​Break  Outline:  ● Remind people that we’re starting promptly at 3:00      Activity 2A - Speedboat (45 minutes)  Activity Type: ​Divergent  Goal(s)​: Identify drivers/hindrances; scope out problem space; expand previous exercise & application data  http://www.liberatingstructures.com/mad-tea/ Reference material: ​Speedboat - Innovation Games​; ​Speed Boat - Gamestorming  Supplies: ​1 easel pad and post-its/markers per table; notes document for group sharing  Outcomes: ​Data to support emergent exercises (keep notes/transcribe); clearer set of positive/negative  perspectives on problem space  Script:  ● In this next activity, we’re going to focus on strengths and challenges by using the metaphor of a boat.  At each table, you should have an easel pad and some markers. Start by drawing a boat on the easel  pad. This boat is the good ship Archival Discovery and Delivery. At your table, each of you should start  adding sails that speed the boat up — representing strengths or things that provide us forward  momentum — or anchors that slow it down (challenges, obstacles, and the like). Spend 15 minutes  doing this and taking turns adding ideas, but keep discussion at this point minimal. Use post-it notes  to add sails and anchors to make it easier.  ● After 15 minutes, discuss all the sails and anchors your table added as a group. Do any of them seem  more significant than others? If the discussion helps you identify more sails or anchors, you can add  them. Spend 15 minutes on that part. After that, we’ll share with the full group for 15 minutes. Ready  to start?  ● [After 15 minutes] ​By now your tables should have a boat with some anchors and sails. Begin shifting  to the group.  ● [After 15 minutes]​ OK, let’s now share across each group. ​[Ensure notetakers are documenting  this; this will help feed into TRIZ on day 2]  ○ Suggested questions:  ■ Do any of these anchors and sails feel directly oppositional?  ■ What stands out to you about these ideas?    Activity 2B - Low-Tech Social Networking (45 minutes)  Activity Type: ​Divergent  Goal(s)​: Identify connections between people  Reference material: ​Low-Tech Social Network - Gamestorming  Supplies: ​1 easel pad, 5x8” index cards, painter’s tape per table; butcher paper?  Outcomes: ​Data to support emergent exercises (keep notes/transcribe); clearer set of positive/negative  perspectives on problem space  Script:  ● As a group, we are going to build the social network that is in the room right now. We’re going to use  this wall to do it. But first, we need to create the most fundamental elements of the network: who you  are. If you’re able to get your trading card from another person at your table, do so, or create a new  one with a drawing of your avatar, your name at the top, and a few words or “tags” at the bottom that  describe who you are, what you do, or what you’re interested in. We’ll give you two minutes to do that  now.  ● [After 2 minutes] ​Now that you have your new avatar, take a blank sheet from your easel pad.  “Upload” your profiles to the social network by taping your cards to the sheet. Once you’ve done that,  starting at your tables, start with the people you know and draw lines to make the connections. You  can ask people questions to see how you’re connected with people that you don’t already know well.  Label the lines if you can, like “friends with,” “works with,” “went to school with,” etc.” We’ll spend ten  minutes doing that.  https://www.innovationgames.com/speed-boat/ https://gamestorming.com/speedboat/ https://gamestorming.com/low-tech-social-network/ ● [After 10 minutes]​ Let’s do this again with two tables - meet and combine your social networks, and  draw the connections. We’ll do this for ten minutes.   ● [After 10 minutes] ​Now, let’s take all the sheets and put them on this wall. Spend the next 15 minutes  looking at the network, and reflecting on who else you happen to be connected to on the network.  We’re not going to draw lines at this point because we don’t want the markers to bleed through the  wall. As you look at the network, ask yourself which connections you find the most noticeable or  striking. We’ll do this for 15 minutes.   ● [After 15 minutes]​ Now that you’ve had some time to reflect, let’s take a few minutes to reflect on  what you noted about the connections. What stood out to you the most?      Retrospective and Prep for Day 2 (30 minutes)  Activity Type: ​Divergent  Goal(s)​: Feedback  Reference material: ​The 4L’s: A Retrospective Technique​;   Supplies: ​Google Docs for notes, post-its and pens  Outcomes:​ Information to check on course correction  Script:  ● Now that we’ve concluded the last exercise for the day, we’re going to spend some time reflecting on  today. As a reminder, we started the day with a trading card exercise, had a series of presentations,  took a break for lunch, and had three activities in the afternoon. The first was Mad Tea, which got you  talking to one another around the room and built in time for reflection. The second was the  Speedboat, where we looked at strengths we have and challenges we face in terms of archival  discovery and delivery. Finally, started building our Low-Tech Social Network to make connections to  one another clearer. The focus today was on a divergent approach to get you to articulate ideas and  new possibilities.  ● To help you reflect, we’re going to use a technique called the 4Ls retrospective. The 4Ls are ​Liked​,  Learned​, ​Lacked​, and ​Longed For​. For liked, think about what you enjoyed or what went well - it  could be anything from the content, the tools, to conversations you’ve had with people. For  “Learned,” think about something that you learned today - any new discoveries, points of interest, or  highlights. For “Lacked,” think about what seemed to be missing today. Was anything unclear? Did  you need something to make your participation go more smoothly? For “Longed For,” try to think of  something that you wish existed or was possible that would ensure that the Forum would be  successful. Write each of your items on a post-it note, and remember which of the 4Ls each post-it is  associated with. We’ll give you five minutes in individual reflection.  ● [After 5 minutes] ​At your tables, have one person draw two lines on a sheet of the easel pad to divide  it into four quadrants. Label each quadrant with one of the 4Ls. Share your responses with the 4Ls  with one another, and place them on the sheet of easel pad. Discuss with one another, and feel free to  add any others should they come up. We’ll spend 10 minutes doing that. ​[Ensure a notetaker  transcribes any significant points in discussion]  ● [After 10 minutes]​ OK, let’s place the sheets on the wall, and we’ll go around each group. You only  have about a minute each to walk us through some of the 4Ls that stuck out in your group. ​[Ensure a  notetaker transcribes these]  ● Thanks for your feedback!   https://www.ebgconsulting.com/blog/the-4ls-a-retrospective-technique/   Facilitator debrief (15 minutes)  Activity type: ​Debrief  Outline:   ● What do we need to start/stop/continue doing?  ● What went particularly well, or where did participants struggle?  ● Were there any concerns about potential Code of Conduct or Community Agreements violations?  ● Do we need to change plans for tomorrow?  Day 2  Set up  ● Room should be set up with 10 tables of 7 chairs each, plus 16-20 additional chairs in the back. Each  table should have the following: (confirm stock each day)  ○ 1 Post-It easel pad  ○ 10+ pens  ○ 10+ markers (various colors; colors not significant)   ○ 15+ 5x8” index cards   ○ 1 roll of painter’s tape  ○ 7 pads of 3x3” Post-It notes per table (various colors; colors significant)  ○ 1 or 2 printed copies of the Forum Playbook (this document)  ● Ensure an additional supply is available of the following:  ○ A few extra Post-It easel pads  ○ Pens  ○ Markers  ○ Index cards  ○ Post-It notes  ○ Dot stickers  ○ 1 ream printer paper  ○ 24-pack bottled water  ● Ensure power (surge protector?) is available at each table    Breakfast (60 minutes)  ● Encourage people to sit at different tables/with different people than in Day 1  Recap of Day 1/What’s Today? (15 minutes)  Activity type: ​Logistics  Goal(s)​: Give participants important information about the event, and refresh participants on what we’ve  done yesterday.  Script:  ● Welcome to day two of the forum. To start, let’s see if there are any announcements.  ● Let’s start by recapping what we did yesterday. We started with breakfast and sharing Trading Cards  as a getting to know you exercise, and moved into a series of presentations by the project team and  from some of your fellow participants around four themes: the evolving systems ecosystem, networks  and the big picture, cultural/legal/ethical concerns, and impacts on public services and outreach. We  took a break for lunch, and moved into Mad Tea, a more in-depth getting to know you exercise that  focused on getting you to talk about potential opportunities related to archival discovery and  delivery. We then did the Speedboat activity to identify the things that move us forward and the things  that slow us down, and rounded out the day with the Low-Tech Social Network to map connections to  one another and a retrospective to talk about how the day went. Yesterday marked the start of the  divergent phase of the forum, where we encouraged you to start exploring new possibilities. Does  anyone have any questions or comments about yesterday?  ● So, what are we going to be doing today? Today, we’re focusing on the emergent phase, where we’re  undertaking deeper exploration of these possibilities, and to allow the unexpected and surprising to  bubble up to the surface. We’ll start with the Trading Cards again as a getting to know you exercise at  your new tables, and then move into a Context Map activity that allows us to see some of the external  factors, trends, and forces that impact our work. We’ll take a short break, and then move into an  activity called TRIZ, which will help us imagine the worst possible results or outcomes, and see how  our own counterproductive behaviors play into that. We’ll take a break for lunch, and then reorganize  in groups based roughly on job function and undertake work to generate and organize potential ideas  into possible themes. We’ll have another break, and close out the day with two activities - one focused  on potential actions that you can take (15% Solutions), and an activity where we’ll generate ideas  individually and share them across the room to see which may be the most promising. At the end of  the day, we’ll take time to review what we’ve done today and have another retrospective.  ● Are there any questions about what we’re up to today?  Activity 3A - Trading Cards (15 minutes)  Activity Type: ​Divergent  Goal(s)​: Get participants to talk to one another  Reference material: ​Trading Cards - Gamestorming  Supplies: ​1 index card and 1 marker/pen per person  Outcomes: ​Artifact to generate discussion.  Script:  ● Because we are sitting at new tables today, we want to give you an opportunity to get to know your  new tablemates and warm-up for the day.  ● We’re going to have each of you make a “trading card” for yourself. Take a blank index card and a  marker or pen. At the top, write your name, and below that draw a picture that represents you. Save a  couple lines at the bottom. On one line, write a few words that describe what kind of work you do.  Below that, write a few words that describe something you’re excited about in terms of the Forum.  We’ll take five minutes for you to draw your trading card.  https://gamestorming.com/trading-cards/ ● (After 5 minutes)​ OK - you all should have your trading cards now. Now, we’ll spend five minutes  trading cards around your table. When someone hands you a card, hold onto it for a moment, and  think of a question you might ask that person, and write it on the back of the card. After you’ve done  that, you can trade your card to someone else at the table. You can hold onto at most one card if you  want to follow up on a specific question that either you or someone else wrote.  ● (After 5 minutes)​ Now, you should all have another person’s card with a question or two on the back.  At each table, you can ask the person who’s card you’re holding the question on the back, and they  can either answer or pass. We’ll take only three minutes to do this, so make your answers fast!  Activity 3B - Context Map (75 minutes)  Activity Type: ​Divergent → Emergent  Goal(s)​: Give participants an opportunity to map out the context they are working in.  Reference material: ​Context Map - Gamestorming  Supplies: ​Six sheets of paper from easel pad, markers.  Outcomes: ​Data to support emergent exercises, including current state of community and forecasting trends.   Script:  ● With this exercise, we want to explore the greater context we currently operate in, and gain a better  understanding of the factors that influence archival discovery and delivery.   ● [Creating the sheets and drawing should take approximately 5 minutes] ​First, we need six pieces  of paper from the easel pad. Arrange the paper into a two row, three column format. In the top middle  piece of paper, draw a representation of archival discovery and delivery- it can be as simple as an  image of a folder or a finding aid- don’t overthink it! On that same piece of paper, above and to the left  of what you’ve drawn, write the words “POLITICAL CLIMATE” and to the right, write the words  “ECONOMIC CLIMATE”.  ● On the top left sheet of paper, draw several large/thick arrows pointing to the middle sheet. Label this  sheet “TRENDS” but leave a blank space before the word so you can add a qualifier later. On the top  right sheet of paper, draw the same arrows, again pointing at the middle sheet. Label this sheet in the  same way you labeled the other one.   ● On the bottom-left sheet, draw large arrows pointing up at the top-row middle sheet, and label this  sheet with “TECHNOLOGY FACTORS”.   ● On the bottom-middle sheet, draw an image representing your stakeholders/users, and label the  sheet “STAKEHOLDER/USER NEEDS.”  ● On the bottom-right sheet, draw a thundercloud and label this sheet “UNCERTAINTIES”.   ● [Spend no more than 10 minutes brainstorming and filling out each sheet] ​To begin filling out the  context map, choose any sheet (aside from the two labeled TRENDS), and begin discussing with your  group. Identify a volunteer to populate the sheet with the relevant discussion from the group. Repeat  this for the other two non-TRENDS sheets.   ● When it comes time to do the “TRENDS” sheets, discuss as a group how you want to qualify/label the  trends. Once this has been decided, discuss the trends and populate the sheet with the relevant  discussion from the group.  ● [Spend 10 minutes discussing the overall map] ​Congratulations, you have made a context map!  Discuss the overall findings with the group and ask for observations. What sticks out to you? Is there  something new you learned from this exercise about the context you/your colleagues work in?  [Ensure there are group note takers documenting the observations]  https://gamestorming.com/context-map-2/ Break (15 minutes)  Activity type: ​Break  Outline:  ● Remind people that we’re starting promptly at 11:00  Activity 4 - TRIZ (60 minutes)  Activity Type:​ Emergent  Goal(s)​: Identify negative behaviors and ways in which we participate in them  Reference material:​ ​TRIZ - Liberating Structures  Supplies: ​Google docs for notes and recap of previously identified anchors. paper for individual notes, easel  for group notes  Outcomes:   Script:  ● We spent some time yesterday in our Speedboat exercise talking about “sails” and “anchors”, or  strengths and limitations of archival discovery and delivery. We’ve also just done some mapping of  those factors. In this next exercise we want to dig into the limiting factors a bit more in a way that we  hope is fun and engages all of our inner supervillains.  ● First, let’s remind ourselves of some of the anchors we identified yesterday. ​[Ask for ideas from the  audience, which are then added to a list that is visible to all. Have a list gathered after  Speedboat to make sure nothing is missed]  ● [After 5 minutes]​ So, given this list of anchors or negative attributes, what are some ways we could  maximize these barriers? ​[Provide an example or two]​ Take fifteen minutes in your groups to  brainstorm ways to be evil, using your easels to jot down ideas. Then see if you can come up with a top  five most effective ways to achieve unwanted results.  ● [After 15 minutes]​ Now, let’s take some time to go around the room and have each group share their  top five ways to maximize barriers to archival discovery and delivery. You may need to specify which  barrier a particular behavior is associated with.  ● [After 10 minutes]​ Now, take five minutes to reflect on the list you created together, and then make a  second list of all that you are currently doing that resembles in some way one or more items on that  list. These could be things that you personally do, or things that happen at your institution or within  communities with which you’re familiar.  ● [After 5 minutes]​ I invite you to now spend five minutes reflecting with one or two people at your  table about your respective lists, seeing where they overlap, and identifying other things that may  come up in conversation. Make sure someone is a note taker within your group.  ● [After 5 minutes]​ Now that you’ve discussed in groups of two or three, I encourage you to share your  answers and generate additional connections between unwanted outcomes and existing behaviors  with everyone at your table. Let’s spend five minutes doing that. Make sure you have a note taker who  is listing all of the ideas.   ● [After 5 minutes] ​So by now, you should have discussed some behaviors which lead to bad things.  Let’s share around the room - name one unwanted result and supporting behavior that stuck out in  your discussion. Let’s keep the discussion brief, perhaps 90 seconds most per table. ​[Ensure someone  is notetaking for the room]  http://www.liberatingstructures.com/6-making-space-with-triz/ Lunch (60 minutes)  Activity type: ​Break  Outline:  ● Remind people that we’re starting promptly at 1:00  ● Let them know that they are moving into groups after lunch; have slide listing groups on screen   ● Encourage people to go outside if the weather is nice  Activity 5 - Affinity Map (90 minutes)  Activity Type:​ Emergent  Goal(s)​: Develop and reflect on patterns and themes based on institutional roles, generate ideas for next  steps   Reference material: ​Affinity Map - Gamestorming  Supplies: ​sticky notes, easel paper, pens  Outcomes: ​Participants will develop categories of action to take  Script:  ● [Have participants sit with affinity groups; present affinity groups on slide before transition]  Now that you are sitting with your affinity group, we will begin to brainstorm answers to address  archival delivery and discovery, based on our institutional/functional roles.  ● Take a large piece of easel paper and write the following questions:  ○ What changes must take place to improve and enhance archival delivery and discovery?  ○ What are steps we can take to improve archival delivery and discovery?  ● Next, spend 10 minutes individually brainstorming all the possible solutions, large and small. Each  participant should try to develop 15-20 post-it notes with ideas.   ● [​After 10 minutes] ​Now, one volunteer can collect all the post-its and display them on a flat surface.   ● Working as a group over the next 30 minutes, try to organize the post-its into clusters or columns  based on their similarities. You may want to do this as a whole group, or delegate one person at a time  to group the post-its, with the following person modifying as they see fit. Leave redundant post-it  notes, as it indicates that multiple people are thinking the same thing. You may also want to create a  parking lot for ideas that do not appear to fall into any of the emerging categories. At this point in the  exercise, you are not trying to label the clusters, only trying to group like-ideas together.   ● [After 30 minutes] ​Now that you have consensus on your clusters, work together to create labels or  categories of action that your clusters fall under. Don’t spend too long attempting to label your  clusters, and if there is disagreement between two labels, write them both for the time being. Spend  15-20 minutes labeling.   ● [After 15-20 minutes] ​Reflect on the categories as a group. What is surprising to you? Is there  anything unexpected about how the ideas are grouped and labeled? Is there any disagreement within  the group?   Break (30 minutes)  Activity type: ​Break  Outline:  https://gamestorming.com/affinity-map/ ● Remind people that we’re starting promptly at 3:00  Activity 6A - 15% Solutions (30 minutes)  Activity Type:​ Emergent  Goal(s)​: To identify small solutions that attendees currently have the power, resources, and discretion to take;  to transition attendees to considering how to address archival discovery and access challenges in light of  trends and context identified in previous activities  Reference material: ​http://www.liberatingstructures.com/7-15-solutions/  Supplies: ​index cards or paper, pens  Outcomes:​ Data to support next exercise; attendees will also be better prepped to consider collective  solutions by workshopping their individual 15% solutions  Script:  ● So far today we have considered the contextual factors affecting discovery and access for archival  materials, thought about how our own negative behaviors can create and reinforce challenges, and  reflected on these themes with peer affinity groups to generate ideas for next steps..  ● To keep us thinking positively and to continue to generate momentum along our emergent trajectory,  we’re now going to use an activity called “15% solutions.” This activity is intended to get us all  thinking about the small things we currently have in our individual power to do. The question we’re  going to be asking ourselves for the next 30 minutes is “What is our 15 percent? Where do we  individually have discretion and freedom to act? And what can you personally do without more  resources or authority?”  ● Take five minutes, by yourself, and try to come up with a list of 15% solutions using the  paper/notecards and pens on the table.  ● [After 5 minutes]​ Now each person should share their top ideas with your half-table of 3 or 4 people.  Assign a notetaker or take turns entering notes into a Google Doc. Each individual gets 5 minutes total  to be the center of attention. Table mates, use this time to listen to the speaker, and not to provide  advice or critique, but to ask clarifying questions, and to offer encouragement for folks to move  forward.  ● [After 20 minutes]​ So now, you’ve come up with some 15% solutions on your own and gotten some  feedback from your table mates to help make those smaller solutions more effective. We’re now going  to try another activity that’s focused on thinking bigger, and generating bolder solutions.  Activity 6B - 25/10 Crowdsourcing (50 minutes)  Activity Type:​ Emergent  Goal(s)​: To share ideas across attendees; to achieve rough consensus on the most promising ideas   Reference material: ​http://www.liberatingstructures.com/12-2510-crowd-sourcing/  Supplies: ​index cards and pens at tables; google docs/laptop/screen (to record and display 10-12 most  promising ideas); bell or alert of some kind (we can also use music during the mill as pass phase)  Outcomes: ​List of 10-12 most promising ideas as voted by attendees  Script:  ● Our next activity is called 25/10 Crowdsourcing. This activity asks us to consider “What big idea would  you recommend to improve archival discovery and access if you were ten times bolder? And what first  http://www.liberatingstructures.com/7-15-solutions/ http://www.liberatingstructures.com/12-2510-crowd-sourcing/ step would you take to get started?” If you like, this activity can directly reference the 15% solution  you were just working on, but it doesn’t have to.  ● [3 minutes]​ So this is the process. First, everyone will take a single index card and write one big idea  and a first step. You’ll have 5 minutes. Then people will get up, mill around, and pass cards from  person to person. No one reads the cards, they just pass. This part is called "Mill and Pass.” When the  bell rings again, people stop passing cards and read the card in their hands. Feel free to discuss it with  the person who handed it to you. You’ll have about 2 minutes to read the card, discuss it, and then  rate the idea and first step with a score from 1 to 5 based on how strong and promising you think the  idea is (1 is not your cup of coffee and 5 is sends you over the moon), writing the score on the back of  the card. This part is called "Read and Score." When the bell rings, cards are passed around for  another round of "Mill and Pass" until the bell rings and the "Read and Score" scoring cycle repeats.  This is done for a total of five scoring rounds. At the end of cycle five, participants add the five scores  on the back of the last card they are holding. Finally, the ideas with the top ten scores are identified  and shared with the whole group. So five round of five gives a max score of 25 for each card, and at the  end we see which 10 ideas and first steps scored the highest. 25-10. Everyone got it? Any questions?  Great.  ● [2 minutes]​ (Optional brief demonstration on writing idea/step, milling and passing, reading and  rating, milling and passing).  ● [5 minutes]​ So let’s start. Take 5 minutes to write a bold idea and first step on your index card.  ● [3 minutes] ​Ok everyone, time to mill and pass. Remember, no reading. Just milling and passing,  milling and passing. [Ring Bell] Great, now time to read and score. Read the card in your hand, and  write a number on the back based on how you wonderful you think it is.1 is lowest, 5 is highest. 10  more seconds. Great. [Ring bell].   ● [3 minutes] ​Now mill and pass again. [Ring bell]. Time to read and score again. [Ring bell]  ● [3 minutes] ​Now mill and pass again. [Ring bell]. Time to read and score again. [Ring bell]  ● [3 minutes] ​Now mill and pass again. [Ring bell]. Time to read and score again. [Ring bell]  ● [3 minutes] ​Final milling and passing [Ring bell]. Ok, this is your last read and score opportunity. [Ring  bell].   ● Congratulations, everyone. We’re done reading and scoring. I saw some excellent milling and passing  as well. Everyone can sit if you’d like. Take the card your holding with you.  ● [2 minute] ​Now add up the scores on the back on your card, and write that number down on the front  of the card. If you have more or less than five ratings for some reason, calculate the average of the  scores and multiply that by 5. Maths!  ● [10 Minutes]​ “Who has a 25?” etc. [Invite each participant, if any, holding a card scored 25 to read out  the idea and action step. Continue with “Who has a 24?,” “Who has a 23”…. Have participants tape  these to a wall, with the highest scores in the left most column. Stop when the top 10 ideas have been  identified and shared.] Activity supporter should type these ideas and first steps into a shared google  doc on the screen.  ● [10 minutes] ​Let’s look more closely at these ideas and first steps. What do we think about the top  ideas? Do you notice any patterns? How could the ideas or steps be even more clear or compelling?  ● [3 minutes] ​(optional) What caught your attention about 25/10?  Retrospective and Prep for Day 3 (30 minutes)  Activity Type: ​Emergent  Goal(s)​: Feedback  Reference material: ​The 4L’s: A Retrospective Technique​;   Supplies: ​Google Docs for notes, post-its and pens  Outcomes:​ Information to check on course correction  Script:  ● Now that we’ve concluded the last exercise for the day, we’re going to spend some time reflecting on  today like we did yesterday. As a reminder, we started the day with a trading card exercise, and then  moved into a context map to present the state of play in the field. We then used TRIZ to identify  counterproductive behaviors. We took a break for lunch, and then moved into the Affinity Map to  group and sort ideas. We took a break, and then used 15% Solutions to identify where we each have  power to take action, and used 25/10 Crowdsourcing to share and refine our ideas. The focus today  was on an emergent approach, allowing us to explore possibilities.   ● To help you reflect, we’re going to use a technique called the 4Ls retrospective. The 4Ls are ​Liked​,  Learned​, ​Lacked​, and ​Longed For​. For liked, think about what you enjoyed or what went well - it  could be anything from the content, the tools, to conversations you’ve had with people. For  “Learned,” think about something that you learned today - any new discoveries, points of interest, or  highlights. For “Lacked,” think about what seemed to be missing today. Was anything unclear? Did  you need something to make your participation go more smoothly? For “Longed For,” try to think of  something that you wish existed or was possible that would ensure that the Forum would be  successful. Write each of your items on a post-it note, and remember which of the 4Ls each post-it is  associated with. We’ll give you five minutes in individual reflection.  ● [After 5 minutes] ​At your tables, have one person draw two lines on a sheet of the easel pad to divide  it into four quadrants. Label each quadrant with one of the 4Ls. Share your responses with the 4Ls  with one another, and place them on the sheet of easel pad. Discuss with one another, and feel free to  add any others should they come up. We’ll spend 10 minutes doing that.  ● [After 10 minutes]​ OK, let’s place the sheets on the wall, and we’ll go around each group. You only  have about a minute each to walk us through some of the 4Ls that stuck out in your group.  Facilitator debrief (15 minutes)  Activity type: ​Debrief  Outline:   ● What do we need to start/stop/continue doing?  ● What went particularly well, or where did participants struggle?  ● Were there any concerns about potential Code of Conduct or Community Agreements violations?   ● Do we need to change plans for tomorrow?  Day 3  Set up  ● Room should be set up with 10 tables of 7 chairs each, plus 16-20 additional chairs in the back. Each  table should have the following: (confirm stock each day)  ○ 1 Post-It easel pad  https://www.ebgconsulting.com/blog/the-4ls-a-retrospective-technique/ ○ 10+ pens  ○ 10+ markers (various colors; colors not significant)   ○ 15+ 5x8” index cards   ○ 1 roll of painter’s tape  ○ 7 pads of 3x3” Post-It notes per table (various colors; colors significant)  ○ 1 or 2 printed copies of the Forum Playbook (this document)  ● Ensure an additional supply is available of the following:  ○ A few extra Post-It easel pads  ○ Pens  ○ Markers  ○ Index cards  ○ Post-It notes  ○ Dot stickers  ○ 1 ream printer paper  ○ 24-pack bottled water  ● Ensure power (surge protector?) is available at each table  Breakfast (60 minutes)  ● Encourage people to sit at different tables/with different people than in Day 2    Recap of Day 2/What’s Today?/Logistics (15 minutes)  Activity type: ​Logistics  Goal(s)​: Give participants important information about the event, and refresh participants on what we’ve  done yesterday.  Script:  ● Today, we shift from the emergent phase into the convergent phase, and move from idea generation  to determine how we want to take action. We’ll first undertake a social network webbing exercise to  help us understand how work around archival discovery and delivery happens. Then we’ll reflect on  some of our past idea generation activities, and determine specific actions that we’re willing to  commit to as individuals, and get feedback from participants who may be interested in them or willing  to commit time to make them happen. Finally, we’ll close out with a retrospective, and talk about  what’s next for the project and the outcomes that we expect.    Activity 7 - Social Network Webbing (60 minutes)  Activity Type:​ Convergent  Goal(s)​: Articulate connections between individuals/roles at the forum and start to identify individuals/roles  who might be missing  Reference material: ​Social Network Webbing - Liberating Structures  Supplies: ​Different colored post its, paper on easels, tape, legend for roles (from affinity mapping) mapped to  post-it color  Outcomes:   http://www.liberatingstructures.com/23-social-network-webbing/ Script:  ● We’re going to try to understand better how work happens in and among networks of people involved  in archival discovery and delivery. This is similar to the exercise that we did at the beginning of this  meeting, but this time we’ll work on it together as a group. To help us understand how different roles  are connected, we’re going to use a particular color of Post-it for each of the roles we used for an  earlier exercise.  ● First, each table should take a blank sheet of paper from the easel pad. Then, everyone at the table  should clearly print your name on a Post-it that best matches the color of the role in which you  identify. Put the Post-its in a group in the center of the paper. Let’s take five minutes and all do that.  ● [After 5 minutes]​ Now, ask yourself who else you know who is active in this work. As you think of  names (or roles), write them on another Post-it, again using the appropriate color. Then, arrange the  Post-its on the paper based on each person’s degrees of separation from you. You might need to add  additional pieces of paper and tape them together - get creative! I invite you to take ten minutes to  work on this.  ● [After 10 minutes]​ So, now that we’ve articulated who is doing the work, let’s ask ourselves who else  we would like to include. Again, as you think of others, write their name (or role) on the  appropriately-colored Post-it and continue to build your web together, thinking about the actual and  desired spread of participation. Let’s all do this together for another ten minutes.  ● [After 10 minutes]​ OK, by now we should have a pretty big web. At each table, let’s take a step back  together for the next 15 minutes and ask, “Who knows whom? Who has influence and expertise? Who  can block progress? Who can boost progress?” As you answer these questions, illustrate them with  connecting lines.  ● [After 15 minutes] ​Now, what kinds of strategies can we develop to 1) invite, attract, and “weave”  new people into this work; 2) work around blockages; and 3) boost progress? Let’s take another ten  minutes to discuss.  ● [After 10 minutes] ​Let’s briefly go around the room for each table to tell us what stood out most from  your conversation.    Activity 8 - Who/What/When Matrix  Activity Type:​ Convergent  Goal(s)​: To identify specific next follow-up actions based on exercises and volunteers to move them forward;  to identify potential participants for future project activities  Reference material: ​Who/What/When Matrix - Gamestorming​, ​Dot Voting: A Simple Decision-Making and  Prioritizing Technique in UX - Nielsen Norman Group   Supplies: ​1 easel pad per table, markers  Outcomes: ​A list of participants with articulated commitments, for future follow up by the project team  Script:  ● For our last activity, we’re going to turn this over to you to articulate how to move this work forward  and what you’d be willing to commit to following up on back at your institution or within your  communities. Thinking back to our last few activities - 15% Solutions, 25/10 Crowdsourcing, and  Social Network Webbing - we’ve had you look at what next steps you can take as individuals, shared  ideas and potential next actions, and looked at the network of people involved in work supporting  archival discovery and delivery. In this exercise, we realize that actions don’t take themselves, and  https://gamestorming.com/whowhatwhen-matrix/ https://www.nngroup.com/articles/dot-voting/ https://www.nngroup.com/articles/dot-voting/ people don’t commit strongly to actions as they do to one another, so we’re going to take an  approach to use a “people-first” approach to determine how you want to continue this work.   ● At your tables, take a sheet from your easel pad and draw three columns: one for ​Who​, the person or  people taking the action; one for ​What​, which is the action to be taken; and ​When​, for when you think  that item will be done. Actions could include something like “take a software developer out for  coffee,” or “set up regular calls with my peers in public services from other institutions.” Ideally,  everyone at your table should commit to at least one action, and there should be at least one sheet for  your entire table. So, for example, if I were to commit to something, I would write ​my name​ in ​Who​,  send email to participants with a report out about the Forum​ under ​What​, and “by February 28”  under ​When​. We’ll spend ten minutes doing this at your tables.  ● [After 10 minutes]​ Now that we have the lists written up, let’s put them up on the wall. You’ll also see  dot stickers on your table - bring those with you when you hang up the lists on the wall. Let’s go from  table to table and walk through each item on the list and say who will do what by when. Again, we’ll  need to keep it brief, so summarize quickly. You can ask for follow-up later if you’d like. ​[Max 15  minutes]  ● [Transcribe lists after Forum conclusion]  Break (15 minutes)  Activity type: ​Break  Outline:  ● Remind people that we’re starting promptly at 11:00  Retrospective (30 minutes)  Activity Type: ​Convergent  Goal(s)​: Feedback  Reference material: ​The 4L’s: A Retrospective Technique​;   Supplies: ​Google Docs for notes, post-its and pens  Outcomes:​ Information to check on course correction  Script:  ● Now that we’ve concluded the last exercise for, the day, we’re going to spend some time reflecting on  today. As a reminder, we started the day with breakfast, and then moved into the social network  webbing exercise to ​[insert phrase]​. We then identified the Who/What/When Matrix​ ​to identify  potential next steps and ask you what you’re willing to commit to moving forward. Our focus today  was on a convergent approach to get you to help identify the most promising courses of action, and to  motivate you as participants to contribute to future work.  ● To help you reflect, we’re going to use a technique called the 4Ls retrospective. The 4Ls are ​Liked​,  Learned​, ​Lacked​, and ​Longed For​. For liked, think about what you enjoyed or what went well - it  could be anything from the content, the tools, to conversations you’ve had with people. For  “Learned,” think about something that you learned today or across the last few days - any new  discoveries, points of interest, or highlights. For “Lacked,” think about what seemed to be missing  today. Was anything unclear? Did you need something to make your participation go more smoothly?  For “Longed For,” try to think of something that you wish existed or was possible that would ensure  https://www.ebgconsulting.com/blog/the-4ls-a-retrospective-technique/ that the Forum would be successful. Write each of your items on a post-it note, and remember which  of the 4Ls each post-it is associated with. We’ll give you five minutes in individual reflection.  ● [After 5 minutes] ​At your tables, have one person draw two lines on a sheet of the easel pad to divide  it into four quadrants. Label each quadrant with one of the 4Ls. Share your responses with the 4Ls  with one another, and place them on the sheet of easel pad. Discuss with one another, and feel free to  add any others should they come up. We’ll spend 10 minutes doing that.  ● [After 10 minutes]​ OK, let’s place the sheets on the wall, and we’ll go around each group. You only  have about a minute each to walk us through some of the 4Ls that stuck out in your group.    General Q&A/What’s Next (30 minutes)  Activity Type: ​Convergent  Goal(s)​: Feedback; inform participants of future work  Reference material:   Supplies: ​Google Docs for notes  Outcomes:​ feedback  Script: (develop more detailed notes/script)  ● Project plan  ○ Forum → digest outputs → draft report  ○ Working meeting → statement of principles  ○ Integration handbook/case studies  ○ Project whitepaper  ○ Presentations/publications  ● What questions do you have for us?  Lunch (120 minutes)  Activity type: ​Break  Outline:  ● Remind people we officially conclude at 2 PM.  ● People can sit where they like and follow up  ● Share documents for feedback  ● Share survey  ● Remind people that forum concludes at 2:00  Facilitator debrief (60 minutes)  Activity type: ​Debrief  Outline:   ● What went particularly well, or where did participants struggle?  ● Were there any concerns about potential Code of Conduct or Community Agreements violations?   ● Were there any specifically promising discussions?  ● Where did we feel like we didn’t have enough time on a topic or activity?  Lighting the Way Forum - 25/10 Crowd Sourcing Ideas Score Tags Idea First step Notes 23 structural change Decolonize the Archives Marginalized groups need to be part of and leading the conversation community collaboration is a long-term relationship not one-off project 23 structural change; collection development Integrate anti-racism and feminist frameworks into our collection policy Prioritize work that elevates marginalized communities 22 structural change; description/metadata Crowdsourced terminology bank that has terms that should be used, i.e. non racist Start a committee to review current issues 22 structural change; community engagement; rethink policy Indigenous data sovereignty Larger institute can assist by recognizing and giving expert authority to the respective group 22 simplify tech; collaboration/communication Reduce reliance on bespoke software development Document and share similar/identical system architectures 22 rights/reuse Develop a strong, profession-wide stance/statement/justification that digitizing and displaying archival material is a transformative use and not copyright infringement Develop backbones and embrace risk 20 shared regional/national projects; VRR A "national" virtual reading room Elevate and join the national aggregator idea (NAFAN) 20 structural change; description/metadata Move away from name and subject authorities for archival description (when the authorized stuff doesn't fit/is wrong/is unethical) Case study - a single collection project: create and publish local headings with crosswalks to authorized headings that don't work 20 community engagement; digitization strategy Develop and implement a digitization/digital collections strategy that is based on priorities gathered from a wide selection of users/stakeholders Survey students, faculty, patrons 20 UX/usability; shared regional/national projects; VRR Develop parameters/user studies for a national "virtual reading room" embargoes; view only; metadata only; token/login based access; born-digital; community centered review, approval 20 improve discovery Get our archival resources findable where people look for all things Break down EAD/systems to promote folder/resource-level findability distinct from (but still linked to somehow) their collection context 20 shared regional/national projects; resource sharing National digital archives sharing network Get well-resourced institutions to commit ongoing funds 20 shared regional/national projects; simplify tech; resource sharing Contribute to shared infrastructure for delivery/discovery of archival description and digital assets Stop building homegrown tools; embrace community projects 19 shared regional/national projects; communication/collaboration National repository for documentation Identify an appropriate SAA discussion & begin sharing the idea and soliciting feedback Many archivists already share their own documentation regarding both systems and workflows, but not in a systematic way. I propose some kind of national repository where institutions can link their documents, Githubs, etc. so they are discoverable to other professionals and we can learn from each other. 19 shared regional/national projects; resource sharing A national DAMS/access preservation system that any archive can contribute digital objects, metadata, archival description, authority records, so they don't have to build and maintain their own system Figure out who is in charge and how it will be sustainably funded 19 structural change; best practices Create standards for ethical community engagement for archival projects and interactions Convene people who do a great job already at this 19 structural change; best practices; description/metadata Create and publish anti-oppressive descriptive practices for LGBTQ+ materials Find our birds of a feather/institutional collaborators 19 open source Publicly accessible special collections libraries will only use open source discovery platforms Agree that is the standard 19 UX/usability; shared regional/national projects Conduct a distributed, lightweight user study involving all of our organizations and aggregate our findings/insights Define a lightweight interview question set we could use 18 UX/usability; shared regional/national projects Conduct a large scale usability study that incorporates feedback from a wide range of users/potential users looking at many systems (ArchivesSpace/other public interfaces, regional/consortial platforms, etc.) Form a project advisory board with members from a wide range of institution types/sizes/etc; apply for SAA Foundation funds 18 rethink tech; structural change; improve discovery Make flexible systems that allow archivists to create custom "finding aids" (i.e., not Finding Aids) that are hyper relevant to their users and collections. They should be able to talk to each other across contexts but not require them to conform to systems and standards that privilege certain communities over others. Any ideas? Lighting the Way Forum - 25/10 Crowd Sourcing Ideas Score Tags Idea First step Notes 18 collaboration/communication Put delivery systems designers on the reference desk/in the special collections classroom Provide cross training or form a small group at institution to facilitate active collaboration between front-line service providers and behind the scenes tech people 18 collaboration/communication Host more forums like this one at regional conferences Find potential organizers who are associated with specific regions/regional associations 18 accessiblity; UX/usability Automated accessibility checking for archives: something pluggable into CI/CD frameworks and seasoned for archival interactions Environmental scan of existing tools and reconcile with archives' use cases 18 collaboration/communication; resource sharing Institutions that have a lot of tech resources to develop systems could also provide a service to help small institutions implement them (beyond just sharing the code) Complete documentation and a database of people who would be willing to implement open source systems 18 open source; collaboration/communication Establish functional requirements for an open source request management system Solicit input from the archival community 17 UX/usability Eliminate archival jargon/terminology Conduct usability research focused on whether users understand said terminology 17 simplify tech Stop replicating the same information across multiple systems Document workflows and the data handoffs between systems and share this documentation with my organization to show redundancy 17 rethink tech; UX/usability Develop a method for aggregating item-level digital objects to present in "folder view" image viewer on a finding aid (allowing appropriate level of description for different access systems) Define functional requirements 17 collaboration/communication; structural change; collection development; community engagement Kick start a nationwide oral history initiative that will involve getting the entire community involved in the translation of existing oral histories and the documentation of new oral histories Building a genuine relationship with TON [Tohono O'odham Nation?] tribal members by setting up meetings with each of the 11 district chairs 17 description/metadata Develop a scalable way of generating minimum collection- (and optional item-)level metadata to help organizations with no current discoverable information Determining what "minimum" level is (DACS? Dublin Core? Something else completely?) 17 UX/usability; accessiblity Usability and accessibility how-to for archivists Understand hurdles and barriers in archives practice 17 community engagement Continuously encourage and support archival instruction and outreach that is community based (not about donors or classroom based) Hire archival instruction librarians and have a plan to target groups not represented in archival research 17 rethink tech; structural change Modifying/creating archival management and discovery systems that work offline Incentivize teams working in different online contexts to prototype solutions for their context and knowledge share/collaborate in the process 17 shared/national infrastructure; VRR Have a regional/nationwide virtual reading room to provide mediated access to copyrighted/culturally sensitive archival collections 16 rethink policy; rights/reuse Radical Access Well-thought out reasoning to charging people (photos, digitization, weird fees, etc.) - i.e. do we need to do this? How is the money used? Is this really necessary for our institution? No/limited barriers to entry/use; less information required of researchers; why IDs? ; what security "concerns" are valid and which are professional tradition and which can be discarded/forgotten?; K-12 community education/instruction outreach (i.e. "use us if you'd like!"); roving archival checkout boxes for K-12 instruction (a la Princeton's Cotsen Lib.); ILL of archival materials 16 structural change; community engagement; Turn every archives into an organization that works post-custodially while fully funding labor and technical infrastructure that allows for an appropriate balance of hyperlocal and widespread access to records based on specific cultural regulatory contexts Connect with donor communities at your archives to develop a preservation and access process that works for them 16 UX/usability Focus on users over standards, professional considerations, formats, legacy systems Stop running and start listening 16 description/metadata Let's fix all the messed up dates in our metadata Analyze dates in large aggregation of finding aids and MARC records to figure out what to do 16 UX/usability Evaluate and improve usage data Evaluate the usage data we collect; decide how to use it or lose it; and identify the gaps 16 improve discovery; collaboration/communication Create a set of requirements that all discovery and delivery systems must meet before being implemented (e.g. conforms to ADA standards for all users) Create a group to moderate the list and invite the entire community to add items/make suggestions, explain reasoning 15.5 rights/reuse Liberate the archives! Make it a priority to enable the reuse of collection items to maximize the advance of knowledge Determine the copyright status of archival items and share determination with patrons Lighting the Way Forum - 25/10 Crowd Sourcing Ideas Score Tags Idea First step Notes 15 resource sharing; community engagement; structural change Collectivize Area archival access practitioners so that large institutions pay for those services & small/under-resourced communities can have them provided pro-bono while maintaining agency Begin organizing through 1 on 1 meetings 15 description/metadata; analysis Explore applying machine-learning approaches to a massive corpus of finding aid and digital collection data (e.g. SNAC, DPLA, Hathi, Europeana); also evaluate scope/contour of collection Build a coalition of research and dev partners 15 rethink tech Collection management software that works for archives and museum collections Determine core functions of both types of description 15 professional development; collaboration/communication Host skillshare/brown bags where staff can/must participate monthly to learn from each other - break down barriers/job role divisions for better understanding and support 15 open source Archives commit to only use open source software for archival discovery and delivery Commitment from big/well resourced institutions 14 description/metadata; simplify tech Create a common data model for archival description to share across systems and make them simpler Find a community and establish principles 14 improve discovery; shared regional/national projects Build one federated system that supports discovery & access to all special collections materials (finding aids, archives, single item manuscripts, books, etc.) so users don't have to toggle between systems. And, allow it to enable requesting Be willing to change/reconsider the structure of the finding aid if necessary 14 description/metadata Turn finding aid inside out Turn requestable item into record linked to other description 14 crowdsourcing; description/metadata Integrate metadata/contextual information generated by researchers & students Conduct contextual inquiry studies to understand how scholars record; organize; augment archival collections 14 description/metadata; improve discovery Take a long pause with/from digitization (excluding patron requests) to focus on cleaning up our metadata and UI for improved discovery, searchability, and online access to existing digitized/digital materials 14 UX/usability User centric systems Gian teams to focus on collecting stories, outreach, feedback 14 description/metadata; translation Create a New Deal-esque translation corps focused on archival description Pick a target language of my institution 13 rethink tech Create a structured system to evaluate archival systems based on integratability Review existing models for this in healthcare systems, etc. 13 crowdsourcing; description/metadata Develop an infrastructure that allows for all kinds of description including crowd-sources and user generated description Identify a way for folks to create crowd sourced description that is consistent 12 improve discovery Users enter query into system using natural language and get relevant results Machine learning of finding aid structure and description/content 12 description/metadata; structural change Blow up LCSH Gather feedback/ideas from stakeholders in mis/underrepresented constituencies 11 advocacy; communication/collaboration Find a champion to help push your ideas Talk to faculty, donors, community, colleges, other professionals, etc. 11 improve discovery; simplify tech One search box for all library content/collections Queries return catalog resources, archives, special collections, data, all 11 staffing Create a discovery and access archivist position even if it may mean changing the title/job of a staff member whose position is very narrow/obsolete 10 data gathering/analysis Assessment project of hidden collections at many institutions Plan and instrument 10 crowdsourcing; description/metadata Let users add description (and reviews like Amazon products) Talk to vendors about how to make it happen 8 digitization strategy; rethink policy To increase access to AV - rapid mass digitization of all archival AV in collections, and then rapid transcribing and/or audio description of all AV including born-digital Lower expectations surrounding archival control/description before digitization 8 rethink policy Let patrons browse in stacks and pull their own materials "Open stacks" hours with staff attendant? 7 resource sharing; structural change A fund that does not end and supports projects community engagement; structural change; collaboration/communication Prioritize building communication channels with particular communities we serve (especially those with less existing wealth and power) Prioritize collection processing to make access a possibility; decline projects which are scoped unsustainably around bespoke access solutions/ideas Not scored Anonymized Who/What/When Matrix Actions What When research more around minimal computing strategies, starting with WAX March 2020 develop plan to create finding aids for Cuba collections March 2020 Brainstorm policy to fund U6 user driven digitization By end of FY20 Circle back on conversations with library copyright/privacy person regarding making data from archival collections openly available Feb/March 2020 Re: crowdsourced description -- work with digital library folx re: how to integrate crowdsourced transcriptions of legacy data/notes into our repositories Feb/March 2020 Re: putting things where people find it -- continue efforts to extract data from archival collections and deposit them into appropriate domain repos Feb/March 2020 Share findings of user study at my institution by end of 2020 Continue to help deveop our DAMS (Islandora 8) developer is contributing back to community ongoing, already in progress Dedicate at least one hour each week to learn more about anti-racism feminist frameworks and open source systems start in 2 weeks Take a copyright risk management course May 18 & 19 Host zine workshop for students and show as many perspectives as possible Continue DEI work with SAA committee on education Advocacy for indigenous data sovreignty, bringing more awareness to institutions with large indigenous holdings/materials YESTERDAY! Solidify connections with under-represented user/represented comms; schedule concrete events to build these end of May Propose setting aside staff time to research POC in the archives in order to better describe them later this week (oproposal) in 2020 (the research) Continue VRR conversations w/ CDL, Atlas, other UCs At lunch! this is a pretty long-term commitment so ... I dunno Get our resources where users find all the things: aka "flip EAD" step 1: find whose idea this is; step 2: how to start? Now to find writer, by march to describe my idea and align Digitization/digital collections strategy -- work with stakeholders in legislative/policy papers to create priorities Now-April (conference) to reach out/articulate plan Collect and publish list of variables relevant to fair use and VRRs Six months (August) To encourage reuse of digital archives, find opportunities to discuss rightsstatements.org with archival community Look for opportunities over next 3 months Open conversation with H of R to update copyright law -- share template for others to contact their reps Next week, next month Revise and reinvent our copyright workflows w/in my institution to shift away from a rights clearance model ot a risk assessment model End of 2020 Anonymized Who/What/When Matrix Actions What When Rewrite collection development policy (and get approved) By end of March 2020 Have all staff read and discuss the anti-racist description resources and integrate that guidance with our description model By end of March 2020 Gather/create resources for usability/accessibility for archives End of 2020 follow up email/survey End of March 2020 Work with CDL and other UCs on virtual reading room project and develop specifications to support this functionality in Aeon (e.g. authentication, delivery, etc.) At lunch (2/12) and ongoing --> write spects by June 2020 Read more about reparations frameworks /move away from philanthropic thinking Next 6 months Develop ArchivesSpace API helpers library (Python) Next 2 months Send a list of feature requests for Circa by the end of February 2020 Explicitly identify women and POC collections materials ongoing Accessibility audit of O/S projects & requirements for new projects 2020 Solicit input on functional requirements for O/S request management Summer NAFAN - Move forward with our partner orgs - next steps - pursue kickstarter grant funds Now! - Summer VRR + request management use case gathering - identify a coalition to crowdsource reqs. Now! Get discussion going Policy + protocol for users + OAC/Calisphere contributions to address biased + racist descriptions Now - Spring - Share Policy + Protocol Learn what it means to decolonize the archives & how it applies to my repository Spring Talk to SHRAB about incorporating top 10 priorities into upcoming strategic goals June Share NAFAN info w/ AASLH for their 250th commemoration which is working on a national db for collections data When I get home Commit to open source software ongoing for 2-3 years for specific project deconstructing the finding aid to make thing[s] (iobjets, folders) findable where people look -- prototype with colleague @ our place Rough sketch - 1 year? Think about the starting steps for decolonizing the archive, and how to convince folks interested in working on this (i.e. radical empathy, Archivists Against, etc.) --> try again (choir briething for liberation Always Terminology bank - environmental scan fall 2020 Black finding aid aggregator collection 2021 User study - student December 2020 Discovery tool for finding aids Summer 2020 TN regional finding aid network [Summer 2020? unclear] Resources findable where people search Anonymized Who/What/When Matrix Actions What When 1. Require processing archivist to link FA in wiki as part of A&D workflow update processing manual by June 2020 2. Learn more about SEO & schema.org via lit review look for SEO workshop via NY continuing ed resources by August 2020 Identify creating communities & subjects of our records & incorporate them in a user study Month (part 1) Winter 2020 (part 2) Description audit to identify offensive lang in our legacy description using projectstand toolkit Begin by fall 2020 Identify problems publishing multilingual FA and create some local guidelines to address by May 2020 Ask PASCL if my institution can join 2 weeks take these conversations back to my statewide portal/lead improvements to best descriptive practices and integration of digitized content and rights issues summer 2020 Digitization strategy- invite participation to inform conversation and outcomes from top 10 ideas share at 2020 membership mtg HBCU library alliance summer 2020/Oct 2020 ABHS- explore ways to contribute our FAs to larger networks and explore what delivery systems are available for us Ongoing 2020 Take conversations re: decolonizing/inclusivity to Assoc. of Lib & Archives @Baptist Inst. ALABI May 2020 Reading/research on indigenous collection and the desired access and practice of access March Assess use of hyperlocal systems for archival discovery and delivery (especially born-digital material related) and their degree of interoperability March Catch up on ArcLight Next 4 weeks Advocate for changes to Aeon API to support user needs Next 6 weeks Want to assist with virtual reading room but don't know how Help? With new head of archives (and other stakeholders) begin to talk about need for interoperability of our systems to get our archives findable. Get commitment to and create plan for creating interoperable system. Start conversation: June 2020; Plan creation start: Jan 2021 Create "local" subject headings list for our collections (with librarians/archivists) Start: Fall 2020 Review terms/remove racist, sexist (etc.) headings from our collection guides and systems Start: June 2020 Gather and summarize integration/architecture diagrams with goal to identify patterns and contribute approaches May 2020 Identify mechanisms to support integration with Mukurtu and/or traditional knowledge labels July 2020 Create repository for organizations to share their system diagrams for integrated archival discovery and forum for discussion September 2020 Anonymized Who/What/When Matrix Actions What When Build consensus internally and commit to using only open source 2020-2021 Plan for ongoing user studies (regular) 2020-2021 For NAFAN user studies, reflect on forum findings and how they fit into proposal This month! Reading list related to indigineous-terminology in Canada February-March Promote accessibility focus library wide and connect to broader implementation/standards discussion 2020 Build local connections to share systems and support resources End of February for conversations CC licenses as access/use statements, rights statements as backups End of summer Replace bespoke finding aids application with open source. Write about how/why/lessons to help others do a similar migration. July 2020 Prioritize digitization projects in collaboration with campus library diversity working group Ongoing effort, but... create prioritized project list by end of spring semester 2020 Explore/research how to provide secure digitial access to archives w/ personal info (what system to use?). Prepare contingency plan for sensitive/intl. archives Now, build by fall 2020 Check in on ArcLight test implementations at other insitutions. Blog? March Get admin support for cross-repo user study April/May Explore shared software solutions vs bespoke software solutions 2020 De-prioritize LCSH in our docs and design planning Summer/Fall 2020 Convene leadership to discuss reevaluation of terminology in FA March Explore alternatives to the finding aid as sole delivery product (written or prototype) June Quartlerly (Google Analytics) Aeon data reports --> sharing dept. wide which should communicate user & priorities for digital projects End of March for next one & cont. Accessibility How-Tos By Sept. Work on getting user (researcher + faculty) priorities into Digitization Pipeline (communication + workflow) June Virtual Reading Room, Yes, Please! Start the convo now Continue conversations about marketing/promoting what we struggle with as much as celebratory 1 month Map and Communicate ecosystem and stakeholders Quarter Anonymized Who/What/When Matrix Actions What When self-education/reading around decolonizing the archives, indigonous data security Immediatly Tap into accessibility convos already happening, Curation goals, audit FY 20-21 goals Talking to ref/academic programs about users, especially students (user advisory? usability study?) include entire ecosystem 3 months initial convo w/ academic programs person talk to library staff meeting about Lighting the Way Forum; synthesize forum work to figure out what to take back; bring top 10 ideas as ex: of consensus in forum Tuesday self-education/reading around decolonizing the archives, indigenous data sovereignty [no date listed] talk to managers group meeting re: copyright working group March 3 talk to reading rainbow group re: LTWF and take aways March ? Change collection development policy by incorporating anti racist and feminist frameworks (then publish!) March ? Say "no" on behalf of the team members to advocate for their capacity starting immediately UC VRR conversations "(today! :)" Integrate LTWF info into ARCH 2020 lesson plans & info Now --> July start talking about digital access as a VRR as a way to move my org out of our hang up on digital libraries/repositories now Finish Native American protocols case study Summer 2020 Continue subject guide project 2021 Advocacy to get Lucidea ArchivERA/find out community needs by conducting outreach Now Collection development policies at my institution and how we are addressing marginalized, non-dominant cultures/content April 2020 Connect our metadata librarian with folks who are also interested or doing work on non-dominant terminologies (and linked data) March 2020 Codify digitization strategies to ensure we have mechanisms in place, including self-critique, to make sure we are representing marginalized, non-dominant cultures/content June 2020 Community development of ArcLight July 2020 Convene local team to provide feedback on transition to Arclight 2/17/2020 Connect with UNC-CH and coordinate strategy for locating/remediating racist language in archival description End of Spring Digitization strategy / stakeholder / buy-in (reach out to DLF for other institutional models) February/March Offer to help plan/convene initial discussion of national virtual reading room February/March Anonymized Who/What/When Matrix Actions What When Devise initial plan to translate Aviary into all 22 languages from my archves April/May Contact LTW Forum attendees about forming a working/reading group around UX/Developing a User study March 2020 Look into Circa & reach out to public services about RMS open source Next week Gather examples of collections & digitization prioritizes that are inclusive/prioritize marginalized groups April 2020 Migrating site from Drupal 7 to Drupal 8 and will be improving accessibility and usability. Redesigning Search interface after based on usability studies. Will conduct usability studies and AB tests again this year. All completed hopefully by end of 2020 Page 1 of 25 Lighting the Way Forum Feedback Start of Block: Overall Satisfaction Q1 Lighting the Way Forum February 10-12, 2020 Thank you so much for attending the Lighting the Way Forum (either in person or by livestream), and for taking the time to leave feedback about your experience. This survey will take about 7 minutes to complete. Page Break Page 2 of 25 Q2 How likely are you to recommend the Lighting the Way Project to a friend or colleague? o 0 (0) o 1 (1) o 2 (2) o 3 (3) o 4 (4) o 5 (5) o 6 (6) o 7 (7) o 8 (8) o 9 (9) o 10 (10) Page Break Page 3 of 25 Q3 What did you like most about the Lighting the Way Forum? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ Page Break Page 4 of 25 Q4 What did you like least about the Lighting the Way Forum? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ Page Break Page 5 of 25 Q5 In which of the following aspects of the event were you most interested? o Learning about the project and its goals (404) o Watching invited presentations (405) o Participating in facilitated activities (409) o Social interactions/informal conversation with other participants (410) o Other (Please specify) (411) ________________________________________________ Page Break Page 6 of 25 Q26 In which of the following goals of the event were you most interested? o To visualize, map, build connections in archival discovery/delivery – between one another, their work, the systems they rely on, and the communities they serve (1) o To organize around shared opportunities/challenges in archival discovery/delivery (2) o To provide a platform for engagement with the project (3) Page Break Page 7 of 25 Q28 How satisfied are you with how the Forum addressed this goal: To visualize, map, build connections in archival discovery/delivery o Extremely satisfied (1) o Somewhat satisfied (2) o Neither satisfied nor dissatisfied (3) o Somewhat dissatisfied (4) o Extremely dissatisfied (5) Page Break Page 8 of 25 Q29 How satisfied are you with how the Forum addressed this goal: To organize around shared opportunities/challenges in archival discovery/ delivery o Extremely satisfied (1) o Somewhat satisfied (2) o Neither satisfied nor dissatisfied (3) o Somewhat dissatisfied (4) o Extremely dissatisfied (5) Page Break Page 9 of 25 Q31 How satisfied are you with how the Forum addressed this goal: To provide a platform for engagement with the project o Extremely satisfied (1) o Somewhat satisfied (2) o Neither satisfied nor dissatisfied (3) o Somewhat dissatisfied (4) o Extremely dissatisfied (5) End of Block: Overall Satisfaction Start of Block: IMLS Build Capacity Metrics Q35 In this next section, you are being asked to reflect on whether the Lighting the Way Forum has helped you and your organization build capacity in terms of archival discovery and delivery. Page Break Page 10 of 25 Q36 The Lighting the Way Forum has better prepared me to improve archival discovery/delivery at my organization. o Strongly agree (11) o Somewhat agree (12) o Neither agree nor disagree (13) o Somewhat disagree (14) o Strongly disagree (15) Page Break Page 11 of 25 Q42 The Lighting the Way Forum has better prepared me to collaborate with people across different roles/professional fields. o Strongly agree (11) o Somewhat agree (12) o Neither agree nor disagree (13) o Somewhat disagree (14) o Strongly disagree (15) Page Break Page 12 of 25 Q37 The Lighting the Way Forum helped me grow my expertise to improve archival discovery/delivery for the communities my organization serves. o Strongly agree (11) o Somewhat agree (12) o Neither agree nor disagree (13) o Somewhat disagree (14) o Strongly disagree (15) Page Break Page 13 of 25 Q38 The Lighting the Forum was a valuable networking opportunity. o Strongly agree (11) o Somewhat agree (12) o Neither agree nor disagree (13) o Somewhat disagree (14) o Strongly disagree (15) Page Break Page 14 of 25 Q39 What did you learn at the Lighting the Way Forum? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ Page Break Page 15 of 25 End of Block: IMLS Build Capacity Metrics Start of Block: Attribute Satisfaction EA1 How satisfied were you with the following: Venue o Extremely satisfied (1) o Somewhat satisfied (2) o Neither satisfied nor dissatisfied (3) o Somewhat dissatisfied (4) o Extremely dissatisfied (5) o Not applicable (6) EA2 How satisfied were you with the following: Food o Extremely satisfied (1) o Somewhat satisfied (2) o Neither satisfied nor dissatisfied (3) o Somewhat dissatisfied (4) o Extremely dissatisfied (5) o Not applicable (6) Page 16 of 25 EA3 How satisfied were you with the following: Lodging (Cardinal Hotel/Schwab Residential Center, booked by project team) o Extremely satisfied (1) o Somewhat satisfied (2) o Neither satisfied nor dissatisfied (3) o Somewhat dissatisfied (4) o Extremely dissatisfied (5) o Not applicable (6) EA4 How satisfied were you with the following: Audiovisual/Livestream o Extremely satisfied (1) o Somewhat satisfied (2) o Neither satisfied nor dissatisfied (3) o Somewhat dissatisfied (4) o Extremely dissatisfied (5) o Not applicable (6) Page 17 of 25 EA5 How satisfied were you with the following: Facilitated activities (e.g. TRIZ, Speedboat, etc.) o Extremely satisfied (1) o Somewhat satisfied (2) o Neither satisfied nor dissatisfied (3) o Somewhat dissatisfied (4) o Extremely dissatisfied (5) o Not applicable (6) EA6 How satisfied were you with the following: Plenary presentations (Day 1 only) o Extremely satisfied (1) o Somewhat satisfied (2) o Neither satisfied nor dissatisfied (3) o Somewhat dissatisfied (4) o Extremely dissatisfied (5) o Not applicable (6) Page 18 of 25 EA7 How satisfied were you with the following: Length of forum o Extremely satisfied (1) o Somewhat satisfied (2) o Neither satisfied nor dissatisfied (3) o Somewhat dissatisfied (4) o Extremely dissatisfied (5) o Not applicable (6) EA8 How satisfied were you with the following: Length of breaks o Extremely satisfied (1) o Somewhat satisfied (2) o Neither satisfied nor dissatisfied (3) o Somewhat dissatisfied (4) o Extremely dissatisfied (5) o Not applicable (6) Page 19 of 25 EA9 How satisfied were you with the following: Logistics (including registration/reimbursement) o Extremely satisfied (1) o Somewhat satisfied (2) o Neither satisfied nor dissatisfied (3) o Somewhat dissatisfied (4) o Extremely dissatisfied (5) o Not applicable (6) EA10 How satisfied were you with the following: Inclusiveness of the forum (active facilitation, Community Agreements, Code of Conduct, travel support) o Extremely satisfied (1) o Somewhat satisfied (2) o Neither satisfied nor dissatisfied (3) o Somewhat dissatisfied (4) o Extremely dissatisfied (5) o Not applicable (6) End of Block: Attribute Satisfaction Start of Block: Valuable Experiences Page 20 of 25 Q13 What would be the most valuable experience we could facilitate within Lighting the Way Forum's project activities? Please list as many ideas as you'd like. ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ End of Block: Valuable Experiences Start of Block: Demand Gen Page 21 of 25 Q14 How did you hear about the Lighting the Way Forum? Select all that apply. ▢ Email listserve (1) ▢ Social media post (2) ▢ Media article (3) ▢ Friend or colleague (4) ▢ Conference presentation (5) ▢ Google (6) ▢ From a project team member (7) ▢ Personal invitation (8) ▢ ⊗I don't recall (16) ▢ Other (17) ________________________________________________ End of Block: Demand Gen Start of Block: Block 6 Q22 Would you be interested in further opportunities to participate in the Lighting the Way project (e.g. writing case studies or giving feedback on project deliverables)? o Yes (1) o Maybe (2) o No (3) Page 22 of 25 Display This Question: If Would you be interested in further opportunities to participate in the Lighting the Way project (... = Yes Or Would you be interested in further opportunities to participate in the Lighting the Way project (... = Maybe Q33 Please provide your email address so we may contact you about future participation opportunities. ________________________________________________________________ End of Block: Block 6 Start of Block: Anything Else Q18 Is there anything else you would like to share with us? ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ End of Block: Anything Else Start of Block: Demographics Page Break Page 23 of 25 Q19 Which statement best describes your current employment status? o Working (paid employee) (1) o Working (self-employed) (2) o Not working (temporary layoff from a job) (3) o Not working (looking for work) (4) o Not working (retired) (5) o Not working (disabled) (6) o Not working (other) (7) ________________________________________________ o Prefer not to answer (8) Page Break Page 24 of 25 Q25 Did you receive travel funding to participate in the Forum? (Select all that apply) ▢ I received travel funding from the Forum (1) ▢ I received travel funding from my employer (2) ▢ I did not receive travel funding (3) ▢ I did not need any travel funding (4) Page Break Page 25 of 25 Q33 What is your ZIP code? ________________________________________________________________ End of Block: Demographics Quantitative Feedback Lighting the Way Forum Feedback July 17, 2020 4:57 PM PDT Net Promoter Score (How likely are you to recommend the Lighting the Way Project to a friend or colleague?) Detractor Passive Promoter 15% Detractor 37% Passive 48% Promoter Promoter Passive Detractor WebinarAttendee 17.19% 42.19% 40.63% Participant 12.00% 30.00% 58.00% WebinarRegistrantDidNotAttend 100.00% Overall NPS -100 100 33.62 Participants: 46.00 WebinarAttendees: 23.08 Q5 - In which of the following aspects of the event were you most interested? 26.26% 5.05% 25.25% 8.08% 35.35% Learning about the project and its goals Watching invited presentations Participating in facilitated activities Social interactions/informal conversation with other participants Other (Please specify) Learning about the project and its goals Watching invited presentations Participating in facilitated activities Social interactions/informal conversation with other participants Other (Please specify) WebinarAttendee 29.41% 5.88%3.92% 60.78% Participant 23.40% 4.26% 46.81% 17.02% 8.51% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 404.00 411.00 405.22 1.73 2.99 51 2 Participant 404.00 411.00 407.74 2.44 5.98 47 Showing rows 1 - 5 of 5 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 404 Learning about the project and its goals 57.69% 15 42.31% 11 0.00% 0 26 405 Watching invited presentations 88.57% 31 11.43% 4 0.00% 0 35 409 Participating in facilitated activities 8.33% 2 91.67% 22 0.00% 0 24 410 Social interactions/informal conversation with other participants 0.00% 0 100.00% 8 0.00% 0 8 411 Other (Please specify) 60.00% 3 40.00% 2 0.00% 0 5 Q5_411_TEXT - Other (Please specify) WebinarAttendee Participant WebinarRegistrantDidNotAttend Other (Please specify) I would say all of the above, but I was only able to watch the livestreamed presentations. It's a tie between the first two: learning about the project and watching presentations. because I was remote, this was my only option Other (Please specify) Developing a shared agenda for next steps Hard to choose between learning about the project and its goals and participating in the facilitated activities. These seem inextricably entwined. Other (Please specify) Q26 - In which of the following goals of the event were you most interested? 41.24% 5.15% 53.61% To visualize, map, build connections in archival discovery/delivery – between one another, their work, the systems they rely on, and the communities they serve To organize around shared opportunities/challenges in archival discovery/delivery To provide a platform for engagement with the project To visualize, map, build connections in archival discovery/delivery – between one another, their work, the systems they rely on, and the communities they serve To organize around shared opportunities/challenges in archival discovery/delivery To provide a platform for engagement with the project WebinarAttendee 34.69% 4.08% 61.22% Participant 48.94% 6.38% 44.68% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 1.00 3.00 1.43 0.57 0.33 49 2 Participant 1.00 3.00 1.62 0.60 0.36 47 Showing rows 1 - 3 of 3 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 1 To visualize, map, build connections in archival discovery/delivery – between one another, their work, the systems they rely on, and the communities they serve 58.82% 30 41.18% 21 0.00% 0 51 2 To organize around shared opportunities/challenges in archival discovery/delivery 42.50% 17 57.50% 23 0.00% 0 40 3 To provide a platform for engagement with the project 40.00% 2 60.00% 3 0.00% 0 5 Q28 - How satisfied are you with how the Forum addressed this goal: To visualize, map, build connections in archival discovery/delivery 20.62% 24.74% 6.19% 48.45% Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied WebinarAttendee 20.41% 36.73% 6.12% 36.73% Participant 21.28%12.77% 6.38% 59.57% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 1.00 4.00 2.29 0.86 0.73 49 # Field Minimum Maximum Mean Std Deviation Variance Count 2 Participant 1.00 4.00 2.04 0.77 0.59 47 Showing rows 1 - 5 of 5 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 1 Extremely satisfied 50.00% 10 50.00% 10 0.00% 0 20 2 Somewhat satisfied 39.13% 18 60.87% 28 0.00% 0 46 3 Neither satisfied nor dissatisfied 75.00% 18 25.00% 6 0.00% 0 24 4 Somewhat dissatisfied 50.00% 3 50.00% 3 0.00% 0 6 5 Extremely dissatisfied 0.00% 0 0.00% 0 0.00% 0 0 Q29 - How satisfied are you with how the Forum addressed this goal: To organize around shared opportunities/challenges in archival discovery/ delivery 25.00% 27.08% 5.21% 42.71% Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied WebinarAttendee 22.92% 29.17% 6.25% 41.67% Participant 27.66%25.53% 2.13% 44.68% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 1.00 4.00 2.19 0.86 0.74 48 # Field Minimum Maximum Mean Std Deviation Variance Count 2 Participant 1.00 4.00 2.02 0.79 0.62 47 Showing rows 1 - 5 of 5 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 1 Extremely satisfied 45.83% 11 54.17% 13 0.00% 0 24 2 Somewhat satisfied 48.78% 20 51.22% 21 0.00% 0 41 3 Neither satisfied nor dissatisfied 53.85% 14 46.15% 12 0.00% 0 26 4 Somewhat dissatisfied 75.00% 3 25.00% 1 0.00% 0 4 5 Extremely dissatisfied 0.00% 0 0.00% 0 0.00% 0 0 Q31 - How satisfied are you with how the Forum addressed this goal: To provide a platform for engagement with the project 29.17% 16.67% 8.33% 45.83% Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied WebinarAttendee 25.00% 18.75% 10.42% 45.83% Participant 34.04% 14.89% 6.38% 44.68% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 1.00 4.00 2.15 0.91 0.83 48 # Field Minimum Maximum Mean Std Deviation Variance Count 2 Participant 1.00 4.00 1.94 0.86 0.74 47 Showing rows 1 - 5 of 5 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 1 Extremely satisfied 42.86% 12 57.14% 16 0.00% 0 28 2 Somewhat satisfied 51.16% 22 48.84% 21 0.00% 0 43 3 Neither satisfied nor dissatisfied 56.25% 9 43.75% 7 0.00% 0 16 4 Somewhat dissatisfied 62.50% 5 37.50% 3 0.00% 0 8 5 Extremely dissatisfied 0.00% 0 0.00% 0 0.00% 0 0 Q36 - The Lighting the Way Forum has better prepared me to improve archival discovery/delivery at my organization. 26.32% 49.47% 5.26% 15.79% 3.16% Strongly agree Somewhat agree Neither agree nor disagree Somewhat disagree Strongly disagree Strongly agree Somewhat agree Neither agree nor disagree Somewhat disagree Strongly disagree WebinarAttendee 36.17% 36.17% 6.38% 17.02% 4.26% Participant 17.02% 63.83% 4.26% 12.77% 2.13% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 11.00 15.00 12.45 0.99 0.97 47 # Field Minimum Maximum Mean Std Deviation Variance Count 2 Participant 11.00 15.00 12.19 0.79 0.62 47 Showing rows 1 - 5 of 5 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 11 Strongly agree 57.14% 8 42.86% 6 0.00% 0 14 12 Somewhat agree 36.17% 17 63.83% 30 0.00% 0 47 13 Neither agree nor disagree 68.00% 17 32.00% 8 0.00% 0 25 14 Somewhat disagree 60.00% 3 40.00% 2 0.00% 0 5 15 Strongly disagree 66.67% 2 33.33% 1 0.00% 0 3 Q42 - The Lighting the Way Forum has better prepared me to collaborate with people across different roles/professional fields. 28.42% 34.74% 6.32% 27.37% 3.16% Strongly agree Somewhat agree Neither agree nor disagree Somewhat disagree Strongly disagree Strongly agree Somewhat agree Neither agree nor disagree Somewhat disagree Strongly disagree WebinarAttendee 36.17% 31.91% 10.64% 14.89% 6.38% Participant 21.28% 38.30% 2.13% 38.30% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 11.00 15.00 12.62 1.06 1.13 47 # Field Minimum Maximum Mean Std Deviation Variance Count 2 Participant 11.00 14.00 11.87 0.82 0.66 47 Showing rows 1 - 5 of 5 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 11 Strongly agree 28.00% 7 72.00% 18 0.00% 0 25 12 Somewhat agree 45.45% 15 54.55% 18 0.00% 0 33 13 Neither agree nor disagree 62.96% 17 37.04% 10 0.00% 0 27 14 Somewhat disagree 83.33% 5 16.67% 1 0.00% 0 6 15 Strongly disagree 100.00% 3 0.00% 0 0.00% 0 3 Q37 - The Lighting the Way Forum helped me grow my expertise to improve archival discovery/delivery for the communities my organization serves. 23.16% 45.26% 8.42% 20.00% 3.16% Strongly agree Somewhat agree Neither agree nor disagree Somewhat disagree Strongly disagree Strongly agree Somewhat agree Neither agree nor disagree Somewhat disagree Strongly disagree WebinarAttendee 29.79% 42.55% 6.38% 17.02% 4.26% Participant 17.02% 48.94% 10.64% 21.28% 2.13% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 11.00 15.00 12.38 0.98 0.96 47 # Field Minimum Maximum Mean Std Deviation Variance Count 2 Participant 11.00 15.00 12.23 0.97 0.95 47 Showing rows 1 - 5 of 5 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 11 Strongly agree 44.44% 8 55.56% 10 0.00% 0 18 12 Somewhat agree 46.51% 20 53.49% 23 0.00% 0 43 13 Neither agree nor disagree 63.64% 14 36.36% 8 0.00% 0 22 14 Somewhat disagree 37.50% 3 62.50% 5 0.00% 0 8 15 Strongly disagree 66.67% 2 33.33% 1 0.00% 0 3 Q38 - The Lighting the Forum was a valuable networking opportunity. 25.26% 13.68% 11.58% 43.16% 6.32% Strongly agree Somewhat agree Neither agree nor disagree Somewhat disagree Strongly disagree Strongly agree Somewhat agree Neither agree nor disagree Somewhat disagree Strongly disagree WebinarAttendee 46.81% 10.64% 21.28% 8.51% 12.77% Participant 4.26% 17.02% 2.13% 76.60% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 11.00 15.00 13.19 1.06 1.13 47 2 Participant 11.00 14.00 11.32 0.66 0.43 47 Showing rows 1 - 5 of 5 ## FieldField WebinarAttendeeWebinarAttendee ParticipantParticipant WebinarRegistrantDidNotAttendWebinarRegistrantDidNotAttend TotalTotal 11 Strongly agree 10.00% 4 90.00% 36 0.00% 0 40 12 Somewhat agree 38.46% 5 61.54% 8 0.00% 0 13 13 Neither agree nor disagree 91.67% 22 8.33% 2 0.00% 0 24 14 Somewhat disagree 90.91% 10 9.09% 1 0.00% 0 11 15 Strongly disagree 100.00% 6 0.00% 0 0.00% 0 6 EA1 - How satisfied were you with the following: Venue 66.67% 7.02% 26.32% Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied WebinarAttendee 44.44% 33.33% 22.22% Participant 70.21% 2.13% 27.66% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 1.00 3.00 1.89 0.87 0.77 9 2 Participant 1.00 3.00 1.32 0.51 0.26 47 Showing rows 1 - 5 of 5 ## FieldField WebinarAttendeeWebinarAttendee ParticipantParticipant WebinarRegistrantDidNotAttendWebinarRegistrantDidNotAttend TotalTotal 1 Extremely satisfied 10.81% 4 89.19% 33 0.00% 0 37 2 Somewhat satisfied 13.33% 2 86.67% 13 0.00% 0 15 3 Neither satisfied nor dissatisfied 75.00% 3 25.00% 1 0.00% 0 4 4 Somewhat dissatisfied 0.00% 0 0.00% 0 0.00% 0 0 5 Extremely dissatisfied 0.00% 0 0.00% 0 0.00% 0 0 EA2 - How satisfied were you with the following: Food Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied WebinarAttendee 50.00% 50.00% Participant 2.13% 46.81% 10.64% 4.26% 36.17% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 2.00 3.00 2.50 0.50 0.25 2 2 Participant 1.00 5.00 1.79 0.94 0.89 47 Showing rows 1 - 5 of 5 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 1 Extremely satisfied 0.00% 0 100.00% 22 0.00% 0 22 2 Somewhat satisfied 5.56% 1 94.44% 17 0.00% 0 18 3 Neither satisfied nor dissatisfied 16.67% 1 83.33% 5 0.00% 0 6 4 Somewhat dissatisfied 0.00% 0 100.00% 2 0.00% 0 2 5 Extremely dissatisfied 0.00% 0 100.00% 1 0.00% 0 1 EA3 - How satisfied were you with the following: Lodging (Cardinal Hotel/Schwab Residential Center, booked by project team) Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied WebinarAttendee 50.00% 50.00% Participant 70.37% 3.70% 3.70% 22.22% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 1.00 3.00 2.00 1.00 1.00 2 2 Participant 1.00 4.00 1.41 0.73 0.54 27 Showing rows 1 - 5 of 5 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 1 Extremely satisfied 5.00% 1 95.00% 19 0.00% 0 20 2 Somewhat satisfied 0.00% 0 100.00% 6 0.00% 0 6 3 Neither satisfied nor dissatisfied 50.00% 1 50.00% 1 0.00% 0 2 4 Somewhat dissatisfied 0.00% 0 100.00% 1 0.00% 0 1 5 Extremely dissatisfied 0.00% 0 0.00% 0 0.00% 0 0 EA4 - How satisfied were you with the following: Audiovisual/Livestream Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied WebinarAttendee 2.22% 40.00% 2.22% 11.11% 44.44% Participant 87.50% 6.25% 6.25% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 1.00 5.00 1.91 1.03 1.06 45 2 Participant 1.00 3.00 1.19 0.53 0.28 16 Showing rows 1 - 5 of 5 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 1 Extremely satisfied 56.25% 18 43.75% 14 0.00% 0 32 2 Somewhat satisfied 95.24% 20 4.76% 1 0.00% 0 21 3 Neither satisfied nor dissatisfied 50.00% 1 50.00% 1 0.00% 0 2 4 Somewhat dissatisfied 100.00% 5 0.00% 0 0.00% 0 5 5 Extremely dissatisfied 100.00% 1 0.00% 0 0.00% 0 1 EA5 - How satisfied were you with the following: Facilitated activities (e.g. TRIZ, Speedboat, etc.) Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied WebinarAttendee 33.33% 33.33% 33.33% Participant 40.43% 6.38% 53.19% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 1.00 4.00 2.67 1.25 1.56 3 2 Participant 1.00 4.00 1.72 0.76 0.58 47 Showing rows 1 - 5 of 5 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 1 Extremely satisfied 5.00% 1 95.00% 19 0.00% 0 20 2 Somewhat satisfied 0.00% 0 100.00% 25 0.00% 0 25 3 Neither satisfied nor dissatisfied 100.00% 1 0.00% 0 0.00% 0 1 4 Somewhat dissatisfied 25.00% 1 75.00% 3 0.00% 0 4 5 Extremely dissatisfied 0.00% 0 0.00% 0 0.00% 0 0 EA6 - How satisfied were you with the following: Plenary presentations (Day 1 only) Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied WebinarAttendee 45.00% 5.00% 5.00% 45.00% Participant 65.22% 2.17% 32.61% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 1.00 4.00 1.70 0.78 0.61 40 2 Participant 1.00 4.00 1.39 0.61 0.37 46 Showing rows 1 - 5 of 5 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 1 Extremely satisfied 37.50% 18 62.50% 30 0.00% 0 48 2 Somewhat satisfied 54.55% 18 45.45% 15 0.00% 0 33 3 Neither satisfied nor dissatisfied 100.00% 2 0.00% 0 0.00% 0 2 4 Somewhat dissatisfied 66.67% 2 33.33% 1 0.00% 0 3 5 Extremely dissatisfied 0.00% 0 0.00% 0 0.00% 0 0 EA7 - How satisfied were you with the following: Length of forum Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied WebinarAttendee 19.23% 34.62% 15.38% 30.77% Participant 46.81% 4.26% 2.13% 46.81% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 1.00 4.00 2.46 0.97 0.94 26 2 Participant 1.00 4.00 1.62 0.67 0.45 47 Showing rows 1 - 5 of 5 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 1 Extremely satisfied 18.52% 5 81.48% 22 0.00% 0 27 2 Somewhat satisfied 26.67% 8 73.33% 22 0.00% 0 30 3 Neither satisfied nor dissatisfied 81.82% 9 18.18% 2 0.00% 0 11 4 Somewhat dissatisfied 80.00% 4 20.00% 1 0.00% 0 5 5 Extremely dissatisfied 0.00% 0 0.00% 0 0.00% 0 0 EA8 - How satisfied were you with the following: Length of breaks Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied WebinarAttendee 30.77% 15.38% 23.08% 30.77% Participant 46.81% 6.38% 8.51% 38.30% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 1.00 4.00 2.31 1.14 1.29 13 2 Participant 1.00 4.00 1.77 0.90 0.82 47 Showing rows 1 - 5 of 5 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 1 Extremely satisfied 15.38% 4 84.62% 22 0.00% 0 26 2 Somewhat satisfied 18.18% 4 81.82% 18 0.00% 0 22 3 Neither satisfied nor dissatisfied 40.00% 2 60.00% 3 0.00% 0 5 4 Somewhat dissatisfied 42.86% 3 57.14% 4 0.00% 0 7 5 Extremely dissatisfied 0.00% 0 0.00% 0 0.00% 0 0 EA9 - How satisfied were you with the following: Logistics (including registration/reimbursement) Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied WebinarAttendee 36.36% 27.27% 36.36% Participant 73.91% 6.52% 6.52% 13.04% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 1.00 3.00 1.91 0.79 0.63 11 2 Participant 1.00 4.00 1.46 0.88 0.77 46 Showing rows 1 - 5 of 5 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 1 Extremely satisfied 10.53% 4 89.47% 34 0.00% 0 38 2 Somewhat satisfied 40.00% 4 60.00% 6 0.00% 0 10 3 Neither satisfied nor dissatisfied 50.00% 3 50.00% 3 0.00% 0 6 4 Somewhat dissatisfied 0.00% 0 100.00% 3 0.00% 0 3 5 Extremely dissatisfied 0.00% 0 0.00% 0 0.00% 0 0 EA10 - How satisfied were you with the following: Inclusiveness of the forum (active facilitation, Community Agreements, Code of Conduct, travel support) Extremely satisfied Somewhat satisfied Neither satisfied nor dissatisfied Somewhat dissatisfied Extremely dissatisfied WebinarAttendee 4.35% 26.09% 17.39% 4.35% 47.83% Participant 72.34% 4.26% 2.13% 21.28% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 1.00 5.00 2.13 0.99 0.98 23 2 Participant 1.00 4.00 1.36 0.67 0.44 47 Showing rows 1 - 5 of 5 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 1 Extremely satisfied 15.00% 6 85.00% 34 0.00% 0 40 2 Somewhat satisfied 52.38% 11 47.62% 10 0.00% 0 21 3 Neither satisfied nor dissatisfied 66.67% 4 33.33% 2 0.00% 0 6 4 Somewhat dissatisfied 50.00% 1 50.00% 1 0.00% 0 2 5 Extremely dissatisfied 100.00% 1 0.00% 0 0.00% 0 1 Q14 - How did you hear about the Lighting the Way Forum? Select all that apply. Email listserve Social media post Media article Friend or colleague Conference presentation Google From a project team member Personal invitation I don't recall Other WebinarAttendee 5.66% 60.38% 15.09% 3.77% 1.89%3.77%1.89% 7.55% Participant 5.08% 30.51% 27.12% 18.64% 1.69%3.39% 8.47% 5.08% WebinarRegistrantDidNotAttend NO DATA Showing rows 1 - 10 of 10 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 1 Email listserve 64.00% 32 36.00% 18 0.00% 0 50 2 Social media post 57.14% 4 42.86% 3 0.00% 0 7 3 Media article 0.00% 0 0.00% 0 0.00% 0 0 4 Friend or colleague 33.33% 8 66.67% 16 0.00% 0 24 5 Conference presentation 50.00% 3 50.00% 3 0.00% 0 6 6 Google 0.00% 0 0.00% 0 0.00% 0 0 7 From a project team member 15.38% 2 84.62% 11 0.00% 0 13 8 Personal invitation 16.67% 1 83.33% 5 0.00% 0 6 16 I don't recall 50.00% 1 50.00% 1 0.00% 0 2 17 Other 50.00% 2 50.00% 2 0.00% 0 4 Q14_17_TEXT - Other WebinarAttendee Other NAFAN project Participant WebinarRegistrantDidNotAttend Other SAA Other From two individuals (management level) at my institution involved in archival delivery/access/discovery projects. First from someone attending DLF, then through email listservs Other Q22 - Would you be interested in further opportunities to participate in the Lighting the Way project (e.g. writing case studies or giving feedback on project deliverables)? Yes Maybe No WebinarAttendee 30.43% 26.09% 43.48% Participant 27.66% 2.13% 70.21% WebinarRegistrantDidNotAttend NO DATA # Field Minimum Maximum Mean Std Deviation Variance Count 1 WebinarAttendee 1.00 3.00 1.83 0.82 0.67 46 2 Participant 1.00 3.00 1.32 0.51 0.26 47 Showing rows 1 - 3 of 3 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 1 Yes 37.74% 20 62.26% 33 0.00% 0 53 2 Maybe 51.85% 14 48.15% 13 0.00% 0 27 3 No 92.31% 12 7.69% 1 0.00% 0 13 Q25 - Did you receive travel funding to participate in the Forum? (Select all that apply) End of Report I received travel funding from the Forum I received travel funding from my employer I did not receive travel funding I did not need any travel funding WebinarAttendee 80.49% 14.63% 2.44% 2.44% Participant 10.91% 1.82% 36.36% 50.91% WebinarRegistrantDidNotAttend NO DATA Showing rows 1 - 4 of 4 # Field WebinarAttendee Participant WebinarRegistrantDidNotAttend Total 1 I received travel funding from the Forum 3.45% 1 96.55% 28 0.00% 0 29 2 I received travel funding from my employer 4.76% 1 95.24% 20 0.00% 0 21 3 I did not receive travel funding 85.71% 6 14.29% 1 0.00% 0 7 4 I did not need any travel funding 84.62% 33 15.38% 6 0.00% 0 39 Table of contents Executive summary Acknowledgements Project background Key concepts Primary audiences and principles Project activities and goals Forum design and structure The application process, response rate, and travel funding Forum conceptual background and overview Day 1 (February 10, 2020) Overview Trading Cards Plenary presentations and themes Mad Tea Speedboat Low-Tech Social Network Retrospective Day 2 (February 11, 2020) Overview Trading Cards Context Map TRIZ Affinity Map 15% Solutions and 25/10 Crowd Sourcing Retrospective Day 3 (February 12, 2020) Overview Social Network Webbing Who/What/When Matrix Retrospective Forum Conclusion Evaluation and analysis Retrospective and facilitator reflection Day 1 Day 2 Day 3 Participant feedback survey What people enjoyed the most about the Forum What people enjoyed the least about the Forum What people learned at the Forum The most valuable experience that the project can offer Emerging themes Day 1 Plenary presentations Mad Tea Speedboat Low Tech Social Network Retrospective Day 2 Context Map TRIZ Affinity Map 15% Solutions and 25/10 Crowd Sourcing Retrospective Day 3 Social Network Webbing Who/What/When Matrix Retrospective Discussion and next steps Scope and focus Participation and community engagement Facilitation and structure of meetings and activities Written contributions Next steps Appendices Application form Community Agreements and Code of Conduct Lighting the Way Forum Playbook 25/10 Crowd Sourcing Ideas Anonymized Who/What/When Matrix actions Feedback survey questions Quantitative feedback summary wiegand-cultures-2021 ---- Chapter 5 Cultures of Innovation: Machine Learning as a Library Service Sue Wiegand Saint Mary’s College Introduction Libraries and librarians have always been concerned with the preservation of knowledge. To this traditional role, librarians in the 20th century added a new function—discovery—teaching peo- ple to find and use the library’s collected scholarship. Information Literacy, now considered the signature pedagogy in library instruction, evolved from the previous Bibliographic Instruction. As Digital Literacy, the next stage, develops, students can come to the library to learn how to leverage the greatest strengths of Machine Learning. Machines excel at recognizing patterns; researchers at all levels can experiment with innovative digital tools and strategies, and build 21st century skill sets. Librarian expertise in preservation, metadata, and sustainability through standards can be leveraged as a value-added service. Leading-edge librarians now invite all the cu- rious to benefit from the knowledge contained in the scholarly canon, accessible through libraries as curated living collections in multiple formats at distributed locations, transformed into new knowledge using new ways to visualize and analyze scholarship. Library collections themselves, including digitized, unique local collections, can provide the data for new insights and ways of knowing produced by Machine Learning. The library could also be viewed as a technology sandbox, a place to create knowledge, connect researchers, and bring together people, ideas, and new technologies. Many libraries are already rising to this challenge, working with other cultural institutions in creating a culture of innovation as a new learning paradigm, exemplified by Machine Learning instruction and technology tool exploration. 49 50 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 5 Library Practice The role of the library in preserving, discovering, and creating knowledge continues to evolve. Originally, libraries came into being as collections to be preserved, managed, and disseminated, a central repository of knowledge, possibly for political reasons (Ryholt and Barjamovic 2019, 1–2). Libraries founded by scholars and devoted to learning came later, during the Middle Ages (Cas- son 2001, 145). In more recent times, librarians began “[c]ollecting, organizing, and making information accessible to scholars and to citizens of a democratic republic” based on values de- veloped during the Enlightenment (Bivens-Tatum 2012, 186). Bibliographic Instruction in libraries, and later Information Literacy, embodied the idea of learning in the library as the next step beyond collecting, with librarians instructing on informa- tion infrastructure with the goal of empowering library users to find, evaluate, and use scholarly information in print and digital formats, with an emphasis on privacy and intellectual freedom as core library values. Now, librarians are also contributing to and participating in the learn- ing enterprise by partnering with the disciplines to produce new knowledge. This final step of knowledge creation in the library completes the scholarly communications cycle of building on previous scholarship—“standing on the shoulders of giants.” One way to cultivate innovation in libraries is to include Machine Learning in the library’s array of tools, resources, and services, both behind-the-scenes and public-facing. Librarians are expert at developing standards, preserving the scholarly record, and refining metadata to enhance interdisciplinary discovery of research, scholarship, and creative works. Librarian expertise could go far beyond local library collections to a global perspective and normative practice of participa- tion at scale in innovative emerging technologies such as Machine Learning. For instance, citations analysis of prospective collections for the library to collect and of the institutions’ research outputs would provide valuable information for both further collection development and for developing researchers’ toolkits. Machine Learning with its predilection for finding patterns, would reveal gaps in the literature and open up new questions to be an- swered, solving problems and leading to innovation. As one example, Yewno, a multi-disciplinary platform that uses Machine Learning to help combat “Information Overload,” advertises that it “helps researchers, students, and educators to deeply explore knowledge across interdisciplinary fields, sparking new ideas along the way…” and “makes [government] information accessible by breaking open silos and comprehending the complicated interconnections across agencies and organizations,” among other applications to improve discovery (Yewno n.d.). Also, in 2019, the Library of Congress hosted a Summit as “part of a larger effort to learn about machine learning and the role it could play in helping the Library of Congress reach its strategic goals, such as en- hancing discoverability of the Library’s collections, building connections between users and the Library’s digital holdings, and leveraging technology to serve creative communities and the gen- eral public” (Jakeway 2020). Integration of Machine Learning technologies is already starting at high levels in the library world. New Services A focus on Machine Learning can inspire new library services to enhance teaching and learning. Connecting people with ideas and with technology enables library virtual spaces to be used as a learning service by networking researchers at all levels in the enterprise of knowledge creation. Finding gaps in the literature would be a helpful first step in new library discovery tools. A way Wiegand 51 this could be done is through a “Researchers’ Workstation,” an end-to-end toolkit that might start by using Machine Learning tools to automate alerts of new content in a narrow area of in- terest and help researchers at all levels find and focus on problem-solving. A Researchers’ Work- station could contain a collection of analytic tools and learning modules to guide users through the phases of discovery. Then, managing citations would be an important step in the process— storing, annotating, and sorting out the most relevant. Starting research reports, keeping lab notebooks, finding datasets, and preserving the researcher’s own data are all relevant to the final results. A collaboration tool would enable researchers to find others with similar interests and share data or work collaboratively from anywhere, asynchronously. Having all these tools in one serendipitous virtual place is an extension of the concept of the library as the physical place to start research and scholarship. It is merely the containers of knowledge that are different. Some of this functionality exists already, both in Open Source software such as Zotero for ci- tation management, and in proprietary tools that combine multiple functions, such as Mendeley from Elsevier.1 Other commercial publishers are developing tools to enable researchers to work within their proprietary platforms, from the point of searching for ideas and finding research gaps through the process of writing and submitting finished papers for publication. The Coali- tion of Open Access Repositories (COAR) is similarly developing “next generation repositories” software integrating end-to-end tools for the Open Access literature archived in repositories, to “facilitate the development of new services on top of the collective network, including social net- working, peer review, notifications, and usage assessment.” (Rodrigues et al, 2017, 5). What else might a researcher want to do that the library could include in a Researchers’ Work- station? Finding, writing, and keeping track of grants could be incorporated at some level. Gener- ating a timeline might be helpful, and infographics and data visualizations could improve research communication and even help make the case for the importance of the study with others, espe- cially the public and funders. Project management tools might be welcomed by some researchers, too. Finally, when it’s time to submit the idea (whether at the preliminary or preprint stage) to something like an ArXiv-like repository or an institutional repository, as well as to journals of in- terest (also identified through Machine Learning tools), the process of submission, peer-review, revision, and re-submitting could be done seamlessly. The tools and functions in the Worksta- tion would ideally be modular, interoperable, and easy to learn and use, as well as continuously updated. The Workstation would be a complete ecosystem in the research cycle—saving time in the Scholarly Communications process and providing one place to go to for discovery, liter- ature review, data management, collaboration, preprint posting, peer review, publication, and post-print commenting.2 Collections as Data, Collections as Resources Exemplified by the literature search that now includes a myriad of Open content on a global basis, collections is an area that provides the greatest scope for library Machine Learning innova- tions to date, both applied and basic/theoretical. Especially if the pathway to using the expanded collections is clear and coherent, and the library provides instruction on why and how to use the various tools to save time and increase impact of research, researchers at all levels will benefit from 1See ?iiTb,ffrrrXxQi2`QXQ`; and ?iiTb,ffrrrXK2M/2H2vX+QK. 2In 2013, I wrote a blog that mentions the idea (Wiegand). https://www.zotero.org https://www.mendeley.com 52 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 5 partnering with librarians for a more comprehensive view of current knowledge in an area. The Always Already Computational: Collections as Data final report and project deliverables and Col- lections as Data: Part to Whole Project were designed to “develop models that support collections as data implementation and holistic reconceptualization of services and roles that support schol- arly use….” The Project specifically seeks “to create a framework and set of resources that guide libraries and other cultural heritage organizations in the development, description, and dissemi- nation of collections that are readily amenable to computational analysis.” (Padilla et al 2019). As a more holistic approach to data-driven scholarship, these resources aim to provide ac- cess to large collections to enable computational use on the national level. Some current library databases have already built this kind of functionality. JSTOR, for example, will provide up to 25,000 documents (or more at special request) in a dataset for analysis.3 Clarivate’s Content as a Service provides Web of Science data to accommodate multiple purposes.4 Besides the many freely available bibliodata sources, researchers can sign up for developer accounts in databases such as Scopus to work with datasets for text mining and computational analysis.5 Using library- licensed collections as data could allow researchers to save time in reading a large corpus, stay updated on a topic of interest, analyze the most important topics at a given time period, confirm gaps in the research literature for investigation, and increase the efficiency of sifting through mas- sive amounts of research in, for instance, the race to develop a COVID-19 vaccine (Ong 2020; Vamathevan 2019). Learning Spaces Machine Learning is a concept that calls out for educating library users through all avenues, in- cluding library spaces. Taking a clue from other GLAM (Galleries, Libraries, Archives, and Mu- seums) cultural institutions, especially galleries and museums, libraries and archives could mount exhibits and incorporate learning into library spaces as a form of outreach to teach how and why using innovative tools will save time and improve efficiency. Inspirational, continuously- updating dashboards and exhibits could show progress and possibilities, while physical and vir- tual tutorials might provide a game-like interface to spark creativity. Showcasing scholarship and incorporating events and speakers help create a new culture of ideas and exploration. Events bring people together in library spaces to network for collaborative endeavors. As an example, the Cleveland Museum of Art is analyzing visitor experiences using an ArtLens app to promote its collections.6 The Library of Congress, as mentioned, hosted a summit that explored such topics as building Machine Learning literacy, attracting interest in GLAM datasets, operational- izing Machine Learning, crowdsourcing, and copyright implications for the use of content. As another example, in 2017 the United Kingdom’s National Archives attempted to demystify Ma- chine Learning and explore ethics and applications such as topic modeling, which was used to find key phrases in Discovery record descriptions and enable innova- tive exploration of the catalogue; and it was also deployed to identify the subjects being discussed across Cabinet Papers. Other projects included the development 3See ?iiTb,ffrrrXDbiQ`XQ`;f/7`f�#Qmif/�i�b2i@b2`pB+2b. 4See ?iiTb,ff+H�`Bp�i2X+QKfb2�`+?f?b2�`+?4+QKTmi�iBQM�HWky/�i�b2ib. 5See ?iiTb,ff/2pX2Hb2pB2`X+QKf and ?iiTb,ff;mB/2bXHB#X#2`F2H2vX2/mfi2ti@KBMBM;. 6See ?iiTb,ffrrrX+H2p2H�M/�`iXQ`;f�`i@Kmb2mKb@�M/@i2+?MQHQ;v@/2p2HQTBM;@M2r@K2i`B+b @K2�bm`2@pBbBiQ`@2M;�;2K2Mi and ?iiTb,ffrrrX+H2p2H�M/�`iXQ`;f�`iH2Mb@;�HH2`vf�`iH2Mb@� TT. https://www.jstor.org/dfr/about/dataset-services https://clarivate.com/search/?search=computational%20datasets https://dev.elsevier.com/ https://guides.lib.berkeley.edu/text-mining https://www.clevelandart.org/art-museums-and-technology-developing-new-metrics-measure-visitor-engagement https://www.clevelandart.org/art-museums-and-technology-developing-new-metrics-measure-visitor-engagement https://www.clevelandart.org/artlens-gallery/artlens-app https://www.clevelandart.org/artlens-gallery/artlens-app Wiegand 53 of a system that found the most important sentence in a news article to generate automated tweeting, while another team built a system to recognise computer code written in different programming languages — this is a major challenge for digital preservation. (Bell 2018) Finally, the HG Contemporary Gallery in Chelsea, in 2019, mounted an exhibit that utilized a “machine-learning algorithm that did most of the work” (Bogost 2019). Sustainable Innovation Diversity, equity, and inclusion (DEI) concerns with the scholarly record and increasingly with recognized biases implicit in algorithms can be addressed by a very intentional focus on the value of differing perspectives in solving problems. Kat Holmes, an inclusive design expert previously at Microsoft and now a leading user experience designer at Google, urges a framework for inclu- sivity that counteracts bias with different points of view by recognizing exclusion, learning from human diversity, and bringing in new perspectives (Bedrossian 2018). Making more data avail- able, and more diverse data, will significantly improve the imbalance perpetuated by a traditional- only corpus. In sustainability terms, Machine Learning tools must be designed to continuously seek to incorporate diverse perspectives that go beyond the traditional definitions of the scholarly canon if they are to be useful in combating bias. Collections used as data in Machine Learning might undergo analysis by researchers, including librarian researchers, to determine the balance of content. Library subject headings should be improved to better reflect the diversity of human thought, cultures, and global perspectives. Streamlining procedures is to everyone’s benefit, and saving time is universally desired. Ef- ficiency won’t fix the time crunch everyone faces, but with too much to do and too much to read, information overload is a very real threat to advancing the research agenda and confronting a multitude of escalating global problems. Machine Learning techniques, applied at scale to large corpora of textual data, could help researchers pinpoint areas where the human researcher should delve more deeply to eliminate irrelevant sources and hone in on possible solutions to problems. One instance—a new service, Scite.ai “can automatically tell readers whether papers have been supported or contradicted by later academic work” (Khamsi 2020). WHO (World Health Orga- nization) is providing a Global Research Database that can be searched or downloaded.7 In re- search on self-driving vehicles, a systematic literature review found more than 10,000 articles, an estimated year’s worth of reading for an individual. A tool called Iris.ai allowed groupings of this archive by topic and is one of several “targeted navigation” tools in development (Extance 2020). Working together as efficiently as possible is the only way to move ahead, and Machine Learning concepts, tools, and techniques, along with training, can be applied to increasingly large textual datasets to accelerate discovery. Machine Learning, like any other technology, augments human capacities, it does not replace them. If 10% of library resources (measured in whatever way works for each particular library), including both time resources of expert librarians and staff and financial resources, were utilized for innovation, libraries would develop a virtuous self-sustaining cycle. Technologies that are not as useful can be assessed and dropped in an agile library, the useful can be incorporated into the 90% of existing services, and the resources (people and money) repurposed. In the same way, that 7See ?iiTb,ffrrrXr?QXBMif2K2`;2M+B2bf/Bb2�b2bfMQp2H@+Q`QM�pB`mb@kyRNf;HQ#�H@`2b2�`+ ?@QM@MQp2H@+Q`QM�pB`mb@kyRN@M+Qp. https://www.who.int/emergencies/diseases/novel-coronavirus-2019/global-research-on-novel-coronavirus-2019-ncov https://www.who.int/emergencies/diseases/novel-coronavirus-2019/global-research-on-novel-coronavirus-2019-ncov 54 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 5 10% of library resources invested into innovations such as Machine Learning, whether in library practice or instruction and other services, will keep the program and the library fresh. Creativity is key and will be the hallmark of successful libraries in the future. Stewardship of resources such as people’s skills and expertise, and strategic use of the collections budget, are already library strengths. By building out new services and tools, and instructing at all levels, libraries can reinvent themselves continuously by investing in creative and sustainable innovation, from digital and data literacy to assembling modules for a library-based customized Researchers’ Workstation that uses Machine Learning to enhance the efficiency of the scholars’ research cycle. Results and more questions A library that adapted Machine Learning as an innovation technology would improve its prac- tices; add new services; choose, use, and license collections differently; utilize all spaces for learn- ing; and role model innovative leadership. What is a library in rapidly changing times? How can librarians reconcile past identity, add value, and leverage hard-won expertise in a new environ- ment? Change management is a topic that all institutions will have to confront as the digital age continues, as we reinvent ourselves and our institutions in a fast paced technological world. Value-added, distinctive, unique—these are all words that will be part of the conversation. Not only does the library add value, but librarians will have to demonstrate and quantify that value while preparing to pivot at any time in response to crises and innovative opportunities. Distinctive library resources and services that speak to the institutions’ academic mission and purpose will be a key feature. What does the library do that no other entity on campus can do? At each particular inflection point, how best to communicate with stakeholders about the value of the distinctive library mission? Can the library work with other cultural heritage institutions to highlight the unique contributions of all? One possible approach—develop a library science/library studies pedagogy as well as out- reach that encompasses the Scholarship of Teaching and Learning (SoTL) and pervades every- thing the library does in providing resources, services, and spaces. Emphasize that library re- sources help people solve multi-dimensional, complex problems, and then work on new ideas to save the time of researchers, improve discovery systems, advocate and facilitate Open Access and Open Source alternatives while enabling, empowering, and yes, inspiring all users to partici- pate in and contribute to the record of human knowledge. Librarians, as the traditional keepers of the scholarly canon in written form, have standing to do this as part of our legacy and as part of our envisioned future. From the library users’ point of view, librarians should think like the audience we are trying to reach to answer the question—why come into the library or use the library website instead of more familiar alternatives? In an era of increasing surveillance, library tools could be better known for an emphasis on privacy and confidentiality, for instance. This may require thinking more deeply about how we use our metrics and finding other ways to show how use of the library contributes to student success. It is also important to gather quantitative and qualitative evidence from library users themselves, and apply the feedback in an agile improvement loop. In the case of Open Access vs. proprietary information, librarians should make the case for Open Access (OA) by advocating, explaining, and instructing library users from the first time they do literature searches to the time they are graduate students, post-docs, and faculty. Librar- ians should produce Open Educational Resources (OER) as well as encourage classroom faculty to adopt these tools of affordable education. Libraries also need to facilitate Open Access content Wiegand 55 from discovery to preservation by developing search tools that privilege OA, using Open Source software whenever possible. Librarians could lead the way to changing the Scholarly Commu- nications system by emphasizing change at the citations level—encourage researchers to insist on being able to obtain author-archived citations in a seamless way, and facilitate that through development of new discovery tools using Machine Learning. Improving discovery of Open Ac- cess, as well as embarking on expanded library publishing programs and advancing academic re- search, might be the most important endeavors that librarians could undertake at this point in time, to prevent a repeat of the “serials crisis” that commoditized scholarly information and to build a more diverse, equitable, and inclusive scholarly record. Well-funded commercial publish- ers are already engaging scholars and researchers in new proprietary platforms that could lock in academia more thoroughly than “Big Deals” did, even as the paradigm shifts away from large, expensive publishers’ platforms and library subscription cancellations mount due to budget cuts and the desire to optimize value for money. The concept of the “inside-out library” (Dempsey 2016) provides a way of thinking about opening local collections to discovery and use in order to create new knowledge through digiti- zation and semantic linking, with cross-disciplinary technologies to augment traditional research and scholarship. Because these ideas are so new but fast-moving, librarians need to spread the word on possibilities in library publishing. Making local collections accessible for computational research helps to diversify findings and focuses attention on larger patterns and new ideas. In 2019, for instance, the Library of Congress sought to “Maximize the Use of its Digital Collection” by launching a program “to understand the technical capabilities and tools that are required to support the discovery and use of digital collections material,” developing ethical and technolog- ical standards to automate in supporting emerging research techniques and “to preprocess text material in a way that would make that content more discoverable” (Price 2019). Scholarly Com- munication, dissemination, and discovery of research results will continue to be an important function of the library if trusted research results are to be available to all, not just the privileged. The so-called Digital Divide isolates and marginalizes some groups and regions; libraries can be a unifying force. An important librarian role might be to identify gaps, in research or in dissemination, and work to overcome barriers to improving highly distributed access to knowledge. Libraries special- ize in connecting disparate groups. Here is what libraries can do now: instruct new researchers (including undergraduate researchers and up) in theories, skills, and techniques to find, use, pop- ulate, preserve, and cite datasets; provide server space and/or Data Management services; intro- duce Machine Learning and text analysis tools and techniques; provide Machine Learning and text analysis tools and/or services to researchers at all levels. Researchers are now expected or even required to provide public scholarship, i.e., to bring their research into the public realm be- yond obscure research journals, and to explain and illuminate their work, connecting it to the public good, especially in the case of publicly-funded research. Librarians can and should part- ner in the public dissemination of research findings through explaining, promoting, and provid- ing innovative new tools across siloed departments to catalyze cross-disciplinary research. Schol- arly Communications began with books and journals shared by scholars over time, then libraries were assembled and built to contain the written record; librarians should ensure that the Schol- arly Communications and information landscape continues into the future with widely-shared, available resources in all formats, now including interactive, web-based software, embedded data analysis tools, and technical support of emerging Open Source platforms. In addition, the flow of research should be smooth and seamless to the researcher, whether 56 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 5 in a Researchers’ Workstation or other library tools. The research cycle should be both clearly explained and embedded in systems and tools. The library, as a central place that cuts across narrowly-defined research areas, could provide a systemic place of collaboration. Librarians, see- ing the bigger picture, could facilitate research as well as disseminate and preserve the resulting data in journals and datasets. Further investigations on how researchers work, how students learn, best practices in pedagogy, and life-long learning in the library could mark a new era in librarian- ship, one that involves teaching, learning, and research as a self-reinforcing cycle. Beyond being a purchaser of journals and books, libraries can expand their role in the learning process itself into a cycle of continuous change and exploration, augmented by Machine Learning. Library Science, Research, and Pedagogy In Library and Information Science (LIS), graduate library schools should teach about Machine Learning as a way of innovating and emphasize pervasive innovation as the new normal. Cre- ating a culture of innovation and creativity in LIS classes and in libraries will pay off for society as a whole, if librarians promote the advantages of a culture of innovation in themselves and in library users. Subverting the stereotypes of tradition-bound libraries and librarians will revital- ize the profession and our workplaces, replacing fear of change and an existential identity crisis with a spirit of creative, agile reinvention that will rise to challenges rather than seek solace in de- nial, whether the seemingly impossible problem is preparedness in dealing with a pandemic or creatively addressing climate change. Academic libraries must transition from a space of transactional (one-time) actions into a transformational learning-centered user space, both physical and virtual, that offers an enhanced experience with teaching, learning, and research—a way to re-center the library as the place to get answers that go beyond the Internet. Libraries add value: do faculty, students, and other patrons know, for instance, that when they find the perfect book on a library shelf through browsing (or on the library website with virtual browsing), it is because a librarian somewhere assigned it a call number to group similar books together? The next step in that process is to use Machine Learning to generate subject headings, and also show the librarians accomplishing that. This process is being investigated in different types of works from fiction to scientific literature (Golub 2006, Joorabchi 2011, Wang 2009, Short 2019). Cataloging, metadata, and enabling access through shared standards and Knowledge Bases are all things librarians do that add value for library users overwhelmed with Google hits, and are worthy of further development, including in an Open environment. Preservation is another traditional library function, and now includes born-digital items and digitization of special collections/archives, increasing the library role. Discovery will be enhanced by Artificial/Augmented Intelligence and Machine Learning techniques. All of this should be taught in library schools, to build a new library culture of innovation and problem-solving be- yond just providing collections and information literacy instruction. The new learning paradigm is immersive in all senses, and the future, as reflected in library transformation and partnerships with researchers, galleries, archives, museums, citizen scientists, hobbyists, and life-long learners re-tooling their careers and life, is bright. LIS programs need to reflect that. To promote learning in libraries, librarians could design a “You belong in the Library” cam- paign to highlight our diverse resources and new ways of working with technology, inviting par- ticipation in innovative technologies such as Machine Learning in an increasingly rare public, non-commercial space—telling why, showing how. In many ways, libraries could model ways to Wiegand 57 achieve academic success and life success, updating a traditional role in educating, instructing, preparing for the future, explaining, promoting understanding, and inspiring. Discussion The larger questions now are, who is heard and who contributes? How are gaps, identified in needs analysis, reduced? What are sources of funding for libraries to develop this important work and not leave it to commercial services? Library leadership and innovative thinking must converge to devise ways for libraries to bring people together, producing more diverse, ethical, innovative, inclusive, practical, transformative, and novel library services and physical and virtual spaces for the public good. Libraries could start with analyses of needs—what problems could be solved with more effec- tive literature searches? What research could fill gaps and inform solutions to those needs? What kind of teaching could help build citizens and critical thinkers, rather than simply encouraging consumption of content? Another need is to diversify collections used in Machine Learning, gathering cultural perspectives that reflect true diversity of thought through inclusion. All voices should be heard and empowered. Librarians can help with that. A Researchers’ Workstation could bring together an array of tools and content to allow not only the organization, discovery, and preservation of knowledge, but also facilitate the creation of new knowledge through the sustainable library, beyond the literature search. The world is converging toward networking and collaborative research all in one place. I would like the library to be the free platform that brings all the others to- gether. Coming full circle, my vision is that when researchers want to work on their re- search, they will log on to the library and find all they need…. The library is the one place … to get your scholarly work done. (Wiegand 2013) The library as a platform should be a shared resource—the truest library value. Here is a scenario. Suppose, for example, scholars wish to analyze the timeline of the begin- ning of the Coronavirus crisis. Logging on to the library’s Researchers’ Workstation, they start with the Discovery module to generate a corpus of research papers from, say, December 2019 to June 2020. Using the Machine Learning function, they search for articles and books, looking for gaps and ideas that have not yet been examined in the literature. They access and download full- text, save citations, annotate and take notes, and prepare a draft outline of their research using a word processing function, writing and citing seamlessly. A Methods (protocols) section could help determine the most effective path of the prospective research. Then, they might search for the authors of the preprints and articles they find interesting, check the authors’ profiles, and contact some of them through the platform to discern interest in collaborating. The profile system would list areas of interest, current projects, availability for new projects, etc. Using the Project Management function, scholars might open a new workspace where preliminary thoughts could be shared, with attribution and acknowledgement as appro- priate, and a peer review timeline chosen to invite comments while authors can still claim the idea as their own. If the preprint is successful, and the investigation shows promise after the results are in, the scholars could search for an appropriate journal for publication, the version of record. The au- 58 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 5 thor, with researcher ID (also contained in his/her profile), has the article added to the final pub- lished section of the profile, with a DOI. The journal showcases the article, sends out tables of content alerts and press releases where it can be picked up by news services and authors invited to comment publicly. Each institution would celebrate its authors’ accomplishments, use the Scholars’ Workstation to determine impact and metrics, and promote the institutions’ research progress. Finally, the article would be preserved through the library repository and also initiatives such as LOCKSS. Future scholars would find it still available and continue to discover and build on the findings presented. All of this and more would be done through the library. Conclusion Machine Learning as a library service can inspire new stages of innovation, energizing and provid- ing a blueprint for the library future—teaching, learning, and scholarship for all. The teaching part of the equation invokes the faculty audience perspective: how can librarians help classroom faculty to integrate both library instruction and library research resources (collections, expertise, spaces) into the educational enterprise (Wiegand and Kominkiewicz 2016)? How can librarians best teach skills, foster engagement, and create knowledge to make a distinctive contribution to the institution? Our answers will determine the library’s future at each academic institution. Machine Learning skills, engagement, and knowledge should fit well with the library’s array of services. Learning is another traditional aspect of library services, this time from the student point of view. The library provides collections—multimedia or print on paper, digital and digitized, proprietary and open, local, redundant, rare, unique. The use of collections is taught by both librarians and disciplinary faculty in the service of learning, including life-long learning for non- academic, everyday knowledge. Students need to know more about Machine Learning, from data literacy to digital competencies, including concerns about privacy, security, and fake news across the curriculum, while learning skills associated with Machine Learning. In addition, through Open Access, library “collections” now encompass the world beyond the library’s physical and virtual spaces. Then, as libraries, like all digitally-inflected institutions, develop “change management” strate- gies, they need to double-down on these unique affordances and communicate them to stake- holders. The most critical strategy is embedding the Scholarship of Teaching and Learning (SoTL) in all aspects of the library workflow. Instead of simply advertising new electronic resources or describing Open Access versus proprietary resources, libraries should broadly embed the lessons of copyright, surveillance, and reproducibility into patron interactions, from the first undergrad- uate literature search to the faculty research consultation. Then, reinforce those lessons by em- phasizing open access and data mining permissions in their discovery tools. These are aspects of the scholarly research cycle over which libraries have some control. By exerting that control, li- braries will promote a culture that positions Machine Learning and other creative digital uses of library data as normal, achievable parts of the scholarly process. To complete the Scholarly Communications lifecycle, support for research, scholarship, and creative works is increasingly provided by libraries as a springboard to creation of knowledge, the library’s newest role. This is where Machine Learning as a new paradigm fits in most compellingly as an innovative practice. Libraries can provide not only associated services such as Data Manage- ment of the datasets resulting from analyzing huge textual corpora, but also databases of propri- Wiegand 59 etary and locally-produced content from inter-connected, cooperating libraries on a global scale. Researchers—faculty, students, and citizens (including alumni)—will benefit from crowdsourc- ing and citizen science while gaining knowledge and contributing to scholarship. But perhaps the largest benefit will be learning by doing, escaping the “black box” of blind consumerism to see how algorithms work and thus develop a more nuanced view of reality in the Machine Age. References Bedrossian, Rebecca. 2018. “Recognizing Exclusion is the Key to Inclusive Design: In Conver- sation with Kat Holmes.” Campaign (blog). July 25, 2018. ?iiTb,ffrrrX+�KT�B;MHB p2X+QKf�`iB+H2f`2+Q;MBxBM;@2t+HmbBQM@F2v@BM+HmbBp2@/2bB;M@+QMp2` b�iBQM@F�i@?QHK2bfR9333dk. Bell, Mark. 2018. “Machine Learning in the Archives.” National Archives (blog). November 8, 2020. ?iiTb,ff#HQ;XM�iBQM�H�`+?Bp2bX;QpXmFfK�+?BM2@H2�`MBM;@�`+?Bp 2bf. Bivens-Tatum, Wayne. 2012. Libraries and the Enlightenment. Los Angeles: Library Juice Press. Accessed January 6, 2020. ProQuest Ebook Central. Bogost, Ian. 2019. “The AI-Art Gold Rush is Here.” The Atlantic. March 6, 2019. ?iiTb, ffrrrXi?2�iH�MiB+X+QKfi2+?MQHQ;vf�`+?Bp2fkyRNfyjf�B@+`2�i2/@�`i@ BMp�/2b@+?2Hb2�@;�HH2`. Casson, Lionel. 2001. Libraries in the Ancient World. New Haven: Yale University Press. Ac- cessed January 6, 2020. ProQuest Ebook Central. Dempsey, Lorcan. 2016. “Library Collections in the Life of the User: Two Directions.” LIBER Quarterly 26: 338–359. ?iiTb,ff/QBXQ`;fRyXR3j8kfH[XRyRdy. Extance, Andy. 2018. “How AI Technology Can Tame the Scientific Literature.” Nature 561: 273-274. ?iiTb,ff/QBXQ`;fRyXRyj3f/9R83e@yR3@yeeRd@8. Golub, K. 2006. “Automated Subject Classification of Textual Web Documents.” Journal of Documentation 62: 350-371. ?iiTb,ff/QBXQ`;fRyXRRy3fyykky9RyeRyeee8yR. Jakeway, Eileen. 2020. “Machine Learning + Libraries Summit: Event Summary now live!” The Signal (blog), Library of Congress. February 12, 2020. ?iiTb,ff#HQ;bXHQ+X;Qpfi?2b B;M�HfkykyfykfK�+?BM2@H2�`MBM;@HB#`�`B2b@bmKKBi@2p2Mi@bmKK�`v@MQ r@HBp2f. Joorabchi, Arash and Abdulhussin E. Mahdi. 2011. “An Unsupervised Approach to Automatic Classification of Scientific Literature Utilising Bibliographic Metadata.” Journal of Infor- mation Science. ?iiTb,ff/QBXQ`;fRyXRRddfyRe888R8yyyyyyy. Khamsi, Rozanne. 2020. “Coronavirus in context: Scite.ai Tracks Positive and Negative Cita- tions for COVID-19 Literature.” Nature. ?iiTb,ff/QBXQ`;fRyXRyj3f/9R83e@yky @yRjk9@e. Padilla, Thomas, Laurie Allen, Hannah Frost, et al. 2019. “Final Report — Always Already Computational: Collections as Data.” Zenodo. May 22, 2019. ?iiTb,ff/QBXQ`;fRyX8 k3Rfx2MQ/QXjR8kNj8. Price, Gary. 2019. “The Library of Congress Posts Solicitation For a Machine Learning/Deep Learning Pilot Program to ‘Maximize the Use of its Digital Collection.’ ” Library Journal. June 13, 2019. Rodrigues, Eloy et al. 2017. “Next Generation Repositories: Behaviours and Technical Rec- ommendations of the COAR Next Generation Repositories Working Group.” Zenodo. https://www.campaignlive.com/article/recognizing-exclusion-key-inclusive-design-conversation-kat-holmes/1488872 https://www.campaignlive.com/article/recognizing-exclusion-key-inclusive-design-conversation-kat-holmes/1488872 https://www.campaignlive.com/article/recognizing-exclusion-key-inclusive-design-conversation-kat-holmes/1488872 https://blog.nationalarchives.gov.uk/machine-learning-archives/ https://blog.nationalarchives.gov.uk/machine-learning-archives/ https://www.theatlantic.com/technology/archive/2019/03/ai-created-art-invades-chelsea-galler https://www.theatlantic.com/technology/archive/2019/03/ai-created-art-invades-chelsea-galler https://www.theatlantic.com/technology/archive/2019/03/ai-created-art-invades-chelsea-galler https://doi.org/10.18352/lq.10170 https://doi.org/10.1038/d41586-018-06617-5 https://doi.org/10.1108/00220410610666501 https://blogs.loc.gov/thesignal/2020/02/machine-learning-libraries-summit-event-summary-now-live/ https://blogs.loc.gov/thesignal/2020/02/machine-learning-libraries-summit-event-summary-now-live/ https://blogs.loc.gov/thesignal/2020/02/machine-learning-libraries-summit-event-summary-now-live/ https://doi.org/10.1177/016555150000000 https://doi.org/10.1038/d41586-020-01324-6 https://doi.org/10.1038/d41586-020-01324-6 https://doi.org/10.5281/zenodo.3152935 https://doi.org/10.5281/zenodo.3152935 60 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 5 November 28, 2017. ?iiTb,ff/QBXQ`;fRyX8k3Rfx2MQ/QXRkR8yR9. Ryholt, K. S. B, and Gojko Barjamovic, eds. 2019. Libraries Before Alexandria: Ancient near Eastern Traditions. Oxford: Oxford University Press. Vamathevan, Jessica, Dominic Clark, Paul Czodrowski, Ian Dunham, Edgardo Ferran, George Lee, Bin Lee, Anant Madabhushi, Parantu Shah, Michaela Spitzer, and Shanrong Zhao. 2019. “Applications of Machine Learning in Drug Discovery and Development.” Nat Rev Drug Discov 18: 463–477. ?iiTb,ff/QBXQ`;fRyXRyj3fb9R8dj@yRN@yyk9@8. Wang, Jun. 2009. “An Extensive Study on Automated Dewey Decimal Classification.” Journal of the American Society for Information Science & Technology 60: 2269–86. ?iiTb,ff/Q BXQ`;fRyXRyykf�bBXkRR9d. Wiegand, Sue. 2013. “ACS Solutions: The Sturm und Drang.” ACRLog (blog), Association of College and Research Libraries. November 8, 2020. ?iiTb,ff�+`HQ;XQ`;fkyRjfy9fy ef�+b@bQHmiBQMb@i?2@bim`K@mM/@/`�M;f. Wiegand, Sue and Frances Kominkiewisz. 2016. Unpublished manuscript. “Integration of Stu- dent Learning through Library and Classroom Instruction.” Yewno. n.d. “Yewno — Transforming Information into Knowledge.” Accessed January 6, 2020. ?iiTb,ffrrrXv2rMQX+QKf. Further Reading Abbattista, Fabio, Luciana Bordoni, and Giovanni Semeraro. 2003. “Artificial Intelligence for Cultural Heritage and Digital Libraries.” Applied Artificial Intelligence 17, no. 8/9: 681. ?iiTb,ff/QBXQ`;fRyXRy3yfdRj3kdk83. Ard, Constance. 2017. “Advanced Analytics Meets Information Services.” Online Searcher 41, no. 6: 21–24. “Artificial Intelligence and Machine Learning in Libraries.” 2019. Library Technology Reports 55, no. 1: 1–29. Badke, William. 2015. “Infolit Land. The Effect of Artificial Intelligence on the Future of In- formation Literacy.” Online Searcher 39, no. 4: 71–73. Boman, Craig. 2019. “Chapter 4: An Exploration of Machine Learning in Libraries.” Library Technology Reports 55: 21–25. Breeding, Marshall. 2018. “Chapter 6: Possible Future Trends.” Library Technology Reports 54, no. 8: 31–32. Dempsey, Lorcan, Constance Malpas, and Brian Lavoie. 2014. “Collection Directions: The Evolution of Library Collections and Collecting” portal: Libraries and the Academy 14, no. 3 (July): 393-423. ?iiTb,ff/QBXQ`;fRyXRj8jfTH�XkyR9XyyRj. Enis, Matt. 2019. “Labs in the Library.” Library Journal 144, no. 3: 18–21. Finley, Thomas. 2019. “The Democratization of Artificial Intelligence: One Library’s Approach.” Information Technology & Libraries 38, no. 1: 8–13. ?iiTb,ff/QBXQ`;fRyXeyRdfBi �HXpj3BRXRyNd9. Frank, Eibe and Gordon W. Paynter. 2004. “Predicting Library of Congress Classifications From Library of Congress Subject Headings.” Journal of The American Society for Information Science and Technology 55, no. 3. ?iiTb,ff/QBXQ`;fRyXRyykf�bBXRyjey. Geary, Daniel. 2019. “How to Bring AI into Your Library.” Computers in Libraries 39, no. 7: 32–35. https://doi.org/10.5281/zenodo.1215014 https://doi.org/10.1038/s41573-019-0024-5 https://doi.org/10.1002/asi.21147 https://doi.org/10.1002/asi.21147 https://acrlog.org/2013/04/06/acs-solutions-the-sturm-und-drang/ https://acrlog.org/2013/04/06/acs-solutions-the-sturm-und-drang/ https://www.yewno.com/ https://doi.org/10.1080/713827258 https://doi.org/10.1353/pla.2014.0013 https://doi.org/10.6017/ital.v38i1.10974 https://doi.org/10.6017/ital.v38i1.10974 https://doi.org/10.1002/asi.10360 Wiegand 61 Griffey, Jason. 2019. “Chapter 5: Conclusion.” Library Technology Reports 55, no. 1: 26–28. Inayatullah, Sohail. 2014. “Library Futures: From Knowledge Keepers to Creators.” Futurist 48, no. 6: 24–28. Johnson, Ben. 2018. “Libraries in the Age of Artificial Intelligence.” Computers in Libraries 38, no. 1: 14–16. Kuhlman, C., L. Jackson, and R. Chunara. 2020. “No Computation without Representation: Avoiding Data and Algorithm Biases through Diversity.” ArXiv:2002.11836v1 [cs.CY], February. ?iiT,ff�`tBpXQ`;f�#bfkyykXRR3je. Lane, David C. and Claire Goode. 2019. “OERu’s Delivery Model for Changing Times: An Open Source NGDLE.” Paper presented at the 28th ICDE World Conference on Online Learning, Dublin, Ireland, November 2019. ?iiTb,ffQ2`mXQ`;f�bb2ibfJ�`+QKbf P1_m@L:.G1@T�T2`@6AL�G@S.6@p2`bBQMXT/7. Liu, Xiaozhong, Chun Guo, and Lin Zhang. 2014. “Scholar Metadata and Knowledge Gener- ation with Human and Artificial Intelligence.” Journal of the Association for Information Science & Technology 65, no. 6: 1187–1201. ?iiTb,ff/QBXQ`;fRyXRyykf�bBXkjyRj. Mitchell, Steve. 2006. “Machine Assistance in Collection Building: New Tools, Research, Issues, and Reflections.” Information Technology & Libraries 25, no. 4: 190–216. ?iiTb,ff/Q BXQ`;fRyXeyRdfBi�HXpk8B9Xjj8j. Ojala, Marydee. 2019. “ProQuest’s New Approach to Streamlining Selection and Acquisitions.” Information Today 36, no. 1: 16–17. Ong, Edison, Mei U. Wong, Anthony Huffman, and Yongqun He. 2020. “COVID-19 Coro- navirus Vaccine Design Using Reverse Vaccinology and Machine Learning.” Frontiers in Immunology 11. ?iiTb,ff/QBXQ`;fRyXjj3Nf7BKKmXkykyXyR83R. Orlowitz, Jake. 2017. “You’re a Researcher Without a Library: What Do You Do?” AWikipedia Librarian (blog), Medium. November 15, 2017. ?iiTb,ffK2/BmKX+QKf�@rBFBT2/ B�@HB#`�`B�MfvQm`2@�@`2b2�`+?2`@rBi?Qmi@�@HB#`�`v@r?�i@/Q@vQm@/Q @e3RR�jyjdj+/. Padilla, Thomas. 2019. Responsible Operations: Data Science, Machine Learning, and AI in Libraries. Dublin, OH: OCLC Research. ?iiTb,ff/QBXQ`;fRyXk8jjjftFdx@N;Nd. Plosker, George. 2018. “Artificial Intelligence Tools for Information Discovery.” OnlineSearcher 42, no. 3: 31–35. ?iiTb,ffrrrXBM7QiQ/�vX+QKfPMHBM2a2�`+?2`f�`iB+H2bf62 �im`2bf�`iB7B+B�H@AMi2HHB;2M+2@hQQHb@7Q`@AM7Q`K�iBQM@.Bb+Qp2`v@R k9dkRXb?iKH. Rak, Rafal, Andrew Rowley, William Black, and Sophie Ananiadou. 2012. “Argo: an Integra- tive, Interactive, Text Mining-based Workbench Supporting Curation.” Database : the Jour- nal of Biological Databases and Curation. ?iiTb,ff/QBXQ`;fRyXRyNjf/�i�#�b2f# �byRy. Schmidt, Lena, Babatunde Kazeem Olorisade, Julian Higgins, and Luke A. McGuinness. 2020. “Data Extraction Methods for Systematic Review (Semi)automation: A Living Review Pro- tocol.” F1000Research 9: 210. ?iiTb,ff/QBXQ`;fRyXRke33f7Ryyy`2b2�`+?Xkkd 3RXk. Schonfeld, Roger C. 2018. “Big Deal: Should Universities Outsource More Core Research In- frastructure?” Ithaka S+R. ?iiTb,ff/QBXQ`;fRyXR3ee8fb`Xjyeyjk. Schockey, Nick. 2013. “How Open Access Empowered a 16-year-old to Make Cancer Break- through.” June 12, 2013. ?iiT,ffrrrXQT2M�++2bbr22FXQ`;fpB/2QfpB/2Qfb?Q r?B/48j38RR8Wj�oB/2QWj�Ny99k. http://arxiv.org/abs/2002.11836 https://oeru.org/assets/Marcoms/OERu-NGDLE-paper-FINAL-PDF-version.pdf https://oeru.org/assets/Marcoms/OERu-NGDLE-paper-FINAL-PDF-version.pdf https://doi.org/10.1002/asi.23013 https://doi.org/10.6017/ital.v25i4.3353 https://doi.org/10.6017/ital.v25i4.3353 https://doi.org/10.3389/fimmu.2020.01581 https://medium.com/a-wikipedia-librarian/youre-a-researcher-without-a-library-what-do-you-do-6811a30373cd https://medium.com/a-wikipedia-librarian/youre-a-researcher-without-a-library-what-do-you-do-6811a30373cd https://medium.com/a-wikipedia-librarian/youre-a-researcher-without-a-library-what-do-you-do-6811a30373cd https://doi.org/10.25333/xk7z-9g97 https://www.infotoday.com/OnlineSearcher/Articles/Features/Artificial-Intelligence-Tools-for-Information-Discovery-124721.shtml https://www.infotoday.com/OnlineSearcher/Articles/Features/Artificial-Intelligence-Tools-for-Information-Discovery-124721.shtml https://www.infotoday.com/OnlineSearcher/Articles/Features/Artificial-Intelligence-Tools-for-Information-Discovery-124721.shtml https://doi.org/10.1093/database/bas010 https://doi.org/10.1093/database/bas010 https://doi.org/10.12688/f1000research.22781.2 https://doi.org/10.12688/f1000research.22781.2 https://doi.org/10.18665/sr.306032 http://www.openaccessweek.org/video/video/show?id=5385115%3AVideo%3A90442 http://www.openaccessweek.org/video/video/show?id=5385115%3AVideo%3A90442 62 Machine Learning, Libraries, and Cross-Disciplinary ResearchǔChapter 5 Short, Matthew. 2019. “Text Mining and Subject Analysis for Fiction; or, Using Machine Learn- ing and Information Extraction to Assign Subject Headings to Dime Novels.” Cataloging & Classification Quarterly 57, no. 5: 315–336. ?iiTb,ff/QBXQ`;fRyXRy3yfyRejNj d9XkyRNXRe8j9Rj. Thompson, Paul, Riza Theresa Batista-Navarro, and Georgio Kontonatsios. 2016. “Text Mining the History of Medicine.” PloS One 11, no. 1:e0144717. ?iiTb,ff/QBXQ`;fRyXRjdRf DQm`M�HXTQM2XyR99dRd. White, Philip. 2019. “Using Data Mining for Citation Analysis.” College & Research Libraries 80, no. 1. ?iiTb,ffb+?QH�`X+QHQ`�/QX2/mf+QM+2`MfT�`2Mif+`8eMRedjf7BH2 nb2ibfNyRNbjRe9. Witbrock, Michael J. and Alexander G. Hauptmann. 1998. “Speech Recognition for a Digital Video Library.” Journal of the American Society for Information Science 49, no. 7: 619–32. ?iiTb,ff/QBXQ`;fRyXRyykfUaA*A)RyNd@98dRURNN3y8R8)9N,dIeRN,,�A.@�a A9>jXyX*P;k@�. Zuccala, Alesia, Maarten Someren, and Maurits Bellen. 2014. “A Machine-Learning Approach to Coding Book Reviews as Quality Indicators: Toward a Theory of Megacitation.” Journal of the Association for Information Science & Technology 65, no. 11: 2248–60. ?iiTb,ff/Q BXQ`;fRyXRyykf�bBXkjRy9. https://doi.org/10.1080/01639374.2019.1653413 https://doi.org/10.1080/01639374.2019.1653413 https://doi.org/10.1371/journal.pone.0144717 https://doi.org/10.1371/journal.pone.0144717 https://scholar.colorado.edu/concern/parent/cr56n1673/file_sets/9019s3164 https://scholar.colorado.edu/concern/parent/cr56n1673/file_sets/9019s3164 https://doi.org/10.1002/asi.23104 https://doi.org/10.1002/asi.23104 schindel-economic-2020 ---- Economic Analyses of Federal Scientific Collections Methods for Documenting Costs and Benefits David E. Schindel and the Economic Study Group of the Interagency Working Group on Scientific Collections Smithsonian Scholarly Press WASHINGTON, D.C. 2020 Cover images (from top to bottom): Aedes aegypti, a disease-transmitting mosquito, after extracting a blood meal from a human subject; photo by James Gathany (2006), courtesy of CDC Public Health Image Library (https://phil.cdc.gov/Details.aspx?pid=8923). Guatema- lan bat, collected in 2011 to determine the prevalence of pathogens; CDC’s Division of Vector Borne Diseases worked closely with the Global Disease Detection Regional Center in Guatemala and the Universidad del Valle de Guatemala; photo courtesy of CDC Public Health Image Library (https://phil.cdc.gov/Details.aspx?pid=18013). Rodents collected during or after World War II in Formosa (now Taiwan) as part of anti-typhus efforts; vials probably contain fleas and other ectoparasites that can transmit pathogens; photo by U.S. Army Signal Corps, courtesy of National Museum of Health and Medicine, Otis Historical Archives 343 (https://www.flickr.com/photos /medicalmuseum/3543831668/in/album-72157618366141658/). CDC scientist stacks plates with human serum samples into a robotic system that detects the presence of antibodies that protect the subject from poliovirus; photo by James Gathany, courtesy of CDC Pub- lic Health Image Library (https://phil.cdc.gov/Details.aspx?pid=22912). A microbiologist in CDC’s Special Pathogens Branch lowers a rack of boxes containing cryovials into a liquid nitrogen freezer for long-term storage; photo by James Gathany, courtesy of CDC Public Health Image Library (https://phil.cdc.gov/Details.aspx?pid=10724). Published by SMITHSONIAN INSTITUTION SCHOLARLY PRESS P.O. Box 37012, MRC 957, Washington, D.C. 20013-7012 https://scholarlypress.si.edu Copyright Information This document is a work of the United States Government and is in the public domain (see 17 U.S.C. §105). Subject to the stipulations below, it may be distributed and copied with acknowledgment to the Smithsonian Institution. Copyrights to graphics included in this document where the original copyright holders or their assignees are credited are used here under the government’s license and by per- mission. Requests to use these images must be made to the provider identified in the image credits. Disclaimer The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of any partic- ipating agency.  Suggested Citation Schindel, D. E. and the Economic Study Group of the Interagency Working Group on Scientific Collections. 2020. “Economic Analyses of Federal Scientific Collections: Methods for Documenting Costs and Benefits.” Report. Washington, DC: Smithsonian Scholarly Press. https://doi.org/10.5479/si.13241612 Library of Congress Control Number: 2020044628 ISBN (online): 978-1-944466-41-1 ISBN (print): 978-1-944466-42-8 Publication date (online): 20 November 2020. Open access PDF available from Smithsonian Institution Scholarly Press at https://doi.org/10.5479/si.13241612 Printed in the United States of America c The paper used in this publication meets the minimum requirements of the American National Standard for Permanence of Paper for Printed Library Materials Z39.48–1992. https://phil.cdc.gov/Details.aspx?pid=8923 https://phil.cdc.gov/Details.aspx?pid=18013 https://www.flickr.com/photos/medicalmuseum/3543831668/in/album-72157618366141658/ https://phil.cdc.gov/Details.aspx?pid=22912 https://phil.cdc.gov/Details.aspx?pid=10724 https://scholarlypress.si.edu https://doi.org/10.5479/si.13241612 https://doi.org/10.5479/si.13241612 https://www.flickr.com/photos/medicalmuseum/3543831668/in/album-72157618366141658/ Contents Executive Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Scientific Collections as a Marketplace Types of Federal Collections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Ownership and Stewardship of Federal Collections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Costs Related to Federal Collections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 9 Project Collection Costs 9 Institutional Collection Costs 10 Services Provided by Institutional Collections 12 Cost Categories and Accounting 13 Method for Reconstructing Collection Budget 14 Cost Recovery Benefits Generated by Federal Collections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15 15 Technology/Knowledge Transfer 17 Success Stories 19 Option Value 20 Value Added by Users 23 Counter-Factual Scenarios 23 Comparison among Methods Implications for Policies and Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27 27 Constraint A 28 Constraint B 28 Constraint C Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29 Appendix 1: Abbreviations and Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .31 Appendix 2: Collections Cited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37 v Executive Summary Federal object- based scientific collections have been created to serve agency missions and, in a few cases, to comply with legislative and regulatory mandates. “Project collections” (those managed by the researchers who obtained them for re- stricted use) and their costs and benefits were considered too varied for standard methodologies that assess costs and benefits. In a few cases, departments and agencies are required by legislation or regulations to retain objects in long- term “institutional collections.” In most cases, decisions to retain objects are based on long- term costs relative to the perceived potential for benefits to taxpayers. Federal collections vary in their philosophies and practices of offsetting operating costs by charging users for access to their collections. Operational costs vary among institutional collections, reflecting differences in the size of collections, types of material they contain, and differences in the services they provide to the agency, extramural users, and society in general. This report describes six general services that federal institutional collections provide. Departments and agencies vary in the number of services they offer and the degree to which these services have been developed. Returns on investment are controlled, in large measure, by decisions about what is accessioned into a collection, policies concerning user fees and access, and the services provided by a collection. Those collections that offer only the basic service of accessioning objects have limited ability to generate benefits because few users will know about and have access to objects in the collection. By offering more services, collections broaden their potential use to: future generations (through proper maintenance and preservation), intramural research and by extramural users (through online documen- tation and user access programs), users in other disciplines (through data curation), and the general public (through edu- cation and outreach). The benefits generated by federal institutional collections can take many forms, both monetary and non- monetary. These benefits are usually indirect and delayed, and the value chains that connect costs to benefits are generally difficult to document. This report describes five methodologies (and their strengths and weaknesses) that are available to federal collections for describing and estimating the benefits they generate. Departments and agencies can use the methods de- scribed here for evidence- based decisions concerning policies and management practices for their institutional collections. vii Acknowledgments The Interagency Working Group for Scientific Collections (IWGSC) consists of representatives from more than 15 Federal departments and agencies, each of which owns, manages, and/or provides financial support for scientific collections. These agencies and their collections cover a wide spectrum of scientific disciplines. In developing this report, IWGSC’s Economic Study Group (ESG) considered the full range of IWGSC disciplines and organizational missions while examining collec- tions from an economic perspective. This report was developed over an 18-month period during which the ESG received and discussed dozens of online presentations devoted to case studies of costs and benefits, as well as methodologies used to evaluate them. The findings transmitted in this report were made possible by critical contributions from the following individuals. Members of the Economic Study Group The following individuals participated in ESG’s online meetings during which they received and discussed presentations on the economics and management of scientific collections. ESG members are scientists representing different disciplines, collection professionals, economists, and policy specialists. Several ESG members gave presentations to the group, and all members shared their experiences and insights. ESG’s interagency and interdisciplinary discussions, and the generous contributions of time and effort by its members, produced the findings and recommendations presented here. The members of the ESG are: Reed Beaman (NSF), Brad Bowzard (HHS/CDC), Vanessa Burrows (HHS/FDA), Jeffrey DeGrasse (HHS/FDA), Kevin Hackett (USDA/ARS), Marianne Henderson (HHS/NIH), Susan Lukacs (HHS/CDC), Gerry McQuillan (HHS/CDC), Scott E. Miller (SI), Tom Moreland (USDA/FS), Emily Pindilli (DOI/USGS), Cassidy R. Sugimoto (NSF), John P. Swann (HHS/FDA), Michael Walsh (DOC/NIST), Ellen Wann (HHS/NIH), and Paul Zankowski (USDA). Presenters The following individuals, not associated with the ESG, provided presentations that were critical in developing the report’s findings and recommendations: Lindsay Powers (USGS), Erik Lichtenberg and Lars Olson (University of Maryland), Doug- las Gollin (Oxford University), Jessica Jones (HHS/FDA Gulf Coast Seafood Laboratory), Peter Bretting (USDA/ARS Crop Production and Protection), Todd Ward (USDA/ARS Culture Collection), Kelly Day-Rubenstein and Paul Heisey (USDA/ Economic Research Service), and Abhi Rao (HHS/NIH). Other Contributors Members of the IWGSC who were not ESG members provided thorough reviews of the draft report, often contributing valuable information about their collections, which was incorporated into the report. Eileen Graham (SI) provided Ap- pendix 2: Collections Cited using information from the IWGSC Clearinghouse and its registry of U.S. Federal Scientific Collections. Keith Crane, Lauren Bartels, and Thomas Olszewski (IDA Science and Technology Policy Institute) provided support for this study under an interagency agreement with the Office of Science and Technology Policy. Deborah Paul (University of Florida) provided the benefit of her experience in natural history collection digitization, aggregation of data on collection usage, and data leadership. 1 Introduction P hysical objects form the basis of research in many scientific disciplines. Organisms, soil, medical samples, meteorites, and thousands of other types of objects provide the evidence that is central to the missions of many Federal depart- ments and agencies. These missions include research, regulatory responsibilities, and complying with legislation that serve the Nation’s interests. In many cases, departments and agencies decide to retain objects for long- term preservation in scientific collections, in anticipation of future use or in compliance with regulation or legislation. The long- term support for collections is a commitment of Federal resources without clear evidence of future returns on these investments. History has shown that some of these objects prove to be valuable, even critical, in solving mission- related challenges. They may help to cure diseases, save agricultural crops, avoid natural disasters, and provide other tangible benefits to the Nation. Many others may not have been used since being added to a long- term collection. Faced with their uncertain future value, how can Federal collection officials decide which objects to preserve and which to discard? How can departments and agencies justify the cost of creating and maintaining scientific collections? Are there evidence- based approaches that can reduce the costs and increase the benefits associated with Federal scientific collections? In 2005, the Office of Science and Technology Policy (OSTP1) created an Interagency Working Group on Scientific Collections (IWGSC)2 to explore and solve the common challenges that face Federal departments and agencies with sci- entific collections. Their 2009 report included a survey of more than 300 collections in 14 departments and agencies, documenting their size, uses, management practices and financial support (IWGSC 2009). The report also put forward recommendations, including the development of better methods for documenting the budget support needed by these col- lections. In 2010, OSTP directed all Federal departments and agencies with collections to implement this recommendation (OSTP 2010) and the America COMPETES Reauthorization Act of 20103 said: (d) Cost Projections - The Office of Science and Technology Policy, in consultation with relevant Federal agencies, shall develop a common set of methodologies to be used by Federal agencies for the assessment and projection of costs associated with the management and preservation of their scientific collections. This report presents the findings of a year- long IWGSC study to provide Federal departments and agencies with standard methods for documenting costs and benefits related to their scientific collections.4 These findings enable Federal scientific collections to do the following: • Document the costs related to operating their long- term institutional collections, whether or not they have separate line items in the budgets of their organizational structure; • Identify how these costs are allocated to the six standard services provided by institutional collections; • Review their philosophies and practices regarding user fees and other forms of cost recovery, and consider how they may be affecting collection use and the benefits stemming from this use; • Consider new methods for documenting the benefits generated by use of their collections, especially the five methods presented in the report, with examples from Federal collections: ◌ Tracing the value chains from collection use to new products, processes, or other economic activity; ◌ Describing the roles their collections have played recently in breakthrough discoveries and meeting major societal challenges; ◌ Associating collections and the services they provide with the mission- critical uses they have played in the past, and are being maintained to serve if needed again; 2 Economic Analyses of Federal Scientific Collections ◌ Documenting the pattern of collection use and the financial assets that users are investing in the collection; and ◌ Gathering data on the losses the Nation’s economy and well- being would suffer if the collections and the services they provide didn’t exist. Scientific Collections as a Marketplace Taken collectively, the collections in any discipline resemble a marketplace with providers, customers, and limited supplies of goods and capital. In this context, “goods” refers to the objects in scientific collections, not other goods and services provided by institutions with collections (e.g., jobs created, increased tourism, museum gift shop items sold; see American Alliance of Museums, 2017). Providers bear the costs of making goods available, and consumers want the highest quality goods at the lowest cost possible. The marketplace analogy is not perfect for two principal reasons. First, the prices that consumers pay for access to goods are not the result of competition or other market forces. Only a minority of collections owned and operated by Federal departments and agencies charge fees that are set to recover costs, and they are never set to generate profits. Most Federal collections charge fees that recover only a portion of costs borne by providers, and they often charge only for ship- ping—or for nothing at all. Second, the benefits generated from the goods provided by collections are generally bestowed on society, not the consumers themselves. They take the form of new knowledge in the public domain, or improved public health and security, or protection against future shocks to the environment, food supply, and public safety. Relatively few Federal collections provide goods that private companies use to generate revenue and profits. Schumann (2014) provides a clear summary of the distinctions between valuations of public and private goods; Smale and Koo (2003) provide a tax- onomy of values generated by scientific collections used for plant breeding. Scientific collections have been described by some authors as “global public goods” and have been discussed in the context of the “public commons” (e.g., Halewood, 2013 for plant genetic resources; Kothamasi, Spurlock, and Kiers, 2011 for agricultural microbial resources; and Reichman, Uhlir, and Dedeurwaerdere, 2015 for microbial culture collections). This literature focuses on intellectual property law pertaining to collections but does not explore methods for estimating their costs and benefits, and was not considered directly germane to this report. This study explores the other factors that determine the costs borne by providers and the benefits generated from the use of collection- based goods obtained by consumers. A clearer understanding of these factors and the relationships among them may equip Federal collections to make evidence- based policies and better management decisions (Graves 2003). Introduction 3 Box 1. Glossary of Terms Accessioning is the process of transferring ownership of objects into an institutional collection. Collections may incorporate the records of the objects into the registry of that collection and objects may be physically transferred into a collection at the time of accessioning, but these procedures vary. Deaccessioning is the process of ending legal ownership of objects in an institutional collection, either by transferring ownership to another institution or destroying the objects. Institutional collections are scientific collections that are made available for use by qualified researchers, companies, government agencies and other qualified users (see original definition in IWGSC 2013a). They can be non-renewable or renewable (defined below), and they are generally managed by collection professionals. Non-renewable collections consist of objects that have been collected or fabricated at a specific time and place and are no longer being produced. They cannot be replaced with identical objects so they are maintained and preserved long-term for future use. Objects in these collections may undergo some degree of destructive analysis, but they are completely consumed only under rare circumstances. Project collections are scientific collections that are managed and used by scientists who may have obtained the physical objects for specific research or another purpose, or by others working on that or related research (see original definition in IWGSC 2013a). Objects in project collections are sometimes completely consumed by destructive analyses or discarded at the end of the project. Renewable collections consist of living organisms that can generate replicas of themselves, or man-made objects that can be fabricated. As a result, objects can be completely consumed by destructive analyses without exhausting the supply. Renewable collections are made available in response to user demand and can be disestablished if demand wanes. Scientific collection is used herein as shorthand for “object-based scientific collection”: a group of physical objects that are used for research, development, education, and other activities related to disciplines in physical, life, earth, and planetary sciences as well as archaeology, physical anthropology and applied sciences such as engineering, agriculture, and veterinary science. Scientific collections may include the maps and notes related directly to physical objects but would not include libraries, document archives, or data repositories not associated with physical objects. 5 Types of Federal Collections IWGSC (2009) included a survey of Federal scientific collections that found 14 departments or agencies with approxi-mately 300 scientific collections. They cover the full spectrum of scientific disciplines and the majority of these depart-ments or agencies own and manage their collections long- term. IWGSC has since created the Registry of U.S. Federal Scientific Collections (USFSC)5 that now contains data on many hundreds of scientific collections in 19 departments or agencies. Despite the great diversity of disciplines in which Federal collections are maintained, there are only a few types of col- lections from the perspective of their operating principles (see Box 1 for definitions). Non- Federal collections can generally be assigned to one of these categories and many of them are eligible for Federal support through NSF grants and other Federal funding opportunities. Most, but not all, of the objects in Federal collections begin as objects in “project collections.” While a project is in progress, access to the collection and information about its contents are generally controlled by the researchers working on them. They make decisions about consuming an object, in part or in whole, through destructive analyses, choosing not to preserve a part for future use. Eventually, one of several things occur: the project can be completed with or without publication of the results; the project leaders can retire or pass away; or the department in which the project was conducted could be reorganized, relocated, or abolished. In each case, decisions need to be made about what to do with the collection. Many objects in project collections are accessioned into “institutional collections” (sometimes referred to as “archival collections”). IWGSC agencies have developed policies governing their institutional collections6 and these are available on the IWGSC Clearinghouse. Ownership of and responsibility for these objects passes to the institutional collection and control of the objects normally passes from researchers to professional collection managers. Their mission is to preserve the contents of the collection and to make them available for uses that serve the broader agency mission and generate benefits for society. Most institutional collections will disseminate information about the contents of the collection and will facilitate their use by qualified researchers. There may be restrictions on the use of certain objects in a collection based on national security, public safety, or the terms of collecting permits and/or material transfer agreements. Examples include informed consent given by human subjects at the time samples are collected, and biological samples collected and exported under international agreements such as the Convention on Biological Diversity, the Nagoya Protocol on Access and Benefit Shar- ing, the Food and Agriculture Organization’s International Treaty on Plant Genetic Resources for Food and Agriculture, and the Convention on International Trade in Endangered Species. Most institutional collections are “non- renewable.” They include inanimate objects, preserved dead organisms, or parts thereof (e.g., frozen tissue or body fluid samples). These collections normally have procedures by which researchers can request permission to perform destructive subsampling and analysis of a sample or specimen. Since the mission of in- stitutional collections is the preservation of objects for future study, collection managers require strong justifications based on the scientific importance of the proposed analysis. Permission is rarely granted to consume the last remaining portion of a sample or specimen. In contrast, “renewable” collections consist of organisms that reproduce or replenish themselves (e.g., cell cultures; viable microbial, plant and animal germplasm) or inanimate objects that can be replicated. Destructive sampling and analyses of these objects are common and are not issues of concern for renewable collections because replacements can be grown or manufactured. These are sometimes referred to as “research resources” though IWGSC considers them scientific collections. 7 Ownership and Stewardship of Federal Collections I n addition to directing agencies to establish adequate and sustainable operating budgets for collections, the America COMPETES Act and OSTP (2010) directed agencies to develop and disseminate policies on the management and use of Federal collections. OSTP (2014) specified the components that should be included in these collections policies and the IWGSC Clearinghouse now presents the policy documents that comply with these requirements. Taken together, the IWGSC (2009) recommendations, America COMPETES, OSTP (2010), and OSTP (2014) direct agencies to harmo- nize their collections policies, budgets, and practices with agency missions as expressed in authorizing legislation and regulations. When Federal departments and agencies obtain and add non- renewable objects to project collections, the objects normally meet some immediate mission- related need. Project collections do not encumber agency budgets beyond the resources allocated for research projects. As described above, objects in project collections can be consumed through destructive analyses when researchers decide that immediate project goals outweigh concerns about potential future use. Within the limits placed on the use of objects by collecting permits, consents, and material transfer agreements, objects in project collections can be viewed as consumable research supplies. In contrast, when agencies accession objects into their institutional collections they accept long- term responsibility and related costs for maintaining the objects long- term for future use, as stipulated in their collections policies. Complete destructive sampling of an object is permitted only under very specific conditions. Figure 1 illustrates the diverse pathways that objects can follow before they are accessioned into Federal institutional collections. Each of the pathways leading to a Federal institutional collection can be viewed as a long- term financial obliga- tion by the receiving agency. However, not all of the objects being added to Federal institutional collections begin as project collections obtained and owned by that agency. Some renewable collections provide users with living organisms for research and development of agricultural prod- ucts and industrial processes (see Boxes 2 and 3). Users rely on these collections to discover and develop traits found in nature’s diversity, so specimens are collected based on how they expand the collection’s diversity. Accessioning them into an institutional collection represents a significant financial commitment because of the costs of maintaining living popula- tions. Other renewable collections offer precisely standardized objects to users for use as calibration standards, or highly characterized and uniform organisms for controlled experiments. This second group of renewable collections goes through an initial period of planning, development, testing, and quality control before the standardized samples or specimens can be offered to users. Planning may involve market surveys and analyses or workshops to determine market demand and to determine the exact properties and characteristics that users want the samples or specimens to have. The development phase involves activities like chemical synthesis, genetic engineering, or selective breeding to produce samples or speci- mens that will satisfy user demands. Testing and quality control follow for confirmation. We consider renewable collections as project collections during this initial phase. We consider them institutional collections once testing has been completed and they have been made available to users. 8 Economic Analyses of Federal Scientific Collections FIGURE 1. Processes leading to accessioning of objects into non- renewable and renewable Federal institutional collections. (A) Typical pathway leading from collecting activities to project collections to institutional collections. This process is the same for Federal and non- Federal non- renewable collections and for non- renewable and renewable collections. (B) The collecting activities of some Federal research projects and other sources go directly into the institutional collections of that agency (e.g., the component of the Centers for Disease Control and Prevention [CDC] National Health and Nutrition Examination Surveys [NHANES7]) collections that is sent directly to the CDC Biorepository; the Department of Defense’s Serum Repository,8 and U.S. Department of Agriculture’s [USDA] renewable collections). (C) Non- Federal collecting activities conducted on certain Federal lands (e.g., National Parks) require collecting permits, many of which specify that the objects collected remain Federal property. (D) Some non- Federal institutions with project collections may choose to donate material to a Federal institutional collection when the project ends. The Smithsonian Institution and US Geological Survey receive many collections in this manner. (E) Non- Federal entities sometimes end their operations or encounter financial obstacles that leave them unable to continue maintaining an institutional collection. Federal institutional collections sometimes acquire these “orphaned” collections. (F) Agencies sometimes transfer objects in project collections to non- Federal institutions.9 9 Costs Related to Federal Collections T his study explores costs associated with institutional scientific collections and standard methods that can be used to estimate and document those costs. This approach is essential and the logical predicate to evaluating benefits gener- ated by collections—the returns on investments made by taxpayers in those collections. The evaluation of costs takes into consideration differences among institutional, project, renewable, and non- renewable collections. Project Collection Costs Project collections have a wide range of cost categories. The costs in each category, measured on a per object basis, vary over many orders of magnitude among collections. This variation results from the fact that project collections involve collecting activities, preserving the objects and preparing them for analyses, conducting a diverse array of analyses, inter- preting data, and preparing publications. The logistics for project collections are sometimes simple, involving one or a few researchers collecting in local areas with no requirements for collecting permits other than the agreement of landowners. Others are far more complex, involving inter- institutional and international agreements or treaties that govern collecting, transfer and use of research material. For example, several different authorities may have legal responsibility for issuing permits or consents to: collect threatened and endangered species; obtain samples from human subjects; or collect and transfer research material among countries or from territories beyond national boundaries (e.g., Antarctica, the open oceans, or outer space). Significant effort and expense may be needed to obtain permits or consents before the first col- lecting activity can begin or access to and use of samples can be granted. Once permission to collect has been granted, the cost of obtaining samples or specimens can be as high as NASA space missions, deep- sea drilling operations, or expeditions into remote regions like tropical forests or the deep sea, to name a few. Analytical techniques can involve a wide range of other instrumentation and techniques, including expensive infrastructure such synchrotron beams, CT scanners, DNA sequencers, and mass spectroscopy. For these reasons, project collections do not lend themselves to standard methodologies for estimating costs. There are simply too many types of costs and too many factors that affect these costs to produce cost estimates that can be compared meaningfully. For these reasons, cost estimation for project collections is not treated in this report. Institutional Collection Costs Unlike project collections, there are relatively few types of costs involved in the operation of institutional collections. The costs per object within each cost category will vary according to: • the type of preservation needed for non- renewable collections (e.g., ultra- cold versus refrigerated; dry versus alcohol- preserved at room temperature), and processes for reproducing renewable collections; • geographic location, which affects salaries (due to cost of living variation), facilities and utility expenses (due to rental, construction and utility rates); • size of the collections (affecting economies of scale); and • the services provided by the collection. The first three sources of variation are relatively straightforward and can be taken into account when comparing costs among institutional collections. The most important source of variation in per object operating costs is differences among collections in the services they provide. That is, the more different services a collection provides, and the more of each service it provides, the higher the cost per object. Cost comparisons among institutional collections will only make sense 10 Economic Analyses of Federal Scientific Collections if differences in the services provided are considered. Like costs, the types of benefits generated by institutional collections vary with the number of services they provide (see below, Implications for Policies and Management). Services Provided by Institutional Collections This study identified six different services that Federal institutional collections may elect to perform, depending on the mis- sions of their institutions and the resources available to them (see Table 1). NSF provides support for all of these services to non- Federal collections in the biological sciences. The exact nature of some services will differ between renewable and non- renewable collections, as portrayed in Table 1. For example, accessioning decisions in non- renewable collections are supply- driven: which of the available samples are needed and supportable by the collection with available resources? For renewable collections, management decisions are driven more by user demand: what type and how many items will the user community want added to the collection to improve its coverage? For some renewable collections of living organisms, both supply and demand are factors in decisions of what to accession. Managers of these collections consider (a) the poten- tial future demand for a new accession and (b) how a new accession will increase the genetic variability and the geographic, habitat, and taxonomic coverage of the collection. TABLE 1. Standard services that are commonly provided by non- renewable and renewable collections Non- Renewable Collections Renewable Collections 1. Accessioning • Taking legal ownership, though not necessarily physical possession • Verifying provenance, ownership, import permits • Receiving • Physical integration into collection • Manufacturing, growing or breeding inventory to meet user demand and to replace living contents to ensure viability of specimens/samples that are provided to users • Transferring accessions from other collections • Collecting materials in nature 2. Preserving and maintaining • Facilities and environmental controls • Security and inventory control • Object conservation 3. Documenting additions • Importing data and metadata (e.g., collector, collecting location and date) into a collection registry • Relabeling and transfer to standard containers • Detecting and correcting errors in data and metadata • Documenting collecting source and location • Documenting specimen characteristics • Data quality control 4. Providing access to users • Establishing and maintaining web- based catalog and applications • Digitizing specimen/sample records • Capturing digital images of collection contents • Creating and maintaining online databases of collection contents • Creating and managing loan and visitor programs • Communicating availability to potential users • Governing access by reviewing applications for access • Managing Material Transfer Agreements • Shipping and receiving • Creating and maintaining online catalogs • Maintaining and updating inventory of holdings • Reviewing and processing orders • Managing Material Transfer Agreements • Shipping and receiving 5. Data curation • Adjusting data and metadata format and terminology to community standards • Documenting user access to collection • Updating data records with corrections, additional information • Linking collection records to publications and datasets in public repositories resulting from use 6. Increasing public understanding through education and outreach • Public exhibits • Developing and disseminating informational material about collection contents, uses and impacts through formal and informal education, media, and other mechanisms Costs Related to Federal Collections 11 The first service, accessioning, is the core service that institutional collections perform. The accessioning function is integral to the definition of institutional collections. As one study participant said, “It’s what we do.” Accessioning is both a service and a critical decision- point for collection managers because adding objects to an institutional collection is a commitment of space and support that usually has no clear time limit. In a few cases, legislation and/or regulations require that certain objects must be accessioned into Federal scientific collections. In other cases, agencies must make decisions based on the potential costs and benefits of long- term maintenance (see below, Implications for Policies and Management). Preservation and maintenance ensure the security of objects and their fitness for use by future users. This is the second most common service provided, though collections within an institution may receive different levels of this expen- sive service. Environmental controls on temperature, humidity, exposure to sunlight and other factors can be critical in preventing the deterioration that would render objects useless for research or other activities (Stauderman and Tompkins, 2016). Some collections require highly specialized preservation (e.g., cryo- preservation, alcohol immersion). For renew- able collections of living specimens, maintaining and producing healthy and viable organisms are major costs. Beyond these two basic services, agencies vary widely in providing the other four services, depending on each agency’s mission and the support available. Documenting additions to collections involves the transfer of data and metadata (e.g., collecting location and date, collector’s name) from project collections and researchers or from other institutional collections. These data and metadata may contain errors and ambiguities that persist unless they are corrected at this stage or at a later stage through data cura- tion. Metadata from renewable project collections can include instructions for maintaining living organisms, as well as data on collectors and collecting localities. Providing access to users can include loan and visitor programs as well as digital access through web portals with information, digital images, and even trait data from the objects in the collection. This is a key service for increasing the benefits generated by institutional collections and their extramural users. These services make communities of users aware of the objects in collections and the rules governing their access and use. NSF is supporting Advancing Digitization of Bio- logical Collections,10 a 10- year funding initiative for non- Federal collections, to raise the visibility, discoverability, and use of collections through digitization. Collection managers are responsible for promoting access and use, while also safeguard- ing objects to ensure their long- term availability and fitness for use. Information about and access to some collections are not made publicly available. These include collections of virulent pathogens, crime investigations, and others that could pose a threat to public health and national security. Data curation has developed rapidly in a few types of collections over the past decades, especially those with large- scale digitization initiatives. Taken together, digitization and data curation have moved these collections into the realm of “big data.” Data curation activities usually include: • Detecting and correcting errors; • Developing and implementing community- driven data standards and ontologies; • Developing automated translation of analog data to digital format; • Automating data quality control; • Developing and using standardized specimen/sample identifiers; and • Linking digital collection records (using standardized identifiers) to publications citing samples/specimens and data derived from samples/specimens in public databases. For example, DNA sequences in GenBank include references to the voucher specimens from which the sequences were derived; not all collections include references to related GenBank records. Some collections have active intramural research programs that document the characteristics of objects in the collec- tion. Examples include the USDA’s National Plant Germplasm System11 (NPGS; see Box 2) and Agricultural Research Ser- vice’s (ARS) Culture Collection12 (Box 3), and the CDC’s NHANES Biospecimen Program (see Box 4). Data curation would include linking public data to the digital records of objects in the collection. The costs and benefits related to extramural research activities that characterize objects in collections are discussed in a following discussion on value added by users. 12 Economic Analyses of Federal Scientific Collections Some Federal collections are used to increase public understanding through education and outreach. Objects from the Smithsonian’s institutional scientific collections are displayed in the exhibits of two public museums that attract more than 10 million visitors per year (National Museum of Natural History, 7+ million visitors per year; National Zoological Park, 3+ million), as well as being featured in magazine articles and television programs. NSF’s support for scientific collec- tions is awarded partly on the basis of their broader impact, including increasing public awareness of science. Cost Categories and Accounting The types of expenditures related to delivering these services fall into a small number of cost categories. Table 2 shows which cost categories are relevant to each of the services offered by collections. These associations between cost categories and particular services do not vary significantly among types of collections, geographic location, or agencies, though the costs within categories will vary. Viewing overall operating costs in the context of the services provided enables collection managers to make evidence- based management decisions, especially when the benefits arising from particular services are considered (see Implications for Policies and Management, below). Baker et al. (2014) presented cost data in terms of a similar breakdown of services provided by a non- Federal collection. The IWGSC (2009) report included the results of a survey with more than 150 responses from 14 agencies represent- ing about 300 Federal collections. The responding agencies indicated that • 27% of collections have a budget line- item devoted to maintenance and management; and • 41% of collections have no funds specifically allocated for collection care and management. Presentations and discussions during this study confirmed that the availability of dedicated funding for the operation of institutional collections varies widely among agencies and even among collections in the same agency. Relatively few in- stitutional collections have distinct budgets that support the cost of providing the services described above. In these few cases, the operating costs of the institutional collection are well documented and provide a solid basis for evaluating costs relative to the benefits generated by the collection. Table 3, column A presents three examples. The first two examples involve transfers of funds to a collection specifically for support of operating costs. One collection is managed by a Federal TABLE 2. Cost categories associated with services provided by scientific collections Personnel, Training, and Staff Travel Facility Space and Modification Equipment Acquisition and Development Utilities Materials and Consumables Shipping and Receiving IT, Web and Communications Services Maintenance and Security Contracts Contracts for Exhibit/ Material Design and Fabrication 1. Accessioning X X X X X X 2. Preserving and maintaining X X X X X X 3. Documenting additions X X X 4. Providing access to users X X X X X X X 5. Data curation X X X 6. Increasing public understanding through education and outreach X X X X Costs Related to Federal Collections 13 agency through an Interagency Agreement and the other is managed by a non- Federal contractor (see Table 3A for exam- ples). These collections created systems for documenting and recovering all operating costs (see Method for Reconstructing Collection Budgets, below). More commonly, a single agency budget will combine the operating costs of one or more institutional collections along with unrelated costs, such as: collecting activities; management of project collections; research activities including analy- ses of objects; and preparation of publications. Other agencies have considered collection- related budgets as too small to segregate so they have included them in much larger organizational units, sometimes unrelated to research. In these latter cases, special efforts are needed to identify the costs associated with the institutional collection (see examples, Table 3B). Method for Reconstructing Collection Budgets Agencies that do not have separate budgets for their collections or use contractors for collection management can docu- ment their operating costs in another way. NSF- ICF uses the following method to develop the budget for its Interagency Agreement with NSF. The International Cooperative Administrative Support Services is a similar system used by the U.S. State Department to divide the cost of administrative services provided by the State Department (e.g., office space, IT sup- port, utilities, motor pool) among the Federal agencies housed in each embassy. Table 2 shows the cost categories associated with each of six services that can be offered by an institutional collection. This framework allows collection managers to reconstruct their operating budgets using the following procedure: • Itemize the full- time equivalents, square footage, number of computers and other measures of resource utilization in each cost category of the services they provide; • Calculate personnel costs directly using the compensation of the staff members involved in each service; • Use the budget of the parent organization to determine total support for space, utilities, IT and security services, and other cost categories used by the collection; and • Prorate the portion of space, utilities, security and other services used by the collection, using the appropriate utiliza- tion measure (e.g., number of computers for IT services, square footage for facility space and security). TABLE 3. Examples of institutional collections with and without dedicated budgets A. Institutional Collections with Dedicated Budgets B. Institutional Collections without Dedicated Budgets Two of the three NHANES institutional collections are stored and managed by non- CDC repositories under a contract with CDC. The budgets submitted by the contractors and paid by CDC document all operational costs other than personnel costs of CDC staff that oversee the contracts. These operating costs include storing and maintaining samples and distributing samples to users whose proposals have been approved after CDC staff review. The third NHANES institutional collection includes samples directly accessioned into the CDC Biorepository after being collected from health survey participants. Management of the NHANES collection (storage, maintenance, and shipment of samples to qualified users) and several other institutional collections is provided by the CDC Biorepository. It can be difficult to know the operating costs of any particular collection in a centralized facility. The National Science Foundation Ice Core Facility (NSF- ICF)13 belongs to NSF but it is housed at the Denver Federal Center, where it is managed by the U.S. Geological Survey (USGS) and supported by an NSF- USGS Interagency Agreement. The agreement’s budget represents the operating budget of the collection, including personnel costs of USGS staff. The Food and Drug Administration (FDA) of the Department of Health and Human Services (HHS) has collections of food- borne pathogens and other food safety collections14 that serve the agency’s regulatory mission. Since the services provided by the collections serve regulatory activities that have separate budget line- items (e.g., seafood safety, marketplace inspections, food and feed safety, outbreak response), support for collections is distributed among several budget lines. ARS has non- renewable institutional collections (e.g., preserved insects, fungi) used for agricultural research. They include objects relevant to several USDA commodity programs that contribute funds for research and collection operations. Combining support for research and collections obscures the true operating costs of the collections.15 14 Economic Analyses of Federal Scientific Collections Once the services/cost categories have been established and their associated costs prorated from higher organizational totals, the operational costs of a collection can be summed across services and cost categories. Most institutional collections have stable operating budgets except for construction of new facilities, major equipment upgrades, or relocation to new facilities. Other cost variations are smaller, involving occasional changes in staff positions, space utilization, or the addition of new services. Cost Recovery This study considered user fees as a mechanism for reducing costs, not for generating benefits or returns on investments. Accordingly, we address the issue of cost recovery in this section on costs, rather than in the following section, which fo- cuses on benefits. We found the full range of agency philosophies concerning cost recovery. At one end of the spectrum, many agencies do not charge users any access fees. There is general concern among these agencies that charging users for access will reduce user interest, especially among potential users with limited resources (e.g., students, small institutions). Based on one year of records from three Canadian biobanks, Albert, Bartlett, Johnston, Schacter and Watson (2014) argued that cost recovery is probably limited to 5–25% of total operating costs. USDA/ARS maintains germplasm of agriculturally important plants, animals, microbes, and insects. ARS has no statutory authority to charge user fees for access, with one exception.16 The 1990 Farm Bill established the USDA National Genetic Resources Program (NGRP), which includes renewable national germplasm collections of agriculturally important plants, animals, microbes, and insects. The Bill stipulated that the NGRP’s germplasm be made available to users free- of- charge. NSF- ICF is also proactive in promoting use and does not seek cost recovery through user fees. Other agencies recover some or all operating costs. NSF’s Living Stock Centers Program17 does not require grantees to recover full operating costs; they are encouraged to become more sustainable through user fees. The National Cancer Institute (Odeh, Miranda and Rao, et al. 2015) and the University of British Columbia18 have developed software tools that help biobanks determine cost recovery fees based on data from cost categories. The NHANES Biospecimen Program charges a set fee per sample to cover some operational and transactional costs: collecting, storing, and processing samples; preparation of data files; and personnel costs associated with the process of reviewing proposals for access to samples. User fees are not intended to recover full operating costs. At the other end of the spectrum, the National Institute of Standards and Technology (NIST) is required by statute (15 USC 275c) to recover costs related to its Standard Reference Materials (SRM) Program. NIST maintains clear documen- tation of the costs associated with developing, marketing, producing, and distributing SRMs. Estimating the number of units they expect to sell each year allows the program to set prices that will recover the required operational costs (Research Triangle Institute, 2000). 15 Benefits Generated by Federal Collections M any, perhaps all, IWGSC agencies have experience in justifying their requests for collection support by describing in various ways the benefits their collections generate for the agency and taxpayers. Their efforts to describe these benefits can use monetary terms (e.g., commercial revenue generated, cost reductions, productivity improve- ments). Rates of return on investment and benefit to cost ratios can be calculated in these cases, assuming that operating costs are also well known. Other benefits are specific but qualitative, and these are often described as impacts. IWGSC (2009) included a series of side- bar examples of these benefits, each identified with areas of impact such as public health (Horowitz et al., 2010; DiEuliis et al., 2016), environmental quality (Lawrey, 1993), or public safety and national security. Other previous attempts to place a value on scientific collections have used qualitative terms (Suarez and Tsutsui 2004) or the cost of creating the collections (Bradley et al., 2014). In reviewing methods used for evaluating benefits generated by scientific collections, the study’s priority was finding evidence- based methods, regardless of whether the evidence was monetary, qualitative, or descriptive. The evidence used by methods reviewed during the study include: survey responses from users; market value data; historical records of collec- tion use; expenditures by users on objects from collections; and potential savings to society through emergency mitigation. The following sections describe five methodologies available to Federal agencies. Each one views the impacts of collec- tions from slightly different perspectives and comes with particular assumptions. Each one has strengths and weaknesses; some are more time- and labor- intensive than others. Agencies may find that employing several approaches in combination is useful. Technology/Knowledge Transfer There is a substantial body of research into the monetary impact of federally funded research in science and technology. Many studies have attempted to trace products in the marketplace and the jobs and wealth they generate back to their ori- gins in government grants and research labs (see review by Ammon, Salter, and Martin, 2001). These studies must address the delays inherent in the R&D process leading from discovery through application, proof of concept, patenting, product development and testing, licensing, manufacture and eventual marketing. They must also somehow apportion the eventual market value of new products among all the contributing links in this value chain. For example, new crop varieties have been developed from accessions obtained from plant germplasm collections (see Box 2) but the monetary value of the new crops was also due to crop breeding that may have involved crops already in production, improvements in farming tech- nique, and other inputs (Rubenstein, Heisey, Shoemaker, Sullivan, and Frisvold, 2015). Partitioning the net value of the new crop among these inputs would require replicate treatments that isolate and control for each contributing factor (e.g., Evenson and Gollin, 1997; Güereña, Lehmann, Thies, Enders, Karanja and Neufeldt, 2015). Since the funds invested in the original research were unavailable for other uses during this delay, appropriate discount rates must be applied to adjust for the actual cost of the research (including opportunity costs) prior to calculating returns on investment. This same approach can be applied to scientific collections, which are often the basis of research leading to discover- ies, product development, and new products and processes. The U.S. Department of Agriculture, like many other Federal agencies in this study, publishes annual reports of technology transfer19 for ARS and for USDA as a whole. These reports present numbers of patents and licenses resulting from USDA research and the royalties received. Due to the challenges and assumptions associated with this type of economic analysis, these reports do not attempt to document the portion of the marketplace value that can be credited to USDA research or collections. 16 Economic Analyses of Federal Scientific Collections Several USDA collections are the starting points of industrial research and development. NPGS acquires, maintains, develops, conducts research and distributes plant varieties in support of breeding new crop varieties (see Box 2). Evenson and Gollin (1997) discuss the challenges associated with estimating returns on investment and benefit to cost ratios of plant germplasm collections. The ARS Culture Collection provides microbial samples that have been critical in generating new knowledge disseminated in academic publication and patents for new commercial products (see Box 3). Collections of living microbial strains (often referred to as “culture collections”) are renewable collections that are used for basic research in microbiology, and for applied research in a variety of commercial areas (e.g., agribusiness, pharmaceu- ticals). Furman and Stern (2011) discuss the economic valuation of Biological Resource Centers such as the ARS Culture Collection. Agencies involved in safeguarding public health face several challenges in demonstrating the benefits generated by the use of institutional collections. There is little doubt that they contribute to the prevention, detection, and cure of diseases, but the precise pathways between the use of collections and tangible, measurable benefits can be difficult to follow (see Box 4). Economic analyses of technology/knowledge transfer that include benefit to cost estimates involve considerable effort and expense. They are normally done by contractors who specialize in economic studies. As an alternative, agencies can Box 2. National Plant Germplasm System NPGS is a distributed network of 25 plant genebanks and support labs cooperatively operated by USDA/ARS, State Agricultural Experiment Stations, and Land- Grant Universities. The repositories receive, preserve, maintain, and distribute germplasm samples and associated information to support agricultural production by making germplasm available to users around the world, including researchers, plant breeders, growers, and other qualified users. NPGS is a renewable collection so preservation and maintenance involve propagating and testing plants to ensure their health and viability. NPGS genebanks collectively manage approximately 600,000 separate germplasm accessions. Each year, NPGS distributes 250,000–300,000 samples to users and charges no user fees. The Germplasm Resources Information Network20 (GRIN) is USDA’s searchable data system that manages the inventories of plant and animal accessions USDA’s genebanks, as well as information about the plant, animal and microbial accessions held by NPGS and other USDA germplasm collections. NPGS documents the origins of the accessions it receives and their physiological and other traits. Some accessions arrive at NPGS with considerable data and metadata, while others have little and are studied and characterized through intramural research. These new data and metadata are added to GRIN. User requests for germplasm samples vary widely, including requests for highly characterized accessions that are already cultivated as crops. These may be used in cultivation or in plant breeding aimed at developing new and more productive cultivars. Other users request wild plant relatives of crops that are poorly characterized and have no history as crop plants. These may be used for basic research and to identify, isolate and incorporate useful traits into new plant varieties. The benefits generated by NPGS take many forms: increased agricultural output; reduced losses to pests and environmental stress; increased knowledge that may contribute to food security; and increased potential to respond to food insecurity. Some of these can be measured quantitatively (Bretting, 2018), but sources other than NPGS also played important roles in generating these measurable benefits. Photo courtesy of USDA ARS Benefits Generated by Federal Collections 17 use qualitative terms to document the use of their collections by commercial entities, or non- commercial users who are working on commercial development projects. This can be accomplished by adding data gathering from users of a collec- tion to other data curation tasks. Success Stories Several of the sidebar examples in IWGSC (2009) are “success stories”—case studies in which collections helped to solve a problem or prevent losses that represented millions to billions of dollars. For example, USDA’s collections of agricul- tural pests help prevent catastrophic crop losses or trade wars over disagreements about imports with dangerous insects (see Box 8). Vaccines developed using samples in collections can curtail epidemics and save lives and avoid productivity losses, documented in “cost- of- illness” economic studies (Byford, Torgerson, and Raftery, 2000). Other examples are used Box 3. ARS Culture Collection The ARS Culture Collection (also known as the Northern Regional Research Lab Collection [NRRL]) is one of the largest public collections of bacteria and fungi in the world. It is housed within the Mycotoxin Prevention and Applied Microbiology Research Unit at the National Center for Agricultural Utilization Research in Peoria, Illinois. NRRL’s intramural research focuses on advancing agricultural production, food safety, public health, and economic development. Data about strains in the collection provided by depositors, users, and intramural staff are added to NRRL’s public database, which improves the community’s ability to find samples for further study. NRRL includes two collections. • The “Open Collection” contains 90,000 microbial strains that are owned by USDA and are made available to academic and commercial researchers without charge. These isolates represent a broad sample of biological diversity collected over more than a century. On average, 4,000 microbial cultures are distributed by the NRRL each year at no cost to users. If obtained from private culture centers, users would be charged approximately $1 million for these samples. In 2018, NRRL isolates were provided to government, academic, and industry scientists across the U.S. and 42 other countries. • The “Patent Collection” contains 7,600 microbial strains that have been deposited, typically in association with a patent application, under the NRRL’s International Depositary Authority. The NRRL is one of only two International Depositary Authorities for bacteria and fungi in the United States. Deposition of isolates in NRRL fulfills requirements of U.S. patent applications and all other countries that have signed the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedures. These isolates are made available to the scientific community upon issuance of an associated patent or at the request of the depositor. NRRL distributes 400 isolates from the Patent Collection per year, on average. A portion of the operating costs of the Patent Collection is recovered in two ways. Depositors are charged a one- time fee when their isolates are accessioned and requesters are charged an access fee. These fees were authorized in the 1985 Farm Bill and are updated via U.S. Patent and Trade Office communications to the World Intellectual Property Organization. NRRL contributes directly and indirectly to technology development and business enterprises. Direct benefits can be traced through patents that cite NRRL samples. NRRL samples contribute indirectly to technology development through the new knowledge presented in research publications that cite them. NRRL isolates have been cited in more than 65,000 scientific publications, as well as 7,500 patents. A formal economic analysis of the monetary value of these direct and indirect contributions has not been attempted. Photo courtesy of USDA ARS 18 Economic Analyses of Federal Scientific Collections routinely by agencies to highlight the potential value of collections that can provide high- impact solutions to applied prob- lems. However, the benefits generated by success stories are often in areas other than the one for which they were collected. That is, the benefits are not the products of the day- to- day work of the collection or even the mission of the agency that owns the collection. For example, IWGSC (2009) included a sidebar about geologic rock cores that were collected for oil and gas discovery but are also proving valuable for mapping and predicting earthquakes. These success stories can be effec- tive for public relations and for raising awareness about collections, but they are not often viewed as compelling evidence of returns on investment or estimators of benefit to cost ratios. Economists sometimes draw an analogy between success stories and winning lottery tickets. Both involve a small in- vestment (for a lottery ticket or a few objects in a collection), a very large payoff, and very low probabilities of success. They also point out some important differences. Lotteries have a known delay between purchasing a ticket and when the winner is chosen, but the waiting time until the next collection- based success story is unknowable. In addition, the value of a winning lottery ticket is known when the winner is drawn, but there is no way to predict the value of a solution that might be found in a collection. Finally, a lottery has at least one guaranteed winner each time the winning number is picked, but there’s no guarantee that the solution to a critical problem is waiting somewhere in a collection. The random, unpredictable nature of success stories may limit their value in demonstrating returns on investment. The following example of success stories, and the subsequent example of option values, may suggest ways that agencies can highlight benefits that are valu- able but unpredictable. The National Park Service (NPS) protects and preserves an extraordinary range of habitats and life forms, from high alpine mammals to marine microorganisms. A well- known success story involves a bacterial species, Thermus aquaticus, collected and described from a hot spring in Yellowstone National Park. It was found to produce an enzyme that catalyzed Box 4. NHANES Value Chain The National Health and Nutrition Examination Survey (NHANES) is a unique public health program that obtains blood and urine samples at the same time physical exam information and responses to standardized interviews are collected. These samples and data have been collected continuously since 1999 from a statistical sample of the U.S. population. The NHANES Biospecimen Program was developed to address future medical, environmental, and public health issues challenging the Nation by maintaining a collection of serum, plasma, urine, and DNA specimens. Data produced from research using NHANES biospecimens are added to the NHANES database and made available to the public on the NHANES website. The NHANES Biospecimen Program makes samples available to any qualified researcher, though most users are other CDC Centers, Federal agencies, and academic institutions. Researchers use NHANES samples and data to establish the distributions of values for new health markers and exposure to environmental toxins in a statistically significant sample. These data are released through the NHANES website where they can be accessed for translational research that benefits public health and society. The $40 million in annual Federal support for NHANES samples and data has been critical in generating these benefits, but the same can be said for other sources of support for new health markers. Calculating the monetary and societal benefits and assigning them to the different sources of support would take considerable time and effort and has not been attempted by CDC. Photo courtesy of HHS CDC Benefits Generated by Federal Collections 19 the growth of the polymerase chain reaction (PCR) into a global biotechnology enterprise worth billions of dollars annually. Box 5 describes this success story. Option Value Bishop (1982) and Fisher and Haneman (1986) described “option value” as the benefit of having something available in case it is needed in the future, even though the probability of future use and its future value are unknown. This concept of options is used in the sale and purchase of stocks, wines, and works of literature that might become the basis of commercial films (to name a few examples). Option value can be thought of as an insurance policy that is only worth something in the occurance of an unforeseen event. Option values are therefore related to the values claimed in success stories, described above, with one important difference. As used here, option values are directly related to agency missions and the everyday uses of collections. The missions of many Federal collections include preventing or mitigating threats to public health and safety, such as risks to the nation’s food supply. For example, NPGS is a resource for developing new and better crops (see Box 2). FDA’s Foodborne Pathogen Collections are critical for ensuring the safety of food in the marketplace (see Box 6). The U.S. has witnessed many such threats to our food security: declines in agricultural output of specific crop variet- ies, foodborne disease outbreaks, and crop failures due to the introduction of insects, mold, fungi, and other agricultural pests. The same is true for epidemics, airline crashes due to bird strikes, earthquakes and other threats to public safety described in IWGSC (2009). The economic impacts of many of such events have been estimated. Agencies could compile focused knowledge bases of past events that illustrate the scope of potential losses that face the country. These past losses illustrate the option value of maintaining collections that could prevent or mitigate these losses. Box 5. Yellowstone National Park Provides Basis for Biotechnology Breakthrough The development of DNA sequencing was hampered by its minute quantities found in biological tissues. Cetus Corp. developed the Polymerase Chain Reaction (PCR), which could produce billions of DNA copies by dividing double strands by heating and assembling new complementary copies by cooling. This thermal cycling required an enzyme that could assemble complementary DNA strands while functioning at high temperatures. Kary Mullis, a Cetus researcher, was awarded the 1993 Nobel Prize in Chemistry for inventing the PCR method. Independently, microbiologists had been exploring the microbes that live in thermal hot springs. Thomas Brock, a microbiologist at Indiana University, discovered the bacterium Thermus aquaticus in Yellowstone National Park, naming it formally in 1969 (Brock and Freeze, 1969). Brock deposited representative cultures of T. aquaticus in the American Type Culture Collection (ATCC; Brock, 1997), at that time a non- Federal collection in Rockville, MD with significant NSF and NIH support. Taq polymerase is the synthetic enzyme developed from Brock’s cultures in ATCC from Yellowstone National Park. Its heat stability and efficiency enabled the success of PCR as a research tool and business enterprise. Thermal cyclers (generally known as “PCR machines”) appeared on the market in the mid- 1980s (see image21). They were soon widespread in genetics labs; a recent report put PCR- related sales in 2017 at $7.41 billion.22 Photo courtesy of DOI NPS 1985 PCR machine, SI 20 Economic Analyses of Federal Scientific Collections Value Added by Users Success stories and option values are based on rare and unpredictable uses of collections. In contrast, this approach to doc- umenting benefits is based on normal, everyday activities in collections. “Providing access to users” is one of the services listed in Table 1. Since this service increases the number of researchers who use a collection, it can be viewed as the service which has the greatest impact on the potential benefits generated by a collection. Collections often explain their value in terms of growth (accessions per year), or activity (numbers of visitors who use the collections, or numbers of specimens/ samples distributed to users), but these are input measures, not indicators of outputs or benefits. Some collections try to report the publications and/or datasets generated by users, though collecting this information from users is often difficult. Published articles and datasets are important outputs, but prior to their use by others it is difficult to assess their beneficial value beyond the professional standing of the authors. Collections will often highlight important discoveries made by users of the collection, similar in some ways to success stories described above. The Core Research Center (CRC) of the U.S. Geological Survey26 has been proactive in promoting community use of the collection and increasing the discoverability of objects in the collection through data curation (see Box 7). Their policies Box 6. FDA Foodborne Bacteria Collections The Food and Drug Administration maintains about 10 institutional collections. Among them are some of the world’s largest and most diverse collections of pathogens associated with human and veterinary illnesses found in the food and feed supply. They are housed in several facilities of FDA’s Center for Food Safety and Applied Nutrition (CFSAN) in the Washington, DC and Chicago areas. The largest of these contains approximately 40,000 strains of the bacterial genus Salmonella.23 A second collection of foodborne bacteria focuses on environmental pathogens found in seafood. It includes 5,000 well- characterized strains of the genus Vibrio housed in FDA’s Gulf Coast Seafood Laboratory24 on Dauphin Island, AL. The CFSAN strains whose whole genomes have been submitted to Genome Trakr25 are maintained as vouchers to characterize them and to ensure the reproducibility of sequencing results. FDA’s intramural research relies on their collections to develop and improve methods for detecting pathogenic bacteria and discriminating among strains, ranging from the species level down to the agents responsible for specific disease outbreaks. The collections provide strains that are assembled into test sets that are distributed for a variety of applications such as: proficiency testing of labs, (including those contributing whole genome DNA sequences to Genome Trakr); providing positive and negative controls on cross- contamination of DNA sequencing runs; validating the sensitivity, specificity and reliability of new methods for detecting and identifying foodborne pathogens; and rapid field testing for the presence of specific pathogens. CFSAN has genetically engineered some of these control strains to make them fluorescent and easily detectable. Strains in the collection are also subjected to high levels of sanitary treatments to detect the emergence of highly resistant strains. FDA responds to 100–150 qualified extramural users per year by providing 500–750 isolates to academic researchers (approximately 50% of users); industrial labs (30%); and State and other Federal agencies (20%). The FDA has not attempted to trace monetary or other impact of the use of its collections. These impacts are generated through new and improved capabilities of public health agencies, hospitals, universities, and private companies that prevent and respond to disease outbreaks caused by foodborne pathogens. However, the use of FDA collections by these sectors is known and can be considered in the context of the foodborne disease outbreaks each year and their cost to the Nation. Photo courtesy of HHS/FDA Benefits Generated by Federal Collections 21 and practices are comparable to those of companies that re- invest profits in their R&D efforts in order to increase future productivity. CRC’s “virtuous cycle” is based on promoting use of the collection, capturing and incorporating the outputs from this use into the collection, and thereby increasing future use. CDC’s NHANES Biospecimen Program also collects the results of laboratory testing done by intra- and extramural users and connects these data to the NHANES database. A list of publications resulting from studies using NHANES institutional collection samples are provided on the NHANES Biospec- imen website. When the results of analyses and sample preparations done by users are integrated back into a collection and its public databases through data curation, future users are more likely to find the samples, specimens, and data they need. The collection’s value to those users has increased. Accordingly, this study views the expenses borne by users for analytical procedures and sample preparations as “co- investments” in the collection and a form of return on investment. Patterns of user demand are the basis for understanding and documenting co- investments in collections. For example, user demand for objects in a collection might be high soon after they are added to the collection but decline soon after. Al- ternatively, user demand might be unrelated to the length of time since objects were accessioned. To determine the degree to which collections retain the user interest that drives co- investment, the study obtained historical data on user demand over time from several Federal collections. USGS’s CRC has accepted donated cores from 1974 to the present, with most coming from around the 1980s. The core samples requested from 2016 to 2018 showed this same age distribution (see Figure 2). The Smithsonian’s National Museum of Natural History (NMNH) has more than 125 million specimens and samples of plants, animals, fossils, rocks, minerals, and human artifacts. NMNH, which began computerizing its loan records in the 1960s, provided information on more than 120,000 loan requests from nine scientific departments. Figure 3 shows loan data from the Invertebrate Zoology Department, which had the most complete data. Box 7. USGS Core Research Center (CRC) CRC is an institutional collection27 that contains approximately 10,000 rock cores and more than 50,000 borehole cuttings, 95% of which were donated by private companies whose intramural use of the cores did not justify the costs of storage and maintenance. USGS evaluates the quality and rarity of donations before accepting them into the Denver, CO based repository. The collection includes cores from 35 States and cuttings from 27 States, with coverage concentrated in the Rocky Mountain region. CRC receives more than a thousand research visits per year and provided users with 10,000 samples from 2016–18. USGS does not charge academic researchers any user fees when they request and receive samples, thereby minimizing barriers to use. However, all users must provide the following within a proscribed period after receiving samples from CRC. Non- compliance can result in loss of future access to the collection. Users are required to provide: • All data derived from analyses of CRC samples; and • A duplicate of thin sections made from CRC core samples. Thin sections that users provide become part of the collection and are available to future users. Photos of thin sections and cores and all associated data records are hosted on the CRC website and they are downloadable from the collection’s catalog. Finally, CRC estimates the dollar value of the analyses and thin sections reported by users, based on market values. Photos courtesy of DOI USGS 22 Economic Analyses of Federal Scientific Collections FIGURE 2 . CRC sample requests from 2016 to 2018 Horizontal axis represents accession years of cores. FIGURE 3 . Loan data for natural history museum specimens. (A) Distribution of requested samples by year they were accessioned into collection. (B) Distribution of requested samples showing year of request versus year the requested sample was collected. Benefits Generated by Federal Collections 23 Figures 2 and, clearly, 3 show that users request access to samples and specimens in proportion to their representation in the collection. There is no evidence that their value to users diminishes over time. This suggests that the value added through co- investment by users over time could be substantial. Some agencies with collections are proactive in documenting their collections and promoting their extramural use. In order to document patterns of use and co- investment by users, agencies will need to have systems for capturing data on: • Research visitors and requests for loans and sample distributions, and • Publications and datasets released on public data repositories. To incorporate the value generated by users into the collection (making them co- investments that add value), the collection must also provide data curation services. This would involve: • Receiving and curating sample preparations done by users and returned to the collection; • Adding metadata about the returned sample preparations to the digital records of the original sample or specimen; • Capturing the metadata, digital identifiers, and web addresses of publications and datasets released on public data repositories; and • Incorporating these metadata, identifiers, and addresses into the digital records of the original sample or specimen. Counter- Factual Scenarios Counter- factual scenarios are well- established devices for economic analyses. They shift the frame of reference normally used to evaluate the value of scientific collections. Rather than exploring the value of collections to users, counter- factual scenarios explore the costs to users of not having access to the collections. Two examples are presented here. In a study of USDA’s Animal and Plant Health Inspection Services (APHIS; see Box 8 and Lichtenberg, Olson and Lawley, 2009), the agency was the principal user of the collection. The absence of the collection would have had a direct and clear effect on the agency’s ability to fulfill one of its core mission responsibilities. This allowed the study to use pro- grammatic data to estimate the financial impact of not having access to the collections. Companies were the focus of a study of the benefits generated by NIST’s SRM Program (see Box 9 and Martin, Gallaher and O’Connor, 2000). The study relied on user surveys and interviews to estimate the financial impact on the companies if they did not have SRMs. NIST does not consider the SRM Program a scientific collection, though it has many of the char- acteristics of renewable collections as described in Box 1, and it provides some of the services described in Table 1. NIST’s use of counter- factual scenarios may therefore be instructive for Federal scientific collections. Economic analyses that employ counter- factual scenarios are data- rich but labor- intensive. These studies are normally done by contractors rather than agency staff to avoid any appearance of conflicts of interest and to increase credibility. The APHIS study (Box 8) relied on programmatic data because the absence of collections (the counter- factual premise) would result in clear consequences (rejection of certain imports at ports of entry). The NIST study (Box 9) relied on structured interviews of users because there were many possible consequences to users if SRMs did not exist. The Paperwork Reduction Act limits the burden placed on non- Federal survey respondents by restricting Federal surveys to nine requests for information. OMB can issue waivers based on formal requests and some agencies have been granted waivers for cause. Complying with the Government Performance and Results Act by conducting economic analy- ses is one possible basis for requesting OMB waivers. Comparison among Methods Federal scientific collections have many stakeholders: agency researchers and administrators; Congress; OSTP and OMB; the non- Federal research community; and U.S. taxpayers. These stakeholders have different reasons for wanting to know if taxpayers are getting a good return on investments in Federal collections. Different stakeholders have different views of what constitutes “value,” “benefits,” and “returns on investment.” They may be looking for cost savings, or ways of in- creasing cost- effectiveness, or seeking ways to make management and policies more evidence- based. No single evaluation 24 Economic Analyses of Federal Scientific Collections method can give clear and simple benefit- to- cost ratios that will address these questions. The methods described here will shed new light on these matters in different ways, and the following table of strengths and weaknesses may help agencies select appropriate methods. Box 8. Counter- Factual Scenario for USDA’s Animal Plant Health Inspection Services Customs and Border Protection (CBP) staff inspect incoming agricultural shipments at U.S. ports of entry for potential agricultural pests. When they encounter evidence of insects, fungi or other potential pests, they collect, preserve and send them to USDA/APHIS area identifiers, located at larger ports and plant inspection stations around the United States. APHIS area identifiers provide the first layer of authoritative identification for commonly intercepted and unambiguously identifiable pests. When the area identifiers cannot identify them with certainty, they send samples to APHIS National Identification Services (NIS) for identification. NIS has a staff of National Taxonomists, responsible for final, authoritative identification of intercepted pests and pathogens using morphological and molecular techniques. In addition, NIS contracts with ARS staff and both use collections in ARS Systematic Laboratories and the Smithsonian Institution as definitive resources in making identifications. The costs of on- site CBP inspections, ARS contracts, and specific other APHIS activities (e.g., quarantine of imported live plants) are recovered through fees paid by importers. The long- term costs of the USDA collections that are housed near experts at ARS facilities28 and in the Smithsonian’s National Museum of Natural History29 are paid by taxpayers. At the end of this process, all shipments can be classified as: a) No potential pests found: safe for entry; b) Potential pest found and identified as benign: safe for entry; c) Potential pest found and identified as harmful but treatable: safe for entry after treatment; d) Potential pest found and identified as harmful and untreatable, not safe for entry; or e) Potential pest found and taxonomic uncertainty prevents definitive identification: not safe for entry. Lichtenberg, Olson, and Lawley (2009) developed a counter- factual scenario based on the premise that there are no reference collections or identification guides based on them. Any shipment in which potential pests were found (cases b to e, above) would not be considered safe for entry. The absence of collections would result in the rejection of all shipments in categories b and c. APHIS inspection and USDA shipment records showed that $180 million in imports fell into these categories over one year (mid- 2006 to mid- 2007). All inspection- related APHIS and ARS research and collection costs totaled $27 million during this period, resulting in a benefit to cost ratio of 4.87. USDA APHIS Image Gallery Photo by Jim Young, USDA Benefits Generated by Federal Collections 25 TABLE 4. Principal advantages and disadvantages of five methods for documenting benefits from scientific collections described in Sections 4A- E Method Principal Advantages Principal Disadvantages Technology/Knowledge Transfer • Based on tangible outcomes, often monetary • Usually connected to normal collections- based work • Can be expressed in quantitative terms (e.g., benefit- cost ratios) • Difficult to connect use of collection to ultimate outcome (delays, other contributors to process) • Sometimes serendipitous Success Stories • Can be dramatic, high value • Easily understood • Based on rare events that can’t be predicted • Can be serendipitous and unrelated to normal collections- based work Option Value • Can be dramatic, high value • Connects to historical events, easily understood • Based on probability of future use, not past performance Value Added by Users • Based on normal collection activities • Highlights patterns of collection use • Can be expressed in quantitative terms (e.g., rates of return) • Requires cooperation of users • Requires data curation • Uses narrow definition of “value” (i.e., value to users, not others) Counterfactual Scenarios • Highlights unique role of collections • Based on customer feedback and/or performance data • Can be expressed in quantitative terms (e.g., rates of return) • Customer surveys can be expensive, labor- intensive • Limitations on Federal surveys (Paperwork Reduction Act) • Distrust of survey results Box 9. Counter- factual scenario for NIST’s Standard Reference Materials Program NIST produces and sells more than 1,300 types of highly characterized and standardized SRMs to industry, academia, and government—including companies that develop, manufacture, and or use analytical instruments that are critical to assuring quality, verifying accuracy of measurements, and ensuring compliance with Federal regulations. The SRM Program shares similarities with renewable collections as defined in this report. The approach taken by NIST using economic impact analyses for the SRM Program highlights an opportunity for institutional collections. An economic analysis was commissioned by NIST to study the value of SRMs developed to measure the sulfur content of fossil fuels (Research Triangle Institute, 2000). The study interviewed representatives of nine companies from industries (e.g., coal processing, oil refining, steel production) that purchased the sulfur SRMs, asking them to report how they would have met regulatory standards if the NIST SRMs were not available. The survey asked for yearly estimates (starting with the year the company began purchasing SRMs) of: the costs and delays from finding an alternative; lost productivity and business; increased transaction costs; regulatory penalties; and other losses they would have suffered. Their estimated costs from 1988- 2003, adjusted for inflation and discounted because of delays, amounted to $412 million. The adjusted cost of operating the program (producing and marketing the SRMs, shipping, billing, overhead and other administrative expenses) was $3.7M. This represented a benefit- to- cost ratio of 112. Photo courtesy of DOC NIST 27 Implications for Policies and Management T he methods described here for documenting and estimating costs and benefits can equip agencies to make two evidence- based decisions: • What kinds of objects should they accession each year, and how many of them? • Which services should they provide, and how much of them? Some agencies operate under authorizing legislation (including the organic acts that created them), in which require- ments to retain Federal ownership of certain objects is specified or implied. This may limit an agency’s ability to make evidence- based decisions on what kinds and how many objects they accession. The following discussion explores the tensions and trade- offs caused by unfunded mandates, and how these mandates may come into conflict with the value of understanding operating costs as called for in America COMPETES. In containing costs and generating benefits, the managers of Federal institutional collections and their agency leader- ship face three structural constraints on policies and management: A. All agencies with collections face decisions on the intake and the removal of objects from their collections in order to maximize potential use and future benefits (within budget constraints), while minimizing the risk of not having important objects when need for them arises. Agencies lose the ability to make informed decisions when legislation or policies mandate that an agency must obtain and keep certain objects; B. Attracting resources (facilities, staff, and funding for operating costs) enable collections to generate benefits that advance the agency’s mission, but budget processes and environments are often zero- sum or declining. Unfunded mandates to obtain and keep certain objects limit an agency’s abilities to make informed decisions about resource allocation; and C. Expanding the range of collection- related services provided by a collection can generate benefits that advance the agency’s mission, but offering too many services (some of which may be mandated by law) may limit the quality and impact of each service provided. Constraint A Decisions about which objects to obtain, retain, and discard are difficult because future use and impact are unknown. Knowledge of future operational costs per object and agency priorities for areas of future benefits provide valuable guid- ance for these decisions. When agencies are required by legislation to maintain and preserve whole categories of objects, they lose control over future costs and the ability to pursue particular benefits. NIH has developed policies concerning the management of their project collections; USGS has done the same for its working collections. These policies30 have accompanying implementation guides31 that include criteria and decision trees for making decisions about which objects should be transferred from NIH’s project collections to institutional collections or from the Department of the Interior’s (DOI) working collections to museum property. Factors such as the uniqueness of the objects, their relevance to agency mission, the cost of long- term maintenance, and the degree to which additions complement the rest of a collection are reasonable criteria. These policies serve as examples of the informed decisions that agencies can make when not constrained by legislative and regulatory mandates. The missions of DOI and USDA include the management of different categories of Federal land (e.g., National Parks and Forests). Several Federal laws mandate that certain types of objects (e.g., archaeological artifacts, vertebrate fossils) collected from designated Federal lands must be maintained and conserved.32 28 Economic Analyses of Federal Scientific Collections Constraint B Federal appropriations are the primary source of support for the services provided by Federal collections, so funding levels limit the amount of services the collections can provide. The growth, maintenance, and preservation of collections gener- ate most of their costs and without adequate support, these basic services can limit the ability to offer other services that generate tangible benefits, especially user access, data curation, and education and outreach. Legislative, regulatory, and policy mandates have significantly increased the number of objects that agencies must ac- cession, maintain, and preserve, thereby limiting the resources available for support of objects that serve other parts of an agency’s mission. For example, DOI’s collections contain tens of millions of scientific objects, many of them added to DOI collections in compliance with legal and policy mandates. Financial and management arrangements can be difficult when ownership and stewardship are assigned to different institutions. Many DOI collections are housed and managed by non- Federal repositories, often those of the researchers who made the collections. DOI retains ownership of these collections but non- Federal institutions are their stewards. DOI’s appropriations have not been adequate for supporting collection services.33 In addition, these non- Federal institutions are not eligible to receive NSF funding for projects that would improve collections, or portions of collections, that are owned by the Federal Government. Constraint C Beyond the basic services of accessioning, documenting, maintaining, and preserving, collections face important and difficult decisions about which services to offer. If a collection or agency tries to offer other services, they may not be able to devote the resources needed to generate benefits, and they are at risk of being accused of “mission creep.” Even basic services such as preservation may suffer. Once policies about accessioning and services are set, it is sometimes necessary to reduce collection growth and services if increases in funding and other resources do not keep pace. Such reductions are likely to reduce the benefits generated, so they are very difficult to explain to collection stakeholders. Decisions about which services a collection should offer are at the center of the relationship between costs and bene- fits. That is, the services provided by a collection drive its operational costs, the types and amounts of benefits the collec- tion can be expected to generate, and the appropriate methods used to document and estimate those benefits. Collections that perform only the basic service—accessioning material—have lower costs that are easier to document, but benefits will be more difficult to generate if no other services are provided. Beyond the basic service of accessioning, each additional service increases costs, but they may also increase the types and amounts of benefits a collection can generate: • Preserving and maintaining a collection extends the time that objects will survive and can produce reliable analytical results; • Documenting additions to a collection will facilitate use by intramural researchers; • Providing access to users will expand the user base to the extramural research community, including other countries and scientific disciplines; • Data curation will add value to the collection and permit the collection to document the benefits of this co- investment; and • Increasing public understanding through education and outreach through collections can create societal benefits be- yond research and development. 29 Recommendations T his report was based on examples and experiences obtained from Federal scientific collections, but the following recommendations may be applicable to scientific collections in general. A. The framework of services, costs, and benefits described here provides collections and organizations which own col- lections with an approach for greater evidence- based policy formulation and management decision- making. Federal institutional collections should consider testing and adopting them as tools for improving operations, as well as for documenting and explaining the value of their collections to taxpayers. B. IWGSC member agencies should consider testing one or more of the methods presented here for documenting the benefits generated by collections. Several would require new data collecting efforts and added expense. CDC NHANES (Box 4) and USGS/CRC (Box 7) are collecting and providing access to value added data provided by users; USDA/ APHIS (Box 8) is using agency data for analysis in a counter- factual scenario. C. In choosing among the methods presented here for documenting benefits, officials should consider their mission and the types of benefits their collections can generate. For example, i. Collections that contribute more directly to economic development might favor technology/knowledge transfer and/or counter- factual scenarios. The former can identify the collection uses related to successful innovations or outcomes, and the latter can document the costs of not having the collections; ii. Collections that contribute to societal benefits by preparing for environmental shocks (e.g., disease outbreaks, major crop failures) might find success stories and option value more useful. Collection managers can use the former method to highlight recent events in which their collections came into use, while the latter method can describe the costs of similar shocks in the past. iii. Collections that primarily contribute new knowledge in the form of public data and academic publications might prefer value added by users and counter- factual scenarios. As described above, collections can gather data on pat- terns of collection use and co- investment by users, and can ask selected users (through surveys) what they would have done if they did not have access to the collection. D. Groups such as the IWGSC, the International Society for Biological and Environmental Repositories (ISBER), the So- ciety for the Preservation of Natural History Collections (SPNHC) and others should continue in their roles as forums for information exchange and sharing of best practices as they apply the methods described here. As collections and organizations which own collections begin to generate reports and other documents concerning costs and benefits, the IWGSC Clearinghouse34 can continue to serve as a useful platform for information exchange. 31 Appendix 1. Abbreviations and Acronyms ARS Agricultural Research Service (part of USDA) APHIS Animal and Plant Health Inspection Service (part of USDA) ATCC American Type Culture Collection BLM Bureau of Land Management (part of DOI) CBP Customs and Border Protection (part of DHS) CDC Centers for Disease Control and Prevention (part of HHS) CFSAN Center for Food Safety and Applied Nutrition (part of FDA) CRC Core Research Center (part of USGS) DHS Department of Homeland Security DOC Department of Commerce DOD Department of Defense DOE Department of Energy DOI Department of the Interior DOJ Department of Justice DOS Department of State DOT Department of Transportation EPA Environmental Protection Agency ERS Economic Research Service (part of USDA) FBI Federal Bureau of Investigation (part of DOJ) FDA Food and Drug Administration (part of HHS) FS Forest Service (part of USDA) GRIN Germplasm Resources Information Network (part of ARS) HHS Department of Health and Human Services IWGSC Interagency Working Group on Scientific Collections (part of NSTC) NASA National Aeronautics and Space Administration NDU National Defense University (part of DOD) NGRP National Genetic Resources Program (part of ARS) NHANES National Health and Nutrition Examination Surveys (part of CDC) NIH National Institutes of Health (part of HHS) NIS National Identification Services (part of APHIS) NIST National Institute of Standards and Technology (part of DOC) NLM National Library of Medicine (part of NIH) NMFS National Marine Fisheries Service (part of NOAA) NMNH National Museum of Natural History (part of SI) NOAA National Oceanic and Atmospheric Administration (part of DOC) NPGS National Plant Germplasm System (part of ARS) NPS National Park Service (part of DOI) NRRL Northern Regional Research Lab Collection (part of ARS) NSF National Science Foundation 32 Economic Analyses of Federal Scientific Collections NSF- ICF National Science Foundation Ice Core Facility NSTC National Science and Technology Council (part of OSTP) OMB Office of Management and Budget OSTP Office of Science and Technology Policy PCR Polymerase Chain Reaction PPQ Plant Protection and Quarantine Division (part of APHIS) SI Smithsonian Institution SRM Standard Reference Materials Program (part of NIST) STPI Science and Technology Policy Institute USAID Agency for International Development USDA Department of Agriculture USFSC Registry of U.S. Federal Scientific Collections USGS U.S. Geological Survey (part of DOI) VA Department of Veterans Affairs 33 Appendix 2. Collections Cited Name Collection Discipline URL Registry record Pages Department of Defense DoD Serum Repository Health Biomedical Sciences https://health.mil/Military-Health-Topics /Combat-Support/Armed-Forces-Health -Surveillance-Branch/Data-Management -and-Technical-Support/Department-of -Defense-Serum-Repository — 8 Department of Health and Human Services, Centers for Disease Control National Health and Nutrition Examination Surveys (NHANES) Health Biomedical Sciences https://www.cdc.gov/nchs/nhanes /biospecimens/biospecimens.htm https://registry.gbif.org/collection/a4f7e9a3 -c9df-443e-b874-6a7d0585453e 8, 11, 13, 18, 20, 29 Department of Health and Human Services, Food and Drug Administration Foodborne pathogens Health Biomedical Sciences https://www.fda.gov/ https://registry.gbif.org/institution/f00c1 f94-8fbf-4fc0-ac39-abeb3ec48723 13, 19, 28 CFSAN Foodborne Bacteria collection Health Biomedical Sciences — https://registry.gbif.org/collection/85b3 c137-6d2a-4a4d-a5c5-ae570c184d46 28 Gulf Coast Seafood Lab Health Biomedical Sciences — https://registry.gbif.org/collection/ebbebdb2 -6b55-4f33-9ce1-41add4cc48c7 28 Department of the Interior, National Park Service Yellowstone National Park Archaeology; Anthropology; Biological Sciences https://www.nps.gov/yell/index.htm https://registry.gbif.org/institution/d4e852 68-a913-4943-bfa2-eeb498c0ab1d 18–19, 26 Department of the Interior, U.S. Geological Survey Core Research Center Geological & Earth Sciences https://www.usgs.gov/core-science -systems/nggdp/core-research-center https://registry.gbif.org/collection/ced54b13 -6914-402c-bf26-29a7ac2c18a5 20–22, 29 National Institute of Standards and Technology Standard Reference Materials Material Sciences; Agricultural Sciences & Natural Resources; Health Biomedical Sciences https://www.nist.gov/srm https://registry.gbif.org/institution/fbcb0b2b -2d4c-4f07-9e4d-bee5737fce74 14, 23, 25 National Science Foundation NSF Ice Core Facility (NSF-ICF) Geological & Earth Sciences; Atmospheric Sciences https://icecores.org/ https://registry.gbif.org/institution/7a717 903-d4c7-4a83-a0cb-aaaca212228e 13 Living Stock Centers Program https://nsf.gov/funding/pgm_summ .jsp?pims_id=505541&org=DBI&from =home — 19 Non-Federal, Private American Type Culture Collection (ATCC) Biological Sciences https://www.atcc.org/ https://registry.gbif.org/institution/dc1823 b1-3b46-47a0-bb92-ebe6f2a4a2dd 26 https://health.mil/Military-Health-Topics/Combat-Support/Armed-Forces-Health-Surveillance-Branch/Data-Management-and-Technical-Support/Department-of-Defense-Serum-Repository https://www.cdc.gov/nchs/nhanes/biospecimens/biospecimens.htm https://registry.gbif.org/collection/a4f7e9a3-c9df-443e-b874-6a7d0585453e https://www.fda.gov/ https://registry.gbif.org/institution/f00c1f94-8fbf-4fc0-ac39-abeb3ec48723 https://registry.gbif.org/collection/85b3c137-6d2a-4a4d-a5c5-ae570c184d46 https://registry.gbif.org/collection/ebbebdb2-6b55-4f33-9ce1-41add4cc48c7 https://www.nps.gov/yell/index.htm https://registry.gbif.org/institution/d4e85268-a913-4943-bfa2-eeb498c0ab1d https://www.usgs.gov/core-science-systems/nggdp/core-research-center https://registry.gbif.org/collection/ced54b13-6914-402c-bf26-29a7ac2c18a5 https://www.nist.gov/srm https://registry.gbif.org/institution/fbcb0b2b-2d4c-4f07-9e4d-bee5737fce74 https://icecores.org/ https://registry.gbif.org/institution/7a717903-d4c7-4a83-a0cb-aaaca212228e https://nsf.gov/funding/pgm_summ.jsp?pims_id=505541&org=DBI&from=home https://www.atcc.org/ https://registry.gbif.org/institution/dc1823b1-3b46-47a0-bb92-ebe6f2a4a2dd 34 Economic Analyses of Federal Scientific Collections Name Collection Discipline URL Registry record Pages Smithsonian Institution, National Museum of Natural History Department of Invertebrate Zoology Biological Sciences; Ocean & Marine Sciences https://naturalhistory.si.edu/research /invertebrate-zoology https://registry.gbif.org/collection/0174f5b3 -da29-4967-b8dc-ce75ed53e35d 21–22 U.S. Department of Agriculture, Agricultural Research Service National Plant Germplasm System (NPGS) Agricultural Sciences & Natural Resources https://www.ars-grin.gov/ https://registry.gbif.org/institution/e45 f5702-3f7a-4eaa-8cbe-bc11cee53412 11, 13, 14, 16, 18, 19, 22 ARS Culture Collection Agricultural Sciences & Natural Resources; Health Biomedical Sciences; Biological Sciences https://nrrl.ncaur.usda.gov/ https://registry.gbif.org/collection/2f212b5e -8619-412f-baf6-ee5f0d4b5c67 11, 16, 17 Multiple ARS Systematic collections Biological Sciences — ARS: https://registry.gbif.org/institution /search?q=usda/ars%20systematic 13, 24, 34 https://naturalhistory.si.edu/research/invertebrate-zoology https://registry.gbif.org/collection/0174f5b3-da29-4967-b8dc-ce75ed53e35d https://www.ars-grin.gov/ https://registry.gbif.org/institution/e45f5702-3f7a-4eaa-8cbe-bc11cee53412 https://nrrl.ncaur.usda.gov/ https://registry.gbif.org/collection/2f212b5e-8619-412f-baf6-ee5f0d4b5c67 https://registry.gbif.org/institution/search?q=usda/ars%20systematic 35 Notes 1. Abbreviations and acronyms used throughout this document are specified in Appendix 1. 2. See IWGSC Clearinghouse: https://iwgsc.nal.usda.gov 3. 42 USC 6624; Public Law 111–358—January 4, 2011, https://www.congress.gov/111/plaws/publ358/PLAW-111publ358.pdf 4. See also IWGSC, 2013b. 5. U.S. Federal Scientific Collections are registered in the Global Registry of Scientific Collections, https://www.gbif.org/grscicoll, which is managed by the Global Biodiversity Information Facility, http://gbif.org. 6. IWGSC agency collections policies: https://iwgsc.nal.usda.gov/agency-documents 7. CDC NHANES collection: https://registry.gbif.org/collection/a4f7e9a3-c9df-443e-b874-6a7d0585453e and https://www.cdc.gov /nchs/nhanes/biospecimens/biospecimens.htm 8. DoD Serum Repository record; see also Perdue et al., 2015 9. When a research project ends, agencies can decide that objects in a project collection are no longer needed for mission-related research. If these objects are not considered appropriate for institutional collections, they can be offered to other agencies or non-Federal institutions. If transferred to a non-Federal institution, these objects may become part of another project collection or they may be accessioned into an institutional collection. For example, the National Marine Fisheries Service (NMFS) often transfers objects from completed project collections to university-based institutional collections. NSF has supported the integration of Fed- eral project collections that have been transferred from agencies (e.g., the National Oceanic and Atmospheric Administration, the Forest Service) and accessioned by non-Federal institutions. 10. NSF Advancing Digitization of Biological Collections Program: https://www.nsf.gov/funding/pgm_summ.jsp?pims_id=503559 11. USDA National Plant Germplasm System collections: https://registry.gbif.org/grscicoll/institution/e45f5702-3f7a-4eaa-8cbe-bc11 cee53412 12. ARS Culture Collection record: https://registry.gbif.org/grscicoll/collection/2f212b5e-8619-412f-baf6-ee5f0d4b5c67 13. National Ice Core Facility record: https://registry.gbif.org/institution/7a717903-d4c7-4a83-a0cb-aaaca212228e; also see https://ice cores.org/ and https://www.usgs.gov/mission-areas/core-science-systems/about/national-science-foundation-ice-core-facility 14. FDA food safety collections: https://registry.gbif.org/grscicoll/institution/f00c1f94-8fbf-4fc0-ac39-abeb3ec48723 15. Documenting additions to renewable collections (Table 1, Service 3) involves characterizing objects, (e.g., NPGS, Box 2; ARS Cul- ture Collection, Box 3). These could be considered research or a collection service unique to this type of collection. 16. The Farm Bill authorizes the ARS Culture Collection in Peoria, IL to charge user fees for access to its Patent Collection (see Box 3). 17. NSF support for Living Stock Centers is provided through the Collections in Support of Biological Research (CSBR) Program: https://nsf.gov/funding/pgm_summ.jsp?pims_id=505541&org=DBI&from=home 18. University of British Columbia’s Biobank Resource Center: https://biobanking.org/webs/biobankcosting 19. USDA and ARS: Annual Reports on Technology Transfer: http://ars.usda.gov/office-of-technology-transfer/tt-reports/ 20. Germplasm Resources Information Network: https://www.ars-grin.gov/ 21. “Mr. Cycle” was an early PCR thermal cycler; see https://americanhistory.si.edu/collections/search/object/nmah_1000862 22. https://www.globenewswire.com/news-release/2018/03/27/1453732/0/en/Global-Polymerase-Chain-Reaction-Market-Will-Reach -USD-10-62-Billion-by-2023-Zion-Market-Research.html] 23. CFSAN Foodborne Bacteria collection record: https://registry.gbif.org/grscicoll/collection/85b3c137-6d2a-4a4d-a5c5-ae570c184 d46 24. Gulf Coast Seafood Lab record: https://registry.gbif.org/grscicoll/collection/ebbebdb2-6b55-4f33-9ce1-41add4cc48c7 25. Genome Trakr is a global network of 43 U.S. and 20 non-U.S. institutions that is assembling a publicly available database of whole genome sequences of foodborne pathogens. The network includes more than 40 U.S. institutions (Federal and State public health agencies, universities, and hospitals) and 20 non-U.S. institutions (see https://www.fda.gov/food/whole-genome-sequencing-wgs -program/genometrakr-network). 26. USGS Core Research Center collection record: https://registry.gbif.org/grscicoll/collection/ced54b13-6914-402c-bf26-29a7ac2c 18a5 27. IWGSC considers CRC an institutional collection. DOI does not use the terms “institutional collections” and “project collections” as defined in IWGSC (2013a). DOI classifies its collections as “museum property” and “working collection” (defined in DOI Depart- mental Manual Part 411, https://www.doi.gov/sites/doi.gov/files/uploads/411dm1_museum_property_policy.pdf). USGS considers CRC a working collection. https://iwgsc.nal.usda.gov https://www.congress.gov/111/plaws/publ358/PLAW-111publ358.pdf https://www.gbif.org/grscicoll http://gbif.org https://iwgsc.nal.usda.gov/agency-documents https://registry.gbif.org/collection/a4f7e9a3-c9df-443e-b874-6a7d0585453e https://www.cdc.gov/nchs/nhanes/biospecimens/biospecimens.htm https://www.nsf.gov/funding/pgm_summ.jsp?pims_id=503559 https://registry.gbif.org/grscicoll/institution/e45f5702-3f7a-4eaa-8cbe-bc11cee53412 https://registry.gbif.org/grscicoll/collection/2f212b5e-8619-412f-baf6-ee5f0d4b5c67 https://registry.gbif.org/institution/7a717903-d4c7-4a83-a0cb-aaaca212228e https://icecores.org/ https://icecores.org/ https://www.usgs.gov/mission-areas/core-science-systems/about/national-science-foundation-ice-core-facility https://registry.gbif.org/grscicoll/institution/f00c1f94-8fbf-4fc0-ac39-abeb3ec48723 https://nsf.gov/funding/pgm_summ.jsp?pims_id=505541&org=DBI&from=home https://biobanking.org/webs/biobankcosting http://ars.usda.gov/office-of-technology-transfer/tt-reports/ https://www.ars-grin.gov/ https://americanhistory.si.edu/collections/search/object/nmah_1000862 https://www.globenewswire.com/news-release/2018/03/27/1453732/0/en/Global-Polymerase-Chain-Reaction-Market-Will-Reach-USD-10-62-Billion-by-2023-Zion-Market-Research.html https://registry.gbif.org/grscicoll/collection/85b3c137-6d2a-4a4d-a5c5-ae570c184d46 https://registry.gbif.org/grscicoll/collection/ebbebdb2-6b55-4f33-9ce1-41add4cc48c7 https://www.fda.gov/food/whole-genome-sequencing-wgs-program/genometrakr-network https://registry.gbif.org/grscicoll/collection/ced54b13-6914-402c-bf26-29a7ac2c18a5 https://www.doi.gov/sites/doi.gov/files/uploads/411dm1_museum_property_policy.pdf https://www.cdc.gov/nchs/nhanes/biospecimens/biospecimens.htm https://registry.gbif.org/grscicoll/institution/e45f5702-3f7a-4eaa-8cbe-bc11cee53412 https://registry.gbif.org/grscicoll/collection/85b3c137-6d2a-4a4d-a5c5-ae570c184d46 https://www.fda.gov/food/whole-genome-sequencing-wgs-program/genometrakr-network https://registry.gbif.org/grscicoll/collection/ced54b13-6914-402c-bf26-29a7ac2c18a5 36 Economic Analyses of Federal Scientific Collections 28. Beltsville fungal collection, https://www.ars.usda.gov/northeast-area/beltsville-md-barc/beltsville-agricultural-research-center /mycology-and-nematology-genetic-diversity-and-biology-laboratory/docs/us-national-fungus-collections-bpi/us-national -fungus-collections-databases/ 29. USDA Systematic Entomology Laboratory, https://www.ars.usda.gov/northeast-area/beltsville-md-barc/beltsville-agricultural -research-center/systematic-entomology-laboratory/; NMNH insect collection, https://naturalhistory.si.edu/research/entomology 30. NIH Collections Policy https://policymanual.nih.gov/1189; USGS Policy on Scientific Working Collections https://www.usgs.gov /products/scientific-collections/usgs-policy-scientific-working-collections 31. Companion Guide: Guidance for Implementation of the NIH Policy for the Management of and Access to Scientific Collections https://osp.od.nih.gov/wp-content/uploads/2016/08/Companion%20Guide.pdf; USGS implementation guide to collections policy https://www.usgs.gov/products/scientific-collections/guide-planning-and-managing-scientific-working-collections-us 32. Examples include the 1906 Antiquities Act, the 1916 NPS Organic Act, and most recently, the 2009 Paleontological Resources Preservation Act (Subtitle D of the Omnibus Land Management Act of 2009: https://www.congress.gov/111/plaws/publ11/PLAW -111publ11.pdf). Legislation concerning DOI’s museum collections policies: https://www.doi.gov/museum/policy 33. See DOI Office of the Inspector General 2009 and 2016, and annual DOI reports on museum property management: https://www .doi.gov/museum/annual-reports 34. https://iwgsc.nal.usda.gov https://www.ars.usda.gov/northeast-area/beltsville-md-barc/beltsville-agricultural-research-center/mycology-and-nematology-genetic-diversity-and-biology-laboratory/docs/us-national-fungus-collections-bpi/us-national-fungus-collections-databases/ https://www.ars.usda.gov/northeast-area/beltsville-md-barc/beltsville-agricultural-research-center/systematic-entomology-laboratory/ https://naturalhistory.si.edu/research/entomology https://policymanual.nih.gov/1189 https://www.usgs.gov/products/scientific-collections/usgs-policy-scientific-working-collections https://osp.od.nih.gov/wp-content/uploads/2016/08/Companion%20Guide.pdf https://www.usgs.gov/products/scientific-collections/guide-planning-and-managing-scientific-working-collections-us https://www.congress.gov/111/plaws/publ11/PLAW-111publ11.pdf https://www.doi.gov/museum/policy https://www.doi.gov/museum/annual-reports https://iwgsc.nal.usda.gov https://www.doi.gov/museum/annual-reports https://www.congress.gov/111/plaws/publ11/PLAW-111publ11.pdf https://www.usgs.gov/products/scientific-collections/usgs-policy-scientific-working-collections https://www.ars.usda.gov/northeast-area/beltsville-md-barc/beltsville-agricultural-research-center/systematic-entomology-laboratory/ https://www.ars.usda.gov/northeast-area/beltsville-md-barc/beltsville-agricultural-research-center/mycology-and-nematology-genetic-diversity-and-biology-laboratory/docs/us-national-fungus-collections-bpi/us-national-fungus-collections-databases/ https://www.ars.usda.gov/northeast-area/beltsville-md-barc/beltsville-agricultural-research-center/mycology-and-nematology-genetic-diversity-and-biology-laboratory/docs/us-national-fungus-collections-bpi/us-national-fungus-collections-databases/ 37 References Albert, M., J. Bartlett, R. N. Johnston, B. Schacter, and P. Watson. 2014. Biobank Bootstrapping: Is Biobank Sustainability Possible through Cost Recovery? Biopreservation and Biobanking 12(6). https://doi.org/10.1089/bio.2014.0051 American Alliance of Museums. 2017. “Museums as Economic Engines: A National Report.” Commissioned by the American Alliance of Museums and conducted by Oxford Economics. Ammon, J., A. J. Salter, and B. R. Martin. 2001. The Economic Benefits of Publicly Funded Basic Research: A Critical Review. Research Policy 30(3):509–532. Available online February 28, 2001. https://doi.org/10.1016/S0048- 7333(00)00091- 3 Baker, R. J., L. C. Bradley, H. J. Garner, and R. D. Bradley. 2014. “‘Door to Drawer’ Costs of Curation, Installation, Documentation, Da- tabasing, and Long- Term Care of Mammal Voucher Specimens in Natural History Collections.” Occasional Papers of the Museum of Texas Tech University, Lubbock, Texas. Bishop, R. C. 1982. Option Value: An Exposition and Extension. Land Economics 58(1):1–15. Bradley, R. D., L. C. Bradley, H. J. Garner, and R. J. Baker. 2014. “Assessing the Value of Natural History Collections and Addressing Issues Regarding Long- Term Growth and Care.” BioScience 64:1150–1158. https://doi.org/10.1093/biosci/biu166 Bretting, P. K. 2018. 2017 Frank Meyer Medal for Plant Genetic Resources Lecture: Stewards of Our Agricultural Future. Crop Science 58:2233–2240. https://doi.org/10.2135/cropsci2018.05.0334 Brock, T. D. 1997. The Value of Basic Research: Discovery of Thermus aquaticus and Other Extreme Thermophiles. Genetics 146(4): 1207–1210. Brock, T. D., and H. Freeze. 1969. Thermus aquaticus gen. n. and sp. n., a Nonsporulating Extreme Thermophile. Journal of Bacteriology 98(1):289–297. Byford, S., D. J. Torgerson, and J. Raftery. 2000. Cost of Illness Studies. BMJ 2000; 320:1335. https://doi.org/10.1136/bmj.320.7245.1335 DiEuliis, D., K. R. Johnson, S. S. Morse, and D. E. Schindel. 2016, Opinion: Specimen Collections Should Have a Much Bigger Role in Infectious Disease Research and Response. Proceedings of the National Academy of Sciences 113(1):4- 7. https://doi.org/10.1073 /pnas.1522680112 Evenson, R. E., and D. Gollin. 1997. Genetic Resources, International Organizations, and Improvement in Rice Varieties. Economic De- velopment and Cultural Change 45(3):471–500. Fisher, A. C., and W. M. Hanemann. 1986. Option Value and the Extinction of Species. Advances in Applied Micro- Economics 4:169–190. Furman, J. L., and S. Stern. 2011. “Climbing atop the Shoulders of Giants: The Impact of Institutions on Cumulative Research.” American Economic Review 101:1933–1963. Graves, P. 2003. “Valuing Public Goods.” Challenge 46:100- 112. Güereña D. T., J. Lehmann, J. E. Thies, A. Enders, N. Karanja, and H. Neufeldt. 2015. Partitioning the Contributions of Biochar Proper- ties to Enhanced Biological Nitrogen Fixation in Common Bean (Phaseolus vulgaris). Biology and Fertility of Soils 51:479–491. https:// doi.org/10.1007/s00374- 014- 0990- z Halewood M. 2013. What Kind of Goods Are Plant Genetic Resources for Food and Agriculture? Towards the Identification and De- velopment of a New Global Commons. International Journal of the Commons 7(2):278–312. https://www.jstor.org/stable/26523131 Horowitz, D. B., E. C. Peters, I. Sunila, and J. C. Wolf. 2010. Treasures in Archived Histopathology Collections: Preserving the Past for Future Understanding. Histologic XLIII(1):1–8. Interagency Working Group on Scientific Collections. 2009. “Scientific Collections: Mission- Critical Infrastructure for Federal Science Agencies.” Report of the IWGSC. https://iwgsc.nal.usda.gov/sites/default/files/IWGSC_GreenReport_FINAL_2009.pdf Interagency Working Group on Scientific Collections. 2013a. “Recommendations for Departmental Collections Policies.” Report of the IWGSC. https://iwgsc.nal.usda.gov/sites/default/files/IWGSC_Recommend_Collxns_Policies_2013_01_28.pdf Interagency Working Group on Scientific Collections. 2013b. “Best Practices for Budgeting for Scientific Collections.” Report of the IWGSC. https://repository.si.edu/bitstream/handle/10088/99408/IWGSC%20Collections%20Budget%20Recommendations%20FINAL -2013.pdf?sequence=1&isAllowed=y Kothamasi, D., M. Spurlock, and E. Kiers. 2011. Agricultural Microbial Resources: Private Property or Global Commons? Nature Biotech- nology 29:1091–1093. https://doi.org/10.1038/nbt.2056 Lawrey, J. D. 1993. Lichens as Monitors of Pollutant Elements at Permanent Sites in Maryland and Virginia. The Bryologist 96:339–341. Lichtenberg, E., L. J. Olson, and C. Lawley. 2009. The Value of Systematics in Screening Imports for Invasive Pests. Unpublished man- uscript, June 17, 2009. Link, A. N., and J. T. Scott. 2012. “The Theory and Practice of Public- Sector R&D Economic Impact Analysis.” National Institute of Standards and Technology Planning Report 11- 1, January 2012. https://doi.org/10.1089/bio.2014.0051 https://doi.org/10.1016/S0048-7333(00)00091-3 https://doi.org/10.1093/biosci/biu166 https://doi.org/10.2135/cropsci2018.05.0334 https://doi.org/10.1136/bmj.320.7245.1335 https://doi.org/10.1073/pnas.1522680112 https://doi.org/10.1007/s00374-014-0990-z https://doi.org/10.1007/s00374-014-0990-z https://www.jstor.org/stable/26523131 https://iwgsc.nal.usda.gov/sites/default/files/IWGSC_GreenReport_FINAL_2009.pdf https://iwgsc.nal.usda.gov/sites/default/files/IWGSC_Recommend_Collxns_Policies_2013_01_28.pdf https://repository.si.edu/bitstream/handle/10088/99408/IWGSC%20Collections%20Budget%20Recommendations%20FINAL-2013.pdf?sequence=1&isAllowed=y https://doi.org/10.1038/nbt.2056 https://doi.org/10.1073/pnas.1522680112 https://repository.si.edu/bitstream/handle/10088/99408/IWGSC%20Collections%20Budget%20Recommendations%20FINAL-2013.pdf?sequence=1&isAllowed=y 38 Economic Analyses of Federal Scientific Collections Martin, S. A., M. P. Gallaher, and A. C. O’Connor. 2000. “Economic Impact of Standard Reference Materials for Sulfur in Fossil Fuels.” National Institute of Standards and Technology Planning Report 00- 1, February 2000. Odeh, H., L. Miranda, A. Rao, J. Vaught, H. Greenman, J. McLean, D. Reed, S. Memon, B. Fombonne, P. Guan, and H. M. Moore. 2015. The Biobank Economic Modeling Tool (BEMT): Online Financial Planning to Facilitate Biobank Sustainability. Biopreservation and Biobanking 13(6):421–429. https://doi.org/10.1089/bio.2015.0089 Office of Science and Technology Policy. 2010. “Policy on Scientific Collections.” Memorandum to the Heads of Executive Departments and Agencies. OSTP, October 6, 2010. https://iwgsc.nal.usda.gov/sites/default/files/OSTP_MEMO_Scientific_Collxns_Policy_2010 _10(1).pdf Office of Science and Technology Policy. 2014. “Improving the Management of and Access to Scientific Collections.” Memorandum to the Heads of Executive Departments and Agencies. OSTP, March 20, 2014. https://iwgsc.nal.usda.gov/sites/default/files/OSTP_MEMO _Scientific_Collxns_FINAL_2014_03(1).pdf Perdue, C. L., A. A. Eick- Cost, and M. V. Rubertone. 2015. A Brief Description of the Operation of the DoD Serum Repository. Military Medicine 180(10):10–12. Reichman, J. H., P. F. Uhlir, and T. Dedeurwaerdere. 2015. Governing Digitally Integrated Genetic Resources, Data, and Literature: Global Intellectual Property Strategies for a Redesigned Microbial Research Commons. New York: Cambridge University Press. Research Triangle Institute. 2000. “Economic Impact of Standard Reference Materials for Sulfur in Fossil Fuels.” National Institute of Standards and Technology Planning Report 00- 1. Robbins, J. 2006. The Search for Private Profit in the Nation’s Public Parks. New York Times, November 28, 2006. https://www.nytimes .com/2006/11/28/science/28yell.html Rubenstein, K. D., P. Heisey, R. Shoemaker, J. Sullivan, and G. Frisvold. 2005. “Crop Genetic Resources: An Economic Appraisal.” A Report of the Economic Research Service. Economic Information Bulletin Number 2, May 2005. U.S. Department of Agriculture. https://www.ers.usda.gov/webdocs/publications/44121/17452_eib2_1_.pdf?v=0 Schuhmann, P. W. 2014. “Non- Market Valuation: Methods and Data.” Slide set, University of North Carolina, Wilmington. https://www .sesync.org/sites/default/files/education/economics- 4.pdf Smale, M., and B. Koo. 2003. Introduction: Taxonomy of Genebank Value. Brief 7. In What is a Genebank Worth? ed. Smale, M., and B. Koo, Briefs 7–12, Biotechnology and Genetic Resource Policies. International Food Policy Research Institute, December 2003. Stauderman, S., and W. G. Tompkins, eds. 2016. Proceedings of the Smithsonian Institution: Summit on the Museum Preservation Environment. Washington, DC: Smithsonian Institution Scholarly Press. https://doi.org/10.5479/si.9781935623878 Suarez, A. V., and N. D. Tsutsui. 2004. The Value of Museum Collections for Research and Society. BioScience 54:66–74. Tassey, G. 2003. “Methods for Assessing the Economic Impacts of Government R&D.” National Institute of Standards and Technology Planning Report 03- 1, September 2003. U.S. Department of the Interior, Office of Inspector General. 2009. “Museum Collections: Accountability and Preservation.” Report No. C- IN- MOA- 0010- 2008, December 2009. https://www.doioig.gov/sites/doioig.gov/files/2010- I- 0005.pdf U.S. Department of the Interior, Office of Inspector General. 2016. Verification Review – Recommendations for the Report “Department of the Interior’s Accountability and Preservation of Museum Collections” (Audit No. C- IN- MOA- 0010- 2008),” Report No. 2016- CR- 018. https://www.doioig.gov/sites/doioig.gov/files/2016CR018Public_0.pdf https://doi.org/10.1089/bio.2015.0089 https://iwgsc.nal.usda.gov/sites/default/files/OSTP_MEMO_Scientific_Collxns_Policy_2010_10(1).pdf https://iwgsc.nal.usda.gov/sites/default/files/OSTP_MEMO_Scientific_Collxns_FINAL_2014_03(1).pdf https://www.nytimes.com/2006/11/28/science/28yell.html https://www.ers.usda.gov/webdocs/publications/44121/17452_eib2_1_.pdf?v=0 https://www.sesync.org/sites/default/files/education/economics-4.pdf https://doi.org/10.5479/si.9781935623878 https://www.doioig.gov/sites/doioig.gov/files/2010-I-0005.pdf https://www.doioig.gov/sites/doioig.gov/files/2016CR018Public_0.pdf https://iwgsc.nal.usda.gov/sites/default/files/OSTP_MEMO_Scientific_Collxns_FINAL_2014_03(1).pdf https://iwgsc.nal.usda.gov/sites/default/files/OSTP_MEMO_Scientific_Collxns_Policy_2010_10(1).pdf https://www.nytimes.com/2006/11/28/science/28yell.html https://www.sesync.org/sites/default/files/education/economics-4.pdf Economic Analyses of Federal Scientific Collections Methods for Documenting Costs and Benefits David E. Schindel and the Economic Study Group of the Interagency Working Group on Scientific Collections Open access PDF available from Smithsonian Institution Scholarly Press at https://doi.org/10.5479/si.13241612 Smithsonian Scholarly Press Schindel_final high res C1.pdf Schindel-IWGSC_FINAL_web.pdf Schindel_C4_web.pdf Blank Page zkie-zero-2020 ---- From Zero to Hero: Human-In-The-Loop Entity Linking in Low Resource Domains Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6982–6993 July 5 - 10, 2020. c©2020 Association for Computational Linguistics 6982 From Zero to Hero: Human-In-The-Loop Entity Linking in Low Resource Domains Jan-Christoph Klie Richard Eckart de Castilho Iryna Gurevych Ubiquitous Knowledge Processing Lab (UKP-TUDA) Department of Computer Science Technical University of Darmstadt, Germany www.ukp.tu-darmstadt.de Abstract Entity linking (EL) is concerned with disam- biguating entity mentions in a text against knowledge bases (KB). It is crucial in a consid- erable number of fields like humanities, tech- nical writing and biomedical sciences to en- rich texts with semantics and discover more knowledge. The use of EL in such domains requires handling noisy texts, low resource set- tings and domain-specific KBs. Existing ap- proaches are mostly inappropriate for this, as they depend on training data. However, in the above scenario, there exists hardly annotated data, and it needs to be created from scratch. We therefore present a novel domain-agnostic Human-In-The-Loop annotation approach: we use recommenders that suggest potential con- cepts and adaptive candidate ranking, thereby speeding up the overall annotation process and making it less tedious for users. We evaluate our ranking approach in a simulation on diffi- cult texts and show that it greatly outperforms a strong baseline in ranking accuracy. In a user study, the annotation speed improves by 35 % compared to annotating without interactive support; users report that they strongly prefer our system. An open-source and ready-to-use implementation based on the text annotation platform INCEpTION1 is made available2. 1 Introduction Entity linking (EL) describes the task of disam- biguating entity mentions in a text by linking them to a knowledge base (KB), e.g. the text span Earl of Orrery can be linked to the KB entry John Boyle, 5. Earl of Cork, thereby disambiguating it. EL is highly beneficial in many fields like digital hu- manities, classics, technical writing or biomedical sciences for applications like search (Meij et al., 1 https://inception-project.github.io 2 https://github.com/UKPLab/ acl2020-interactive-entity-linking Figure 1: Difficult entity mentions with their linked en- tities: 1) Name variations, 2) Spelling Variation, 3) Am- biguity 2014), semantic enrichment (Schlögl and Lejtovicz, 2017) or information extraction (Nooralahzadeh and Øvrelid, 2018). These are overwhelmingly low-resource settings: often, no data annotated ex- ists; coverage of open-domain knowledge bases like Wikipedia or DBPedia is low. Therefore, en- tity linking is frequently performed against domain- specific knowledge bases (Munnelly and Lawless, 2018a; Bartsch, 2004). In these scenarios, the first crucial step is to ob- tain annotated data. This data can then be either directly used by researchers for their downstream task or to train machine learning models for au- tomatic annotation. For this initial data creation step, we developed a novel Human-In-The-Loop (HITL) annotation approach. Manual annotation is laborious and often prohibitively expensive. To improve annotation speed and quality, we there- fore add interactive machine learning annotation support that helps the user find entities in the text and select the correct knowledge base entries for them. The more entities are annotated, the better the annotation support will be. Throughout this work, we focus on texts from digital humanities, to be more precise, texts written in Early Modern English texts, including poems, biographies, novels as well as legal documents. In https://inception-project.github.io https://inception-project.github.io https://github.com/UKPLab/acl2020-interactive-entity-linking https://github.com/UKPLab/acl2020-interactive-entity-linking 6983 this domain, texts are noisy as they were written in times where orthography was rather incidental or due to OCR and transcription errors (see Fig. 1). Tools like named entity recognizers are unavailable or perform poorly (Erdmann et al., 2019). We demonstrate the effectiveness of our ap- proach with extensive simulation as well as a user study on different, challenging datasets. We imple- ment our approach based on the open-source anno- tation platform INCEpTION (Klie et al., 2018) and publish all datasets and code. Our contributions are the following: 1. We present a generic, KB-agnostic annotation approach for low-resource settings and pro- vide a ready-to-use implementation so that researchers can easily annotate data for their use cases. We validate our approach exten- sively in a simulation and in a user study. 2. We show that statistical machine learning models can be used in an interactive entity linking setting to improve annotation speed by over 35%. 2 Related work In the following, we give a broad overview of exist- ing EL approaches, annotation support and Human- In-The-Loop annotation. Entity Linking describes the task of disam- biguating mentions in a text against a knowl- edge base. It is typically approached in three steps: 1) mention detection, 2) candidate gener- ation and 3) candidate ranking (Shen et al., 2015) (Fig. 2). Mention detection most often relies either on gazetteers or pretrained named entity recogniz- ers. Candidate generation either uses precompiled candidate lists derived from labeled data or uses full-text search. Candidate ranking assigns each candidate a score, then the candidate with the high- est score is returned as the final prediction. Existing systems rely on the availability of certain resources like a large Wikipedia as well as software tools and often are restricted in the knowledge base they can link to. Off-the-shelf systems like Dexter (Ceccarelli et al., 2013), DBPedia Spotlight (Daiber et al., 2013) and TagMe (Ferragina and Scaiella, 2010) most often can only link against Wikipedia or a related knowledge base like Wiki- data or DBPedia. They require good Wikipedia coverage for computing frequency statistics like popularity, view count or PageRank (Guo et al., 2013). These features work very well for stan- dard datasets due to their Zipfian distribution of entities, leading to high reported scores on state- of-the art datasets (Ilievski et al., 2018; Milne and Witten, 2008). However, these systems are rarely applied out-of-domain such as in digital humanities or classical studies. Compared to state-of-the-art approaches, only a limited amount of research has been performed on entity linking against domain- specific knowledge bases. AGDISTIS (Usbeck et al., 2014) developed a knowledge-base-agnostic approach based on the HITS algorithm. The men- tion detection relies on gazetteers compiled from re- sources like Wikipedia and thereby performs string matching. Brando et al. (2016) propose REDEN, an approach based on graph centrality to link French authors to literary criticism texts. It requires addi- tional linked data that is aligned with the custom knowledge base–they use DBPedia. As we work in a domain-specific low resource setting, access to large corpora which can be used to compute pop- ularity priors is limited. We do not have suitable named entity linking tools, gazetteers or a sufficient amount of labeled training data. Therefore, it is challenging to use state of the art systems. Human-in-the-loop annotation HITL machine learning describes an interactive scenario where a machine learning (ML) system and a human work together to improve their performance. The ML system gives predictions, and the human corrects if they are wrong and helps to spot things that have been overlooked by the machine. The sys- tem uses this feedback to improve, leading to bet- ter predictions and thereby reducing the effort of the human. In natural language processing, it has been applied in scenarios like interactive text sum- marization (Gao et al., 2018), parsing (He et al., 2016) or data generation (Wallace et al., 2019). Regarding machine-learning assisted annotation, Yimam et al. (2014) propose an annotation editor that during annotation, interactively trains a model using annotations made by the user. They use string matching and MIRA (Crammer and Singer, 2003) as recommenders, evaluate on POS and NER anno- tation and show improvement in annotation speed. TASTY (Arnold et al., 2016) is a system that is able to perform EL against Wikipedia on the fly while typing a document. A pretrained neural se- quence tagger is being used that performs mention detection. Candidates are precomputed and the candidate is chosen that has the highest text sim- 6984 Figure 2: Entity linking pipeline: First, mentions of entities in the text need to be found. Then, given a mention, candidate entities are generated. Finally, entities are ranked and the top entity is chosen. ilarity. The system updates its suggestions after interactions such as writing, rephrasing, removing or correcting suggested entity links. Corrections are used as training data for the neural model. How- ever, due to the following reasons, it is not yet suit- able for our scenario. In order to overcome the cold start problem, it needs annotated training data in addition to a precomputed index for candidate generation. It also only links against Wikipedia. 3 Architecture The following section describes the three com- ponents of our annotation framework, following the standard entity linking pipeline (see Fig. 2). Throughout this work, we will mainly focus on the candidate Ranking step. We call the text span which contains an entity the mention and the sen- tence the mention is in the context. Each candidate from the knowledge base is assumed to have a la- bel and a description. For instance, in Fig. 2, one mention is Dublin, the context is Dublin is the cap- ital of Ireland, the label of the the first candidate is Trinity College and its description is constituent college of the University of Dublin in Ireland. Mention Detection In the annotation setting, we rely on users to mark text spans that contain annota- tions. As support, we provide suggestions given by different recommender models: similar to Yimam et al. (2014), we use a string matcher suggesting an- notations for mentions which have been annotated before. We also propose a new Levenshtein string matcher based on Levenshtein automata (Schulz and Mihov, 2002). In contrast to the string matcher, it suggests annotations for spans within a Leven- shtein distance of 1 or 2. Preliminary experiments with ML models for mention detection like using a Conditional Random Field and handcrafted fea- tures did not perform well and yielded noisy sug- gestions, requiring further investigation. Candidate Generation We index the knowledge base and use full text search to retrieve candidates based on the surface form of the annotated men- tion. Besides, users can query this index during annotation. We use fuzzy search to help in cases where the mention and the knowledge base label are almost the same but not identical (e.g. Dublin vs. Dublyn). In the interactive setting, the user can also search the knowledge base during annotation, e.g. in cases when the gold entity is not ranked high enough or when the surface form and knowledge base label are not the same (Zeus vs. Jupiter). Candidate Ranking We follow Zheng et al. (2010) and model candidate ranking as a learning- to-rank problem: given a mention and a list of can- didates, sort the candidates so that the most relevant candidate is at the top. For training, we guarantee that the gold candidate is present in the candidate list. For evaluation, the gold candidate can be ab- sent from the candidate list if the candidate search failed to find it. This interaction is the core Human-in-the-loop in our approach. For training, we rephrase the task as preference learning: By selecting an entity label from the candidate list, users express that the se- lected one was preferred over all other candidates. These preferences are used to train state-of-the-art pairwise learning-to-rank models from the litera- ture: the gradient boosted trees variant LightGBM (Ke et al., 2017), RankSVM (Joachims, 2002) and RankNet (Burges et al., 2005). Models are re- trained in the background when new annotations are made, thus improving over time with an in- creasing number of annotations. We use a set of generic handcrafted features which are described in Table 1. These models were chosen as they can work with low data, train quickly and allow intro- spection. Using deep models or word embeddings as input features showed to be too slow to be inter- 6985 active. We also leverage pretrained Sentence-BERT embeddings (Reimers and Gurevych, 2019) trained on Natural Language Inference data written in sim- ple English. These are not fine-tuned by us during training. Although they come from a different do- main, we conjecture that the WordPiece tokeniza- tion of BERT helps with the spelling variance of our texts in contrast to traditional word embeddings which would have many out-of-vocabulary words. For specific tasks, custom features can easily be incorporated e.g. entity type information, time in- formation for diachronic entity linking, location information or distance for annotating geographi- cal entities. • Mention exactly matches label • Label is prefix/postfix of mention • Mention is prefix/postfix of label • Label is substring of mention and vice versa • Levenshtein distance between mention and label • Levenshtein distance between context and description • Jaro-Winkler distance between mention and label • Jaro-Winkler distance between context and description • Sørensen-Dice index between context and description • Jaccard coefficient between context and description • Exact match of Soundex encoding of mention and label • Phonetic Match Rating of mention and label • Cosine distance between SBERT Embeddings of context and description (Reimers and Gurevych, 2019) • Query length * Query exactly matches label * Query is prefix/postfix of label/mention * Query is substring of mention/label * Levenshtein distance between query and label • Levenshtein distance between query and mention • Jaro-Winkler distance between query and label • Jaro-Winkler distance between query and mention Table 1: Features used for candidate ranking. Starred features were also used by Zheng et al. (2010) 4 Datasets There are very few datasets available that can be used for EL against domain-specific knowledge bases, further stressing our point that we need more of these, thereby requiring approaches like ours to create them. We use three datasets: AIDA-YAGO, Women Writers Online (WWO) and 1641 Deposi- tions. AIDA consists of Reuters news stories. To the best of our knowledge, WWO has not been consid- ered for automatic EL so far. The 1641 Depositions have been used in automatic EL, but only when linking against DBPedia which has a very low en- tity coverage (Munnelly and Lawless, 2018b). We preprocess the data, split it in sentences, tokenize and reduce noise. For WWO, we derive a RDF KB from their personography, for 1641 we derive a knowledge base from the annotations. The exact processing steps as well as example texts are de- scribed in the appendix. The resulting data sets for WWO and 1641 Depositions are also made available in the accompanying code repository. AIDA-YAGO: For validating our approach, we evaluate on the AIDA-YAGO state-of-the art dataset introduced by Hoffart et al. (2011). Orig- inally, this dataset is linked against YAGO and Wikipedia. We map the Wikipedia URLs to Wiki- data and link against this KB, as Wikidata is avail- able in RDF and the official Wikidata SPARQL endpoint offers full text search: it does not offer fuzzy search though. Women Writers Online: Women Writers On- line3 is a collection of texts by pre-Victorian women writers. It includes texts on a wide range of topics and from various genres including poems, plays, and novels. They represent different states of the English language between 1400 and 1850. A subset of documents has been annotated with named entities (persons, works, places) (Melson and Flanders, 2010). Persons have also been linked to create a personography, a structured represen- tation of persons’ biographies containing names, titles, time and place of birth and death. The texts are challenging to disambiguate due to spelling variance, ciphering of names and a lack of stan- dardized orthography. Sometimes, people are not referred to by name but by rank or function, e.g. the king. This dataset is interesting, as it contains doc- uments with heterogeneous topics and text genres, causing low redundancy. 1641 Depositions: The 1641 Depositions4 con- tain legal texts in form of court witness statements recorded after the Irish Rebellion of 1641. In this conflict, Irish and English Catholics revolted against English and Scottish Protestants and their colonization of Ireland. It lasted over 10 years and ended with the Irish Catholics’ defeat and the for- eign rule of Ireland. The depositions have been transcribed from 17th century handwriting, keep- ing the old language and orthography. These doc- uments have been used to analyze the rebellion, perform cold case reviews of the atrocities commit- ted and to gain insights into contemporary life of this era. Part of the documents have been annotated 3 https://www.wwp.northeastern.edu/wwo 4 http://1641.tcd.ie/ https://www.wwp.northeastern.edu/wwo http://1641.tcd.ie/ 6986 Table 2: Data statistics of the three used datasets: Total number of Documents, Tokens, Entities, average number of Entities per Sentence, % of entities that are not linked. We also report the average number of entities linked to a mention, the average number of candidates when searching for a mention in the KB and the Gini coefficient which measures how balanced the entity distribution is. Corpus #D #T #E #E/S %NIL Avg. Amb. Avg. #Cand. Gini AIDA 1393 301,418 34,929 1.59 20.37 1.08 6.98 0.73 WWO 74 1,461,401 14,651 0.34 7.42 1.08 16.66 0.56 1641 16 11,895 480 2.40 0.0 1.01 36.29 0.44 with named entities that are linked to DBPedia (Munnelly and Lawless, 2018b). As the coverage of DBPedia was not sufficient (only around 20% of the entities are in DBPedia), we manually cre- ated a domain specific knowledge base for this data set containing places and people mentioned. To increase difficulty and reduce overfitting, we added additional related entities from DBPedia. The num- ber of persons increases thereby by tenfold (130 → 1383) and the number of places by twentyfold (99 → 2119). Details for that can be found in Ap- pendix A.1. While generating a KB from gold data is not ideal, creating or completing a knowledge base during annotation is not uncommon (see e.g. Wolfe et al., 2015). The texts are difficult to disam- biguate due to the same reasons as for WWO. The depositions are interesting, as they contain docu- ments from the same domain (witness reports), but feature many different actors and events. Table 2 contains several statistics regarding the three datasets. AIDA and 1641 contain on aver- age at least one entity per sentence, whereas WWO, while larger, is only sparsely annotated. In con- trast to the other two, 1641 contains no entities linked to NIL. This is caused by the fact that we created the KB for 1641 from the gold annota- tions and for entities previously NIL, new entities were created by hand ; before that, the original corpus linking to DBPedia had 77% NIL annota- tions. The average ambiguity, that is, how many different entities were linked to mentions with the same surface form is quite high for AIDA and WWO and quite low for 1641. We explain the latter by the extreme variance in surface form, as even men- tions of the same name are often written differently (e.g. Castlekevyn vs. Castlekevin). Also, 1641 contains many hapax legomena (mentions that only occur once). The average number of candidates is comparatively larger for WWO and 1641 as we use fuzzy search for these. Finally, the distribu- tions of assigned entities in WWO and 1641 are also more balanced, expressed by a lower Gini co- efficient (Dodge, 2008). These last two aspects together with noisy texts and low resources causes entity linking to be much more difficult compared to state-of-the-art datasets like AIDA. 5 Experiments To validate our approach, we first evaluate recom- mender performance. Then, non-interactive rank- ing performance is evaluated similarly to state-of- the-art EL. Afterwards, we simulate a user annotat- ing corpora with our Human-In-The-Loop ranker. Finally, we conduct a user study to test it in a re- alistic setting. Similar to other work on EL, our main metric for ranking is accuracy. We also mea- sure Accuracy@5, as our experiments showed that users can quickly scan and select the right entity from a list of five elements. In our annotation edi- tor, the candidate list shows the first five elements without scrolling. As a baseline, we use the Most- Frequently Linked Entity baseline (MFLEB). It assigns, given a mention, the entity that was most often linked to it in the training data. 5.1 Automatic suggestion performance We evaluate the performance of our Levenshtein- based recommender that suggests potential annota- tions to users (Table 3). We filter out suggestions consisting of ≤ 3 characters as these introduce too much noise. For annotation suggestions, we focus on recall: where low precision implies rec- ommendations that are not useful, no recall results in no recommendations at all. It can be seen that for AIDA and WWO, the performance of all three recommenders is quite good (recall is about 60% and 40%) while for 1641, it is only around 20%. The Levenshtein recommender increases recall and reduces precision. The impact is most pronounced for 1641, where it improves recall upon the string matching recommender by around 50%. In sum- mary, we suggest using the string matching rec- 6987 Dataset Model P R F1 AIDA String 0.43 0.60 0.50 Leven@1 0.31 0.55 0.40 Leven@2 0.19 0.57 0.28 WWO String 0.17 0.38 0.23 Leven@1 0.11 0.40 0.16 Leven@2 0.04 0.42 0.07 1641 String 0.12 0.14 0.13 Leven@1 0.16 0.19 0.17 Leven@2 0.12 0.22 0.15 Table 3: Recommender performance in Precision, Recall and F1 score for String matching recommender and Levenshtein recommender with distance 1 and 2. For AIDA, we evaluate on the test set, for the other datasets, we use 10-fold cross validation. ommender for domains where texts are clean and exhibit low spelling variance. We consider the Levenshtein recommender to be more suitable for domains with noisy texts. 5.2 Candidate ranking performance We evaluate EL candidate ranking in a non- interactive setting first to estimate the upper bound ranking performance. As we are the first to per- form EL on our version of WWO and 1641, it also serves as a difficulty comparison between AIDA as the state-of-the-art dataset and datasets from our domain-specific setting. For AIDA, we use the ex- isting train, development and test split; for the other two corpora, we perform 10-fold cross validation as we observed high variance in score when us- ing different train-test splits. Features related to user queries are not used in this experiment. We assume that the gold candidate always exists in training and evaluation data. The results of this experiment are depicted in Table 4. It can be seen that for AIDA, the MFLE baseline is particularly strong, being better than all trained models. For the other datasets, the baseline is weaker than all, show- ing that popularity is a weak feature in our setting. For AIDA, LightGBM performs best, for WWO and 1641, the RankNet is best closely followed by the RankSVM. The accuracy@5 is compara- tively high as there are cases where the candidate list is relatively short. Regarding training times, LightGBM trains extremely fast with RankSVM being a close second. They are fast enough to re- train after each user annotation. The RankNet trains two to four times slower than both. Data Model A@1 A@5 |C| t AIDA MFLEB 0.56 0.71 31 LightGBM 0.44 0.72 9 RankSVM 0.37 0.69 56 RankNet 0.42 0.70 190 WWO MFLEB 0.32 0.77 19 LightGBM 0.37 0.83 2 RankSVM 0.46 0.86 15 RankNet 0.52 0.87 37 1641 MFLEB 0.28 0.75 38 LightGBM 0.35 0.77 1 RankSVM 0.48 0.80 1 RankNet 0.55 0.83 2 Table 4: Ranking scores when using all the data. We report Accuracy@1 (Gold Candidate was ranked high- est, Accuracy@5 (Gold Candidate was in top 5 predic- tions of the ranker)). |C| denotes the average number of candidates found for each mention. For AIDA, we evaluate on the test set, for the other datasets, we use 10-fold cross validation. We also measure the training time t in seconds averaged over 10 runs. Feature importance The models we chose for ranking are white-box; they allow us to introspect the importance they give to each feature, thereby explaining their scoring choice. For the RankSVM, we follow Guyon et al. (2002) and use the square of the model weights as importance. For Light- GBM, we use the number of times a feature is used to make a split in a decision tree. We train RankSVM and LightGBM models on all data and report the most important and least important fea- tures in Fig. 3. We normalize the weights by the L1-norm. It can be seen that both models rely on Levenshtein distance between mention and label as well as Sentence-BERT. The other text similarity features are, while sparingly, also used. Simple fea- tures like exact match, contains or prefix and postfix seem to not have a large impact. In general, LightGBM uses more features than the RankSVM. Even though Sentence-BERT was trained on Natural Language Inference (NLI) data which contains only relatively simple sentences, it still is relied on by both models for all datasets. The high importance of Levenshtein distance between mention and label for 1641 is expected and can be explained by the fact that the knowledge base labels often were derived from the mentions in the text when creating a domain-specific knowledge 6988 base for this dataset. When trained on AIDA, the RankSVM assigns a high importance to the Jac- card distance between context and description. We attribute this to the fact that entity descriptions in Wikidata are quite short; if they are similar to the context then it is very likely a match. Figure 3: Feature importance of the respective models for different datasets. For the RankSVM, we use the squared weights; for LightGBM, we use the number of times a feature is used for splitting. Both are normal- ized to sum up to 1. ML stands for Mention-Label, CD for Context-Description. 5.3 Simulation We simulate the Human-In-The-Loop setting by modeling a user annotating an unannotated corpus linearly. In the beginning, they annotate an ini- tial seed of 10 entities without annotation support which are then used to bootstrap the ranker. At every step, the user annotates several entities where the ranker is used as assistance. After an anno- tation batch is finished, this new data is added to the training set, the ranker is retrained and evalu- ated. Only LightGBM and RankSVM are used as the RankNet turned out to be too slow. We do not evaluate on a holdout set. Instead, we follow Erdmann et al. (2019) and simulate annotating the complete corpus and evaluate on the very same data as we are interested in how an annotated sub- set helps to annotate the rest of the data, not how well the model generalizes. We assume that users annotate mention spans perfectly, i.e. we use gold spans. The candidate generation is simulated in three phases. It relies on the fact that the gold en- tity is given by the dataset: First, search for the mention only. If it was not found, search for the first word of the mention only. If this does not return the gold entity, search for the gold entity label. All candidates retrieved by these searches for a mention are used as training data. We also experimented with using only candidates for that the ranker assigned a higher score than the gold one. This, however, did not affect the performance. Therefore, we use all negative candidates. Fig. 4 depicts the simulation results. All mod- els outperform the MFLE baseline over most of the annotation process. It can be seen that both of our used models achieve high performance even if trained on very few annotations. The RankSVM handles low data better than LightGBM, but quickly reaches its peak performance due to it be- ing a linear model with limited learning capacity. The LightGBM does not plateau that early. This potentially allows to first use a RankSVM for the cold start and when enough annotations are made, LightGBM, thereby combining the best of both models. Comparing the performance on the three datasets, we notice that the performance for AIDA is much higher. Also, the baseline rises much more steeply, hinting again that AIDA is easier and pop- ularity there is a very strong feature. For 1641, the curve continue to rise, hinting that more data is needed to reach maximum performance. Dataset Phase 1 Phase 2 Phase 3 AIDA 0.20 0.00 0.80 WWO 0.26 0.27 0.47 1641 0.55 0.06 0.39 Table 5: Percentage of times the simulated user found the gold entity in the candidate list by searching for the mention (Phase 1), for the first word of the mention (Phase 2) or for the gold label (Phase 3). Table 5 shows how the simulated user searched for the gold entities. We see that for WWO and 1641, the user often does not need to spend much effort in searching for the gold label, using the mention is in around 50% of the cases enough. We attribute this to the fuzzy search which the official Wikidata endpoint does not offer. 6989 0 5k 10k 15k 20k 25k 30k 35k 0.4 0.5 0.6 0.7 0.8 0.9 1.0 A cc ur ac y@ 1 0 5k 10k 15k 20k 25k 30k 35k Number of annotations 0.4 0.5 0.6 0.7 0.8 0.9 1.0 A cc ur ac y@ 5 MFLE baseline LightGBM RankSVM AIDA-CoNLL 0 2.5k 5k 7.5k 10k 12.5k 0.4 0.5 0.6 0.7 0.8 0.9 1.0 A cc ur ac y@ 1 0 2.5k 5k 7.5k 10k 12.5k Number of annotations 0.4 0.5 0.6 0.7 0.8 0.9 1.0 A cc ur ac y@ 5 MFLE baseline LightGBM RankSVM Women Writers 0 100 200 300 400 500 0.4 0.5 0.6 0.7 0.8 0.9 1.0 A cc ur ac y@ 1 0 100 200 300 400 500 Number of annotations 0.4 0.5 0.6 0.7 0.8 0.9 1.0 A cc ur ac y@ 5 MFLE baseline LightGBM RankSVM 1641 Depositions Figure 4: Human-in-the-loop simulation results for our three datasets and models. We can see that we get good Accuracy@5 with only a few annotations, especially for the RankSVM. This shows that the system is useful even at the beginning of the annotation process, alleviating the cold start problem. 5.4 User Study In order to validate the viability of our approach in a realistic scenario, we conduct a user study. For that, we augmented the already existing anno- tation tool INCEpTION5 (Klie et al., 2018) with our Human-In-The-Loop entity ranking and auto- matic suggestions. Fig. 5 shows a screenshot of the annotation editor itself. We let five users reanno- tate parts of the 1641 corpus. It was chosen as it has a high density of entity mentions while being small enough to be annotated in under one hour. Users stem from various academic backgrounds, e.g. natural language processing, computer science and digital humanities. Roughly half of them have previous experience with annotating. We compare two configurations: one uses our ranking and Lev- enshtein recommender, one uses the ranking of the full text search with the string matching recom- mender. We randomly selected eight documents which we split in two sets of four documents. To reduce bias, we assign users in four groups based on which part and which ranking they use first. Users are given detailed instructions and a warm- up document that is not used in the evaluation to get used to the annotation process. We measure annotation time, number of suggestions used and search queries performed. After the annotation is finished, we ask users to fill out a survey asking which system they prefer, how they experienced the annotation process and what suggestions they have to improve it. The evaluation of the user study 5 https://inception-project.github.io shows that using our approach, users on average annotated 35% faster and needed 15% less search queries. Users positively commented on the rank- ing performance and the annotation suggestions for both systems. For our ranking, users reported that the gold entity often ranked first or close to top; they rarely observed that gold candidates were sorted close to the end of the candidate list. We conduct a paired sample t-test to estimate the significance of our user study. Our null-hypothesis is that the reranking system does not improve the average annotation time. Conducting the test yields the following: t = 3.332, p = 0.029. We therefore reject the null hypothesis with p = 0.029 < 0.05, meaning that we have ample evidence that our reranking speeds up annotation time. Recommender suggestions made up around 30% of annotations. We did not measure a significant difference between string and Levenshtein recom- mender. About the latter, users liked that it can suggest annotations for inexact matches. How- ever, they criticized the noisier suggestions, espe- cially for shorter mentions (e.g. annotating joabe (a name) yielded suggestions for to be). In the future, we will address this issue by filtering out more potentially unhelpful suggestions and using annotation rejections as a blacklist. 6 Conclusion We presented a domain-agnostic annotation ap- proach for annotating entity linking for low- resource domains. It consists of two main com- https://inception-project.github.io 6990 Figure 5: For our user study, we extend the INCEpTION annotation framework: 1© entity linking search field, 2© candidate list, 3© linked named entity, 4© entity linking recommendation. ponents: recommenders that are algorithms that suggest potential annotations to users and a ranker that, given a mention span, ranks potential entity candidates so that they show up higher in the can- didate list, making it easier to find for users. Both systems are retrained whenever new annotations are made, forming the Human-In-The-Loop. Our approach does not require the existence of external resources like labeled data, tools like named entity recognizers or large-scale resources like Wikipedia. It can be applied to any domain, only requiring a knowledge base whose entities have a label and a description. In this paper, we evaluate on three datasets: AIDA, which is often used to validate state-of-the-art entity linking sys- tems as well as WWO and 1641 from the humanities. We show that in simulation, only a very small sub- set needs to be annotated (fewer than 100) for the ranker to reach high accuracy. In a user study, re- sults show that users prefer our approach compared to the typical annotation process; annotation speed improves by around 35% when using our system relative to using no reranking support. In the future, we want to investigate more power- ful recommenders, combine interactive entity link- ing with knowledge base completion and use online learning to leverage deep models, despite their long training time. Acknowledgments We thank the anonymous reviewers and Kevin Stowe for their detailed and helpful comments. We also want to thank the Women Writers Project which made the Women Writers Online text col- lection available to us. This work was supported by the German Research Foundation under grant № EC 503/1-1 and GU 798/21-1 as well as by the German Federal Ministry of Education and Re- search (BMBF) under the promotional reference 01UG1816B (CEDIFOR). References Sebastian Arnold, Robert Dziuba, and Alexander Löser. 2016. TASTY: Interactive Entity Linking As-You- Type. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguis- tics: System Demonstrations, pages 111–115. Sabine Bartsch. 2004. Annotating a Corpus for Build- ing a Domain-specific Knowledge Base. In Proceed- ings of the Fourth International Conference on Lan- guage Resources and Evaluation (LREC’04), pages 1669–1672. Carmen Brando, Francesca Frontini, and Jean-Gabriel Ganascia. 2016. REDEN: Named Entity Linking in Digital Literary Editions Using Linked Data Sets. Complex Systems Informatics and Modeling Quar- terly, (7):60–80. Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to rank using Gradient Descent. In Proceedings of the 22nd international conference on Machine learning - ICML ’05, pages 89–96. Diego Ceccarelli, Claudio Lucchese, Salvatore Or- lando, Raffaele Perego, and Salvatore Trani. 2013. Dexter. In Proceedings of the sixth international workshop on Exploiting semantic annotations in in- formation retrieval - ESAIR '13, pages 17–20. https://www.aclweb.org/anthology/C16-2024 https://www.aclweb.org/anthology/C16-2024 http://www.lrec-conf.org/proceedings/lrec2004/pdf/361.pdf http://www.lrec-conf.org/proceedings/lrec2004/pdf/361.pdf https://doi.org/10.7250/csimq.2016-7.04 https://doi.org/10.7250/csimq.2016-7.04 https://doi.org/10.1145/1102351.1102363 https://doi.org/10.1145/2513204.2513212 6991 Koby Crammer and Yoram Singer. 2003. Ultraconser- vative Online Algorithms for Multiclass Problems. JMLR, 3:951–991. Joachim Daiber, Max Jakob, Chris Hokamp, and Pablo N. Mendes. 2013. Improving efficiency and accuracy in multilingual entity extraction. In Pro- ceedings of the 9th International Conference on Se- mantic Systems - I-SEMANTICS '13, pages 121–124. Yadolah Dodge. 2008. The Concise Encyclopedia of Statistics. Springer. Alexander Erdmann, David Joseph Wrisley, Benjamin Allen, Christopher Brown, Sophie Cohen-Bodénès, Micha Elsner, Yukun Feng, Brian Joseph, Béatrice Joyeux-Prunel, and Marie-Catherine de Marneffe. 2019. Practical, Efficient, and Customizable Active Learning for Named Entity Recognition in the Digi- tal Humanities. In Proceedings of the 2019 Confer- ence of the North, pages 2223–2234. Paolo Ferragina and Ugo Scaiella. 2010. TAGME: On-the-fly Annotation of Short Text Fragments (by Wikipedia Entities). In Proceedings of the 19th ACM international conference on Information and knowledge management - CIKM '10, pages 1625– 1628. Yang Gao, Christian M. Meyer, and Iryna Gurevych. 2018. APRIL: Interactively Learning to Summarise by Combining Active Preference Learning and Re- inforcement Learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Lan- guage Processing, pages 4120–4130. Stephen Guo, Ming-Wei Chang, and Emre Kiciman. 2013. To Link or Not to Link? A Study on End- to-End Tweet Entity Linking. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Hu- man Language Technologies, pages 1020–1030. Isabelle Guyon, Jason Weston, Stephen Barnhill, and Vladimir Vapnik. 2002. Gene Selection for Cancer Classification using Support Vector Machines. Ma- chine Learning, 46:389–422. Luheng He, Julian Michael, Mike Lewis, and Luke Zettlemoyer. 2016. Human-in-the-Loop Parsing. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2337–2342. Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bor- dino, Hagen Fürstenau, Manfred Pinkal, Marc Span- iol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. 2011. Robust Disambiguation of Named Entities in Text. In Proceedings of EMNLP’11, pages 782–792. Filip Ilievski, Piek Vossen, and Stefan Schlobach. 2018. Systematic Study of Long Tail Phenomena in En- tity Linking. In Proceedings of the 27th Inter- national Conference on Computational Linguistics, pages 664–674. Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’02, pages 133–142. Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Informa- tion Processing Systems 30, pages 3146–3154. Jan-Christoph Klie, Michael Bugert, Beto Boullosa, Richard Eckart de Castilho, and Iryna Gurevych. 2018. The INCEpTION Platform: Machine- Assisted and Knowledge-Oriented Interactive Anno- tation. In Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations, pages 5–9. Edgar Meij, Krisztian Balog, and Daan Odijk. 2014. Entity linking and retrieval for semantic search. In Proceedings of the 7th ACM international confer- ence on Web search and data mining - WSDM '14, pages 683–684. John Melson and Julia Flanders. 2010. Not Just One of Your Holiday Games: Names and Name Encoding in the Women Writers Project Textbase. White paper, Women Writers Project, Brown University. David Milne and Ian H. Witten. 2008. Learning to link with Wikipedia. In Proceeding of the 17th ACM conference on Information and knowledge mining - CIKM '08, pages 509–518. Gary Munnelly and Séamus Lawless. 2018a. Con- structing a knowledge base for entity linking on Irish cultural heritage collections. Procedia Computer Science, 137:199–210. Gary Munnelly and Seamus Lawless. 2018b. Investi- gating Entity Linking in Early English Legal Doc- uments. In Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries - JCDL ’18, pages 59–68. Farhad Nooralahzadeh and Lilja Øvrelid. 2018. SIRIUS-LTG: An Entity Linking Approach to Fact Extraction and Verification. In Proceedings of the First Workshop on Fact Extraction and VERification (FEVER), pages 119–123. Nils Reimers and Iryna Gurevych. 2019. Sentence- BERT: Sentence Embeddings using Siamese BERT- Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Process- ing, pages 3980–3990. Matthias Schlögl and Katalin Lejtovicz. 2017. APIS - Austrian Prosopographical Information System. In Proceedings of the Second Conference on Biograph- ical Data in a Digital World 2017. https://doi.org/10.1162/jmlr.2003.3.4-5.951 https://doi.org/10.1162/jmlr.2003.3.4-5.951 https://doi.org/10.1145/2506182.2506198 https://doi.org/10.1145/2506182.2506198 https://doi.org/10.1007/978-0-387-32833-1_169 https://doi.org/10.1007/978-0-387-32833-1_169 https://doi.org/10.18653/v1/n19-1231 https://doi.org/10.18653/v1/n19-1231 https://doi.org/10.18653/v1/n19-1231 https://doi.org/10.1145/1871437.1871689 https://doi.org/10.1145/1871437.1871689 https://doi.org/10.1145/1871437.1871689 https://doi.org/10.18653/v1/d18-1445 https://doi.org/10.18653/v1/d18-1445 https://doi.org/10.18653/v1/d18-1445 https://www.aclweb.org/anthology/N13-1122 https://www.aclweb.org/anthology/N13-1122 https://doi.org/10.1023/a:1012487302797 https://doi.org/10.1023/a:1012487302797 https://doi.org/10.18653/v1/d16-1258 https://www.aclweb.org/anthology/D11-1072.pdf https://www.aclweb.org/anthology/D11-1072.pdf https://www.aclweb.org/anthology/C18-1056 https://www.aclweb.org/anthology/C18-1056 https://doi.org/10.1145/775047.775067 https://doi.org/10.1145/775047.775067 http://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision-tree.pdf http://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision-tree.pdf https://www.aclweb.org/anthology/C18-2002 https://www.aclweb.org/anthology/C18-2002 https://www.aclweb.org/anthology/C18-2002 https://doi.org/10.1145/2556195.2556201 https://www.wwp.northeastern.edu/research/publications/reports/neh_2008/WWP_Names_White_Paper.pdf https://www.wwp.northeastern.edu/research/publications/reports/neh_2008/WWP_Names_White_Paper.pdf https://www.wwp.northeastern.edu/research/publications/reports/neh_2008/WWP_Names_White_Paper.pdf https://doi.org/10.1145/1458082.1458150 https://doi.org/10.1145/1458082.1458150 https://doi.org/10.1016/j.procs.2018.09.019 https://doi.org/10.1016/j.procs.2018.09.019 https://doi.org/10.1016/j.procs.2018.09.019 https://doi.org/10.1145/3197026.3197055 https://doi.org/10.1145/3197026.3197055 https://doi.org/10.1145/3197026.3197055 https://doi.org/10.18653/v1/w18-5519 https://doi.org/10.18653/v1/w18-5519 https://doi.org/10.18653/v1/D19-1410 https://doi.org/10.18653/v1/D19-1410 https://doi.org/10.18653/v1/D19-1410 https://doi.org/10.5281/zenodo.846571 https://doi.org/10.5281/zenodo.846571 6992 Klaus U. Schulz and Stoyan Mihov. 2002. Fast string correction with Levenshtein automata. Interna- tional Journal on Document Analysis and Recogni- tion, 5(1):67–85. Wei Shen, Jianyong Wang, and Jiawei Han. 2015. En- tity Linking with a Knowledge Base: Issues, Tech- niques, and Solutions. IEEE Transactions on Knowl- edge and Data Engineering, 27(2):443–460. Ricardo Usbeck, Axel-Cyrille Ngonga Ngomo, Michael Röder, Daniel Gerber, Sandro Athaide Coelho, Sören Auer, and Andreas Both. 2014. AGDISTIS - Graph-Based Disambiguation of Named Entities Using Linked Data. In The Semantic Web – ISWC 2014, pages 457–471. Eric Wallace, Pedro Rodriguez, Shi Feng, Ikuya Ya- mada, and Jordan Boyd-Graber. 2019. Trick Me If You Can: Human-in-the-loop Generation of Ad- versarial Question Answering Examples. Transac- tions of the Association for Computational Linguis- tics, 7(0):387–401. Travis Wolfe, Mark Dredze, James Mayfield, Paul McNamee, Craig Harman, Tim Finin, and Ben- jamin Van Durme. 2015. Interactive Knowledge Base Population. Seid Muhie Yimam, Chris Biemann, Richard Eckart de Castilho, and Iryna Gurevych. 2014. Auto- matic Annotation Suggestions and Custom Annota- tion Layers in WebAnno. In Proceedings of 52nd Annual Meeting of the Association for Computa- tional Linguistics: System Demonstrations, pages 91–96. Zhicheng Zheng, Fangtao Li, Minlie Huang, and Xi- aoyan Zhu. 2010. Learning to Link Entities with Knowledge Base. In Prooceedings of NAACL- HLT’10, pages 483–491. A Appendices A.1 Dataset creation The following section describes how we preprocess the raw texts from WWO and 1641. Example texts can be found in Table 6. The respective code and datasets will be made available on acceptance. A.1.1 Women Writers Online We use the following checkout of the WWO data, which was graciously provided by the Women Writ- ers Project6. Revision: 36425 Last Changed Rev: 36341 Last Changed Date: 2019-02-19 6 https://www.wwp.northeastern.edu/ The texts itself are provided as TEI7. We use DKPro Core8 to read in the TEI, split the raw text into sentences and tokenize it with the JTokSegmenter. When an annotation is spread over two sentences, we merge these sentences. This is mostly caused by a too eager sentence splitter. We covert the personographie which is in XML to RDF, including all properties that were encoded in there. A.1.2 1641 Depositions We use a subset of the 1641 depositions provided by Gary Munnelly. The raw data can be found on Github9. The texts itself are provided as NIF10. We use DKPro Core11 to read in the NIF, split the raw text into sentences and tokenize it with the JTokSegmenter. When an annotation is spread over two sentences, we merge these sentences. This is mostly caused by a too eager sentence splitter. We use the knowledge base that comes with the NIF and create entities for all mentions that were NIL. We carefully deduplicate entities, e.g. Luke Toole and Colonel Toole are mapped to the same entity. In order to increase the difficulty of this dataset, we add additional entities from DB- Pedia: all Irish people, Irish cities and buildings in Ireland; all popes; royalities born between 1550 and 1650. For that, we execute SPARQL queries against DBPedia for instances of dbc:Popes, dbc:Royality, dbc:17th-century Irish people and keep entries with a birth date before 1650 and a death date between 1600 and 1700. For the places, we search for dbo:Castle, dbo:HistoricPlace, dbo:Building, dbc:17th-century Irish people that are located in Ireland. The follwing table shows how many entities were in the original KB and how many were added: Persons in gold data 130 Places in gold data 99 Persons added from DBPedia 1253 Places added from DBPedia 2020 7 https://tei-c.org/ 8 https://dkpro.github.io/dkpro-core/ 9 https://github.com/munnellg/ 1641DepositionsCorpus 10 https://persistence.uni-leipzig.org/ nlp2rdf/ 11 https://dkpro.github.io/dkpro-core/ https://doi.org/10.1007/s10032-002-0082-8 https://doi.org/10.1007/s10032-002-0082-8 https://doi.org/10.1109/tkde.2014.2327028 https://doi.org/10.1109/tkde.2014.2327028 https://doi.org/10.1109/tkde.2014.2327028 https://doi.org/10.1007/978-3-319-11964-9_29 https://doi.org/10.1007/978-3-319-11964-9_29 https://transacl.org/ojs/index.php/tacl/article/view/1711 https://transacl.org/ojs/index.php/tacl/article/view/1711 https://transacl.org/ojs/index.php/tacl/article/view/1711 http://arxiv.org/abs/1506.00301 http://arxiv.org/abs/1506.00301 https://doi.org/10.3115/v1/p14-5016 https://doi.org/10.3115/v1/p14-5016 https://doi.org/10.3115/v1/p14-5016 https://www.aclweb.org/anthology/N10-1072.pdf https://www.aclweb.org/anthology/N10-1072.pdf https://www.wwp.northeastern.edu/ https://tei-c.org/ https://dkpro.github.io/dkpro-core/ https://github.com/munnellg/1641DepositionsCorpus https://github.com/munnellg/1641DepositionsCorpus https://persistence.uni-leipzig.org/nlp2rdf/ https://persistence.uni-leipzig.org/nlp2rdf/ https://dkpro.github.io/dkpro-core/ 6993 WWO The following Lines occasion’d by the Marriage of Edward Herbert Esquire, and Mrs. Eliza- beth Herbert. Cupid one day ask’d his Mother , When she meant that he shou’d Wed? You’re too Young, my Boy, she said: Nor has Nature made another Fit to match with Cupid’s Bed. Finch, Anne: Miscellany poems, on several occasions, 1713 Joseph Joice of Kisnebrasney in the kings County gentleman sworne and examined de- poseth and saith That after the Rebellion was begun in the County aforesaid vizt about the xxth of November 1641 This deponent for saffty fled to the Castle of knocknamease in the same County Deposition of Joseph Joice, 164312 Table 6: Example sentences from these corpora. Linked Named entities are highlighted in yellow. A.2 Experiments A.2.1 Full text search For AIDA and Wikidata, we use the official SPARQL endpoint and the Mediawiki API Query Service13. It does not support fuzzy search. For WWO and 1641, we host the created RDF in a Fuseki14 instance and use the builtin func- tionality to index via Lucene. A.2.2 Timing Timing was performed on a Desktop PC with Ryzen 3600 and a GeForce RTX 2060. 13 https://www.mediawiki.org/wiki/ Wikidata_Query_Service/User_Manual/MWAPI 14 https://jena.apache.org/ documentation/fuseki2/ https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual/MWAPI https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual/MWAPI https://jena.apache.org/documentation/fuseki2/ https://jena.apache.org/documentation/fuseki2/ stanford-ai-2021 ---- Artificial Intelligence Index Report 2021 Artificial Intelligence Index Report 2021 2 INTRODUCTION TO THE 2021 AI INDEX REPORT Welcome to the fourth edition of the AI Index Report! This year we significantly expanded the amount of data available in the report, worked with a broader set of external organizations to calibrate our data, and deepened our connections with Stanford’s Institute for Human- Centered Artificial Intelligence (HAI). The AI Index Report tracks, collates, distills, and visualizes data related to artificial intelligence. Its mission is to provide unbiased, rigorously vetted, and globally sourced data for policymakers, researchers, executives, journalists, and the general public to develop intuitions about the complex field of AI. The report aims to be the world’s most credible and authoritative source for data and insights about AI. C O V I D A N D A I The 2021 report shows the effects of COVID-19 on AI development from multiple perspectives. The Technical Performance chapter discusses how an AI startup used machine-learning-based techniques to accelerate COVID- related drug discovery during the pandemic, and our Economy chapter suggests that AI hiring and private investment were not significantly adversely influenced by the pandemic, as both grew during 2020. If anything, COVID-19 may have led to a higher number of people participating in AI research conferences, as the pandemic forced conferences to shift to virtual formats, which in turn led to significant spikes in attendance. C H A N G E S F O R T H I S E D I T I O N In 2020, we surveyed more than 140 readers from government, industry, and academia about what they found most useful about the report and what we should change. The main suggested areas for improvement were: • Technical performance: We significantly expanded this chapter in 2021, carrying out more of our own analysis. • Diversity and ethics data: We gathered more data for this year’s report, although our investigation surfaced several areas where the AI community currently lacks good information. • Country comparisons: Readers were generally interested in being able to use the AI Index for cross- country comparisons. To support this, we: • gathered more data to allow for comparison among countries, especially relating to economics and bibliometrics; and • included a thorough summary of the various AI strategies adopted by different countries and how they evolved over time. P U B L I C DATA A N D T O O L S The AI Index 2021 Report is supplemented by raw data and an interactive tool. We invite each member of the AI community to use the data and tool in a way most relevant to their work and interests. • Raw data and charts: The public data and high- resolution images of all the charts in the report are available on Google Drive. • Global AI Vibrancy Tool: We revamped the Global AI Vibrancy Tool this year, allowing for better interactive visualization when comparing up to 26 countries across 22 indicators. The updated tool provides transparent evaluation of the relative position of countries based on users’ preference; identifies relevant national indicators to guide policy priorities at a country level; and shows local centers of AI excellence for not just advanced economies but also emerging markets. • Issues in AI measurement: In fall 2020, we published “Measurement in AI Policy: Opportunities and Challenges,” a report that lays out a variety of AI measurement issues discussed at a conference hosted by the AI Index in fall 2019. https://drive.google.com/drive/folders/1YY9rj8bGSJDLgIq09FwmF2y1k_FazJUm?usp=sharing https://aiindex.stanford.edu/vibrancy https://aiindex.stanford.edu/vibrancy https://arxiv.org/abs/2009.09071 https://arxiv.org/abs/2009.09071 Artificial Intelligence Index Report 2021 3 Table of Contents INTRODUCTION TO THE 2021 AI INDEX REPORT 2 TOP 9 TAKEAWAYS 4 AI INDEX STEERING COMMITTEE & STAFF 5 HOW TO CITE THE REPORT 6 ACKNOWLEDGMENTS 7 REPORT HIGHLIGHTS 10 CHAPTER 1 Research and Development 14 CHAPTER 2 Technical Performance 41 CHAPTER 3 The Economy 80 CHAPTER 4 AI Education 107 CHAPTER 5 Ethical Challenges of AI Applications 125 CHAPTER 6 Diversity in AI 135 CHAPTER 7 AI Policy and National Strategies 151 APPENDIX 177 ACCESS THE PUBLIC DATA https://drive.google.com/drive/folders/1YY9rj8bGSJDLgIq09FwmF2y1k_FazJUm?usp=sharing Artificial Intelligence Index Report 2021 4 AI investment in drug design and discovery increased significantly: “Drugs, Cancer, Molecular, Drug Discovery” received the greatest amount of private AI investment in 2020, with more than USD 13.8 billion, 4.5 times higher than 2019. The industry shift continues: In 2019, 65% of graduating North American PhDs in AI went into industry—up from 44.4% in 2010, highlighting the greater role industry has begun to play in AI development. Generative everything: AI systems can now compose text, audio, and images to a sufficiently high standard that humans have a hard time telling the difference between synthetic and non-synthetic outputs for some constrained applications of the technology. AI has a diversity challenge: In 2019, 45% new U.S. resident AI PhD graduates were white—by comparison, 2.4% were African American and 3.2% were Hispanic. China overtakes the US in AI journal citations: After surpassing the United States in the total number of journal publications several years ago, China now also leads in journal citations; however, the United States has consistently (and significantly) more AI conference papers (which are also more heavily cited) than China over the last decade. The majority of the US AI PhD grads are from abroad—and they’re staying in the US: The percentage of international students among new AI PhDs in North America continued to rise in 2019, to 64.3%—a 4.3% increase from 2018. Among foreign graduates, 81.8% stayed in the United States and 8.6% have taken jobs outside the United States. Surveillance technologies are fast, cheap, and increasingly ubiquitous: The technologies necessary for large-scale surveillance are rapidly maturing, with techniques for image classification, face recognition, video analysis, and voice identification all seeing significant progress in 2020. AI ethics lacks benchmarks and consensus: Though a number of groups are producing a range of qualitative or normative outputs in the AI ethics domain, the field generally lacks benchmarks that can be used to measure or assess the relationship between broader societal discussions about technology development and the development of the technology itself. Furthermore, researchers and civil society view AI ethics as more important than industrial organizations. AI has gained the attention of the U.S. Congress: The 116th Congress is the most AI-focused congressional session in history with the number of mentions of AI in congressional record more than triple that of the 115th Congress. TOP 9 TAKEAWAYS 1 2 3 4 5 6 7 8 9 Artificial Intelligence Index Report 2021 5 AI Index Steering Committee AI Index Staff Co-Directors Members Research Manager and Editor in Chief Program Manager Jack Clark OECD, GPAI Daniel Zhang Stanford University Erik Brynjolfsson Stanford University John Etchemendy Stanford University Deep Ganguli Stanford University Barbara Grosz Harvard University Terah Lyons Partnership on AI James Manyika McKinsey Global Institute Juan Carlos Niebles Stanford University Michael Sellitto Stanford University Yoav Shoham (Founding Director) Stanford University, AI21 Labs Raymond Perrault SRI International Saurabh Mishra Stanford University Artificial Intelligence Index Report 2021 6 How to Cite This Report Daniel Zhang, Saurabh Mishra, Erik Brynjolfsson, John Etchemendy, Deep Ganguli, Barbara Grosz, Terah Lyons, James Manyika, Juan Carlos Niebles, Michael Sellitto, Yoav Shoham, Jack Clark, and Raymond Perrault, “The AI Index 2021 Annual Report,” AI Index Steering Committee, Human-Centered AI Institute, Stanford University, Stanford, CA, March 2021. The AI Index 2021 Annual Report by Stanford University is licensed under Attribution-NoDerivatives 4.0 International. To view a copy of this license, visit http://creativecommons.org/licenses/by-nd/4.0/. The AI Index is an independent initiative at Stanford University’s Human-Centered Artificial Intelligence Institute (HAI). We thank our supporting partners We welcome feedback and new ideas for next year. Contact us at AI-Index-Report@stanford.edu. The AI Index was conceived within the One Hundred Year Study on AI (AI100). http://creativecommons.org/licenses/by-nd/4.0/ http://hai.stanford.edu mailto:AI-Index-Report%40stanford.edu?subject= https://ai100.stanford.edu/ Artificial Intelligence Index Report 2021 7 Acknowledgments We appreciate the following organizations and individuals who provided data, analysis, advice, and expert commentary for inclusion in the AI Index 2021 Report: Organizations arXiv Jim Entwood, Paul Ginsparg, Joe Halpern, Eleonora Presani AI Ethics Lab Cansu Canca, Yasemin Usta Black in AI Rediet Abebe, Hassan Kane Bloomberg Government Chris Cornillie Burning Glass Technologies Layla O’Kane, Bledi Taska, Zhou Zhou Computing Research Association Andrew Bernat, Susan Davidson Elsevier Clive Bastin, Jörg Hellwig, Sarah Huggett, Mark Siebert Intento Grigory Sapunov, Konstantin Savenkov International Federation of Robotics Susanne Bieller, Jeff Burnstein Joint Research Center, European Commission Giuditta De Prato, Montserrat López Cobo, Riccardo Righi LinkedIn Guy Berger, Mar Carpanelli, Di Mo, Virginia Ramsey Liquidnet Jeffrey Banner, Steven Nichols McKinsey Global Institute Brittany Presten Microsoft Academic Graph Iris Shen, Kuansan Wang National Institute of Standards and Technology Patrick Grother Nesta Joel Klinger, Juan Mateos-Garcia, Kostas Stathoulopoulos NetBase Quid Zen Ahmed, Scott Cohen, Julie Kim PostEra Aaron Morris Queer in AI Raphael Gontijo Lopes State of AI Report Nathan Benaich, Ian Hogarth Women in Machine Learning Sarah Tan, Jane Wang Artificial Intelligence Index Report 2021 8 Individuals ActivityNet Fabian Caba (Adobe Research); Bernard Ghanem (King Abdullah University of Science and Technology); Cees Snoek (University of Amsterdam) AI Brain Drain and Faculty Departure Michael Gofman (University of Roches- ter); Zhao Jin (Cheung Kong Graduate School of Business) Automated Theorem Proving Geoff Sutcliffe (University of Miami); Christian Suttner (Connion GmbH) Boolean Satisfiability Problem Lars Kotthoff (University of Wyoming) Corporate Representation at AI Research Conferences Nuruddin Ahmed (Ivey Business School, Western University); Muntasir Wahed (Virginia Tech) Conference Attendance Maria Gini, Gita Sukthankar (AAMAS); Carol Hamilton (AAAI); Dan Jurafsky (ACL); Walter Scheirer, Ramin Zabih (CVPR); Jörg Hoffmann, Erez Karpas (ICAPS); Paul Oh (IROS); Pavlos Peppas, Michael Thielscher (KR) Ethics at AI Conferences Pedro Avelar, Luis Lamb, Marcelo Prates (Federal University of Rio Grande do Sul) ImageNet Lucas Beyer, Alexey Dosovitskiy, Neil Houlsby (Google) MLPerf/DAWNBench Cody Coleman (Stanford University), Peter Mattson (Google) Molecular Synthesis Philippe Schwaller (IBM Research–Europe) Visual Question Answering Dhruv Batra, Devi Parikh (Georgia Tech/FAIR); Ayush Shrivastava (Georgia Tech) You Only Look Once Xiang Long (Baidu) Artificial Intelligence Index Report 2021 9 Advice and Expert Commentary Graduate Researchers Report and Website Alexey Bochkovskiy; Baidu’s PaddlePaddle Computer Vision Team; Chenggang Xu (Cheung Kong Graduate School of Business); Mohammed AlQuraishi (Columbia University); Evan Schnidman (EAS Innovation); Fanghzhen Lin (Hong Kong University of Science and Technology); David Kanter (MLCommons); Sam Bowman (New York University); Maneesh Agrawala, Jeannette Bohg, Emma Brunskill, Chelsea Finn, Aditya Grover, Tatsunori Hashimoto, Dan Jurafsky, Percy Liang, Sharon Zhou (Stanford University); Vamsi Sistla (University of California, Berkeley); Simon King (University of Edinburgh); Ivan Goncharov (Weights & Biases) Ankita Banerjea, Yu-chi Tsao (Stanford University) Michi Turner (report graphic design and cover art); Nancy King (report editor); Michael Taylor (report data visualization); Kevin Litman-Navarro (Global AI Vibrancy Tool design and development); Travis Taylor (AI Index website design); Digital Avenues (AI Index website development) https://www.pencildotstring.com/ https://www.linkedin.com/in/michaeltaylor1989 https://kevinlitman-navarro.github.io/ http://travismarktyler.com/ Artificial Intelligence Index Report 2021 1 0 C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T • The number of AI journal publications grew by 34.5% from 2019 to 2020—a much higher percentage growth than from 2018 to 2019 (19.6%). • In every major country and region, the highest proportion of peer-reviewed AI papers comes from academic institutions. But the second most important originators are different: In the United States, corporate- affiliated research represents 19.2% of the total publications, whereas government is the second most important in China (15.6%) and the European Union (17.2%). • In 2020, and for the first time, China surpassed the United States in the share of AI journal citations in the world, having briefly overtaken the United States in the overall number of AI journal publications in 2004 and then retaken the lead in 2017. However, the United States has consistently (and significantly) more cited AI conference papers than China over the last decade. • In response to COVID-19, most major AI conferences took place virtually and registered a significant increase in attendance as a result. The number of attendees across nine conferences almost doubled in 2020. • In just the last six years, the number of AI-related publications on arXiv grew by more than sixfold, from 5,478 in 2015 to 34,736 in 2020. • AI publications represented 3.8% of all peer-reviewed scientific publications worldwide in 2019, up from 1.3% in 2011. C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E • Generative everything: AI systems can now compose text, audio, and images to a sufficiently high standard that humans have a hard time telling the difference between synthetic and non-synthetic outputs for some constrained applications of the technology. That promises to generate a tremendous range of downstream applications of AI for both socially useful and less useful purposes. It is also causing researchers to invest in technologies for detecting generative models; the DeepFake Detection Challenge data indicates how well computers can distinguish between different outputs. • The industrialization of computer vision: Computer vision has seen immense progress in the past decade, primarily due to the use of machine learning techniques (specifically deep learning). New data shows that computer vision is industrializing: Performance is starting to flatten on some of the largest benchmarks, suggesting that the community needs to develop and agree on harder ones that further test performance. Meanwhile, companies are investing increasingly large amounts of computational resources to train computer vision systems at a faster rate than ever before. Meanwhile, technologies for use in deployed systems—like object-detection frameworks for analysis of still frames from videos—are maturing rapidly, indicating further AI deployment. REPORT HIGHLIGHTS Artificial Intelligence Index Report 2021 1 1 • Natural Language Processing (NLP) outruns its evaluation metrics: Rapid progress in NLP has yielded AI systems with significantly improved language capabilities that have started to have a meaningful economic impact on the world. Google and Microsoft have both deployed the BERT language model into their search engines, while other large language models have been developed by companies ranging from Microsoft to OpenAI. Progress in NLP has been so swift that technical advances have started to outpace the benchmarks to test for them. This can be seen in the rapid emergence of systems that obtain human level performance on SuperGLUE, an NLP evaluation suite developed in response to earlier NLP progress overshooting the capabilities being assessed by GLUE. • New analyses on reasoning: Most measures of technical problems show for each time point the performance of the best system at that time on a fixed benchmark. New analyses developed for the AI Index offer metrics that allow for an evolving benchmark, and for the attribution to individual systems of credit for a share of the overall performance of a group of systems over time. These are applied to two symbolic reasoning problems, Automated Theorem Proving and Satisfiability of Boolean formulas. • Machine learning is changing the game in healthcare and biology: The landscape of the healthcare and biology industries has evolved substantially with the adoption of machine learning. DeepMind’s AlphaFold applied deep learning technique to make a significant breakthrough in the decades-long biology challenge of protein folding. Scientists use ML models to learn representations of chemical molecules for more effective chemical synthesis planning. PostEra, an AI startup used ML-based techniques to accelerate COVID-related drug discovery during the pandemic. C H A P T E R 3 : T H E E C O N O M Y • “Drugs, Cancer, Molecular, Drug Discovery” received the greatest amount of private AI investment in 2020, with more than USD 13.8 billion, 4.5 times higher than 2019. • Brazil, India, Canada, Singapore, and South Africa are the countries with the highest growth in AI hiring from 2016 to 2020. Despite the COVID-19 pandemic, the AI hiring continued to grow across sample countries in 2020. • More private investment in AI is being funneled into fewer startups. Despite the pandemic, 2020 saw a 9.3% increase in the amount of private AI investment from 2019—a higher percentage increase than from 2018 to 2019 (5.7%), though the number of newly funded companies decreased for the third year in a row. • Despite growing calls to address ethical concerns associated with using AI, efforts to address these concerns in the industry are limited, according to a McKinsey survey. For example, issues such as equity and fairness in AI continue to receive comparatively little attention from companies. Moreover, fewer companies in 2020 view personal or individual privacy risks as relevant, compared with in 2019, and there was no change in the percentage of respondents whose companies are taking steps to mitigate these particular risks. • Despite the economic downturn caused by the pandemic, half the respondents in a McKinsey survey said that the coronavirus had no effect on their investment in AI, while 27% actually reported increasing their investment. Less than a fourth of businesses decreased their investment in AI. • The United States recorded a decrease in its share of AI job postings from 2019 to 2020—the first drop in six years. The total number of AI jobs posted in the United States also decreased by 8.2% from 2019 to 2020, from 325,724 in 2019 to 300,999 jobs in 2020. Artificial Intelligence Index Report 2021 1 2 C H A P T E R 4 : A I E D U C AT I O N • An AI Index survey conducted in 2020 suggests that the world’s top universities have increased their investment in AI education over the past four years. The number of courses that teach students the skills necessary to build or deploy a practical AI model on the undergraduate and graduate levels has increased by 102.9% and 41.7%, respectively, in the last four academic years. • More AI PhD graduates in North America chose to work in industry in the past 10 years, while fewer opted for jobs in academia, according to an annual survey from the Computing Research Association (CRA). The share of new AI PhDs who chose industry jobs increased by 48% in the past decade, from 44.4% in 2010 to 65.7% in 2019. By contrast, the share of new AI PhDs entering academia dropped by 44%, from 42.1% in 2010 to 23.7% in 2019. • In the last 10 years, AI-related PhDs have gone from 14.2% of the total of CS PhDs granted in the United States, to around 23% as of 2019, according to the CRA survey. At the same time, other previously popular CS PhDs have declined in popularity, including networking, software engineering, and programming languages. Compilers all saw a reduction in PhDs granted relative to 2010, while AI and Robotics/Vision specializations saw a substantial increase. • After a two-year increase, the number of AI faculty departures from universities to industry jobs in North America dropped from 42 in 2018 to 33 in 2019 (28 of these are tenured faculty and five are untenured). Carnegie Mellon University had the largest number of AI faculty departures between 2004 and 2019 (16), followed by the Georgia Institute of Technology (14) and University of Washington (12). • The percentage of international students among new AI PhDs in North America continued to rise in 2019, to 64.3%—a 4.3% increase from 2018. Among foreign graduates, 81.8% stayed in the United States and 8.6% have taken jobs outside the United States. • In the European Union, the vast majority of specialized AI academic offerings are taught at the master’s level; robotics and automation is by far the most frequently taught course in the specialized bachelor’s and master’s programs, while machine learning (ML) dominates in the specialized short courses. C H A P T E R 5 : E T H I C A L C H A L L E N G E S O F A I A P P L I C AT I O N S • The number of papers with ethics-related keywords in titles submitted to AI conferences has grown since 2015, though the average number of paper titles matching ethics-related keywords at major AI conferences remains low over the years. • The five news topics that got the most attention in 2020 related to the ethical use of AI were the release of the European Commission’s white paper on AI, Google’s dismissal of ethics researcher Timnit Gebru, the AI ethics committee formed by the United Nations, the Vatican’s AI ethics plan, and IBM’s exiting the facial-recognition businesses. Artificial Intelligence Index Report 2021 1 3 C H A P T E R 6 : D I V E R S I T Y I N A I • The percentages of female AI PhD graduates and tenure-track computer science (CS) faculty have remained low for more than a decade. Female graduates of AI PhD programs in North America have accounted for less than 18% of all PhD graduates on average, according to an annual survey from the Computing Research Association (CRA). An AI Index survey suggests that female faculty make up just 16% of all tenure-track CS faculty at several universities around the world. • The CRA survey suggests that in 2019, among new U.S. resident AI PhD graduates, 45% were white, while 22.4% were Asian, 3.2% were Hispanic, and 2.4% were African American. • The percentage of white (non-Hispanic) new computing PhDs has changed little over the last 10 years, accounting for 62.7% on average. The share of Black or African American (non-Hispanic) and Hispanic computing PhDs in the same period is significantly lower, with an average of 3.1% and 3.3%, respectively. • The participation in Black in AI workshops, which are co-located with the Conference on Neural Information Processing Systems (NeurIPS), has grown significantly in recent years. The numbers of attendees and submitted papers in 2019 are 2.6 times higher than in 2017, while the number of accepted papers is 2.1 times higher. • In a membership survey by Queer in AI in 2020, almost half the respondents said they view the lack of inclusiveness in the field as an obstacle they have faced in becoming a practitioner in the AI/ML field. More than 40% of members surveyed said they have experienced discrimination or harassment at work or school. C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S • Since Canada published the world’s first national AI strategy in 2017, more than 30 other countries and regions have published similar documents as of December 2020. • The launch of the Global Partnership on AI (GPAI) and Organisation for Economic Co-operation and Development (OECD) AI Policy Observatory and Network of Experts on AI in 2020 promoted intergovernmental efforts to work together to support the development of AI for all. • In the United States, the 116th Congress was the most AI-focused congressional session in history. The number of mentions of AI by this Congress in legislation, committee reports, and Congressional Research Service (CRS) reports is more than triple that of the 115th Congress. TA B L E O F C O N T E N T S 1 4C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 CHAPTER 1: Research & Development Artificial Intelligence Index Report 2021 TA B L E O F C O N T E N T S 1 5C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Overview 16 Chapter Highlights 17 1.1 PUBLICATIONS 18 Peer-Reviewed AI Publications 18 Overview 18 By Region 18 By Geographic Area 20 By Institutional Affiliation 21 Academic-Corporate Collaboration 23 AI Journal Publications 25 Overview 25 By Region 26 By Geographic Area 27 Citation 27 AI Conference Publications 28 Overview 28 By Region 29 By Geographic Area 30 Citation 30 AI Patents 31 Overview 31 arXiv Publications 32 Overview 32 By Region 32 By Geographic Area 33 By Field of Study 34 Highlight: Deep Learning Papers on arXiv 35 1.2 CONFERENCES 36 Conference Attendance 36 Highlight: Corporate Representation at AI Research Conferences 38 1.3 AI OPEN-SOURCE SOFTWARE LIBRARIES 39 GitHub Stars 39 Chapter Preview CHAPTER 1: ACCESS THE PUBLIC DATA https://drive.google.com/drive/folders/1CgMBVb4p6BikFBTziFqdDY4lPkcP1Cij?usp=sharing TA B L E O F C O N T E N T S 1 6C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 Overview O V E R V I E W The report opens with an overview of the research and development (R&D) efforts in artificial intelligence (AI) because R&D is fundamental to AI progress. Since the technology first captured the imagination of computer scientists and mathematicians in the 1950s, AI has grown into a major research discipline with significant commercial applications. The number of AI publications has increased dramatically in the past 20 years. The rise of AI conferences and preprint archives has expanded the dissemination of research and scholarly communications. Major powers, including China, the European Union, and the United States, are racing to invest in AI research. The R&D chapter aims to capture the progress in this increasingly complex and competitive field. This chapter begins by examining AI publications—from peer-reviewed journal articles to conference papers and patents, including the citation impact of each, using data from the Elsevier/Scopus and Microsoft Academic Graph (MAG) databases, as well as data from the arXiv paper preprint repository and Nesta. It examines contributions to AI R&D from major AI entities and geographic regions and considers how those contributions are shaping the field. The second and third sections discuss R&D activities at major AI conferences and on GitHub. C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T TA B L E O F C O N T E N T S 1 7C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 CHAPTER HIGHLIGHTS • The number of AI journal publications grew by 34.5% from 2019 to 2020—a much higher percentage growth than from 2018 to 2019 (19.6%). • In every major country and region, the highest proportion of peer-reviewed AI papers comes from academic institutions. But the second most important originators are different: In the United States, corporate-affiliated research represents 19.2% of the total publications, whereas government is the second most important in China (15.6%) and the European Union (17.2%). • In 2020, and for the first time, China surpassed the United States in the share of AI journal citations in the world, having briefly overtaken the United States in the overall number of AI journal publications in 2004 and then retaken the lead in 2017. However, the United States has consistently (and significantly) more cited AI conference papers than China over the last decade. • In response to COVID-19, most major AI conferences took place virtually and registered a significant increase in attendance as a result. The number of attendees across nine conferences almost doubled in 2020. • In just the last six years, the number of AI-related publications on arXiv grew by more than sixfold, from 5,478 in 2015 to 34,736 in 2020. • AI publications represented 3.8% of all peer-reviewed scientific publications worldwide in 2019, up from 1.3% in 2011. C H A P T E R H I G H L I G H T S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T TA B L E O F C O N T E N T S 1 8C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 0 20 40 60 80 100 120 N u m b er o f P ee r- R ev ie w ed A I P u b lic at io n s (i n T h o u sa n d s) NUMBER of PEER-REVIEWED AI PUBLICATIONS, 2000-19 Source: Elsevier/Scopus, 2020 | Chart: 2021 AI Index Report AI publications include peer-reviewed publications, journal articles, conference papers, and patents. To track trends among these publications and to assess the state of AI R&D activities around the world, the following datasets were used: the Elsevier/Scopus database for peer-reviewed publications; the Microsoft Academic Graph (MAG) database for all journals, conference papers, and patent publications; and arXiv and Nesta data for electronic preprints. P E E R - R E V I E W E D A I P U B L I C AT I O N S This section presents data from the Scopus database by Elsevier. Scopus contains 70 million peer-reviewed research items curated from more than 5,000 international publishers. The 2019 version of the data shown below is derived from an entirely new set of publications, so figures of all peer- reviewed AI publications differ from those in previous years’ AI Index reports. Due to changes in the methodology for indexing publications, the accuracy of the dataset increased from 80% to 84% (see the Appendix for more details). Overview Figure 1.1.1a shows the number of peer-reviewed AI publications, and Figure 1.1.1b shows the share of those 1.1 PUBLICATIONS among all peer-reviewed publications in the world. The total number of publications grew by nearly 12 times between 2000 and 2019. Over the same period, the percentage of peer-reviewed publications increased from 0.82% of all publications in 2000 to 3.8% in 2019. By Region1 Among the total number of peer-reviewed AI publications in the world, East Asia & Pacific has held the largest share since 2004, followed by Europe & Central Asia, and North America (Figure 1.1.2). Between 2009 and 2019, South Asia and sub-Saharan Africa experienced the highest growth in terms of the number of peer-reviewed AI publications, increasing by eight- and sevenfold, respectively. 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.1.1a 1 Regions in this chapter are classified according to the World Bank analytical grouping. https://datatopics.worldbank.org/world-development-indicators/images/figures-png/world-by-region-map.pdf TA B L E O F C O N T E N T S 1 9C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 0% 1% 2% 3% 4% P ee r- R ev ie w ed A I P u b lic at io n s (% o f T o ta l) 3.8% PEER-REVIEWED AI PUBLICATIONS (% of TOTAL), 2000-19 Source: Elsevier/Scopus, 2020 | Chart: 2021 AI Index Report 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.1.1b 19 9 9 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 20 21 20 22 20 23 0% 10% 20% 30% 40% P ee r- R ev ie w ed A I P u b lic at io n s (% o f T o ta l) 5.5% Middle East & North Africa 2.7% Latin America & Caribbean 0.7% Sub-Saharan Africa 8.8% South Asia 17.0% North America 25.1% Europe & Central Asia 36.9% East Asia & Pacific PEER-REVIEWED AI PUBLICATIONS (% of TOTAL) by REGION, 2000-19 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report Figure 1.1.2 TA B L E O F C O N T E N T S 2 0C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 0% 5% 10% 15% 20% 25% P ee r- R ev ie w ed A I P u b lic at io n s (% o f W o rl d T o ta l) 16.4% EU 14.6% US 22.4% China PEER-REVIEWED AI PUBLICATIONS (% of WORLD TOTAL) by GEOGRAPHIC AREA, 2000-19 Source: Elsevier/Scopus, 2020 | Chart: 2021 AI Index Report By Geographic Area To compare the activity among the world’s major AI players, this section shows trends of peer-reviewed AI publications coming out of China, the European Union, and the United States. As of 2019, China led in the share of peer-reviewed AI publications in the world, after overtaking the European Union in 2017 (Figure 1.1.3). It published 3.5 times more peer-reviewed AI papers in 2019 than it did in 2014—while the European Union published just 2 times more papers and the United States 2.75 times more over the same period. 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.1.3 TA B L E O F C O N T E N T S 2 1C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 0 1,000 2,000 3,000 4,000 N u m b er o f P ee r- R ev ie w ed A I P u b lic at io n s 14 Other 382 Medical 4,352 Government 1,675 Corporate NUMBER of PEER-REVIEWED AI PUBLICATIONS in CHINA by INSTITUTIONAL AFFILIATION, 2000-19 Source: Elsevier/Scopus, 2020 | Chart: 2021 AI Index Report 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.1.4a By Institutional Affiliation The following charts show the number of peer-reviewed AI publications affiliated with corporate, government, medical, and other institutions in China (Figure 1.1.4a), the European Union (Figure 1.1.4b), and the United States (Figure 1.1.4c).2 In 2019, roughly 95.4% of overall peer- reviewed AI publications in China were affiliated with the academic field, compared with 81.9% in the European Union and 89.6% in the United States. Those affiliation categories are not mutually exclusive, as some authors could be affiliated with more than one type of institution. The data suggests that, excluding academia, government institutions—more than those in other categories— consistently contribute the highest percentage of peer- reviewed AI publications in both China and the European Union (15.6% and 17.2 %, respectively, in 2019), while in the United States, the highest portion is corporate- affiliated (19.2%). 2 Across all three geographic areas, the number of papers affiliated with academia exceeds that of government-, corporate-, and medical-affiliated ones; therefore, the academia affiliation is not shown, as it would distort the graphs. TA B L E O F C O N T E N T S 2 2C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 0 1,000 2,000 3,000 N u m b er o f P ee r- R ev ie w ed A I P u b lic at io n s 120 Other 718 Medical 2,277 Gov. 3,513 Corporate NUMBER of PEER-REVIEWED AI PUBLICATIONS in the UNITED STATES by INSTITUTIONAL AFFILIATION, 2000-19 Source: Elsevier/Scopus, 2020 | Chart: 2021 AI Index Report 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.1.4c 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 0 1,000 2,000 3,000 N u m b er o f P ee r- R ev ie w ed A I P u b lic at io n s 187 Other 508 Medical 3,523 Government 1,594 Corporate NUMBER of PEER-REVIEWED AI PUBLICATIONS in the EUROPEAN UNION by INSTITUTIONAL AFFILIATION, 2000-19 Source: Elsevier/Scopus, 2020 | Chart: 2021 AI Index Report Figure 1.1.4b TA B L E O F C O N T E N T S 2 3C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 Number of Peer-Reviewed AI Publications United States European Union China United Kingdom Germany Japan France Canada South Korea Netherlands Switzerland India Hong Kong Spain Italy NUMBER of ACADEMIC-CORPORATE PEER-REVIEWED AI PUBLICATIONS by GEOGRAPHIC AREA, 2015-19 (SUM) Source: Elsevier/Scopus, 2020 | Chart: 2021 AI Index Report 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.1.5 Academic-Corporate Collaboration Since the 1980s, the R&D collaboration between academia and industry in the United States has grown in importance and popularity, made visible by the proliferation of industry-university research centers as well as corporate contributions to university research. Figure 1.1.5 shows that between 2015 and 2019, the United States produced the highest number of hybrid academic-corporate, co-authored, peer-reviewed AI publications—more than double the amount in the European Union, which comes in second, followed by China in third place. TA B L E O F C O N T E N T S 24C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 5 10 20 50 100 200 500 1,000 2,000 Number of Academic-Corporate Peer-Reviewed AI Publications (Log Scale) 0 1 2 3 P ee r- R ev ie w ed A I P u b lic at io n s' F ie ld -W ei g h te d C it at io n Im p ac t (F W C I) Germany Australia Brazil Canada China European Union France Hong Kong IndiaIndonesia Iran Italy Japan Malaysia Netherlands Russia Singapore South Korea Spain Switzerland Taiwan Turkey United Kingdom United States PEER-REVIEWED AI PUBLICATIONS' FIELD-WEIGHTED CITATION IMPACT and NUMBER of ACADEMIC-CORPORATE PEER-REVIEWED AI PUBLICATIONS, 2019 Source: Elsevier/Scopus, 2020 | Chart: 2021 AI Index Report 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T To assess how academic-corporate collaborations impact the Field-Weighted Citation Impact (FWCI) of AI publications from different geographic regions, see Figure 1.1.6. FWCI measures how the number of citations received by publications compares with the average number of citations received by other similar publications in the same year, discipline, and format (book, article, conference paper, etc.). A value of 1.0 represents the world average. More than or less than 1 means publications are cited more or less than expected, according to the world average. For example, an FWCI of 0.75 means 25% fewer citations than the world average. The chart shows the FWCI for all peer-reviewed AI publications on the y-axis and the total number (on a log scale) of academic-corporate co-authored publications on the x-axis. To increase the signal-to-noise ratio of the FWCI metric, only countries that have more than 1,000 peer-reviewed AI publications in 2020 are included. Figure 1.1.6 TA B L E O F C O N T E N T S 2 5C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0 20 40 60 80 N u m b e r o f A I J o u rn al P u b lic at io n s (i n T h o u sa n d s) NUMBER of AI JOURNAL PUBLICATIONS, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0% 1% 2% 3% A I J o u rn al P u b lic at io n s (% o f A ll P u b lic at io n s) 2.2% AI JOURNAL PUBLICATIONS (% of ALL JOURNAL PUBLICATIONS), 2000-20 Source: Microso Academic Graph, 2020 | Chart: 2021 AI Index Report 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.1.7a A I J O U R N A L P U B L I C AT I O N S The next three sections chart the trends in the publication of AI journals, conference publications, and patents, as well as their respective citations that provide a signal for R&D impact, based on data from Microsoft Academic Graph. MAG3 is a knowledge graph consisting of more than 225 million publications (at the end of November 2019). Overview Overall, the number of AI journal publications in 2020 is 5.4 times higher than it was in 2000 (Figure 1.1.7a). In 2020, the number of AI journal publications increased by 34.5% from 2019—a much higher percentage growth than from 2018 to 2019 (19.6%). Similarly, the share of AI journal publications among all publications in the world has jumped by 0.4 percentage points in 2020, higher than the average of 0.03 percentage points in the past five years (Figure 1.1.7b). Figure 1.1.7b 3 See “An Overview of Microsoft Academic Service (MAS) and Applications” and “A Review of Microsoft Academic Services for Science of Science Studies” for more details. https://www.microsoft.com/en-us/research/publication/an-overview-of-microsoft-academic-service-mas-and-applications-2/ https://www.microsoft.com/en-us/research/publication/a-review-of-microsoft-academic-services-for-science-of-science-studies/ TA B L E O F C O N T E N T S 2 6C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 20 21 20 22 20 23 20 24 0% 10% 20% 30% 40% A I J o u rn al P u b lic at io n s (% o f W o rl d T o ta l) 0.3%, Sub-Saharan Africa 4.9%, South Asia 14.0%, North America 3.1%, Middle East & North Africa 1.3%, Latin America & Caribbean 13.3%, Europe & Central Asia 26.7%, East Asia & Pacific AI JOURNAL PUBLICATIONS (% of WORLD TOTAL) by REGION, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report By Region Figure 1.1.8 shows the share of AI journals—the dominant publication entity in terms of numbers in the MAG database—by region between 2000 and 2020. East Asia & Pacific, Europe & Central Asia, and North America are responsible for the majority of AI journal publications in the past 21 years, while the lead position among the three regions changes over time. In 2020, East Asia & Pacific held the highest share (26.7%), followed by Europe & Central Asia (13.3%) and North America (14.0%). Additionally, in the last 10 years, South Asia, and Middle East & North Africa saw the most significant growth, as the number of AI journal publications in those two regions grew six- and fourfold, respectively. 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.1.8 TA B L E O F C O N T E N T S 2 7C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0% 5% 10% 15% 20% 25% A I J o u rn al P u b lic at io n s (% o f W o rl d T o ta l) 12.3%, US 8.6%, EU 18.0%, China AI JOURNAL PUBLICATIONS (% of WORLD TOTAL) by GEOGRAPHIC AREA, 2000-20 Source: Microso Academic Graph, 2020 | Chart: 2021 AI Index Report By Geographic Area Figure 1.1.9 shows that among the three major AI powers, China has had the largest share of AI journal publications in the world since 2017, with 18.0% in 2020, followed by the United States (12.3%) and the European Union (8.6%). 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.1.9 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0% 10% 20% 30% 40% A I J o u rn al C it at io n s (% o f W o rl d T o ta l) 19.8% US 11.0% EU 20.7% China AI JOURNAL CITATIONS (% of WORLD TOTAL) by GEOGRAPHIC AREA, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report Citation In terms of the highest share of AI journal citations, Figure 1.1.10 shows that China (20.7%) overtook the United States (19.8%) in 2020 for the first time, while the European Union continued to lose overall share. Figure 1.1.10 TA B L E O F C O N T E N T S 2 8C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0% 5% 10% 15% 20% 25% A I C o n fe re n ce P u b lic at io n s (% o f A ll P u b lic at io n s) 20.2% AI CONFERENCE PUBLICATIONS (% of ALL CONFERENCE PUBLICATIONS), 2000-20 Source: Microso Academic Graph, 2020 | Chart: 2021 AI Index Report 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0 10 20 30 40 50 N u m b er o f P u b lic at io n s (i n T h o u sa n d s) NUMBER of AI CONFERENCE PUBLICATIONS, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.1.11a Figure 1.1.11b A I C O N F E R E N C E P U B L I C AT I O N S Overview Between 2000 and 2019, the number of AI conference publications increased fourfold, although the growth flattened out in the past ten years, with the number of publications in 2019 just 1.09 times higher than the number in 2010.4 4 Note that conference data in 2020 on the MAG system is not yet complete. See the Appendix for details. TA B L E O F C O N T E N T S 2 9C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 19 9 9 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 20 21 20 22 20 23 20 24 20 25 20 26 0% 10% 20% 30% 40% A I C o n fe re n ce P u b lic at io n s (% o f W o rl d T o ta l) 0.3%, Sub-Saharan Africa 21.7%, North America 2.2%, Middle East & North Africa 1.7%, Latin America & Caribbean 5.1%, South Asia 18.6%, Europe & Central Asia 27.3%, East Asia & Pacific AI CONFERENCE PUBLICATIONS (% of WORLD TOTAL) by REGION, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report By Region Figure 1.1.12 shows that, similar to the trends in AI journal publication, East Asia & Pacific, Europe & Central Asia, and North America are the world’s dominant sources for AI conference publications. Specifically, East Asia & Pacific took the lead starting in 2004, accounting for more than 27% in 2020. North America overtook Europe & Central Asia to claim second place in 2018, accounting for 20.1%, followed by 21.7% in 2020. 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.1.12 TA B L E O F C O N T E N T S 3 0C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0% 10% 20% 30% 40% A I C o n fe re n ce C it at io n s (% o f W o rl d T o ta l) 11.8% China 40.1% US 10.9% EU AI CONFERENCE CITATIONS (% of WORLD TOTAL) by GEOGRAPHIC AREA, 2000-20 Source: Microso Academic Graph, 2020 | Chart: 2021 AI Index Report 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0% 5% 10% 15% 20% 25% A I C o n fe re n ce P u b lic at io n s (% o f W o rl d T o ta l) 12.8% EU 15.2% China 19.4% US AI CONFERENCE PUBLICATIONS (% of WORLD TOTAL) by GEOGRAPHIC AREA, 2000-20 Source: Microso Academic Graph, 2020 | Chart: 2021 AI Index Report By Geographic Area China overtook the United States in the share of AI conference publications in the world in 2019 (Figure 1.1.13). Its share has grown significantly since 2000. China’s percentage of AI conference publications in 2019 is almost nine times higher than it was in 2000. The share of conference publications for the European Union peaked in 2011 and continues to decline. 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.1.13 Citation With respect to citations of AI conference publications, Figure 1.1.14 shows that the United States has held a dominant lead among the major powers over the past 21 years. The United States tops the list with 40.1% of overall citations in 2020, followed by China (11.8%) and the European Union (10.9%). Figure 1.1.14 TA B L E O F C O N T E N T S 31C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0 20 40 60 80 100 N u m b er o f A I P at en t P u b lic at io n s (i n T h o u sa n d s) NUMBER of AI PATENT PUBLICATIONS, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.1.15a A I PAT E N T S Overview The total number of AI patents published in the world has been steadily increasing in the past two decades, growing from 21,806 in 2000 to more than 4.5 times that, or 101,876, in 2019 (Figure 1.1.15a). The share of AI patents published in the world exhibits a lesser increase, from around 2% in 2000 to 2.9% in 2020 (Figure 1.1.15b). The AI patent data is incomplete—only 8% of the dataset in 2020 includes a country or regional affiliation. There is reason to question the data on the share of AI patent publications by both region and geographic area, and it is therefore not included in the main report. See the Appendix for details. 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0% 1% 2% 3% A I P at e n t P u b lic at io n s (% o f A ll P u b lic at io n s) 2.9% AI PATENT PUBLICATIONS (% of ALL PATENT PUBLICATIONS), 2000-20 Source: Microso Academic Graph, 2020 | Chart: 2021 AI Index Report Figure 1.1.15b TA B L E O F C O N T E N T S 3 2C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 2015 2016 2017 2018 2019 2020 0 10 20 30 N u m b er o f A I- R el at ed P u b lic at io n s o n a rX iv ( in T h o u sa n d s) NUMBER of AI-RELATED PUBLICATIONS on ARXIV, 2015-20 Source: arXiv, 2020 | Chart: 2021 AI Index Report 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.1.16 A R X I V P U B L I C AT I O N S In addition to the traditional avenues for publishing academic papers (discussed above), AI researchers have embraced the practice of publishing their work (often pre–peer review) on arXiv, an online repository of electronic preprints. arXiv allows researchers to share their findings before submitting them to journals and conferences, which greatly accelerates the cycle of information discovery and dissemination. The number of AI-related publications in this section includes preprints on arXiv under cs.AI (artificial intelligence), cs.CL (computation and language), cs.CV (computer vision), cs.NE (neural and evolutionary computing), cs.RO (robotics), cs.LG (machine learning in computer science), and stat.ML (machine learning in statistics). Overview In just six years, the number of AI-related publications on arXiv grew more than sixfold, from 5,478 in 2015 to 34,736 in 2020 (Figure 1.1.16). 2015 2016 2017 2018 2019 2020 2021 0% 10% 20% 30% 40% A I- R e la te d P u b lic at io n s o n a rX iv ( % o f W o rl d T o ta l) 4.0% South Asia 2.5% Middle East & North Africa 1.3% Latin America & Caribbean 0.3% Sub-Saharan Africa 36.3% North America 22.9% Europe & Central Asia 26.5% East Asia & Paci c ARXIV AI-RELATED PUBLICATIONS (% of WORLD TOTAL) by REGION, 2015-20 Source: arXiv, 2020 | Chart: 2021 AI Index Report By Region The analysis by region shows that while North America still holds the lead in the global share of arXiV AI-related publications, its share has been decreasing—from 41.6% in 2017 to 36.3% in 2020 (Figure 1.1.17). Meanwhile, the share of publications in East Asia & Pacific has grown steadily in the past five years—from 17.3% in 2015 to 26.5% in 2020. Figure 1.1.17 TA B L E O F C O N T E N T S 3 3C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 2015 2016 2017 2018 2019 2020 0 2,000 4,000 6,000 8,000 10,000 12,000 N u m b er o f A I- R el at ed P u b lic at io n s o n a rX iv 11,280 US 6,505 EU 5,440 China NUMBER of AI-RELATED PUBLICATIONS on ARXIV by GEOGRAPHIC AREA, 2015-20 Source: arXiv, 2020 | Chart: 2021 AI Index Report 2015 2016 2017 2018 2019 2020 0% 10% 20% 30% A I- R el at ed P u b lic at io n s o n a rX iv ( % o f W o rl d T o ta l) 32.5% US 18.7% EU 15.7% China ARXIV AI-RELATED PUBLICATIONS (% of WORLD TOTAL) by GEOGRAPHIC AREA, 2015-20 Source: arXiv, 2020 | Chart: 2021 AI Index Report By Geographic Area While the total number of AI-related publications on arXiv is increasing among the three major AI powers, China is catching up with the United States (Figure 1.1.18a and Figure 1.1.18b). The share of publication counts by the European Union, on the other hand, has remained largely unchanged. 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.1.18a Figure 1.1.18b TA B L E O F C O N T E N T S 3 4C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 2015 2016 2017 2018 2019 2020 0 2,000 4,000 6,000 8,000 10,000 N u m b er o f A I- R el at ed P u b lic at io n s o n a rX iv 11,098 cs.LG 11,001 cs.CV 1,818 stat.ML 2,571 cs.RO 743 cs.NE 5,573 cs.CL 1,923 cs.AI NUMBER of AI-RELATED PUBLICATIONS on ARXIV by FIELD of STUDY 2015-20 Source: arXiv, 2020 | Chart: 2021 AI Index Report By Field of Study Among the six fields of study related to AI on arXiv, the number of publications in Robotics (cs.RO) and Machine Learning in computer science (cs.LG) have seen the fastest growth between 2015 and 2020, increasing by 11 times and 10 times respectively (Figure 1.1.19). In 2020, cs.LG and Computer Vision (cs.CV) lead in the overall number of publications, accounting for 32.0% and 31.7%, respectively, of all AI- related publications on arXiv. Between 2019 and 2020, the fastest-growing categories of the seven studied here were Computation and Language (cs.CL), by 35.4%, and cs.RO, by 35.8%. 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.1.19 Among the six fields of study related to AI on arXiv, the number of publications in Robotics (cs.RO) and Machine Learning in computer science (cs. LG) have seen the fastest growth between 2015 and 2020, increasing by 11 times and 10 times respectively. TA B L E O F C O N T E N T S 3 5C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 Deep Learning Papers on arXiv With increased access to data and significant improvements in computing power, the field of deep learning (DL) is growing at breakneck speed. Researchers from Nesta used a topic modeling algorithm to identify the deep learning papers on arXiv by analyzing the abstract of arXiv papers under the Computer Science (CS) and Machine Learning in Statistics (state.ML) categories. Figure 1.1.20 suggests that in the last five years alone, the overall number of DL publications on arXiv grew almost sixfold. 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 0 1 2 3 4 5 6 7 N u m b e r o f D e e p L e ar n in g P u b lic at io n s o n a rX iv ( in T h o u sa n d s) NUMBER of DEEP LEARNING PUBLICATIONS on ARXIV, 2010-19 Source: arXiv/Nesta, 2020 | Chart: 2021 AI Index Report Figure 1.1.20 1 .1 P U B L I C AT I O N S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T TA B L E O F C O N T E N T S 3 6C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 Conference attendance is an indication of broader industrial and academic interest in a scientific field. In the past 20 years, AI conferences have grown not only in size but also in number and prestige. This section presents data on the trends in attendance at and submissions to major AI conferences. C O N F E R E N C E AT T E N DA N C E Last year saw a significant increase in participation levels at AI conferences, as most were offered through a virtual format. Only the 34th Association for the Advancement of Artificial Intelligence (AAAI) Conference on Artificial Intelligence was held in person in February 2020. Conference organizers report that a virtual format allows for higher attendance of researchers from all over the world, though exact attendance numbers are difficult to measure. Due to the atypical nature of 2020 conference attendance data, the 11 major AI conferences in 2019 have been split into two categories based on 2019 attendance data: large AI conferences with over 3,000 attendees and small AI conferences with fewer than 3,000 attendees. Figure 1.2.1 shows that in 2020, the total number of attendees across nine conferences almost doubled.5 In particular, the International Conference on Intelligent Robots and Systems (IROS) extended the virtual conference to allow users to watch events for up to three months, which explains the high attendance count. Because the International Joint Conference on Artificial Intelligence (IJCAI) was held in 2019 and January 2021—but not in 2020—it does not appear on the charts. 1.2 CONFERENCES 1 . 2 C O N F E R E N C E S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Conference organizers report that a virtual format allows for higher attendance of researchers from all over the world, though exact attendance numbers are difficult to measure. 5 For the AAMAS conference, the attendance in 2020 is based on the number of users on site reported by the platform that recorded the talks and managed the online conference; For the KR conference, the attendance in 2020 is based on the number of registrations; For the ICPAS conference, the attendance of 450 in 2020 is an estimate as some participants may have used anonymous Zoom accounts. TA B L E O F C O N T E N T S 3 7C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 0 1,000 2,000 3,000 4,000 5,000 N u m b er o f A tt en d ee s 469 KR 450 ICAPS 3,726 AAMAS 5,600 ICLR 3,972 ACL ATTENDANCE at SMALL AI CONFERENCES, 2010-20 Source: Conference Data | Chart: 2021 AI Index Report 1 . 2 C O N F E R E N C E S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.2.2 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0 5,000 10,000 15,000 20,000 25,000 N u m b e r o f A tt e n d e e s 22,011 NeurIPS 25,719 IROS 3,015 IJCAI3,050 ICRA 4,884 AAAI 10,800 ICML 7,500 CVPR ATTENDANCE at LARGE AI CONFERENCES, 2010-20 Source: Conference Data | Chart: 2021 AI Index Report Figure 1.2.1 TA B L E O F C O N T E N T S 3 8C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 Corporate Representation at AI Research Conferences Researchers from Virginia Tech and Ivey Business School, Western University found that large technology firms have increased participation in major AI conferences. In their paper, titled “The De-Democratization of AI: Deep Learning and the Compute Divide in Artificial Intelligence Research,” the ressearchers use the share of papers affiliated with firms over time at AI conferences to illustrate the increased presence of firms in AI research. They argue that the unequal distribution of compute power in academia, which they refer to as the “compute divide,” is adding to the inequality in the era of deep learning. Big tech firms tend to have more resources to design AI products, but they also tend to be less diverse than less elite or smaller institutions. This raises concerns about bias and fairness within AI. All 10 major AI conferences displayed in Figure 1.2.3 show an upward trend in corporate representation, which further extends the compute divide. 0% 10% 20% 30% 40% 0% 10% 20% 30% 40% 0% 10% 20% 30% 40% 30.8% KDD 29.0% NeurIPS 28.7% ACL 28.5% EMNLP 27.9% ICML 25.6% ECCV 23.7% ICCV 21.9% CVPR 19.3% AAAI 17.5% IJCAI SHARE of FORTUNE GLOBAL 500 TECH-AFFILIATED PAPERS Source: Ahmed & Wahed, 2020 | Chart: 2021 AI Index Report % o f F o rt u n e G lo b al 5 0 0 T ec h -A ffi lia te d P ap er s 2000 2019 2000 2019 2000 2019 2000 2019 Figure 1.2.3 1 . 2 C O N F E R E N C E S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T https://arxiv.org/abs/2010.15581 https://arxiv.org/abs/2010.15581 https://arxiv.org/abs/2010.15581 TA B L E O F C O N T E N T S 3 9C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 A software library is a collection of computer code that is used to create applications and products. Popular AI-specific software libraries—such as TensorFlow and PyTorch—help developers create their AI solutions quickly and efficiently. This section analyzes the popularity of software libraries through GitHub data. G I T H U B S TA R S GitHub is a code hosting platform that AI researchers and developers frequently use to upload, comment on, and download software. GitHub users can “star” a project to save it in their list, thereby expressing their interests and likes—similar to the “like’’ function on Twitter and other social media platforms. As AI researchers upload packages on GitHub that mention the use of an open- source library, the “star” function on GitHub can be used to measure the popularity of various AI programming open-source libraries. Figure 1.3.1 suggests that TensorFlow (developed by Google and publicly released in 2017) is the most popular AI software library. The second most popular library in 2020 is Keras (also developed by Google and built on top of TensorFlow 2.0). Excluding TensorFlow, Figure 1.3.2 shows that PyTorch (created by Facebook) is another library that is becoming increasingly popular. 1.3 AI OPEN-SOURCE SOFTWARE LIBRARIES 1 . 3 A I O P E N - S O U R C E S O F T WA R E L I B R A R I E S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T TensorFlow (developed by Google and publicly released in 2017) is the most popular AI software library. The second most popular library in 2020 is Keras (also developed by Google and built on top of TensorFlow 2.0). TA B L E O F C O N T E N T S 4 0C H A P T E R 1 P R E V I E W Artificial Intelligence Index Report 2021 2014 2015 2016 2017 2018 2019 2020 0 10 20 30 40 50 C u m u la ti ve G it h u b S ta rs ( in T h o u sa n d ) 45 Sckit-learn 19 MXNet 51 Keras 8 Caffe2 9 Theano 46 PyTorch 17 Cntk 31 BVLC/caffe NUMBER of GITHUB STARS by AI LIBRARY (excluding TENSORFLOW), 2014-20 Source: GitHub, 2020 | Chart: 2021 AI Index Report 1 . 3 A I O P E N - S O U R C E S O F T WA R E L I B R A R I E S C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T Figure 1.3.2 2014 2015 2016 2017 2018 2019 2020 0 50 100 150 C u m u la ti ve G it h u b S ta rs ( in T h o u sa n d ) 45 Sckit-learn 19 MXNet 51 Keras 8 Caffe2 9 Theano 153 TensorFlow 46 PyTorch 17 Cntk 31 BVLC/caffe NUMBER of GITHUB STARS by AI LIBRARY, 2014-20 Source: GitHub, 2020 | Chart: 2021 AI Index Report Figure 1.3.1 TA B L E O F C O N T E N T S 41C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 CHAPTER 2: Technical Performance Artificial Intelligence Index Report 2021 TA B L E O F C O N T E N T S 4 2C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E Overview 43 Chapter Highlights 44 COMPUTER VISION 45 2.1 COMPUTER VISION—IMAGE 46 Image Classification 46 ImageNet 46 ImageNet: Top-1 Accuracy 46 ImageNet: Top-5 Accuracy 47 ImageNet: Training Time 48 ImageNet: Training Costs 49 Highlight: Harder Tests Beyond ImageNet 50 Image Generation 51 STL-10: Fréchet Inception Distance (FID) Score 51 FID Versus Real Life 52 Deepfake Detection 53 Deepfake Detection Challenge (DFDC) 53 Human Pose Estimation 54 Common Objects in Context (COCO): Keypoint Detection Challenge 54 Common Objects in Context (COCO): DensePose Challenge 55 Semantic Segmentation 56 Cityscapes 56 Embodied Vision 57 2.2 COMPUTER VISION—VIDEO 58 Activity Recognition 58 ActivityNet 58 ActivityNet: Temporal Action Localization Task 58 ActivityNet: Hardest Activity 59 Object Detection 60 You Only Look Once (YOLO) 60 Face Detection and Recognition 61 National Institute of Standards and Technology (NIST) Face Recognition Vendor Test (FRVT) 61 2.3 LANGUAGE 62 English Language Understanding Benchmarks 62 SuperGLUE 62 SQuAD 63 Commercial Machine Translation (MT) 64 Number of Commercially Available MT Systems 64 GPT-3 65 2.4 LANGUAGE REASONING SKILLS 67 Vision and Language Reasoning 67 Visual Question Answering (VQA) Challenge 67 Visual Commonsense Reasoning (VCR) Task 68 2.5 SPEECH 69 Speech Recognition 69 Transcribe Speech: LibriSpeech 69 Speaker Recognition: VoxCeleb 69 Highlight: The Race Gap in Speech Recognition Technology 71 2.6 REASONING 72 Boolean Satisfiability Problem 72 Automated Theorem Proving (ATP) 74 2.7 HEALTHCARE AND BIOLOGY 76 Molecular Synthesis 76 Test Set Accuracy for Forward Chemical Synthesis Planning 76 COVID-19 and Drug Discovery 77 AlphaFold and Protein Folding 78 EXPERT HIGHLIGHTS 79 Chapter Preview CHAPTER 2: ACCESS THE PUBLIC DATA https://drive.google.com/drive/folders/18AxWbJ5hZWtikMjVb645ysN4yB5jjs2z?usp=sharing TA B L E O F C O N T E N T S 4 3C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 Overview O V E R V I E W This chapter highlights the technical progress in various subfields of AI, including computer vision, language, speech, concept learning, and theorem proving. It uses a combination of quantitative measurements, such as common benchmarks and prize challenges, and qualitative insights from academic papers to showcase the developments in state-of- the-art AI technologies. While technological advances allow AI systems to be deployed more widely and easily than ever, concerns about the use of AI are also growing, particularly when it comes to issues such as algorithmic bias. The emergence of new AI capabilities such as being able to synthesize images and videos also poses ethical challenges. C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E TA B L E O F C O N T E N T S 4 4C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 CHAPTER HIGHLIGHTS • Generative everything: AI systems can now compose text, audio, and images to a sufficiently high standard that humans have a hard time telling the difference between synthetic and non-synthetic outputs for some constrained applications of the technology. That promises to generate a tremendous range of downstream applications of AI for both socially useful and less useful purposes. It is also causing researchers to invest in technologies for detecting generative models; the DeepFake Detection Challenge data indicates how well computers can distinguish between different outputs. • The industrialization of computer vision: Computer vision has seen immense progress in the past decade, primarily due to the use of machine learning techniques (specifically deep learning). New data shows that computer vision is industrializing: Performance is starting to flatten on some of the largest benchmarks, suggesting that the community needs to develop and agree on harder ones that further test performance. Meanwhile, companies are investing increasingly large amounts of computational resources to train computer vision systems at a faster rate than ever before. Meanwhile, technologies for use in deployed systems—like object-detection frameworks for analysis of still frames from videos—are maturing rapidly, indicating further AI deployment. • Natural Language Processing (NLP) outruns its evaluation metrics: Rapid progress in NLP has yielded AI systems with significantly improved language capabilities that have started to have a meaningful economic impact on the world. Google and Microsoft have both deployed the BERT language model into their search engines, while other large language models have been developed by companies ranging from Microsoft to OpenAI. Progress in NLP has been so swift that technical advances have started to outpace the benchmarks to test for them. This can be seen in the rapid emergence of systems that obtain human level performance on SuperGLUE, an NLP evaluation suite developed in response to earlier NLP progress overshooting the capabilities being assessed by GLUE. • New analyses on reasoning: Most measures of technical problems show for each time point the performance of the best system at that time on a fixed benchmark. New analyses developed for the AI Index offer metrics that allow for an evolving benchmark, and for the attribution to individual systems of credit for a share of the overall performance of a group of systems over time. These are applied to two symbolic reasoning problems, Automated Theorem Proving and Satisfiability of Boolean formulas. • Machine learning is changing the game in healthcare and biology: The landscape of the healthcare and biology industries has evolved substantially with the adoption of machine learning. DeepMind’s AlphaFold applied deep learning technique to make a significant breakthrough in the decades-long biology challenge of protein folding. Scientists use ML models to learn representations of chemical molecules for more effective chemical synthesis planning. PostEra, an AI startup used ML-based techniques to accelerate COVID-related drug discovery during the pandemic. C H A P T E R H I G H L I G H T S C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E TA B L E O F C O N T E N T S 4 5C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 Introduced in the 1960s, the field of computer vision has seen significant progress and in recent years has started to reach human levels of performance on some restricted visual tasks. Common computer vision tasks include object recognition, pose estimation, and semantic segmentation. The maturation of computer vision technology has unlocked a range of applications: self-driving cars, medical image analysis, consumer applications (e.g., Google Photos), security applications (e.g., surveillance, satellite imagery analysis), industrial applications (e.g., detecting defective parts in manufacturing and assembly), and others. C O M P U T E R V I S I O N C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E Computer Vision TA B L E O F C O N T E N T S 4 6C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 I M AG E C L A S S I F I C AT I O N In the 2010s, the field of image recognition and classification began to switch from classical AI techniques to ones based on machine learning and, specifically, deep learning. Since then, image recognition has shifted from being an expensive, domain-specific technology to being one that is more affordable and applicable to more areas—primarily due to advancements in the underlying technology (algorithms, compute hardware, and the utilization of larger datasets). ImageNet Created by computer scientists from Stanford University and Princeton University in 2009, ImageNet is a dataset of over 14 million images across 200 classes that expands and improves the data available for researchers to train AI algorithms. In 2012, researchers from the University of Toronto used techniques based on deep learning to set a new state of the art in the ImageNet Large Scale Visual Recognition Challenge. Since then, deep learning techniques have ruled the competition leaderboards—several widely used techniques have debuted in ImageNet competition entries. In 2015, a team from Microsoft Research said it had surpassed human-level performance on the image classification task1 via the use of “residual networks”—an innovation that subsequently proliferated into other AI systems. Even after the end of the competition in 2017, researchers continue to use the ImageNet dataset to test and develop computer vision applications. The image classification task of the ImageNet Challenge asks machines to assign a class label to an image based on the main object in the image. The following graphs explore the evolution of the top-performing ImageNet systems over time, as well as how algorithmic and infrastructure advances have allowed researchers to increase the efficiency of training image recognition systems and reduce the absolute time it takes to train high-performing ones. ImageNet: Top-1 Accuracy Top-1 accuracy tests for how well an AI system can assign the correct label to an image, specifically whether its single most highly probable prediction (out of all possible labels) is the same as the target label. In recent years, researchers have started to focus on improving performance on ImageNet by pre-training their systems on extra training data, for instance photos from Instagram or other social media sources. By pre-training on these datasets, they’re able to more effectively use ImageNet data, which further improves performance. Figure 2.1.1 shows that recent systems with extra training data make 1 error out of every 10 tries on top-1 accuracy, versus 4 errors out of every 10 tries in December 2012. The model from the Google Brain team achieved 90.2% on top-1 accuracy in January 2021. 2.1 COMPUTER VISION—IMAGE 2 .1 C O M P U T E R V I S I O N — I M AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E 1 Though it is worth noting that the human baseline for this metric comes from a single Stanford graduate student who took roughly the same test as the AI systems took. Image recognition has shifted from being an expensive, domain-specific technology to being one that is more affordable and applicable to more areas—primarily due to advancements in the underlying technology. http://www.image-net.org/ https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks https://arxiv.org/abs/1502.01852 TA B L E O F C O N T E N T S 47C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 75% 80% 85% 90% 95% 100% T o p -5 A cc u ra cy 98.8%, With Extra Training Data 97.9%, Without Extra Training Data 94.9% Human Performance IMAGENET CHALLENGE: TOP-5 ACCURACY Source: Papers with Code, 2020; AI Index, 2021 | Chart: 2021 AI Index Report 01/2013 01/2014 01/2015 01/2016 01/2017 01/2018 01/2019 01/2020 01/2021 90... Human Performance ImageNet: Top-5 Accuracy Top-5 accuracy asks whether the correct label is in at least the classifier’s top five predictions. Figure 2.1.2 shows that the error rate has improved from around 85% in 2013 to almost 99% in 2020.2 2 .1 C O M P U T E R V I S I O N — I M AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E 2 Note: For data on human error, a human was shown 500 images and then was asked to annotate 1,500 test images; their error rate was 5.1% for Top-5 classification. This is a very rough baseline, but it gives us a sense of human performance on this task. 60% 70% 80% 90% 100% T o p -1 A cc u ra cy 90.2%, With extra training data 86.5%, Without extra training data IMAGENET CHALLENGE: TOP-1 ACCURACY Source: Papers with Code, 2020; AI Index, 2021 | Chart: 2021 AI Index Report 01/2013 01/2014 01/2015 01/2016 01/2017 01/2018 01/2019 01/2020 01/2021 80... Figure 2.1.1 Figure 2.1.2 TA B L E O F C O N T E N T S 4 8C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 ImageNet: Training Time Along with measuring the raw improvement in accuracy over time, it is useful to evaluate how long it takes to train image classifiers on ImageNet to a standard performance level as it sheds light on advances in the underlying computational infrastructure for large-scale AI training. This is important to measure because the faster you can train a system, the more quickly you can evaluate it and update it with new data. Therefore, the faster ImageNet systems can be trained, the more productive organizations can become at developing and deploying AI systems. Imagine the difference between waiting a few seconds for a system to train versus waiting a few hours, and what that difference means for the type and volume of ideas researchers explore and how risky they might be. What follows are the results from MLPerf, a competition run by the MLCommons organization that challenges entrants to train an ImageNet network using a common (residual network) architecture, and then ranks systems according to the absolute “wall clock” time it takes them to train a system.3 As shown in Figure 2.1.3, the training time on ImageNet has fallen from 6.2 minutes (December 2018) to 47 seconds (July 2020). At the same time, the amount of hardware used to achieve these results has increased dramatically; frontier systems have been dominated by the use of “accelerator” chips, starting with GPUs in the 2018 results, and transitioning to Google’s TPUs for the best-in-class results from 2019 and 2020. Distribution of Training Time: MLPerf does not just show the state of the art for each competition period; it also makes available all the data behind each entry in each competition cycle. This, in turn, reveals the distribution of training times for each period (Figure 2.1.3). (Note that in each MLPerf competition, competitors typically submit multiple entries that use different permutations of hardware.) Figure 2.1.4 shows that in the past couple of years, training times have shortened, as has the variance between MLPerf entries. At the same time, competitors have started to use larger and larger numbers of accelerator chips to speed training times. This is in line with broader trends in AI development, as large-scale training becomes better understood, with a higher degree of shared best practices and infrastructure. 2 .1 C O M P U T E R V I S I O N — I M AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E 3 The next MLPerf update is planned for June 2021. Imagine the difference between waiting a few seconds for a system to train versus waiting a few hours, and what that difference means for the type and volume of ideas researchers explore and how risky they might be. https://mlcommons.org/en/ TA B L E O F C O N T E N T S 4 9C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 12/2017 06/2018 12/2018 06/2019 12/2019 1 2 5 10 20 50 100 200 500 1,000 2,000 C o st ( U .S . D o lla rs ; L o g S ca le ) $7.43 IMAGENET: TRAINING COST (to 93% ACCURACY) Source: DAWNBench, 2020 | Chart: 2021 AI Index Report ImageNet: Training Costs How much does it cost to train a contemporary image- recognition system? The answer, according to tests run by the Stanford DAWNBench team, is a few dollars in 2020, down by around 150 times from costs in 2017 (Figure 2.1.5). To put this in perspective, what cost one entrant around USD 1,100 to do in October 2017 now costs about USD 7.43. This represents progress in algorithm design as well as a drop in the costs of cloud-computing resources. 2 .1 C O M P U T E R V I S I O N — I M AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E Figure 2.1.5 12/2018 06/2019 07/2020 0 1,000 2,000 3,000 4,000 N u m b er o f A cc el er at o rs 0 1 2 3 4 5 6 T ra in in g T im e 4,096 1,024 640 6.2 Min 1.3 Min 47 Sec IMAGENET: TRAINING TIME and HARDWARE of the BEST SYSTEM Source: MLPerf, 2020 | Chart: 2021 AI Index Report Number of Accelerators Training Time 2018 2019 2020 1 10 100 1,000 10,000 T ra in in g T im e (m in s) IMAGENET: DISTRIBUTION of TRAINING TIME Source: MLPerf, 2020 | Chart: 2021 AI Index Report Figure 2.1.3 Figure 2.1.4 https://dawn.cs.stanford.edu/benchmark/ TA B L E O F C O N T E N T S 5 0C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 Harder Tests Beyond ImageNet In spite of the progress in performance on ImageNet, current computer vision systems are still not perfect. To better study their limitations, researchers have in recent years started to develop more challenging image classification benchmarks. But since ImageNet is already a large dataset, which requires a nontrivial amount of resources to use, it does not intuitively make sense to simply expand the resolution of the images in ImageNet or the absolute size of the dataset—as either action would further increase the cost to researchers when training systems on ImageNet. Instead, people have tried to figure out new ways to test the robustness of image classifiers by creating custom datasets, many of which are compatible with ImageNet (and are typically smaller). These include IMAGENET ADVERSARIAL: This is a dataset of images similar to those found in ImageNet but incorporating natural confounders (e.g., a butterfly sitting on a carpet with a similar texture to the butterfly), and images that are persistently misclassified by contemporary systems. These images “cause consistent classification mistakes due to scene complications encountered in the long tail of scene configurations and by exploiting classifier blind spots,” according to the researchers. Therefore, making progress on ImageNet Adversarial could improve the ability of models to generalize. IMAGENET-C: This is a dataset of common ImageNet images with 75 visual corruptions applied to them (e.g., changes in brightness, contrast, pixelations, fog effects, etc.). By testing systems against this, researchers can provide even more information about the generalization capabilities of these models. IMAGENET-RENDITION: This tests generalization by seeing how well ImageNet- trained models can categorize 30,000 illustrations of 200 ImageNet classes. Since ImageNet is designed to be built out of photos, generalization here indicates that systems have learned something more subtle about what they’re trying to classify, because they’re able to “understand” the relationship between illustrations and the photographed images they’ve been trained on. What is the Time Table for Tracking This Data? As these benchmarks are relatively new, the plan is to wait a couple of years for the community to test a range of systems against them, which will generate the temporal information necessary to make graphs tracking progress overtime. 2 .1 C O M P U T E R V I S I O N — I M AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E https://arxiv.org/abs/1907.07174 https://arxiv.org/abs/1903.12261 https://arxiv.org/abs/2006.16241 TA B L E O F C O N T E N T S 5 1C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 I M AG E G E N E R AT I O N Image generation is the task of generating images that look indistinguishable from “real” images. Image generation systems have a variety of uses, ranging from augmenting search capabilities (it is easier to search for a specific image if you can generate other images like it) to serving as an aid for other generative uses (e.g., editing images, creating content for specific purposes, generating multiple variations of a single image to help designers brainstorm, and so on). In recent years, image generation progress has accelerated as a consequence of the continued improvement in deep learning–based algorithms, as well as the use of increased computation and larger datasets. STL-10: Fréchet Inception Distance (FID) Score One way to measure progress in image generation is via a technique called Fréchet Inception Distance (FID), which roughly correlates to the difference between how a given AI system “thinks” about a synthetic image versus a real image, where a real image has a score of 0 and synthetic images that look similar have scores that approach 0. Figure 2.1.6 shows the progress of generative models over the past two years at generating convincing synthetic images in the STL-10 dataset, which is designed to test how effective systems are at generating images and gleaning other information about them. 2 .1 C O M P U T E R V I S I O N — I M AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E 01/2018 04/2018 07/2018 10/2018 01/2019 04/2019 07/2019 10/2019 01/2020 04/2020 07/2020 20 25 30 35 40 45 F ré ch et In ce p ti o n D is ta n ce ( F ID ) S co re 25.4 STL-10: FRÉCHET INCEPTION DISTANCE (FID) SCORE Source: Papers with Code, 2020 | Chart: 2021 AI Index Report Figure 2.1.6 https://cs.stanford.edu/~acoates/stl10/ TA B L E O F C O N T E N T S 5 2C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 2 .1 C O M P U T E R V I S I O N — I M AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E FID Versus Real Life FID has drawbacks as an evaluation technique— specifically, it assesses progress on image generation via quantitative metrics that use data from the model itself, rather than other evaluation techniques. Another approach is using teams of humans to evaluate the outputs of these models; for instance, the Human eYe Perceptual Evaluation (HYPE) method tries to judge image quality by showing synthetically generated images to humans and using their qualitative ratings to drive the evaluation methodology. This approach is more expensive and slower to run than typical evaluations, but it may become more important as generative models get better. Qualitative Examples: To get a sense of progress, you can look at the evolution in the quality of synthetically generated images over time. In Figure 2.1.7, you can see the best-in-class examples of synthetic images of human faces, ordered over time. By 2018, performance of this task had become sufficiently good that it is difficult for humans to easily model further progress (though it is possible to train machine learning systems to spot fakes, it is becoming more challenging). This provides a visceral example of recent progress in this domain and underscores the need for new evaluation methods to gauge future progress. In addition, in recent years people have turned to doing generative modeling on a broader range of categories than just images of people’s faces, which is another way to test for generalization. Figure 2.1.7 GAN PROGRESS ON FACE GENERATION 2014 2015 2016 2017 2018 2020 Source: Goodfellow et al., 2014; Radford et al., 2016; Liu & Tuzel, 2016; Karras et al., 2018; Karras et al., 2019; Goodfellow, 2019; Karras et al., 2020; AI Index, 2021 TA B L E O F C O N T E N T S 5 3C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 D E E P FA K E D E T E C T I O N Advances in image synthesis have created new opportunities as well as threats. For instance, in recent years, researchers have harnessed breakthroughs in synthetic imagery to create AI systems that can generate synthetic images of human faces, then superimpose those faces onto the faces of other people in photographs or movies. People call this application of generative technology a “deepfake.” Malicious uses of deepfakes include misinformation and the creation of (predominantly misogynistic) pornography. To try to combat this, researchers are developing deepfake- detection technologies. Deepfake Detection Challenge (DFDC) Created in September 2019 by Facebook, the Deepfake Detection Challenge (DFDC) measures progress on deepfake-detection technology. A two-part challenge, DFDC asks participants to train and test their models from a public dataset of around 100,000 clips. The submissions are scored on log loss, a classification metric based on probabilities. A smaller log loss means a more accurate prediction of deepfake videos. According to Figure 2.1.8, log loss dropped by around 0.5 as the challenge progressed between December 2019 and March 2020. 2 .1 C O M P U T E R V I S I O N — I M AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E 1/6/2020 1/16/2020 1/26/2020 2/5/2020 2/15/2020 2/25/2020 3/6/2020 3/16/2020 3/26/2020 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 L o g L o ss 0.19 DEEPFAKE DETECTION CHALLENGE: LOG LOSS Source: Kaggle, 2020 | Chart: 2021 AI Index Report Figure 2.1.8 https://www.kaggle.com/c/deepfake-detection-challenge/ https://www.kaggle.com/c/deepfake-detection-challenge/ https://www.kaggle.com/dansbecker/what-is-log-loss TA B L E O F C O N T E N T S 5 4C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 H U M A N P O S E E S T I M AT I O N Human pose estimation is the problem of estimating the positions of human body parts or joints (wrists, elbows, etc.) from a single image. Human pose estimation is a classic “omni-use” AI capability. Systems that are good at this task can be used for a range of applications, such as creating augmented reality applications for the fashion industry, analyzing behaviors gleaned from physical body analysis in crowds, surveilling people for specific behaviors, aiding with analysis of live sporting and athletic events, mapping the movements of a person to a virtual avatar, and so on. Common Objects in Context (COCO): Keypoint Detection Challenge Common Objects in Context (COCO) is a large-scale dataset for object detection, segmentation, and captioning with 330,000 images and 1.5 million object instances. Its Keypoint Detection Challenge requires machines to simultaneously detect an object or a person and localize their body keypoints—points in the image that stand out, such as a person’s elbows, knees, and other joints. The task evaluates algorithms based on average precision (AP), a metric that can be used to measure the accuracy of object detectors. Figure 2.1.9 shows that the accuracy of algorithms in this task has improved by roughly 33% in the past four years, with the latest machine scoring 80.8% on average precision. 2 .1 C O M P U T E R V I S I O N — I M AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E 07/2016 01/2017 07/2017 01/2018 07/2018 01/2019 07/2019 01/2020 07/2020 50% 60% 70% 80% 90% A ve ra g e P re ci si o n ( A P ) 80.8% COCO KEYPOINT CHALLENGE: AVERAGE PRECISION Source: COCO Leaderboard, 2020 | Chart: 2021 AI Index Report Figure 2.1.9 TA B L E O F C O N T E N T S 5 5C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 Common Objects in Context (COCO): DensePose Challenge DensePose, or dense human pose estimation, is the task of extracting a 3D mesh model of a human body from a 2D image. After open-sourcing a system called DensePose in 2018, Facebook built DensePose-COCO, a large-scale dataset of image-to-surface correspondences annotated on 50,000 COCO images. Since then, DensePose has become a canonical benchmark dataset. The COCO DensePose Challenge involves tasks of simultaneously detecting people, segmenting their bodies, and estimating the correspondences between image pixels that belong to a human body and a template 3D model. The average precision is calculated based on the geodesic point similarity (GPS) metric, a correspondence matching score that measures the geodesic distances between the estimated points and the true location of the body points on the image. The accuracy has grown from 56% in 2018 to 72% in 2019 (Figure 2.1.10). 2 .1 C O M P U T E R V I S I O N — I M AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E 03/2018 05/2018 07/2018 09/2018 11/2018 01/2019 03/2019 05/2019 07/2019 09/2019 50% 55% 60% 65% 70% 75% A ve ra g e P re ci si o n ( A P ) 72% COCO DENSEPOSE CHALLENGE: AVERAGE PRECISION Source: arXiv & CodaLab, 2020 | Chart: 2021 AI Index Report Figure 2.1.10 https://cocodataset.org/index.htm#densepose-eval TA B L E O F C O N T E N T S 5 6C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 S E M A N T I C S E G M E N TAT I O N Semantic segmentation is the task of classifying each pixel in an image to a particular label, such as person, cat, etc. Where image classification tries to assign a label to the entire image, semantic segmentation tries to isolate the distinct entities and objects in a given image, allowing for more fine-grained identification. Semantic segmentation is a basic input technology for self-driving cars (identifying and isolating objects on roads), image analysis, medical applications, and more. Cityscapes Cityscapes is a large-scale dataset of diverse urban street scenes across 50 different cities recorded during the daytime over several months (during spring, summer, and fall) of the year. The dataset contains 5,000 images with high-quality, pixel-level annotations and 20,000 weekly labeled ones. Semantic scene understanding, especially in the urban space, is crucial to the environmental perception of autonomous vehicles. Cityscapes is useful for training deep neural networks to understand the urban environment. One Cityscapes task that focuses on semantic segmentation is the pixel-level semantic labeling task. This task requires an algorithm to predict the per-pixel semantic labeling of the image, partitioning an image into different categories, like cars, buses, people, trees, and roads. Participants are evaluated based on the intersection-over-union (IoU) metric. A higher IoU score means a better segmentation accuracy. Between 2014 and 2020, the mean IoU increased by 35% (Figure 2.1.11). There was a significant boost to progress in 2016 and 2017 when people started using residual networks in these systems. 2 .1 C O M P U T E R V I S I O N — I M AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E 01/2015 01/2016 01/2017 01/2018 01/2019 01/2020 60% 65% 70% 75% 80% 85% 90% M ea n In te rs ec ti o n -O ve r- U n io n ( m Io U ) 85.1% With Extra Training Data 82.3% Without Extra Training Data CITYSCAPES CHALLENGE: PIXEL-LEVEL SEMANTIC LABELING TASK Source: Papers with Code, 2020 | Chart: 2021 AI Index Report Figure 2.1.11 https://towardsdatascience.com/metrics-to-evaluate-your-semantic-segmentation-model-6bcb99639aa2 TA B L E O F C O N T E N T S 5 7C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 E M B O D I E D V I S I O N The performance data so far shows that computer vision systems have advanced tremendously in recent years. Object recognition, semantic segmentation, and human pose estimation, among others, have now achieved significant levels of performance. Note that these visual tasks are somewhat passive or disembodied. That is, they can operate on images or videos taken from camera systems that are not physically able to interact with the surrounding environment. As a consequence of the continuous improvement in those passive tasks, researchers have now started to develop more advanced AI systems that can be interactive or embodied—that is, systems that can physically interact with and modify the surrounding environment in which they operate: for example, a robot that can visually survey a new building and autonomously navigate it, or a robot that can learn to assemble pieces by watching visual demonstrations instead of being manually programmed for this. Progress in this area is currently driven by the development of sophisticated simulation environments, where researchers can deploy robots in virtual spaces, simulate what their cameras would see and capture, and develop AI algorithms for navigation, object search, and object grasping, among other interactive tasks. Because of the relatively early nature of this field, there are few standardized metrics to measure progress. Instead, here are brief highlights of some of the available simulators, their year of release, and any other significant feature. • Thor (AI2, 2017) focuses on sequential abstract reasoning with predefined “magic” actions that are applicable to objects. • Gibson (Stanford, 2018) focuses on visual navigation in photorealistic environments obtained with 3D scanners. • iGibson (Stanford, 2019) focuses on full interactivity in large realistic scenes mapped from real houses and made actable: navigation + manipulation (known in robotics as “mobile manipulation”). • AI Habitat (Facebook, 2019) focuses on visual navigation with an emphasis on much faster execution, enabling more computationally expensive approaches. • ThreeDWorld (MIT and Stanford, 2020) focuses on photorealistic environments through game engines, as well as adds simulation of flexible materials, fluids, and sounds. • SEAN-EP (Yale, 2020) is a human-robot interaction environment with simulated virtual humans that enables the collection of remote demonstrations from humans via a web browser. • Robosuite (Stanford and UT Austin, 2020) is a modular simulation framework and benchmark for robot learning. 2 .1 C O M P U T E R V I S I O N — I M AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E https://ai2thor.allenai.org http://gibsonenv.stanford.edu http://svl.stanford.edu/igibson/ https://aihabitat.org/ http://www.threedworld.org/ https://sean.interactive-machines.com/#sean-ep https://robosuite.ai/ TA B L E O F C O N T E N T S 5 8C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 2016 2017 2018 2019 2020 0% 10% 20% 30% 40% 50% M ea n A ve ra g e P re ci si o n ( m A P ) 42.8% ACTIVITYNET: TEMPORAL ACTION LOCALIZATION TASK Source: ActivityNet, 2020 | Chart: 2021 AI Index Report Video analysis is the task of making inferences over sequential image frames, sometimes with the inclusion of an audio feed. Though many AI tasks rely on single-image inferences, a growing body of applications require computer vision machines to reason about videos. For instance, identifying a specific dance move benefits from seeing a variety of frames connected in a temporal sequence; the same is true of making inferences about an individual seen moving through a crowd, or a machine carrying out a sequence of movements over time. AC T I V I T Y R E C O G N I T I O N The task of activity recognition is to identify various activities from video clips. It has many important everyday applications, including surveillance by video cameras and autonomous navigation of robots. Research on video understanding is still focused on short events, such as videos that are a few seconds long. Longer-term video understanding is slowly gaining traction. ActivityNet Introduced in 2015, ActivityNet is a large-scale video benchmark for human-activity understanding. The benchmark tests how well algorithms can label and categorize human behaviors in videos. By improving performance on tasks like ActivityNet, AI researchers are developing systems that can categorize more complex behaviors than those that can be contained in a single image, like characterizing the behavior of pedestrians on a self-driving car’s video feed or providing better labeling of specific movements in sporting events. ActivityNet: Temporal Action Localization Task The temporal action localization task in the ActivityNet challenge asks machines to detect time segments in a 600-hour, untrimmed video sequence that contains several activities. Evaluation on this task focuses on (1) localization: how well can the system localize the interval with the precise start time and end time; and (2) recognition: how well can the system recognize the activity and classify it into the correct category (such as throwing, climbing, walking the dog, etc.). Figure 2.2.1 shows that the highest mean average precision of the temporal action localization task among submissions has grown by 140% in the last five years. 2.2 COMPUTER VISION—VIDEO 2 . 2 C O M P U T E R V I S I O N —V I D E O C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E Figure 2.2.1 http://activity-net.org/challenges/2020/tasks/anet_localization.html TA B L E O F C O N T E N T S 5 9C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 0% 2% 4% 6% 8% 10% 12% 14% Mean Average Precision (mAP) Drinking coffee High jump Polishing furniture Putting in contact lenses Removing curlers Rock-paper-scissors Running a marathon Shot put Smoking a cigarette Throwing darts ACTIVITYNET: HARDEST ACTIVITIES, 2019-20 Source: ActivityNet, 2020 | Chart: 2021 AI Index Report 2019 2020 ActivityNet: Hardest Activity Figure 2.2.2 shows the hardest activities of the temporal action location task in 2020 and how their mean average precision compares with the 2019 result. Drinking coffee remained the hardest activity in 2020. Rock-paper- scissors, though still the 10th hardest activity, saw the greatest improvement among all activities, increasing by 129.2%—from 6.6% in 2019 to 15.22% in 2020. 2 . 2 C O M P U T E R V I S I O N —V I D E O C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E Figure 2.2.2 TA B L E O F C O N T E N T S 6 0C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 12/2016 04/2018 04/2020 11/2020 20 30 40 50 60 70 80 M ea n A ve ra g e P re ci si o n ( m A P 5 0 ) 65.2 YOLOv2 (Resolution: Unclear) YOLOv3 (Resolution: 608) YOLOv4 (Resolution: 608) PP-YOLO (Resolution: 608) YOU ONLY LOOK ONCE (YOLO): MEAN AVERAGE PRECISION Source: Redmon & Farhadi (2016 & 2018), Bochkovskiy et al. (2020), Long et al. (2020) | Chart: 2021 AI Index Report O B J E C T D E T E C T I O N Object detection is the task of identifying a given object in an image. Frequently, image classification and image detection are coupled together in deployed systems. One way to get a proxy measure for the improvement in deployed object recognition systems is to study the advancement of widely used object detection systems. You Only Look Once (YOLO) You Only Look Once (YOLO) is a widely used open source system for object detection, so its progress has been included on a standard task on YOLO variants to give a sense of how research percolates into widely used open source tools. YOLO has gone through multiple iterations since it was first published in 2015. Over time, YOLO has been optimized along two constraints: performance and inference latency, as shown in Figure 2.2.3. What this means, specifically, is that by measuring YOLO, one can measure the advancement of systems that might not have the best absolute performance but are designed around real-world needs, like low-latency inference over video streams. Therefore, YOLO systems might not always contain the absolute best performance as defined in the research literature, but they will represent good performance when faced with trade-offs such as inference time. 2 . 2 C O M P U T E R V I S I O N —V I D E O C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E Figure 2.2.3 TA B L E O F C O N T E N T S 61C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 2017 2018 2019 2020 2021 2022 0.002 0.005 0.01 0.02 0.05 0.1 0.2 0.5 1 F al se N o n -M at ch r at e (F N M R ; L o g S ca le ) 0.0035, VISABORDER Photos FNMR@FMR ≤ 0.000001 0.0025, VISA Photos FNMR @ FMR ≤ 0.000001 0.0023, MUGSHOT Photos FNMR @ FMR ≤ 0.00001 DT>=12 YRS 0.0022, MUGSHOT Photos FNMR @ FMR ≤ 0.00001 0.0293, WILD Photos FNMR @ FMR ≤ 0.00001 0.0064, BORDER Photos FNMR @ FMR = 0.000001 NIST FRVT 1:1 VERIFICATION ACCURACY by DATASET, 2017-20 Source: National Institute of Standards and Technology, 2020 | Chart: 2021 AI Index Report FAC E D E T E C T I O N A N D R E C O G N I T I O N Facial detection and recognition is one of the use-cases for AI that has a sizable commercial market and has generated significant interest from governments and militaries. Therefore, progress in this category gives us a sense of the rate of advancement in economically significant parts of AI development. National Institute of Standards and Technology (NIST) Face Recognition Vendor Test (FRVT) The Face Recognition Vendor Tests (FRVT) by the National Institute of Standards and Technology (NIST) provide independent evaluations of commercially available and prototype face recognition technologies. FRVT measures the performance of automated face recognition technologies used for a wide range of civil and governmental tasks (primarily in law enforcement and homeland security), including verification of visa photos, mug shot images, and child abuse images. Figure 2.2.4 shows the results of the top-performing 1:1 algorithms measured on false non-match rate (FNMR) across several different datasets. FNMR refers to the rate at which the algorithm fails when attempting to match the image with the individual. Facial recognition technologies on mug-shot-type and visa photos have improved the most significantly in the past four years, falling from error rates of close to 50% to a fraction of a percent in 2020.4 2 . 2 C O M P U T E R V I S I O N —V I D E O C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E Figure 2.2.4 4 You can view details and examples of various datasets on periodically updated FRVT 1:1 verification reports. https://pages.nist.gov/frvt/html/frvt11.html#_overview_ TA B L E O F C O N T E N T S 6 2C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 07/2019 07/2019 08/2019 10/2019 01/2020 12/2020 01/2021 01/2021 50 60 70 80 90 100 S co re 90.3 Score89.8 Human Performance SUPERGLUE BENCHMARK Source: SuperGLUE Leaderboard, 2020 | Chart: 2021 AI Index Report Natural language processing (NLP) involves teaching machines to interpret, classify, manipulate, and generate language. From the early use of handwritten rules and statistical techniques to the recent adoption of generative models and deep learning, NLP has become an integral part of our lives, with applications in text generation, machine translation, question answering, and other tasks. In recent years, advances in natural language processing technology have led to significant changes in large-scale systems that billions of people access. For instance, in late 2019, Google started to deploy its BERT algorithm into its search engine, leading to what the company said was a significant improvement in its in-house quality metrics. Microsoft followed suit, announcing later in 2019 that it was using BERT to augment its Bing search engine. E N G L I S H L A N G U AG E U N D E R S TA N D I N G B E N C H M A R K S SuperGLUE Launched in May 2019, SuperGLUE is a single-metric benchmark that evaluates the performance of a model on a series of language understanding tasks on established datasets. SuperGLUE replaced the prior GLUE benchmark (introduced in 2018) with more challenging and diverse tasks. The SuperGLUE score is calculated by averaging scores on a set of tasks. Microsoft’s DeBERTa model now tops the SuperGLUE leaderboard, with a score of 90.3, compared with an average score of 89.8 for SuperGLUE’s “human baselines.” This does not mean that AI systems have surpassed human performance on all SuperGLUE tasks, but it does mean that the average performance across the entire suite has exceeded that of a human baseline. The rapid pace of progress (Figure 2.3.1) suggests that SuperGLUE may need to be made more challenging or replaced by harder tests in the future, just as SuperGLUE replaced GLUE. 2.3 LANGUAGE 2 . 3 L A N G U AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E Figure 2.3.1 https://super.gluebenchmark.com/ TA B L E O F C O N T E N T S 6 3C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 07/2016 01/2017 07/2017 01/2018 07/2018 01/2019 07/2019 01/2020 60 70 80 90 100 F 1 S co re 95.4 SQuAD 1.1 93.0 SQuAD 2.0 91.2 Human 1.1 89.5 Human 2.0 SQUAD 1.1 and SQUAD 2.0: F1 SCORE Source: CodaLab Worksheets, 2020 | Chart: 2021 AI Index Report SQuAD The Stanford Question Answering Dataset, or SQuAD, is a reading-comprehension benchmark that measures how accurately a NLP model can provide short answers to a series of questions pertaining to a small article of text. The SQuAD test makers established a human performance benchmark by having a group of people read Wikipedia articles on a variety of topics and then answer multiple-choice questions about those articles. Models are given the same task and are evaluated on the F1 score, or the average overlap between the model prediction and the correct answer. Higher scores indicate better performance. Two years after the introduction of the original SQuAD, in 2016, SQuAD 2.0 was developed once the initial benchmark revealed increasingly fast performances by the participants (mirroring the trend seen in GLUE and SuperGLUE). SQuAD 2.0 combines the 100,000 questions in SQuAD 1.1 with over 50,000 unanswerable questions written by crowdworkers to resemble answerable ones. The objective is to test how well systems can answer questions and to determine when systems know that no answer exists. As Figure 2.3.2 shows, the F1 score for SQuAD 1.1 improved from 67.75 in August 2016 to surpass human performance of 91.22 in September 2018—a 25-month period—whereas SQuAD 2.0 took just 10 months to beat human performance (from 66.3 in May 2018 to 89.47 in March 2019). In 2020, the most advanced models of SQuAD 1.1 and SQuAD 2.0 reached the F1 scores of 95.38 and 93.01, respectively. 2 . 3 L A N G U AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E Figure 2.3.2 TA B L E O F C O N T E N T S 6 4C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 0 5 10 15 20 25 N u m b er o f In d ep en d en t M ac h in e T ra n sl at io n S er vi ce s NUMBER of INDEPENDENT MACHINE TRANSLATION SERVICES Source: Intento, 2020 | Chart: 2021 AI Index Report 05/2017 07/2017 11/2017 03/2018 07/2018 12/2018 06/2019 11/2019 07/2020 0 Preview Commercial C O M M E R C I A L M AC H I N E T R A N S L AT I O N ( M T ) Machine translation (MT), the subfield of computational linguistics that investigates the use of software to translate text or speech from one language to another, has seen significant improvement due to advances in machine learning. Recent progress in MT has prompted developers to shift from symbolic approaches toward ones that use both statistical and deep learning approaches. Number of Commercially Available MT Systems The trend in the number of commercially available systems speaks to the significant growth of commercial machine translation technology and its rapid adoption in the commercial marketplace. In 2020, the number of commercially available independent cloud MT systems with pre-trained models increased to 28, from 8 in 2017, according to Intento, a startup that evaluates commercially available MT services (Figure 2.3.3). 2 . 3 L A N G U AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E Figure 2.3.3 https://inten.to/?utm_campaign=Inten.to%20Main%20Page%20Registrations&utm_source=Report%20landing%20page TA B L E O F C O N T E N T S 6 5C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 0% 10% 20% 30% 40% 50% 60% A cc u ra cy 42.6% Zero-Shot 51.0% One-Shot 57.4% Few-Shot GPT-3: AVERAGE PERFORMANCE across 42 BENCHMARKS Source: OpenAI (Brown et al.), 2020 | Chart: 2021 AI Index Report 1 10 100 Number of Parameters (Billions) G P T- 3 In July 2020, OpenAI unveiled GPT-3, the largest known dense language model. GPT-3 has 175 billion parameters and was trained on 570 gigabytes of text. For comparison, its predecessor, GPT-2, was over 100 times smaller, at 1.5 billion parameters. This increase in scale leads to surprising behavior: GPT-3 is able to perform tasks it was not explicitly trained on with zero to few training examples (referred to as zero-shot and few-shot learning, respectively). This behavior was mostly absent in the much smaller GPT-2. Furthermore, for some tasks (but not all; e.g., SuperGLUE and SQuAD2), GPT-3 outperforms state-of-the-art models that were explicitly trained to solve those tasks with far more training examples. Figure 2.3.4, adapted from the GPT-3 paper, demonstrates the impact of scale (in terms of model parameters) on task accuracy (higher is better) in zero-, one-, and few-shot learning regimes. Each point on the curve corresponds to an average performance accuracy, aggregated across 42 accuracy-oriented benchmarks. As model size increases, average accuracy in all task regimes increases accordingly. Few-shot learning accuracy increases more rapidly with scale, compared with zero- shot learning, which suggests that large models can perform surprisingly well given minimal context. 2 . 3 L A N G U AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E Figure 2.3.4 https://arxiv.org/pdf/2005.14165.pdf TA B L E O F C O N T E N T S 6 6C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 That a single model can achieve state-of-the-art or close to state-of-the-art performance in limited-training-data regimes is impressive. Most models until now have been designed for a single task, and thus can be evaluated effectively by a single metric. In light of GPT-3, we anticipate novel benchmarks that are explicitly designed to evaluate zero- to few-shot learning performance for language models. This will not be straightforward. Developers are increasingly finding model novel capabilities (e.g., the ability to generate a website from a text description) that will be difficult to define, let alone measure performance on. Nevertheless, the AI Index is committed to tracking performance in this new context as it evolves. Despite its impressive capabilities, GPT-3 has several shortcomings, many of which are outlined in the original paper. For example, it can generate racist, sexist, and otherwise biased text. Furthermore, GPT-3 (and other language models) can generate unpredictable and factually inaccurate text. Techniques for controlling and “steering” such outputs to better align with human values are nascent but promising. GPT-3 is also expensive to train, which means that only a limited number of organizations with abundant resources can currently afford to develop and deploy such models. Finally, GPT-3 has an unusually large number of uses, from chatbots to computer code generation to search. Future users are likely to discover more applications, both good and bad, making it difficult to identify the range of possible uses and forecast their impact on society. Nevertheless, research to address harmful outputs and uses is ongoing at several universities and industrial research labs, including OpenAI. For more details, refer to work by Bender and Gebru et al. and the proceedings from a recent Stanford Institute for Human-Centered Artificial Intelligence (HAI) workshop (which included researchers from OpenAI), “Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models.” 2 . 3 L A N G U AG E C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E That a single model can achieve state-of-the- art or close to state-of- the-art performance in limited-training-data regimes is impressive. Most models until now have been designed for a single task, and thus can be evaluated effectively by a single metric. https://faculty.washington.edu/ebender/papers/Stochastic_Parrots.pdf https://arxiv.org/abs/2102.02503 https://arxiv.org/abs/2102.02503 https://arxiv.org/abs/2102.02503 TA B L E O F C O N T E N T S 6 7C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 10/2015 04/2016 10/2016 04/2017 10/2017 04/2018 10/2018 04/2019 10/2019 04/2020 55% 60% 65% 70% 75% 80% A cc u ra cy 76.4% Accuracy 80.8% Human Baseline VISUAL QUESTION ANSWERING (VQA) CHALLENGE: ACCURACY Source: VQA Challenge, 2020 | Chart: 2021 AI Index Report V I S I O N A N D L A N G U AG E R E A S O N I N G Vision and language reasoning is a research area that addresses how well machines jointly reason about visual and text data. Visual Question Answering (VQA) Challenge The VQA challenge, introduced in 2015, requires machines to provide an accurate natural language answer, given an image and a natural language question about the image based on a public dataset. Figure 2.4.1 shows that the accuracy has grown by almost 40% since its first installment at the International Conference on Computer Vision (ICCV) in 2015. The highest accuracy of the 2020 challenge is 76.4%. This achievement is closer to the human baseline of 80.8% accuracy and represents a 1.1% absolute increase in performance from the top 2019 algorithm. 2.4 LANGUAGE REASONING SKILLS 2 . 4 L A N G U AG E R E A S O N I N G S K I L L S C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E Figure 2.4.1 https://visualqa.org/ TA B L E O F C O N T E N T S 6 8C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 11/2018 01/2019 03/2019 05/2019 07/2019 09/2019 11/2019 01/2020 03/2020 05/2020 07/2020 20 40 60 80 100 Q -> A R S co re 70.5 85 Human Performance VISUAL COMMONSENSE REASONING (VCR) TASK: Q->AR Score Source: VCR Leaderboard, 2020 | Chart: 2021 AI Index Report Visual Commonsense Reasoning (VCR) Task The Visual Commonsense Reasoning (VCR) task, first introduced in 2018, asks machines to answer a challenging question about a given image and justify that answer with reasoning (whereas VQA just requests an answer). The VCR dataset contains 290,000 pairs of multiple-choice questions, answers, and rationales, as well as over 110,000 images from movie scenes. The main evaluation mode for the VCR task is the Q->AR score, requiring machines to first choose the right answer (A) to a question (Q) among four answer choices (Q->A) and then select the correct rationale (R) among four rationale choices based on the answer. A higher score is better, and human performance on this task is measured by a QA->R score of 85. The best-performing machine has improved on the Q->AR score from 44 in 2018 to 70.5 in 2020 (Figure 2.4.2), which represents a 60.2% increase in performance from the top competitor in 2019. 2 . 4 L A N G U AG E R E A S O N I N G S K I L L S C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E Figure 2.4.2 https://visualcommonsense.com/ TA B L E O F C O N T E N T S 6 9C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 A major aspect of AI research is the analysis and synthesis of human speech conveyed via audio data. In recent years, machine learning approaches have drastically improved performance across a range of tasks. S P E E C H R E C O G N I T I O N Speech recognition, or automatic speech recognition (ASR), is the process that enables machines to recognize spoken words and convert them to text. Since IBM introduced its first speech recognition technology in 1962, the technology has evolved with voice-driven applications such as Amazon Alexa, Google Home, and Apple Siri becoming increasingly prevalent. The flexibility and predictive power of deep neural networks, in particular, has allowed speech recognition to become more accessible. Transcribe Speech: LibriSpeech LibriSpeech is a dataset, first introduced in 2015, made up of 1,000 hours of speech from audiobooks. It has become widely used for the development and testing of speech recognition technologies. In recent years, neural-network- based AI systems have started to dramatically improve performance on LibriSpeech, lowering the word error rate (WER; 0% is optimal performance) to around 2% (Figure 2.5.1a and Figure 2.5.1b). Developers can test out their systems on LibriSpeech in two ways: • Test Clean determines how well their systems can transcribe speech from a higher-quality subset of the LibriSpeech dataset. This test gives clues about how well AI systems might perform in more controlled environments. • Test Other determines how systems can deal with lower-quality parts of the LibriSpeech dataset. This test suggests how well AI systems might perform in noisier (and perhaps more realistic) environments. There has been substantial progress recently on both datasets, with an important trend emerging in the past two years: The gap between performance on Test Clean and Test Other has started to close significantly for frontier systems, shifting from an absolute performance difference of more than seven points in late 2015 to a difference of less than one point in 2020. This reveals dramatic improvements in the robustness of ASR systems over time and suggests that we might be saturating performance on LibriSpeech—in other words, harder tests may be needed. 2.5 SPEECH 2 . 5 S P E E C H C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E Speaker Recognition: VoxCeleb Speaker identification tests how well machine learning systems can attribute speech to a particular person. The VoxCeleb dataset, first introduced in 2017, contains over a million utterances from 6,000 distinct speakers, and its associated speaker-identification task tests the error rate for systems that try to attribute a particular utterance to a particular speaker. A better (lower) score in VoxCeleb provides a proxy for how well a machine can distinguish one voice among 6,000. Evaluation method for VoxCeleb is Equal Error Rate (EER), a commonly used metric for identity verification systems. EER provides a measure for both the false positive rate (assigning a label incorrectly) and the false negative rate (failing to assign a correct label). In recent years, progress on this task has come from hybrid systems—systems that fuse contemporary deep learning approaches with more structured algorithms, developed by the broader speech-processing community. As of 2020, error rates have dropped such that computers have a very high (99.4%) ability to attribute utterances to a given speaker (Figure 2.5.2) Still, obstacles remain: These systems face challenges processing speakers with different accents and in differentiating among speakers when confronted with a large dataset (it is harder to identify one person in a set of a billion people than to pick out one person across the VoxCeleb training set of 6,000). https://www.robots.ox.ac.uk/~vgg/data/voxceleb/ TA B L E O F C O N T E N T S 70C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 2017 2018 2019 2020 0% 2% 4% 6% 8% E q u al E rr o r R at e (E E R ) 0.6% VOXCELEB: EQUAL ERROR RATE Source: VoxCeleb, 2020 | Chart: 2021 AI Index Report 01/2016 01/2017 01/2018 01/2019 01/2020 0 1 2 3 4 5 6 W o rd E rr o r R at e (W E R ) 1.4 LIBRISPEECH: WORD ERROR RATE, TEST CLEAN Source: Papers with Code, 2020 | Chart: 2021 AI Index Report 01/2016 01/2017 01/2018 01/2019 01/2020 0 2 4 6 8 10 12 14 W o rd E rr o r R at e (W E R ) 2.6 LIBRISPEECH: WORD ERROR RATE, TEST OTHER Source: Papers with Code, 2020 | Chart: 2021 AI Index Report Figure 2.5.1b Figure 2.5.2 Figure 2.5.1a 2 . 5 S P E E C H C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E TA B L E O F C O N T E N T S 7 1C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 The Race Gap in Speech Recognition Technology Researchers from Stanford University found that state-of-the-art ASR systems exhibited significant racial and gender disparity—they misunderstand Black speakers twice as often as white speakers. In the paper, titled “Racial Disparities in Automated Speech Recognition,” authors ran thousands of audio snippets of white and Black speakers, transcribed from interviews conducted with 42 white speakers and 73 Black speakers, through leading speech-to-text services by Amazon, Apple, Google, IBM, and Microsoft. The results suggest that, on average, systems made 19 errors every hundred words for white speakers and 35 errors for Black speakers— nearly twice as many. Moreover, the systems performed particularly poorly for Black men, with more than 40 errors for every hundred words (Figure 2.5.3). The breakdown by ASR systems shows that gaps are similar across companies (Figure 2.5.4). This research emphasizes the importance of addressing the bias of AI technologies and ensuring equity as they become mature and deployed. 2 . 5 S P E E C H C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E Black Men Black Women White Men White Women 0% 10% 20% 30% 40% A ve ra g e W o rd E rr o r R at e (W E R ) TESTINGS on LEADING SPEECH-to-TEXT SERVICES: WORD ERROR RATE by RACE and GENDER, 2019 Source: Koenecke et al., 2020 | Chart: 2021 AI Index Report 0% 5% 10% 15% 20% 25% A ve ra g e W o rd E rr o r R at e (W E R ) TESTINGS on LEADING SPEECH-to-TEXT SERVICES: WORD ERROR RATE by SERVICE and RACE, 2019 Source: Koenecke et al., 2020 | Chart: 2021 AI Index Report Apple IBM Google Amazon Microsoft 100% A v. . Black Speakers White Speakers Black Men Black Women White Men White Women 0% 10% 20% 30% 40% A ve ra g e W o rd E rr o r R at e (W E R ) TESTINGS on LEADING SPEECH-to-TEXT SERVICES: WORD ERROR RATE by RACE and GENDER, 2019 Source: Koenecke et al., 2020 | Chart: 2021 AI Index Report 0% 5% 10% 15% 20% 25% A ve ra g e W o rd E rr o r R at e (W E R ) TESTINGS on LEADING SPEECH-to-TEXT SERVICES: WORD ERROR RATE by SERVICE and RACE, 2019 Source: Koenecke et al., 2020 | Chart: 2021 AI Index Report Apple IBM Google Amazon Microsoft 100% A v. . Black Speakers White Speakers Figure 2.5.3 Figure 2.5.4 https://www.pnas.org/content/117/14/7684 https://www.pnas.org/content/117/14/7684 https://www.pnas.org/content/117/14/7684 TA B L E O F C O N T E N T S 7 2C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 This section measures progress on symbolic (or logical) reasoning in AI, which is the process of drawing conclusions from sets of assumptions. We consider two major reasoning problems, Boolean Satisfiability (SAT) and Automated Theorem Proving (ATP). Each has real-world applications (e.g., circuit design, scheduling, software verification, etc.) and poses significant measurement challenges. The SAT analysis shows how to assign credit for the overall improvement in the field to individual systems over time. The ATP analysis shows how to measure performance given an evolving test set. All analyses below are original to this report. Lars Kotthoff wrote the text and performed the analysis for the SAT section. Geoff Sutcliffe, Christian Suttner, and Raymond Perrault wrote the text and performed the analysis for the ATP section. This work had not been published at the time of writing; consequently, a more academically rigorous version of this section (with references, more precise details, and further context) is included in the Appendix. B O O L E A N SAT I S F I A B I L I T Y P R O B L E M Analysis and text by Lars Kotthoff The SAT problem considers whether there is an assignment of values to a set of Boolean variables, joined by logical connectives, that makes the logical formula it represents true. Many real-world problems, such as circuit design, automated theorem proving, and scheduling, can be represented and solved efficiently as SAT problems. The performance of the top-, median-, and bottom-ranked SAT solvers was examined from each of the last five years (2016–2020) of the SAT Competition, which has been running for almost 20 years, to measure a snapshot of state-of-the-art performance. In particular, all 15 solvers were run on all 400 SAT instances from the main track of the 2020 competition and the time (in CPU seconds) it took to solve all instances was measured.5 Critically, each solver was run on the same hardware, such that comparisons across years would not be confounded by improvements in hardware efficiency over time. While performance of the best solvers from 2016 to 2018 did not change significantly, large improvements are evident in 2019 and 2020 (Figure 2.6.1). These improvements affect not only the best solvers but also their competitors. The performance of the median-ranked solver in 2019 is better than that of the top-ranked solvers in all previous years, and the performance of the median- ranked solver in 2020 is almost on par with the top-ranked solver in 2019. Performance improvements in SAT—and more generally, hard computational AI problems—come primarily from two areas of algorithmic improvements: novel techniques and more efficient implementations of existing techniques. Typically, performance improvements arise primarily from novel techniques. However, more efficient implementations (which can arise with performance improvements in hardware over time) can also increase performance. Therefore, it is difficult to assess whether performance improvements arise primarily from novel techniques or more efficient implementations. To address this problem, the temporal Shapley value, which is the contribution of an individual system to state-of-the-art performance over time, was measured (see the Appendix for more details). Figure 2.6.2 shows the temporal Shapley value contributions of each solver for the different competition years. Note that the contributions of the solvers in 2016 are highest because there is no previous state-of-the-art to compare them with in our evaluation and that their contribution is not discounted. 2.6 REASONING 2 . 6 R E A S O N I N G C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E 5 Acknowledgments: The Advanced Research Computing Center at the University of Wyoming provided resources for gathering the computational data. Austin Stephen performed the computational experiments. http://www.satcompetition.org/ TA B L E O F C O N T E N T S 7 3C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 2016 2017 2018 2019 2020 7.00e+05 8.00e+05 1.00e+06 1.20e+06 1.50e+06 1.75e+06 2.00e+06 T o ta l T im e (C P U s) smallsat glu_vc YalSAT 03r YalSAT PauSat MapleLCMDistChronoBT MapleLCMDiscChronoBT-DL-v3 MapleCOMSPS_DRUP Maple_mix Maple_LCM_Dist Kissat-sc2020-sat glue_alt expMC_VSIDS_LRB_Switch_2500 CCAnrSim Candy TOTAL TIME to SOLVE ALL 400 INSTANCES for EACH SOLVER and YEAR (LOWER IS BETTER), 2016-20 Source: Kotthoff, 2020 | Chart: 2021 AI Index Report 2016 2017 2018 2019 2020 1.00e+02 1.00e+03 1.00e+04 1.00e+05 1.00e+06 T em p o ra l S h ap le y V al u e glu_vc expMC_VSIDS_LRB_Switch_2500 YalSAT 03r YalSAT smallsat PauSat MapleLCMDistChronoBT MapleLCMDiscChronoBT-DL-v3 MapleCOMSPS_DRUP Maple_mix Maple_LCM_Dist Kissat-sc2020-sat glue_alt CCAnrSim Candy TEMPORAL SHAPLEY VALUE CONTRIBUTIONS of INDIVIDUAL SOLVERS to the STATE of the ART OVER TIME (HIGHER IS BETTER), 2016-20 Source: Kotthoff, 2020 | Chart: 2021 AI Index Report Figure 2.6.1 Figure 2.6.2 2 . 6 R E A S O N I N G C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E TA B L E O F C O N T E N T S 74C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 According to the temporal Shapley value, in 2020 the best solver contributes significantly more than the median- and bottom-ranked solvers do. The 2020 winner, Kissat, has the highest temporal Shapley value of any solvers excluding the first year. The changes it incorporates, compared with those of previous solvers, are almost exclusively more efficient data structures and algorithms; Kissat thus impressively demonstrates the impact of good engineering on the state-of-the-art performance. By contrast, smallsat, the solver with the largest temporal Shapley value (but not the winner) in 2019, focuses on improved heuristics instead of a more efficient implementation. The same is true of Candy, the solver with the largest temporal Shapley value in 2017, whose main novelty is to analyze the structure of a SAT instance and apply heuristics based on this analysis. Interestingly, neither solver ranked first in their respective years; both were outperformed by versions of the Maple solver, which nevertheless contributes less to the state of the art. This indicates that incremental improvements, while not necessarily exciting, are important for good performance in practice. Based on our limited analysis of the field, novel techniques and more efficient implementations have made equally important contributions to the state of the art in SAT solving. Incremental improvements of established solvers are as likely to result in top performance as more substantial improvements of solvers without a long track record. AU T O M AT E D T H E O R E M P R OV I N G ( AT P) Analysis and text by Geoff Sutcliffe, Christian Suttner, and Raymond Perrault Automated Theorem Proving (ATP) concerns the development and use of systems that automate sound reasoning, or the derivation of conclusions that follow inevitably from facts. ATP systems are at the heart of many computational tasks, including software verification. The TPTP problem library was used to evaluate the performance of ATP algorithms from 1997 to 2020 and to measure the fraction of problems solved by any system over time (see the Appendix for more details). The analysis extends to the whole TPTP (over 23,000 problems) in addition to four salient subsets (each ranging between 500 and 5,500 problems)—clause normal form (CNF), first-order form (FOF), monomorphic typed first-order form (TF0) with arithmetic, and monomorphic typed higher-order form (TH0) theorems— all including the use of the equality operator. Figure 2.6.3 shows that the fraction of problems solved climbs consistently, indicating progress in the field. The noticeable progress from 2008 to 2013 included strong progress in the FOF, TF0, and TH0 subsets. In FOF, which has been used in many domains (e.g., mathematics, real-world knowledge, software verification), there were significant improvements in the Vampire, E, and iProver systems. In TF0 (primarily used for solving problems in mathematics and computer science) and TH0 (useful in subtle and complex topics such as philosophy and logic), there was rapid initial progress as systems developed techniques that solved “low-hanging fruit” problems. In 2014–2015, there was another burst of progress in TF0, as the Vampire system became capable of processing TF0 problems. It is noteworthy that, since 2015, progress has continued but slowed, with no indication of rapid advances or breakthroughs in the last few years. 2 . 6 R E A S O N I N G C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E http://www.tptp.org/ TA B L E O F C O N T E N T S 75C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 While this analysis demonstrates progress in ATP, there is obviously room for much more. Two keys to solve ATP problems are axiom selection (given a large set of axioms, only some of which are needed for a proof of the conjecture, how to select an adequate subset of the axioms); and search choice (at each stage of an ATP system’s search for a solution, which logical formula(e) should be selected for attention). The latter issue has been at the forefront of ATP research since its inception in the 1960s, while the former has become increasingly important as large bodies of knowledge are encoded for ATP. In the last decade, there has been growing use of machine learning approaches to addressing these two key challenges (e.g., in the MaLARea and Enigma ATP systems). Recent results from the CADE ATP System Competition (CASC) have shown that the emergence of machine learning is a potential game-changer for ATP. 2 . 6 R E A S O N I N G C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E 19 9 7 19 9 8 19 9 9 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 20 21 20 22 20 23 20 24 0% 20% 40% 60% 80% % o f P ro b le m s S o lv ed 64.0% Th0 Thm Equ Nar 71.4% Tf0 Thm Equ Ari 61.1% Fof Thm Rfo Seq 57.8% Cnf Uns Rfo Seq Nhn 49.2% All PERCENTAGE of PROBLEMS SOLVED, 1997-2020 Source: Sutcliffe, Suttner & Perrault, 2020 | Chart: 2021 AI Index Report Figure 2.6.3 http://www.tptp.org/CASC http://www.tptp.org/CASC TA B L E O F C O N T E N T S 76C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 2.7 HEALTHCARE AND BIOLOGY Standard 500k BenchmarkHuman Prediction Benchmark 0% 25% 50% 75% 100% T o p -1 A cc u ra cy 76.5% Best Human76.5% Best Human76.5% Best Human76.5% Best Human76.5% Best Human GNN-based | Graph Edits GNN-based | Graph Edits Seq2seq with attn | SMILES Transformer | SMILES Transformer | SMILES CHEMICAL SYNTHESIS PLANS BENCHMARK: TOP-1 TEST ACCURACY Source: Schwaller, 2020 | Chart: 2021 AI Index Report 12/2017 06/2018 11/2018 08/2019 11/2020 Human Prediction Benchmark Standard 500k Benchmark In collaboration with the “State of AI Report” M O L E C U L A R SY N T H E S I S Text by Nathan Benaich and Philippe Schwaller Over the last 25 years, the pharmaceutical industry has shifted from developing drugs from natural sources (e.g., plants) to conducting large-scale screens with chemically synthesized molecules. Machine learning allows scientists to determine what potential drugs are worth evaluating in the lab and the most effective way of synthesizing them. Various ML models can learn representations of chemical molecules for the purposes of chemical synthesis planning. A way to approach chemical synthesis planning is to represent chemical reactions with a text notation and cast the task as a machine translation problem. Recent work since 2018 makes use of the transformer architecture trained on large datasets of single-step reactions. Later work in 2020 approached model forward prediction and retrosynthesis as a sequence of graph edits, where the predicted molecules were built from scratch. Notably, these approaches offer an avenue to rapidly sweep through a list of candidate drug-like molecules in silico and output synthesizability scores and synthesis plans. This enables medicinal chemists to prioritize candidates for empirical validation and could ultimately let the pharmaceutical industry mine the vast chemical space to unearth novel drugs to benefit patients. Test Set Accuracy for Forward Chemical Synthesis Planning Figure 2.7.1 shows the top-1 accuracy of models benchmarked on a freely available dataset of one million reactions in the U.S. patents.6 Top-1 accuracy means that the product predicted by the model with the highest likelihood corresponds to the one that was reported in the ground truth. Data suggests that progress in chemical synthesis planning has seen steady growth in the last three years, as the accuracy grew by 15.6% in 2020 from 2017. The latest molecular transformer scored 92% on top-1 accuracy in November 2020. 2 .7 H E A LT H C A R E A N D B I O L O G Y C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E Figure 2.7.1 6 Acknowledgment: Philippe Schwaller at IBM Research–Europe and the University of Bern provided instructions and resources for gathering and analyzing the data. https://www.stateof.ai/ TA B L E O F C O N T E N T S 7 7C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 2 .7 H E A LT H C A R E A N D B I O L O G Y C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E 03/2020 04/2020 05/2020 06/2020 07/2020 08/2020 09/2020 10/2020 11/2020 12/2020 0 5,000 10,000 15,000 T o ta l N u m b er o f M o o n sh o t S u b m is si o n s 15,545 POSTERA: TOTAL NUMBER of MOONSHOT SUBMISSIONS Source: PostEra, 2020 | Chart: 2021 AI Index Report C OV I D -1 9 A N D D R U G D I S C OV E R Y AI-powered drug discovery has gone open source to combat the COVID-19 pandemic. COVID Moonshot is a crowdsourced initiative joined by over 500 international scientists to accelerate the development of a COVID-19 antiviral. The consortium of scientists submits their molecular designs pro bono, with no claims. PostEra, an AI startup, uses machine learning and computational tools to assess how easily compounds can be made using the submissions from the scientists and generates synthetic routes. After the first week, Moonshot received over 2,000 submissions, and PostEra designed synthetic routes in under 48 hours. Human chemists would have taken three to four weeks to accomplish the same task. Figure 2.7.2 shows the accumulated number of submissions by scientists over time. Moonshot received over 10,000 submissions from 365 contributors around the world in just four months. Toward the end of August 2020, the crowdsourcing had served its purpose, and the emphasis moved to optimize the lead compounds and set up for animal testing. As of February 2021, Moonshot aims to nominate a clinical candidate by the end of March. Figure 2.7.2 https://covid.postera.ai/covid https://postera.ai/ TA B L E O F C O N T E N T S 7 8C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 2 .7 H E A LT H C A R E A N D B I O L O G Y C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E CASP7 (2006) CASP8 (2008) CASP9 (2010) CASP10 (2012) CASP11 (2014) CASP12 (2016) CASP13 (2018) CASP14 (2020) 0 20 40 60 80 100 G lo b al D is ta n ce T es t (G D T _T S ) ALPHAFOLD ALPHAFOLD 2 CASP: MEDIAN ACCURACY of PREDICTIONS in FREE-MODELING by THE BEST TEAM, 2006-20 Source: DeepMind, 2020 | Chart: 2021 AI Index Report A L P H A F O L D A N D P R O T E I N F O L D I N G The protein folding problem, a grand challenge in structural biology, considers how to determine the three- dimensional structure of proteins (essential components of life) from their one-dimensional representations (sequences of amino acids7). A solution to this problem can have wide ranging applications—from better understanding the cellular basis of life, to fueling drug discovery, to curing diseases, to engineering de-novo proteins for industrial tasks, and more. In recent years, machine learning-based approaches have started to make a meaningful difference on the protein folding problem. Most notably, DeepMind’s AlphaFold debuted in 2018 at the Critical Assessment of Protein Structure (CASP) competition, a biennial competition to foster and measure progress on protein folding. At CASP, competing teams are given amino acid sequences and tasked to predict the three-dimensional structures of the corresponding proteins, the latter of which are determined through laborious and expensive experimental methods (e.g., nuclear magnetic resonance spectroscopy, X-ray crystallography, cryo-electron microscopy, etc.) and unknown to the competitors. Performance on CASP is commonly measured by the Global Distance Test (GDT) score, a number between 0 and 100, which measures the similarity between two protein structures. A higher GDT score is better. Figure 2.7.3, adapted from the DeepMind blog post, shows the median GDT scores of the best team on some of the harder types of proteins to predict (the ‘free- modelling’ category of proteins) at CASP over the last 14 years. In the past, winning algorithms were typically based on physics based models; however, in the last two competitions, Deepmind’s AlphaFold and AlphaFold 2 algorithms achieved winning scores through the partial incorporation of deep learning techniques. Figure 2.7.3 7 Currently most protein folding algorithms leverage multiple sequence alignments—many copies of a protein sequence representing the same protein across evolution—rather than just a single sequence. https://deepmind.com/blog/article/alphafold-a-solution-to-a-50-year-old-grand-challenge-in-biology TA B L E O F C O N T E N T S 7 9C H A P T E R 2 P R E V I E W Artificial Intelligence Index Report 2021 EXPERT HIGHLIGHTS This year, the AI Index asked AI experts to share their thoughts on the most significant technical AI breakthroughs in 2020. Here’s a summary of their responses, along with a couple of individual highlights. What was the single most impressive AI advancement in 2020? • The two most mentioned systems by a significant margin were AlphaFold (DeepMind), a model for molecular assay, and GPT-3 (OpenAI), a generative text model. What single trend will define AI in 2021? • Experts predict that more advances will be built by using pretrained models. For instance, GPT-3 is a large NLP model that can subsequently be fine-tuned for excellent performance on specific, narrow tasks. Similarly, 2020 saw various computer vision advancements built on top of models pretrained on very large image datasets. What aspect of AI technical progress, deployment, and development are you most excited to see in 2021? • “It’s interesting to note the dominance of the Transformers architecture, which started for machine translation but has become the de facto neural network architecture. More broadly, whereas NLP trailed vision in terms of adoption of deep learning, now it seems like advances in NLP are also driving vision.” — Percy Liang, Stanford University • “The incredible recent advancements in language generation have had a profound effect on the fields of NLP and machine learning, rendering formerly difficult research challenges and datasets suddenly useless while simultaneously encouraging new research efforts into the fascinating emergent capabilities (and important failings) of these complex new models.” —Carissa Schoenick, Allen Institute of AI Research E X P E R T H I G H L I G H T S C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E TA B L E O F C O N T E N T S 8 0C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 CHAPTER 3: The Economy Artificial Intelligence Index Report 2021 TA B L E O F C O N T E N T S 8 1C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 3 : T H E E C O N O M Y Overview 82 Chapter Highlights 83 3.1 JOBS 84 AI Hiring 84 AI Labor Demand 86 Global AI Labor Demand 86 U.S. AI Labor Demand: By Skill Cluster 87 U.S. Labor Demand: By Industry 88 U.S. Labor Demand: By State 90 AI Skill Penetration 91 Global Comparison 91 Global Comparison: By Industry 92 3.2 INVESTMENT 93 Corporate Investment 93 Startup Activity 94 Global Trend 94 Regional Comparison 95 Focus Area Analysis 97 3.3 CORPORATE ACTIVITY 98 Industry Adoption 98 Global Adoption of AI 98 AI Adoption by Industry and Function 99 Type of AI Capabilities Adopted 99 Consideration and Mitigation of Risks from Adopting AI 101 The Effect of COVID-19 103 Industrial Robot Installations 104 Global Trend 104 Regional Comparison 105 Earnings Calls 106 Chapter Preview CHAPTER 3: ACCESS THE PUBLIC DATA https://drive.google.com/drive/folders/1LUBMFoGssJN3sgtUTKf9IyfM447AM3Ik?usp=sharing TA B L E O F C O N T E N T S 8 2C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 Overview O V E R V I E W The rise of artificial intelligence (AI) inevitably raises the question of how much the technologies will impact businesses, labor, and the economy more generally. Considering the recent progress and numerous breakthroughs in AI, the field offers substantial benefits and opportunities for businesses, from increasing productivity gains with automation to tailoring products to consumers using algorithms, analyzing data at scale, and more. However, the boost in efficiency and productivity promised by AI also presents great challenges: Companies must scramble to find and retain skilled talent to meet their production needs while being mindful about implementing measures to mitigate the risks of using AI. Moreover, the COVID-19 pandemic has caused chaos and continued uncertainty for the global economy. How have private companies relied on and scaled AI technologies to help their business navigate through this most difficult time? This chapter looks at the increasingly intertwined relationship between AI and the global economy from the perspective of jobs, investment, and corporate activity. It first analyzes the worldwide demand for AI talent using data on hiring rates and skill penetration rates from LinkedIn as well as AI job postings from Burning Glass Technologies. It then looks at trends in private AI investment using statistics from S&P Capital IQ (CapIQ), Crunchbase, and Quid. The third, final section analyzes trends in the adoption of AI capabilities across companies, trends in robot installations across countries, and mentions of AI in corporate earnings, drawing from McKinsey’s Global Survey on AI, the International Federation of Robotics (IFR), and Prattle, respectively. C H A P T E R 3 : T H E E C O N O M Y TA B L E O F C O N T E N T S 8 3C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R H I G H L I G H T S C H A P T E R 3 : T H E E C O N O M Y CHAPTER HIGHLIGHTS • “Drugs, Cancer, Molecular, Drug Discovery” received the greatest amount of private AI investment in 2020, with more than USD 13.8 billion, 4.5 times higher than 2019. • Brazil, India, Canada, Singapore, and South Africa are the countries with the highest growth in AI hiring from 2016 to 2020. Despite the COVID-19 pandemic, the AI hiring continued to grow across sample countries in 2020. • More private investment in AI is being funneled into fewer startups. Despite the pandemic, 2020 saw a 9.3% increase in the amount of private AI investment from 2019—a higher percentage increase than in 2019 (5.7%), though the number of newly funded companies decreased for the third year in a row. • Despite growing calls to address ethical concerns associated with using AI, efforts to address these concerns in the industry are limited, according to a McKinsey survey. For example, issues such as equity and fairness in AI continue to receive comparatively little attention from companies. Moreover, fewer companies in 2020 view personal or individual privacy risks as relevant, compared with in 2019, and there was no change in the percentage of respondents whose companies are taking steps to mitigate these particular risks. • Despite the economic downturn caused by the pandemic, half the respondents in a McKinsey survey said that the coronavirus had no effect on their investment in AI, while 27% actually reported increasing their investment. Less than a fourth of businesses decreased their investment in AI. • The United States recorded a decrease in its share of AI job postings from 2019 to 2020— the first drop in six years. The total number of AI jobs posted in the United States also decreased by 8.2%, from 325,724 in 2019 to 300,999 in 2020. TA B L E O F C O N T E N T S 8 4C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 0 1 2 3 AI Hiring Index Brazil India Canada Singapore South Africa Germany Australia United States Argentina United Kingdom Turkey Italy France China AI HIRING INDEX by COUNTRY, 2020 Source: LinkedIn, 2020 | Chart: 2021 AI Index Report Attracting and retaining skilled AI talent is challenging. This section examines the latest trend in AI hiring, labor demand, and skill penetration, with data from LinkedIn and Burning Glass. C H A P T E R 3 : T H E E C O N O M Y A I H I R I N G How rapidly is the growth of AI jobs in different countries? This section first looks at LinkedIn data that gives the AI hiring rate for different countries. The AI hiring rate is calculated as the number of LinkedIn members who include AI skills on their profile or work in AI-related occupations and who added a new employer in the same month their new job began, divided by the total number of LinkedIn members in the country. This rate is then indexed to the average month in 2016; for example, an index of 1.05 in December 2020 points to a hiring rate that is 5% higher than the average month in 2016. LinkedIn makes month- to-month comparisons to account for any potential lags in members updating their profiles. The index for a year is the average index over all months within that year. This data suggests that the hiring rate has been increasing across all sample countries in 2020. Brazil, India, Canada, Singapore, and South Africa are the countries with the highest growth in AI hiring from 2016 to 2020 (Figure 3.1.1). Across the 14 countries analyzed, the AI hiring rate in 2020 was 2.2 times higher, on average, than that in 2016. For the top country, Brazil, the hiring index grew by more than 3.5 times. Moreover, despite the COVID-19 pandemic, AI hiring continued its growth across the 14 sampled countries in 2020 (Figure 3.1.2). For more explorations of cross-country comparisons, see the AI Index Global AI Vibrancy Tool. 3.1 JOBS 3 .1 J O B S Figure 3.1.1 1 Countries included are a sample of eligible countries with at least 40% labor force coverage by LinkedIn and at least 10 AI hires in any given month. China and India were also included in this sample because of their increasing importance in the global economy, but LinkedIn coverage in these countries does not reach 40% of the workforce. Insights for these countries may not provide as full a picture as in other countries, and should be interpreted accordingly. TA B L E O F C O N T E N T S 8 5C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 3 : T H E E C O N O M Y 3 .1 J O B S 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 3.4 Brazil 2.8 India 2.7 Canada 2.5 Singapore 2.3 South Africa 2.2 Germany 2.1 Australia 2.1 United States 2.0 Argentina 1.8 United Kingdom 1.8 Turkey 1.7 Italy 1.6 France 1.3 China A I H ir in g In d e x AI HIRING INDEX by COUNTRY, 2016-20 Source: LinkedIn, 2020 | Chart: 2021 AI Index Report 2016 2020 2016 2020 2016 2020 2016 2020 Figure 3.1.2 TA B L E O F C O N T E N T S 8 6C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 3 : T H E E C O N O M Y 3 .1 J O B S 2013 2014 2015 2016 2017 2018 2019 2020 2021 0.0% 0.5% 1.0% 1.5% 2.0% 2.5% A I J o b P o st in g s (% o f A ll Jo b P o st in g s) 0.8% United States 0.8% United Kingdom 2.4% Singapore 0.2% New Zealand 0.7% Canada 0.5% Australia AI JOB POSTINGS (% of ALL JOB POSTINGS) by COUNTRY, 2013-20 Source: Burning Glass, 2020 | Chart: 2021 AI Index Report Figure 3.1.3 A I L A B O R D E M A N D This section analyzes the AI labor demand based on data from Burning Glass, an analytics firm that collects postings from over 45,000 online job sites. To develop a comprehensive, real-time portrait of labor market demand, Burning Glass aggregated job postings, removed duplicates, and extracted data from job posting text. Note that Burning Glass updated the data coverage in 2020 with more job sites; as a result, the numbers in this report should not be directly compared with data in the 2019 report. Global AI Labor Demand Demand for AI labor in six countries covered by Burning Glass data—the United States, the United Kingdom, Canada, Australia, New Zealand, and Singapore—has grown significantly in the last seven years (Figure 3.1.3). On average, the share of AI job postings among all job postings in 2020 is more than five times larger than in 2013. Of the six countries, Singapore exhibits the largest growth, as its percentage of AI job postings across all job roles in 2020 is 13.5 times larger than in 2013. The United States is the only country among the six that recorded a decrease in its share of AI job postings from 2019 to 2020—the first drop in six years. This may be due to the coronavirus pandemic or the country’s relatively more mature AI labor market. The total number of AI jobs posted in the United States also decreased by 8.2%, from 325,724 in 2019 to 300,999 in 2020. TA B L E O F C O N T E N T S 8 7C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 3 : T H E E C O N O M Y 3 .1 J O B S 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 0.0% 0.1% 0.2% 0.3% 0.4% 0.5% A I J o b P o st in g s (% o f A ll Jo b P o st in g s) 0.1% Visual Image Recognition 0.1% Robotics 0.1% Neural Networks 0.1% Natural Language Processing 0.5% Machine Learning 0.1% Autonomous Driving 0.3% Artificial Intelligence AI JOB POSTINGS (% of ALL JOB POSTINGS) in the UNITED STATES by SKILL CLUSTER, 2013-20 Source: Burning Glass, 2020 | Chart: 2021 AI Index Report Figure 3.1.4 U.S. AI Labor Demand: By Skill Cluster Taking a closer look at the AI labor demand in the United States between 2013 and 2020, Figure 3.1.4 breaks down demand during that period year by year according to skill cluster. Each skill cluster consists of a list of AI-related skills; for example, the neural network skill cluster includes skills like deep learning and convolutional neural network. The Economy chapter appendix provides a complete list of AI skills under each skill cluster. Between 2013 and 2020, AI jobs related to machine learning and artificial intelligence experienced the fastest growth in online AI job postings in the United States, increasing from 0.1% of total jobs to 0.5% and 0.03% to 0.3%, respectively. As noted earlier, 2020 shows a decrease in the share of AI jobs among overall job postings across all skill clusters. Between 2013 and 2020, AI jobs related to machine learning and artificial intelligence experienced the fastest growth in online AI job postings in the United States, increasing from 0.1% of total jobs to 0.5% and 0.03% to 0.3%, respectively. TA B L E O F C O N T E N T S 8 8C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 3 : T H E E C O N O M Y U.S. Labor Demand: By Industry To dive deeper into how AI job demand in the U.S. labor market varies across industries, this section looks at the share of AI job postings across all jobs posted in the United States by industry in 2020 (Figure 3.1.5) as well as the trend in the past 10 years (Figure 3.1.6). In 2020, industries focused on information (2.8%); professional, scientific, and technical services (2.5%); and agriculture, forestry, fishing, and hunting (2.1%) had the highest share of AI job postings among all job postings in the United States. While the first two have always dominated demand for AI jobs, the agriculture, forestry, fishing, and hunting industry saw the biggest jump—by almost 1 percentage point—in the share of AI jobs from 2019 to 2020. 3 .1 J O B S 0.0% 0.5% 1.0% 1.5% 2.0% 2.5% AI Job Postings (% of All Job Postings) Information Professional, Scientific, Tech Services Agriculture, Forestry, Fishing, Hunting Manufacturing Finance and Insurance Public Administration Mining, Quarrying, Oil/Gas Extraction Management of Companies/Enterprises Educational Services Wholesale Trade Utilities Real Estate and Rental and Leasing Other Services Retail Trade Transportation and Warehousing Health Care and Social Assistance Construction Arts, Entertainment, and Recreation Accommodation and Food Services AI JOB POSTINGS (% of ALL JOB POSTINGS) in the UNITED STATES by INDUSTRY, 2020 Source: Burning Glass, 2020 | Chart: 2021 AI Index Report Figure 3.1.5 In 2020, industries focused on information (2.8%); professional, scientific, and technical services (2.5%); and agriculture, forestry, fishing, and hunting (2.1%) had the highest share of AI job postings among all job postings in the United States. TA B L E O F C O N T E N T S 8 9C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 A I J o b P o st in g s (% o f A ll Jo b P o st in g s) 0.0% 1.0% 2.0% 3.0% 0.0% 1.0% 2.0% 3.0% 0.0% 1.0% 2.0% 3.0% 0.0% 1.0% 2.0% 3.0% 0.0% 1.0% 2.0% 3.0% 2.8% Information 2.5% Professional, Scientific, Tech Services 2.1% Agriculture, Forestry, Fishing, Hunting 1.8% Manufacturing 1.4% Finance and Insurance 0.9% Public Administration 0.8% Mining, Quarrying, Oil/Gas Extraction 0.8% Management of Companies/Enterprises 0.8% Educational Services 0.7% Wholesale Trade 0.6% Utilities 0.5% Other Services 0.5% Real Estate and Rental and Leasing 0.4% Retail Trade 0.2% Arts, Entertainment, and Recreation 0.2% Construction 0.2% Transportation and Warehousing 0.2% Health Care and Social Assistance 0.1% Accommodation and Food Services AI JOB POSTINGS (% of ALL JOB POSTINGS) in the UNITED STATES by INDUSTRY, 2013-20 Source: Burning Glass, 2020 | Chart: 2021 AI Index Report 2013 2020 2013 2020 2013 2020 2013 2020 Figure 3.1.6 C H A P T E R 3 : T H E E C O N O M Y 3 .1 J O B S TA B L E O F C O N T E N T S 9 0C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 100 200 500 1,000 2,000 5,000 10,000 20,000 50,000 100,000 Number of AI Job Postings (Log Scale) 0.0% 0.5% 1.0% 1.5% 2.0% A I J o b P o st in g s (% o f A ll Jo b P o st in g s) Alabama Alaska Arizona Arkansas California Connecticut Delaware District of Columbia Florida Hawaii Idaho Illinois Indiana Louisiana Maine Maryland Massachusetts Michigan Missouri Montana New Jersey New Mexico New York North Carolina OhioOregon PennsylvaniaSouth Dakota Texas Vermont Virginia Washington Wisconsin Wyoming AI JOB POSTINGS (TOTAL and % of ALL JOB POSTINGS) by U.S. STATE and DISTRICT, 2020 Source: Burning Glass, 2020 | Chart: 2021 AI Index Report U.S. Labor Demand: By State As the competition for AI talent intensifies, where are companies seeking employees with machine learning, data science, and other AI-related skills within the United States? Figure 3.1.7 examines the labor demand by U.S. state in 2020, plotting the share of AI job postings across all job postings on the y-axis and the total number of AI jobs posted on a log scale on the x-axis. The chart shows that the District of Columbia has the highest share of AI jobs posted (1.88%), overtaking Washington state in 2019; and California remains the state with the highest number of AI job postings (63,433). In addition to Washington, D.C., six states registered over 1% of AI job postings among all job postings— Washington, Virginia, Massachusetts, California, New York, and Maryland—compared with five last year. California also has more AI job postings than the next three states combined, which are Texas (22,539), New York (18,580), and Virginia (17,718). C H A P T E R 3 : T H E E C O N O M Y 3 .1 J O B S Figure 3.1.7 TA B L E O F C O N T E N T S 91C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 3 : T H E E C O N O M Y 3 .1 J O B S 0 1 2 3 Relative AI Skill Penetration Rate India United States China Germany Canada South Korea Singapore United Kingdom France Australia Brazil Italy South Africa RELATIVE AI SKILL PENETRATION RATE by COUNTRY, 2015-20 Source: LinkedIn, 2020 | Chart: 2021 AI Index Report Figure 3.1.8 A I S K I L L P E N E T R AT I O N How prevalent are AI skills across occupations? The AI skill penetration metric shows the average share of AI skills among the top 50 skills in each occupation, using LinkedIn data that includes skills listed on a member’s profile, positions held, and the locations of the positions. Global Comparison For cross-country comparison, the relative penetration rate of AI skills is measured as the sum of the penetration of each AI skill across occupations in a given country, divided by the average global penetration of AI skills across the same occupations. For example, a relative penetration rate of 2 means that the average penetration of AI skills in that country is 2 times the global average across the same set of occupations. Among the sample countries shown in Figure 3.1.8, the aggregated data from 2015 to 2020 shows that India (2.83 times the global average) has the highest relative AI skill penetration rate, followed by the United States (1.99 times the global average), China (1.40 times the global average), Germany (1.27 times the global average), and Canada (1.13 times the global average).2 2 Countries included are a select sample of eligible countries with at least 40% labor force coverage by LinkedIn and at least 10 AI hires in any given month. China and India were included in this sample because of their increasing importance in the global economy, but LinkedIn coverage in these countries does not reach 40% of the workforce. Insights for these countries may not provide as full a picture as other countries, and should be interpreted accordingly. TA B L E O F C O N T E N T S 9 2C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 3 : T H E E C O N O M Y Global Comparison: By Industry To provide an in-depth sectoral decomposition of AI skill penetration across industries and sample countries, Figure 3.1.9 includes the aggregated data of the top five industries with the highest AI skill penetration globally in the last five years: education, finance, hardware and networking, manufacturing, and software and IT.3 India has the highest relative AI skill penetration across all five industries, while the United States and China frequently appear high up on the list. Other pockets of specialization worth highlighting with relative skill penetration rates of more than 1 include Germany in hardware and networking as well as manufacturing; and Israel in manufacturing and education. 3 .1 J O B S Relative AI Skill Penetration Rate RELATIVE AI SKILL PENETRATION RATE by INDUSTRY, 2015-20 Source: LinkedIn, 2020 | Chart: 2021 AI Index Report Education 0 1 2 3 India United States South Korea Israel Switzerland Germany China Finance 0 1 2 3 India United States Canada Germany United Kingdom South Africa Netherlands Hardware & Networking 0 1 2 3 India United States Germany China Israel France Singapore Manufacturing 0 1 2 3 India United States Germany Israel United Kingdom Spain China Software & IT Services 0 1 2 3 India United States China South Korea Germany Canada Israel Figure 3.1.9 3 Countries included are a select sample of eligible countries with at least 40% labor force coverage by LinkedIn and at least 10 AI hires in any given month. China and India were included in this sample because of their increasing importance in the global economy, but LinkedIn coverage in these countries does not reach 40% of the workforce. Insights for these countries may not provide as full a picture as other countries, and should be interpreted accordingly. TA B L E O F C O N T E N T S 9 3C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 This section explores the investment activity of private companies by NetBase Quid based on data from CapIQ and Crunchbase. Specifically, it looks at the latest trends in corporate AI investment, such as private investment, public offerings, mergers and acquisitions (M&A), and minority stakes related to AI. The section then focuses on the private investment in AI, or how much private funding goes into AI startups and which sectors are attracting significant investment and in which countries. C H A P T E R 3 : T H E E C O N O M Y C O R P O R AT E I N V E S T M E N T The total global investment in AI, including private investment, public offerings, M&A, and minority stakes, increased by 40% in 2020 relative to 2019 for a total of USD 67.9 billion (Figure 3.2.1). Given the pandemic, many small businesses have suffered disproportionately. As a result, industry consolidation and increased M&A activity in 2020 are driving up the total corporate investment in AI. M&A made up the majority of the total investment amount in 2020, increasing by 121.7% relative to 2019. Several high-profile acquisitions related to AI took place in 2020, including NVIDIA’s acquisition of Mellanox Technologies and Capgemini’s of Altran Technologies. 3.2 INVESTMENT 3 . 2 I N V E S T M E N T 2015 2016 2017 2018 2019 2020 0 10,000 20,000 30,000 40,000 50,000 60,000 70,000 T o ta l I n ve st m en t (i n M ill io n s o f U .S . D o lla rs ) 44,075 67,854 48,851 17,699 43,811 12,751 23,002 38,659 42,238 36,576 19,849 13,097 18,932 4,493 4,328 4,140 7,952 8,541 GLOBAL CORPORATE INVESTMENT in AI by INVESTMENT ACTIVITY, 2015-20 Source: CapIQ, Crunchbase, and NetBase Quid, 2020 | Chart: 2021 AI Index Report Private Investment Public O ering Merger/Acquisition Minority Stake Figure 3.2.1 TA B L E O F C O N T E N T S 9 4C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 3 : T H E E C O N O M Y S TA R T U P AC T I V I T Y The following section analyzed the trend of private investment in AI startups that have received investments of over USD 400,000 in the last 10 years. While the amount of private investment in AI has soared dramatically in recent years, the rate of growth has slowed. Global Trend More private investment in AI is being funneled into fewer startups. Despite the pandemic, 2020 saw a 9.3% increase in the amount of private AI investment from 2019—a higher percentage than the 5.7% increase in 2019 (Figure 3.2.2), though the number of companies funded decreased for the third year in a row (Figure 3.2.3). While there was a record high of more than USD 40 billion in private investment in 2020, that represents only a 9.3% increase from 2019—compared with the largest increase of 59.0%, observed between 2017 and 2018. Moreover, the number of funded AI startups continued a sharp decline from its 2017 peak. 3 . 2 I N V E S T M E N T 20 15 20 16 20 17 20 18 20 19 20 20 0 10,000 20,000 30,000 40,000 T o ta l I n ve st m en t (i n M ill io n s o f U .S . D o lla rs ) PRIVATE INVESTMENT in FUNDED AI COMPANIES, 2015-20 Source: CapIQ, Crunchbase, and NetBase Quid, 2020 | Chart: 2021 AI Index Report Figure 3.2.2 2015 2016 2017 2018 2019 2020 0 1,000 2,000 3,000 4,000 N u m b er o f C o m p an ie s NUMBER OF NEWLY FUNDED AI COMPANIES in the WORLD, 2015-20 Source: CapIQ, Crunchbase, and NetBase Quid, 2020 | Chart: 2021 AI Index Report Figure 3.2.3 TA B L E O F C O N T E N T S 9 5C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 0 5,000 10,000 15,000 20,000 25,000 Total Investment (in Millions of U.S. Dollars) United States China United Kingdom Israel Canada Germany France India Japan Singapore Australia PRIVATE INVESTMENT in AI by COUNTRY, 2020 Source: CapIQ, Crunchbase, and NetBase Quid, 2020 | Chart: 2021 AI Index Report C H A P T E R 3 : T H E E C O N O M Y Regional Comparison As shown in Figure 3.2.4, the United States remains the leading destination for private investment, with over USD 23.6 billion in funding in 2020, followed by China (USD 9.9 billion) and the United Kingdom (USD 1.9 billion). A closer examination of the three contenders leading the AI race—the United States, China, and the European Union—further validates the United States’ dominant position in private AI investment. While China saw an exceptionally high amount of private AI investment in 2018, its investment level in 2020 is less than half that of the United States (Figure 3.2.5). It is important to note, however, that China has strong public investments in AI. Both the central and local governments in China are spending heavily on AI R&D.4 3 . 2 I N V E S T M E N T Figure 3.2.4 4 See “A Brief Examination of Chinese Government Expenditures on Artificial Intelligence R&D” (2020) by the Institute for Defense Analyses for more details. https://www.ida.org/-/media/feature/publications/a/ab/a-brief-examination-of-chinese-government-expenditures-on-artificial-intelligence-r-and-d/d-12068.ashx TA B L E O F C O N T E N T S 9 6C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 2015 2016 2017 2018 2019 2020 0 5,000 10,000 15,000 20,000 25,000 T o ta l I n ve st m en t (i n M ill io n s o f U .S . D o lla rs ) 9,933 China 23,597 US 6,662 Rest of the World 2,044 EU PRIVATE INVESTMENT in AI by GEOGRAPHIC AREA, 2015-20 Source: CAPIQ, Crunchbase, and NetBase Quid, 2020 | Chart: 2021 AI Index Report C H A P T E R 3 : T H E E C O N O M Y 3 . 2 I N V E S T M E N T Figure 3.2.5 TA B L E O F C O N T E N T S 9 7C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 0 5,000 10,000 15,000 Total Investment (in Millions of U.S. Dollars) Drugs, Cancer, Molecular, Drug Discovery Autonomous Vehicles, Fleet, Autonomous driving, Road Students, Courses, Edtech, English language Open Source, Compute, Hadoop, Devops Speech Recognition, Computer interaction, Dialogue, Machine translation Money Laundering, Anti-fraud, Fraud Detection, Fraud Prevention Fashion, Shopping Experience, Beauty, Visual Search Games, Fans, Gaming, Football Semiconductor, Chip, Data Centers, Processor Bank, Card, Credit Cards, Gift GLOBAL PRIVATE INVESTMENT in AI by FOCUS AREA, 2019 vs 2020 Source: CapIQ, Crunchbase, and NetBase Quid, 2020 | Chart: 2021 AI Index Report 2019 2020 C H A P T E R 3 : T H E E C O N O M Y Focus Area Analysis Figure 3.2.6 shows the ranking of the top 10 focus areas that receive the greatest amount of private investment in 2020 as well as their respective investment amount in 2019. The “Drugs, Cancer, Molecular, Drug Discovery” area tops the list, with more than USD 13.8 billion in private AI investment—4.5 times higher than 2019— followed by “Autonomous Vehicles, Fleet, Autonomous Driving, Road” (USD 4.5 billion), and “Students, Courses, Edtech, English Language” (USD 4.1 billion). In addition to Drugs, Cancer, Molecular, Drug Discovery,” both “Games, Fans, Gaming, Football” and “Students, Courses, Edtech, English Language” saw a significant increase in the amount of private AI investment from 2019 to 2020. The former is largely driven by several financing rounds to gaming and sports startups in the United States and South Korea, while the latter is boosted by investments in an online education platform in China. 3 . 2 I N V E S T M E N T Figure 3.2.6 https://techcrunch.com/2020/05/22/statespace-the-platform-that-trains-gamers-raises-15-million/ https://www.sportbusiness.com/news/kakao-vx-raises-money-for-virtual-golf-business/ https://www.prnewswire.com/news-releases/yuanfudao-raises-us2-2-billion-in-new-financing-valuing-the-company-at-us15-5-billion-becoming-the-most-valued-ed-tech-company-worldwide-301157837.html TA B L E O F C O N T E N T S 9 8C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 This section reviews how corporations have capitalized on the advances in AI, using AI and automation to their advantage and generating value at scale. While the number of corporations starting to deploy AI technologies has surged in recent years, the economic turmoil and impact of COVID-19 in 2020 have slowed that rate of adoption. The latest trends in corporate AI activities are examined through data on the adoption of AI capabilities by McKinsey’s Global Survey on AI, trends in robot installations across the globe by the International Federation of Robotics (IFR), and mentions of AI in corporate earnings calls by Prattle. C H A P T E R 3 : T H E E C O N O M Y I N D U S T R Y A D O P T I O N This section shares the results of a McKinsey & Company survey of 2,395 respondents: individuals representing companies from a range of regions, industries, sizes, functional specialties, and tenures. McKinsey & Company’s “The State of AI in 2020” report contains the full results of this survey, including insights on how different companies have adopted AI across functions, core best practices shared among the companies that are generating the greatest value from AI, and the impacts of the COVID-19 pandemic on these companies’ AI investment plans. Global Adoption of AI The 2020 survey results suggest no increase in AI adoption relative to 2019. Over 50% of respondents say that their organizations have adopted AI in at least one business function (Figure 3.3.1). In 2019, 58% of respondents said their companies adopted AI in at least one function, although the 2019 survey asked about companies’ AI adoption differently. In 2020, companies in developed Asia-Pacific countries led in AI adoptions, followed by those in India and North America. While AI adoption was about equal across regions in 2019, this year’s respondents working for companies in Latin America and in other developing countries are much less likely to report adopting AI in at least one business function. 3.3 CORPORATE ACTIVITY 3 . 3 C O R P O R AT E AC T I V I T Y 0% 10% 20% 30% 40% 50% 60% % of Respondents Developed Asia-Pacific India North America Europe Developing Markets (incl. China, MENA) Latin America A ll G eo g rap h ies AI ADOPTION by ORGANIZATIONS GLOBALLY, 2020 Source: McKinsey & Company, 2020 | Chart: 2021 AI Index Report Figure 3.3.1 https://www.mckinsey.com/Business-Functions/McKinsey-Analytics/Our-Insights/Global-survey-The-state-of-AI-in-2020 TA B L E O F C O N T E N T S 9 9C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 3 : T H E E C O N O M Y AI Adoption by Industry and Function Respondents representing companies in high tech and telecom were most likely to report AI adoption in 2020, similar to the 2019 results, followed in second place by both financial services and automotive and assembly (Figure 3.3.2). In another repeat from 2019 (and 2018), the 2020 survey suggests that the functions where companies are most likely to adopt AI vary by industry (Figure 3.3.3). For example, respondents in the automotive and assembly industry report greater AI adoption for manufacturing- related tasks than any other; respondents in financial services report greater AI adoption for risk functions; and respondents in high tech and telecom report greater AI adoption for product and service development functions. Across industries, companies in 2020 are most likely to report using AI for service operations (such as field services, customer care, back office), product and service development, and marketing and sales, similar to the survey results in 2019. 3 . 3 C O R P O R AT E AC T I V I T Y 0% 10% 20% 30% 40% 50% 60% 70% % of Respondents High tech/Telecom Automotive and Assembly Financial Services Business, Legal, and Professional Services Healthcare/Pharma Consumer Goods/Retail AI ADOPTION by INDUSTRY, 2020 Source: McKinsey & Company, 2020 | Chart: 2021 AI Index Report Figure 3.3.2 Type of AI Capabilities Adopted By industry, the type of AI capabilities adopted varies (Figure 3.3.4). Across industries, companies in 2020 were most likely to identify other machine learning techniques, robotic process automation, and computer vision as capabilities adopted in at least one business function. Industries tend to adopt AI capabilities that best serve their core functions. For example, physical robotics, as well as autonomous vehicles, are most frequently adopted by industries where manufacturing and distribution play a large role—such as automotive and assembly, and consumer goods and retail. Natural language processing capabilities, such as text understanding, speech understanding, and text generation, are frequently adopted by industries with high volumes of customer or operational data in text forms; these include business, legal, and professional services, financial services, healthcare, and high tech and telecom. TA B L E O F C O N T E N T S 1 0 0C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 Industry Human Resources Manufacturing Marketing And Sales Product and/or Service Development Risk Service Operations Strategy and Corporate Finance Supply-Chain Management All Industries Automotive and Assembly Business, Legal, and Professional Services Consumer Goods/Retail Financial Services Healthcare/Pharma High Tech/Telecom 12% 6% 2% 10% 9% 18% 9% 9% 2% 7% 2% 10% 8% 7% 39% 11% 34% 10% 20% 16% 21% 14% 4% 32% 3% 13% 2% 10% 37% 15% 15% 14% 21% 21% 21% 26% 16% 21% 20% 16% 10% 15% 11% 12% 5% 19% 9% 29% 12% 14% 3% 5% 1% 13% 13% 8% AI ADOPTION by INDUSTRY & FUNCTION, 2020 Source: McKinsey & Company, 2020 | Chart: 2021 AI Index Report % of Respondents Figure 3.3.3 C H A P T E R 3 : T H E E C O N O M Y 3 . 3 C O R P O R AT E AC T I V I T Y Industry Autonomous Vehicles Computer Vision Conversation al Interfaces Deep Learning NL Generation NL Speech U nderstanding NL Text Unde rstanding Other Machine Learning Techniques Physical Robotics Robotic Process Automation All Industries Automotive and Assembly Business, Legal, and Professional Services Consumer Goods/Retail Financial Services Healthcare/Pharma High Tech/Telecom 34% 18% 37% 14% 13% 33% 22% 14% 10% 8% 23% 11% 31% 13% 37% 19% 32% 12% 25% 27% 23% 33% 15% 26% 9% 18% 19% 13% 25% 11% 19% 6% 15% 14% 12% 18% 12% 18% 6% 14% 12% 11% 30% 14% 19% 6% 19% 19% 16% 32% 10% 24% 9% 17% 16% 15% 34% 15% 18% 10% 13% 33% 18% 9% 1% 6% 13% 7% 20% 7% AI CAPABILITIES EMBEDDED in STANDARD BUSINESS PROCESSES, 2020 Source: McKinsey & Company, 2020 | Chart: 2021 AI Index Report % of Respondents Figure 3.3.4 TA B L E O F C O N T E N T S 1 01C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 3 : T H E E C O N O M Y 3 . 3 C O R P O R AT E AC T I V I T Y Consideration and Mitigation of Risks from Adopting AI Only a minority of companies acknowledge the risks associated with AI, and even fewer report taking steps to mitigate those risks (Figure 3.3.5 and Figure 3.3.6). Relative to 2019, the share of survey respondents citing each risk as relevant has largely remained flat; that is, most changes were not statistically significant. Cybersecurity remains the only risk a majority of respondents say their organizations consider relevant. A number of less commonly cited risks, such as national security and political stability, were more likely to be seen as relevant by companies in 2020 than in 2019. Despite growing calls to attend to ethical concerns associated with the use of AI, efforts to address these concerns in the industry are limited. For example, concerns such as equity and fairness in AI use continue to receive comparatively little attention from companies. Moreover, fewer companies in 2020 view personal or individual privacy as a risk from adopting AI compared with in 2019, and there is no change in the percentage of respondents whose companies are taking steps to mitigate this particular risk. Relative to 2019, the share of survey respondents citing each risk as relevant has largely remained flat; that is, most changes were not statistically significant. Cybersecurity remains the only risk a majority of respondents say their organizations consider relevant. TA B L E O F C O N T E N T S 1 0 2C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 0% 10% 20% 30% 40% 50% 60% % of Respondents Cybersecurity Regulatory Compliance Explainability Personal/Individual Privacy Organizational Reputation Workforce/Labor Displacement Equity And Fairness Physical Safety National Security Political Stability RISKS from ADOPTING AI THAT ORGANIZATIONS CONSIDER RELEVANT, 2020 Source: McKinsey & Company, 2020 | Chart: 2021 AI Index Report 2019 2020 0% 10% 20% 30% 40% 50% % of Respondents Cybersecurity Regulatory Compliance Personal/Individual Privacy Explainability Workforce/Labor Displacement Organizational Reputation Physical Safety Equity And Fairness National Security Political Stability RISKS from ADOPTING AI THAT ORGANIZATIONS TAKE STEPS to MITGATE, 2020 Source: McKinsey & Company, 2020 | Chart: 2021 AI Index Report 2019 2020 Figure 3.3.5 Figure 3.3.6 C H A P T E R 3 : T H E E C O N O M Y 3 . 3 C O R P O R AT E AC T I V I T Y TA B L E O F C O N T E N T S 1 0 3C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% All Industries Automotive and Assembly Business, Legal, and Professional Services Consumer Goods/Retail Financial Services Healthcare/Pharma High Tech/Telecom 23% 11% 22% 23% 27% 22% 23% 52% 44% 50% 50% 48% 36% 50% 24% 44% 28% 26% 25% 42% 27% CHANGES in AI INVESTMENTS AMID the COVID-19 PANDEMIC Source: McKinsey & Company, 2020 | Chart: 2021 AI Index Report Decreased No Effect Increased Figure 3.3.7 C H A P T E R 3 : T H E E C O N O M Y 3 . 3 C O R P O R AT E AC T I V I T Y The Effect of COVID-19 Despite the economic downturn caused by the pandemic, half of respondents said the pandemic had no effect on their investment in AI, while 27% actually reported increasing their investment. Less than a fourth of businesses decreased their investment in AI (Figure 3.3.7).5 By industry, respondents in healthcare and pharma as well as automotive and assembly were the most likely to report that their companies had increased investment in AI. 5 Figures may not sum to 100% because of rounding. Despite the economic downturn caused by the pandemic, half of respondents said the pandemic had no effect on their investment in AI, while 27% actually reported increasing their investment. TA B L E O F C O N T E N T S 1 0 4C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 3 : T H E E C O N O M Y I N D U S T R I A L R O B O T I N S TA L L AT I O N S Right now, AI is being deployed widely onto consumer devices like smartphones and personal vehicles (e.g., self-driving technology). But relatively little AI is deployed on actual robots.6 That may change as researchers develop software to integrate AI-based approaches with contemporary robots. For now, it is possible to measure global sales of industrial robots to draw conclusions about the amount of AI-ready infrastructure being bought worldwide. While the COVID- 19-induced economic crisis will lead to a decline in robot sales in the short term, the International Federation of Robotics (IFR) expects the pandemic to generate global growth opportunities for the robotics industry in the medium term. Global Trend After six years of growth, the number of new industrial robots installed worldwide decreased by 12%, from 422,271 units in 2018 to 373,240 units in 2019 (Figure 3.3.8). The decline is a product of trade tensions between the United States and China as well as challenges faced by the two primary customer industries: automotive and electrical/electronics. With the automotive industry taking the lead (28% of total installations), followed by electrical/electronics (24%), metal and machinery (12%), plastics and chemical products (5%), and food and beverages (3%).7 It is important to note that these metrics are a measurement of installed infrastructure that is susceptible to adopting new AI technologies and does not indicate whether every new robot used a significant amount of AI. 3 . 3 C O R P O R AT E AC T I V I T Y 2012 2013 2014 2015 2016 2017 2018 2019 0 100 200 300 400 N u m b er o f U n it s (i n T h o u sa n d s) GLOBAL INDUSTRIAL ROBOT INSTALLATIONS, 2012-19 Source: International Federation of Robotics, 2020 | Chart: 2021 AI Index Report Figure 3.3.8 6 For more insights on the adoption of AI and robots by the industry, read the National Bureau of Economic Research working paper based on the 2018 Annual Business Survey by the U.S. Census Bureau, “Advancing Technologies Adoption and Use by U.S. Firms: Evidence From the Annual Business Survey” (2020). 7 Note that there is no information on the customer industry for approximately 20% of robots installed. http://www.nber.org/papers/w28290 TA B L E O F C O N T E N T S 1 0 5C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 3 : T H E E C O N O M Y Regional Comparison Asia, Europe, and North America—three of the largest industrial robot markets—all witnessed the end of a six- year growth period in robot installations (Figure 3.3.9). North America experienced the sharpest decline, of 16%, in 2019, compared with 5% in Europe and 13% in Asia. Figure 3.3.10 shows the number of installations in the five major markets for industrial robot markets. All five— accounting for 73% of global robot installations—saw roughly the same decline, except for Germany, which saw a slight bump in installations between 2018 and 2019. Despite the downward trend in China, it is worth noting that the country had more industrial robots in 2019 than the other four countries combined. 3 . 3 C O R P O R AT E AC T I V I T Y 0 100 200 300 Number of Units (in Thousands) Asia Europe North Amercia Others South & Central America Africa NEW INDUSTRIAL ROBOT INSTALLATIONS by REGION, 2017-19 Source: International Federation of Robotics, 2020 | Chart: 2021 AI Index Report 2017 2018 2019 Asia, Europe, and North America—three of the largest industrial robot markets—all witnessed the end of a six-year growth period in robot installations. North America experienced the sharpest decline, of 16%, in 2019, compared with 5% in Europe and 13% in Asia. Figure 3.3.9 TA B L E O F C O N T E N T S 1 0 6C H A P T E R 3 P R E V I E W Artificial Intelligence Index Report 2021 0 50 100 150 200 Number of Units (in Thousands) China Japan United States South Korea Germany Rest of the World NEW INDUSTRIAL ROBOT INSTALLATIONS in FIVE MAJOR MARKETS, 2017-19 Source: International Federation of Robotics, 2020 | Chart: 2021 AI Index Report 2017 2018 2019 C H A P T E R 3 : T H E E C O N O M Y 3 . 3 C O R P O R AT E AC T I V I T Y 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 0 1,000 2,000 3,000 4,000 5,000 N u m b e r o f A I M e n ti o n s 1,356 Maching Learning 310 Cloud 652 Big Data 4,734 Arti cial Intelligence MENTIONS of AI in CORPORATE EARNINGS CALLS, 2011-20 Source: Prattle & Liquidnet, 2020 | Chart: 2021 AI Index Report E A R N I N G S C A L L S Mentions of AI in corporate earnings calls have increased substantially since 2013, as Figure 3.3.11 shows. In 2020, the number of mentions of AI in earning calls was two times higher than mentions of big data, cloud, and machine learning combined, though that figure declined by 8.5% from 2019. The mentions of big data peaked in 2017 and have since declined by 57%. Figure 3.3.10 Figure 3.3.11 TA B L E O F C O N T E N T S 1 0 7C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 CHAPTER 4: AI Education Artificial Intelligence Index Report 2021 TA B L E O F C O N T E N T S 1 0 8C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 4 : A I E D U C AT I O N Overview 109 Chapter Highlights 110 4.1 STATE OF AI EDUCATION IN HIGHER EDUCATION INSTITUTIONS 111 Undergraduate AI Course Offerings 111 Undergraduate Courses That Teach AI Skills 111 Intro-Level AI and ML Courses 111 Graduate AI Course Offerings 113 Graduate Courses That Focus on AI Skills 113 Faculty Who Focus on AI Research 113 4.2 AI AND CS DEGREE GRADUATES IN NORTH AMERICA 114 CS Undergraduate Graduates in North America 114 New CS PhDs in the United States 114 New CS PhDs by Specialty 115 New CS PhDs with AI/ML and Robotics/Vision Specialties 117 New AI PhDs Employment in North America 118 Industry vs. Academia 118 New International AI PhDs 119 4.3 AI EDUCATION IN THE EUROPEAN UNION AND BEYOND 120 AI Offerings in EU27 120 By Content Taught in AI-Related Courses 121 International Comparison 122 HIGHLIGHT: AI BRAIN DRAIN AND FACULTY DEPARTURE 123 Chapter Preview CHAPTER 4: ACCESS THE PUBLIC DATA https://drive.google.com/drive/folders/1MaGQgZ5KkRlOnxXa8wTydiA-KfrlI-Ra?usp=sharing TA B L E O F C O N T E N T S 1 0 9C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 Overview O V E R V I E W As AI has become a more significant driver of economic activity, there has been increased interest from people who want to understand it and gain the necessary qualifications to work in the field. At the same time, rising AI demands from industry are tempting more professors to leave academia for the private sector. This chapter focuses on trends in the skills and training of AI talent through various education platforms and institutions. What follows is an examination of data from an AI Index survey on the state of AI education in higher education institutions, along with a discussion on computer science (CS) undergraduate graduates and PhD graduates who specialized in AI-related disciplines, based on the annual Computing Research Association (CRA) Taulbee Survey. The final section explores trends in AI education in Europe, drawing on statistics from the Joint Research Centre (JRC) at the European Commission. C H A P T E R 4 : A I E D U C AT I O N TA B L E O F C O N T E N T S 1 1 0C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 CHAPTER HIGHLIGHTS • An AI Index survey conducted in 2020 suggests that the world’s top universities have increased their investment in AI education over the past four years. The number of courses that teach students the skills necessary to build or deploy a practical AI model on the undergraduate and graduate levels has increased by 102.9% and 41.7%, respectively, in the last four academic years. • More AI PhD graduates in North America chose to work in industry in the past 10 years, while fewer opted for jobs in academia, according to an annual survey from the Computing Research Association (CRA). The share of new AI PhDs who chose industry jobs increased by 48% in the past decade, from 44.4% in 2010 to 65.7% in 2019. By contrast, the share of new AI PhDs entering academia dropped by 44%, from 42.1% in 2010 to 23.7% in 2019. • In the last 10 years, AI-related PhDs have gone from 14.2% of the total of CS PhDs granted in the United States, to around 23% as of 2019, according to the CRA survey. At the same time, other previously popular CS PhDs have declined in popularity, including networking, software engineering, and programming languages. Compilers all saw a reduction in PhDs granted relative to 2010, while AI and Robotics/Vision specializations saw a substantial increase. • After a two-year increase, the number of AI faculty departures from universities to industry jobs in North America dropped from 42 in 2018 to 33 in 2019 (28 of these are tenured faculty and five are untenured). Carnegie Mellon University had the largest number of AI faculty departures between 2004 and 2019 (16), followed by the Georgia Institute of Technology (14) and University of Washington (12). • The percentage of international students among new AI PhDs in North America continued to rise in 2019, to 64.3%—a 4.3% increase from 2018. Among foreign graduates, 81.8% stayed in the United States and 8.6% have taken jobs outside the United States. • In the European Union, the vast majority of specialized AI academic offerings are taught at the master’s level; robotics and automation is by far the most frequently taught course in the specialized bachelor’s and master’s programs, while machine learning (ML) dominates in the specialized short courses. C H A P T E R H I G H L I G H T S C H A P T E R 4 : A I E D U C AT I O N TA B L E O F C O N T E N T S 1 1 1C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 4 : A I E D U C AT I O N In 2020, AI Index developed a survey that asked computer science departments or schools of computing and informatics at top-ranking universities around the world and in emerging economies about four aspects of their AI education: undergraduate program offerings, graduate program offerings, offerings on AI ethics, and faculty expertise and diversity. The survey was completed by 18 universities from nine countries.1 Results from the AI Index survey indicate that universities have increased both the number of AI courses they offer that teach students how to build and deploy a practical AI model and the number of AI-focused faculty. U N D E R G R A D U AT E A I C O U R S E O F F E R I N G S Course offerings at the undergraduate level were examined by evaluating trends in courses that teach students the skills necessary to build or deploy a practical AI model, intro-level AI and ML courses, and enrollment statistics. Undergraduate Courses That Teach AI Skills The survey results suggest that CS departments have invested heavily in practical AI courses in the past four academic years (AY).2 The number of 4.1 STATE OF AI EDUCATION IN HIGHER EDUCATION INSTITUTIONS 4 .1 S TAT E O F A I E D U C AT I O N I N H I G H E R E D U C AT I O N I N S T I T U T I O N S 2016-17 2017-18 2018-19 2019-20 0 50 100 150 200 N u m b er o f C o u rs es NUMBER of UNDERGRADUATE COURSES THAT TEACH STUDENTS the SKILLS NECESSARY to BUILD or DEPLOY a PRACTICAL AI MODEL, AY 2016-20 Source: AI Index, 2020 | Chart: 2021 AI Index Report Figure 4.1.1 courses on offer that teach students the skills necessary to build or deploy a practical AI model has increased by 102.9%, from 102 in AY 2016–17 to 207 in AY 2019–20, across 18 universities (Figure 4.1.1). Intro-Level AI and ML Courses The data shows that the number of students who enrolled in or attempted to enroll in an Introduction to Artificial Intelligence course and Introduction to Machine Learning course has jumped by almost 60% in the past four academic years (Figure 4.1.2).3 The slight drop in enrollment in the intro-level AI and ML courses in AY 2019–20 is mostly driven by the decrease in the number of course offerings at U.S. universities. Intro-level course enrollment 1 The survey was distributed to 73 universities online over three waves from November 2020 to January 2021 and completed by 18 universities, a 24.7% response rate. The 18 universities are—Belgium: Katholieke Universiteit Leuven; Canada: McGill University; China: Shanghai Jiao Tong University, Tsinghua University; Germany: Ludwig Maximilian University of Munich, Technical University of Munich; Russia: Higher School of Economics, Moscow Institute of Physics and Technology; Switzerland: École Polytechnique Fédérale de Lausanne; United Kingdom: University of Cambridge; United States: California Institute of Technology, Carnegie Mellon University (Department of Machine Learning), Columbia University, Harvard University, Stanford University, University of Wisconsin–Madison, University of Texas at Austin, Yale University. 2 See here for a list of keywords on practical artificial intelligence models provided to the survey respondents. A course is defined as a set of classes that require a minimum of 2.5 class hours (including lecture, lab, TA hours, etc.) per week for at least 10 weeks in total. Multiple courses with the same titles and numbers count as one course. 3 For universities that have a cap on course registration, the number of students who attempted to enroll in the intro-level AI and ML courses are included. https://www.timeshighereducation.com/world-university-rankings/2021/world-ranking#!/page/0/length/25/sort_by/rank/sort_order/asc/cols/stats https://www.timeshighereducation.com/world-university-rankings/2021/world-ranking#!/page/0/length/25/sort_by/rank/sort_order/asc/cols/stats https://www.timeshighereducation.com/world-university-rankings/2020/emerging-economies-university-rankings https://drive.google.com/file/d/11w_XMEdC_KkbRQqE-ThrfHunw8EDYsaR/view?usp=sharing TA B L E O F C O N T E N T S 1 1 2C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 2016-17 2017-18 2018-19 2019-20 0% 50% 100% 150% P er ce n ta g e C h an g e (2 0 16 -1 7= 1) 165.1%, EU 21.0%, US PERCENTAGE CHANGE in the NUMBER of STUDENTS WHO ENROLLED or ATTEMPTED to ENROLL in INTRO to AI and INTRO to ML COURSES by GEOGRAPHIC AREA, AY 2016-20 Source: AI Index, 2020 | Chart: 2021 AI Index Report in the European Union has gradually increased by 165% in the past four academic years, while such enrollment in the United States has seen a clear dip in growth in the last academic year (Figure 4.1.3). Six of the eight U.S. universities surveyed say that the number of (attempted) enrollments for the introductory AI and ML courses has decreased within the last year. Some universities cited students taking leaves during the pandemic as the main cause of the drop; others mentioned structural changes in intro-level AI course offerings— such as creating Intro to Data Science last year—that may have driven students away from traditional intro to AI and ML courses. C H A P T E R 4 : A I E D U C AT I O N 4 .1 S TAT E O F A I E D U C AT I O N I N H I G H E R E D U C AT I O N I N S T I T U T I O N S 2016-17 2017-18 2018-19 2019-20 0 2,000 4,000 6,000 8,000 10,000 N u m b er o f S tu d en ts NUMBER of STUDENTS WHO ENROLLED or ATTEMPTED to ENROLL in INTRO to AI and INTRO to ML COURSES, AY 2016-20 Source: AI Index, 2020 | Chart: 2021 AI Index Report Figure 4.1.2 Figure 4.1.3 TA B L E O F C O N T E N T S 1 1 3C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 4 : A I E D U C AT I O N G R A D U AT E A I C O U R S E O F F E R I N G S The survey also looks at course offerings at the graduate or advanced degree level, specifically at graduate courses that teach students the skills necessary to build or deploy a practical AI model.4 Graduate Courses That Focus on AI Skills Graduate offerings that teach students the skills required to build or deploy a practical AI model increased by 41.7% in the last four academic years, from 151 courses in AY 2016–17 to 214 in AY 2019–20 (Figure 4.1.4). FAC U LT Y W H O F O C U S O N A I R E S E A R C H As shown in Figure 4.1.5, the number of tenure-track faculty with a primary research focus on AI at the surveyed universities grew significantly over the past four academic years, in keeping with the rising demand for AI classes and degree programs. The number of AI- focused faculty grew by 59.1%, from 105 in AY 2016–17 to 167 in AY 2019–20. 4 .1 S TAT E O F A I E D U C AT I O N I N H I G H E R E D U C AT I O N I N S T I T U T I O N S 4 See here for a list of keywords on practical artificial intelligence models provided to the survey respondents. A course is defined as a set of classes that require a minimum of 2.5 class hours (including lecture, lab, TA hours, etc.) per week for at least 10 weeks in total. Multiple courses with the same titles and numbers count as one course. 2016-17 2017-18 2018-19 2019-20 0 50 100 150 200 N u m b er o f C o u rs es NUMBER of GRADUATE COURSES THAT TEACH STUDENTS the SKILLS NECESSARY to BUILD or DEPLOY a PRACTICAL AI MODEL, AY 2016-20 Source: AI Index, 2020 | Chart: 2021 AI Index Report 2016-17 2017-18 2018-19 2019-20 0 50 100 150 N u m b er o f F ac u lt y NUMBER of TENURE-TRACK FACULTY WHO PRIMARILY FOCUS THEIR RESEARCH on AI, AY 2016-20 Source: AI Index, 2020 | Chart: 2021 AI Index Report Figure 4.1.4 Figure 4.1.5 https://drive.google.com/file/d/11w_XMEdC_KkbRQqE-ThrfHunw8EDYsaR/view?usp=sharing TA B L E O F C O N T E N T S 1 1 4C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 This section presents findings from the annual Taulbee Survey from the Computing Research Association (CRA). The annual CRA survey documents trends in student enrollment, degree production, employment of graduates, and faculty salaries in academic units in the United States and Canada that grant doctoral degrees in computer science (CS), computer engineer- ing (CE), or information (I). Academic units include departments of computer science and computer engineering or, in some cases, colleges or schools of information or computing. C H A P T E R 4 : A I E D U C AT I O N C S U N D E R G R A D U AT E G R A D U AT E S I N N O R T H A M E R I C A Most AI-related courses in North America are a part of the CS course offerings at the undergraduate level. The number of new CS undergraduate graduates at doctoral institutions in North America has grown steadily in the last 10 years (Figure 4.2.1). More than 28,000 undergraduates completed CS degrees in 2019, around three times higher than the number in 2010. N E W C S P H D S I N T H E U N I T E D S TAT E S The section examines the trend of CS PhD graduates in the United States with a focus on those with AI-related specialties.5 The CRA survey includes 20 specialties in total, two of which are directly related to the field of AI, including “artificial intelligence/machine learning” and “robotics/vision.” 4.2 AI AND CS DEGREE GRADUATES IN NORTH AMERICA 4 . 2 A I A N D C S D E G R E E G R A D U AT E S I N N O R T H A M E R I C A 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 0 5,000 10,000 15,000 20,000 25,000 30,000 N u m b e r o f N e w C S U n d e rg ra d u at e G ra d u at e s NUMBER of NEW CS UNDERGRADUATE GRADUATES at DOCTORAL INSTITUTIONS in NORTH AMERICA, 2010-19 Source: CRA Taulbee Survey, 2020 | Chart: 2021 AI Index Report Figure 4.2.1 5 New CS PhDs in this section include PhD graduates from academic units (departments, colleges, or schools within universities) of computer science in the United States. https://cra.org/resources/taulbee-survey/ TA B L E O F C O N T E N T S 1 1 5C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 N E W C S P H D S B Y S P E C I A LT Y Among all computer science PhD graduates in 2019, those who specialized in artificial intelligence/machine learning (22.8%), theory and algorithms (8.0%), and robotics/ vision (7.3%) top the list (Figure 4.2.2). The AI/ML specialty has been the most popular in the past decade, and the number of AI/ML graduates in 2019 is higher than the number of the next five specialties combined. Moreover, robotics/vision jumped from the eighth most popular specialization in 2018 to the third in 2019. Over the past 10 years, AI/ML and robotics/vision are the CS PhD specializations that exhibit the most significant growth, relative to 18 other specializations (Figure 4.2.3). The percentage of AI/ML-specialized CS PhD graduates among all new CS PhDs in 2020 is 8.6 percentage points (pp) larger than in 2010, followed by robotics/vision- specialized doctorates at 2.4 pp. By contrast, the share of CS PhDs specializing in networks (-4.8 pp), software engineering (-3.6 pp), and programming languages/ compilers (-3.0 pp) experienced negative growth in 2020. 0% 5% 10% 15% 20% % of New CS PhDs Artificial Intelligence/Machine Learning Theory and Algorithms Robotics/Vision Databases/Information Retrieval Security/Information Assurance Graphics/Visualization Networks Other Software Engineering Human-Computer Interaction Operating Systems Informatics: Biomedical/Other Science Hardware/Architecture High Performance Computing Programming Languages/Compilers Social Computing/Social Informatics/CSCW Computing Education Scientific/Numerical Computing Information Systems Information Science NEW CS PHDS (% of TOTAL) in the UNITED STATES by SPECIALITY, 2019 Source: CRA Taulbee Survey, 2020 | Chart: 2021 AI Index Report Figure 4.2.2 C H A P T E R 4 : A I E D U C AT I O N 4 . 2 A I A N D C S D E G R E E G R A D U AT E S I N N O R T H A M E R I C A TA B L E O F C O N T E N T S 1 1 6C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 4 : A I E D U C AT I O N 4 . 2 A I A N D C S D E G R E E G R A D U AT E S I N N O R T H A M E R I C A Artificial Intelligence/Machine Learning Robotics/Vision Human-Computer Interaction Security/Information Assurance Computing Education Databases/Information Retrieval High Performance Computing Theory and Algorithms Information Science Social Computing/Social Informatics/CSCW PERCENTAGE POINT CHANGE in NEW CS PHDS in the UNITED STATES from 2010 to 2019 by SPECIALTY Source: CRA Taulbee Survey, 2020 | Chart: 2021 AI Index Report -6% -4% -2% 0% 2% 4% 6% 8% 10% Percentage Point Change in New CS PhDs Information Systems Operating Systems Graphics/Visualization Other Hardware/Architecture Informatics: Biomedical/Other Science Scientific/Numerical Computing Programming Languages/Compilers Software Engineering Networks Figure 4.2.3 TA B L E O F C O N T E N T S 1 1 7C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 N E W C S P H D S W I T H A I / M L A N D R O B O T I C S / V I S I O N S P E C I A LT I E S Figure 4.2.4a and Figure 4.2.4b take a closer look at the number of recent AI PhDs specializing in AI/ML or robotics/ vision in the United States. Between 2010 and 2019, the number of AI/ML-focused graduates grew by 77%, while the percentage of these new PhDs among all CS PhD graduates increased by 61%. The number of both AI/ML and robotics/ vision PhD graduates reached an all-time high in 2019. C H A P T E R 4 : A I E D U C AT I O N 4 . 2 A I A N D C S D E G R E E G R A D U AT E S I N N O R T H A M E R I C A 0 100 200 300 N u m b er o f G ra d u at es NEW CS PHDS with AI/ML and ROBOTICS/VISION SPECIALTY in the UNITED STATES, 2010-19 Source: CRA Taulbee Survey, 2020 | Chart: 2021 AI Index Report Number of Artificial Intelligence/Machine Learning New PhDs Number of Robotics/Vision New PhDs 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2... 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 0% 10% 20% 30% N ew C S P h D s (% o f T o ta l) 22.8% 7.3% NEW CS PHDS (% of TOTAL) with AI/ML and ROBOTICS/VISION SPECIALTY in the UNITED STATES, 2010-19 Source: CRA Taulbee Survey, 2020 | Chart: 2021 AI Index Report Artificial Intelligence/Machine Learning Specialty (% of New CS PhDs) Robotics/Vision Specialty (% of New CS PhDs) Figure 4.2.4a Figure 4.2.4b TA B L E O F C O N T E N T S 1 1 8C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 N E W A I P H D S E M P L OY M E N T I N N O R T H A M E R I C A Where do new AI PhD graduates choose to work? This section captures the employment trends of new AI PhDs in academia and industry across North America.6 C H A P T E R 4 : A I E D U C AT I O N 4 . 2 A I A N D C S D E G R E E G R A D U AT E S I N N O R T H A M E R I C A 0 50 100 150 N u m b er o f N ew A I P h D s EMPLOYMENT of NEW AI PHDS to ACADEMIA or INDUSTRY in NORTH AMERICA, 2010-19 Source: CRA Taulbee Survey, 2020 | Chart: 2021 AI Index Report 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 0% 20% 40% 60% N ew A I P h D s (% o f T o ta l) 23.7% 65.7% EMPLOYMENT of NEW AI PHDS (% of TOTAL) to ACADEMIA or INDUSTRY in NORTH AMERICA, 2010-19 Source: CRA Taulbee Survey, 2020 | Chart: 2021 AI Index Report New AI PhDs to Academia (% of Total New AI PhDs) New AI PhDs to Industry (% of Total New AI PhDs) Number of New AI PhDs to Academia Number of New AI PhDs to Industry 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 100N . Figure 4.2.5a Figure 4.2.5b Industry vs. Academia In the past 10 years, the number of new AI PhD graduates in North America who chose industry jobs continues to grow, as its share increased by 48%, from 44.4% in 2010 to 65.7% in 2019 (Figure 4.2.5a and Figure 4.2.5b). By contrast, the share of new AI PhDs entering academia dropped by 44%, from 42.1% in 2010 to 23.7% in 2019. As is clear from Figure 4.2.5b, these changes are largely a reflection of the fact that the number of PhD graduates entering academia has remained roughly level through the decade, while the large increase in PhD output is primarily being absorbed by the industry. 6 New AI PhDs in this section include PhD graduates who specialize in artificial intelligence from academic units (departments, colleges, or schools within universities) of computer science, computer engineering, and information in the United States and Canada. TA B L E O F C O N T E N T S 1 1 9C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 20% 30% 40% 50% 60% 70% N ew In te rn at io n al A I P h D s (% o f T o ta l N ew A I P h D s) 64.3% NEW INTERNATIONAL AI PHDS (% of TOTAL NEW AI PHDS) in NORTH AMERICA, 2010-19 Source: CRA Taulbee Survey, 2020 | Chart: 2021 AI Index Report Figure 4.2.6 N E W I N T E R N AT I O N A L A I P H D S The percentage of international students among new AI PhD graduates in North America continued to rise in 2019, to 64.3%—a 4.3 percentage point increase from 2018 (Figure 4.2.6). For comparison, of all PhDs with a known specialty area, 63.4% of computer engineering, 59.6% of computer science, and 29.5% of information recipients are international students in 2019. Moreover, among foreign AI PhD graduates in 2019 in the United States specifically, 81.8% stayed in the United States for employment and 8.6% have taken jobs outside the United States (Figure 4.2.7). In comparison, among all international student graduates with known specialties, 77.9% have stayed in the United States while 10.4% were employed elsewhere. C H A P T E R 4 : A I E D U C AT I O N 4 . 2 A I A N D C S D E G R E E G R A D U AT E S I N N O R T H A M E R I C A 8.6% Outside the United States 81.8% United States 9.6% Unknown INTERNATIONAL NEW AI PHDS (% of TOTAL) in the UNITED STATES by LOCATION OF EMPLOYMENT, 2019 Source: CRA Taulbee Survey, 2020 | Chart: 2021 AI Index Report 8.6% Outside the United States 81.8% United States 9.6% Unknown INTERNATIONAL NEW AI PHDS (% of TOTAL) in the UNITED STATES by LOCATION OF EMPLOYMENT, 2019 Source: CRA Taulbee Survey, 2020 | Chart: 2021 AI Index Report Figure 4.2.7 TA B L E O F C O N T E N T S 1 2 0C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 This section presents research from the Joint Research Center at the European Commission that assessed the academic offerings of advanced digital skills in 27 European Union member states as well as six other countries: the United Kingdom, Norway, Switzerland, Canada, the United States, and Australia. This was the second such study,7 and the 2020 version addressed four technological domains: artificial intelligence (AI), high performance computing (HPC), cybersecurity (CS), and data science (DS), applying text-mining and machine-learning techniques to extract content related to study programs addressing the specific domains. See the reports “Academic Offer of Advanced Digital Skills in 2019–20. International Comparison” and “Estimation of Supply and Demand of Tertiary Education Places in Advanced Digital Profiles in the EU,” for more detail. C H A P T E R 4 : A I E D U C AT I O N A I O F F E R I N G S I N E U 2 7 The study revealed a total number of 1,032 AI programs across program scopes and program levels in the 27 EU countries (Figure 4.3.1). The overwhelming majority of specialized AI academic offerings in the EU are taught at the master’s level, which leads to a degree that equips students with strong competencies for the workforce. Germany leads the other member nations in offering the most specialized AI programs, followed by the Netherlands, France, and Sweden. France tops the list in offering the most AI programs at the master’s level. 4.3 AI EDUCATION IN THE EUROPEAN UNION AND BEYOND 4 . 3 A I E D U C AT I O N I N T H E E U R O P E A N U N I O N A N D B E YO N D 0 10 20 30 N u m b er o f S p ec ia liz ed A I P ro g ra m s NUMBER of SPECIALIZED AI PROGRAMS in EU27, 2019-20 Source: Joint Research Centre, European Commission, 2020 | Chart: 2021 AI Index Report G er m an y N et h er la n d s F ra n ce S w ed en D en m ar k F in la n d Ir el an d It al y P o la n d S p ai n B el g iu m H u n g ar y E st o n ia L it h u an ia A u st ri a P o rt u g al C ze ch ia R o m an ia L at vi a C yp ru s S lo va ki a G re ec e S lo ve n ia M al ta C ro at ia L u xe m b o u rg B u lg ar ia Master Bachelor Short Courses Figure 4.3.1 7 Note that the 2020 report introduced methodological improvements from the 2019 version; therefore, a strict comparison is not possible. Improvements include the removal of certain keywords and the addition of others to identify the programs. Still, more than 90% of all detected programs in the 2020 edition are triggered by keywords present in the 2019 study. https://ec.europa.eu/jrc/en/publication/academic-offer-advanced-digital-skills-2019-20-international-comparison https://ec.europa.eu/jrc/en/publication/estimation-supply-and-demand-tertiary-education-places-advanced-digital-profiles-eu https://ec.europa.eu/jrc/en/publication/estimation-supply-and-demand-tertiary-education-places-advanced-digital-profiles-eu TA B L E O F C O N T E N T S 1 2 1C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 4 : A I E D U C AT I O N 4 . 3 A I E D U C AT I O N I N T H E E U R O P E A N U N I O N A N D B E YO N D Short Courses Bachelor Master 0% 20% 40% 0% 20% 40% 0% 20% 40% Robotics & Automation Machine Learning AI Applications AI Ethics Computer Vision Natural Language Processing Connected And Automated Vehicles Knowledge Representation And Reasoning; Planning; Searching; Optimization Multi-Agent Systems Philosophy Of AI AI (Generic) Audio Processing SPECIALIZED AI PROGRAMS (% of TOTAL) by CONTENT AREA in EU27, 2019-20 Source: Joint Research Centre, European Commission, 2020 | Chart: 2021 AI Index Report % of Total Specialized AI Programs Figure 4.3.2 By Content Taught in AI-Related Courses What types of AI technologies are the most popular among the course offerings in three levels of specialized AI programs in the European Union? Data suggests that robotics and automation are by far the most frequently taught courses in the specialized bachelor’s and master’s programs, while machine learning dominates in the specialized short courses (Figure 4.3.2). As short courses cater to working professionals, the trend shows that machine learning has become one of the key competencies in the professional development and implementation of AI. It is also important to mention the role of AI ethics and AI applications, as both content areas claim a significant share of the education offerings among the three program levels. AI ethics—including courses on security, safety, accountability, and explainability—accounts for 14% of the curriculum on average, while AI applications— such as courses on big data, the internet of things, and virtual reality—take a similar share on average. TA B L E O F C O N T E N T S 1 2 2C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 4 : A I E D U C AT I O N 4 . 3 A I E D U C AT I O N I N T H E E U R O P E A N U N I O N A N D B E YO N D 0 100 200 300 Number of Courses United States United Kingdom EU27 Australia Canada Norway Switzerland NUMBER of SPECIALIZED AI PROGRAMS by GEOGRAPHIC AREA and LEVEL, 2019-20 Source: Joint Research Centre, European Commission, 2020 | Chart: 2021 AI Index Report Bachelor Master Short Courses Figure 4.3.3 I N T E R N AT I O N A L C O M PA R I S O N The JRC report compared AI education in the 27 EU member states with other countries in Europe, including Norway, Switzerland, and the United Kingdom, as well as Canada, the United States, and Australia. Figure 4.3.3 shows the total number of 1,680 specialized AI programs in all countries considered in the 2019–20 academic year. The United States appears to have offered more programs specialized in AI than any other geographic area, although EU27 comes in a close second in terms of the number of AI-specialized master’s programs. The United States appears to have offered more programs specialized in AI than any other geographic area although EU27 comes in a close second in terms of the number of AI-specialized master’s programs. TA B L E O F C O N T E N T S 1 2 3C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 AI Brain Drain and Faculty Departure Michael Gofman and Zhao Jin, researchers from the University of Rochester and Cheung Kong Graduate School of Business, respectively, published a paper titled “Artificial Intelligence, Education, and Entrepreneurship” in 2019 that explores the relationship between domain- specific knowledge of university students and their ability to establish startups and attract funding.8 For the source of variation in students’ AI-specific knowledge, the co-authors used the departure of AI professors—what they referred to as “an unprecedented brain drain”—from universities to industry between 2004 and 2018. They relied on data hand-collected from LinkedIn as well as authors’ affiliation from the Scopus database of academic publications and conferences to complement the results from the LinkedIn search. The paper found that AI faculty departures have a negative effect on AI startups founded by students who graduate from universities where those professors used to work, with the researchers pointing to a chilling effect on future AI entrepreneurs in the years following the faculty departures. PhD students are the most affected, compared with undergraduate C H A P T E R 4 : A I E D U C AT I O N H I G H L I G H T 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 0 10 20 30 40 N u m b e r o f F ac u lt y NUMBER of AI FACULTY DEPARTURES in NORTH AMERICA, 2004-19 Source: Gofman and Jin, 2020 | Chart: 2021 AI Index Report Tenured Total Number of Departures Untenured Figure 4.4.1 8 See AI Brain Drain Index for more details. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3449440 https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3449440 http://www.aibraindrain.org/ TA B L E O F C O N T E N T S 1 24C H A P T E R 4 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 4 : A I E D U C AT I O N H I G H L I G H T AI Brain Drain and Faculty Departure (continued) and master’s students, and the negative impact intensifies when the AI professors who leave are replaced by faculty from lower-ranked schools or untenured AI professors. With the updated data of 2019 from Gofman and Jin, Figure 4.4.1 shows that after a two- year increase, the total number of AI faculty departures from universities in North America to industry dropped from 42 in 2018 to 33 in 2019 (28 of these are tenured faculty and 5 are untenured). Between 2004 and 2019, Carnegie Mellon University had the largest number of AI faculty departures in 2019 (16), followed by the Georgia Institute of Technology (14) and University of Washington (12), as shown in Figure 4.4.2. A I E D U C AT I O N 0 2 4 6 8 10 12 14 16 Number of Faculty Carnegie Mellon University Georgia Institute of Technology University of Washington University of California, Berkeley University of Toronto Stanford University University of Southern California University of Texas at Austin University of Michigan University of Illinois at Urbana-Champaign University of California, San Diego Purdue University Harvard University NUMBER of AI FACULTY DEPATURES in NORTH AMERICA (with UNIVERSITY AFFILIATION) by UNIVERSITY, 2004-18 Source: Gofman and Jin, 2020 | Chart: 2021 AI Index Report Figure 4.4.2 TA B L E O F C O N T E N T S 1 2 5C H A P T E R 5 P R E V I E W Artificial Intelligence Index Report 2021 CHAPTER 5: Ethical Challenges of AI Applications Artificial Intelligence Index Report 2021 TA B L E O F C O N T E N T S 1 2 6C H A P T E R 5 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 5 : E T H I C A L C H A L L E N G E S O F A I A P P L I C AT I O N S Overview 127 Chapter Highlights 128 5.1 AI PRINCIPLES AND FRAMEWORKS 129 5.2 GLOBAL NEWS MEDIA 131 5.3 ETHICS AT AI CONFERENCES 132 5.4 ETHICS OFFERINGS AT HIGHER EDUCATION INSTITUTIONS 134 Chapter Preview CHAPTER 5: ACCESS THE PUBLIC DATA https://drive.google.com/drive/folders/1LH5cQis-PBlO6t6lm3eMB0QTctrfCFT1?usp=sharing TA B L E O F C O N T E N T S 1 2 7C H A P T E R 5 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 5 : E T H I C A L C H A L L E N G E S O F A I A P P L I C AT I O N S Overview O V E R V I E W As artificial intelligence–powered innovations become ever more prevalent in our lives, the ethical challenges of AI applications are increasingly evident and subject to scrutiny. As previous chapters have addressed, the use of various AI technologies can lead to unintended but harmful consequences, such as privacy intrusion; discrimination based on gender, race/ethnicity, sexual orientation, or gender identity; and opaque decision-making, among other issues. Addressing existing ethical challenges and building responsible, fair AI innovations before they get deployed has never been more important. This chapter tackles the efforts to address the ethical issues that have arisen alongside the rise of AI applications. It first looks at the recent proliferation of documents charting AI principles and frameworks, as well as how the media covers AI-related ethical issues. It then follows with a review of ethics-related research presented at AI conferences and what kind of ethics courses are being offered by computer science (CS) departments at universities around the world. The AI Index team was surprised to discover how little data there is on this topic. Though a number of groups are producing a range of qualitative or normative outputs in the AI ethics domain, the field generally lacks benchmarks that can be used to measure or assess the relationship between broader societal discussions about technology development and the development of the technology itself. One datapoint, covered in the technical performance chapter, is the study by the National Institute of Standards and Technology on facial recognition performance with a focus on bias. Figuring out how to create more quantitative data presents a challenge for the research community, but it is a useful one to focus on. Policymakers are keenly aware of ethical concerns pertaining to AI, but it is easier for them to manage what they can measure, so finding ways to translate qualitative arguments into quantitative data is an essential step in the process. TA B L E O F C O N T E N T S 1 2 8C H A P T E R 5 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 5 : E T H I C A L C H A L L E N G E S O F A I A P P L I C AT I O N S CHAPTER HIGHLIGHTS • The number of papers with ethics-related keywords in titles submitted to AI conferences has grown since 2015, though the average number of paper titles matching ethics- related keywords at major AI conferences remains low over the years. • The five news topics that got the most attention in 2020 related to the ethical use of AI were the release of the European Commission’s white paper on AI, Google’s dismissal of ethics researcher Timnit Gebru, the AI ethics committee formed by the United Nations, the Vatican’s AI ethics plan, and IBM’s exiting the facial-recognition businesses. C H A P T E R H I G H L I G H T S TA B L E O F C O N T E N T S 1 2 9C H A P T E R 5 P R E V I E W Artificial Intelligence Index Report 2021 Since 2015, governments, private companies, intergovernmental organizations, and research/ professional organizations have been producing normative documents that chart the approaches to manage the ethical challenges of AI applications. Those documents, which include principles, guidelines, and more, provide frameworks for addressing the concerns and assessing the strategies attached to developing, deploying, and governing AI within various organizations. Some common themes that emerge from these AI principles and frameworks include privacy, accountability, transparency, and explainability. The publication of AI principles signals that organizations are paying heed to and establishing a vision for AI governance. Even so, the proliferation of so-called ethical principles has met with criticism from ethics researchers and human rights practitioners who oppose the imprecise usage of ethics-related terms. The critics also point out that they lack institutional frameworks and are non-binding in most cases. The vague and abstract nature of those principles fails to offer direction on how to implement AI-related ethics guidelines. Researchers from the AI Ethics Lab in Boston created a ToolBox that tracks the growing body of AI principles. A total of 117 documents relating to AI principles were published between 2015 and 2020. Data shows that research and professional organizations were among the earliest to roll out AI principle documents, and private companies have to date issued the largest number of publications on AI principles among all organization types (Figure 5.1.1). Europe and Central Asia have the highest number of publications as of 2020 (52), followed by North America (41), and East Asia and Pacific (14), according to Figure 5.1.2. In terms of rolling out ethics principles, 2018 was the clear high-water mark for tech companies— including IBM, Google, and Facebook—as well as various U.K., EU, and Australian government agencies. 5.1 AI PRINCIPLES AND FRAMEWORKS C H A P T E R 5 : E T H I C A L C H A L L E N G E S O F A I A P P L I C AT I O N S 5 .1 A I P R I N C I P L E S A N D F R A M E W O R K S Europe and Central Asia have the highest number of publications as of 2020 (44), followed by North America (30), and East Asia and Pacific (14). In terms of rolling out ethics principles, 2018 was the clear high-water mark for tech companies— including IBM, Google, and Facebook—as well as various U.K., EU, and Australian government agencies. https://aiethicslab.com/big-picture/ TA B L E O F C O N T E N T S 1 3 0C H A P T E R 5 P R E V I E W Artificial Intelligence Index Report 2021 2015 2016 2017 2018 2019 2020 0 10 20 30 40 N u m b er o f N ew A I E th ic s P ri n ci p le s 45 28 23 17 2 2 20 16 13 12 9 6 5 5 7 7 NUMBER of NEW AI ETHICS PRINCIPLES by REGION, 2015-20 Source: AI Ethics Lab, 2020 | Chart: 2021 AI Index Report East Asia & Pacific Europe & Central Asia Global Latin America & Caribbean Middle East & North Africa North America South Asia Figure 5.1.2 C H A P T E R 5 : E T H I C A L C H A L L E N G E S O F A I A P P L I C AT I O N S 5 .1 A I P R I N C I P L E S A N D F R A M E W O R K S 2015 2016 2017 2018 2019 2020 0 10 20 30 40 N u m b er o f N ew A I E th ic s P ri n ci p le s 45 28 23 17 2 2 19 13 1511 11 4 4 9 6 8 5 3 2 2 NUMBER of NEW AI ETHICS PRINCIPLES by ORGANIZATION TYPE, 2015-20 Source: AI Ethics Lab, 2020 | Chart: 2021 AI Index Report Research/Professional Organization Private Company Intergovernmental Organization/Agency Government Agency Figure 5.1.1 TA B L E O F C O N T E N T S 1 31C H A P T E R 5 P R E V I E W Artificial Intelligence Index Report 2021 0% 5% 10% 15% 20% % of Total News Coverage on AI Ethics Guidance, Framework Research, Education Facial Recognition Algorithm Bias Robots, Autonomous Cars AI Explainability Data Privacy Enterprise Efforts NEWS COVERAGE on AI ETHICS (% of TOTAL) by THEME, 2020 Source: CAPIQ, Crunchbase, and NetBase Quid, 2020 | Chart: 2021 AI Index Report Figure 5.2.1 How has the news media covered the topic of the ethical use of AI technologies? This section analyzed data from NetBase Quid, which searches the archived news database of LexisNexis for articles that discuss AI ethics1, analyzing 60,000 English-language news sources and over 500,000 blogs in 2020. The search found 3,047 articles related to AI technologies that include terms such as “human rights,” “human values,” “responsibility,” “human control,” “fairness,” “discrimination” or “nondiscrimination,” “transparency,” “explainability,” “safety and security,” “accountability,” and “privacy.” (See the Appendix for more details on search terms.) NetBase Quid clustered the resulting media narratives into seven large themes based on language similarity. Figure 5.2.1 shows that articles relating to AI ethics guidance and frameworks topped the list of the most covered news topics (21%) in 2020, followed by research and education (20%), and facial recognition (20%). The five news topics that received the most attention in 2020 related to the ethical use of AI were: 1. The release of the European Commission’s white paper on AI (5.9%) 2. Google’s dismissal of ethics researcher Timnit Gebru (3.5%) 3. The AI ethics committee formed by the United Nations (2.7%) 4. The Vatican’s AI ethics plan (2.6%) 5. IBM exiting the facial-recognition businesses (2.5%). 5.2 GLOBAL NEWS MEDIA C H A P T E R 5 : E T H I C A L C H A L L E N G E S O F A I A P P L I C AT I O N S 5 . 2 G L O B A L N E W S M E D I A 1 The methodology for this is looking for articles that contain keywords related to AI ethics as determined by a Harvard research study. https://cyber.harvard.edu/publication/2020/principled-ai TA B L E O F C O N T E N T S 1 3 2C H A P T E R 5 P R E V I E W Artificial Intelligence Index Report 2021 Researchers are writing more papers that focus directly on the ethics of AI, with submissions in this area more than doubling from 2015 to 2020. To measure the role of ethics in AI research, researchers from the Federal University of Rio Grande do Sul in Porto Alegre, Brazil, searched ethics-related terms in the titles of papers in leading AI, machine learning, and robotics conferences. As Figure 5.3.1 shows, there has been a significant increase in the number of papers with ethics-related keywords in titles submitted to AI conferences since 2015. Further analysis in Figure 5.3.2 shows the average number of keyword matches throughout all publications among the six major AI conferences. Despite the growing mentions in the previous chart, the average number of paper titles matching ethics-related keywords at major AI conferences remains low over the years. Changes are coming to AI conferences, though. Starting in 2020, the topic of ethics was more tightly integrated into conference proceedings. For instance, the Neural Information Processing Systems (NeurIPS) conference, one of the biggest AI research conferences in the world, asked researchers to submit “Broader Impacts” statements alongside their work for the first time in 2020, which led to a deeper integration of ethical concerns into technical work. Additionally, there has been a recent proliferation of conferences and workshops that specifically focus on responsible AI, including the new Artificial Intelligence, Ethics, and Society Conference by the Association for the Advancement of Artificial Intelligence and the Conference on Fairness, Accountability, and Transparency by the Association for Computing Machinery. 5.3 ETHICS AT AI CONFERENCES C H A P T E R 5 : E T H I C A L C H A L L E N G E S O F A I A P P L I C AT I O N S 5 . 3 E T H I C S AT A I C O N F E R E N C E S There has been a significant increase in the number of papers with ethics- related keywords in titles submitted to AI conferences since 2015. Further analysis shows the average number of keyword matches throughout all publications among the six major AI conferences. https://arxiv.org/abs/1809.08328 TA B L E O F C O N T E N T S 1 3 3C H A P T E R 5 P R E V I E W Artificial Intelligence Index Report 2021 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 0.00 0.02 0.04 0.06 A ve ra g e N u m b er o f K ey w o rd s M at ch es AVERAGE NUMBER of PAPER TITLES MENTIONING ETHICS KEYWORDS at SELECT LARGE AI CONFERENCES, 2000-19 Source: Prates et al., 2018 | Chart: 2021 AI Index Report AAAI ICML ICRA IJCAI IROS NIPS/NeurIPS Figure 5.3.2 C H A P T E R 5 : E T H I C A L C H A L L E N G E S O F A I A P P L I C AT I O N S 5 . 3 E T H I C S AT A I C O N F E R E N C E S 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 0 20 40 60 80 N u m b er o f P ap er T it le s 70 NUMBER of PAPER TITLES MENTIONING ETHICS KEYWORDS at AI CONFERENCES, 2000-19 Source: Prates et al., 2018 | Chart: 2021 AI Index Report Figure 5.3.1 TA B L E O F C O N T E N T S 1 3 4C H A P T E R 5 P R E V I E W Artificial Intelligence Index Report 2021 0 2 4 6 8 10 12 Keynote events or panel discussions on AI ethics University-wide undergraduate general requirement courses on broadly defined ethics Standalone courses on AI ethics in CS or Other Departments Stand-alone courses on CS ethics in CS or Other Departments Ethics modules embedded into CS courses AI ethics–related student groups/organizations AI ETHICS OFFERING at CS DEPARTMENTS of TOP UNIVERSITIES around the WORLD, AY 2019-20 Source: AI Index, 2020 | Chart: 2021 AI Index Report Figure 5.4.1 Chapter 4 introduced a survey of computer science departments or schools at top universities around the world in order to assess the state of AI education in higher education institutions.2 In part, the survey asked whether the CS department or university offers the opportunity to learn about the ethical side of AI and CS. Among the 16 universities that completed the survey, 13 reported some type of relevant offering. Figure 5.4.1 shows that 11 of the 18 departments report hosting keynote events or panel discussions on AI ethics, while 7 of them offer stand-alone courses on AI ethics in CS or other departments at their university. Some universities also offer classes on ethics in the computer science field in general, including stand-alone CS ethics courses or ethics modules embedded in the CS curriculum offering.3 5.4 ETHICS OFFERINGS AT HIGHER EDUCATION INSTITUTIONS C H A P T E R 5 : E T H I C A L C H A L L E N G E S O F A I A P P L I C AT I O N S 5 . 4 E T H I C S O F F E R I N G S AT H I G H E R E D U C AT I O N I N S T I T U T I O N S 2 The survey was distributed to 73 universities online over three waves from November 2020 to January 2021 and completed by 18 universities, a 24.7% response rate. The 18 universities are—Belgium: Katholieke Universiteit Leuven; Canada: McGill University; China: Shanghai Jiao Tong University, Tsinghua University; Germany: Ludwig Maximilian University of Munich, Technical University of Munich; Russia: Higher School of Economics, Moscow Institute of Physics and Technology; Switzerland: École Polytechnique Fédérale de Lausanne; United Kingdom: University of Cambridge; United States: California Institute of Technology, Carnegie Mellon University (Department of Machine Learning), Columbia University, Harvard University, Stanford University, University of Wisconsin–Madison, University of Texas at Austin, Yale University. 3 The survey did not explicitly present “Ethics modules embedded into CS courses” as an option. Selections were filled in the “Others” option. This will be included in next year’s survey. 11 of the 18 departments report hosting keynote events or panel discussions on AI ethics, while 7 of them offer stand-alone courses on AI ethics in CS or other departments at their university. TA B L E O F C O N T E N T S 1 3 5C H A P T E R 6 P R E V I E W Artificial Intelligence Index Report 2021 CHAPTER 6: Diversity in AI Artificial Intelligence Index Report 2021 TA B L E O F C O N T E N T S 1 3 6C H A P T E R 6 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 6 : D I V E R S I T Y I N A I Overview 137 Chapter Highlights 138 6.1 GENDER DIVERSITY IN AI 139 Women in Academic AI Settings 139 Women in the AI Workforce 140 Women in Machine Learning Workshops 141 Workshop Participants 141 Demographics Breakdown 142 6.2 RACIAL AND ETHNIC DIVERSITY IN AI 144 New AI PhDs in the United States by Race/Ethnicity 144 New Computing PhDs in the United States by Race/Ethnicity 145 CS Tenure-Track Faculty by Race/Ethnicity 146 Black in AI 146 6.3 GENDER IDENTITY AND SEXUAL ORIENTATION IN AI 147 Queer in AI 147 Demographics Breakdown 147 Experience as Queer Practitioners 149 Chapter Preview CHAPTER 6: ACCESS THE PUBLIC DATA https://drive.google.com/drive/folders/1Ma_pmLLqYlrayfUkitRhrznIjtrVyjXf?usp=sharing TA B L E O F C O N T E N T S 1 3 7C H A P T E R 6 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 6 : D I V E R S I T Y I N A I Overview O V E R V I E W While artificial intelligence (AI) systems have the potential to dramatically affect society, the people building AI systems are not representative of the people those systems are meant to serve. The AI workforce remains predominantly male and lacking in diversity in both academia and the industry, despite many years highlighting the disadvantages and risks this engenders. The lack of diversity in race and ethnicity, gender identity, and sexual orientation not only risks creating an uneven distribution of power in the workforce, but also, equally important, reinforces existing inequalities generated by AI systems, reduces the scope of individuals and organizations for whom these systems work, and contributes to unjust outcomes. This chapter presents diversity statistics within the AI workforce and academia. It draws on collaborations with various organizations—in particular, Women in Machine Learning (WiML), Black in AI (BAI), and Queer in AI (QAI)— each of which aims to improve diversity in some dimension in the field. The data is neither comprehensive nor conclusive. In preparing this chapter, the AI Index team encountered significant challenges as a result of the sparsity of publicly available demographic data. The lack of publicly available demographic data limits the degree to which statistical analyses can assess the impact of the lack of diversity in the AI workforce on society as well as broader technology development. The diversity issue in AI is well known, and making more data available from both academia and industry is essential to measuring the scale of the problem and addressing it. There are many dimensions of diversity that this chapter does not cover, including AI professionals with disabilities; nor does it consider diversity through an intersectional lens. Other dimensions will be addressed in future iterations of this report. Moreover, these diversity statistics tell only part of the story. The daily challenges of minorities and marginalized groups working in AI, as well as the structural problems within organizations that contribute to the lack of diversity, require more extensive data collection and analysis. 1 We thank Women in Machine Learning, Black in AI, and Queer in AI for their work to increase diversity in AI, for sharing their data, and for partnering with us. TA B L E O F C O N T E N T S 1 3 8C H A P T E R 6 P R E V I E W Artificial Intelligence Index Report 2021 CHAPTER HIGHLIGHTS • The percentages of female AI PhD graduates and tenure-track computer science (CS) faculty have remained low for more than a decade. Female graduates of AI PhD programs in North America have accounted for less than 18% of all PhD graduates on average, according to an annual survey from the Computing Research Association (CRA). An AI Index survey suggests that female faculty make up just 16% of all tenure- track CS faculty at several universities around the world. • The CRA survey suggests that in 2019, among new U.S. resident AI PhD graduates, 45% were white, while 22.4% were Asian, 3.2% were Hispanic, and 2.4% were African American. • The percentage of white (non-Hispanic) new computing PhDs has changed little over the last 10 years, accounting for 62.7% on average. The share of Black or African American (non-Hispanic) and Hispanic computing PhDs in the same period is significantly lower, with an average of 3.1% and 3.3%, respectively. • The participation in Black in AI workshops, which are co-located with the Conference on Neural Information Processing Systems (NeurIPS), has grown significantly in recent years. The numbers of attendees and submitted papers in 2019 are 2.6 times higher than in 2017, while the number of accepted papers is 2.1 times higher. • In a membership survey by Queer in AI in 2020, almost half the respondents said they view the lack of inclusiveness in the field as an obstacle they have faced in becoming a queer practitioner in the AI/ML field. More than 40% of members surveyed said they have experienced discrimination or harassment as a queer person at work or school. C H A P T E R H I G H L I G H T S C H A P T E R 6 : D I V E R S I T Y I N A I TA B L E O F C O N T E N T S 1 3 9C H A P T E R 6 P R E V I E W Artificial Intelligence Index Report 2021 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 10% 15% 20% 25% 30% F em al e N ew A I P h D s (% o f A ll N ew A I P h D s) 22.1% AI 20.3% CS FEMALE NEW AI and CS PHDS (% of TOTAL NEW AI and CS PHDS) in NORTH AMERICA, 2010-19 Source: CRA Taulbee Survey, 2020 | Chart: 2021 AI Index Report 575 (83.9%) Male 110 (16.1%) Female TENURE-TRACK FACULTY at CS DEPARTMENTS of TOP UNIVERSITIES around the WORLD by GENDER, AY 2019-20 Source: AI Index, 2020 | Chart: 2021 AI Index Report 575 (83.9%) Male 110 (16.1%) Female TENURE-TRACK FACULTY at CS DEPARTMENTS of TOP UNIVERSITIES around the WORLD by GENDER, AY 2019-20 Source: AI Index, 2020 | Chart: 2021 AI Index Report Figure 6.1.1 Figure 6.1.2 W O M E N I N AC A D E M I C A I S E T T I N G S Chapter 4 introduced the AI Index survey that evaluates the state of AI education in CS departments at top universities around the world, along with the Computer Research Association’s annual Taulbee Survey on the enrollment, production, and employment of PhDs in information, computer science, and computer engineering in North America. Data from both surveys show that the percentage of female AI and CS PhD graduates as well as tenure-track CS faculty remains low. Female graduates of AI PhD programs and CS PhD programs have accounted for 18.3% of all PhD graduates on average within the past 10 years (Figure 6.1.1). Among the 17 universities that completed the AI Index survey of CS programs globally, female faculty make up just 16.1% of all tenure-track faculty whose primary research focus area is AI (Figure 6.1.2). 6.1 GENDER DIVERSITY IN AI 6 .1 G E N D E R D I V E R S I T Y I N A I C H A P T E R 6 : D I V E R S I T Y I N A I TA B L E O F C O N T E N T S 1 4 0C H A P T E R 6 P R E V I E W Artificial Intelligence Index Report 2021 6 .1 G E N D E R D I V E R S I T Y I N A I C H A P T E R 6 : D I V E R S I T Y I N A I 0 1 2 3 Relative AI Skills Penetration Rate India United States South Korea Singapore China Canada France Germany Australia United Kingdom South Africa Italy RELATIVE AI SKILLS PENETRATION RATE by GENDER, 2015-20 Source: LinkedIn, 2020 | Chart: 2021 AI Index Report Female Male Figure 6.1.3 W O M E N I N T H E A I W O R K F O R C E Chapter 3 introduced the “global relative AI skills penetration rate,” a measure that reflects the prevalence of AI skills across occupations, or the intensity with which people in certain occupations use AI skills. Figure 6.1.3 shows AI skills penetration by country for female and male labor pools in a set of select countries.2 The data suggest that across the majority of these countries, the AI skills penetration rate for women is lower than that for men. Among the 12 countries we examined, India, South Korea, Singapore, and Australia are the closest to reaching equity in terms of the AI skills penetration rate of females and males. 2 Countries included are a select sample of eligible countries with at least 40% labor force coverage by LinkedIn and at least 10 AI hires in any given month. China and India were included in this sample because of their increasing importance in the global economy, but LinkedIn coverage in these countries does not reach 40% of the workforce. Insights for these countries may not provide as full a picture as other countries, and should be interpreted accordingly. This data suggests that across the majority of select countries, the AI skills penetration rate for women is lower than it is for men. TA B L E O F C O N T E N T S 1 41C H A P T E R 6 P R E V I E W Artificial Intelligence Index Report 2021 6 .1 G E N D E R D I V E R S I T Y I N A I C H A P T E R 6 : D I V E R S I T Y I N A I 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0 200 400 600 800 1,000 N u m b er o f P ar ti ci p an ts 925 NUMBER of PARTICIPANTS at WIML WORKSHOP at NEURIPS, 2006-20 Source: Women in Machine Learning, 2020 | Chart: 2021 AI Index Report Figure 6.1.4 W O M E N I N M AC H I N E L E A R N I N G W O R K S H O P S Women in Machine Learning, founded in 2006 by Hanna Wallach, Jenn Wortman, and Lisa Wainer, is an organization that runs events and programs to support women in the field of machine learning (ML). This section presents statistics from its annual technical workshops, which are held at NeurIPS. In 2020, WiML also hosted for the first time a full-day “Un-Workshop” at the International Conference on Machine Learning 2020, which drew 812 participants. Workshop Participants The number of participants attending WiML workshops at NeurIPS has been steadily increasing since the workshops were first offered in 2006. According to the organization, the WiML workshop in 2020 was completely virtual because of the pandemic and delivered on a new platform (Gather.Town); these two factors may make attendance numbers harder to compare to those of previous years. Figure 6.1.4 shows an estimate of 925 attendees in 2020, based on the number of individuals who accessed the virtual platform. In the past 10 years, WiML workshops have expanded their programs to include mentoring roundtables, where more senior participants offer one-on-one feedback and professional advice, in addition to the main session that includes keynotes and poster presentations. Similar opportunities may have contributed to the increase in attendance since 2014. Between 2016 and 2019, the WiML workshop attendance is on average about 10% of the overall NeurIPS attendance. https://wimlworkshop.org/ TA B L E O F C O N T E N T S 1 4 2C H A P T E R 6 P R E V I E W Artificial Intelligence Index Report 2021 Demographics Breakdown The following geographic, professional position, and gender breakdowns are based only on participants at the 2020 WiML workshop at NeurIPS who consented to having the information aggregated and who spent at least 10 minutes on the virtual platform through which the workshop was offered. Among the participants, 89.5% were women and/or nonbinary, 10.4% were men (Figure 6.1.5), and a large majority were from North America (Figure 6.1.6). Further, as shown in Figure 6.1.7, students—including PhD, master’s, and undergraduate students—make up more than half the participants (54.6%). Among participants who work in the industry, research scientist/engineer and data scientist/engineer are the most commonly held professional positions. 6 .1 G E N D E R D I V E R S I T Y I N A I C H A P T E R 6 : D I V E R S I T Y I N A I 89.5% Woman and/or nonbinary 10.4% Man PARTICIPANTS of WIML WORKSHOP at NEURIPS (% of TOTAL) by GENDER, 2020 Source: Women in Machine Learning, 2020 | Chart: 2021 AI Index Report 89.5% Woman and/or nonbinary 10.4% Man PARTICIPANTS of WIML WORKSHOP at NEURIPS (% of TOTAL) by GENDER, 2020 Source: Women in Machine Learning, 2020 | Chart: 2021 AI Index Report Figure 6.1.5 Among the participants, 89.5% were women and/ or nonbinary, 10.4% were men, and a large majority were from North America. Further, students—including PhD, master’s, and undergraduate students—make up more than half the participants (54.6%). TA B L E O F C O N T E N T S 1 4 3C H A P T E R 6 P R E V I E W Artificial Intelligence Index Report 2021 0% 5% 10% 15% 20% 25% 30% 35% % of Participants PhD Student Research Scientist/Engineer MSc Student Data scientist/Engineer Undergraduate Student Postdoctoral Researcher Software Engineer Professor (Pre-Tenure) Professor (Post-Tenure) Program/Product Manager PARTICIPANTS of WIML WORKSHOP at NEURIPS (% of TOTAL) by TOP 10 PROFESSIONAL POSITIONS, 2020 Source: Women in Machine Learning, 2020 | Chart: 2021 AI Index Report Figure 6.1.7 6 .1 G E N D E R D I V E R S I T Y I N A I C H A P T E R 6 : D I V E R S I T Y I N A I 0% 10% 20% 30% 40% 50% 60% % of Participants North America Europe Asia Africa Central, South America, and the Carribean Australia and Oceania Middle East PARTICIPANTS of WIML WORKSHOP at NEURIPS (% of TOTAL) by CONTINENT of RESIDENCE, 2020 Source: Women in Machine Learning, 2020 | Chart: 2021 AI Index Report Figure 6.1.6 TA B L E O F C O N T E N T S 1 4 4C H A P T E R 6 P R E V I E W Artificial Intelligence Index Report 2021 22.4% Asian 45.6% White (non-Hispanic) 24.8% Unknown 2.4% Black or African American (non-Hispanic) 3.2% Hispanic 1.6% Multiracial (non-Hispanic) NEW U.S. RESIDENT AI PHDS (% of TOTAL) by RACE/ETHNICITY, 2019 Source: CRA Taulbee Survey, 2020 | Chart: 2021 AI Index Report 22.4% Asian 45.6% White (non-Hispanic) 24.8% Unknown 2.4% Black or African American (non-Hispanic) 3.2% Hispanic 1.6% Multiracial (non-Hispanic) NEW U.S. RESIDENT AI PHDS (% of TOTAL) by RACE/ETHNICITY, 2019 Source: CRA Taulbee Survey, 2020 | Chart: 2021 AI Index Report N E W A I P H D S I N T H E U N I T E D S TAT E S B Y R AC E / E T H N I C I T Y According to the CRA Taulbee Survey, among the new AI PhDs in 2019 who are U.S. residents, the largest percentage (45.6%) are white (non-Hispanic), followed by Asian (22.4%). By comparison, 2.4% were African American (non-Hispanic) and 3.2% were Hispanic (Figure 6.2.1). 6.2 RACIAL AND ETHNIC DIVERSITY IN AI 6 . 2 R AC I A L A N D E T H N I C D I V E R S I T Y I N A I C H A P T E R 6 : D I V E R S I T Y I N A I Figure 6.2.1 TA B L E O F C O N T E N T S 1 4 5C H A P T E R 6 P R E V I E W Artificial Intelligence Index Report 2021 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 0% 10% 20% 30% 40% 50% 60% 70% N ew C o m p u ti n g P h D s, U .S . R es id en t (% o f T o ta l) 8.5% Unknown 0.6% Native Hawaiian/Pac Islander 1.7% Multiracial (non-Hispanic) 3.2% Hispanic, any race 2.5% Black or African American (non-Hispanic) 24.4% Asian 0.3% Amer Indian or Alaska Native 58.9% White (non-Hispanic) NEW COMPUTING PHDS, U.S. RESIDENT (% of TOTAL) by RACE/ETHNICITY, 2010-19 Source: CRA Taulbee Survey, 2020 | Chart: 2021 AI Index Report N E W C O M P U T I N G P H D S I N T H E U N I T E D S TAT E S B Y R AC E / E T H N I C I T Y Figure 6.2.2 shows all PhDs awarded in the United States to U.S. residents across departments of computer science (CS), computer engineering (CE), and information (I) between 2010 and 2019. The CRA survey indicates that the percentage of white (non-Hispanic) new PhDs has changed little over the last 10 years, accounting for 62.7% on average. The share of new Black or African American (non-Hispanic) and Hispanic computing PhDs in the same period is significantly lower, with an average of 3.1% and 3.3%, respectively. We are not able to compare the numbers between new AI and CS PhDs in 2019 because of the number of unknown cases (24.8% for new AI PhDs and 8.5% for CS PhDs). 6 . 2 R AC I A L A N D E T H N I C D I V E R S I T Y I N A I C H A P T E R 6 : D I V E R S I T Y I N A I The CRA survey indicates that the percentage of white (non-Hispanic) new PhDs has changed little over the last 10 years, accounting for 62.7% on average. Figure 6.2.2 TA B L E O F C O N T E N T S 1 4 6C H A P T E R 6 P R E V I E W Artificial Intelligence Index Report 2021 0 100 200 300 400 500 Number of Attendees and Papers Number of Attendees Submitted Papers Accepted Papers NUMBER OF ATTENDEES, SUBMITTED PAPERS, and ACCEPTED PAPERS at BLACK in AI WORKSHOP CO-LOCATED with NEURIPS, 2017-19 Source: Black in AI, 2020 | Chart: 2021 AI Index Report 2017 2018 2019 Figure 6.2.3 shows data from the AI Index education survey.3 Among 15 universities that completed the question pertaining to the racial makeup of their faculty, approximately 67.0% of the tenure-track faculty are white, followed by Asian (14.3%), other races (8.3%), and mixed/other race, ethnicity, or origin (6.3%). The smallest representation among tenure-track faculty are teachers of Black or African and of Hispanic, Latino, or Spanish origins, who account for 0.6% and 0.8%, respectively. B L AC K I N A I Black in AI (BAI), founded in 2017 by Timnit Gebru and Rediet Abebe, is a multi-institutional and transcontinental initiative that aims to increase the presence of Black people in the field of AI. As of 2020, BAI has around 3,000 community members and allies, has held more than 10 workshops at major AI conferences, and has helped increase the number of Black people participating at major AI conferences globally 40-fold. Figure 6.2.4 shows the number of attendees, submitted papers, and accepted papers from the annual Black in AI Workshop, which is co-located with NeurIPS.4 The numbers of attendees and accepted papers in 2019 are 2.6 times higher than in 2017, while the number of accepted papers is 2.1 times higher. 6 . 2 R AC I A L A N D E T H N I C D I V E R S I T Y I N A I C H A P T E R 6 : D I V E R S I T Y I N A I C S T E N U R E -T R AC K FAC U LT Y B Y R AC E / E T H N I C I T Y 67.0% White 14.3% Asian 8.3% Other races 6.3% Mixed/other race, ethnicity, or origin 2.7% Middle Eastern or North African 0.8% Hispanic, Latino, or Spanish origin 0.6% Black or African TENURE-TRACK FACULTY (% of TOTAL) at CS DEPARTMENTS of TOP UNIVERSITIES in the WORLD by RACE/ETHNICITY, 2019-20 Source: AI Index, 2020 | Chart: 2021 AI Index Report 67.0% White 14.3% Asian 8.3% Other races 6.3% Mixed/other race, ethnicity, or origin 2.7% Middle Eastern or North African 0.8% Hispanic, Latino, or Spanish origin 0.6% Black or African TENURE-TRACK FACULTY (% of TOTAL) at CS DEPARTMENTS of TOP UNIVERSITIES in the WORLD by RACE/ETHNICITY, 2019-20 Source: AI Index, 2020 | Chart: 2021 AI Index Report Figure 6.2.3 Figure 6.2.4 3 The survey was distributed to 73 universities online over three waves from November 2020 to January 2021 and completed by 18 universities, a 24.7% response rate. The 18 universities are Belgium: Katholieke Universiteit Leuven; Canada: McGill University; China: Shanghai Jiao Tong University, Tsinghua University; Germany: Ludwig Maximilian University of Munich, Technical University of Munich; Russia: Higher School of Economics, Moscow Institute of Physics and Technology; Switzerland: École Polytechnique Fédérale de Lausanne; United Kingdom: University of Cambridge; United States: California Institute of Technology, Carnegie Mellon University (Department of Machine Learning), Columbia University, Harvard University, Stanford University, University of Wisconsin–Madison, University of Texas at Austin, Yale University. 4 The 2020 data are clearly affected by the pandemic and not included as a result. For more information, see the Black in AI impact report. https://blackinai2020.vercel.app/ https://docs.google.com/presentation/d/1wzh9uggU_pW7X0XJ2bVPonimAprbAwNtFTvsq5hy2w8/edit#slide=id.g25f6af9dd6_0_0 TA B L E O F C O N T E N T S 1 47C H A P T E R 6 P R E V I E W Artificial Intelligence Index Report 2021 Q U E E R I N A I This section presents data from a membership survey by Queer in AI (QAI), 5 an organization that aims to make the AI/ML community one that welcomes, supports, and values queer scientists. Founded in 2018 by William Agnew, Raphael Gontijo Lopes, and Eva Breznik, QAI builds a visible community of queer and ally AI/ML scientists through meetups, poster sessions, mentoring, and other initiatives. Demographics Breakdown According to the 2020 survey, with around 100 responses, about 31.5% of respondents identify as gay, followed by bisexual, queer, and lesbian (Figure 6.3.1); around 37.0% and 26.1% of respondents identify as cis male and cis female, respectively, followed by gender queer, gender fluid, nonbinary, and others (Figure 6.3.2). Trans female and male account for 5.0% and 2.5% of total members, respectively. Moreover, the past three years of surveys show that students make up the majority of QAI members—around 41.7% of all respondents on average (Figure 6.3.3), followed by junior-level professionals in academia or industry. 6.3 GENDER IDENTITY AND SEXUAL ORIENTATION IN AI 0% 5% 10% 15% 20% 25% 30% % of Respondents Gay Bisexual Queer Lesbian Straight Asexual Pansexual Others QAI MEMBERSHIP SURVEY: WHAT IS YOUR SEXUAL ORIENTATION, 2020 Source: Queer in AI, 2020 | Chart: 2021 AI Index Report Figure 6.3.1 6 . 3 G E N D E R I D E N T I T Y A N D S E X U A L O R I E N TAT I O N I N A I C H A P T E R 6 : D I V E R S I T Y I N A I 5 Queer in AI presents the survey results at its workshop at the annual NeurIPS conference. https://sites.google.com/view/queer-in-ai/home?authuser=0 TA B L E O F C O N T E N T S 1 4 8C H A P T E R 6 P R E V I E W Artificial Intelligence Index Report 2021 0% 5% 10% 15% 20% 25% 30% 35% % of Respondents Cis Male Cis Female Gender Queer Gender Fluid NonBinary and Others Trans Female QAI MEMBERSHIP SURVEY: WHAT IS YOUR GENDER IDENTITY, 2020 Source: Queer in AI, 2020 | Chart: 2021 AI Index Report 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% % of Respondents Student Junior Academic Junior Industry Others QAI MEMBERSHIP SURVEY: HOW WOULD YOU DESCRIBE YOUR POSITION, 2018-20 Source: Queer in AI, 2020 | Chart: 2021 AI Index Report 2018 2019 2020 Figure 6.3.2 Figure 6.3.3 6 . 3 G E N D E R I D E N T I T Y A N D S E X U A L O R I E N TAT I O N I N A I C H A P T E R 6 : D I V E R S I T Y I N A I TA B L E O F C O N T E N T S 1 4 9C H A P T E R 6 P R E V I E W Artificial Intelligence Index Report 2021 Experience as Queer Practitioners QAI also surveyed its members on their experiences as queer AI/ML practitioners. As shown in Figure 6.3.4, 81.4% regard the lack of role models as being a major obstacle for their careers, and 70.9% think the lack of community contributes to the same phenomenon. Almost half the respondents also view the lack of inclusiveness in the field as an obstacle. Moreover, more than 40% of QAI members have experienced discrimination or harassment as a queer person at work or school (Figure 6.3.5). Around 9.7% have encountered discrimination or harassment on more than five occasions. 0% 10% 20% 30% 40% 50% 60% 70% 80% % of Respondents Lack of Role Models Lack of Community Lack of Inclusiveness Lack of Work/School Support Economic Hardship Now Harrassment/Discrimination QAI MEMBERSHIP SURVEY: WHAT ARE OBSTACLES YOU HAVE FACED in BECOMING a QUEER AI/ML PRACTITIONER, 2020 Source: Queer in AI, 2020 | Chart: 2021 AI Index Report Figure 6.3.4 6 . 3 G E N D E R I D E N T I T Y A N D S E X U A L O R I E N TAT I O N I N A I C H A P T E R 6 : D I V E R S I T Y I N A I Among surveyed QAI members, 81.4% regard the lack of role models as being a major obstacle for their careers, and 70.9% think the lack of community contributes to the same phenomenon. TA B L E O F C O N T E N T S 1 5 0C H A P T E R 6 P R E V I E W Artificial Intelligence Index Report 2021 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% % of Respondents 0 times 1 time 2 times 5+ times Others QAI MEMBERSHIP SURVEY: HAVE YOU EXPERIENCED DISCRIMINATION/HARASSMENT as a QUEER PERSON at YOUR JOB or SCHOOL, 2020 Source: Queer in AI, 2020 | Chart: 2021 AI Index Report Figure 6.3.5 6 . 3 G E N D E R I D E N T I T Y A N D S E X U A L O R I E N TAT I O N I N A I C H A P T E R 6 : D I V E R S I T Y I N A I More than 40% of QAI members have experienced discrimination or harassment as a queer person at work or school. Around 9.7% have encountered discrimination or harassment on more than five occasions. TA B L E O F C O N T E N T S 1 5 1C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 CHAPTER 7: AI Policy and National Strategies Artificial Intelligence Index Report 2021 TA B L E O F C O N T E N T S 1 5 2C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S Overview 153 Chapter Highlights 154 7.1 NATIONAL AND REGIONAL AI STRATEGIES 155 Published Strategies 156 2017 156 2018 157 2019 159 2020 161 Strategies in Development (as of December 2020) 162 Strategies in Public Consultation 162 Strategies Announced 163 Highlight: National AI Strategies and Human Rights 164 7.2 INTERNATIONAL COLLABORATION ON AI 165 Intergovernmental Initiatives 165 Working Group 165 Summits and Meetings 166 Bilateral Agreements 166 7.3 U.S. PUBLIC INVESTMENT IN AI 167 Federal Budget for Non-Defense AI R&D 167 U.S. Department of Defense Budget Request 168 U.S. Government Contract Spending 169 Total Contract Spending 169 Contract Spending by Department and Agency 169 7.4 AI AND POLICYMAKING 171 Legislation Records on AI 171 U.S. Congressional Record 172 Mentions of AI and ML in Congressional/Parliamentary Proceedings 172 Central Banks 174 U.S. AI Policy Papers 176 Chapter Preview CHAPTER 7: ACCESS THE PUBLIC DATA https://drive.google.com/drive/folders/1B1bY5jgmloqRBTILkNg7Y2uw-77_004C?usp=sharing TA B L E O F C O N T E N T S 1 5 3C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S Overview O V E R V I E W AI is set to shape global competitiveness over the coming decades, promising to grant early adopters a significant economic and strategic advantage. To date, national governments and regional and intergovernmental organizations have raced to put in place AI-targeted policies to maximize the promise of the technology while also addressing its social and ethical implications. This chapter navigates the landscape of AI policymaking and tracks efforts taking place on the local, national, and international levels to help promote and govern AI technologies. It begins with an overview of national and regional AI strategies and then reviews activities on the intergovernmental level. The chapter then takes a closer look at public investment in AI in the United States as well as how legislative bodies, central banks, and nongovernmental organizations are responding to the growing need to institute a policy framework for AI technologies. TA B L E O F C O N T E N T S 1 5 4C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S C H A P T E R H I G H L I G H T S CHAPTER HIGHLIGHTS • Since Canada published the world’s first national AI strategy in 2017, more than 30 other countries and regions have published similar documents as of December 2020. • The launch of the Global Partnership on AI (GPAI) and Organisation for Economic Co-operation and Development (OECD) AI Policy Observatory and Network of Experts on AI in 2020 promoted intergovernmental efforts to work together to support the development of AI for all. • In the United States, the 116th Congress was the most AI-focused congressional session in history. The number of mentions of AI by this Congress in legislation, committee reports, and Congressional Research Service (CRS) reports is more than triple that of the 115th Congress. TA B L E O F C O N T E N T S 1 5 5C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 To guide and foster the development of AI, countries and regions around the world are establishing strategies and initiatives to coordinate governmental and intergovernmental efforts. Since Canada published the world’s first national AI strategy in 2017, more than 30 other countries and regions have published similar documents as of December 2020. 7.1 NATIONAL AND REGIONAL AI STRATEGIES C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S 7.1 N AT I O N A L A N D R E G I O N A L A I S T R AT E G I E S This section presents an overview of select national and regional AI strategies from around the world, including details on the strategies for G20 countries, Estonia, and Singapore as well as links to strategy documents for many others. Sources include websites of national or regional governments, the OECD AI Policy Observatory (OECD.AI), and news coverage. “AI strategy” is defined as a policy document that communicates the objective of supporting the development of AI while also maximizing the benefits of AI for society. Excluded are broader innovation or digital strategy documents which do not focus predominantly on AI, such as Brazil’s E-Digital Strategy and Japan’s Integrated Innovation Strategy. COUNTRIES WITH PUBLISHED AI STRATEGIES: 32 COUNTRIES DEVELOPING AI STRATEGIES: 22 https://oecd.ai/ TA B L E O F C O N T E N T S 1 5 6C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 Published Strategies 2017 Canada • AI Strategy: Pan Canadian AI Strategy • Responsible Organization: Canadian Institute for Advanced Research (CIFAR) • Highlights: The Canadian strategy emphasizes developing Canada’s future AI workforce, supporting major AI innovation hubs and scientific research, and positioning the country as a thought leader in the economic, ethical, policy, and legal implications of artificial intelligence. • Funding (December 2020 conversion rate): CAD 125 million (USD 97 million) • In November 2020, CIFAR published its most recent annual report, titled “AICAN,” which tracks progress on implementing its national strategy, which highlighted substantial growth in Canada’s AI ecosystem, as well as research and activities related to healthcare and AI’s impact on society, among other outcomes of the strategy. China • AI Strategy: A Next Generation Artificial Intelligence Development Plan • Responsible Organization: State Council for the People’s Republic of China • Highlights: China’s AI strategy is one of the most comprehensive in the world. It encompasses areas including R&D and talent development through education and skills acquisition, as well as ethical norms and implications for national security. It sets specific targets, including bringing the AI industry in line with competitors by 2020; becoming the global leader in fields such as unmanned aerial vehicles (UAVs), voice and image recognition, and others by 2025; and emerging as the primary center for AI innovation by 2030. • Funding: N/A • Recent Updates: China established a New Generation AI Innovation and Development Zone in February 2019 and released the “Beijing AI Principles” in May 2019 with a multi-stakeholder coalition consisting of academic institutions and private-sector players such as Tencent and Baidu. Japan • AI Strategy: Artificial Intelligence Technology Strategy • Responsible Organization: Strategic Council for AI Technology • Highlights: The strategy lays out three discrete phases of AI development. The first phase focuses on the utilization of data and AI in related service industries, the second on the public use of AI and the expansion of service industries, and the third on creating an overarching ecosystem where the various domains are merged. • Funding: N/A • Recent Updates: In 2019, the Integrated Innovation Strategy Promotion Council launched another AI strategy, aimed at taking the next step forward in overcoming issues faced by Japan and making use of the country’s strengths to open up future opportunities. Others Finland: Finland’s Age of Artificial Intelligence United Arab Emirates: UAE Strategy for Artificial Intelligence 7.1 N AT I O N A L A N D R E G I O N A L A I S T R AT E G I E S C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S https://cifar.ca/ai/ https://cifar.ca/wp-content/uploads/2020/11/AICan-2020-CIFAR-Pan-Canadian-AI-Strategy-Impact-Report.pdf https://na-production.s3.amazonaws.com/documents/translation-fulltext-8.1.17.pdf https://na-production.s3.amazonaws.com/documents/translation-fulltext-8.1.17.pdf http://www.china.org.cn/china/2019-02/22/content_74493744.htm https://ai-japan.s3-ap-northeast-1.amazonaws.com/7116/0377/5269/Artificial_Intelligence_Technology_StrategyMarch2017.pdf https://www.kantei.go.jp/jp/singi/ai_senryaku/pdf/aistratagy2019en.pdf https://julkaisut.valtioneuvosto.fi/bitstream/handle/10024/160391/TEMrap_47_2017_verkkojulkaisu.pdf https://u.ae/en/about-the-uae/strategies-initiatives-and-awards/federal-governments-strategies-and-plans/uae-strategy-for-artificial-intelligence https://u.ae/en/about-the-uae/strategies-initiatives-and-awards/federal-governments-strategies-and-plans/uae-strategy-for-artificial-intelligence TA B L E O F C O N T E N T S 1 5 7C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 Published Strategies 2018 European Union • AI Strategy: Coordinated Plan on Artificial Intelligence • Responsible Organization: European Commission • Highlights: This strategy document outlines the commitments and actions agreed on by EU member states, Norway, and Switzerland to increase investment and build their AI talent pipeline. It emphasizes the value of public-private partnerships, creating European data spaces, and developing ethics principles. • Funding (December 2020 conversation rate): At least EUR 1 billion (USD 1.1 billion) per year for AI research and at least EUR 4.9 billion (USD 5.4 billion) for other aspects of the strategy • Recent updates: A first draft of the ethics guidelines was released in June 2018, followed by an updated version in April 2019. France • AI Strategy: AI for Humanity: French Strategy for Artificial Intelligence • Responsible Organizations: Ministry for Higher Education, Research and Innovation; Ministry of Economy and Finance; Directorate General for Enterprises; Public Health Ministry; Ministry of the Armed Forces; National Research Institute for Digital Sciences; Interministerial Director of the Digital Technology and the Information and Communication System • Highlights: The main themes include developing an aggressive data policy for big data; targeting four strategic sectors, namely health care, environment, transport, and defense; boosting French efforts in research and development; planning for the impact of AI on the workforce; and ensuring inclusivity and diversity within the field. • Funding (December 2020 conversion rate): EUR 1.5 billion (USD 1.8 billion) up to 2022 • Recent Updates: The French National Research Institute for Digital Sciences (Inria) has committed to playing a central role in coordinating the national AI strategy and will report annually on its progress. Germany • AI Strategy: AI Made in Germany • Responsible Organizations: Federal Ministry of Education and Research; Federal Ministry for Economic Affairs and Energy; Federal Ministry of Labour and Social Affairs • Highlights: The focus of the strategy is on cementing Germany as a research powerhouse and strengthening the value of its industries. There is also an emphasis on the public interest and working to better the lives of people and the environment. • Funding (December 2020 conversion rate): EUR 500 million (USD 608 million) in the 2019 budget and EUR 3 billion (USD 3.6 billion) for the implementation up to 2025 • Recent Updates: In November 2019, the government published an interim progress report on the Germany AI strategy. 7.1 N AT I O N A L A N D R E G I O N A L A I S T R AT E G I E S C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S https://ec.europa.eu/knowledge4policy/publication/coordinated-plan-artificial-intelligence-com2018-795-final_en#:~:text=This%20plan%20proposes%20joint%20actions,fostering%20talent%20and%20ensuring%20trust. https://ec.europa.eu/futurium/en/ai-alliance-consultation/guidelines#Top https://www.aiforhumanity.fr/en/ https://www.aiforhumanity.fr/en/ https://www.ki-strategie-deutschland.de/home.html?file=files/downloads/Nationale_KI-Strategie_engl.pdf TA B L E O F C O N T E N T S 1 5 8C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 2018 (continued) India • AI Strategy: National Strategy on Artificial Intelligence: #AIforAll • Responsible Organization: National Institution for Transforming India (NITI Ayog) • Highlights: The Indian strategy focuses on both economic growth and ways to leverage AI to increase social inclusion, while also promoting research to address important issues such as ethics, bias, and privacy related to AI. The strategy emphasizes sectors such as agriculture, health, and education, where public investment and government initiative are necessary. • Funding (December 2020 conversion rate): INR 7000 crore (USD 949 million) • Recent Updates: In 2019, the Ministry of Electronics and Information Technology released its own proposal to set up a national AI program with an allocated INR 400 crore (USD 54 million). The Indian government formed a committee in late 2019 to push for an organized AI policy and establish the precise functions of government agencies to further India’s AI mission. Mexico • AI Strategy: Artificial Intelligence Agenda MX (2019 agenda-in-brief version) • Responsible Organization: IA2030Mx, Economía • Highlights: As Latin America’s first strategy, the Mexican strategy focuses on developing a strong governance framework, mapping the needs of AI in various industries, and identifying governmental best practices with an emphasis on developing Mexico’s AI leadership. • Funding: N/A • Recent Updates: According to the Inter-American Development Bank’s recent fAIr LAC report, Mexico is in the process of establishing concrete AI policies to further implementation. United Kingdom • AI Strategy: Industrial Strategy: Artificial Intelligence Sector Deal • Responsible Organization: Office for Artificial Intelligence (OAI) • Highlights: The U.K. strategy emphasizes a strong partnership between business, academia, and the government and identifies five foundations for a successful industrial strategy: becoming the world’s most innovative economy, creating jobs and better earnings potential, infrastructure upgrades, favorable business conditions, and building prosperous communities throughout the country. • Funding (December 2020 conversion rate): GBP 950 million (USD 1.3 billion) • Recent Updates: Between 2017 and 2019, the U.K.’s Select Committee on AI released an annual report on the country’s progress. In November 2020, the government announced a major increase in defense spending of GBP 16.5 billion (USD 21.8 billion) over four years, with a major emphasis on AI technologies that promise to revolutionize warfare. Others Sweden: National Approach to Artificial Intelligence Taiwan: Taiwan AI Action Plan 7.1 N AT I O N A L A N D R E G I O N A L A I S T R AT E G I E S C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S https://niti.gov.in/national-strategy-artificial-intelligence https://niti.gov.in/national-strategy-artificial-intelligence https://36dc704c-0d61-4da0-87fa-917581cbce16.filesusr.com/ugd/7be025_85f5cec6ea584d8a842d11ad401c0685.pdf https://www.gov.uk/government/publications/artificial-intelligence-sector-deal/ai-sector-deal https://www.gov.uk/government/publications/artificial-intelligence-sector-deal/ai-sector-deal https://publications.parliament.uk/pa/ld201719/ldselect/ldai/100/100.pdf https://techcrunch.com/2020/11/19/uk-to-invest-in-ai-and-cyber-as-part-of-major-defense-spending-hike/?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig=AQAAAIx4JatDuyEDHYgBB8AfMLeKzlL3Bz2ZnR536RVd-YFlwHgYs9bbSiKmLQq0DcA5nFGUz0oCcUg32K-EQ4VG81RaZsAPnF9URgL3_4QyjjRjpKfZVlfNUzBIeFg3NPy2jf-GcM-JuEmNS5UXMIQCsMKNSwMSC9kcUzf-_8ogVV0Q https://www.government.se/4a7451/contentassets/fe2ba005fb49433587574c513a837fac/national-approach-to-artificial-intelligence.pdf https://ai.taiwan.gov.tw/ TA B L E O F C O N T E N T S 1 5 9C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 Published Strategies 2019 Estonia • AI Strategy: National AI Strategy 2019–2021 • Responsible Organization: Ministry of Economic Affairs and Communications (MKM) • Highlights: The strategy emphasizes actions necessary for both the public and private sectors to take to increase investment in AI research and development, while also improving the legal environment for AI in Estonia. In addition, it hammers out the framework for a steering committee that will oversee the implementation and monitoring of the strategy. • Funding (December 2020 conversion rate): EUR 10 million (USD 12 million) up to 2021 • Recent Updates: The Estonian government released an update on the AI taskforce in May 2019. Russia • AI Strategy: National Strategy for the Development of Artificial Intelligence • Responsible Organizations: Ministry of Digital Development, Communications and Mass Media; Government of the Russian Federation • Highlights: The Russian AI strategy places a strong emphasis on its national interests and lays down guidelines for the development of an “information society” between 2017 and 2030. These include a national technology initiative, departmental projects for federal executive bodies, and programs such as the Digital Economy of the Russian Federation, designed to implement the AI framework across sectors. • Funding: N/A • Recent Updates: In December 2020, Russian president Vladmir Putin took part in the Artificial Intelligence Journey Conference, where he presented four ideas for AI policies: establishing experimental legal frameworks for the use of AI, developing practical measures to introduce AI algorithms, providing neural network developers with competitive access to big data, and boosting private investment in domestic AI industries. Singapore • AI Strategy: National Artificial Intelligence Strategy • Responsible Organization: Smart Nation and Digital Government Office (SNDGO) • Highlights: Launched by Smart Nation Singapore, a government agency that seeks to transform Singapore’s economy and usher in a new digital age, the strategy identifies five national AI projects in the following fields: transport and logistics, smart cities and estates, health care, education, and safety and security. • Funding (December 2020 conversion rate): While the 2019 strategy does not mention funding, in 2017 the government launched its national program, AI Singapore, with a pledge to invest SGD 150 million (USD 113 million) over five years. • Recent Updates: In November 2020, SNDGO published its inaugural annual update on the Singaporean government’s data protection efforts. It describes the measures taken to date to strengthen public sector data security and to safeguard citizens’ private data. 7.1 N AT I O N A L A N D R E G I O N A L A I S T R AT E G I E S C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S https://f98cc689-5814-47ec-86b3-db505a7c3978.filesusr.com/ugd/7df26f_27a618cb80a648c38be427194affa2f3.pdf https://f98cc689-5814-47ec-86b3-db505a7c3978.filesusr.com/ugd/7df26f_486454c9f32340b28206e140350159cf.pdf http://www.kremlin.ru/acts/bank/44731 http://www.kremlin.ru/acts/bank/44731 http://en.kremlin.ru/events/president/news/64545 http://en.kremlin.ru/events/president/news/64545 https://www.smartnation.gov.sg/why-Smart-Nation/NationalAIStrategy TA B L E O F C O N T E N T S 1 6 0C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 2019 (continued) United States • AI Strategy: American AI Initiative • Responsible Organization: The White House • Highlights: The American AI Initiative prioritizes the need for the federal government to invest in AI R&D, reduce barriers to federal resources, and ensure technical standards for the safe development, testing, and deployment of AI technologies. The White House also emphasizes developing an AI-ready workforce and signals a commitment to collaborating with foreign partners while promoting U.S. leadership in AI. The initiative, however, lacks specifics on the program’s timeline, whether additional research will be dedicated to AI development, and other practical considerations. • Funding: N/A • Recent Updates: The U.S. government released its year one annual report in February 2020, followed in November by the first guidance memorandum for federal agencies on regulating artificial intelligence applications in the private sector, including principles that encourage AI innovation and growth and increase public trust and confidence in AI technologies. The National Defense Authorization Act (NDAA) for Fiscal Year 2021 called for a National AI Initiative to coordinate AI research and policy across the federal government. South Korea • AI Strategy: National Strategy for Artificial Intelligence • Responsible Organization: Ministry of Science, ICT and Future Planning (MSIP) • Highlights: The Korean strategy calls for plans to facilitate the use of AI by businesses and to streamline regulations to create a more favorable environment for the development and use of AI and other new industries. The Korean government also plans to leverage its dominance in the global supply of memory chips to build the next generation of smart chips by 2030. • Funding (December 2020 conversion rate): KRW 2.2 trillion (USD 2 billion) • Recent Updates: N/A Others Colombia: National Policy for Digital Transformation and Artificial Intelligence Czech Republic: National Artificial Intelligence Strategy of the Czech Republic Lithuania: Lithuanian Artificial Intelligence Strategy: A Vision for the Future Luxembourg: Artificial Intelligence: A Strategic Vision for Luxembourg Malta: Malta: The Ultimate AI Launchpad Netherlands: Strategic Action Plan for Artificial Intelligence Portugal: AI Portugal 2030 Qatar: National Artificial Intelligence for Qatar 7.1 N AT I O N A L A N D R E G I O N A L A I S T R AT E G I E S C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S https://trumpwhitehouse.archives.gov/ai/ai-american-innovation/ https://trumpwhitehouse.archives.gov/wp-content/uploads/2020/02/American-AI-Initiative-One-Year-Annual-Report.pdf https://www.whitehouse.gov/wp-content/uploads/2020/11/M-21-06.pdf https://www.korea.kr/news/pressReleaseView.do?newsId=156366736 https://colaboracion.dnp.gov.co/CDT/Conpes/Econ%C3%B3micos/3975.pdf https://colaboracion.dnp.gov.co/CDT/Conpes/Econ%C3%B3micos/3975.pdf https://www.mpo.cz/assets/en/guidepost/for-the-media/press-releases/2019/5/NAIS_eng_web.pdf https://www.mpo.cz/assets/en/guidepost/for-the-media/press-releases/2019/5/NAIS_eng_web.pdf http://kurklt.lt/wp-content/uploads/2018/09/StrategyIndesignpdf.pdf http://kurklt.lt/wp-content/uploads/2018/09/StrategyIndesignpdf.pdf https://gouvernement.lu/en/publications/rapport-etude-analyse/minist-digitalisation/artificial-intelligence/artificial-intelligence.html https://gouvernement.lu/en/publications/rapport-etude-analyse/minist-digitalisation/artificial-intelligence/artificial-intelligence.html https://malta.ai/wp-content/uploads/2019/11/Malta_The_Ultimate_AI_Launchpad_vFinal.pdf https://www.government.nl/binaries/government/documents/reports/2019/10/09/strategic-action-plan-for-artificial-intelligence/Strategic+Action+Plan+for+Artificial+Intelligence.pdf https://www.government.nl/binaries/government/documents/reports/2019/10/09/strategic-action-plan-for-artificial-intelligence/Strategic+Action+Plan+for+Artificial+Intelligence.pdf https://www.incode2030.gov.pt/sites/default/files/julho_incode_brochura.pdf https://www.motc.gov.qa/sites/default/files/national_ai_strategy_-_english_0.pdf TA B L E O F C O N T E N T S 1 61C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 Published Strategies 2020 Indonesia • AI Strategy: National Strategy for the Development of Artificial Intelligence (Stranas KA) • Responsible Organizations: Ministry of Research and Technology (Menristek), National Research and Innovation Agency (BRIN), Agency for the Assessment and Application of Technology (BPPT) • Strategy Highlights: The Indonesian strategy aims to guide the country in developing AI between 2020 and 2045. It focuses on education and research, health services, food security, mobility, smart cities, and public sector reform. • Funding: N/A • Recent Updates: None Saudi Arabia • AI Strategy: National Strategy on Data and AI (NSDAI) • Responsible Organization: Saudi Data and Artificial Intelligence Authority (SDAIA) • Highlights: As part of an effort to diversify the country’s economy away from oil and boost the private sector, the NSDAI aims to accelerate AI development in five critical sectors: health care, mobility, education, government, and energy. By 2030, Saudi Arabia intends to train 20,000 data and AI specialists, attract USD 20 billion in foreign and local investment, and create an environment that will attract at least 300 AI and data startups. • Funding: N/A • Recent Updates: During the summit where the Saudi government released its strategy, the country’s National Center for Artificial Intelligence (NCAI) signed collaboration agreements with China’s Huawei and Alibaba Cloud to design AI-related Arabic-language systems. Others Hungary: Hungary’s Artificial Intelligence Strategy Norway: National Strategy for Artificial Intelligence Serbia: Strategy for the Development of Artificial Intelligence in the Republic of Serbia for the Period 2020–2025 Spain: National Artificial Intelligence Strategy 7.1 N AT I O N A L A N D R E G I O N A L A I S T R AT E G I E S C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S https://ai-innovation.id/strategi https://ai-innovation.id/strategi https://ai.sa/Brochure_NSDAI_Summit%20version_EN.pdf https://ai-hungary.com/api/v1/companies/15/files/138309/view https://www.regjeringen.no/en/dokumenter/nasjonal-strategi-for-kunstig-intelligens/id2685594/?ch=6#:~:text=The%20Government%20wants%20Norway%20to,AI%20in%20the%20business%20sector. https://www.srbija.gov.rs/extfile/sr/437310/strategy_artificial_intelligence-condensed261219_2.docx https://www.srbija.gov.rs/extfile/sr/437310/strategy_artificial_intelligence-condensed261219_2.docx https://www.srbija.gov.rs/extfile/sr/437310/strategy_artificial_intelligence-condensed261219_2.docx https://www.lamoncloa.gob.es/presidente/actividades/Documents/2020/021220-ENIA.pdf TA B L E O F C O N T E N T S 1 6 2C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 Strategies in Development (AS OF DECEMBER 2020) Strategies in Public Consultation Brazil • AI Strategy Draft: Brazilian Artificial Intelligence Strategy • Responsible Organization: Ministry of Science, Technology and Innovation (MCTI) • Highlights: Brazil’s national AI strategy was announced in 2019 and is currently in the public consultation stage. According to the OECD, the strategy aims to cover relevant topics bearing on AI, including its impact on the economy, ethics, development, education, and jobs, and to coordinate specific public policies addressing such issues. • Funding: N/A • Recent Updates: In October 2020, the country’s largest research facility dedicated to AI was launched in collaboration with IBM, the University of São Paulo, and the São Paulo Research Foundation. Italy • AI Strategy Draft: Proposal for an Italian Strategy for Artificial Intelligence • Responsible Organization: Ministry of Economic Development (MISE) • Highlights: This document provides the proposed strategy for the sustainable development of AI, aimed at improving Italy’s competitiveness in AI. It focuses on improving AI-based skills and competencies, fostering AI research, establishing a regulatory and ethical framework to ensure a sustainable ecosystem for AI, and developing a robust data infrastructure to fuel these developments. • Funding (December 2020 conversion rate): EUR 1 billion (USD 1.1 billion) through 2025 and expected matching funds from the private sector, bringing the total investment to EUR 2 billion. • Recent Updates: None Others Cyprus: National Strategy for Artificial Intelligence Ireland: National Irish Strategy on Artificial Intelligence Poland: Artificial Intelligence Development Policy in Poland Uruguay: Artificial Intelligence Strategy for Digital Government 7.1 N AT I O N A L A N D R E G I O N A L A I S T R AT E G I E S C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S http://participa.br/estrategia-brasileira-de-inteligencia-artificial/estrategia-brasileira-de-inteligencia-artificial-aplicacao-nos-setores-produtivos https://www.mise.gov.it/images/stories/documenti/Proposte_per_una_Strategia_italiana_AI.pdf https://www.mise.gov.it/images/stories/documenti/Proposte_per_una_Strategia_italiana_AI.pdf https://knowledge4policy.ec.europa.eu/sites/default/files/cyprus_ai_strategy.pdf https://enterprise.gov.ie/en/Consultations/Public-Consultation-Development-of-a-National-Strategy-on-Artificial-Intelligence.html https://www.gov.pl/attachment/a8ea194c-d0ce-404e-a9ca-e007e9fbc93e https://www.gov.pl/attachment/a8ea194c-d0ce-404e-a9ca-e007e9fbc93e https://www.gub.uy/participacionciudadana/consultapublica https://www.gub.uy/participacionciudadana/consultapublica TA B L E O F C O N T E N T S 1 6 3C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 Strategies Announced Argentina • Related Document: N/A • Responsible Organization: Ministry of Science, Technology and Productive Innovation (MINCYT) • Status: Argentina’s AI plan is a part of the Argentine Digital Agenda 2030 but has not yet been published. It is intended to cover the decade between 2020 and 2030, and reports indicate that it has the potential to reap huge benefits for the agricultural sector. Australia • Related Documents: Artificial Intelligence Roadmap / An AI Action Plan for all Australians • Responsible Organizations: Commonwealth Scientific and Industrial Research Organisation (CSIRO), Data 61, and the Australian government • Status: The Australian government published a road map in 2019 (in collaboration with the national science agency, CSIRO) and a discussion paper of an AI action plan in 2020 as frameworks to develop a national AI strategy. In its 2018–19 budget, the Australian government earmarked AUD 29.9 million (USD 22.2 million [December 2020 conversation rate]) over four years to strengthen the country’s capabilities in AI and machine learning (ML). In addition, CSIRO published a research paper on Australia’s AI Ethics Framework in 2019 and launched a public consultation, which is expected to produce a forthcoming strategy document. Turkey • Related Document: N/A • Responsible Organizations: Presidency of the Republic of Turkey Digital Transformation Office; Ministry of Industry and Technology; Scientific and Technological Research Council of Turkey; Science, Technology and Innovation Policies Council • Status: The strategy has been announced but not yet published. According to media sources, it will focus on talent development, scientific research, ethics and inclusion, and digital infrastructure. Others Austria: Artificial Intelligence Mission Austria (official report) Bulgaria: Concept for the Development of Artificial Intelligence in Bulgaria Until 2030 (concept document) Chile: National AI Policy (official announcement) Israel: National AI Plan (news article) Kenya: Blockchain and Artificial Intelligence Taskforce (news article) Latvia: On the Development of Artificial Intelligence Solutions (official report) Malaysia: National Artificial Intelligence (Al) Framework (news article) New Zealand: Artificial Intelligence: Shaping a Future New Zealand (official report) Sri Lanka: Framework for Artificial Intelligence (news article) Switzerland: Artificial Intelligence (official guidelines) Tunisia: National Artificial Intelligence Strategy (task force announced) Ukraine: Concept of Artificial Intelligence Development in Ukraine AI (concept document) Vietnam: Artificial Intelligence Development Strategy (official announcement) 7.1 N AT I O N A L A N D R E G I O N A L A I S T R AT E G I E S C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S https://data61.csiro.au/en/Our-Research/Our-Work/AI-Roadmap https://consult.industry.gov.au/digital-economy/ai-action-plan/supporting_documents/AIDiscussionPaper.pdf https://consult.industry.gov.au/strategic-policy/artificial-intelligence-ethics-framework/supporting_documents/ArtificialIntelligenceethicsframeworkdiscussionpaper.pdf https://www.bmk.gv.at/dam/jcr:8acef058-7167-4335-880e-9fa341b723c8/aimat_ua.pdf http://libreresearchgroup.org/p/1/3/139-2020-09-17-libre-statement-ai-strategy-bg-1136.pdf http://libreresearchgroup.org/p/1/3/139-2020-09-17-libre-statement-ai-strategy-bg-1136.pdf https://www.gob.cl/en/news/government-announces-artificial-intelligence-plan-be-developed-science-ministry/ https://www.calcalistech.com/ctech/articles/0,7340,L-3883355,00.html https://kenyanwallstreet.com/kenya-govt-sets-blockchain-artificial-intelligence-taskforce/ http://tap.mk.gov.lv/lv/mk/tap/?pid=40475479 http://tap.mk.gov.lv/lv/mk/tap/?pid=40475479 https://opengovasia.com/plans-for-cloud-first-strategy-and-national-ai-framework-revealed-at-29th-msc-malaysia-implementation-council-meeting/ https://www.mbie.govt.nz/dmsdocument/5754-artificial-intelligence-shaping-a-future-new-zealand-pdf https://www.mbie.govt.nz/dmsdocument/5754-artificial-intelligence-shaping-a-future-new-zealand-pdf http://www.colombopage.com/archive_19A/Jun28_1561662503CH.php https://www.sbfi.admin.ch/sbfi/fr/home/politique-fri/fri-2021-2024/themes-transversaux/numerisation-fri/intelligence-artificielle.html http://www.anpr.tn/national-ai-strategy-unlocking-tunisias-capabilities-potential/ https://www.slideshare.net/mobile/ador1231/ss-239685388?ref=https%3A%2F%2Fthedigital.gov.ua%2F&fbclid=IwAR0nLHzqmYOI5uYZOGkJtF9GR4MtppPc4tUpJKAuLSV_Wb-yz82hUehLB8M https://www.slideshare.net/mobile/ador1231/ss-239685388?ref=https%3A%2F%2Fthedigital.gov.ua%2F&fbclid=IwAR0nLHzqmYOI5uYZOGkJtF9GR4MtppPc4tUpJKAuLSV_Wb-yz82hUehLB8M https://english.mic.gov.vn/Pages/TinTuc/139578/Selecting-appropriate-artificial-intelligence-development-strategy.html TA B L E O F C O N T E N T S 1 6 4C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 Read more on AI national strategies: • Tim Dutton: An Overview of National AI Strategies • Organisation for Economic Co-operation and Development: OECD AI Policy Observatory • Canadian Institute for Advanced Research: Building an AI World, Second Edition • Inter-American Development Bank: Artificial Intelligence for Social Good in Latin America and the Caribbean: The Regional Landscape and 12 Country Snapshots 7.1 N AT I O N A L A N D R E G I O N A L A I S T R AT E G I E S C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S National AI Strategies and Human Rights Table 7.1.1: Mapping human rights referenced in national AI strategies HUMAN RIGHTS MENTIONED STATES/REGIONAL ORGANIZATIONS The right to privacy Australia, Belgium, China, Czech Republic, Germany, India, Italy, Luxembourg, Malta, Netherlands, Norway, Portugal, Qatar, South Korea, United States The right to equality/ nondiscrimination Australia, Belgium, Czech Republic, Denmark, Estonia, EU, France, Germany, Italy, Malta, Netherlands, Norway The right to an effective remedy Australia (responsibility and ability to hold humans responsible), Denmark, Malta, Netherlands The rights to freedom of thought, expression, and access to information France, Netherlands, Russia The right to work France, Russia In 2020, Global Partners Digital and Stanford’s Global Digital Policy Incubator published a report examining governments’ national AI strategies from a human rights perspective, titled “National Artificial Intelligence Strategies and Human Rights: A Review.” The report assesses the extent to which governments and regional organizations have incorporated human rights considerations into their national AI strategies and made recommendations to policymakers looking to develop or review AI strategies in the future. The report found that among the 30 states and two regional strategies (from the European Union and the Nordic-Baltic states), a number of strategies refer to the impact of AI on human rights, with the right to privacy as the most commonly mentioned, followed by equality and nondiscrimination (Table 6.1.1). However, very few strategy documents provide deep analysis or concrete assessment of the impact of AI applications on human rights. Specifics as to how and the depth to which human rights should be protected in the context of AI is largely missing, in contrast to the level of specificity on other issues such as economic competitiveness and innovation advantage. https://medium.com/politics-ai/an-overview-of-national-ai-strategies-2a70ec6edfd https://oecd.ai/ https://cifar.ca/wp-content/uploads/2020/10/building-an-ai-world-second-edition.pdf https://publications.iadb.org/es/la-inteligencia-artificial-al-servicio-del-bien-social-en-america-latina-y-el-caribe-panor%C3%A1mica-regional-e-instant%C3%A1neas-de-doce-paises https://publications.iadb.org/es/la-inteligencia-artificial-al-servicio-del-bien-social-en-america-latina-y-el-caribe-panor%C3%A1mica-regional-e-instant%C3%A1neas-de-doce-paises https://fsi-live.s3.us-west-1.amazonaws.com/s3fs-public/national_artifical_intelligence_strategies_and_human_rights-a_review1.pdf https://fsi-live.s3.us-west-1.amazonaws.com/s3fs-public/national_artifical_intelligence_strategies_and_human_rights-a_review1.pdf TA B L E O F C O N T E N T S 1 6 5C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 Given the scale of the opportunities and the challenges presented by AI, a number of international efforts have recently been announced that aim to develop multilateral AI strategies. This section provides an overview of those international initiatives from governments committed to working together to support the development of AI for all. These multilateral initiatives on AI suggest that organizations are taking a variety of approaches to tackle the practical applications of AI and scale those solutions for maximum global impact. Many countries turn to international organizations for global AI norm formulation, while others engage in partnerships or bilateral agreements. Among the topics under discussion, the ethics of AI—or the ethical challenges raised by current and future applications of AI—stands out as a particular focus area for intergovernmental efforts. Countries such as Japan, South Korea, the United Kingdom, the United States, and members of the European Union are active participants of intergovernmental efforts on AI. A major AI powerhouse, China, on the other hand, has opted to engage in a number of science and technology bilateral agreements that stress cooperation on AI as part of the Digital Silk Road under the Belt and Road (BRI) initiative framework. For example, AI is mentioned in China’s economic cooperation under the BRI Initiative with the United Arab Emirates. I N T E R G O V E R N M E N TA L I N I T I AT I V E S Intergovernmental working groups consist of experts and policymakers from member states who study and report on the most urgent challenges related to developing and deploying AI and then make recommendations based on their findings. These groups are instrumental in identifying and developing strategies for the most pressing issues in AI technologies and their applications. Working Groups Global Partnership on AI (GPAI) • Participants: Australia, Brazil, Canada, France, Germany, India, Italy, Japan, Mexico, the Netherlands, New Zealand, South Korea, Poland, Singapore, Slovenia, Spain, the United Kingdom, the United States, and the European Union (as of December 2020) • Host of Secretariat: OECD • Focus Areas: Responsible AI; data governance; the future of work; innovation and commercialization • Recent Activities: Two International Centres of Expertise—the International Centre of Expertise in Montreal for the Advancement of Artificial Intelligence and the French National Institute for Research in Digital Science and Technology (INRIA) in Paris—are supporting the work in the four focus areas and held the Montreal Summit 2020 in December 2020. Moreover, the data governance working group published the beta version of the group’s framework in November 2020. OECD Network of Experts on AI (ONE AI) • Participants: OECD countries • Host: OECD • Focus Areas: Classification of AI; implementing trustworthy AI; policies for AI; AI compute • Recent Activities: ONE AI convened its first meeting in February 2020, when it also launched the OECD AI Policy Observatory. In November 2020, the working group on the classification of AI presented the first look at an AI classification framework based on OECD’s definition of AI divided into four dimensions (context, data and input, AI model, task and output) that aims to guide policymakers in designing adequate policies for each type of AI system. High-Level Expert Group on Artificial Intelligence (HLEG) • Participants: EU countries • Host: European Commission • Focus Areas: Ethics guidelines for trustworthy AI • Recent Activities: Since its launch at the recommendation 7.2 INTERNATIONAL COLLABORATION ON AI C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S 7. 2 I N T E R N AT I O N A L C O L L A B O R AT I O N O N A I http://english.scio.gov.cn/beltandroad/2018-07/23/content_57813914.htm https://www.gpai.ai/ https://oecd.ai/wonk/open-call-input-gpai-data-governance-working-group https://www.oecd.ai/network-of-experts https://oecd.ai/ https://oecd.ai/ https://www.oecd.ai/wonk/a-first-look-at-the-oecds-framework-for-the-classification-of-ai-systems-for-policymakers https://www.oecd.ai/wonk/a-first-look-at-the-oecds-framework-for-the-classification-of-ai-systems-for-policymakers https://ec.europa.eu/digital-single-market/en/high-level-expert-group-artificial-intelligence TA B L E O F C O N T E N T S 1 6 6C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 of the EU AI strategy in 2018, HLEG presented the EU Ethics Guidelines for Trustworthy Artificial Intelligence and a series of policy and investment recommendations, as well as an assessment checklist related to the guidelines. Ad Hoc Expert Group (AHEG) for the Recommendation on the Ethics of Artificial Intelligence • Participants: United Nations Educational, Scientific and Cultural Organization (UNESCO) member states • Host: UNESCO • Focus Areas: Ethical issues raised by the development and use of AI • Recent Activities: The AHEG produced a revised first draft Recommendation on the Ethics of Artificial Intelligence, which was transmitted in September 2020 to Member States of UNESCO for their comments by December 31, 2020. Summits and Meetings AI for Good Global Summit • Participants: Global (with the United Nations and its agencies) • Hosts: International Telecommunication Union, XPRIZE Foundation • Focus Areas: Trusted, safe, and inclusive development of AI technologies and equitable access to their benefits AI Partnership for Defense • Participants: Australia, Canada, Denmark, Estonia, Finland, France, Israel, Japan, Norway, South Korea, Sweden, the United Kingdom, and the United States • Hosts: Joint Artificial Intelligence Center, U.S. Department of Defense • Focus Areas: AI ethical principles for defense China-Association of Southeast Asian Nations (ASEAN) AI Summit • Participants: Brunei, Cambodia, China, Indonesia, Laos, Malaysia, Myanmar, the Philippines, Singapore, Thailand, and Vietnam • Hosts: China Association for Science and Technology, Guangxi Zhuang Autonomous Region, China • Focus Areas: Infrastructure construction, digital economy, and innovation-driven development B I L AT E R A L AG R E E M E N T S Bilateral agreements focusing on AI are another form of international collaboration that has been gaining in popularity in recent years. AI is usually included in the broader context of collaborating on the development of digital economies, though India stands apart for investing in developing multiple bilateral agreements specifically geared toward AI. India and United Arab Emirates Invest India and the UAE Ministry of Artificial Intelligence signed a memorandum of understanding in July 2018 to collaborate on fostering innovative AI ecosystems and other policy concerns related to AI. Two countries will convene a working committee aimed at increasing investment in AI startups and research activities in partnership with the private sector. India and Germany It was reported in October 2019 that India and Germany likely will sign an agreement including partnerships on the use of artificial intelligence (especially in farming). United States and United Kingdom The U.S. and the U.K. announced a declaration in September 2020, through the Special Relationship Economic Working Group, that the two countries will enter into a bilateral dialogue on advancing AI in line with shared democratic values and further cooperation in AI R&D efforts. India and Japan India and Japan were said to have finalized an agreement in October 2020 that focuses on collaborating on digital technologies, including 5G and AI. French and Germany France and Germany signed a road map for a Franco- German Research and Innovation Network on artificial intelligence as part of the Declaration of Toulouse in October 2019 to advance European efforts in the development and application of AI, taking into account ethical guidelines. C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S 7. 2 I N T E R N AT I O N A L C O L L A B O R AT I O N O N A I https://en.unesco.org/artificial-intelligence/ethics https://en.unesco.org/artificial-intelligence/ethics https://aiforgood.itu.int/ https://www.ai.mil/news_09_16_20-jaic_facilitates_first-ever_international_ai_dialogue_for_defense_.html https://www.prnewswire.com/news-releases/xinhua-silk-road-ai-empowers-china-asean-cooperation-helping-tap-into-market-opportunities-301174146.html https://www.prnewswire.com/news-releases/xinhua-silk-road-ai-empowers-china-asean-cooperation-helping-tap-into-market-opportunities-301174146.html https://pib.gov.in/Pressreleaseshare.aspx?PRID=1540480 https://www.reuters.com/article/us-india-germany/india-and-germany-likely-to-sign-agreement-on-artificial-intelligence-idUSKBN1X91JW https://www.gov.uk/government/publications/declaration-of-the-united-states-of-america-and-the-united-kingdom-of-great-britain-and-northern-ireland-on-cooperation-in-ai-research-and-development/declaration-of-the-united-states-of-america-and-the-united-kingdom-of-great-britain-and-northern-ireland-on-cooperation-in-artificial-intelligence-re https://economictimes.indiatimes.com/news/defence/india-japan-finalise-pact-for-cooperation-in-5g-ai-critical-information-infrastructure/articleshow/78534833.cms https://www.diplomatie.gouv.fr/en/country-files/germany/events/article/french-german-declaration-of-toulouse-16-oct-19 TA B L E O F C O N T E N T S 1 6 7C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 2020 (Request) 2020 (Enacted) 2021 (Request) 0 500 1,000 1,500 B u d g et ( in M ill io n s o f U .S . D o lla rs ) U.S. FEDERAL BUDGET for NON-DEFENSE AI R&D, FY 2020-21 Source: U.S. NITRD Program, 2020 | Chart: 2021 AI Index Report F E D E R A L B U D G E T F O R N O N - D E F E N S E A I R & D In September 2019, the White House National Science and Technology Council released a report attempting to total up all public-sector AI R&D funding, the first time such a figure was published. This funding is to be disbursed as grants for government laboratories or research universities or in the form of government contracts. These federal budget figures, however, do not include substantial AI R&D investments by the Department of Defense (DOD) and the intelligence sector, as they were withheld from publication for national security reasons. As shown in Figure 7.3.1, federal civilian agencies—those agencies that are not part of the DOD or the intelligence sector— allocated USD 973.5 million to AI R&D for FY 2020, a figure that rose to USD 1.1 billion once congressional appropriations and transfers were factored in. For FY 2021, federal civilian agencies budgeted USD 1.5 billion, which is almost 55% higher than its 2020 request. 7.3 U.S. PUBLIC INVESTMENT IN AI C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S 7. 3 U. S . P U B L I C I N V E S T M E N T I N A I This section examines public investment in AI in the United States based on data from the U.S. Networking and Information Technology Research and Development (NITRD) program and Bloomberg Government. Figure 7.3.1 Federal civilian agencies—those agencies that are not part of the DOD or the intelligence sector— allocated USD 973.5 million to AI R&D for FY 2020, a figure that rose to USD 1.1 billion once congressional appropriations and transfers were factored in. https://www.nitrd.gov/pubs/FY2020-NITRD-Supplement.pdf TA B L E O F C O N T E N T S 1 6 8C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 2018 (Enacted) 2019 (Enacted) 2020 (Enacted) 2021 (Request) 0 1,000 2,000 3,000 4,000 5,000 B u d g e t (i n M ill io n s o f U .S . D o lla rs ) 927 841 DOD Reported Budget on AI R&D DOD Reported Budget on AI R&D U.S. DOD BUDGET for AI-SPECIFIC RESEARCH DEVELOPMENT, TEST, and EVALUATION (RDT&E), FY 2018-20 Sources: Bloomberg Government & U.S. Department of Defense, 2020 | Chart: 2021 AI Index Report Figure 7.3.2 U. S . D E PA R T M E N T O F D E F E N S E A I R & D B U D G E T R E Q U E S T While the official DOD budget is not publicly available, Bloomberg Government has analyzed the department’s publicly available budget request for research, development, test, and evaluation (RDT&E)— data that sheds light on its spending on AI R&D. With 305 unclassified DOD R&D programs specifying the use of AI or ML technologies, the combined U.S. military budget for AI R&D in FY 2021 is USD 5.0 billion (Figure 7.3.2). This figure appears consistent with the USD 5.0 billion enacted the previous year. However, the FY 2021 figure reflects a budget request, rather than a final enacted budget. As noted above, once congressional appropriations are factored in, the true level of funding available to DOD AI R&D programs in FY 2021 may rise substantially. The top five projects set to receive the highest amount of AI R&D investment in FY 2021: • Rapid Capability Development and Maturation, by the U.S. Army (USD 284.2 million) • Counter WMD Technologies and Capabilities Development, by the DOD Threat Reduction Agency (USD 265.2 million) • Algorithmic Warfare Cross-Functional Team (Project Maven), by the Office of the Secretary of Defense (USD 250.1 million) • Joint Artificial Intelligence Center (JAIC), by the Defense Information Systems Agency (USD 132.1 million) • High Performance Computing Modernization Program, by the U.S. Army (USD 99.6 million) In addition, the Defense Advanced Research Projects Agency (DARPA) alone is investing USD 568.4 million in AI R&D, an increase of USD 82 million from FY 2020. C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S 7. 3 U. S . P U B L I C I N V E S T M E N T I N A I Important data caveat: This chart illustrates the challenge of working with contemporary government data sources to understand spending on AI. By one measure—the requests that include AI-relevant keywords—the DOD is requesting more than USD 5 billion for AI-specific research development in 2021 . However, DOD’s own accounting produces a radically smaller number: USD 841 million. This relates to the issue of defining where an AI system ends and another system begins; for instance, an initiative that uses AI for drones may also count hardware-related expenditures for the drones within its “AI” budget request, though the AI software component will be much smaller. TA B L E O F C O N T E N T S 1 6 9C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 USD 1.5 billion agencies spent in FY 2019 (Figure 7.3.3). AI spending in 2020 was more than six times higher than what it was just five years ago—about USD 300 million in FY 2015. However, to put this in perspective, the federal government spent USD 682 billion on contracts in FY 2020, so AI currently represents 0.25% of government spending. Contract Spending by Department and Agency Figure 7.3.4 shows that in FY 2020, the DOD spent more on AI-related contracts than any other federal department or agency (USD 1.4 billion). In second and third place are NASA (USD 139.1 million) and the Department of Homeland Security (USD 112.3 million). DOD, NASA, and the Department of Health and Human Services top the list for the most contract spending on AI over the past 10 years combined (Figure 7.3.5). In fact, DOD’s total contract spending on AI from 2001 to 2020 (USD 3.9 billion) is more than what was spent by the other 44 departments and agencies combined (USD 2.9 billion) over the same period. Looking ahead, DOD spending on AI contracts is only expected to grow as the Pentagon’s Joint Artificial Intelligence Center (JAIC), established in June 2018, is 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0 500 1,000 1,500 2,000 C o n tr ac t S p e n d in g ( in M ill io n s o f U .S . D o lla rs ) 1,837 U.S. GOVERNMENT TOTAL CONTRACT SPENDING on AI, FY 2001-20 Source: Bloomberg Government, 2020 | Chart: 2021 AI Index Report Figure 7.3.3 C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S 7. 3 U. S . P U B L I C I N V E S T M E N T I N A I U. S . G O V E R N M E N T A I - R E L AT E D C O N T R AC T S P E N D I N G Another indicator of public investment in AI technologies is the level of spending on government contracts across the federal government. Contracting for products and services supplied by private businesses typically occupies the largest share of an agency’s budget. Bloomberg Government built a model that captures contract spending on AI technologies by adding up all contracting transactions that contain a set of more than 100 AI-specific keywords in their titles or descriptions. The data reveals that the amount the federal government spends on contracts for AI products and services has reached an all-time high and shows no sign of slowing down. However, note that during the procurement process, vendors may add a bunch of keywords into their applications, so some of these things may have a relatively small AI component relative to other parts of technology. Total Contract Spending Federal departments and agencies spent a combined USD 1.8 billion on unclassified AI-related contracts in FY 2020. This represents a more than 25% increase from the TA B L E O F C O N T E N T S 1 70C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 0 500 1000 1500 2000 2500 3000 3500 4000 Contract Spending (in Millions of U.S. Dollars) Department of Defense (DOD) National Aeronautics and Space Administration (NASA) Department of Health and Human Services (HHS) Department of the Treasury (TREAS) Department of Homeland Security (DHS) Department of Veterans A airs (VA) Department of Commerce (DOC) Department of Agriculture (USDA) General Services Administration (GSA) Department of State (DOS) TOP 10 CONTRACT SPENDING on AI by U.S. GOVERNMENT DEPARTMENT and AGENCY, 2001-20 (SUM) Source: Bloomberg Government, 2020 | Chart: 2021 AI Index Report 0 200 400 600 800 1,000 1,200 1,400 Contract Spending (in Millions of U.S. Dollars) Department of Defense (DOD) National Aeronautics and Space Administration (NASA) Department of Homeland Security (DHS) Department of Health and Human Services (HHS) Department of Commerce (DOC) Department of the Treasury (TREAS) Department of Veterans A airs (VA) Securities and Exchange Commission (SEC) Department of Agriculture (USDA) Department of Justice (DOJ) TOP 10 CONTRACT SPENDING on AI by U.S. GOVERNMENT DEPARTMENT and AGENCY, 2020 Source: Bloomberg Government, 2020 | Chart: 2021 AI Index Report Figure 7.3.4 Figure 7.3.5 C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S 7. 3 U. S . P U B L I C I N V E S T M E N T I N A I still in the early stages of driving DOD’s AI spending. In 2020, JAIC awarded two massive contracts, one to Booz Allen Hamilton for the five-year, USD 800 million Joint Warfighter program, and another to Deloitte Consulting for a four-year, USD 106 million enterprise cloud environment for the JAIC, known as the Joint Common Foundation. TA B L E O F C O N T E N T S 1 7 1C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 107Th (2001-2002) 108Th (2003-2004) 109Th (2005-2006) 110Th (2007-2008) 111Th (2009-2010) 112Th (2011-2012) 113Th (2013-2014) 114Th (2015-2016) 115Th (2017-2018) 116th (2019-2020) 0 100 200 300 400 500 N u m b er o f M en ti o n s 486 149 22 10 1615174 87 243 173 44 66 39 70 MENTIONS of AI in U.S. CONGRESSIONAL RECORD by LEGISLATIVE SESSION, 2001-20 Source: Bloomberg Government, 2020 | Chart: 2021 AI Index Report Congressional Research Service Reports Committee Reports Legislation As AI gains attention and importance, policies and initiatives related to the technology are becoming higher priorities for governments, private companies, technical organizations, and civil society. This section examines how three of these four are setting the agenda for AI policymaking, including the legislative and monetary authority of national governments, as well as think tanks, civil society, and the technology and consultancy industry. L E G I S L AT I O N R E C O R D S O N A I The number of congressional and parliamentary records on AI is an indicator of governmental interest in developing AI capabilities—and legislating issues pertaining to AI. In this section, we use data from Bloomberg and McKinsey & Company to ascertain the 7.4 AI AND POLICYMAKING C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S 7. 4 A I A N D P O L I C Y M A K I N G Figure 7.4.1 number of these records and how that number has evolved in the last 10 years. Bloomberg Government identified all legislation (passed or introduced), reports published by congressional committees, and CRS reports that referenced one or more AI-specific keywords. McKinsey & Company searched for the terms “artificial intelligence” and “machine learning” on the websites of the U.S. Congressional Record, the U.K. Parliament, and the Parliament of Canada. For the United States, each count indicates that AI or ML was mentioned during a particular event contained in the Congressional Record, including the reading of a bill; for the U.K. and Canada, each count indicates that AI or ML was mentioned in a particular comment or remark during the proceedings.1 1 If a speaker or member mentioned artificial intelligence (AI) or machine learning (ML) multiple times within remarks, or multiple speakers mentioned AI or ML within the same event, it appears only once as a result. Counts for AI and ML are separate, as they were conducted in separate searches. Mentions of the abbreviations “AI” or “ML” are not included. TA B L E O F C O N T E N T S 1 7 2C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 0 20 40 60 80 100 120 140 N u m b er o f M en ti o n s 120 129 92 27 0 98 11 1 101 92 28 28 25 23 67 67 MENTIONS of AI and ML in the PROCEEDINGS of U.S. CONGRESS, 2011-20 Sources: U.S. Congressional Record website, the McKinsey Global Institute, 2020 | Chart: 2021 AI Index Report Machine Learning Artificial Intelligence U.S. Congressional Record The 116th Congress (January 1, 2019–January 3, 2021) is the most AI-focused congressional session in history. The number of mentions of AI by this Congress in legislation, committee reports, and CRS reports is more than triple that of the 115th Congress. Congressional interest in AI has continued to accelerate in 2020. Figure 7.4.1 shows that during this congressional session, 173 distinct pieces of legislation either focused on or contained language about AI technologies, their development, use, and rules governing them. During that two-year period, various House and Senate committees and C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S 7. 4 A I A N D P O L I C Y M A K I N G Figure 7.4.2 subcommittees commissioned 70 reports on AI, while the CRS, tasked as a fact-finding body for members of Congress, published 243 about AI or referencing AI. Mentions of AI and ML in Congressional/ Parliamentary Proceedings As shown in Figures 7.4.2–7.4.5, the number of mentions of artificial intelligence and machine learning in the proceedings of the U.S. Congress and the U.K. parliament continued to rise in 2020, while there were fewer mentions in the parliamentary proceedings of Canada. TA B L E O F C O N T E N T S 1 7 3C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 0 50 100 150 200 250 300 N u m b er o f M en ti o n s 283 192 183 138 51 0 4 57 1 246 158 138 179 34 42 37 MENTIONS of AI and ML in the PROCEEDINGS of U.K. PARLIAMENT, 2011-20 Sources: Parliament of U.K. website, the McKinsey Global Institute, 2020 | Chart: 2021 AI Index Report Machine Learning Artificial Intelligence 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 0 10 20 30 40 N u m b er o f M en ti o n s 34 38 18 21 00 000 2 35 33 21 17 3 MENTIONS of AI and ML in the PROCEEDINGS of CANADIAN PARLIAMENT, 2011-20 Sources: Canadian Parliament website, the McKinsey Global Institute, 2020 | Chart: 2021 AI Index Report Machine Learning Artificial Intelligence C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S 7. 4 A I A N D P O L I C Y M A K I N G Figure 7.4.3 Figure 7.4.4 TA B L E O F C O N T E N T S 1 74C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S 7. 4 A I A N D P O L I C Y M A K I N G 2 See Science & Technology Review and Scientific American for more details. 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 0 200 400 600 800 1,000 N u m b er o f M en ti o n s 225 MENTIONS of AI in CENTRAL BANK COMMUNICATIONS around THE WORLD, 2011-20 Source: Prattle/LiquidNet, 2020 | Chart: 2021 AI Index Report Figure 7.4.5 C E N T R A L B A N K S Central banks play a key role in conducting currency and monetary policy in a country or a monetary union. As with many other institutions, central banks are tasked with integrating AI into their operations and relying on big data analytics to assist them with forecasting, risk management, and financial supervision. Prattle, a leading provider of automated investment research solutions, monitors mentions of AI in the communications of central banks, including meeting minutes, monetary policy papers, press releases, speeches, and other official publications. Figure 7.4.5 shows a significant increase in the mention of AI across 16 central banks over the past 10 years, with the number reaching a peak of 1,020 in 2019. The sharp decline in 2020 can be explained by the COVID-19 pandemic as most central bank communications focused on responses to the economic downturn. Moreover, the Federal Reserve in the United States, Norges Bank in Norway, and the European Central Bank top the list for the most aggregated number of AI mentions in communications in the past five years (Figure 7.4.6). http://www.kjdb.org/EN/abstract/abstract14765.shtml# https://blogs.scientificamerican.com/observations/algorithmic-foreign-policy/ TA B L E O F C O N T E N T S 1 75C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 0 200 400 600 800 1,000 1,200 1,400 1,600 1,800 2,000 Number of Mentions Federal Reserve Norges Bank European Central Bank Reserve Bank of India Bank of England Bank of Israel Bank of Japan Bank of Korea Reserve Bank of Australia Reserve bank of New Zealand Bank of Taiwan Bank of Canada Sveriges Riksbank Swedish Riksbank Central Bank of the Republic of Turkey Central Bank of Brazil MENTIONS of AI in CENTRAL BANK COMMUNICATIONS around THE WORLD by BANK, 2016-20 (SUM) Source: Prattle/LiquidNet, 2020 | Chart: 2021 AI Index Report C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S 7. 4 A I A N D P O L I C Y M A K I N G Figure 7.4.6 TA B L E O F C O N T E N T S 1 76C H A P T E R 7 P R E V I E W Artificial Intelligence Index Report 2021 0 20 40 60 80 100 120 140 160 Number of Policy Products Innovation & Technology Int'l Affairs & Int'l Security Industry & Regulation Workforce & Labor Government & Public Administration Privacy, Safety & Security Ethics Justice & Law Enforcement Equity & Inclusion Education & Skills Social & Behavioral Sciences Health & Biological Sciences Communications & Media Democracy Humanities Energy & Environment Physical Sciences U.S. AI POLICY PRODUCTS by TOPIC, 2019-20 (SUM) Source: Stanford HAI & AI Index, 2020 | Chart: 2021 AI Index Report Secondary Topic Primary Topic U. S . A I P O L I C Y PA P E R S What are the AI policy initiatives outside national and intergovernmental governments? We monitored 42 prominent organizations that deliver policy papers on topics related to AI and assessed the primary topic as well as the secondary topic on policy papers published in 2019 and 2020. (See the Appendix for a complete list of organizations included.) Those organizations are either U.S.-based or have a sizable presence in the United States, and we grouped them into three categories: think tanks, policy institutes and academia (27); civil society organizations, associations and consortiums (9); and industry and consultancy (6). AI policy papers are defined as research papers, research reports, blog posts, and briefs that focus on a specific policy issue related to AI and provide clear recommendations C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S 7. 4 A I A N D P O L I C Y M A K I N G Figure 7.4.7 for policymakers. Primary topics mean that such a topic is the main focus of the policy paper, while secondary topics mean that the policy paper either briefly touches on the topic or the topic is a sub-focus of the paper. Combined data for 2019 and 2020 suggests that the topics of innovation and technology, international affairs and international security, and industry and regulation are the main focuses of AI policy papers in the United States (Figure 7.4.7). Fewer documents placed a primary focus on topics related to AI ethics—such as ethics, equity and inclusion; privacy, safety and security; and justice and law enforcement—which have largely been secondary topics. Moreover, topics bearing on the physical sciences, energy and environment, humanities, and democracy have received the least attention in U.S. AI policy papers. TA B L E O F C O N T E N T S 1 7 7A P P E N D I X Artificial Intelligence Index Report 2021 Artificial Intelligence Index Report 2021 Appendix TA B L E O F C O N T E N T S 1 7 8A P P E N D I X Artificial Intelligence Index Report 2021 Appendix A P P E N D I X CHAPTER 1 Research and Development 179 CHAPTER 2 Technical Performance 194 CHAPTER 3 The Economy 203 CHAPTER 4 AI Education 208 CHAPTER 5 Ethical Challenges of AI Applications 211 CHAPTER 6 Diversity in AI 214 CHAPTER 7 AI Policy and National Strategies 215 GLOBAL AI VIBRANCY 218 TA B L E O F C O N T E N T S 1 7 9A P P E N D I X Artificial Intelligence Index Report 2021 E L S E V I E R Prepared by Jörg Hellwig and Thomas A. Collins Source Elsevier’s Scopus database of scholarly publications has indexed more than 81 million peer-reviewed documents. This data was compiled by Elsevier. Methodology Scopus tags its papers with keywords, publication dates, country affiliations, and other bibliographic information. The Elsevier AI Classifier leveraged the following features extracted from the Scopus records that were returned as a result of querying against the provided approximately 800 AI search terms. Each record fed into the feature creation also maintained a list of each search term that hit for that particular record: • hasAbs: Boolean value whether or not the record had an abstract text section in the record (e.g., some records are only title and optional keywords) • coreCnt: number of core-scored search terms present for the record • mediumCnt: number of medium-scored search terms present for the record • lowCnt: number of low-scored search terms present for the record • totalCnt: total number of search terms present for the record • pcntCore: coreCnt/totalCnt • pcntMedium: mediumCnt/totalCnt • pcntLow: lowCnt/totalCnt • totalWeight = 5*coreCnt + 3*mediumCnt + 1*lowCnt • normWeight = if (has Abs) { totalWeight / (title.length + abstract.length) } else • { totalWeight/title.length} • hasASJC: Boolean value: does the record have an associated ASJC list? • isAiASJC: does ASJC list contain 1702? • isCompSciASJC does ASJC list contain a 17XX ASJC code (“1700,” “1701,” “1702,” “1703,” “1704,” “1705,” “1706,” “1707,” “1708,” “1709,” “1710,” “1711,” “1712”) • isCompSubj: does the Scopus record have a ComputerScience subject code associated with it? This should track 1:1 to isCompSciASJC. Scopus has 27 major subject areas of which one is Computer Science. The feature checks, if the publication is within Computer Science or not. This is no exclusion.pcntCompSciASJC: percentage of ASJC codes for record that are from the CompSci ASJC code list Details on Elsevier’s dataset defining AI, country affiliations, and AI subcategories can be found in the 2018 AI Index Report Appendix. Nuance • The Scopus system is retroactively updated. As a result, the number of papers for a given query may increase over time. • Members of the Elsevier team commented that data on papers published after 1995 would be the most reliable. The raw data has 1996 as the starting year for Scopus data. Nuances specific to AI publications by region • Papers are counted utilizing whole counting rather than fractional counting. Papers assigned to multiple countries (or regions) due to collaborations are counted toward each country (or region). This explains why top- line numbers in a given year may not match individual country numbers. For example, a paper assigned to Germany, France, and the United States will appear on each country’s count, but only once for Europe (plus once for the U.S.) as well as being counted only at the global level. • “Other” includes all other countries that have published one or more AI papers on Scopus. CHAPTER 1: RESEARCH & DEVELOPMENT C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T A P P E N D I X https://www.elsevier.com/solutions/scopus https://www.elsevier.com/ TA B L E O F C O N T E N T S 1 8 0A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T A P P E N D I X Nuances specific to publications by topic • The 2017 AI Index Report showed only AI papers within the CS category. In the 2018 and 2019 reports, all papers tagged as AI were included, regardless of whether they fell into the larger CS category. • Scopus has a subject category called AI, which is a subset of CS, but this is relevant only for a subject- category approach to defining AI papers. The methodology used for the report includes all papers, since increasingly not all AI papers fall under CS. Nuances specific to methodology • The entire data collection process was done by Elsevier internally. The AI Index was not involved in the keyword selection process or the counting of relevant papers. • The boundaries of AI are difficult to establish, in part because of the rapidly increasing applications in many fields, such as speech recognition, computer vision, robotics, cybersecurity, bioinformatics, and healthcare. But limits are also difficult to define because of AI’s methodological dependency on many areas, such as logic, probability and statistics, optimization, photogrammetry, neuroscience, and game theory—to name just a few. Given the community’s interest in AI bibliometrics, it would be valuable if groups producing these studies strived for a level of transparency in their methods, which would support the reproducibility of results, particularly on different underlying bibliographic databases. AI Training Set A training set of approximately 1,500 publications defines the AI field. The set is only the EID (the Scopus identifier of the underlying publications). Publications can be searched and downloaded either from Scopus directly or via the API.The training set is a set of publications randomly selected from the initial seven mio publications. After running the algorithm we verify the results of the training set with the gold set (expert hand-checked publications which are definitely AI). TA B L E O F C O N T E N T S 1 8 1A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T A P P E N D I X M I C R O S O F T AC A D E M I C G R A P H : M E T H O D O L O G Y Prepared by Zhihong Shen, Boya Xie, Chiyuan Huang, Chieh-Han Wu, and Kuansan Wang Source The Microsoft Academic Graph1 is a heterogeneous graph containing scientific publication records and citation relationships between those publications, as well as authors, institutions, journals, conferences, and fields of study. This graph is used to power experiences in Bing, Cortana, Word, and Microsoft Academic. The graph is currently being updated on a weekly basis. Learn more about MAG here. Methodology MAG Data Attribution: Each paper is counted exactly once. When a paper has multiple authors or regions, the credit is equally distributed to the unique regions. For example, if a paper has two authors from the United States, one from China, and one from the United Kingdom, then the United States, China, and the United Kingdom each get one-third credit. Metrics: Total number of published papers ( journal papers, conference papers, patents, repository2); total number of citations of published papers. Definition: The citation and reference count represents the number of respective metrics for AI papers collected from all papers. For example, in “OutAiPaperCitationCountryPairByYearConf.csv,” a row stating “China, United States, 2016, 14955” means that China’s conference AI papers published in 2016 received 14,955 citations from (all) U.S. papers indexed by MAG. Curating the MAG Dataset and References: Generally speaking, the robots sit on top of a Bing crawler to read everything from the web and have access to the entire web index. As a result, MAG is able to program the robots to conduct more web searches than a typical human can complete. This helps disambiguate entities with the same names. For example, for authors, MAG gets to additionally use all the CVs and institutional homepages on the web as signals to recognize and verify claims3. MAG has found this approach to be superior to the results of the best of the KDD Cup 2013 competition, which uses only data from within all publication records and Open Researcher and Contributor Identifiers (ORCIDs). Notes About the MAG Data Conference Papers: After the contents and data sources were scrutinized, it was determined that some of the 2020 conference papers were not properly tagged with their conference venue. Many conference papers in the MAG system are under arXiv papers, but due to issues arising from some data sources (including delays in DBLP and web form changes on the ACM website), they were possibly omitted as 2020 conference papers (ICML-PKDD, IROS, etc.). However, the top AI conferences (selected not in terms of publication count, but rather considering both publication and citation count as well as community prestige) are complete. In 2020, the top 20 conferences presented 103,000 papers, which is 13.7% of all AI conference papers, and they received 7.15 million citations collectively, contributing 47% of all citations received for all AI conference papers. The number of 2020 conference publications is slightly lower than in 2019. Data is known to be missing for ICCV and NAACL. About 100 Autonomous Agents and Multiagent Systems (AAMAS) conference papers are erroneously attributed to an eponymous journal. Unknown Countries for Journals and Conferences: For the past 20 to 30 years, 30% of journal and conference affiliation data lacks affiliation by country or region, due to errors in paper format, data source, and PDF parsing, among others. 1 See “A Review of Microsoft Academic Services for Science of Science Studies” and “Microsoft Academic Graph: When Experts Are Not Enough” for more details. 2 Repository as a publication type in MAG refers to both preprints and postprints. In the AI domain, it predominantly comes from arXiv. See “Is Preprint the Future of Science? A Thirty Year Journey of Online Preprint Services” for details. 3 See “Machine Verification for Paper and Author Claims” and “How Microsoft Academic Uses Knowledge to Address the Problem of Conflation/Disambiguation” for details. https://www.microsoft.com/en-us/research/project/microsoft-academic-graph/ https://www.kaggle.com/c/kdd-cup-2013-author-paper-identification-challenge https://www.microsoft.com/en-us/research/publication/a-review-of-microsoft-academic-services-for-science-of-science-studies/ https://www.microsoft.com/en-us/research/publication/microsoft-academic-graph-when-experts-are-not-enough/ https://arxiv.org/pdf/2102.09066.pdf https://arxiv.org/pdf/2102.09066.pdf https://www.microsoft.com/en-us/research/project/academic/articles/machine-verification-paper-author-claims/ https://www.microsoft.com/en-us/research/project/academic/articles/microsoft-academic-uses-knowledge-address-problem-conflation-disambiguation/ TA B L E O F C O N T E N T S 1 8 2A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T A P P E N D I X M I C R O S O F T AC A D E M I C G R A P H : PAT E N T DATA C H A L L E N G E As mentioned in the report, the patent data—especially the affiliation information—is incomplete in the MAG database. The reason for the low coverage is twofold. First, applications published by the patent offices often identify the inventors by their residencies not affiliations. While patent applications often have the information about the “assignees” of a patent, they do not necessarily mean the underlying inventions originate from the assignee institutions. Therefore, detected affiliations may be inaccurate. In case a patent discloses the scholarly publications underlying the invention, MAG can infer inventors’ affiliations through the scholarly publications. Second, to maximize intellectual property protection around the globe, institutions typically file multiple patent applications on the same invention under various jurisdictions. These multiple filings, while appear very different because the titles and inventor names are often translated into local languages, are in fact the result of a single invention. Raw patent counts therefore inflate the inventions in their respective domains. To remediate this issue, MAG uses the patent family ID feature to combine all filings with the original filing, which allows the database to count filings all around the world of the same origin only once.4 Conflating the multiple patent applications of the same invention is not perfect, and over-conflations of patents are more noticeable in MAG than scholarly articles. These challenges raise questions about the reliability of data on the share of AI patent publications by both region and geographic area. Those charts are included below. 4 Read “Sharpening Insights into the Innovation Landscape with a New Approach to Patents” for more details. 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 20 21 20 22 20 23 0% 2% 4% 6% 8% A I P at en t P u b lic at io n s (% o f W o rl d T o ta l) 0.0%, Sub-Saharan Africa 0.1%, South Asia 0.0%, Middle East & North Africa 0.0%, Latin America & Caribbean 1.6%, Europe & Central Asia 3.2%, North America 2.6%, East Asia & Pacific AI PATENT PUBLICATIONS (% of WORLD TOTAL) by REGION, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report Figure 1.4.1 By Region https://www.microsoft.com/en-us/research/project/academic/articles/sharpening-insights-into-the-innovation-landscape-with-a-new-approach-to-patents/ TA B L E O F C O N T E N T S 1 8 3A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T A P P E N D I X 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0% 2% 4% 6% 8% A I P at en t P u b lic at io n s (% o f W o rl d T o ta l) 1.3% EU 0.4% China 3.2% US AI PATENT PUBLICATIONS (% of WORLD TOTAL) by GEOGRAPHIC AREA, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0% 5% 10% 15% 20% 25% A I P at en t C it at io n s (% o f W o rl d T o ta l) 2.1% EU 2.0% China 8.6% US AI PATENT CITATIONS (% of WORLD TOTAL) by GEOGRAPHIC AREA, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report Figure 1.4.2 Figure 1.4.3 By Geographic Area Citation TA B L E O F C O N T E N T S 1 8 4A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T A P P E N D I X M I C R O S O F T AC A D E M I C G R A P H : M E A S U R E M E N T C H A L L E N G E S A N D A LT E R N AT I V E D E F I N I T I O N O F A I As the AI Index team discussed in the paper “Measurement in AI Policy: Opportunities and Challenges,” choosing how to define AI and correctly capture relevant bibliometric data remain challenging. Data in the main report is based on a restricted definition of AI, adopted by MAG, that aligns with what has been used in previous AI Index reports. One consequence is that such a definition excludes many AI publications from venues considered to be core AI venues. For example, only 25% of conference publications in the 2020 AAAI conference are included in the original conference dataset. To spur discussion on this important topic, this section presents the MAG data with an alternative definition of AI used by the Organisation for Economic Co-operation and Development (OECD). OECD defines AI publications as papers in the MAG database tagged with a field of study that is categorized in either the “artificial intelligence” or the “machine learning” field of study as well as their subtopics in the MAG taxonomy.5 This is a more liberal definition than the one used by MAG, which considers only those publications tagged with “artificial intelligence” as AI publications. For example, an application paper in biology that uses ML techniques will be counted as an AI publication under the OECD definition, but not under the MAG definition unless the paper is specifically tagged in the AI category. Charts corresponding to those in the main text but using the OECD definition are presented below. The overall trends are very similar. 5 Read the OECD.AI Policy Observatory MAG methodological note for more details on the MAG-OECD definition of AI and “A Web-scale System for Scientific Knowledge Exploration” on the MAG Taxonomy. 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0 50 100 150 200 250 300 N u m b e r o f A I J o u rn al P u b lic at io n s (i n T h o u sa n d s) OECD DEFINITION: NUMBER of AI JOURNAL PUBLICATIONS, 2000-20 Source: Microso Academic Graph, 2020 | Chart: 2021 AI Index Report Figure 1.5.1a AI Journal Publications (OECD Definition) https://arxiv.org/pdf/2009.09071 https://arxiv.org/pdf/2009.09071 https://www.oecd.ai/assets/files/Methodology_20200219.pdf https://www.aclweb.org/anthology/P18-4015.pdf TA B L E O F C O N T E N T S 1 8 5A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T A P P E N D I X 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 2% 4% 6% 8% A I J o u rn al P u b lic at io n s (% o f A ll P u b lic at io n s) 7.9% OECD DEFINITION: AI JOURNAL PUBLICATIONS (% of ALL JOURNAL PUBLICATIONS), 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 20 21 20 22 20 23 20 24 20 25 0% 5% 10% 15% 20% 25% A I J o u rn al P u b lic at io n s (% o f W o rl d T o ta l) 4.2% South Asia 4.1% Middle East & North Africa 16.5% Europe & Central Asia 1.0% Sub-Saharan Africa 16.2% North America 2.2% Latin America & Caribbean 23.0% East Asia & Pacific OECD DEFINITION: AI JOURNAL PUBLICATION (% of WORLD TOTAL) by REGION, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report Figure 1.5.1b Figure 1.5.2 TA B L E O F C O N T E N T S 1 8 6A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T A P P E N D I X 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0% 5% 10% 15% 20% A I J o u rn al P u b lic at io n s (% o f W o rl d T o ta l) 10.8% EU 14.4% China 14.1% US OECD DEFINITION: AI JOURNAL PUBLICATION (% of WORLD TOTAL) by GEOGRAPHIC AREA, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0% 5% 10% 15% 20% A I J o u rn al P u b lic at io n s (% o f W o rl d T o ta l) 8.2% US 5.8% EU 8.4% China OECD DEFINITION: AI JOURNAL CITATIONS (% of WORLD TOTAL) by GEOGRAPHIC AREA, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report Figure 1.5.3 Figure 1.5.4 TA B L E O F C O N T E N T S 1 8 7A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T A P P E N D I X 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0 20 40 60 80 100 N u m b er o f P u b lic at io n s (i n T h o u sa n d s) OECD DEFINITION: NUMBER of AI CONFERENCE PUBLICATIONS, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 20% 25% 30% 35% 40% A I C o n fe re n ce P u b lic at io n s (% o f A ll P u b lic at io n s) 37.6% OECD DEFINITION: AI CONFERENCE PUBLICATIONS (% of ALL CONFERENCE PUBLICATIONS), 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report Figure 1.5.5a Figure 1.5.5b AI Conference Publications (OECD Definition) TA B L E O F C O N T E N T S 1 8 8A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T A P P E N D I X 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 20 21 20 22 20 23 20 24 20 25 0% 10% 20% 30% 40% A I C o n fe re n ce P u b lic at io n s (% o f W o rl d T o ta l) 0.4% Sub-Saharan Africa 2.4% Middle East & North Africa 1.8% Latin America & Caribbean 20.0% Europe & Central Asia 5.0% South Asia 21.2% North America 26.5% East Asia & Pacific OECD DEFINITION: AI CONFERENCE PUBLICATION (% of WORLD TOTAL) by REGION, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0% 5% 10% 15% 20% 25% A I C o n fe re n ce P u b lic at io n s (% o f W o rl d T o ta l) 15.0% China 18.9% US 14.2% EU OECD DEFINITION: AI CONFERENCE PUBLICATION (% of WORLD TOTAL) by GEOGRAPHIC AREA, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report Figure 1.5.6 Figure 1.5.7 TA B L E O F C O N T E N T S 1 8 9A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T A P P E N D I X 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0% 5% 10% 15% A I C o n fe re n ce C it at io n ( % o f W o rl d T o ta l) 4.2% China 14.3% US 4.2% EU OECD DEFINITION: AI CONFERENCE CITATION (% of WORLD TOTAL) by GEOGRAPHIC AREA, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0 50 100 150 200 N u m b e r o f A I P at e n t P u b lic at io n s (i n T h o u sa n d s) OECD DEFINITION: NUMBER of AI PATENT PUBLICATIONS, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report Figure 1.5.8 Figure 1.5.9a AI Patent Publications (OECD Definition) TA B L E O F C O N T E N T S 1 9 0A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T A P P E N D I X 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 2% 3% 4% 5% A I P at en t P u b lic at io n s (% o f A ll P u b lic at io n s) 4.6% OECD DEFINITION: AI PATENT PUBLICATIONS (% of ALL PATENT PUBLICATIONS), 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 20 21 20 22 20 23 20 24 0% 2% 4% 6% 8% 10% A I P at en t P u b lic at io n s (% o f W o rl d T o ta l) 0.0% Sub-Saharan Africa 0.1% South Asia 0.1% Middle East & North Africa 3.3% North America 0.0% Latin America & Caribbean 1.4% Europe & Central Asia 2.1% East Asia & Pacific OECD DEFINITION: AI PATENT PUBLICATION (% of WORLD TOTAL) by REGION, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report Figure 1.5.9b Figure 1.5.10 TA B L E O F C O N T E N T S 1 91A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T A P P E N D I X 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0% 2% 4% 6% 8% 10% A I P at en t P u b lic at io n s (% o f W o rl d T o ta l) 3.2% US 1.0% EU 0.4% China OECD DEFINITION: AI PATENT PUBLICATIONS (% of WORLD TOTAL) by GEOGRAPHIC AREA, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report 20 0 0 20 0 1 20 0 2 20 0 3 20 0 4 20 0 5 20 0 6 20 0 7 20 0 8 20 0 9 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 20 20 0% 5% 10% 15% 20% 25% A I P at en t C it at io n ( % o f W o rl d T o ta l) 1.2% China 5.6% US 1.1% EU OECD DEFINITION: AI PATENT CITATION (% of WORLD TOTAL) by GEOGRAPHIC AREA, 2000-20 Source: Microsoft Academic Graph, 2020 | Chart: 2021 AI Index Report Figure 1.5.11 Figure 1.5.12 TA B L E O F C O N T E N T S 1 9 2A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T A P P E N D I X PA P E R S O N A R X I V Prepared by Jim Entwood and Eleonora Presani Source arXiv.org is an online archive of research articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics. arXiv is owned and operated by Cornell University. See more information on arXiv.org. Methodology Raw data for our analysis was provided by representatives at arXiv.org. The keywords we selected, and their respective categories, are below: Artificial intelligence (cs.AI) Computation and language (cs.CL) Computer vision and pattern recognition (cs.CV) Machine learning (cs.LG) Neural and evolutionary computing (cs.NE) Robotics (cs.RO) Machine learning in stats (stats.ML) For most categories, arXiv provided data for 2015–2020. To review other categories’ submission rates on arXiv, see arXiv.org’s submission statistics. The arXiv team has been expanding the publicly available submission statistics. This is a tableau-based application with tabs at the top for various displays of submission stats and filters on the side bar to drill down by topic. (Hover over the charts to view individual categories.) The data is meant to be displayed on a monthly basis with download options. arXiv is actively looking at ways to improve how it can better support AI/ML researchers as the field grows and discovering content becomes more challenging. For example, there may be ways to create finer grained categories in arXiv for machine learning to help researchers in subfields share and find work more easily. The other rapidly expanding area is computer vision, where there is considerable overlap for ML applications of computer vision. Nuance • Categories are self-identified by authors—those shown are selected as the “primary” category. Thus there is not a single automated categorization process. Additionally, the artificial intelligence or machine learning categories may be categorized by other subfields or keywords. • arXiv team members suggest that participation on arXiv can breed greater participation, meaning that an increase in a subcategory on arXiv could drive over- indexed participation by certain communities. http://arxiv.org https://arxiv.org/help/stats/2017_by_area/index https://arxiv.org/about/reports/submission_category_by_year TA B L E O F C O N T E N T S 1 9 3A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 1 : R E S E A R C H & D E V E L O P M E N T A P P E N D I X N E S TA Prepared by Joel Kliger and Juan Mateos-Garcia Source Details can be found in the following publication: Deep Learning, Deep Change? Mapping the Development of the Artificial Intelligence General Purpose Technology Methodology Deep learning papers were identified through a topic modeling analysis of the abstracts of arXiv papers in the CS (computer science) and stats.ML (statistics: machine learning category) arXiv categories. The data was enriched with institutional affiliation and geographic information from the Microsoft Academic Graph and the Global Research Identifier. Nesta’s arXlive tool is available here. Access the Code The code for data collection and processing can be found here; or, without the infrastructure overhead here. G I T H U B S TA R S Source GitHub: star-history (available at star history website) was used to retrieve the data. Methodology The visual in the report shows the number of stars for various GitHub repositories over time. The repositories include the following: apache/incubator-mxnet, BVLC/cafe, cafe2/cafe2, dmlc/ mxnet, fchollet/keras, Microsoft/CNTK, pytorch/pytorch, scikit-learn/scikit-learn, tensorflow/tensorflow, Theano/ Theano, Torch/Torch7. Nuance The GitHub Archive currently does not provide a way to count when users remove a star from a repository. Therefore, the reported data slightly overestimates the number of stars. A comparison with the actual number of stars for the repositories on GitHub reveals that the numbers are fairly close and that the trends remain unchanged. https://arxiv.org/pdf/1808.06355.pdf https://arxiv.org/pdf/1808.06355.pdf https://arxlive.org/ https://github.com/nestauk/nesta/tree/dev/nesta/core/routines/arxiv https://github.com/nestauk/arxiv_ai/tree/master/ai_index https://github.com/timqian/star-history https://star-history.t9t.io TA B L E O F C O N T E N T S 1 9 4A P P E N D I X Artificial Intelligence Index Report 2021 I M AG E N E T: AC C U R AC Y Prepared by Jörg Hellwig and Thomas A. Collins Source Data on ImageNet accuracy was retrieved through an arXiv literature review. All results reported were tested on the LSRVC 2012 validation set, as the results on the test set, which are not significantly different, are not public. Their ordering may differ from the results reported on the LSRVC website, since those results were obtained on the test set. Dates we report correspond to the day when a paper was first published to arXiv, and top-1 accuracy corresponds to the result reported in the most recent version of each paper. We selected a top result at any given point in time from 2012 to Nov. 17, 2019. Some of the results we mention were submitted to LSRVC competitions over the years. Image classification was part of LSRVC through 2014; in 2015, it was replaced with an object localization task, where results for classification were still reported but no longer a part of the competition, having instead been replaced by more difficult tasks. For papers published in 2014 and later, we report the best result obtained using a single model (we did not include ensembles) and using single-crop testing. For the three earliest models (AlexNet, ZFNet, Five Base), we reported the results for ensembles of models. While we report the results as described above, due to the diversity in models, evaluation methods, and accuracy metrics, there are many other ways to report ImageNet performance. Some possible choices include: • Evaluation set: validation set (available publicly) or test set (available only to LSRVC organizers) • Performance metric: Top-1 accuracy (whether the correct label was the same as the first predicted label for each image) or top-5 accuracy (whether the correct label was present among the top five predicted labels for each image) • Evaluation method: single-crop or multi-crop CHAPTER 2: TECHNICAL PERFORMANCE C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E A P P E N D I X To highlight progress here in top-5 accuracy, we have taken scores from the following papers, without extra training data: Fixing the Train-Test Resolution Discrepancy: FixEfficientNet Adversarial Examples Improve Image Recognition OverFeat: Integrated Recognition, Localization and Detection Using Convolutional Networks Local Relation Networks for Image Recognition Densely Connected Convolutional Networks Revisiting Unreasonable Effectiveness of Data in Deep Learning Era Squeeze-and-Excitation Networks EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks MultiGrain: A Unified Image Embedding for Classes and Instances EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks Billion-Scale Semi-Supervised Learning for Image Classification GPipe: Efficient Training of Giant Neural Networks Using Pipeline Parallelism RandAugment: Practical Data Augmentation with No Separate Search Fixing the Train-Rest Resolution Discrepancy https://paperswithcode.com/paper/adversarial-examples-improve-image https://paperswithcode.com/paper/overfeat-integrated-recognition-localization https://paperswithcode.com/paper/local-relation-networks-for-image-recognition https://paperswithcode.com/paper/densely-connected-convolutional-networks https://paperswithcode.com/paper/revisiting-unreasonable-effectiveness-of-data https://paperswithcode.com/paper/squeeze-and-excitation-networks https://paperswithcode.com/paper/efficientnet-rethinking-model-scaling-for https://paperswithcode.com/paper/multigrain-a-unified-image-embedding-for https://paperswithcode.com/paper/efficientnet-rethinking-model-scaling-for https://paperswithcode.com/paper/billion-scale-semi-supervised-learning-for https://paperswithcode.com/paper/gpipe-efficient-training-of-giant-neural https://paperswithcode.com/paper/randaugment-practical-data-augmentation-with https://paperswithcode.com/paper/fixing-the-train-test-resolution-discrepancy TA B L E O F C O N T E N T S 1 9 5A P P E N D I X Artificial Intelligence Index Report 2021 To highlight progress here in top-5 accuracy, we have taken scores from the following papers, with extra training data: Meta Pseudo Labels Self-Training with Noisy Student Improves ImageNet Classification Big Transfer (BiT): General Visual Representation Learning ImageNet Classification with Deep Convolutional Neural Networks ESPNetv2: A Light-Weight, Power Efficient, and General Purpose Convolutional Neural Network Xception: Deep Learning with Depthwise Separable Convolutions EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks Self-training with Noisy Student Improves ImageNet Classification To highlight progress here in top-1 accuracy, we have taken scores from the following papers, without extra training data: Fixing the Train-Test Resolution Discrepancy: FixEfficientNet Adversarial Examples Improve Image Recognition OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks Densely Connected Convolutional Networks Revisiting Unreasonable Effectiveness of Data in Deep Learning Era Dual Path Networks Res2Net: A New Multi-Scale Backbone Architecture Billion-Scale Semi-Supervised Learning for Image Classification Squeeze-and-Excitation Networks EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks MultiGrain: A Unified Image Embedding for Classes and Instances EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks Billion-Scale Semi-Supervised Learning for Image Classification EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks RandAugment: Practical Data Augmentation with No Separate Search Fixing the Train-Test Resolution Discrepancy To highlight progress here in top-1 accuracy, we have taken scores from the following papers, without extra training data: Meta Pseudo Labels Sharpness-Aware Minimization for Efficiently Improving Generalization An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale Fixing the Train-Test Resolution Discrepancy: FixEfficientNet Self-training with Noisy Student Improves ImageNet Classification Big Transfer (BiT): General Visual Representation Learning ImageNet Classification with Deep Convolutional Neural Networks ESPNetv2: A Light-Weight, Power Efficient, and General Purpose Convolutional Neural Network Xception: Deep Learning with Depthwise Separable Convolutions EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks Self-training with Noisy Student Improves ImageNet Classification The estimate of human-level performance is from Russakovsky et al, 2015. Learn more about the LSVRC ImageNet competition and the ImageNet data set. C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E A P P E N D I X https://paperswithcode.com/paper/meta-pseudo-labels https://paperswithcode.com/paper/self-training-with-noisy-student-improves https://paperswithcode.com/paper/large-scale-learning-of-general-visual https://paperswithcode.com/paper/imagenet-classification-with-deep https://paperswithcode.com/paper/espnetv2-a-light-weight-power-efficient-and https://paperswithcode.com/paper/xception-deep-learning-with-depthwise https://paperswithcode.com/paper/efficientnet-rethinking-model-scaling-for https://paperswithcode.com/paper/self-training-with-noisy-student-improves https://paperswithcode.com/paper/fixing-the-train-test-resolution-discrepancy-2 https://paperswithcode.com/paper/adversarial-examples-improve-image https://paperswithcode.com/paper/overfeat-integrated-recognition-localization https://paperswithcode.com/paper/densely-connected-convolutional-networks https://paperswithcode.com/paper/revisiting-unreasonable-effectiveness-of-data https://paperswithcode.com/paper/dual-path-networks https://paperswithcode.com/paper/res2net-a-new-multi-scale-backbone https://paperswithcode.com/paper/billion-scale-semi-supervised-learning-for https://paperswithcode.com/paper/squeeze-and-excitation-networks https://paperswithcode.com/paper/efficientnet-rethinking-model-scaling-for https://paperswithcode.com/paper/multigrain-a-unified-image-embedding-for https://paperswithcode.com/paper/efficientnet-rethinking-model-scaling-for https://paperswithcode.com/paper/billion-scale-semi-supervised-learning-for https://paperswithcode.com/paper/efficientnet-rethinking-model-scaling-for https://paperswithcode.com/paper/randaugment-practical-data-augmentation-with https://paperswithcode.com/paper/fixing-the-train-test-resolution-discrepancy https://paperswithcode.com/paper/meta-pseudo-labels https://paperswithcode.com/paper/sharpness-aware-minimization-for-efficiently-1 https://paperswithcode.com/paper/an-image-is-worth-16x16-words-transformers https://paperswithcode.com/paper/fixing-the-train-test-resolution-discrepancy-2 https://paperswithcode.com/paper/self-training-with-noisy-student-improves https://paperswithcode.com/paper/large-scale-learning-of-general-visual https://paperswithcode.com/paper/imagenet-classification-with-deep https://paperswithcode.com/paper/espnetv2-a-light-weight-power-efficient-and https://paperswithcode.com/paper/xception-deep-learning-with-depthwise https://paperswithcode.com/paper/efficientnet-rethinking-model-scaling-for https://paperswithcode.com/paper/self-training-with-noisy-student-improves https://arxiv.org/pdf/1409.0575.pdf http://image-net.org/challenges/LSVRC/ http://image-net.org/challenges/LSVRC/ http://image-net.org/ TA B L E O F C O N T E N T S 1 9 6A P P E N D I X Artificial Intelligence Index Report 2021 I M AG E N E T: T R A I N I N G T I M E Trends can also be observed by studying research papers that discuss the time it takes to train ImageNet on any infrastructure. To gather this data, we looked at research papers from the past few years that tried to optimize for training ImageNet to a standard accuracy level while competing on reducing the overall training time. Source The data is sourced from MLPerf. Detailed data for runs for specific years are available: 2020: MLPerf Training v0.7 Results 2019: MLPerf Training v0.6 Results 2018: MLPerf Training v0.5 Results Notes Data from MLPerf is available in cloud systems for rent. Available On Premise systems contain only components that are available for purchase. Preview systems must be submittable as Available In Cloud or Available on Premise in the next submission round. Research, Development, or Internal (RDI) contain experimental, in development, or internal-use hardware or software. Each row in the results table is a set of results produced by a single submitter us- ing the same software stack and hardware platform. Each row contains the following information: Submitter: the organization that submitted the results System: general system description Processor and count: the type and number of CPUs used, if CPUs perform the majority of ML compute Accelerator and count: the type and number of accel- erators used, if accelerators perform the majority of ML compute Software: the ML framework and primary ML hardware library used Benchmark results: training time to reach a specified tar- get quality, measured in minutes Details: link to metadata for submission Code: link to code for submission Notes: arbitrary notes from the submitter I M AG E N E T: T R A I N I N G C O S T Source DAWNBench is a benchmark suite for end-to-end, deep-learning training and inference. Computation time and cost are critical resources in building deep models, yet many existing benchmarks focus solely on model accuracy. DAWNBench provides a reference set of common deep-learning workloads for quantifying training time, training cost, inference latency, and inference cost across different optimization strategies, model architectures, software frameworks, clouds, and hardware. More details available at DawnBench. Note The DawnBench data source has been deprecated for the period after March 2020, and MLPerf is the most reliable and updated source for AI compute measurements. C O C O : K E Y P O I N T D E T E C T I O N The data for COCO keypoint detection data is sourced from COCO keypoints leaderboard. C O C O : D E N S E P O S E E S T I M AT I O N We gathered data from the CODALab 2020 challenge and read arXiv repository papers to build comprehensive data on technical progress in this challenge. The detailed list of papers and sources used in our survey include: DensePose: Dense Human Pose Estimation In the Wild COCO-DensePose 2018 CodaLab Parsing R-CNN for Instance-Level Human Analysis Capture Dense: Markerless Motion Capture Meets Dense Pose Estimation Slim DensePose: Thrifty Learning from Sparse Annotations and Motion Cues COCO-DensePose 2020 CodaLab Transferring Dense Pose to Proximal Animal Classes Making DensePose Fast and Light SimPose: Effectively Learning DensePose and Surface Normals of People from Simulated Data C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E A P P E N D I X https://mlperf.org/ https://mlperf.org/training-results-0-7 https://mlperf.org/training-results-0-6 https://mlperf.org/training-results-0-5 https://dawn.cs.stanford.edu https://cocodataset.org/#keypoints-leaderboard https://competitions.codalab.org/competitions/20660#results https://arxiv.org/abs/1802.00434 https://competitions.codalab.org/competitions/19636#results https://arxiv.org/abs/1811.12596 https://arxiv.org/abs/1812.01783? https://arxiv.org/abs/1812.01783? https://arxiv.org/abs/1906.05706 https://arxiv.org/abs/1906.05706 https://competitions.codalab.org/competitions/20660#results https://arxiv.org/abs/2003.00080 https://arxiv.org/abs/2006.15190 http://arxiv.org/abs/2007.15506 http://arxiv.org/abs/2007.15506 TA B L E O F C O N T E N T S 1 9 7A P P E N D I X Artificial Intelligence Index Report 2021 AC T I V I T Y N E T: T E M P O R A L L O C A L I Z AT I O N TA S K In the challenge, there are three separate tasks, but they focus on the main problem of temporally localizing where activities happen in untrimmed videos from the ActivityNet benchmark. We have compiled several attributes for the task of temporal localization at the challenge over the last four rounds. Below is a link to the overall stats and trends for this task, as well as some detailed analysis (e.g., how has the performance for individual activity classes improved over the years? Which are the hardest and easiest classes now? Which classes have the most improvement over the years?). See the Performance Diagnosis (2020) tab for a detailed trends update. Please see ActivityNet Statisticsin the public data folder for more details. YO L O ( YO U O N LY L O O K O N C E ) YOLO is a neural network model mainly used for the detection of objects in images and in real-time videos. mAP (mean average precision) is a metric that is used to measure the accuracy of object detectors. It is a combination of precision and recall. mAP is the average of the precision and recall calculated over a document. The performance of YOLO has increased gradually with the development of new architectures and versions in past years. With the increase in size of model, its mean average precision increases as well, with a corresponding decrease in FPS of the video. We conducted a detailed survey of arXiv papers and GitHub repository to segment progress in YOLO across its various versions. Below are the references for original sources: YOLOv1: You Only Look Once: Unified, Real-Time Object Detection YOLOv2: YOLO9000: Better, Faster, Stronger YOLO: Real-Time Object Detection YOLOv3: YOLOv3: An Incremental Improvement Learning Spatial Fusion for Single-Shot Object Detection GitHub: ultralytics/yolov3 YOLOv4: YOLOv4: Optimal Speed and Accuracy of Object Detection GitHub: AlexeyAB/darknet YOLOv5: GitHub: ultralytics/yolov5 PP-YOLO: PP-YOLO: An Effective and Efficient Implementation of Object Detector POLY-YOLO: Poly-YOLO: Higher Speed, More Precise Detection and Instance Segmentation for YOLOV3 C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E A P P E N D I X http://www.activity-net.org/ https://docs.google.com/spreadsheets/d/1yVmy433Dp9WjV-g_ZbFKSdRLrqRKKPHk61AtRKxVfW4/edit?usp=sharing https://arxiv.org/abs/1506.02640 https://arxiv.org/abs/1612.08242 https://pjreddie.com/yolo/ https://arxiv.org/abs/1804.02767 https://arxiv.org/abs/1911.09516?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%253A+arxiv%252FQSXk+%2528ExcitingAds%2521+cs+updates+on+arXiv.org%2529 https://github.com/ultralytics/yolov3 https://arxiv.org/abs/2004.10934 https://github.com/AlexeyAB/darknet https://github.com/ultralytics/yolov5 https://arxiv.org/abs/2007.12099 https://arxiv.org/abs/2007.12099 https://arxiv.org/abs/2005.13243 https://arxiv.org/abs/2005.13243 TA B L E O F C O N T E N T S 1 9 8A P P E N D I X Artificial Intelligence Index Report 2021 V I S U A L Q U E S T I O N A N S W E R I N G ( V Q A ) VQA accuracy data was provided by the VQA team. Learn more about VQA here. More details on VQA 2020 are available here. Methodology Given an image and a natural language question about the image, the task is to provide an accurate natural language answer. The challenge is hosted on the VQA Challenge website. The challenge is hosted on EvalAI. The challenge link is here. The VQA v2.0 training, validation, and test sets, containing more than 250,000 images and 1.1 million questions, are available on the download page. All questions are annotated with 10 concise, open-ended answers each. Annotations on the training and validation sets are publicly available. VQA Challenge 2020 is the fifth edition of the VQA Challenge. Results from previous versions of the VQA Challenge were announced at the VQA Challenge Workshop in CVPR 2019, CVPR 2018, CVPR 2017, and CVPR 2016. More details about past challenges can be found here: VQA Challenge 2019, VQA Challenge 2018, VQA Challenge 2017, VQA Challenge 2016. VQA had 10 humans answer each question. More details about the VQA evaluation metric and human accuracy can be found here (see Evaluation Code section) and in sections three (“Answers”) and four (“Inter-Human Agreement”) of the paper. See slide 56 for the progress graph in VQA in the 2020 Challenge. The values corresponding to the progress graph are available in a sheet. Here is the information about the teams that participated in the 2020 challenge and their accuracies. For more details about the teams, please refer to the VQA website. PA P E R S W I T H C O D E : PA P E R A N D C O D E L I N K I N G We used paperswithcode (PWC) for referencing technical progress where available. Learn more about PWC here and see the public link here. Methodology For papers, we follow specific ML-related categories on arxiv (see [1] below for the full list) and the major ML conferences (NeurIPS, ICML, ICLR, etc.). For code, we follow GitHub repositories mentioning papers. We have good coverage of core ML topics but are missing some applications—for instance, applications of ML in medicine or bioinformatics, which are usually in journals behind paywalls. For code, the dataset is fairly unbiased (as long as the paper is freely available). For tasks (e.g., “image classification”), the dataset has annotated those on 1,600 state-of-the-art papers from the database, published in 2018 Q3. For state-of-the-art tables (e.g., “image classification on ImageNet”), the data has been scraped from different sources (see the full list here), and a large number focusing on CV and NLP were hand-annotated. A significant portion of our data was contributed by users, and they have added data based on their own preferences and interests. Arxiv categories we follow: ARXIV_CATEGORIES = “cs.CV”, “cs.AI”, “cs.LG”, “cs.CL”, “cs. NE”, “stat.ML”,”cs.IR”} Process of Extracting Dataset at Scale 1) Follow various paper sources (as described above) for new papers. 2) Conduct a number of predefined searches on GitHub (e.g., for READMEs containing links to arxiv). 3) Extract GitHub links from papers. 4) Extract paper links from GitHub. 5) Run validation tests to decide if links from 3) and 4) are bona fide links or false positives. 6) Let the community fix any errors and/or add any missing values. C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E A P P E N D I X https://visualqa.org/people.html https://visualqa.org/index.html https://drive.google.com/file/d/1oypaw0uhBTRSQFtq7TqLvlvVuLWTOzwc/view https://visualqa.org/challenge.html http://evalai.cloudcv.org/ https://evalai.cloudcv.org/web/challenges/challenge-page/514/overview https://visualqa.org/download.html https://visualqa.org/challenge_2019.html https://visualqa.org/challenge_2018.html https://visualqa.org/challenge_2017.html https://visualqa.org/challenge_2017.html https://visualqa.org/vqa_v1_challenge.html https://visualqa.org/evaluation.html https://arxiv.org/pdf/1505.00468.pdf https://drive.google.com/file/d/1yJISTi9PhQblI6aLgkMnojstx2frN5iY/view?usp=sharing https://docs.google.com/spreadsheets/d/1f4VLkRG2NtrcTQXTOwZwNRw68G5BzrFP_OeZKwNJVSs/edit?usp=sharing https://docs.google.com/spreadsheets/d/1tDl54e6db5MDnlzqod4I6Kim5-rBGlR_tMo-An8_w10/edit?usp=sharing https://visualqa.org/roe.html https://paperswithcode.com https://paperswithcode.com/about https://paperswithcode.com/sota https://github.com/paperswithcode/sota-extractor TA B L E O F C O N T E N T S 1 9 9A P P E N D I X Artificial Intelligence Index Report 2021 N I S T F R V T Source There are two FRVT evaluation leaderboards available here: 1:1 Verification and 1:N Identification Nuances about FRVT evaluation metrics Wild Photos have some identity labeling errors as the best algorithm has a low false non-match rate (FNMR), but obtaining complete convergence is difficult. This task will be retired in the future. The data became public in 2018 and has become easier over time. Wild is coming from public web sources. So it is possible those same images have been scrapped from the web by developers. There is no training in the FRVT data, only test data. The 1:1 and 1:N should be studied separately. The differences include algorithmic approaches, particularly fast search algorithms are especially useful in 1:N whereas speed is not a factor in 1:1. S U P E R G L U E The SuperGLUE benchmark data was pulled from the SuperGLUE leaderboard. Details about the SuperGLUE benchmark are in the SuperGLUE paper and SuperGLUE software toolkit. The tasks and evaluation metrics for SuperGLUE are: C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E A P P E N D I X NAME IDENTIFIER METRIC Broad Coverage Diagnostics AX-b Matthew’s Corr CommitmentBank CB Avg. F1 / Accuracy Choice of Plausible Alternatives COPA Accuracy Multi-Sentence Reading Comprehension MultiRC F1a / EM Recognizing Textual Entailment RTE Accuracy Words in Context WiC Accuracy The Winograd Schema Challenge WSC Accuracy BoolQ BoolQ Accuracy Reading Comprehension with Commonsense Reasoning ReCoRD F1 / Accuracy Winogender Schema Diagnostics AX-g Gender Parity / Accuracy V I S U A L C O M M O N S E N S E R E A S O N I N G ( V C R ) Technical progress for VCR is taken from the VCR leaderboard. VCR has two different subtasks: • Question Answering (Q->A): A model is provided a question and has to pick the best answer out of four choices. Only one of the four is correct. • Answer Justification (QA->R): A model is provided a question, along with the correct answer, and it must justify it by picking the best rationale among four choices. The two parts with the Q->AR metrics are combined in which a model only gets a question right if it answers correctly and picks the right rationale. Models are evaluated in terms of accuracy (%). https://pages.nist.gov/frvt/html/frvt11.html https://pages.nist.gov/frvt/html/frvt1N.html https://super.gluebenchmark.com/leaderboard https://arxiv.org/abs/1905.00537 https://jiant.info/ https://visualcommonsense.com/leaderboard/ TA B L E O F C O N T E N T S 200A P P E N D I X Artificial Intelligence Index Report 2021 V OXC E L E B VoxCeleb is an audio-visual dataset consisting of short clips of human speech, extracted from interview videos uploaded to YouTube. VoxCeleb contains speech from 7,000-plus speakers spanning a wide range of ethnicities, accents, professions, and ages—amounting to over a million utterances (face-tracks are captured “in the wild,” with background chatter, laughter, overlapping speech, pose variation, and different lighting conditions) recorded over a period of 2,000 hours (both audio and video). Each segment is at least three seconds long. The data contains an audio dataset based on celebrity voices, shorts, films, and conversational pieces (e.g., talk shows). The initial VoxCeleb 1 (100,000 utterances taken from 1,251 celebrities on YouTube) was expanded to VoxCeleb 2 (1 million utterances from 6,112 celebrities). However, in earlier years of the challenge, top-1 and top-5 scores were also reported. For top-1 score, the system is correct if the target label is the class to which it assigns the highest probability. For top-5 score, the system is correct if the target label is one of the five predictions with the highest probabilities. In both cases, the top score is computed as the number of times a predicted label matches the target label, divided by the number of data points evaluated. The data is extracted from different years of the submission challenges, including: • 2017: VoxCeleb: A Large-Scale Speaker Identification Dataset • 2018: VoxCeleb2: Deep Speaker Recognition • 2019: Voxceleb: Large-Scale Speaker Verification in the Wild • 2020: Query ExpansionSystem for the VoxCeleb Speaker Recognition Challenge 2020 B O O L E A N S AT I S F I A B I L I T Y P R O B L E M Analysis and text by Lars Kotthoff Primary Source and Data Sets The Boolean Satisfiability Problem (SAT) determines whether there is an assignment of values to a set of Boolean variables joined by logical connectives that makes the logical formula it represents true. SAT was the first problem to be proven NP-complete, and the first algorithms to solve it were developed in the 1960s. Many real-world problems, such as circuit design, automated theorem proving, and scheduling, can be represented and solved efficiently as SAT. The annual SAT competition is designed to present a snapshot of the state-of-the-art and has been running for almost 20 years. We took the top-ranked, median-ranked, and bottom- ranked solvers from each of the last five years (2016-2020) of the SAT competition. We ran all 15 solvers on all 400 SAT instances from the main track of the 2020 competition. More information on the competition, as well as the solvers and instances, is available at the SAT competition website. Results We ran each solver on each instance on the same hardware, with a time limit of 5,000 CPU seconds per instance, and measured the time it took a solver to solve an instance in CPU seconds. Ranked solvers always return correct results, hence we do not consider correctness as a metric. Except for the 2020 competition solvers, we evaluated the performance of the SAT solvers on a set of instances different from the set of instances they competed on. Further, our hardware is different from what was used for the SAT competition. The results we report here will therefore differ from the exact results reported for the respective SAT competitions. The Shapley value is a concept from cooperative game theory that assigns a contribution to the total value that a coalition generates to each player. It quantifies how important each player is for the coalition and has several desirable properties that make the distribution of the total value to the individual players fair. For example, C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E A P P E N D I X https://www.robots.ox.ac.uk/~vgg/data/voxceleb/ http://www.robots.ox.ac.uk/~vgg/publications/2017/Nagrani17/nagrani17.pdf http://www.robots.ox.ac.uk/~vgg/publications/2018/Chung18a/chung18a.pdf https://www.robots.ox.ac.uk/~vgg/publications/2019/Nagrani19/nagrani19.pdf https://arxiv.org/pdf/2011.02882.pdf https://arxiv.org/pdf/2011.02882.pdf http://www.satcompetition.org/ TA B L E O F C O N T E N T S 2 01A P P E N D I X Artificial Intelligence Index Report 2021 the Shapley value is used to distribute airport costs to its users, allocate funds to different marketing campaigns, and in machine learning, where it helps render complex black-box models more explainable. In our context, it quantifies the contribution of a solver to the state-of-the-art through the average performance improvement it provides over a set of other solvers and over all subsets of solvers (Fréchette et al. (2016)). For a given set of solvers, we choose the respective best for each instance to solve. By including another solver and being able to choose it, overall solving performance improves, with the difference to the original set of solvers being the marginal contribution of the added solver. The average marginal contribution to all sets of solvers is the Shapley value. Quantifying the contribution of a solver through the Shapley value compares solvers from earlier competitions to solvers in later competitions. This is often not a fair comparison, as later solvers are often improved versions of earlier solvers, and the contribution of the solver to the future state-of-the-art will always be low. The temporal Shapley value (Kotthoff et al. (2018)) solves this problem by considering the time a particular solver was introduced when quantifying its contribution to the state-of-the-art. A U T O M AT E D T H E O R E M P R O V I N G Analysis and text by Geoff Sutcliffe, Christian Suttner, and Raymond Perrault 1. Motivation Automated Theorem Proving (ATP) (also referred to as Automated Deduction) is a subfield of automated reasoning, concerned with the development and use of systems that automate sound reasoning: the derivation of conclusions that follow inevitably from facts. ATP systems are at the heart of many computational tasks and are used commercially, e.g., for integrated circuit design and computer program verification. ATP problems are typically solved by showing that a conjecture is or is not a logical consequence of a set of axioms. ATP problems are encoded in a chosen logic, and an ATP system for that logic is used to (attempt to) solve the problem. A key concern of ATP research is the development of more powerful systems, capable of solving more difficult problems within the same resource limits. In order to assess the merits of new techniques, sound empirical evaluations of ATP systems are key. 2. Analysis For the evaluation of ATP systems, there exists a large and growing collection of problems called the TPTP problem library. The current release v7.4.0 (released June 10, 2020) contains 23,291 ATP problems, structured into 54 topic domains (e.g., Set Theory, Software Verification, Philosophy, etc.). Orthogonally, the TPTP is divided into Specialist Problem Classes (SPCs), each of which contains problems with a specified set of logical, language, and syntactic characteristics (e.g. first-order logic theorems with some use of equality). The SPCs allow ATP system developers to select problems and evaluate their systems appropriately. Since its first release in 1993, many researchers have used the TPTP as an appropriate and convenient basis for ATP system evaluation. Over the years, the TPTP has also increasingly been used as a conduit for ATP users to contribute samples of their problems to ATP system developers. This exposes the problems to ATP system developers, who can then improve their systems’ performances on the problems, which completes a cycle to provide users with more effective tools. Associated with the TPTP is the TSTP solution library, which maintains updated results from running all current versions of ATP systems (available to the maintainer) on all the TPTP problems. One use of the TSTP is to compute TPTP problem difficulty ratings: Easy problems, which are solved by all ATP systems, have a rating of 0.0; difficult problems, which are solved by some ATP systems, have ratings between 0.0 and 1.0; unsolved problems, which are not solved by any ATP system, have a rating of 1.0. Note that the rating for a problem is not strictly decreasing, as different ATP systems and versions become available for populating the TSTP. The history of each TPTP problem’s C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E A P P E N D I X https://ada.liacs.nl/papers/FreEtAl16.pdf https://www.ijcai.org/Proceedings/2018/0716.pdf TA B L E O F C O N T E N T S 2 0 2A P P E N D I X Artificial Intelligence Index Report 2021 ratings is saved with the problem, which makes it possible to tell when the problem was first solved by any ATP system (the point at which its rating dropped below 1.0). That information has been used here to obtain an indication of progress in the field. The simplest way to measure progress takes a fixed set of problems that has been available (and unchanged) in the TPTP from some chosen initial TPTP release, and then for the TPTP releases from then on, counts how many of the problems had been solved from that release. The analysis reports the fraction of problems solved for each release. This simple approach is unambiguous, but it does not take into account new problems that are added to the TPTP after the initial release. The analysis used here extends the “Fixed Set” analysis, taking into account new problems added after the initial release. As it is not possible to run all previously available ATP systems on new problems when they are added, this approach assumes that if a problem is unsolved by current ATP systems when it is added to the TPTP, then it would have been unsolved by previously available ATP systems. Under that assumption, the new problem is retrospectively “added” to prior TPTP releases for the analysis. If a problem is solved when it is added to the TPTP, it is ignored because it may have been solved in prior versions as well, and therefore should not serve as an indicator of progress. This analysis reports the fraction of problems solved for each release, but note that the fraction is with respect to both the number of problems actually in the release and also the problems retrospectively “added.” The growing set analysis is performed on the whole TPTP and on four SPCs. These were chosen because many ATP problems in those forms have been contributed to the TPTP, and correspondingly there are many ATP systems that can attempt them; they represent the “real world” demand for ATP capability. The table here in the public data folder shows the breakdown of TPTP problems by content fields, as well as by SPCs used in the analysis. The totals are slightly larger than those shown in the analysis, as some problems were left out for technical reasons (no scores available, problems revised over time, etc.). C H A P T E R 2 : T E C H N I C A L P E R F O R M A N C E A P P E N D I X https://docs.google.com/spreadsheets/d/1NsRsa1T8b2BNLXKcjj3K4gNp2gkYG4jHMiZhhy26cu4/edit#gid=208696205 TA B L E O F C O N T E N T S 2 0 3A P P E N D I X Artificial Intelligence Index Report 2021 L I N K E D I N Prepared by Mar Carpanelli, Ramanujam MV, and Nathan Williams Country Sample Included countries represent a select sample of eligible countries with at least 40% labor force coverage by LinkedIn and at least 10 AI hires in any given month. China and India were included in this sample because of their increasing importance in the global economy, but LinkedIn coverage in these countries does not reach 40% of the workforce. Insights for these countries may not provide as full a picture as other countries, and should be interpreted accordingly. Skills LinkedIn members self-report their skills on their LinkedIn profiles. Currently, more than 35,000 distinct, standardized skills are identified by LinkedIn. These have been coded and classified by taxonomists at LinkedIn into 249 skill groupings, which are the skill groups represented in the dataset. The top skills that make up the AI skill grouping are machine learning, natural language processing, data structures, artificial intelligence, computer vision, image processing, deep learning, TensorFlow, Pandas (software), and OpenCV, among others. Skill groupings are derived by expert taxonomists through a similarity-index methodology that measures skill composition at the industry level. Industries are classified according to the ISIC 4 industry classification (Zhu et al., 2018). AI Skills Penetration The aim of this indicator is to measure the intensity of AI skills in an entity (in a particular country, industry, gender, etc.) through the following methodology: • Compute frequencies for all self-added skills by LinkedIn members in a given entity (occupation, industry, etc.) in 2015–2020. • Re-weight skill frequencies using a TF-IDF model to get the top 50 most representative skills in that entity. These 50 skills compose the “skill genome” of that entity. • Compute the share of skills that belong to the AI skill group out of the top skills in the selected entity. Interpretation: The AI skill penetration rate signals the prevalence of AI skills across occupations, or the intensity with which LinkedIn members utilize AI skills in their jobs. For example, the top 50 skills for the occupation of engineer are calculated based on the weighted frequency with which they appear in LinkedIn members’ profiles. If four of the skills that engineers possess belong to the AI skill group, this measure indicates that the penetration of AI skills is estimated to be 8% among engineers (e.g., 4/50). Relative AI Skills Penetration To allow for skills penetration comparisons across countries, the skills genomes are calculated and a relevant benchmark is selected (e.g., global average). A ratio is then constructed between a country’s and the benchmark’s AI skills penetrations, controlling for occupations. Interpretation: A country’s relative AI skills penetration of 1.5 indicates that AI skills are 1.5 times as frequent as in the benchmark, for an overlapping set of occupations. Global Comparison For cross-country comparison, we present the relative penetration rate of AI skills, measured as the sum of the penetration of each AI skill across occupations in a given country, divided by the average global penetration of AI skills across the overlapping occupations in a sample of countries. Interpretation: A relative penetration rate of 2 means that the average penetration of AI skills in that country is two times the global average across the same set of occupations. CHAPTER 3: ECONOMY C H A P T E R 3 : E C O N O M Y A P P E N D I X TA B L E O F C O N T E N T S 204A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 3 : E C O N O M Y A P P E N D I X Global Comparison: By Industry The relative AI skills penetration by country for industry provides an in-depth sectoral decomposition of AI skill penetration across industries and sample countries. Interpretation: A country’s relative AI skill penetration rate of 2 in the education sector means that the average penetration of AI skills in that country is two times the global average across the same set of occupations in that sector. LinkedIn AI Hiring Index The LinkedIn AI hiring rate is calculated as the total number of LinkedIn members who are identified as AI talent and added a new employer in the same month the new job began, divided by the total number of LinkedIn members in the country. By analyzing only the timeliest data, it is possible to make month-to-month comparisons and account for any potential lags in members updating their profiles. The baseline time period is typically a year, and it is indexed to the average month/period of interest during that year. The AI hiring rate is indexed against the average annual hiring in 2016; for example, an index of 3.5 for Brazil in 2020 indicates that the AI hiring rate is 3.5 times higher in 2020 than the average in 2016. Interpretation: The hiring index means the rate of hiring in the AI field, specifically how fast each country is experiencing growth in AI hiring. Top AI Skills AI skills most frequently added by members during 2015–2020 period. B U R N I N G G L A S S T E C H N O L O G I E S Prepared by Bledi Taska, Layla O’Kane, and Zhou Zhou Burning Glass Technologies delivers job market analytics that empower employers, workers, and educators to make data-driven decisions. The company’s artificial intelligence technology analyzes hundreds of millions of job postings and real-life career transitions to provide insight into labor market patterns. This real- time strategic intelligence offers crucial insights, such as what jobs are most in demand, the specific skills employers need, and the career directions that offer the highest potential for workers. For more information, visit burning-glass.com. Job Posting Data To support these analyses, Burning Glass mined its dataset of millions of job postings collected since 2010. Burning Glass collects postings from over 45,000 online job sites to develop a comprehensive, real-time portrait of labor market demand. It aggregates job postings, removes duplicates, and extracts data from job postings text. This includes information on job title, employer, industry, and region, as well as required experience, education, and skills. Job postings are useful for understanding trends in the labor market because they allow for a detailed, real-time look at the skills employers seek. To assess the representativeness of job postings data, Burning Glass conducts a number of analyses to compare the distribution of job postings to the distribution of official government and other third-party sources in the United States. The primary source of government data on U.S. job postings is the Job Openings and Labor Turnover Survey (JOLTS) program, conducted by the Bureau of Labor Statistics. To understand the share of job openings captured by Burning Glass data, it is important to first note that Burning Glass and JOLTS collect data on job postings differently. Burning Glass data captures new postings: A posting appears in the data only on the first month TA B L E O F C O N T E N T S 2 0 5A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 3 : E C O N O M Y A P P E N D I X it is found and is considered a duplicate and removed in subsequent months. JOLTS data captures active postings: A posting appears in the data for every month that it is still actively posted, meaning the same posting can be counted in two or more consecutive months if it has not been filled. To allow for apples-to-apples volume comparison in postings, the Burning Glass data needs to be inflated to account for active postings, not only new postings. The number of postings from Burning Glass can be inflated using the ratio of new jobs to active jobs in Help Wanted OnLine™ (HWOL), a method used in Carnevale, Jayasundera and Repnikov (2014). Based on this calculation, the share of jobs online as captured by Burning Glass is roughly 85% of the jobs captured in JOLTS in 2016. The labor market demand captured by Burning Glass data represents over 85% of the total labor demand. Jobs not posted online are usually in small businesses (the classic example being the “Help Wanted” sign in the restaurant window) and union hiring halls. Measuring Demand for AI In order to measure the demand by employers of AI skills, Burning Glass uses its skills taxonomy of over 17,000 skills. The list of AI skills from Burning Glass data are shown below, with associated skill clusters. While some skills are considered to be in the AI cluster specifically, for the purposes of this report, all skills below were considered AI skills. A job posting was considered an AI job if it requested one or more of these skills. Artificial Intelligence: Expert System, IBM Watson, IPSoft Amelia, Ithink, Virtual Agents, Autonomous Systems, Lidar, OpenCV, Path Planning, Remote Sensing Natural Language Processing (NLP): ANTLR, Automatic Speech Recognition (ASR), Chatbot, Computational Linguistics, Distinguo, Latent Dirichlet Allocation, Latent Semantic Analysis, Lexalytics, Lexical Acquisition, Lexical Semantics, Machine Translation (MT), Modular Audio Recognition Framework (MARF), MoSes, Natural Language Processing, Natural Language Toolkit (NLTK), Nearest Neighbor Algorithm, OpenNLP, Sentiment Analysis/Opinion Mining, Speech Recognition, Text Mining, Text to Speech (TTS), Tokenization, Word2Vec Neural Networks: Caffe Deep Learning Framework, Convolutional Neural Network (CNN), Deep Learning, Deeplearning4j, Keras, Long Short-Term Memory (LSTM), MXNet, Neural Networks, Pybrain, Recurrent Neural Network (RNN), TensorFlow Machine Learning: AdaBoost algorithm, Boosting (Machine Learning), Chi Square Automatic Interaction Detection (CHAID), Classification Algorithms, Clustering Algorithms, Decision Trees, Dimensionality Reduction, Google Cloud Machine Learning Platform, Gradient boosting, H2O (software), Libsvm, Machine Learning, Madlib, Mahout, Microsoft Cognitive Toolkit, MLPACK (C++ library), Mlpy, Random Forests, Recommender Systems, Scikit-learn, Semi-Supervised Learning, Supervised Learning (Machine Learning), Support Vector Machines (SVM), Semantic Driven Subtractive Clustering Method (SDSCM), Torch (Machine Learning), Unsupervised Learning, Vowpal, Xgboost Robotics: Blue Prism, Electromechanical Systems, Motion Planning, Motoman Robot Programming, Robot Framework, Robotic Systems, Robot Operating System (ROS), Robot Programming, Servo Drives / Motors, Simultaneous Localization and Mapping (SLAM) Visual Image Recognition: Computer Vision, Image Processing, Image Recognition, Machine Vision, Object Recognition TA B L E O F C O N T E N T S 2 0 6A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 3 : E C O N O M Y A P P E N D I X N E T B A S E Q U I D Prepared by Julie Kim NetBase Quid is a big data analytics platform that inspires full-picture thinking by drawing connections across massive amounts of unstructured data. The software applies advanced natural language processing technology, semantic analysis, and artificial intelligence algorithms to reveal patterns in large, unstructured datasets and to generate visualizations that allow users to gain actionable insights. NetBase Quid uses Boolean query to search for focus areas, topics, and keywords within the archived news and blogs, companies, and patents database, as well as any custom uploaded datasets. This can filter out the search by published date time frame, source regions, source categories, or industry categories on the news—and by regions, investment amount, operating status, organization type (private/public), and founding year within the companies database. NetBase Quid then visualizes these data points based on the semantic similarity. Search, Data Sources, and Scope Here 3.6 million public and private company profiles from multiple data sources are indexed in order to search across company descriptions, while filtering and including metadata ranging from investment information to firmographic information, such as founded year, HQ location, and more. Company information is updated on a weekly basis. Quid algorithm reads a big amount of text data from each document (news article, company descriptions, etc.) to make links between different documents based on their similar language. This process is repeated at an immense scale, which produces a network with different clusters identifying distinct topics or focus areas. Trends are identified based on keywords, phrases, people, companies, institutions that Quid identifies, and the other metadata that is put into the software. Data Organization data is embedded from CapIQ and Crunchbase. These companies include all types of companies (private, public, operating, operating as a subsidiary, out of business) throughout the world. The investment data includes private investments, M&A, public offerings, minority stakes made by PE/VCs, corporate venture arms, governments, and institutions both within and outside the United States. Some data is simply unreachable—for instance, when the investors are undisclosed or the funding amounts by investors are undisclosed. Quid also embeds firmographic information such as founded year and HQ location. NetBase Quid embeds CapIQ data as a default and adds in data from Crunchbase for the ones that are not captured in CapIQ. This yields not only comprehensive and accurate data on all global organizations, but it also captures early-stage startups and funding events data. Company information is uploaded on a weekly basis. Methodology Boolean query is used to search for focus areas, topics, and keywords within the archived company database, within their business descriptions and websites. We can filter out the search results by HQ regions, investment amount, operating status, organization type (private/ public), and founding year. Quid then visualizes these companies. If there are more than 7,000 companies from the search result, Quid selects the 7,000 most relevant companies for visualization based on the language algorithm. Boolean Search: “artificial intelligence” or “AI” or “machine learning” or “deep learning” Companies: • Chart 3.2.1: Global AI & ML companies that have been invested (private, IPO, M&A) from 01/01/2011 to 12/31/2020. • Chart 3.2.2–3.2.6: Global AI & ML companies that have invested over USD 400,000 for the last 10 years (January 1, 2011 to December 31, 2020)—7,000 companies out of 7,500 companies have been selected through Quid’s relevance algorithm. TA B L E O F C O N T E N T S 2 0 7A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 3 : E C O N O M Y A P P E N D I X Target Event Definitions • Private investments: A private placement is a private sale of newly issued securities (equity or debt) by a company to a selected investor or a selected group of investors. The stakes that buyers take in private placements are often minority stakes (under 50%), although it is possible to take control of a company through a private placement as well, in which case the private placement would be a majority stake investment. • Minority investment: These refer to minority stake acquisitions in Quid, which take place when the buyer acquires less than 50% of the existing ownership stake in entities, asset product, and business divisions. • M&A: This refers to a buyer acquiring more than 50% of the existing ownership stake in entities, asset product, and business divisions. M C K I N S E Y & C O M PA N Y S O U R C E This survey was written, filled, and analyzed by McKinsey & Company. You can find additional results from the Global AI Survey here. Methodology The survey was conducted online and was in the field from June 9, 2020, to June 19, 2020, and garnered responses from 2,395 participants representing the full range of regions, industries, company sizes, functional specialties, and tenures. Of those respondents, 1,151 said their organizations had adopted AI in at least one function and were asked questions about their organizations’ AI use. To adjust for differences in response rates, the data are weighted by the contribution of each respondent’s nation to global GDP. McKinsey also conducted interviews with executives between May and August 2020 about their companies’ use of AI. All quotations from executives were gathered during those interviews. Note Survey respondents are limited by their perception of their organization’s AI adoption. I N T E R N AT I O N A L F E D E R AT I O N O F R O B O T I C S Source Data was received directly from the International Federation of Robotics’ (IFR) 2020 World Robotics Report. Learn more about IFR. Methodology The data displayed is the number of industrial robots installed by country. Industrial robots are defined by the ISO 8373:2012 standard. See more information on IFR’s methodology. Nuance • It is unclear how to identify what percentage of robot units run software that would be classified as “AI,” and it is unclear to what extent AI development contributes to industrial robot usage. • This metric was called “robot imports” in the 2017 AI Index Report. P R AT T L E ( E A R N I N G C A L L S O N LY ) Prepared by Jeffrey Banner and Steven Nichols Source Liquidnet provides sentiment data that predicts the market impact of central bank and corporate communications. Learn more about Liquidnet here. https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/global-survey-the-state-of-ai-in-2020 https://ifr.org/ https://ifr.org/downloads/press2018/WR%20Industrial%20Robots%202019_Chapter_1.pdf https://ifr.org/downloads/press2018/WR%20Industrial%20Robots%202019_Chapter_1.pdf https://www.liquidnet.com/ TA B L E O F C O N T E N T S 2 0 8A P P E N D I X Artificial Intelligence Index Report 2021 C R A TA U L B E E S U R V E Y Prepared by Betsy Bizot (CRA senior research associate) and Stu Zweben (CRA survey chair, professor emeritus at The Ohio State University) Source Computing Research Association (CRA) members are 200-plus North American organizations active in computing research: academic departments of computer science and computer engineering; laboratories and centers in industry, government, and academia; and affiliated professional societies (AAAI, ACM, CACS/AIC, IEEE Computer Society, SIAM USENIX). CRA’s mission is to enhance innovation by joining with industry, government, and academia to strengthen research and advanced education in computing. Learn more about CRA here. Methodology CRA Taulbee Survey gathers survey data during the fall of each academic year by reaching out to over 200 PhD- granting departments. Details about the Taulbee Survey can be found here. Taulbee does not directly survey the students. The department identifies each new PhD’s area of specialization as well as their type of employment. Data is collected from September to January of each academic year for PhDs awarded in the previous academic year. Results are published in May after data collection closes. So the 2019 data points were newly available last spring, and the numbers provided for 2020 will be available in May 2021. The CRA Taulbee Survey is sent only to doctoral departments of computer science, computer engineering, and information science/systems. Historically, (a) Taulbee covers 1/4 to 1/3 of total BS CS recipients in the United States; (b) the percent of women earning bachelor’s degrees is lower in the Taulbee schools than overall; and (c) Taulbee tracks the trends in overall CS production. Nuances • Of particular interest in PhD job market trends are the metrics on the AI PhD area of specialization. The categorization of specialty areas changed in 2008 and was clarified in 2016. From 2004-2007, AI and robotics were grouped; from 2008-present, AI is separate; 2016 clarified to respondents that AI includes ML. • Notes about the trends in new tenure-track hires (overall and particularly at AAU schools): In the 2018 Taulbee Survey, for the first time, we asked how many new hires had come from the following sources: new PhD, postdoc, industry, and other academic. Results indicated that 29% of new assistant professors came from another academic institution. • Some may have been teaching or research faculty rather than tenure-track, but there is probably some movement between institutions, meaning the total number hired overstates the total who are actually new. CHAPTER 4: AI EDUCATION C H A P T E R 4 : A I E D U C AT I O N A P P E N D I X https://cra.org/ https://cra.org/resources/taulbee-survey/ TA B L E O F C O N T E N T S 2 0 9A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 4 : A I E D U C AT I O N A P P E N D I X A I I N D E X E D U C AT I O N S U R V E Y Prepared by Daniel Zhang (Stanford Institute for Human- Centered Artificial Intelligence) Methodology The survey was distributed to 73 universities online over three waves from November 2020 to January 2021 and completed by 18 universities, a 24.7% response rate. The selection of universities is based on the World University Rankings 2021 and Emerging Economies University Rankings 2020 by The Times Higher Education. The 18 universities are: • Belgium: Katholieke Universiteit Leuven • Canada: McGill University • China: Shanghai Jiao Tong University, Tsinghua University • Germany: Ludwig Maximilian University of Munich, Technical University of Munich • Russia: Higher School of Economics, Moscow Institute of Physics and Technology • Switzerland: École Polytechnique Fédérale de Lausanne • United Kingdom: University of Cambridge • United States: California Institute of Technology, Carnegie Mellon University (Department of Machine Learning), Columbia University, Harvard University, Stanford University, University of Wisconsin–Madison, University of Texas at Austin, Yale University Key Definitions • Major or a study program: a set of required and elective courses in an area of discipline—such as AI—that leads to a bachelor’s degree upon successful completion. • Course: a set of classes that require a minimum of 2.5 class hours (including lecture, lab, TA hours, etc.) per week for at least 10 weeks in total. Multiple courses with the same titles and numbers count as one course. • Practical Artificial Intelligence Models - Keywords: Adaptive learning, AI Application, Anomaly detection, Artificial general intelligence, Artificial intelligence, Audio processing, Automated vehicle, Automatic translation, Autonomous system, Autonomous vehicle, Business intelligence, Chatbot, Computational creativity, Computational linguistics, Computational neuroscience, Computer vision, Control theory, Cyber physical steam, Deep learning, Deep neural network, Expert system, Face recognition, Human-AI interaction, Image processing, Image recognition, Inductive programming, Intelligence software, Intelligent agent, Intelligent control, Intelligent software development, Intelligence system, Knowledge representation and reasoning, Machine learning, Machine translation, Multi-agent system, Narrow artificial intelligence, Natural language generation, Natural language processing, Natural language understanding, Neural network, Pattern recognition, Predictive analysis, Recommender system, Reinforcement learning, Robot system, Robotics, Semantic web, Sentiment analysis, Service robot, Social robot, Sound synthesis, Speaker identification, Speech processing, Speech recognition, Speech synthesis, Strong artificial intelligence, Supervised learning, Support vector machine, Swarm intelligence, Text mining, Transfer learning, Unsupervised learning, Voice recognition, Weak artificial intelligence (Adapted from: Joint Research Centre, European Commission, p.68) • AI Ethics - Keywords: Accountability, Consent, Contestability, Ethics, Equality, Explainability, Fairness, Non-discrimination, Privacy, Reliability, Safety, Security, Transparency, Trustworthy ai, Uncertainty, Well-being (Adapted from: Joint Research Centre, European Commission, p.68) https://www.timeshighereducation.com/world-university-rankings/2021/world-ranking#!/page/0/length/25/sort_by/rank/sort_order/asc/cols/stats https://www.timeshighereducation.com/world-university-rankings/2021/world-ranking#!/page/0/length/25/sort_by/rank/sort_order/asc/cols/stats https://www.timeshighereducation.com/world-university-rankings/2020/emerging-economies-university-rankings#!/page/0/length/25/sort_by/rank/sort_order/asc/cols/stats https://www.timeshighereducation.com/world-university-rankings/2020/emerging-economies-university-rankings#!/page/0/length/25/sort_by/rank/sort_order/asc/cols/stats https://publications.jrc.ec.europa.eu/repository/bitstream/JRC121680/jrc121680_jrc121680_academic_offer_of_advanced_digital_skills.pdf https://publications.jrc.ec.europa.eu/repository/bitstream/JRC121680/jrc121680_jrc121680_academic_offer_of_advanced_digital_skills.pdf https://publications.jrc.ec.europa.eu/repository/bitstream/JRC121680/jrc121680_jrc121680_academic_offer_of_advanced_digital_skills.pdf https://publications.jrc.ec.europa.eu/repository/bitstream/JRC121680/jrc121680_jrc121680_academic_offer_of_advanced_digital_skills.pdf TA B L E O F C O N T E N T S 2 1 0A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 4 : A I E D U C AT I O N A P P E N D I X E U AC A D E M I C O F F E R I N G , J O I N T R E S E A R C H C E N T E R , E U R O P E A N C O M M I S S I O N Prepared by Giuditta De-Prato, Montserrat López Cobo, and Riccardo Righi Source The Joint Research Centre (JRC) is the European Commission’s science and knowledge service. The JRC employs scientists to carry out research in order to provide independent scientific advice and support to EU policy. Learn more about JRC here. Methodology By means of text-mining techniques, the study identifies AI-related education programs from the programs’ descriptions present in JRC’s database. To query the database, a list of domain-specific keywords is obtained through a multistep methodology involving (i) selection of top keywords from AI-specific scientific journals; (ii) extraction of representative terms of the industrial dimension of the technology; (iii) topic modeling; and (iv) validation by experts. In this edition, the list of keywords has been enlarged to better cover certain AI subdomains and to expand to related transversal domains, such as philosophy and ethics in AI. Then the keywords are grouped into categories, which are used to analyze the content areas taught in the identified programs. The content areas used are adapted from the JRC report “Defining Artificial Intelligence: Towards an Operational Definition and Taxonomy of Artificial Intelligence,” conducted in the context of AI Watch. The education programs are classified into specialized and broad, according to the depth with which they address artificial intelligence. Specialized programs are those with a strong focus in AI, e.g., “automation and computer vision” or “advanced computer science (computational intelligence).” Broad programs target the addressed domain, but in a more generic way, usually aiming at building wider profiles or making reference to the domain in the framework of a program specialized in a different discipline (e.g., biomedical engineering). Due to some methodological improvements introduced in this edition, namely the addition of new keywords, a strict comparison is not possible. Still, more than 90% of all detected programs in this edition are triggered by keywords present in the 2019 study. The original source on which queries are performed is the Studyportals’ database, which is made up of over 207,000 programs from 3,700 universities in over 120 countries. Studyportals collects information from institutions’ websites, and their database is regularly updated. This source, although offering the widest coverage among all those identified, still suffers from some lack of coverage, mostly because it only tracks English-language programs. This poses a comparability issue between English-native-speaking countries and the rest, but also between countries with differing levels of incorporation of English as a teaching language in higher education. Bachelor’s-level studies are expected to be more affected by this fact, where the offer is mostly taught in a native language, unlike master’s, which attracts more international audiences and faculties. As a consequence, this study may be showing a partial picture of the level of inclusion of advanced digital skills in bachelor’s degree programs. https://ec.europa.eu/info/departments/joint-research-centre_en https://ec.europa.eu/jrc/en/publication/ai-watch-defining-artificial-intelligence https://ec.europa.eu/jrc/en/publication/ai-watch-defining-artificial-intelligence https://knowledge4policy.ec.europa.eu/ai-watch_en TA B L E O F C O N T E N T S 2 1 1A P P E N D I X Artificial Intelligence Index Report 2021 N E T B A S E Q U I D Prepared by Julie Kim Quid is a data analytics platform within the NetBase Quid portfolio that applies advanced natural language processing technology, semantic analysis, and artificial intelligence algorithms to reveal patterns in large, unstructured datasets and generate visualizations to allow users to gain actionable insights. Quid uses Boolean query to search for focus areas, topics, and keywords within the archived news and blogs, companies, and patents database, as well as any custom uploaded datasets. Users can then filter their search by published date time frame, source regions, source categories, or industry categories on the news; and by regions, investment amount, operating status, organization type (private/public), and founding year within the companies’ database. Quid then visualizes these data points based on the semantic similarity. Network Searched for [AI technology keywords + Harvard ethics principles keywords] global news from January 1, 2020, to December 31, 2020. Search Query: (AI OR [“artificial intelligence”](“artificial intelligence” OR “pattern recognition” OR algorithms) OR [“machine learning”](“machine learning” OR “predictive analytics” OR “big data” OR “pattern recognition” OR “deep learning”) OR [“natural language”] (“natural language” OR “speech recognition”) OR NLP OR “computer vision” OR [“robotics”](“robotics” OR “factory automation”) OR “intelligent systems” OR [“facial recognition”](“facial recognition” OR “face recognition” OR “voice recognition” OR “iris recognition”) OR [“image recognition”](“image recognition” OR “pattern recognition” OR “gesture recognition” OR “augmented reality”) OR [“semantic search”](“semantic search” OR “data-mining” OR “full-text search” OR “predictive coding”) OR “semantic web” OR “text analytics” OR “virtual assistant” OR “visual search”) AND (ethics OR “human rights” OR “human values” OR “responsibility” OR “human control” OR “fairness” OR discrimination OR non- discrimination OR “transparency” OR “explainability” OR “safety and security” OR “accountability” OR “privacy” ) News Dataset Data Source Quid indexes millions of global-source, English-language news articles and blog posts from LexisNexis. The platform has archived news and blogs from August 2013 to the present, updating every 15 minutes. Sources include over 60,000 news sources and over 500,000 blogs. Visualization in Quid Software Quid uses Boolean query to search for topics, trends, and keywords within the archived news database, with the ability to filter results by the published date time frame, source regions, source categories, or industry categories. (In this case, we only looked at global news published from January 1, 2020, to December 31, 2020.) Quid then selects the 10,000 most relevant stories using its NLP algorithm and visualizes de-duplicated unique articles. CHAPTER 5: ETHICAL CHALLENGES OF AI APPLICATIONS C H A P T E R 5 : E T H I C A L C H A L L E N G E S O F A I A P P L I C AT I O N S A P P E N D I X https://ai-hr.cyber.harvard.edu/primp-viz.html https://ai-hr.cyber.harvard.edu/primp-viz.html TA B L E O F C O N T E N T S 2 1 2A P P E N D I X Artificial Intelligence Index Report 2021 E T H I C S I N A I C O N F E R E N C E S Prepared by Marcelo Prates, Pedro Avelar, and Luis C. Lamb Source Prates, Marcelo, Pedro Avelar, Luis C. Lamb. 2018. On Quantifying and Understanding the Role of Ethics in AI Research: A Historical Account of Flagship Conferences and Journals. September 21, 2018. Methodology The percent of keywords has a straightforward interpretation: For each category (classical/trending/ ethics), the number of papers for which the title (or abstract, in the case of the AAAI and NeurIPS figures) contains at least one keyword match. The percentages do not necessarily add up to 100% (e.g, classical/trending/ ethics are not mutually exclusive). One can have a paper with matches on all three categories. To achieve a measure of how much Ethics in AI is discussed, ethics-related terms are searched for in the titles of papers in flagship AI, machine learning, and robotics conferences and journals. The ethics keywords used were the following: Accountability, Accountable, Employment, Ethic, Ethical, Ethics, Fool, Fooled, Fooling, Humane, Humanity, Law, Machine Bias, Moral, Morality, Privacy, Racism, Racist, Responsibility, Rights, Secure, Security, Sentience, Sentient, Society, Sustainability, Unemployment, and Workforce. The classical and trending keyword sets were compiled from the areas in the most cited book on AI by Russell and Norvig [2012] and from curating terms from the keywords that appeared most frequently in paper titles over time in the venues. The keywords chosen for the classical keywords category were: Cognition, Cognitive, Constraint Satisfaction, Game Theoretic, Game Theory, Heuristic Search, Knowledge Representation, Learning, Logic, Logical, Multiagent, Natural Language, Optimization, Perception, Planning, Problem Solving, Reasoning, Robot, Robotics, Robots, Scheduling, Uncertainty, and Vision. The curated trending keywords were: Autonomous, Boltzmann Machine, Convolutional Networks, Deep Learning, Deep Networks, Long Short Term Memory, Machine Learning, Mapping, Navigation, Neural, Neural Network, Reinforcement Learning, Representation Learning, Robotics, Self Driving, Self- Driving, Sensing, Slam, Supervised/Unsupervised Learning, and Unmanned. The terms searched for were based on the issues exposed and identified in papers below, and also on the topics called for discussion in the First AAAI/ACM Conference on AI, Ethics, and Society. J. Bossmann. Top 9 Ethical Issues in Artificial Intelligence. 2016. World Economic Forum. Emanuelle Burton, Judy Goldsmith, Sven Koenig, Benjamin Kuipers, Nicholas Mattei, and Toby Walsh. Ethical Considerations in Artificial Intelligence Courses. AI Magazine, 38(2):22–34, 2017. The Royal Society Working Group, P. Donnelly, R. Browsword, Z. Gharamani, N. Griffiths, D. Hassabis, S. Hauert, H. Hauser, N. Jennings, N. Lawrence, S. Olhede, M. du Sautoy, Y.W. Teh, J. Thornton, C. Craig, N. McCarthy, J. Montgomery, T. Hughes, F. Fourniol, S. Odell, W. Kay, T. McBride, N. Green, B. Gordon, A. Berditchevskaia, A. Dearman, C. Dyer, F. McLaughlin, M. Lynch, G. Richardson, C. Williams, and T. Simpson. Machine Learning: The Power and Promise of Computers That Learn by Example. The Royal Society, 2017. C H A P T E R 5 : E T H I C A L C H A L L E N G E S O F A I A P P L I C AT I O N S A P P E N D I X https://arxiv.org/pdf/1809.08328.pdf https://arxiv.org/pdf/1809.08328.pdf https://arxiv.org/pdf/1809.08328.pdf https://www.weforum.org/agenda/2016/10/top-10-ethical-issues-in-artificial-intelligence/ TA B L E O F C O N T E N T S 2 1 3A P P E N D I X Artificial Intelligence Index Report 2021 Conference and Public Venue - Sample The AI group contains papers from the main artificial intelligence and machine learning conferences such as AAAI, IJCAI, ICML, and NIPS and also from both the Artificial Intelligence Journal and the Journal of Artificial Intelligence Research (JAIR). The robotics group contains papers published in the IEEE Transactions on Robotics and Automation (now known as IEEE Transactions on Robotics), ICRA, and IROS. The CS group contains papers published in the mainstream computer science venues such as the Communications of the ACM, IEEE Computer, ACM Computing Surveys, and the ACM and IEEE Transactions. Codebase The code and data are hosted in this GitHub repository. C H A P T E R 5 : E T H I C A L C H A L L E N G E S O F A I A P P L I C AT I O N S A P P E N D I X https://github.com/marceloprates/Ethics-AI-Data TA B L E O F C O N T E N T S 2 1 4A P P E N D I X Artificial Intelligence Index Report 2021 L I N K E D I N AI Skills Penetration The aim of this indicator is to measure the intensity of AI skills in an entity (in a particular country, industry, gender, etc.) through the following methodology: • Compute frequencies for all self-added skills by LinkedIn members in a given entity (occupation, industry, etc.) in 2015–2020. • Re-weight skill frequencies using a TF-IDF model to get the top 50 most representative skills in that entity. These 50 skills compose the “skill genome” of that entity. • Compute the share of skills that belong to the AI skill group out of the top skills in the selected entity. Interpretation: The AI skill penetration rate signals the prevalence of AI skills across occupations, or the intensity with which LinkedIn members utilize AI skills in their jobs. For example, the top 50 skills for the occupation of engineer are calculated based on the weighted frequency with which they appear in LinkedIn members’ profiles. If four of the skills that engineers possess belong to the AI skill group, this measure indicates that the penetration of AI skills is estimated to be 8% among engineers (e.g., 4/50). Relative AI Skills Penetration To allow for skills penetration comparisons across countries, the skills genomes are calculated and a relevant benchmark is selected (e.g., global average). A ratio is then constructed between a country’s and the benchmark’s AI skills penetrations, controlling for occupations. Interpretation: A country’s relative AI skills penetration of 1.5 indicates that AI skills are 1.5 times as frequent as in the benchmark, for an overlapping set of occupations. Global Comparison: By Gender The relative AI skill penetration by country for gender provides an in-depth decomposition of AI skills penetration across female and male labor pools and sample countries. Interpretation: A country’s relative AI skill penetration rate of 2 for women means that the average penetration of AI skills among women in that country is two times the global average across the same set of occupations among women. If, in the same country, the relative AI skill penetration rate for men is 1.9, this indicates that the average penetration of AI skills among women in that country is 5% higher than that of men (calculated by dividing 1.9 by 2 and then subtracting 1, or 2/1.9-1) for the same set of occupations. CHAPTER 6: DIVERSITY IN AI C H A P T E R 6 : D I V E R S I T Y I N A IA P P E N D I X TA B L E O F C O N T E N T S 2 1 5A P P E N D I X Artificial Intelligence Index Report 2021 CHAPTER 7: AI POLICY AND NATIONAL STRATEGIES C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S A P P E N D I X B L O O M B E R G G O V E R N M E N T Bloomberg Government (BGOV) is a subscription- based market intelligence service designed to make U.S. government budget and contracting data more accessible to business development and government affairs professionals. BGOV’s proprietary tools ingest and organize semi-structured government data sets and documents, enabling users to track and forecast investment in key markets. Methodology The BGOV data included in this section was drawn from three original sources: Contract Spending: BGOV’s Contracts Intelligence Tool ingests on a twice-daily basis all contract spending data published to the beta.SAM.gov Data Bank, and structures the data to ensure a consistent picture of government spending over time. For the section “U.S. Government Contract Spending,” BGOV analysts used FPDS-NG data, organized by the Contracts Intelligence Tool, to build a model of government spending on artificial intelligence- related contracts in the fiscal years 2000 through 2021. BGOV’s model used a combination of government- defined produce service codes and more than 100 AI-related keywords and acronyms to identify AI-related contract spending. Defense RDT&E Budget: BGOV organized all 7,057 budget line items included in the RDT&E budget request based on data available on the DOD Comptroller website. For the section “U.S. Department of Defense (DOD) Budget,” BGOV used a set of more than a dozen AI- specific keywords to identify 305 unique budget activities related to artificial intelligence and machine learning worth a combined USD 5.0 billion in FY 2021. Congressional Record (available on Congressional Record website): BGOV maintains a repository of congressional documents, including bills, amendments, bill summaries, Congressional Budget Office assessments, reports published by congressional committees, Congressional Research Service (CRS), and others. For the section “U.S. Congressional Record,” BGOV analysts identified all legislation (passed or introduced), congressional committee reports, and CRS reports that referenced one or more of a dozen AI- specific keywords. Results are organized by a two-year congressional session. L I Q U I D N E T Prepared by Jeffrey Banner and Steven Nichols Source Liquidnet provides sentiment data that predicts the market impact of central bank and corporate communications. Learn more about Liquidnet here. Examples of Central Bank Mentions Here are some examples of how AI is mentioned by central banks: In the first case, China uses a geopolitical environment simulation and prediction platform that works by crunching huge amounts of data and then providing foreign policy suggestions to Chinese diplomats or the Bank of Japan use of AI prediction models for foreign exchange rates. For the second case, many central banks are leading communications through either official documents—for example, on July 25, 2019, the Dutch Central Bank (DNB) published Guidelines for the use of AI in financial services and launched its six “SAFEST” principles for regulated firms to use AI responsibly—or a speech on June 4, 2019, by the Bank of England’s Executive Director of U.K. Deposit Takers Supervision James Proudman, titled “Managing Machines: The Governance of Artificial Intelligence,” focused on the increasingly important strategic issue of how boards of regulated financial services should use AI. https://beta.sam.gov/help/contract-data https://comptroller.defense.gov/Budget-Materials/ https://www.congress.gov/congressional-record https://www.congress.gov/congressional-record https://www.liquidnet.com/ https://www.dnb.nl/media/voffsric/general-principles-for-the-use-of-artificial-intelligence-in-the-financial-sector.pdf https://www.bankofengland.co.uk/-/media/boe/files/speech/2019/managing-machines-the-governance-of-artificial-intelligence-speech-by-james-proudman https://www.bankofengland.co.uk/-/media/boe/files/speech/2019/managing-machines-the-governance-of-artificial-intelligence-speech-by-james-proudman TA B L E O F C O N T E N T S 2 1 6A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S A P P E N D I X M C K I N S E Y G L O B A L I N S T I T U T E Source Data collection and analysis was performed by the McKinsey Global Institute (MGI). Canada (House of Commons) Data was collected using the Hansard search feature on Parliament of Canada website. MGI searched for the terms “Artificial Intelligence” and “Machine Learning” (quotes included) and downloaded the results as a CSV. The date range was set to “all debates.” Data is as of Dec. 31, 2020. Data are available online from Aug. 31, 2002. Each count indicates that Artificial Intelligence or Machine Learning was mentioned in a particular comment or remark during the proceedings of the House of Commons. This means that within an event or conversation, if a member mentions AI or ML multiple times within their remarks, it will appear only once. However if, during the same event, the speaker mentions AI or ML in separate comments (with other speakers in between), it will appear multiple times. Counts for Artificial Intelligence or Machine Learning are separate, as they were conducted in separate searches. Mentions of the abbreviations AI or ML are not included. United Kingdom (House of Commons, House of Lords, Westminster Hall, and Committees) Data was collected using the Find References feature of the Hansard website of the U.K. Parliament. MGI searched for the terms “Artificial Intelligence” and “Machine Learning” (quotes included) and catalogued the results. Data is as of Dec. 31, 2020. Data are available online from January 1, 1800 onward. Contains Parliamentary information licensed under the Open Parliament Licence v3.0. As in Canada, each count indicates that Artificial Intelligence or Machine Learning was mentioned in a particular comment or remark during a proceeding. Therefore, if a member mentions AI or ML multiple times within their remarks, it will appear only once. However if, during the same event, the same speaker mentions AI or ML in separate comments (with other speakers in between), it will appear multiple times. Counts for Artificial Intelligence or Machine Learning are separate, as they were conducted in separate searches. Mentions of the abbreviations AI or ML are not included. United States (Senate and House of Representatives) Data was collected using the advanced search feature of the U.S. Congressional Record website. MGI searched the terms “Artificial Intelligence” and “Machine Learning” (quotes included) and downloaded the results as a CSV. The “word variant” option was not selected, and proceedings included Senate, House of Representatives, and Extensions of Remarks, but did not include the Daily Digest. Data is as of Dec. 31, 2020, and data is available online from the 104th Congress onward (1995). Each count indicates that Artificial Intelligence or Machine Learning was mentioned during a particular event contained in the Congressional Record, including the reading of a bill. If a speaker mentioned AI or ML multiple times within remarks, or multiple speakers mentioned AI or ML within the same event, it would appear only once as a result. Counts for Artificial Intelligence or Machine Learning are separate, as they were conducted in separate searches. Mentions of the abbreviations AI or ML are not included. U. S . A I P O L I C Y PA P E R Source Data collection and analysis was performed by Stanford Institute of Human-Centered Artificial Intelligence and AI Index. Organizations To develop a more nuanced understanding of the thought leadership that motivates AI policy, we tracked policy papers published by 36 organizations across three broad categories including: Think Tanks, Policy Institutes & Academia: This includes organizations where experts (often from academia and the political sphere) provide information and advice on specific policy problems. We included the following 27 organizations: AI PULSE at UCLA Law, American Enterprise Institute, Aspen Institute, Atlantic Council, Berkeley Center for Long-Term Cybersecurity, Brookings https://www.mckinsey.com/mgi/overview https://www.mckinsey.com/mgi/overview https://www.ourcommons.ca/Search/en/publications/hansard https://hansard.parliament.uk/search?startDate=1800-01-01&endDate=2018-11-20&searchTerm=%22Artificial%20intelligence%22&partial=False https://hansard.parliament.uk/ https://www.parliament.uk/site-information/copyright-parliament/open-parliament-licence/ https://www.congress.gov/quick-search/congressional-record?wordsPhrases=%22Artificial+Intelligence%22&congresses%5B%5D=all&dates=datesCongress§ionSenate=on§ionHouse=on§ionExtensionsOfRemarks=on&representative%5B%5D=&senator%5B%5D=&searchResultViewType=expanded https://www.congress.gov/congressional-record TA B L E O F C O N T E N T S 2 1 7A P P E N D I X Artificial Intelligence Index Report 2021 C H A P T E R 7 : A I P O L I C Y A N D N AT I O N A L S T R AT E G I E S A P P E N D I X Institution, Carnegie Endowment for International Peace, Cato Institute, Center for a New American Security, Center for Strategic and International Studies, Council on Foreign Relations, Georgetown Center for Security and Emerging Technology (CSET), Harvard Belfer Center, Harvard Berkman Klein Center, Heritage Foundation, Hudson Institute, MacroPolo, MIT Internet Policy Research Initiative, New America Foundation, NYU AI Now Institute, Princeton School of Public and International Affairs, RAND Corporation, Rockefeller Foundation, Stanford Institute for Human-Centered Artificial Intelligence (HAI), Stimson Center, Urban Institute, Wilson Center. Civil Society, Associations & Consortiums: Not-for profit institutions including community-based organizations and NGOs advocating for a range of societal issues. We included the following nine organizations: Algorithmic Justice League, Alliance for Artificial Intelligence in Healthcare, Amnesty International, EFF, Future of Privacy Forum, Human Rights Watch, IJIS, Institute for Electrical and Electronics Engineers, Partnership on AI Industry & Consultancy: Professional practices providing expert advice to clients and large industry players. We included six prominent organizations in this space: Accenture, Bain & Co., BCG, Deloitte, Google AI, McKinsey & Company Methodology Each broad topic area is based on a collection of underlying keywords that describes the content of the specific paper. We included 17 topics that represented the majority of discourse related to AI between 2019-2020. These topic areas and the associated keywords are listed below. • Health & Biological Sciences: medicine, healthcare systems, drug discovery, care, biomedical research, insurance, health behaviors, COVID-19, global health • Physical Sciences: chemistry, physics, astronomy, earth science • Energy & Environment: Energy costs, climate change, energy markets, pollution, conservation, oil & gas, alternative energy • International Affairs & International Security: international relations, international trade, developing countries, humanitarian assistance, warfare, regional security, national security, autonomous weapons • Justice & Law Enforcement: civil justice, criminal justice, social justice, police, public safety, courts • Communications & Media: social media, disinformation, media markets, deepfakes • Government & Public Administration: federal government, state government, local government, public sector efficiency, public sector effectiveness, government services, government benefits, government programs, public works, public transportation • Democracy: elections, rights, freedoms, liberties, personal freedoms • Industry & Regulation: economy, antitrust, M&A, competition, finance, management, supply chain, telecom, economic regulation, technical standards, autonomous vehicle industry & regulation • Innovation & Technology: advancements and improvements in AI technology, R&D, intellectual property, patents, entrepreneurship, innovation ecosystems, startups, computer science, engineering • Education & Skills: early childhood, K-12, higher education, STEM, schools, classrooms, reskilling • Workforce & Labor: labor supply and demand, talent, immigration, migration, personnel economics, future of work • Social & Behavioral Sciences: sociology, linguistics, anthropology, ethnic studies, demography, geography, psychology, cognitive science • Humanities: arts, music, literature, language, performance, theater, classics, history, philosophy, religion, cultural studies • Equity & Inclusion: biases, discrimination, gender, race, socioeconomic inequality, disabilities, vulnerable populations • Privacy, Safety & Security: anonymity, GDPR, consumer protection, physical safety, human control, cybersecurity, encryption, hacking • Ethics: transparency, accountability, human values, human rights, sustainability, explainability, interpretability, decision-making norms TA B L E O F C O N T E N T S 2 1 8A P P E N D I X Artificial Intelligence Index Report 2021 O V E R V I E W The tables below show the high-level pillar, sub-pillars, and indicators covered by the Global AI Vibrancy Tool. Each sub-pillar is composed of individual indicators reported in the Global AI Vibrancy Codebook. There are 22 metrics in total, with 14 metrics under Research and Development (R&D) pillar, 6 metrics under the Economy pillar, and 2 metrics available under the Inclusion pillar specific to gender diversity. To aid data-driven decision-making to design national policy strategies, the Global AI Vibrancy is available as a web tool. GLOBAL AI VIBRANCY G L O B A L A I V I B R A N C Y A P P E N D I X R&D SUB-PILLAR VARIABLE Conference Publications Number of AI conference papers* Conference Publications Number of AI conference papers per capita Conference Publications Number of AI conference citations* Conference Publications Number of AI conference citations per capita Journal Publications Number of AI journal papers* Journal Publications Number of AI journal papers per capita Journal Publications Number of AI journal citations* Journal Publications Number of AI journal citations per capita Innovation > Patents Number of AI patents* Innovation > Patents Number of AI patents per capita Innovation > Patents Number of AI patent citations* Innovation > Patents Number of AI patent citations per capita Journal Publications > Deep Learning Number of Deep Learning papers* Journal Publications > Deep Learning Number of Deep Learning papers per capita ECONOMY SUB-PILLAR VARIABLE Skills Relative Skill Penetration Skills Number of unique AI occupations (job titles) Labor AI hiring index Investment Total AI Private Investment* Investment AI Private Investment per capita Investment Number of Startups Funded* Investment Number of funded startups per capita INCLUSION SUB-PILLAR VARIABLE Gender Diversity AI Skill Penetration (female) Gender Diversity Number of unique AI occupations (job titles), female https://drive.google.com/file/d/1HzSGtHVy4ZO4jmjekEF6AUTc6G5gkKhB/view?usp=sharing TA B L E O F C O N T E N T S 2 1 9A P P E N D I X Artificial Intelligence Index Report 2021 The webtool allows users to adjust weights to each metric based on their individual preference. The default settings of the tool allow the user to select between three weighting options: All weights to midpoint This button assigns equal weights to all indicators. Only absolute metrics This button assigns maximum weights to absolute metrics. Per capita metrics are not considered. Only per capita metrics This button assigns maximum weights to per capita metrics. Absolute metrics are not considered. The user can adjust the weights to each metric based on their preference. The charts automatically update when any weight is changed. The user can select “Global” or “National” view to visualize the results. The “Global” view offers a cross- country comparative view based on the weights selected by the user. The “National” view offers a country deep dive to assess which AI indicators a given country is relatively better at. The country-metric specific values are scaled (0-100), where 100 indicates that a given country has the highest number in the global distribution for that metric, and conversely small numbers like 0 or 1 indicates relatively low values in the global distribution This can help identify areas for improvement and identify national policy strategies to support a vibrant AI ecosystem. C O N S T R U C T I O N O F T H E G L O B A L A I V I B R A N CY: C O M P O S I T E M E A S U R E Source The data is collected by AI Index using diverse datasets that are referenced in the 2020 AI Index Report chapters. Methodology Step 1: Obtain, harmonize, and integrate data on individual attributes across countries and time. Step 2: Use Min-Max Scalar to normalize each country-year specific indicator between 0-100. Step 3: Take arithmetic Mean per country-indicator for a given year. Step 4: Build modular weighted for available pillars and individual indicators. Aggregate Measure The AI Vibrancy Composite Index can be expressed in the following equation: where c represents a country and t represents year, is the scaled (0-100) individual indicator, is the weight assigned to individual indicators, is weight specific to one of the three high-level pillars and N is the number of indicators available for a given country for a specific year. Normalization To adjust for differences in units of measurement and ranges of variation, all 22 variables were normalized into the [0, 100] range, with higher scores representing better outcomes. A minimum-maximum normalization method was adopted, given the minimum and maximum values of each variable respectively. Higher values indicate better outcomes. The normalization formula is: Coverage and Nuances A threshold of 73% coverage was chosen to select the final list of countries based on an average of available data between 2015-2020. Russia and South Korea were added manually due to their growing importance in the global AI landscape, even though they did not pass the 73% threshold. G L O B A L A I V I B R A N C Y A P P E N D I X TA B L E O F C O N T E N T S 2 2 0A P P E N D I X Artificial Intelligence Index Report 2021 G L O B A L A I V I B R A N C Y A P P E N D I X ID PILLAR SUB-PILLAR NAME DEFINITION SOURCE 1 Research and Development Conference Publications Number of AI conference papers* Total count of published AI conference papers attributed to institutions in the given country. Microsoft Academic Graph (MAG) 2 Research and Development Conference Publications Number of AI conference papers per capita Total count of published AI conference papers attributed to institutions in the given country in per capita terms. The denominator is the population (in tens of millions) for a given year to obtain scaled values. Microsoft Academic Graph (MAG) 3 Research and Development Conference Publications Number of AI conference citations* Total count of AI conference citations attributed to institutions in the given country. Microsoft Academic Graph (MAG) 4 Research and Development Conference Publications Number of AI conference citations per capita Total count of AI conference citations attributed to institutions in the given country in per capita terms. The denominator is the population (in tens of millions) for a given year to obtain scaled values. Microsoft Academic Graph (MAG) 5 Research and Development Journal Publications Number of AI journal papers* Total count of published AI journal papers attributed to institutions in the given country. Microsoft Academic Graph (MAG) 6 Research and Development Journal Publications Number of AI journal papers per capita Total count of published AI journal papers attributed to institutions in the given country in per capita terms. The denominator is the population (in tens of millions) for a given year to obtain scaled values. Microsoft Academic Graph (MAG) 7 Research and Development Journal Publications Number of AI journal citations* Total count of AI journal citations attributed to institutions in the given country. Microsoft Academic Graph (MAG) 8 Research and Development Journal Publications Number of AI journal citations per capita Total count of AI journal citations attributed to institutions in the given country in per capita terms. The denominator is the population (in tens of millions) for a given year to obtain scaled values. Microsoft Academic Graph (MAG) 9 Research and Development Innovation > Patents Number of AI patents* Total count of published AI patents attributed to institutions in the given country. Microsoft Academic Graph (MAG) 10 Research and Development Innovation > Patents Number of AI patents per capita Total count of published AI patents attributed to institutions in the given country in per capita terms. The denominator is the population (in tens of millions) for a given year to obtain scaled values. Microsoft Academic Graph (MAG) 11 Research and Development Innovation > Patents Number of AI patent citations* Total count of published AI patents citations attributed to institutions of originating patent filing. Microsoft Academic Graph (MAG) 12 Research and Development Innovation > Patents Number of AI patent citations per capita Total count of published AI patent citations attributed to institutions in the given country of originating patent filing, in per capita terms. The denominator is the population (in tens of millions) for a given year to obtain scaled values. Microsoft Academic Graph (MAG) 13 Research and Development Journal Publications > Deep Learning Number of deep learning papers* Total count of arXiv papers on Deep Learning attributed to institutions in the given country. arXiv, NESTA 14 Research and Development Journal Publications > Deep Learning Number of deep learning papers per capita Total count of arXiv papers on Deep Learning attributed to institutions in the given country in per capita terms. The denominator is the population (in tens of millions) for a given year to obtain scaled values. arXiv, NESTA R E S E A R C H A N D D E V E L O P M E N T I N D I C AT O R S TA B L E O F C O N T E N T S 2 2 1A P P E N D I X Artificial Intelligence Index Report 2021 G L O B A L A I V I B R A N C Y A P P E N D I X ID PILLAR SUB-PILLAR NAME DEFINITION SOURCE 15 Economy Skills Relative skill penetration Relative skill penetration rate measure is based on a method to compare how prevalent AI skills are at the average occupation in each country against a benchmark (e.g. the global average), controlling for the same set of occupations. LinkedIn Economic Graph 16 Economy Labor AI hiring index AI hiring rate is the percentage of LinkedIn members who had any AI skills (see the Appendix for the AI skill grouping) on their profile and added a new employer to their profile in the same month the new job began, divided by the total number of LinkedIn members in the country. This rate is then indexed to the average month in 2015-2016; for example, an index of 1.05 indicates a hiring rate that is 5% higher than the average month in 2015-2016. LinkedIn Economic Graph 17 Economy Investment Total Amount of Funding* Total amount of private investment funding received for AI startups (nominal USD). Crunchbase, CapIQ, NetBase Quid 18 Economy Investment Total per capita funding Total amount of private investment funding received for AI startups in per capita terms. The denominator is the population (in tens of millions) for a given year to obtain appropriately scaled values. Crunchbase, CapIQ, NetBase Quid 19 Economy Investment Number of companies funded* Total number of AI companies founded in the given country. Crunchbase, CapIQ, NetBase Quid 20 Economy Investment Number of companies funded per capita Total number of AI companies founded in the given country in per capita terms. The denominator is the population (in tens of millions) for a given year to obtain appropriately scaled values. Crunchbase, CapIQ, NetBase Quid ID PILLAR SUB-PILLAR NAME DEFINITION SOURCE 21 Inclusion Gender Diversity AI skill penetration (female) Relative skill penetration rate measure is based on a method to compare how prevalent AI skills are at the average occupation in each country against a benchmark (e.g. the global average), controlling for the same set of occupations. LinkedIn Economic Graph 22 Inclusion Gender Diversity Number of unique AI occupations (job titles), female Number of unique AI occupations (or job titles) with high AI skill penetration for females in a given country. LinkedIn Economic Graph E C O N O M Y I N D I C AT O R S I N C L U S I O N I N D I C AT O R S Artificial Intelligence Index Report 2021